duplicates

Remove duplicate rows from Pandas dataframe where only some columns have the same value

安稳与你 提交于 2019-11-26 16:41:32
I have a pandas dataframe as follows: A B C 1 2 x 1 2 y 3 4 z 3 5 x I want that only 1 row remains of rows that share the same values in specific columns. In the example above I mean columns A and B . In other words, if the values of columns A and B occur more than once in the dataframe, only one row should remain (which one does not matter). FWIW: the maximum number of so called duplicate rows (that is, where column A and B are the same) is 2. The result should looke like this: A B C 1 2 x 3 4 z 3 5 x or A B C 1 2 y 3 4 z 3 5 x Use drop_duplicates with parameter subset , for keeping only last

Removing Duplicates From Dictionary

可紊 提交于 2019-11-26 16:37:51
I have the following Python 2.7 dictionary data structure (I do not control source data - comes from another system as is): {112762853378: {'dst': ['10.121.4.136'], 'src': ['1.2.3.4'], 'alias': ['www.example.com'] }, 112762853385: {'dst': ['10.121.4.136'], 'src': ['1.2.3.4'], 'alias': ['www.example.com'] }, 112760496444: {'dst': ['10.121.4.136'], 'src': ['1.2.3.4'] }, 112760496502: {'dst': ['10.122.195.34'], 'src': ['4.3.2.1'] }, 112765083670: ... } The dictionary keys will always be unique. Dst, src, and alias can be duplicates. All records will always have a dst and src but not every record

How to remove duplicates from unsorted std::vector while keeping the original ordering using algorithms?

谁都会走 提交于 2019-11-26 16:26:38
I have an array of integers that I need to remove duplicates from while maintaining the order of the first occurrence of each integer. I can see doing it like this, but imagine there is a better way that makes use of STL algorithms better? The insertion is out of my control, so I cannot check for duplicates before inserting. int unsortedRemoveDuplicates(std::vector<int> &numbers) { std::set<int> uniqueNumbers; std::vector<int>::iterator allItr = numbers.begin(); std::vector<int>::iterator unique = allItr; std::vector<int>::iterator endItr = numbers.end(); for (; allItr != endItr; ++allItr) {

How to repeat Pandas data frame?

☆樱花仙子☆ 提交于 2019-11-26 16:04:29
问题 This is my data frame that should be repeated for 5 times: >>> x = pd.DataFrame({'a':1,'b':2},index = range(1)) >>> x a b 0 1 2 I wanna have the result like this: >>> x.append(x).append(x).append(x) a b 0 1 2 0 1 2 0 1 2 0 1 2 But there must be a way smarter than keep appending.. Actually the data frame Im working on should be repeated for 50 times.. I haven't found anything practical, including those like np.repeat ---- it just doesnt work on data frame. Could anyone help? 回答1: You can use

Scala: Remove duplicates in list of objects

我怕爱的太早我们不能终老 提交于 2019-11-26 15:59:38
I've got a list of objects List[Object] which are all instantiated from the same class. This class has a field which must be unique Object.property . What is the cleanest way to iterate the list of objects and remove all objects(but the first) with the same property? IttayD list.groupBy(_.property).map(_._2.head) Explanation: The groupBy method accepts a function that converts an element to a key for grouping. _.property is just shorthand for elem: Object => elem.property (the compiler generates a unique name, something like x$1 ). So now we have a map Map[Property, List[Object]] . A Map[K,V]

Find duplicate rows with PostgreSQL

蓝咒 提交于 2019-11-26 15:17:43
问题 We have a table of photos with the following columns: id, merchant_id, url this table contains duplicate values for the combination merchant_id, url . so it's possible that one row appears more several times. 234 some_merchant http://www.some-image-url.com/abscde1213 235 some_merchant http://www.some-image-url.com/abscde1213 236 some_merchant http://www.some-image-url.com/abscde1213 What is the best way to delete those duplications? (I use PostgreSQL 9.2 and Rails 3.) 回答1: Here is my take on

Does adding a duplicate value to a HashSet/HashMap replace the previous value

牧云@^-^@ 提交于 2019-11-26 15:14:40
问题 Please consider the below piece of code: HashSet hs = new HashSet(); hs.add("hi"); -- (1) hs.add("hi"); -- (2) hs.size() will give 1 as HashSet doesn't allow duplicates so only one element will be stored. I want to know if we add the duplicate element, then does it replace the previous element or it simply doesn't add it? Also, what will happen using HashMap for the same case? 回答1: In the case of HashMap, it replaces the old value with the new one. In the case of HashSet, the item isn't

Merging rows with the same ID variable [duplicate]

一曲冷凌霜 提交于 2019-11-26 14:51:41
问题 This question already has an answer here: how to spread or cast multiple values in r [duplicate] 2 answers I have a dataframe in R with 2186 obs of 38 vars. Rows have an ID variable referring to unique experiments and using length(unique(df$ID))==nrow(df) n_occur<-data.frame(table(df$ID)) I know 327 of my rows have repeated IDs with some IDs repeated more than once. I am trying to merge rows with the same ID as these aren't duplicates but just second, third etc. observations within a given

how to combine duplicate rows and sum the values 3 column in excel

谁说胖子不能爱 提交于 2019-11-26 14:40:59
问题 Hello everyone, I have a problem to create VBA excel to duplicate data. How to combine duplicate rows and sum the values 3 column in excel? Thank you. 回答1: this one uses Remove Duplicates: Sub dupremove() Dim ws As Worksheet Dim lastrow As Long Set ws = Sheets("Sheet1") ' Change to your sheet With ws lastrow = .Range("A" & .Rows.Count).End(xlUp).Row With .Range("B2:C" & lastrow) .Offset(, 4).FormulaR1C1 = "=SUMIF(C1,RC1,C[-4])" .Offset(, 4).Value = .Offset(, 4).Value End With With .Range("A1

Removing duplicate elements from a List

点点圈 提交于 2019-11-26 14:25:39
问题 I have developed an array list. ArrayList<String> list = new ArrayList<String>(); list.add("1"); list.add("2"); list.add("3"); list.add("3"); list.add("5"); list.add("6"); list.add("7"); list.add("7"); list.add("1"); list.add("10"); list.add("2"); list.add("12"); But as seen above it contains many duplicate elements. I want to remove all duplicates from that list. For this I think first I need to convert the list into a set. Does Java provide the functionality of converting a list into a set?