duplicates

How do I check if there are duplicates in a flat list?

末鹿安然 提交于 2019-12-17 03:50:34
问题 For example, given the list ['one', 'two', 'one'] , the algorithm should return True , whereas given ['one', 'two', 'three'] it should return False . 回答1: Use set() to remove duplicates if all values are hashable : >>> your_list = ['one', 'two', 'one'] >>> len(your_list) != len(set(your_list)) True 回答2: Recommended for short lists only: any(thelist.count(x) > 1 for x in thelist) Do not use on a long list -- it can take time proportional to the square of the number of items in the list! For

MySQL remove duplicates from big database quick

和自甴很熟 提交于 2019-12-17 03:22:30
问题 I've got big (>Mil rows) MySQL database messed up by duplicates. I think it could be from 1/4 to 1/2 of the whole db filled with them. I need to get rid of them quick (i mean query execution time). Here's how it looks: id (index) | text1 | text2 | text3 text1 & text2 combination should be unique, if there are any duplicates, only one combination with text3 NOT NULL should remain. Example: 1 | abc | def | NULL 2 | abc | def | ghi 3 | abc | def | jkl 4 | aaa | bbb | NULL 5 | aaa | bbb | NULL ..

Remove duplicate rows from Pandas dataframe where only some columns have the same value

断了今生、忘了曾经 提交于 2019-12-17 02:38:12
问题 I have a pandas dataframe as follows: A B C 1 2 x 1 2 y 3 4 z 3 5 x I want that only 1 row remains of rows that share the same values in specific columns. In the example above I mean columns A and B . In other words, if the values of columns A and B occur more than once in the dataframe, only one row should remain (which one does not matter). FWIW: the maximum number of so called duplicate rows (that is, where column A and B are the same) is 2. The result should looke like this: A B C 1 2 x 3

Remove duplicate rows from Pandas dataframe where only some columns have the same value

放肆的年华 提交于 2019-12-17 02:37:59
问题 I have a pandas dataframe as follows: A B C 1 2 x 1 2 y 3 4 z 3 5 x I want that only 1 row remains of rows that share the same values in specific columns. In the example above I mean columns A and B . In other words, if the values of columns A and B occur more than once in the dataframe, only one row should remain (which one does not matter). FWIW: the maximum number of so called duplicate rows (that is, where column A and B are the same) is 2. The result should looke like this: A B C 1 2 x 3

JQuery: Remove duplicate elements?

假装没事ソ 提交于 2019-12-16 22:12:07
问题 Say I have a list of links with duplicate values as below: <a href="#">Book</a> <a href="#">Magazine</a> <a href="#">Book</a> <a href="#">Book</a> <a href="#">DVD</a> <a href="#">DVD</a> <a href="#">DVD</a> <a href="#">Book</a> How would I, using JQuery, remove the dups and be left with the following for example: <a href="#">Book</a> <a href="#">Magazine</a> <a href="#">DVD</a> Basically I am looking for a way to remove any duplicate values found and show 1 of each link. 回答1: var seen = {}; $

Regular Expression For Consecutive Duplicate Words

醉酒当歌 提交于 2019-12-16 20:13:36
问题 I'm a regular expression newbie, and I can't quite figure out how to write a single regular expression that would "match" any duplicate consecutive words such as: Paris in the the spring. Not that that is related. Why are you laughing? Are my my regular expressions THAT bad?? Is there a single regular expression that will match ALL of the bold strings above? 回答1: Try this regular expression: \b(\w+)\s+\1\b Here \b is a word boundary and \1 references the captured match of the first group. 回答2

pair-wise duplicate removal from dataframe [duplicate]

杀马特。学长 韩版系。学妹 提交于 2019-12-16 20:04:24
问题 This question already has an answer here : Select equivalent rows [A-B & B-A] [duplicate] (1 answer) Closed 2 years ago . This seems like a simple problem but I can't seem to figure it out. I'd like to remove duplicates from a dataframe (df) if two columns have the same values, even if those values are in the reverse order . What I mean is, say you have the following data frame: a <- c(rep("A", 3), rep("B", 3), rep("C",2)) b <- c('A','B','B','C','A','A','B','B') df <-data.frame(a,b) a b 1 A A

Algorithm: efficient way to remove duplicate integers from an array

白昼怎懂夜的黑 提交于 2019-12-16 20:03:34
问题 I got this problem from an interview with Microsoft. Given an array of random integers, write an algorithm in C that removes duplicated numbers and return the unique numbers in the original array. E.g Input: {4, 8, 4, 1, 1, 2, 9} Output: {4, 8, 1, 2, 9, ?, ?} One caveat is that the expected algorithm should not required the array to be sorted first. And when an element has been removed, the following elements must be shifted forward as well. Anyway, value of elements at the tail of the array

Removing duplicate rows from table in Oracle

对着背影说爱祢 提交于 2019-12-16 20:00:45
问题 I'm testing something in Oracle and populated a table with some sample data, but in the process I accidentally loaded duplicate records, so now I can't create a primary key using some of the columns. How can I delete all duplicate rows and leave only one of them? 回答1: Use the rowid pseudocolumn. DELETE FROM your_table WHERE rowid not in (SELECT MIN(rowid) FROM your_table GROUP BY column1, column2, column3); Where column1 , column2 , and column3 make up the identifying key for each record. You

How to delete duplicates on a MySQL table?

非 Y 不嫁゛ 提交于 2019-12-16 19:17:14
问题 I need to DELETE duplicated rows for specified sid on a MySQL table. How can I do this with an SQL query? DELETE (DUPLICATED TITLES) FROM table WHERE SID = "1" Something like this, but I don't know how to do it. 回答1: this removes duplicates in place, without making a new table ALTER IGNORE TABLE `table_name` ADD UNIQUE (title, SID) note: only works well if index fits in memory 回答2: Suppose you have a table employee , with the following columns: employee (first_name, last_name, start_date) In