duplicates

Find near-duplicates of comma-separated lists using Levenshtein distance [duplicate]

时光毁灭记忆、已成空白 提交于 2019-12-02 11:03:25
This question already has an answer here: Potential Duplicates Detection, with 3 Severity Level 1 answer This question based on the answer of my question yesterday. To solve my problem, Jean-François Corbett suggested a Levenshtein distance approach. Then I found this code somewhere to get Levenshtein distance percentage. Public Function GetLevenshteinPercentMatch( _ ByVal string1 As String, ByVal string2 As String, _ Optional Normalised As Boolean = False) As Single Dim iLen As Integer If Normalised = False Then string1 = UCase$(WorksheetFunction.Trim(string1)) string2 = UCase$

Removing duplicates (with condition) in excel

梦想与她 提交于 2019-12-02 10:50:18
I have a sheet that consists of 3 columns and thousands of rows. The columns are First-Name, Last-Name, and E-Mail. not all of the entries (rows) have data in the E-Mail column - for some it's just left empty. The sheet contains "duplicates", which means two rows with both the same First name, and the same last name. I'd like to remove the duplicates in the following manner: If one of the duplicate entries has E-Mail address, remove the other. If both have E-Mail address, remove one of them (whichever one. say first, for example). And the same if both don't have E-Mail. How can I do this task?

Rename duplicate rows in MySQL

北城以北 提交于 2019-12-02 10:33:03
There are some similar topics in stackoverflow, but still I didn't succeed to rename my duplicate rows: CREATE TABLE IF NOT EXISTS `products` ( `product_id` int(11) unsigned NOT NULL AUTO_INCREMENT, `product_code` varchar(32) NOT NULL, PRIMARY KEY (`product_id`) ) ENGINE=MyISAM DEFAULT CHARSET=utf8 AUTO_INCREMENT=4 ; INSERT INTO `products` (`product_id`, `product_code`) VALUES (1, 'A'), (2, 'B'), (3, 'A'); Here first and third rows have same items. I just want to add a suffix like "Copy". And the result would be: product_id product_code ----------- -------------- 1 A 2 B 3 A Copy So how can I

Check if URL contains href of link I already clicked

安稳与你 提交于 2019-12-02 10:29:32
In a list I have a few links: <ul class="dropdowner" id="coll-filter"> <li><a href="#black">Black</a></li> <li><a href="#white">White</a></li> <li><a href="#blue">Blue</a></li> </ul> Another output I have uses + instead of # in the url Ie: <ul class="dropdowner" id="coll-filter"> <li><a href="+black">Black</a></li> <li><a href="+white">White</a></li> <li><a href="+blue">Blue</a></li> </ul> If I click the link White then "#white" is inserted into my URL. (mydomain.com/#white) I want to avoid duplicates so is there a way to check if "#white" already exist in URL and if so don't allow the link to

Excel 2007: Remove rows by duplicates in column value

僤鯓⒐⒋嵵緔 提交于 2019-12-02 10:12:56
问题 I have a table in excel. E.g. col1 col2 A Something A Something else A Something more A Something blahblah B Something Fifth B Something xth C Som thin F Summerthing F Boom And I want only rows without duplicate col1: e.g: col1 col2 A Something B Something Fifth C Som thin F Boom Is there any way of filtering rows like this :) ? 回答1: Found it myself: To remove duplicate values, use the Remove Duplicates command in the Data Tools group on the Data tab. 来源: https://stackoverflow.com/questions

Count CLOB Duplicates in a large Oracle Table

霸气de小男生 提交于 2019-12-02 10:06:48
问题 I have an Oracle database table LOG_MESSAGES with a CLOB column called MESSAGE . Some of the rows contain the same MESSAGE . For each MESSAGE which has at least a duplicate, I'd like to know the number of duplicates. Quite a number of these CLOBs are huge (> 100 kB), so converting to VARCHAR2 is out of question. Since many traditional methods such as GROUP BY do not work with CLOB , could someone please enlighten me? For information, the table is very large (around 1 TB). So an optimised

Remove duplicates in ArrayList - Java

雨燕双飞 提交于 2019-12-02 09:43:06
I have some problem with my Java code. I'm supposed to use loops and not any other method. Say that my ArrayLis t contains of [Dog Cat Dog Dog Cat Dog Horse] My goal is also to remove the copies of Dog and Cat so my final results equals [Dog Cat Horse] public void removeDouble(){ int counter = 0; for (int i = 0 ; i < animals.size(); i++) { for (int j = 1+i; j < animals.size() ; j++) //don't start on the same word or you'll eliminate it. if ( animals.get(j).equals( animals.get(i) ) ) { animals.remove(animals.get(j)); counter++; } } } It feels like the "logic" is correct but my code does not

delete non duplicate data in excel using VBA

孤街浪徒 提交于 2019-12-02 09:38:53
i try to remove non-duplicate data and keep the duplicate data i've done some coding, but nothing happen, oh. it's error. lol this is my code. Sub mukjizat2() Dim desc As String Dim sapnbr As Variant Dim shortDesc As String X = 1 i = 2 desc = Worksheets("process").Cells(i, 3).Value sapnbr = Worksheets("process").Cells(i, 1).Value shortDesc = Worksheets("process").Cells(i, 2).Value Do While Worksheets("process").Cells(i, 1).Value <> "" If desc = Worksheets("process").Cells(i + 1, 3).Value <> Worksheets("process").Cells(i, 3) Or Worksheets("process").Cells(i + 1, 2) <> Worksheets("process")

Expand data.frame by creating duplicates based on group condition (2)

℡╲_俬逩灬. 提交于 2019-12-02 09:38:24
Starting from @AndrewGustar answer/code: Expand data.frame by creating duplicates based on group condition 1) What about if I have the input data.frame with ID values not in sequence and that can also duplicate theirselves? Example data.frame: df = read.table(text = 'ID Day Count Count_group 18 1933 6 11 33 1933 6 11 37 1933 6 11 18 1933 6 11 16 1933 6 11 11 1933 6 11 111 1932 5 8 34 1932 5 8 60 1932 5 8 88 1932 5 8 18 1932 5 8 33 1931 3 4 13 1931 3 4 56 1931 3 4 23 1930 1 1 6 1800 6 10 37 1800 6 10 98 1800 6 10 52 1800 6 10 18 1800 6 10 76 1800 6 10 55 1799 4 6 6 1799 4 6 52 1799 4 6 133 1799

How to add only unique values from CSV into ComboBox?

浪子不回头ぞ 提交于 2019-12-02 09:32:02
问题 I want to read a csv File and put words " Jakarta " and " Bandung " in a combobox. Here's the input id,from, 1,Jakarta 2,Jakarta 5,Jakarta 6,Jakarta 10,Bandung 11,Bandung 12,Bandung I managed to get the words and put it in the combobox, but as you can see, the text file itself contains a lot word " Jakarta " and " Bandung " while i want to show both only once in the combobox. Here's my temporary code, which works for now but inefficient and probably can't be used if the word has more variety