duplicates

How can I combine rows within the same data frame in R (based on duplicate values under a specific column)?

早过忘川 提交于 2019-12-19 04:37:07
问题 Sample of 2 (made-up) example rows in df: userid facultyid courseid schoolid 167 265 NA 1678 167 71111 301 NA Suppose that I have a couple hundred duplicate userid like in the above example. However, the vast majority of userid have different values. How can I combine rows with duplicate userid in such a way as to stick to the column values in the 1st (of the 2) row unless the first value is NA (in which case the NA will be repopulated with whatever value came from the second row)? In essence

How to merge cells based on similar values - Excel 2010

て烟熏妆下的殇ゞ 提交于 2019-12-19 04:20:12
问题 I have a problem merging cells in excel based on similar values for one column- I would like to keep other columns data - let's view some screenshots and it will be clearer: This above is the initial state of the Data, what I want to achieve is this: I'm sure there is a way to do it with VB or formulas- I need the most simple way possible as this is for a customer and it needs to be easy. Thank you all in advanced. 回答1: Option Explicit Private Sub MergeCells() Application.ScreenUpdating =

Is there an efficient algorithm for fuzzy deduplication of string lists? [duplicate]

风流意气都作罢 提交于 2019-12-19 03:56:13
问题 This question already has answers here : Fuzzy matching deduplication in less than exponential time? (6 answers) Closed 6 years ago . For example, I have a long list of strings, each string has about 30-50 characters, and I want to remove strings that are similar to some other string in that list (leaving only one occurrence from a family of duplicates). I looked at various string similarity algorithms, for example, Levenstein distance and the method presented in this article. They do work,

Can I use ON DUPLICATE KEY UPDATE with an INSERT query using the SET option?

情到浓时终转凉″ 提交于 2019-12-19 03:37:19
问题 I've seen the following (using the VALUES option): $query = "INSERT INTO $table (column-1, column-2, column-3) VALUES ('value-1', 'value-2', 'value-3') ON DUPLICATE KEY UPDATE SET column1 = value1, column2 = value2, column3 = value3, ID=LAST_INSERT_ID(ID)"; ... but I can't figure how to add ON DUPLICATE KEY UPDATE to what I'm using: $query = "INSERT INTO $table SET column-1 ='value-1', column-2 ='value-2', column-3 ='value-3' "; e.g.:, pseudo-code $query = "INSERT INTO $table SET column-1 =

Remove duplicates within Excel cell

痴心易碎 提交于 2019-12-19 03:33:50
问题 Say I have the following text string in one single Excel cell: John John John Mary Mary I want to create a formula (so no menu functions or VBA, please) that would give me, on another cell John Mary How can I do this? What I've tried so far was search the internet and SO about the issue and all I could find were solutions involving Excel's built-in duplicate removal or something involving countif and the replacement of duplicates for "" . I've also taken a look at the list of Excel functions,

How to raise error if duplicates keys in dictionary

邮差的信 提交于 2019-12-18 19:41:47
问题 I try to raise an error if the user enter a duplicate key in a dictionary. The dictionary is in a file and the user can edit the file manually. Example: dico= {'root':{ 'a':{'some_key':'value',...}, 'b':{'some_key':'value',...}, 'c':{'some_key':'value',...}, ... 'a':{'some_key':'value',...}, } } the new key 'a' already exist... How can I test dico and warn the user when I load dico from the file? 回答1: Write a subclass of dict, override __setitem__ such that it throws an error when replacing

Error: [ngRepeat:dupes] what does this mean?

一世执手 提交于 2019-12-18 18:53:44
问题 repeat directive outputing wine records from an api. I have a factory function to serve up the wine API which is then accessed in my controller app.factory("Wine", function ($http){ var factory = {}; //getWines factory.getWines = function(){ return $http.get("http://www.greatwines.9000.com") } } Controller: app.controller("winesCtrl", function($scope, $http, Wine){ Wine.getWines() .success(function(wines){ $scope.wines = wines; }) .error(function(){ alert("Error!"); }); }); VIEW: <h2>Wine

Avoid duplicate Strings in Java

|▌冷眼眸甩不掉的悲伤 提交于 2019-12-18 17:34:28
问题 I want to ask a question about avoiding String duplicates in Java. The context is: an XML with tags and attributes like this one: <product id="PROD" name="My Product"...></product> With JibX, this XML is marshalled/unmarshalled in a class like this: public class Product{ private String id; private String name; // constructor, getters, setters, methods and so on } The program is a long-time batch processing, so Product objects are created, used, copied, etc. Well, the question is: When I

Avoid duplicate Strings in Java

蹲街弑〆低调 提交于 2019-12-18 17:34:12
问题 I want to ask a question about avoiding String duplicates in Java. The context is: an XML with tags and attributes like this one: <product id="PROD" name="My Product"...></product> With JibX, this XML is marshalled/unmarshalled in a class like this: public class Product{ private String id; private String name; // constructor, getters, setters, methods and so on } The program is a long-time batch processing, so Product objects are created, used, copied, etc. Well, the question is: When I

Merge multiple CSV files and remove duplicates in R

瘦欲@ 提交于 2019-12-18 17:27:41
问题 I have almost 3.000 CSV files (containing tweets) with the same format, I want to merge these files into one new file and remove the duplicate tweets. I have come across various topics discussing similar questions however the number of files is usually quit small. I hope you can help me write a code within R that does this job both efficiently and effectively. The CSV files have the following format: Image of CSV format: I changed (in column 2 and 3) the usernames (on Twitter) to A-E and the