duplicates

Remove duplicates keeping entry with largest absolute value

假如想象 提交于 2019-11-26 18:55:10
Let's say I have four samples: id=1, 2, 3, and 4, with one or more measurements on each of those samples: > a <- data.frame(id=c(1,1,2,2,3,4), value=c(1,2,3,-4,-5,6)) > a id value 1 1 1 2 1 2 3 2 3 4 2 -4 5 3 -5 6 4 6 I want to remove duplicates, keeping only one entry per ID - the one having the largest absolute value of the "value" column. I.e., this is what I want: > a[c(2,4,5,6), ] id value 2 1 2 4 2 -4 5 3 -5 6 4 6 How might I do this in R? aa <- a[order(a$id, -abs(a$value) ), ] #sort by id and reverse of abs(value) aa[ !duplicated(aa$id), ] # take the first row within each id id value 2

Remove duplicate records based on multiple columns?

大城市里の小女人 提交于 2019-11-26 18:50:34
问题 I'm using Heroku to host my Ruby on Rails application and for one reason or another, I may have some duplicate rows. Is there a way to delete duplicate records based on 2 or more criteria but keep just 1 record of that duplicate collection? In my use case, I have a Make and Model relationship for cars in my database. Make Model --- --- Name Name Year Trim MakeId I'd like to delete all Model records that have the same Name, Year and Trim but keep 1 of those records (meaning, I need the record

php: check if an array has duplicates

随声附和 提交于 2019-11-26 18:48:54
I'm sure this is an extremely obvious question, and that there's a function that does exactly this, but I can't seem to find it. In PHP, I'd like to know if my array has duplicates in it, as efficiently as possible. I don't want to remove them like array_unique does, and I don't particularly want to run array_unique and compare it to the original array to see if they're the same, as this seems very inefficient. As far as performance is concerned, the "expected condition" is that the array has no duplicates. I'd just like to be able to do something like if (no_dupes($array)) // this deals with

How do I delete all the duplicate records in a MySQL table without temp tables

萝らか妹 提交于 2019-11-26 18:44:29
I've seen a number of variations on this but nothing quite matches what I'm trying to accomplish. I have a table, TableA , which contain the answers given by users to configurable questionnaires. The columns are member_id, quiz_num, question_num, answer_num . Somehow a few members got their answers submitted twice. So I need to remove the duplicated records, but make sure that one row is left behind. There is no primary column so there could be two or three rows all with the exact same data. Is there a query to remove all the duplicates? Saharsh Shah Add Unique Index on your table: ALTER

Keep first row by multiple columns in an R data.table

早过忘川 提交于 2019-11-26 18:30:28
问题 I'd like to get the first row only from a data.table, grouped by multiple columns. This is straightforward with a single column, e.g.: (dt <- data.table(x = c(1, 1, 1, 2), y = c(1, 1, 2, 2), z = c(1, 2, 1, 2))) # x y z # |1: 1 1 1 # |2: 1 1 2 # |3: 1 2 1 # |4: 2 2 2 dt[!duplicated(x)] # Remove rows 2-3 # x y z # |1: 1 1 1 # |2: 2 2 2 But none of these approaches work when trying to remove based on two columns; i.e. in this case removing only row 2: dt[!duplicated(x, y)] # Keeps only original

How do keep only unique words within each string in a vector

本小妞迷上赌 提交于 2019-11-26 18:24:52
问题 I have data that looks like this: vector = c("hello I like to code hello","Coding is fun", "fun fun fun") I want to remove duplicate words (space delimited) i.e. the output should look like vector_cleaned [1] "hello I like to code" [2] "coding is fun" [3] "fun" 回答1: Split it up ( strsplit on spaces), use unique (in lapply ), and paste it back together: vapply(lapply(strsplit(vector, " "), unique), paste, character(1L), collapse = " ") # [1] "hello i like to code" "coding is fun" "fun" ## OR

list of masked functions in R

荒凉一梦 提交于 2019-11-26 18:19:06
问题 I use a lot of packages and I know some functions are masked because they exist in several different packages. Is there a way to get the list of duplicate functions (or masked functions?) The ideal would be to have a list of duplicate function and for each of them, the list of packages in which it exists. 回答1: in R base: conflicts(detail=TRUE) And to find the list of environments that contain a version of getAnywhere(x = "functionA") Note: getAnywhere also finds the functions which are not

Mongoose duplicate key error with upsert

余生颓废 提交于 2019-11-26 17:54:48
I have problem with duplicate key. Long time can`t find answer. Please help me solve this problem or explain why i get duplicate key error. Trace: { [MongoError: E11000 duplicate key error collection: project.monitor index: _id_ dup key: { : 24392490 }] name: 'MongoError', message: 'E11000 duplicate key error collection: project.monitor index: _id_ dup key: { : 24392490 }', driver: true, index: 0, code: 11000, errmsg: 'E11000 duplicate key error collection: project.monitor index: _id_ dup key: { : 24392490 }' } at /home/project/app/lib/monitor.js:67:12 at callback (/home/project/app/node

Error: java.util.zip.ZipException: duplicate entry

对着背影说爱祢 提交于 2019-11-26 17:53:29
I'm trying to add a library to my project, right now my current build.gradle is: apply plugin: 'com.android.application' android { compileSdkVersion 21 buildToolsVersion "21.1.2" repositories { mavenCentral() } defaultConfig { applicationId "com.example.guycohen.cheaters" minSdkVersion 11 targetSdkVersion 21 versionCode 1 versionName "1.0" // Enabling multidex support. multiDexEnabled true } buildTypes { release { minifyEnabled false proguardFiles getDefaultProguardFile('proguard-android.txt'), 'proguard-rules.pro' } } } dependencies { compile fileTree(dir: 'libs', include: ['*.jar']) compile

Deleting duplicate rows from a table

邮差的信 提交于 2019-11-26 17:52:06
I have a table in my database which has duplicate records that I want to delete. I don't want to create a new table with distinct entries for this. What I want is to delete duplicate entries from the existing table without the creation of any new table. Is there any way to do this? id action L1_name L1_data L2_name L2_data L3_name L3_data L4_name L4_data L5_name L5_data L6_name L6_data L7_name L7_data L8_name L8_data L9_name L9_data L10_name L10_data L11_name L11_data L12_name L12_data L13_name L13_data L14_name L14_data L15_name L15_data see these all are my fields : id is unique for every