apply

Pandas中的map(), apply()和applymap()

|▌冷眼眸甩不掉的悲伤 提交于 2019-12-03 17:09:15
它们的区别在于应用的对象不同。 1、map() map() 是一个Series的函数,DataFrame结构中没有map()。map()将一个自定义函数应用于 Series结构中的每个元素(elements)。 例子: df = pd.DataFrame({'key1' : ['a', 'a', 'b', 'b', 'a'], 'key2' : ['one', 'two', 'one', 'two', 'one'], 'data1' : np.arange(5), 'data2' : np.arange(5,10)}) df 我们现在用map来对列data1改成保留小数点后三位: df['data1'] = df['data1'].map(lambda x : "%.3f"%x) df 你也可以用map把key1的a改成c,b改成d df['key1'] = df['key1'].map({'a':'c',"b":"d"}) df 2、apply() apply()将一个函数作用于DataFrame中的 每个行或者列 例子: 我们现在用apply来对列data1,data2进行相加 #axis =1 ,作用于行. #axis =0,作用于列,默认为0 df['total'] = df[['data1','data2']].apply(lambda x : x.sum(),axis

How does the [].push.apply work?

折月煮酒 提交于 2019-12-03 16:39:25
can someone please explain me how does this line of code work. [].push.apply(perms, permutation(arr.slice(0), start + 1, last)); This function generates an array of all permutations of an input array; var permutation = function(arr, start, last){ var length = arr.length; if(!start){ start = 0; } if(!last){ last = length - 1; } if( last === start){ return [arr]; } var temp; var perms = []; for(var i = start; i < length; i++){ swapIndex(arr, i, start); console.log(arr); [].push.apply(perms, permutation(arr.slice(0), start + 1, last)); swapIndex(arr, i, start); } return perms; }; [].push creates

Using Apply in Pandas Lambda functions with multiple if statements

戏子无情 提交于 2019-12-03 16:13:02
I'm trying to infer a classification according to the size of a person in a dataframe like this one: Size 1 80000 2 8000000 3 8000000000 ... I want it to look like this: Size Classification 1 80000 <1m 2 8000000 1-10m 3 8000000000 >1bi ... I understand that the ideal process would be to apply a lambda function like this: df['Classification']=df['Size'].apply(lambda x: "<1m" if x<1000000 else "1-10m" if 1000000<x<10000000 else ...) I checked a few posts regarding multiple ifs in a lambda function, here is an example link , but that synthax is not working for me for some reason in a multiple ifs

Why doesn't Array.push.apply work?

…衆ロ難τιáo~ 提交于 2019-12-03 15:31:31
问题 As described here, a quick way to append array b to array a in javascript is a.push.apply(a, b) . You'll note that the object a is used twice. Really we just want the push function, and b.push.apply(a, b) accomplishes exactly the same thing -- the first argument of apply supplies the this for the applied function. I thought it might make more sense to directly use the methods of the Array object: Array.push.apply(a, b) . But this doesn't work! I'm curious why not, and if there's a better way

MDN bind why concat arguments when calling apply

不羁的心 提交于 2019-12-03 15:16:57
MDN specifies a polyfill bind method for those browsers without a native bind method: https://developer.mozilla.org/en-US/docs/JavaScript/Reference/Global_Objects/Function/bind This code has the following line: aArgs.concat(Array.prototype.slice.call(arguments)) Which is passed as the args to the apply method on the function: fToBind.apply(this instanceof fNOP && oThis ? this : oThis, aArgs.concat(Array.prototype.slice.call(arguments))); However, this line actually repeats the arguments, so that if I called the bind method as: fnX.bind({value: 666}, 1, 2, 3) the arguments passed to fnX are: [1

How to use apply, cat and print, without getting NULL

情到浓时终转凉″ 提交于 2019-12-03 14:17:48
I am trying to use cat() as functions inside apply(). I can almost make R do what I want, but I'm getting some very confusing (to me) NULLS at the end of the return. Here is a silly example, to highlight what I'm getting. val1 <- 1:10 val2 <- 25:34 values <- data.frame(val1, val2) apply(values, 1, function(x) cat(x[1], x[2], fill=TRUE)) This "works" in that R accepts it and it runs, but I don't understand the results. > apply(values, 1, function(x) cat(x[1], x[2], fill=TRUE)) 1 25 2 26 3 27 4 28 5 29 6 30 7 31 8 32 9 33 10 34 NULL But, I want to get: > apply(values, 1, function(x) cat(x[1], x

using data.table with multiple threads in R

穿精又带淫゛_ 提交于 2019-12-03 13:59:10
Is there a way to utilize multiple threads for computation using data.table in R? For example let's say i have the following data.table : dtb <- data.table(id=rep(1:10000, 1000), x=1:1e7) setkey(dtb, id) f <- function(m) { #some really complicated function } res <- dtb[,f(x), by=id] Is there a way to get R to multithread this if f takes a while to compute? What about in the case that f is quick, will multithreading help or is most of the time going to be taken by data.table in splitting things up into groups? 42- I am not sure that this is "multi-threading", but perhaps you meant to include a

Applying pnorm to columns of a data frame

為{幸葍}努か 提交于 2019-12-03 12:30:54
I'm trying to normalize some data which I have in a data frame. I want to take each value and run it through the pnorm function along with the mean and standard deviation of the column the value lives in. Using loops, here's how I would write out what I want to do: #example data hist_data <- data.frame( matrix( rnorm( 200,mean=5,sd=.5 ),nrow=20 ) ) n <- dim( hist_data )[2] #columns=10 k <- dim( hist_data )[1] #rows =20 #set up the data frame which we will populate with a loop normalized <- data.frame( matrix( nrow = nrow( hist_data ), ncol = ncol( hist_data ) ) ) #hot loop in loop action for (

R return the index of the minimum column for each row

心已入冬 提交于 2019-12-03 12:06:21
I have a data.frame that contains 4 columns (given below). I want to find the index of the minimum column (NOT THE VALUE) for each row. Any idea hiw to achieve that? > d V1 V2 V3 V4 1 0.388116155 0.98999967 0.41548536 0.76093748 2 0.495971331 0.47173142 0.51582728 0.06789924 3 0.436495321 0.48699268 0.21187838 0.54139290 4 0.313514389 0.50265539 0.08054103 0.46019601 5 0.277275961 0.39055360 0.29594162 0.70622532 6 0.264804739 0.86996266 0.85708635 0.61136741 7 0.627344463 0.54277873 0.96769568 0.80399490 8 0.814420492 0.35362949 0.39023446 0.39246250 9 0.517459983 0.65895805 0.93662382 0

Pandas - combine column values into a list in a new column

▼魔方 西西 提交于 2019-12-03 12:03:08
问题 I have a Python Pandas dataframe df: d=[['hello',1,'GOOD','long.kw'], [1.2,'chipotle',np.nan,'bingo'], ['various',np.nan,3000,123.456]] t=pd.DataFrame(data=d, columns=['A','B','C','D']) which looks like this: print(t) A B C D 0 hello 1 GOOD long.kw 1 1.2 chipotle NaN bingo 2 various NaN 3000 123.456 I am trying to create a new column which is a list of the values in A , B , C , and D . So it would look like this: t['combined'] Out[125]: 0 [hello, 1, GOOD, long.kw] 1 [1.2, chipotle, nan, bingo