apply | 易学教程

Why does pandas apply calculate twice

阅读更多关于 Why does pandas apply calculate twice

I'm using the apply method on a panda's DataFrame object. When my DataFrame has a single column, it appears that the applied function is being called twice. The questions are why? And, can I stop that behavior? Code: import pandas as pd def mul2(x): print 'hello' return 2*x df = pd.DataFrame({'a': [1,2,0.67,1.34]}) print df.apply(mul2) Output: hello hello 0 2.00 1 4.00 2 1.34 3 2.68 I'm printing 'hello' from within the function being applied. I know it's being applied twice because 'hello' printed twice. What's more is that if I had two columns, 'hello' prints 3 times. Even more still is when

Sum every nth points

阅读更多关于 Sum every nth points

I have a vector and I need to sum every n numbers and return the results. This is the way I plan on doing it currently. Any better way to do this? v = 1:100 n = 10 sidx = seq.int(from=1, to=length(v), by=n) eidx = c((sidx-1)[2:length(sidx)], length(v)) thesum = sapply(1:length(sidx), function(i) sum(v[sidx[i]:eidx[i]])) This gives: thesum [1] 55 155 255 355 455 555 655 755 855 955 unname(tapply(v, (seq_along(v)-1) %/% n, sum)) # [1] 55 155 255 355 455 555 655 755 855 955 UPDATE: If you want to sum every n consecutive numbers use colSums If you want to sum every nth number use rowSums as per

How to use Pandas groupby apply() without adding an extra index

阅读更多关于 How to use Pandas groupby apply() without adding an extra index

问题 I very often want to create a new DataFrame by combining multiple columns of a grouped DataFrame. The apply() function allows me to do that, but it requires that I create an unneeded index: In [359]: df = pandas.DataFrame({'x': 3 * ['a'] + 2 * ['b'], 'y': np.random.normal(size=5), 'z': np.random.normal(size=5)}) In [360]: df Out[360]: x y z 0 a 0.201980 -0.470388 1 a 0.190846 -2.089032 2 a -1.131010 0.227859 3 b -0.263865 -1.906575 4 b -1.335956 -0.722087 In [361]: df.groupby('x').apply

Remove columns from dataframe where ALL values are NA

阅读更多关于 Remove columns from dataframe where ALL values are NA

I'm having trouble with a data frame and couldn't really resolve that issue myself: The dataframe has arbitrary properties as columns and each row represents one data set . The question is: How to get rid of columns where for ALL rows the value is NA ? Try this: df <- df[,colSums(is.na(df))<nrow(df)] The two approaches offered thus far fail with large data sets as (amongst other memory issues) they create is.na(df) , which will be an object the same size as df . Here are two approaches that are more memory and time efficient An approach using Filter Filter(function(x)!all(is.na(x)), df) and an

js 关于apply和call的理解使用

阅读更多关于 js 关于apply和call的理解使用

　　关于call和apply，以前也思考良久，很多时候都以为记住了，但是，我太难了。今天我特地写下笔记，希望可以完全掌握这个东西，也希望可以帮助到任何想对学习这个东西的同学。一.apply函数定义与理解，先从apply函数出发　　在MDN上，apply的定义是：　　　　“ apply() 方法调用一个具有给定 this 值的函数，以及作为一个数组（或类似数组对象）提供的参数。” 　　我的理解是：apply的前面有个含有this的对象，设为A，apply()的参数里，也含有一个含有this的对象设为B。则A.apply(B)，表示A代码执行调用了B，B代码照常执行，执行后的结果作为apply的参数，然后apply把这个结果所指代表示的this替换掉A本身的this，接着执行A代码。　　比如： 1 var aa = { 2 _name:111, 3 _age:222, 4 _f:function(){ 5 console.log(this) 6 console.log(this._name) 7 } 8 } 9 var cc = { 10 _name:0, 11 _age:0, 12 _f:function(){ 13 console.log(this) 14 console.log(this._name) 15 } 16 } 17 cc._f.apply(aa)/

Last Observation Carried Forward In a data frame? [duplicate]

阅读更多关于 Last Observation Carried Forward In a data frame? [duplicate]

This question already has an answer here: Replacing NAs with latest non-NA value 15 answers I wish to implement a "Last Observation Carried Forward" for a data set I am working on which has missing values at the end of it. Here is a simple code to do it (question after it): LOCF <- function(x) { # Last Observation Carried Forward (for a left to right series) LOCF <- max(which(!is.na(x))) # the location of the Last Observation to Carry Forward x[LOCF:length(x)] <- x[LOCF] return(x) } # example: LOCF(c(1,2,3,4,NA,NA)) LOCF(c(1,NA,3,4,NA,NA)) Now this works great for simple vectors. But if I

apply a function over groups of columns

阅读更多关于 apply a function over groups of columns

How can I use apply or a related function to create a new data frame that contains the results of the row averages of each pair of columns in a very large data frame? I have an instrument that outputs n replicate measurements on a large number of samples, where each single measurement is a vector (all measurements are the same length vectors). I'd like to calculate the average (and other stats) on all replicate measurements of each sample. This means I need to group n consecutive columns together and do row-wise calculations. For a simple example, with three replicate measurements on two

Results transposed with R apply [duplicate]

阅读更多关于 Results transposed with R apply [duplicate]

问题 This question already has an answer here : Why apply() returns a transposed xts matrix? (1 answer) Closed 6 years ago . Apologies, I just realised that this has already been answered here. This should be pretty basic but I do not really understand why it is happening. Can someone help? This is the simple code with the example 'data': applyDirichletPrior <- function (row_vector) { row_vector_added <- row_vector + min (row_vector) row_vector_result <- row_vector_added / sum(row_vector_added) }

Why apply() returns a transposed xts matrix?

阅读更多关于 Why apply() returns a transposed xts matrix?

I want to run a function on all periods of an xts matrix. apply() is very fast but the returned matrix has transposed dimensions compared to the original object: > dim(myxts) [1] 7429 48 > myxts.2 = apply(myxts, 1 , function(x) { return(x) }) > dim(myxts.2) [1] 48 7429 > str(myxts) An 'xts' object from 2012-01-03 09:30:00 to 2012-01-30 16:00:00 containing: Data: num [1:7429, 1:48] 4092500 4098500 4091500 4090300 4095200 ... - attr(*, "dimnames")=List of 2 ..$ : NULL ..$ : chr [1:48] "Open" "High" "Low" "Close" ... Indexed by objects of class: [POSIXlt,POSIXt] TZ: xts Attributes: NULL > str

Apply a function to every row of a matrix or a data frame

阅读更多关于 Apply a function to every row of a matrix or a data frame

Suppose I have a n by 2 matrix and a function that takes a 2-vector as one of its arguments. I would like to apply the function to each row of the matrix and get a n-vector. How to do this in R? For example, I would like to compute the density of a 2D standard Normal distribution on three points: bivariate.density(x = c(0, 0), mu = c(0, 0), sigma = c(1, 1), rho = 0){ exp(-1/(2*(1-rho^2))*(x[1]^2/sigma[1]^2+x[2]^2/sigma[2]^2-2*rho*x[1]*x[2]/(sigma[1]*sigma[2]))) * 1/(2*pi*sigma[1]*sigma[2]*sqrt(1-rho^2)) } out <- rbind(c(1, 2), c(3, 4), c(5, 6)) How to apply the function to each row of out ?