apply | 易学教程

Set column names while calling a function

阅读更多关于 Set column names while calling a function

Consider we have a numeric data.frame foo and want to find the sum of each two columns: foo <- data.frame(x=1:5,y=4:8,z=10:14, w=8:4) bar <- combn(colnames(foo), 2, function(x) foo[,x[1]] + foo[,x[2]]) bar # [,1] [,2] [,3] [,4] [,5] [,6] #[1,] 5 11 9 14 12 18 #[2,] 7 13 9 16 12 18 #[3,] 9 15 9 18 12 18 #[4,] 11 17 9 20 12 18 #[5,] 13 19 9 22 12 18 Everything is fine, except the column names that are missing from bar . I want column names of bar to show the related columns in foo , for instance in this example: colnames(bar) <- apply(combn(colnames(foo),2), 2, paste0,collapse="") colnames(bar)

Pandas groupby/apply has different behaviour with int and string types

阅读更多关于 Pandas groupby/apply has different behaviour with int and string types

I have the following dataframe X Y 0 A 10 1 A 9 2 A 8 3 A 5 4 B 100 5 B 90 6 B 80 7 B 50 and two different functions that are very similar def func1(x): if x.iloc[0]['X'] == 'A': x['D'] = 1 else: x['D'] = 0 return x[['X', 'D']] def func2(x): if x.iloc[0]['X'] == 'A': x['D'] = 'u' else: x['D'] = 'v' return x[['X', 'D']] Now I can groupby/apply these functions df.groupby('X').apply(func1) df.groupby('X').apply(func2) The first line gives me what I want, i.e. X D 0 A 1 1 A 1 2 A 1 3 A 1 4 B 0 5 B 0 6 B 0 7 B 0 But the second line returns something quite strange X D 0 A u 1 A u 2 A u 3 A u 4 A u 5

`tapply()` to return data frame

阅读更多关于 `tapply()` to return data frame

I have a dataset with a datetime (POSIXct), a "node" (factor) and and a "c" (numeric) columns, for example: date node c 1 2011-08-14 10:30:00 2 0.051236000 2 2011-08-14 10:30:00 2 0.081230000 3 2011-08-14 10:31:00 1 0.000000000 4 2011-08-14 10:31:00 4 0.001356337 5 2011-08-14 10:31:00 3 0.001356337 6 2011-08-14 10:32:00 2 0.000000000 I need to take the mean of column "c" for all pairs of "date" and "node", so I did this: tapply(data$c, list(data$node, data$date), mean) The result I obtain is what I want, but in a strange structure: num [1:5, 1:8923] 0 0 0.00092 0.00146 NA ... - attr(*,

Help me replace a for loop with an “apply” function

阅读更多关于 Help me replace a for loop with an “apply” function

...if that is possible My task is to find the longest streak of continuous days a user participated in a game. Instead of writing an sql function, I chose to use the R's rle function, to get the longest streaks and then update my db table with the results. The (attached) dataframe is something like this: day user_id 2008/11/01 2001 2008/11/01 2002 2008/11/01 2003 2008/11/01 2004 2008/11/01 2005 2008/11/02 2001 2008/11/02 2005 2008/11/03 2001 2008/11/03 2003 2008/11/03 2004 2008/11/03 2005 2008/11/04 2001 2008/11/04 2003 2008/11/04 2004 2008/11/04 2005 I tried the following to get per user

apply a function on rolling window in Dataframe where whole dataframe is passed to function

阅读更多关于 apply a function on rolling window in Dataframe where whole dataframe is passed to function

I have a dataframe of 5 columns indexed by YearMo: yearmo = np.repeat(np.arange(2000, 2010) * 100, 12) + [x for x in range(1,13)] * 10 rates = pd.DataFrame(data=np.random.random(120, 5)), index=pd.Series(data=yearmo, name='YearMo'), columns=['A', 'B','C', 'D', 'E']) rates.head() YearMo A B C D E 200411 0.237696 0.341937 0.258713 0.569689 0.470776 200412 0.601713 0.313006 0.221821 0.720162 0.889891 200501 0.024379 0.761315 0.225032 0.293682 0.302431 200502 0.996778 0.388783 0.026448 0.056188 0.744850 200503 0.942024 0.768416 0.484236 0.102904 0.287446 What I would like to do is to be able to

R plyr, data.table, apply certain columns of data.frame

阅读更多关于 R plyr, data.table, apply certain columns of data.frame

I am looking for ways to speed up my code. I am looking into the apply / ply methods as well as data.table . Unfortunately, I am running into problems. Here is a small sample data: ids1 <- c(1, 1, 1, 1, 2, 2, 2, 2) ids2 <- c(1, 2, 3, 4, 1, 2, 3, 4) chars1 <- c("aa", " bb ", "__cc__", "dd ", "__ee", NA,NA, "n/a") chars2 <- c("vv", "_ ww_", " xx ", "yy__", " zz", NA, "n/a", "n/a") data <- data.frame(col1 = ids1, col2 = ids2, col3 = chars1, col4 = chars2, stringsAsFactors = FALSE) Here is a solution using loops: library("plyr") cols_to_fix <- c("col3","col4") for (i in 1:length(cols_to_fix)) {

Passing multiple arguments to apply (Python)

阅读更多关于 Passing multiple arguments to apply (Python)

问题 I'm trying to clean up some code in Python to vectorize a set of features and I'm wondering if there's a good way to use apply to pass multiple arguments. Consider the following (current version): def function_1(x): if "string" in x: return 1 else: return 0 df['newFeature'] = df['oldFeature'].apply(function_1) With the above I'm having to write a new function (function_1, function_2, etc) to test for each substring "string" that I want to find. In an ideal world I could combine all of these

Row-wise iteration like apply with purrr

阅读更多关于 Row-wise iteration like apply with purrr

问题 How do I achieve row-wise iteration using purrr::map? Here's how I'd do it with a standard row-wise apply. df <- data.frame(a = 1:10, b = 11:20, c = 21:30) lst_result <- apply(df, 1, function(x){ var1 <- (x[['a']] + x[['b']]) var2 <- x[['c']]/2 return(data.frame(var1 = var1, var2 = var2)) }) However, this is not too elegant, and I would rather do it with purrr. May (or may not) be faster, too. 回答1: You can use pmap for row-wise iteration. The columns are used as the arguments of whatever

multiply multiple column and find sum of each column for multiple values

阅读更多关于 multiply multiple column and find sum of each column for multiple values

问题 I'm trying to multiply column and get its names. I have a data frame: v1 v2 v3 v4 v5 0 1 1 1 1 0 1 1 0 1 1 0 1 1 0 I'm trying to multiplying each column with other, like: v1v2 v1v3 v1v4 v1v5 and v2v3 v2v4 v2v5 etc, and v1v2v3 v1v2v4 v1v2v5 v2v3v4 v2v3v5 4 combination and 5 combination...if there is n column then n combination. I'm try to use following code in while loop, but it is not working: i<-1 while(i<=ncol(data) { results<-data.frame() v<-i results<- t(apply(data,1,function(x) combn(x,v

How to apply rolling functions in a group by object in pandas

阅读更多关于 How to apply rolling functions in a group by object in pandas

I'm having difficulty to solve a look-back or roll-over problem in dataframe or perhaps in groupby. The following is a simple example of the dataframe I have: fruit amount 20140101 apple 3 20140102 apple 5 20140102 orange 10 20140104 banana 2 20140104 apple 10 20140104 orange 4 20140105 orange 6 20140105 grape 1 … 20141231 apple 3 20141231 grape 2 I need to calculate the average value of 'amount' of each fruit in the previous 3 days for everyday, and create the following data frame: fruit average_in_last 3 days 20140104 apple 4 20140104 orange 10 ... For example on 20140104, the previous 3