apply

Unexpected apply function behaviour in R

孤者浪人 提交于 2019-11-29 14:16:58
I've discovered a surprising behaviour by apply that I wonder if anyone can explain. Lets take a simple matrix: > (m = matrix(1:8,ncol=4)) [,1] [,2] [,3] [,4] [1,] 1 3 5 7 [2,] 2 4 6 8 We can flip it vertically thus: > apply(m, MARGIN=2, rev) [,1] [,2] [,3] [,4] [1,] 2 4 6 8 [2,] 1 3 5 7 This applies the rev() vector reversal function iteratively to each column. But when we try to apply rev by row we get: > apply(m, MARGIN=1, rev) [,1] [,2] [1,] 7 8 [2,] 5 6 [3,] 3 4 [4,] 1 2 .. a 90 degree anti-clockwise rotation! Apply delivers the same result using FUN=function(v) {v[length(v):1]} so it is

Looping over combinations of regression model terms

烈酒焚心 提交于 2019-11-29 12:49:38
I'm running a regression in the form reg=lm(y ~ x1+x2+x3+z1,data=mydata) In the place of the last term, z1 , I want to loop through a set of different variables, z1 through z10 , running a regression for each with it as the last term. E.g. in second run I want to use reg=lm(y ~ x1+x2+x3+z2,data=mydata) in 3rd run: reg=lm(y ~ x1+x2+x3+z3,data=mydata) How can I automate this by looping through the list of z-variables? With this dummy data: dat1 <- data.frame(y = rpois(100,5), x1 = runif(100), x2 = runif(100), x3 = runif(100), z1 = runif(100), z2 = runif(100) ) You could get your list of two lm

When we go for cross apply and when we go for inner join in SQL Server 2012

一笑奈何 提交于 2019-11-29 12:29:41
I have small question about SQL Server. When do we use cross apply , and when do we use inner join ? Why use cross apply at all in SQL Server? I have emp, dept tables; based on those two tables, I write an inner join and cross apply query like this: ----using cross apply SELECT * FROM Department D CROSS APPLY (SELECT * FROM Employee E WHERE E.DepartmentID = D.DepartmentID) A ----using inner join SELECT * FROM Department D INNER JOIN Employee E ON D.DepartmentID = E.DepartmentID Both queries return the same result. Here why is cross apply needed in SQL Server? Is there performance difference?

R count times word appears in element of list

爱⌒轻易说出口 提交于 2019-11-29 12:28:26
I have a list comprised of words. > head(splitWords2) [[1]] [1] "Some" "additional" "information" "that" "we" "would" "need" "to" "replicate" "the" [11] "experiment" "is" "how" "much" "vinegar" "should" "be" "placed" "in" "each" [21] "identical" "container" "or" "what" "tool" "use" "measure" "mass" "of" "four" [31] "different" "samples" "and" "distilled" "water" "rinse" "after" "taking" "them" "out" [[2]] [1] "After" "reading" "the" "expirement" "I" "realized" "that" "additional" "information" "you" [11] "need" "to" "replicate" "expireiment" "is" "one" "amant" "of" "vinegar" "poured" [21] "in"

Get the (t-1) data within groups

混江龙づ霸主 提交于 2019-11-29 10:50:53
Apologies if this has been asked before, but I couldn't find any question which answers this exactly. I have a data like this: Project Date price A 30/3/2013 2082 B 19/3/2013 1567 B 22/2/2013 1642 C 12/4/2013 1575 C 5/6/2013 1582 I want to have a column with last-instance prices by group. For example, for row 2, the last instance price for same group will be 1642. The final data will look somewhat like this: Project Date price lastPrice A 30/3/2013 2082 0 B 19/3/2013 1567 1642 B 22/2/2013 1642 0 C 12/4/2013 1575 0 C 5/6/2013 1582 1575 How to do this? The main issue I'm facing is that the data

How do I count the number of words in a text (string)?

这一生的挚爱 提交于 2019-11-29 09:55:20
I have this string vector (for example): str <- c("this is a string current trey", "feather rtttt", "tusla", "laq") To count the number of words in this vector I used this (as given here Count the number of words in a string in R? , which is a possible duplicate but with another issue) No_words <- sapply(gregexpr("\\W+", str), length) + 1 but it returns 6 2 2 2 String has only 1 element in last two places (i.e. "tusla" and "laq" ) so it should return 6 2 1 1 How do I get around this problem? You can try sapply(gregexpr("\\S+", x), length) ## [1] 6 2 1 1 Or as suggested in comments you can try

calculate row sum and product in data.frame

﹥>﹥吖頭↗ 提交于 2019-11-29 09:36:04
I would like to append a columns to my data.frame in R that contain row sums and products Consider following data frame x y z 1 2 3 2 3 4 5 1 2 I want to get the following x y z sum prod 1 2 3 6 6 2 3 4 9 24 5 1 2 8 10 I have tried sum = apply(ages,1,add) but it gives me a row vector. Can some one please show me an efficient command to sum and product and append them to original data frame as shown above? Try transform(df, sum=rowSums(df), prod=x*y*z) # x y z sum prod #1 1 2 3 6 6 #2 2 3 4 9 24 #3 5 1 2 8 10 Or transform(df, sum=rowSums(df), prod=Reduce(`*`, df)) # x y z sum prod #1 1 2 3 6 6

python if elif else 区别

我怕爱的太早我们不能终老 提交于 2019-11-29 04:13:27
if data_ori=='医疗': # 医疗 df = pd.read_excel(path_apply + 'apply/YS_ZY_HZSQ_样例.xls', encoding='gbk', error_bad_lines=False) df=df[['HZMD']] df=df[~df['HZMD'].isnull()]else: # 中国日报 df = pd.read_csv(path_apply + 'apply/原始文本.txt', header=None, encoding='gbk')return df if data_ori=='医疗': # 医疗 df = pd.read_excel(path_apply + 'apply/YS_ZY_HZSQ_样例.xls', encoding='gbk', error_bad_lines=False) df=df[['HZMD']] df=df[~df['HZMD'].isnull()]elif: # 中国日报 df = pd.read_csv(path_apply + 'apply/原始文本.txt', header=None, encoding='gbk') return dfif 和 elif 的区别 来源: https://www.cnblogs.com/jfdwd/p/11451878.html

Find and replace missing values with row mean

落爺英雄遲暮 提交于 2019-11-29 04:10:11
I have a data frame with NAs and I want to replace the NAs with row means c1 = c(1,2,3,NA) c2 = c(3,1,NA,3) c3 = c(2,1,3,1) df = data.frame(c1,c2,c3) > df c1 c2 c3 1 1 3 2 2 2 1 1 3 3 NA 3 4 NA 3 1 so that > df c1 c2 c3 1 1 3 2 2 2 1 1 3 3 3 3 4 2 3 1 Very similar to @baptiste's answer > ind <- which(is.na(df), arr.ind=TRUE) > df[ind] <- rowMeans(df, na.rm = TRUE)[ind[,1]] I think this works, df[which(is.na(df), arr.ind=TRUE)] <- rowMeans(df[!complete.cases(df), ], na.rm=TRUE) Using apply (note the returned object is a matrix ): t( apply( df , 1 , function(x) { x[ is.na(x) ] = mean( x , na.rm

Dataframe create new column based on other columns

故事扮演 提交于 2019-11-29 03:58:59
I have a dataframe: df <- data.frame('a'=c(1,2,3,4,5), 'b'=c(1,20,3,4,50)) df a b 1 1 1 2 2 20 3 3 3 4 4 4 5 5 50 and I want to create a new column based on existing columns. Something like this: if (df[['a']] == df[['b']]) { df[['c']] <- df[['a']] + df[['b']] } else { df[['c']] <- df[['b']] - df[['a']] } The problem is that the if condition is checked only for the first row... If I create a function from the above if statement then I use apply() (or mapply() ...), it is the same. In Python/pandas I can use this: df['c'] = df[['a', 'b']].apply(lambda x: x['a'] + x['b'] if (x['a'] == x['b']) \