apply | 易学教程

Unexpected apply function behaviour in R

阅读更多关于 Unexpected apply function behaviour in R

I've discovered a surprising behaviour by apply that I wonder if anyone can explain. Lets take a simple matrix: > (m = matrix(1:8,ncol=4)) [,1] [,2] [,3] [,4] [1,] 1 3 5 7 [2,] 2 4 6 8 We can flip it vertically thus: > apply(m, MARGIN=2, rev) [,1] [,2] [,3] [,4] [1,] 2 4 6 8 [2,] 1 3 5 7 This applies the rev() vector reversal function iteratively to each column. But when we try to apply rev by row we get: > apply(m, MARGIN=1, rev) [,1] [,2] [1,] 7 8 [2,] 5 6 [3,] 3 4 [4,] 1 2 .. a 90 degree anti-clockwise rotation! Apply delivers the same result using FUN=function(v) {v[length(v):1]} so it is

Looping over combinations of regression model terms

阅读更多关于 Looping over combinations of regression model terms

I'm running a regression in the form reg=lm(y ~ x1+x2+x3+z1,data=mydata) In the place of the last term, z1 , I want to loop through a set of different variables, z1 through z10 , running a regression for each with it as the last term. E.g. in second run I want to use reg=lm(y ~ x1+x2+x3+z2,data=mydata) in 3rd run: reg=lm(y ~ x1+x2+x3+z3,data=mydata) How can I automate this by looping through the list of z-variables? With this dummy data: dat1 <- data.frame(y = rpois(100,5), x1 = runif(100), x2 = runif(100), x3 = runif(100), z1 = runif(100), z2 = runif(100) ) You could get your list of two lm

When we go for cross apply and when we go for inner join in SQL Server 2012

阅读更多关于 When we go for cross apply and when we go for inner join in SQL Server 2012

I have small question about SQL Server. When do we use cross apply , and when do we use inner join ? Why use cross apply at all in SQL Server? I have emp, dept tables; based on those two tables, I write an inner join and cross apply query like this: ----using cross apply SELECT * FROM Department D CROSS APPLY (SELECT * FROM Employee E WHERE E.DepartmentID = D.DepartmentID) A ----using inner join SELECT * FROM Department D INNER JOIN Employee E ON D.DepartmentID = E.DepartmentID Both queries return the same result. Here why is cross apply needed in SQL Server? Is there performance difference?

R count times word appears in element of list

阅读更多关于 R count times word appears in element of list

I have a list comprised of words. > head(splitWords2) [[1]] [1] "Some" "additional" "information" "that" "we" "would" "need" "to" "replicate" "the" [11] "experiment" "is" "how" "much" "vinegar" "should" "be" "placed" "in" "each" [21] "identical" "container" "or" "what" "tool" "use" "measure" "mass" "of" "four" [31] "different" "samples" "and" "distilled" "water" "rinse" "after" "taking" "them" "out" [[2]] [1] "After" "reading" "the" "expirement" "I" "realized" "that" "additional" "information" "you" [11] "need" "to" "replicate" "expireiment" "is" "one" "amant" "of" "vinegar" "poured" [21] "in"

Get the (t-1) data within groups

阅读更多关于 Get the (t-1) data within groups

Apologies if this has been asked before, but I couldn't find any question which answers this exactly. I have a data like this: Project Date price A 30/3/2013 2082 B 19/3/2013 1567 B 22/2/2013 1642 C 12/4/2013 1575 C 5/6/2013 1582 I want to have a column with last-instance prices by group. For example, for row 2, the last instance price for same group will be 1642. The final data will look somewhat like this: Project Date price lastPrice A 30/3/2013 2082 0 B 19/3/2013 1567 1642 B 22/2/2013 1642 0 C 12/4/2013 1575 0 C 5/6/2013 1582 1575 How to do this? The main issue I'm facing is that the data

How do I count the number of words in a text (string)?

阅读更多关于 How do I count the number of words in a text (string)?

I have this string vector (for example): str <- c("this is a string current trey", "feather rtttt", "tusla", "laq") To count the number of words in this vector I used this (as given here Count the number of words in a string in R? , which is a possible duplicate but with another issue) No_words <- sapply(gregexpr("\\W+", str), length) + 1 but it returns 6 2 2 2 String has only 1 element in last two places (i.e. "tusla" and "laq" ) so it should return 6 2 1 1 How do I get around this problem? You can try sapply(gregexpr("\\S+", x), length) ## [1] 6 2 1 1 Or as suggested in comments you can try

calculate row sum and product in data.frame

阅读更多关于 calculate row sum and product in data.frame

I would like to append a columns to my data.frame in R that contain row sums and products Consider following data frame x y z 1 2 3 2 3 4 5 1 2 I want to get the following x y z sum prod 1 2 3 6 6 2 3 4 9 24 5 1 2 8 10 I have tried sum = apply(ages,1,add) but it gives me a row vector. Can some one please show me an efficient command to sum and product and append them to original data frame as shown above? Try transform(df, sum=rowSums(df), prod=x*y*z) # x y z sum prod #1 1 2 3 6 6 #2 2 3 4 9 24 #3 5 1 2 8 10 Or transform(df, sum=rowSums(df), prod=Reduce(`*`, df)) # x y z sum prod #1 1 2 3 6 6

python if elif else 区别

阅读更多关于 python if elif else 区别

if data_ori=='医疗': # 医疗 df = pd.read_excel(path_apply + 'apply/YS_ZY_HZSQ_样例.xls', encoding='gbk', error_bad_lines=False) df=df[['HZMD']] df=df[~df['HZMD'].isnull()]else: # 中国日报 df = pd.read_csv(path_apply + 'apply/原始文本.txt', header=None, encoding='gbk')return df if data_ori=='医疗': # 医疗 df = pd.read_excel(path_apply + 'apply/YS_ZY_HZSQ_样例.xls', encoding='gbk', error_bad_lines=False) df=df[['HZMD']] df=df[~df['HZMD'].isnull()]elif: # 中国日报 df = pd.read_csv(path_apply + 'apply/原始文本.txt', header=None, encoding='gbk') return dfif 和 elif 的区别来源： https://www.cnblogs.com/jfdwd/p/11451878.html

Find and replace missing values with row mean

阅读更多关于 Find and replace missing values with row mean

I have a data frame with NAs and I want to replace the NAs with row means c1 = c(1,2,3,NA) c2 = c(3,1,NA,3) c3 = c(2,1,3,1) df = data.frame(c1,c2,c3) > df c1 c2 c3 1 1 3 2 2 2 1 1 3 3 NA 3 4 NA 3 1 so that > df c1 c2 c3 1 1 3 2 2 2 1 1 3 3 3 3 4 2 3 1 Very similar to @baptiste's answer > ind <- which(is.na(df), arr.ind=TRUE) > df[ind] <- rowMeans(df, na.rm = TRUE)[ind[,1]] I think this works, df[which(is.na(df), arr.ind=TRUE)] <- rowMeans(df[!complete.cases(df), ], na.rm=TRUE) Using apply (note the returned object is a matrix ): t( apply( df , 1 , function(x) { x[ is.na(x) ] = mean( x , na.rm

Dataframe create new column based on other columns

阅读更多关于 Dataframe create new column based on other columns

I have a dataframe: df <- data.frame('a'=c(1,2,3,4,5), 'b'=c(1,20,3,4,50)) df a b 1 1 1 2 2 20 3 3 3 4 4 4 5 5 50 and I want to create a new column based on existing columns. Something like this: if (df[['a']] == df[['b']]) { df[['c']] <- df[['a']] + df[['b']] } else { df[['c']] <- df[['b']] - df[['a']] } The problem is that the if condition is checked only for the first row... If I create a function from the above if statement then I use apply() (or mapply() ...), it is the same. In Python/pandas I can use this: df['c'] = df[['a', 'b']].apply(lambda x: x['a'] + x['b'] if (x['a'] == x['b']) \