apply | 易学教程

How to build a custom function that uses externally defined values with a string condition in R

阅读更多关于 How to build a custom function that uses externally defined values with a string condition in R

问题 I'm working on a function for calculations of a single numeric variable (double). It should takes it's components from another data frame that stores different equations which are broken up into their single pieces (I use linear regression equations here so it's about the two variables/columns slope and intercept). Depending on one condition (a name/specific string) which is stored in the equations table as well the function should use the slope and intercept from the same row. The actual

apply - test multiple conditions before moving rows

阅读更多关于 apply - test multiple conditions before moving rows

问题 I want to use apply to iterate over a matrix comparing open and high prices to a limit. I originally used a while loop but it was slow so moved to apply. I have tried to +1 to the StartingRow as below. Summary <- matrix(data=NA, nrow=1, ncol=1) Overall <- matrix(data=NA, nrow=1, ncol=2) Open <- matrix(data=NA, nrow=1, ncol=1) MSingle <- function(x, StartingRow=1, Limit=0.01, StopLoss=0.01){ Open = x[1] High = x[2] Low = x[3] #If the difference between High and Open exceeds Limit the function

pandas return column name apply function for each row

阅读更多关于 pandas return column name apply function for each row

问题 I am working on the pandas dataset. For 2D dataframe try to return/append one column which return the column name whose value is over 0.95. import pandas as pd import numpy as np Exp_day_list = ["EXP_DAY_1","EXP_DAY_2","EXP_DAY_3","EXP_DAY_4","EXP_DAY_5","EXP_DAY_6","EXP_DAY_7","EXP_DAY_8","EXP_DAY_9","EXP_DAY_10","EXP_GT_DAY_10"] test = raw_databased.head() Exp_day_percentage = test[Exp_day_list] def over_95_percent(x): for column in x: if x[column] > 0.95: return column break Exp_day

R apply user define function on data frame columns

阅读更多关于 R apply user define function on data frame columns

问题 in R I have a function define to calculate intersection between 2 strings: containedin <- function(t1,t2){ return length(Reduce(intersect, strsplit(c(t1,t2), "\\s+"))) } I want to apply this function on a data frame that contains 2 string columns: data.selected[c('keywords','title')] keywords title 1 Samsung UN48H6350 48" Samsung UN48H6350 48" Full 1080p Smart HDTV 120Hz with Wi-Fi +$50 Visa Gift Card 2 Samsung UN48H6350 48" Samsung UN48H6350 48" Full HD Smart LED TV -Bundle- (See Below for

R data.frame: rowSums of selected columns by grouping vector

阅读更多关于 R data.frame: rowSums of selected columns by grouping vector

问题 I have a data frame with a sequence of numeric columns, surrounded on both sides by (irrelevant) columns of characters. I want to obtain a new data frame that keeps the position of the irrelevant columns, and adds the numeric columns to eachother by a certain grouping vector (or applies some other row-wise function to the data frame, by group). Example: sample = data.frame(cha1 = c("A","B"),num1=1:2,num2=3:4,num3=11:12,num4=13:14,cha2=c("C","D")) > sample cha1 num1 num2 num3 num4 cha2 1 A 1 3

Filling a pandas column based on another column

阅读更多关于 Filling a pandas column based on another column

问题 I would like to fill each row of a column of my dataframe based on the entries in another column, in particular I want to fill each row with the corresponding name of the corresponding ticker for that stock, like so dict1 = [{'ticker': 'AAPL','Name': 'Apple Inc.'}, {'ticker': 'MSFT','Name': 'Microsoft Corporation'}] df1 = pd.DataFrame(dict1) This function provides the name for a given ticker: So I can pull the name for for say MSFT: dict1 = [{'ticker': 'AAPL','Name': 'Apple Inc.'}, {'ticker':

R - conditional incrementation

阅读更多关于 R - conditional incrementation

问题 This should be trivial to code but could not think of an elegant one-liner in R. I have a dataframe as below: data <- data.frame( index= seq(1:20), event=rep(0,20) ) data$event[10] <- 1 data$event[15] <- 1 I simply want to add start and stop counter columns that increment in 10's and reset right after an event=1 is observed. So the desired output with these two additional columns would be: index event start stop 1 1 0 0 10 2 2 0 10 20 3 3 0 20 30 4 4 0 30 40 5 5 0 40 50 6 6 0 50 60 7 7 0 60

create new column that compares across rows in pandas dataframe

阅读更多关于 create new column that compares across rows in pandas dataframe

问题 I am looking to create a new column in a dataframe based on the values seen in the next 2 rows. Specifically, if any values in the next 2 rows are below 4, then I want the new value in the current row to be 0 (and if all values in the next 2 rows are above 4 then I want the new value in the current row to be 1). >>> df = pandas.DataFrame({"A": [5,6,7,3,2]}) >>> df A 0 5 1 6 2 7 3 8 4 2 >>> desired_result = pandas.DataFrame({"A": [5,6,7,8,2], "new": [1,1,0,0,0]}) >>> desired_result A new 0 5 1

R: how to vapply across rows for xts object?

阅读更多关于 R: how to vapply across rows for xts object?

问题 I have the following xts object. x <- structure(c(30440.5, 30441, 30441.5, 30441.5, 30441, 30439.5, 30440.5, 30441, 30441.5, NA, NA, 30439.5, NA, NA, NA, 30441.5, 30441, NA), .indexTZ = "", class = c("xts", "zoo"), .indexCLASS = c("POSIXct", "POSIXt"), tclass = c("POSIXct", "POSIXt"), tzone = "", index = structure(c(1519866931.1185, 1519866931.1255, 1519866931.1255, 1519866931.1905, 1519866931.1905, 1519866931.1915), tzone = "", tclass = c("POSIXct", "POSIXt")), .indexFormat = "%Y-%m-%d %H:%M

Improve this code by eliminating nested for cycles

阅读更多关于 Improve this code by eliminating nested for cycles

问题 The R package corrplot contains, among the other stuff, this nifty function cor.mtest <- function(mat, conf.level = 0.95){ mat <- as.matrix(mat) n <- ncol(mat) p.mat <- lowCI.mat <- uppCI.mat <- matrix(NA, n, n) diag(p.mat) <- 0 diag(lowCI.mat) <- diag(uppCI.mat) <- 1 for(i in 1:(n-1)){ for(j in (i+1):n){ tmp <- cor.test(mat[,i], mat[,j], conf.level = conf.level) p.mat[i,j] <- p.mat[j,i] <- tmp$p.value lowCI.mat[i,j] <- lowCI.mat[j,i] <- tmp$conf.int[1] uppCI.mat[i,j] <- uppCI.mat[j,i] <- tmp