rolling-computation

apply custom function on pandas dataframe on a rolling window

回眸只為那壹抹淺笑 提交于 2020-07-09 19:37:02
问题 Suppose you have a dataframe with 1000 closing prices. You want to apply a risk calculation function (let's say VaR) named compute_var() on last 90 closing prices, on a rolling basis. How would you do it? I presume with apply() : def compute_var(df): return do_calculations_on(df[-90:]) def compute_rolling_var(self): self.var = self.closing.apply(compute_var) Problem is that .apply only passes 1 day closing to compute_var, and not a dataframe. So it gives an error. The only working solution I

apply custom function on pandas dataframe on a rolling window

房东的猫 提交于 2020-07-09 19:36:17
问题 Suppose you have a dataframe with 1000 closing prices. You want to apply a risk calculation function (let's say VaR) named compute_var() on last 90 closing prices, on a rolling basis. How would you do it? I presume with apply() : def compute_var(df): return do_calculations_on(df[-90:]) def compute_rolling_var(self): self.var = self.closing.apply(compute_var) Problem is that .apply only passes 1 day closing to compute_var, and not a dataframe. So it gives an error. The only working solution I

Pandas rolling apply using multiple columns

|▌冷眼眸甩不掉的悲伤 提交于 2020-07-09 04:25:44
问题 I am trying to use a pandas.DataFrame.rolling.apply() rolling function on multiple columns. Python version is 3.7, pandas is 1.0.2. import pandas as pd #function to calculate def masscenter(x): print(x); # for debug purposes return 0; #simple DF creation routine df = pd.DataFrame( [['02:59:47.000282', 87.60, 739], ['03:00:01.042391', 87.51, 10], ['03:00:01.630182', 87.51, 10], ['03:00:01.635150', 88.00, 792], ['03:00:01.914104', 88.00, 10]], columns=['stamp', 'price','nQty']) df['stamp'] = pd

Filtering out outliers in Pandas dataframe with rolling median

风格不统一 提交于 2020-07-06 11:57:38
问题 I am trying to filter out some outliers from a scatter plot of GPS elevation displacements with dates I'm trying to use df.rolling to compute a median and standard deviation for each window and then remove the point if it is greater than 3 standard deviations. However, I can't figure out a way to loop through the column and compare the the median value rolling calculated. Here is the code I have so far import pandas as pd import numpy as np def median_filter(df, window): cnt = 0 median = df[

Filtering out outliers in Pandas dataframe with rolling median

家住魔仙堡 提交于 2020-07-06 11:57:35
问题 I am trying to filter out some outliers from a scatter plot of GPS elevation displacements with dates I'm trying to use df.rolling to compute a median and standard deviation for each window and then remove the point if it is greater than 3 standard deviations. However, I can't figure out a way to loop through the column and compare the the median value rolling calculated. Here is the code I have so far import pandas as pd import numpy as np def median_filter(df, window): cnt = 0 median = df[

Rolling window function for irregular time series that can handle duplicates

耗尽温柔 提交于 2020-06-27 14:11:10
问题 I have the following data.frame: grp nr yr 1: A 1.0 2009 2: A 2.0 2009 3: A 1.5 2009 4: A 1.0 2010 5: B 3.0 2009 6: B 2.0 2010 7: B NA 2011 8: C 3.0 2014 9: C 3.0 2019 10: C 3.0 2020 11: C 4.0 2021 Desired output: grp nr yr nr_roll_period_3 1 A 1.0 2009 NA 2 A 2.0 2009 NA 3 A 1.5 2009 NA 4 A 1.0 2010 NA 5 B 3.0 2009 NA 6 B 2.0 2010 NA 7 B NA 2011 NA 8 C 3.0 2014 NA 9 C 3.0 2019 NA 10 C 3.0 2020 NA 11 C 4.0 2021 3.333333 The logic: I want to calculate a rolling mean for the period of length k

Rolling window function for irregular time series that can handle duplicates

陌路散爱 提交于 2020-06-27 14:04:52
问题 I have the following data.frame: grp nr yr 1: A 1.0 2009 2: A 2.0 2009 3: A 1.5 2009 4: A 1.0 2010 5: B 3.0 2009 6: B 2.0 2010 7: B NA 2011 8: C 3.0 2014 9: C 3.0 2019 10: C 3.0 2020 11: C 4.0 2021 Desired output: grp nr yr nr_roll_period_3 1 A 1.0 2009 NA 2 A 2.0 2009 NA 3 A 1.5 2009 NA 4 A 1.0 2010 NA 5 B 3.0 2009 NA 6 B 2.0 2010 NA 7 B NA 2011 NA 8 C 3.0 2014 NA 9 C 3.0 2019 NA 10 C 3.0 2020 NA 11 C 4.0 2021 3.333333 The logic: I want to calculate a rolling mean for the period of length k