Curve fitting for each column in Pandas + extrapolate values

那年仲夏 提交于 2019-12-06 05:19:30
Vishnu Kunchur

Some changes have been made after the conversation with the principal author of this answer and with his approval.

First of all, since we are dealing with log-transform quantities, it is necessary to find the range of values which correspond to non-negative values per column.

negative_idx_aux = df_drop_depth.apply(lambda x:(x<0).nonzero()[0][:1].tolist())   
negative_idx = [item for sublist in negative_idx_aux for item in sublist]

if len(negative_idx) > 0:
    max_idx = max_idx = np.min(negative_idx)
else:
    max_idx = None

Compared to the original, I only merge the loops to obtain both the slope and intercept.

iz_cols = df1.columns.difference(['depth'])
slp_int = {}
for c in iz_cols:
    slope, intercept, r_value, p_value, std_err = stats.linregress(df1['depth'][0:max_idx],np.log(df1[c][0:max_idx]))
    slp_int[c] = [intercept, slope]

slp_int = pd.DataFrame(, index = ['intercept', 'slope'])

Exponentiating intercept gives us the value of I at the surface:

slp_int.loc['intercept'] = np.exp(slp_int.loc['intercept'])

The last part of the post has been corrected due to a misunderstanding of the final concept. The dataframe is now recreated, with new values for the surface depths (above the depth range of df1, keeping the df1 for values below.

First a whole range between z = 0 and the maximum value of the depth column is recreated, with an assigned step plus keeping the value at z = 0:

depth = np.asarray(df1.depth)
depth_min = np.min(depth)    ;   
depth_min_arr = np.array([depth_min])
step = 0.5
missing_vals_aux = np.arange(depth_min - step, 0, -step)[::-1]
missing_vals = np.concatenate(([0.], missing_vals_aux), axis=0)
depth_tot = np.concatenate((missing_vals, depth), axis=0)

df_boundary = pd.DataFrame(columns = iz_cols) 
df_up = pd.DataFrame(columns = iz_cols) 

Create a dataframe with the range of the upward-propagated depth quotas:

for c in iz_cols: 
    df_up[c]       = missing_vals

Fill the data with the regression-obtained parameters:

upper_df = slp_int.loc['intercept']*np.exp(slp_int.loc['slope']*df_up)
upper_df['depth'] = missing_vals

Merge the df1 and the upper_df to obtain a whole profile:

lower_df = df1
lower_df['depth'] = depth

df_profile_tot = upper_df.append(lower_df, ignore_index=True)
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!