Appending data to a dataframe but changing rows after certain # of columns

天大地大妈咪最大 提交于 2021-01-28 18:49:02

问题


Here is a code that I've written, which creates some increments of 3 variables to be used within p-value calculations, where the three variables are loc values or indicators or whatever the numbers mean:

i = 0
k = 2
j = 2

result = []
df = pd.DataFrame()

while j < data.shape[1]:
    tstat, data_stat = ttest_ind_from_stats(data.loc[i][k], data.loc[i + 1][k], data.loc[i + 2][k], data.loc[i][j],
                                        data.loc[i + 1][j], data.loc[i + 2][j])
    result.append([data_stat])
    j+=1
    if j == 8:
        j = 2
        i = i + 3
    if i == data.shape[0]:
        k = k + 1
        i = 0
        if k > 7:
            break

data_result = pd.DataFrame(result)

Where data.shape[0] = 150 and data.shape[1] = 8.

This code creates the correct p-values but as 1800 rows x 1 column dataframe. However, I would like to break the resulting df so that the code produces six different dataframes, each with data.shape[1]-2 number of columns (so 6 columns). With some example screenshots:

1) The data_result dataframe from my current code:

1
0.658
0.1067
0.777
0.459
0.3307
1
0.622
0.4178
0.3158
0.7674
0.7426

2) What I want:

col1    col2   col3    col4    col5    col6
1       0.658  0.1067  0.777   0.459   0.3307
1       0.622  0.4178  0.3158  0.7674  0.7426

There should be six of the above dataframes from the code.

3) I would then preferably add a column to the left of each dataframe, which would be used to insert the placeholder values for each row (screenshot omitted). This step is just optional.

So basically, I am dividing the resulting dataframe by every 6 rows, transpose them from single column to six columns, then repeat for the next six values, and so on. I thought maybe creating a Series or a new df until j = 8 then append to the overall df by row, but wasn't sure if this would work or be possible. Thanks!

edit)

so basically, I want to create six separate dataframes, each with 50 rows x 6 column shape. My current dataframe has 1800 rows x 1 column.


回答1:


For the point2: You can try it with numpy:

import numpy as np
import pandas as pd

result_array= np.asarray(result)
# reshape for 150 rows and 6 columns
result_array.reshape(150,6)
#if number of row is undefined and 6 columns
#result_array.reshape(-1,6)

return pd.DataFrame(result_array)

For point 3, I'm not sure to get it, but from the data frame return you can do everything than pandas is allowing...




回答2:


This would get you the df you need (credit should go to Renaud)

a = np.array(df)
b= a.reshape(int(df.shape[0]/6),6)
df_new = pd.DataFrame(b)
df_new.columns =['col1','col2','col3','col4','col5','col6']
df_new

Output

   col1     col2    col3        col4    col5    col6
0   1.0     0.658   0.106743    0.7770  0.4590  0.3307
1   1.0     0.622   0.417800    0.3158  0.7674  0.7426


来源:https://stackoverflow.com/questions/59817359/appending-data-to-a-dataframe-but-changing-rows-after-certain-of-columns

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!