I have some data I'm trying to organize into a DataFrame in Pandas. I was trying to make each row a Series and append it to the DataFrame. I found a way to do it by appending the Series to an empty list and then converting the list of Series to a DataFrame
e.g. DF = DataFrame([series1,series2],columns=series1.index)
This list to DataFrame step seems to be excessive. I've checked out a few examples on here but none of the Series preserved the Index labels from the Series to use them as column labels.
My long way where columns are id_names and rows are type_names:
Is it possible to append Series to rows of DataFrame without making a list first?
#!/usr/bin/python
DF = DataFrame()
for sample,data in D_sample_data.items():
SR_row = pd.Series(data.D_key_value)
DF.append(SR_row)
DF.head()
TypeError: Can only append a Series if ignore_index=True or if the Series has a name
Then I tried
DF = DataFrame()
for sample,data in D_sample_data.items():
SR_row = pd.Series(data.D_key_value,name=sample)
DF.append(SR_row)
DF.head()
Empty DataFrame
Tried Insert a row to pandas dataframe Still getting an empty dataframe :/
I am trying to get the Series to be the rows, where the index of the Series becomes the column labels of the DataFrame
Maybe an easier way would be to add the pandas.Series into the pandas.DataFrame with ignore_index=True argument to DataFrame.append(). Example -
DF = DataFrame()
for sample,data in D_sample_data.items():
SR_row = pd.Series(data.D_key_value)
DF = DF.append(SR_row,ignore_index=True)
Demo -
In [1]: import pandas as pd
In [2]: df = pd.DataFrame([[1,2],[3,4]],columns=['A','B'])
In [3]: df
Out[3]:
A B
0 1 2
1 3 4
In [5]: s = pd.Series([5,6],index=['A','B'])
In [6]: s
Out[6]:
A 5
B 6
dtype: int64
In [36]: df.append(s,ignore_index=True)
Out[36]:
A B
0 1 2
1 3 4
2 5 6
Another issue in your code is that DataFrame.append() is not in-place, it returns the appended dataframe, you would need to assign it back to your original dataframe for it to work. Example -
DF = DF.append(SR_row,ignore_index=True)
To preserve the labels, you can use your solution to include name for the series along with assigning the appended DataFrame back to DF. Example -
DF = DataFrame()
for sample,data in D_sample_data.items():
SR_row = pd.Series(data.D_key_value,name=sample)
DF = DF.append(SR_row)
DF.head()
DataFrame.append does not modify the DataFrame in place. You need to do df = df.append(...) if you want to reassign it back to the original variable.
Something like this could work...
mydf.loc['newindex'] = myseries
Here is an example where I used it...
stats = df[['bp_prob', 'ICD9_prob', 'meds_prob', 'regex_prob']].describe()
stats
Out[32]:
bp_prob ICD9_prob meds_prob regex_prob
count 171.000000 171.000000 171.000000 171.000000
mean 0.179946 0.059071 0.067020 0.126812
std 0.271546 0.142681 0.152560 0.207014
min 0.000000 0.000000 0.000000 0.000000
25% 0.000000 0.000000 0.000000 0.000000
50% 0.000000 0.000000 0.000000 0.013116
75% 0.309019 0.065248 0.066667 0.192954
max 1.000000 1.000000 1.000000 1.000000
medians = df[['bp_prob', 'ICD9_prob', 'meds_prob', 'regex_prob']].median()
stats.loc['median'] = medians
stats
Out[36]:
bp_prob ICD9_prob meds_prob regex_prob
count 171.000000 171.000000 171.000000 171.000000
mean 0.179946 0.059071 0.067020 0.126812
std 0.271546 0.142681 0.152560 0.207014
min 0.000000 0.000000 0.000000 0.000000
25% 0.000000 0.000000 0.000000 0.000000
50% 0.000000 0.000000 0.000000 0.013116
75% 0.309019 0.065248 0.066667 0.192954
max 1.000000 1.000000 1.000000 1.000000
median 0.000000 0.000000 0.000000 0.013116
来源:https://stackoverflow.com/questions/33094056/is-it-possible-to-append-series-to-rows-of-dataframe-without-making-a-list-first

