问题
I have two Series s1
and s2
with the same (non-consecutive) indices. How do I combine s1
and s2
to being two columns in a DataFrame and keep one of the indices as a third column?
回答1:
I think concat is a nice way to do this. If they are present it uses the name attributes of the Series as the columns (otherwise it simply numbers them):
In [1]: s1 = pd.Series([1, 2], index=['A', 'B'], name='s1')
In [2]: s2 = pd.Series([3, 4], index=['A', 'B'], name='s2')
In [3]: pd.concat([s1, s2], axis=1)
Out[3]:
s1 s2
A 1 3
B 2 4
In [4]: pd.concat([s1, s2], axis=1).reset_index()
Out[4]:
index s1 s2
0 A 1 3
1 B 2 4
Note: This extends to more than 2 Series.
回答2:
Pandas will automatically align these passed in series and create the joint index
They happen to be the same here. reset_index
moves the index to a column.
In [2]: s1 = Series(randn(5),index=[1,2,4,5,6])
In [4]: s2 = Series(randn(5),index=[1,2,4,5,6])
In [8]: DataFrame(dict(s1 = s1, s2 = s2)).reset_index()
Out[8]:
index s1 s2
0 1 -0.176143 0.128635
1 2 -1.286470 0.908497
2 4 -0.995881 0.528050
3 5 0.402241 0.458870
4 6 0.380457 0.072251
回答3:
Why don't you just use .to_frame if both have the same indexes?
>= v0.23
a.to_frame().join(b)
< v0.23
a.to_frame().join(b.to_frame())
回答4:
Example code:
a = pd.Series([1,2,3,4], index=[7,2,8,9])
b = pd.Series([5,6,7,8], index=[7,2,8,9])
data = pd.DataFrame({'a': a,'b':b, 'idx_col':a.index})
Pandas allows you to create a DataFrame
from a dict
with Series
as the values and the column names as the keys. When it finds a Series
as a value, it uses the Series
index as part of the DataFrame
index. This data alignment is one of the main perks of Pandas. Consequently, unless you have other needs, the freshly created DataFrame
has duplicated value. In the above example, data['idx_col']
has the same data as data.index
.
回答5:
If I may answer this.
The fundamentals behind converting series to data frame is to understand that
1. At conceptual level, every column in data frame is a series.
2. And, every column name is a key name that maps to a series.
If you keep above two concepts in mind, you can think of many ways to convert series to data frame. One easy solution will be like this:
Create two series here
import pandas as pd
series_1 = pd.Series(list(range(10)))
series_2 = pd.Series(list(range(20,30)))
Create an empty data frame with just desired column names
df = pd.DataFrame(columns = ['Column_name#1', 'Column_name#1'])
Put series value inside data frame using mapping concept
df['Column_name#1'] = series_1
df['Column_name#2'] = series_2
Check results now
df.head(5)
回答6:
Not sure I fully understand your question, but is this what you want to do?
pd.DataFrame(data=dict(s1=s1, s2=s2), index=s1.index)
(index=s1.index
is not even necessary here)
回答7:
A simplification of the solution based on join()
:
df = a.to_frame().join(b)
来源:https://stackoverflow.com/questions/18062135/combining-two-series-into-a-dataframe-in-pandas