Convert pandas data frame to series

这一生的挚爱 提交于 2019-11-27 17:56:58

It's not smart enough to realize it's still a "vector" in math terms.

Say rather that it's smart enough to recognize a difference in dimensionality. :-)

I think the simplest thing you can do is select that row positionally using iloc, which gives you a Series with the columns as the new index and the values as the values:

>>> df = pd.DataFrame([list(range(5))], columns=["a{}".format(i) for i in range(5)])
>>> df
   a0  a1  a2  a3  a4
0   0   1   2   3   4
>>> df.iloc[0]
a0    0
a1    1
a2    2
a3    3
a4    4
Name: 0, dtype: int64
>>> type(_)
<class 'pandas.core.series.Series'>

You can transpose the single-row dataframe (which still results in a dataframe) and then squeeze the results into a series (the inverse of to_frame).

df = pd.DataFrame([list(range(5))], columns=["a{}".format(i) for i in range(5)])

>>> df.T.squeeze()  # Or more simply, df.squeeze() for a single row dataframe.
a0    0
a1    1
a2    2
a3    3
a4    4
Name: 0, dtype: int64

Note: To accommodate the point raised by @IanS (even though it is not in the OP's question), test for the dataframe's size. I am assuming that df is a dataframe, but the edge cases are an empty dataframe, a dataframe of shape (1, 1), and a dataframe with more than one row in which case the use should implement their desired functionality.

if df.empty:
    # Empty dataframe, so convert to empty Series.
    result = pd.Series()
elif df.shape == (1, 1)
    # DataFrame with one value, so convert to series with appropriate index.
    result = pd.Series(df.iat[0, 0], index=df.columns)
elif len(df) == 1:
    # Convert to series per OP's question.
    result = df.T.squeeze()
else:
    # Dataframe with multiple rows.  Implement desired behavior.
    pass

This can also be simplified along the lines of the answer provided by @themachinist.

if len(df) > 1:
    # Dataframe with multiple rows.  Implement desired behavior.
    pass
else:
    result = pd.Series() if df.empty else df.iloc[0, :]

You can retrieve the series through slicing your dataframe using one of these two methods:

http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.iloc.html http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.loc.html

import pandas as pd
import numpy as np
df = pd.DataFrame(data=np.random.randn(1,8))

series1=df.iloc[0,:]
type(series1)
pandas.core.series.Series

Another way -

Suppose myResult is the dataFrame that contains your data in the form of 1 col and 23 rows

// label your columns by passing a list of names
myResult.columns = ['firstCol']

// fetch the column in this way, which will return you a series
myResult = myResult['firstCol']

print(type(myResult))

In similar fashion, you can get series from Dataframe with multiple columns.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!