Returning a dataframe in python function

主宰稳场 提交于 2020-08-01 04:41:23

问题


I am trying to create and return a data frame from a python function

def create_df():
    data = {'state': ['Ohio','Ohio','Ohio','Nevada','Nevada'],
           'year': [2000,2001,2002,2001,2002],
           'pop': [1.5,1.7,3.6,2.4,2.9]}
    df = pd.DataFrame(data)
    return df
create_df()
df

I get an error that df is not defined. If I replace 'return' by 'print' I get print of the data frame correctly. Is there a way to do this? thanks


回答1:


when you call create_df() python calls the function but doesn't save the result in any variable. that is why you got the error.

assign the result of create_df() to df like this df = create_df()




回答2:


I'm kind of late here, but what about creating a global variable within the function? It should save a step for you.

def create_df():

    global df

    data = {
    'state': ['Ohio','Ohio','Ohio','Nevada','Nevada'],
    'year': [2000,2001,2002,2001,2002],
    'pop': [1.5,1.7,3.6,2.4,2.9]
    }

    df = pd.DataFrame(data)

Then when you run create_df(), you'll be able to just use df.

Of course, be careful in your naming strategy if you have a large program so that the value of df doesn't change as various functions execute.

EDIT: I noticed I got some points for this. Here's another (probably worse) way to do this using exec. This also allows for multiple dataframes to be created, if desired.

import pandas as pd

def create_df():
    data = {'state': ['Ohio','Ohio','Ohio','Nevada','Nevada'],
           'year': [2000,2001,2002,2001,2002],
           'pop': [1.5,1.7,3.6,2.4,2.9]}
    df = pd.DataFrame(data)
    return df

### We'll create three dataframes for an example
for i in range(3):
    exec(f'df_{i} = create_df()')

Then, you can test them out:

Input: df_0

Output:

    state  year  pop
0    Ohio  2000  1.5
1    Ohio  2001  1.7
2    Ohio  2002  3.6
3  Nevada  2001  2.4
4  Nevada  2002  2.9

Input: df_1

Output:

    state  year  pop
0    Ohio  2000  1.5
1    Ohio  2001  1.7
2    Ohio  2002  3.6
3  Nevada  2001  2.4
4  Nevada  2002  2.9

Etc.




回答3:


Function explicitly returns two DataFrames:

import pandas as pd
import numpy as np

def return_2DF():

date = pd.date_range('today', periods=20)
DF1 = pd.DataFrame(np.random.rand(20, 2), index=date, columns=list('xyz'))

DF2 = pd.DataFrame(np.random.rand(20, 4), index=date, columns='A B C D'.split())

return DF1, DF2

Calling and returning two data frame

one, two = return_2DF()



回答4:


You can return dataframe from a function by making a copy of the dataframe like

def my_function(dataframe):
  my_df=dataframe.copy()
  my_df=my_df.drop(0)
  return(my_df)

new_df=my_function(old_df)
print(type(new_df))

Output: pandas.core.frame.DataFrame



来源:https://stackoverflow.com/questions/45579525/returning-a-dataframe-in-python-function

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!