Import data frame from one Jupyter Notebook file to another

依然范特西╮ 提交于 2021-01-22 03:38:46

问题


I have 3 separate jupyter notebook files that deal with separate data frames. I clean and manipulate the data in these notebooks for each df. Is there a way to reference the cleaned up/final data in a separate notebook?

My concern is that if I work on all 3 dfs in one notebook and then do more with it after (merge/join), it will be a mile long. I also don't want to re-write a bunch of code just to get data ready for use in my new notebook.


回答1:


If you are using pandas data frames then one approach is to use pandas.DataFrame.to_csv() and pandas.read_csv() to save and load the cleaned data between each step.

  1. Notebook1 loads input1 and saves result1.
  2. Notebook2 loads result1 and saves result2.
  3. Notebook3 loads result2 and saves result3.

If this is your data:

import pandas as pd
raw_data = {'id': [10, 20, 30], 
            'name': ['foo', 'bar', 'baz']
           }
input = pd.DataFrame(raw_data, columns = ['id', 'name'])

Then in notebook1.ipynb, process it like this:

# load
df = pd.read_csv('input.csv', index_col=0)
# manipulate frame here
# ...
# save
df.to_csv('result1.csv')

...and repeat that process for each stage in the chain.

# load
df = pd.read_csv('result1.csv', index_col=0)
# manipulate frame here
# ...
# save
df.to_csv('result2.csv')

At the end, your notebook collection will look like this:

  • input.csv
  • notebook1.ipynb
  • notebook2.ipynb
  • notebook3.ipynb
  • result1.csv
  • result2.csv
  • result3.csv

Documentation:

  • pandas.read_csv
  • pandas.DataFrame.to_csv


来源:https://stackoverflow.com/questions/46674086/import-data-frame-from-one-jupyter-notebook-file-to-another

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!