pandas | 易学教程

Several time series to DataFrame

阅读更多关于 Several time series to DataFrame

问题 I have problem merging several time series to a common DataFrame. The example code I'm using: import pandas import datetime import numpy as np start = datetime.datetime(2001, 1, 1) end = datetime.datetime(2001, 1, 10) dates = pandas.date_range(start, end) serie_1 = pandas.Series(np.random.randn(10), index = dates) start = datetime.datetime(2001, 1, 2) end = datetime.datetime(2001, 1, 11) dates = pandas.date_range(start, end) serie_2 = pandas.Series(np.random.randn(10), index = dates) start =

Several time series to DataFrame

阅读更多关于 Several time series to DataFrame

Several time series to DataFrame

阅读更多关于 Several time series to DataFrame

Several time series to DataFrame

阅读更多关于 Several time series to DataFrame

return default if pandas dataframe.loc location doesn't exist

阅读更多关于 return default if pandas dataframe.loc location doesn't exist

问题 I find myself often having to check whether a column or row exists in a dataframe before trying to reference it. For example I end up adding a lot of code like: if 'mycol' in df.columns and 'myindex' in df.index: x = df.loc[myindex, mycol] else: x = mydefault Is there any way to do this more nicely? For example on an arbitrary object I can do x = getattr(anobject, 'id', default) - is there anything similar to this in pandas? Really any way to achieve what I'm doing more gracefully? 回答1: There

Python pandas idxmax for multiple indexes in a dataframe

阅读更多关于 Python pandas idxmax for multiple indexes in a dataframe

问题 I have a series that looks like this: delivery 2007-04-26 706 23 2007-04-27 705 10 706 1089 708 83 710 13 712 51 802 4 806 1 812 3 2007-04-29 706 39 708 4 712 1 2007-04-30 705 3 706 1016 707 2 ... 2014-11-04 1412 53 1501 1 1502 1 1512 1 2014-11-05 1411 47 1412 1334 1501 40 1502 433 1504 126 1506 100 1508 7 1510 6 1512 51 1604 1 1612 5 Length: 26255, dtype: int64 where the query is: df.groupby([df.index.date, 'delivery']).size() For each day, I need to pull out the delivery number which has

Python pandas idxmax for multiple indexes in a dataframe

阅读更多关于 Python pandas idxmax for multiple indexes in a dataframe

Cythonising Pandas: ctypes for content, index and columns

阅读更多关于 Cythonising Pandas: ctypes for content, index and columns

问题 I am very new to Cython, yet am already experiencing extraordinary speedups just copying my .py to .pyx (and cimport cython , numpy etc) and importing into ipython3 with pyximport . Many tutorials start in this approach with the next step being to add cdef declarations for every data type, which I can do for the iterators in my for loops etc. But unlike most Pandas Cython tutorials or examples I am not apply functions so to speak, more manipulating data using slices, sums and division (etc).

Cythonising Pandas: ctypes for content, index and columns

阅读更多关于 Cythonising Pandas: ctypes for content, index and columns

dateutil 2.5.0 is the minimum required version

阅读更多关于 dateutil 2.5.0 is the minimum required version

问题 I'm running the jupyter notebook (Enthought Canopy python distribution 2.7) on Mac OSX (v 10.13.6). When I try to import pandas (import pandas as pd), I am getting the complaint: ImportError: dateutil 2.5.0 is the minimum required version. I have these package versions: Canopy version 2.1.3.3542 (64 bit) jupyter version 1.0.0-25 pandas version 0.23.1-1 python_dateutil version 2.6.0-1 I'm not getting this complaint when I run with the Canopy Editor so it must be some jupyter compatibility