pandas

Timeserie datetick problems when using pandas.DataFrame.plot method

折月煮酒 提交于 2021-02-07 09:31:34
问题 I just discovered something really strange when using plot method of pandas.DataFrame . I am using pandas 0.19.1 . Here is my MWE: import numpy as np import matplotlib.pyplot as plt import matplotlib.dates as mdates import pandas as pd t = pd.date_range('1990-01-01', '1990-01-08', freq='1H') x = pd.DataFrame(np.random.rand(len(t)), index=t) fig, axe = plt.subplots() x.plot(ax=axe) plt.show(axe) xt = axe.get_xticks() When I try to format my xticklabels I get strange beahviours, then I

How to Replace All the “nan” Strings with Empty String in My DataFrame?

雨燕双飞 提交于 2021-02-07 09:27:30
问题 I have "None" and "nan" strings scattered in my dataframe. Is there a way to replace all of those with empty string "" or nan so they do not show up when I export the dataframe as excel sheet? Simplified Example: Note: nan in col4 are not strings ID col1 col2 col3 col4 1 Apple nan nan nan 2 None orange None nan 3 None nan banana nan The output should be like this after removing all the "None" and "nan" strings when we replaced them by empty strings "" : ID col1 col2 col3 col4 1 Apple nan 2

How to Replace All the “nan” Strings with Empty String in My DataFrame?

巧了我就是萌 提交于 2021-02-07 09:27:00
问题 I have "None" and "nan" strings scattered in my dataframe. Is there a way to replace all of those with empty string "" or nan so they do not show up when I export the dataframe as excel sheet? Simplified Example: Note: nan in col4 are not strings ID col1 col2 col3 col4 1 Apple nan nan nan 2 None orange None nan 3 None nan banana nan The output should be like this after removing all the "None" and "nan" strings when we replaced them by empty strings "" : ID col1 col2 col3 col4 1 Apple nan 2

Annualized Return in Pandas

我的未来我决定 提交于 2021-02-07 09:24:06
问题 I am seeking to confirm that my representation of the annualized return formula (using monthly returns) is optimal. The annualized return formula I am using (where M is a monthly return and D is the total count of monthly returns) where the count of monthly returns is greater than 12 is as follows: Alternatively, the this would change in the case of the monthly return count being less than 12: Here is my representation of this formula in Pandas: ann_return = observations.apply(lambda y: y

Annualized Return in Pandas

眉间皱痕 提交于 2021-02-07 09:22:19
问题 I am seeking to confirm that my representation of the annualized return formula (using monthly returns) is optimal. The annualized return formula I am using (where M is a monthly return and D is the total count of monthly returns) where the count of monthly returns is greater than 12 is as follows: Alternatively, the this would change in the case of the monthly return count being less than 12: Here is my representation of this formula in Pandas: ann_return = observations.apply(lambda y: y

easy multidimensional numpy ndarray to pandas dataframe method?

二次信任 提交于 2021-02-07 09:17:12
问题 Having a 4-D numpy.ndarray, e.g. myarr = np.random.rand(10,4,3,2) dims={'time':1:10,'sub':1:4,'cond':['A','B','C'],'measure':['meas1','meas2']} But with possible higher dimensions. How can I create a pandas.dataframe with multiindex, just passing the dimensions as indexes, without further manual adjustments (reshaping the ndarray into 2D shape)? I can't wrap my head around the reshaping, not even really in 3 dimensions quite yet, so I'm searching for an 'automatic' method if possible. What

easy multidimensional numpy ndarray to pandas dataframe method?

浪子不回头ぞ 提交于 2021-02-07 09:14:03
问题 Having a 4-D numpy.ndarray, e.g. myarr = np.random.rand(10,4,3,2) dims={'time':1:10,'sub':1:4,'cond':['A','B','C'],'measure':['meas1','meas2']} But with possible higher dimensions. How can I create a pandas.dataframe with multiindex, just passing the dimensions as indexes, without further manual adjustments (reshaping the ndarray into 2D shape)? I can't wrap my head around the reshaping, not even really in 3 dimensions quite yet, so I'm searching for an 'automatic' method if possible. What

How to split a pandas dataframe into many columns after groupby

时间秒杀一切 提交于 2021-02-07 09:01:27
问题 I want to be able to use groupby in pandas to group the data by a column, but then split it so each group is its own column in a dataframe. e.g.: time data 0 1 2.0 1 2 3.0 2 3 4.0 3 1 2.1 4 2 3.1 5 3 4.1 etc. into data1 data2 ... dataN time 1 2.0 2.1 ... 2 3.0 3.1 ... 3 4.0 4.1 ... I am sure the place to start is df.groupby('time') but then I can't seem to figure out the right way to use concat (or other function) to build the split data frame that I want. There is probably some simple

How to split a pandas dataframe into many columns after groupby

余生长醉 提交于 2021-02-07 09:00:27
问题 I want to be able to use groupby in pandas to group the data by a column, but then split it so each group is its own column in a dataframe. e.g.: time data 0 1 2.0 1 2 3.0 2 3 4.0 3 1 2.1 4 2 3.1 5 3 4.1 etc. into data1 data2 ... dataN time 1 2.0 2.1 ... 2 3.0 3.1 ... 3 4.0 4.1 ... I am sure the place to start is df.groupby('time') but then I can't seem to figure out the right way to use concat (or other function) to build the split data frame that I want. There is probably some simple

pandas read_csv parse header as string type but i want integer

折月煮酒 提交于 2021-02-07 08:37:57
问题 for example, csv file is as below ,(1,2,3) is header! 1,2,3 0,0,0 I read csv file using pd.read_csv and print import pandas as pd df = pd.read_csv('./test.csv') print(df[1]) it occur error key error:1 it seems like that read_csv parse header as string.. is there any way using integer type in dataframe column? 回答1: I think more general is cast to columns names to integer by astype: df = pd.read_csv('./test.csv') df.columns = df.columns.astype(int) Another way is first get only first column and