pandas

Parse pandas (multi)index to datetime

时间秒杀一切 提交于 2021-02-15 11:35:14
问题 I have multi-index df as follows x y id date abc 3/1/1994 100 7 9/1/1994 90 8 3/1/1995 80 9 Where dates are stored as str. I want to parse date index. The following statement df.index.levels[1] = pd.to_datetime(df.index.levels[1]) returns error: TypeError: 'FrozenList' does not support mutable operations. 回答1: As mentioned, you have to recreate the index: df.index = df.index.set_levels([df.index.levels[0], pd.to_datetime(df.index.levels[1])]) 回答2: You cannot modify it in-place. You can use

Parse pandas (multi)index to datetime

梦想的初衷 提交于 2021-02-15 11:33:41
问题 I have multi-index df as follows x y id date abc 3/1/1994 100 7 9/1/1994 90 8 3/1/1995 80 9 Where dates are stored as str. I want to parse date index. The following statement df.index.levels[1] = pd.to_datetime(df.index.levels[1]) returns error: TypeError: 'FrozenList' does not support mutable operations. 回答1: As mentioned, you have to recreate the index: df.index = df.index.set_levels([df.index.levels[0], pd.to_datetime(df.index.levels[1])]) 回答2: You cannot modify it in-place. You can use

How to read data in Python dataframe without concatenating?

戏子无情 提交于 2021-02-15 10:15:54
问题 I want to read the file f (file size:85GB) in chunks to a dataframe. Following code is suggested. chunksize = 5 TextFileReader = pd.read_csv(f, chunksize=chunksize) However, this code gives me TextFileReader, not dataframe. Also, I don't want to concatenate these chunks to convert TextFileReader to dataframe because of the memory limit. Please advise. 回答1: As you are trying to process 85GB CSV file, if you will try to read all the data by breaking it into chunks and converting it into

How to read data in Python dataframe without concatenating?

泪湿孤枕 提交于 2021-02-15 10:13:26
问题 I want to read the file f (file size:85GB) in chunks to a dataframe. Following code is suggested. chunksize = 5 TextFileReader = pd.read_csv(f, chunksize=chunksize) However, this code gives me TextFileReader, not dataframe. Also, I don't want to concatenate these chunks to convert TextFileReader to dataframe because of the memory limit. Please advise. 回答1: As you are trying to process 85GB CSV file, if you will try to read all the data by breaking it into chunks and converting it into

How to read data in Python dataframe without concatenating?

风格不统一 提交于 2021-02-15 10:12:33
问题 I want to read the file f (file size:85GB) in chunks to a dataframe. Following code is suggested. chunksize = 5 TextFileReader = pd.read_csv(f, chunksize=chunksize) However, this code gives me TextFileReader, not dataframe. Also, I don't want to concatenate these chunks to convert TextFileReader to dataframe because of the memory limit. Please advise. 回答1: As you are trying to process 85GB CSV file, if you will try to read all the data by breaking it into chunks and converting it into

Pandas `.to_pydatetime()` not working inside a DataFrame

為{幸葍}努か 提交于 2021-02-15 06:28:28
问题 I have strings like '03-21-2019' that I want to convert to the native Python datetime object: that is, of the datetime.datetime type. The conversion is easy enough through pandas : import pandas as pd import datetime as dt date_str = '03-21-2019' pd_Timestamp = pd.to_datetime(date_str) py_datetime_object = pd_Timestamp.to_pydatetime() print(type(py_datetime_object)) with the result <class 'datetime.datetime'> This is precisely what I want, since I want to compute timedelta 's by subtracting

How to implode(reverse of pandas explode) based on a column

点点圈 提交于 2021-02-15 05:53:17
问题 I have a dataframe df like below NETWORK config_id APPLICABLE_DAYS Case Delivery 0 Grocery 5399 SUN 10 1 1 Grocery 5399 MON 20 2 2 Grocery 5399 TUE 30 3 3 Grocery 5399 WED 40 4 I want to implode( combine Applicable_days from multiple rows into single row like below) and get the average case and delivery per config_id NETWORK config_id APPLICABLE_DAYS Avg_Cases Avg_Delivery 0 Grocery 5399 SUN,MON,TUE,WED 90 10 using the groupby on network,config_id i can get the avg_cases and avg_delivery like

Converting generator from read_sql in pandas to dataframe has failed

情到浓时终转凉″ 提交于 2021-02-15 05:34:13
问题 I want to read data from my oracle, I use the pandas's read_sql and set the parameter chunksize=20000 , from sqlalchemy import create_engine import pandas as pd engine = create_engine("my oracle") df = pd.read_sql("select clause",engine,chunksize=20000) It returns a iterator, and I want to convert this generator to a dataframe using df = pd.DataFrame(df) , but it's wrong, How can the iterator be converted to a dataframe? 回答1: This iterator can be concatenated, then it return a dataframe: df =

How to set two time formatters in matplotlib?

China☆狼群 提交于 2021-02-15 05:27:07
问题 This chart is built by Excel. How can I do the same using matplotlib? I mean how to add two formatters: years months. Now i use something like this: fig, ax = plt.subplots(1,1) ax.margins(x=0) ax.plot(list(df['Date']), list(df['Value']), color="g") ax.xaxis.set_major_locator(matplotlib.dates.YearLocator()) ax.xaxis.set_major_formatter(matplotlib.dates.DateFormatter('%Y')) plt.text(df["Date"].iloc[-1], df["Value"].iloc[-1], df["Value"].iloc[-1]) plt.title(title) plt.get_current_fig_manager()

How to set two time formatters in matplotlib?

孤者浪人 提交于 2021-02-15 05:26:22
问题 This chart is built by Excel. How can I do the same using matplotlib? I mean how to add two formatters: years months. Now i use something like this: fig, ax = plt.subplots(1,1) ax.margins(x=0) ax.plot(list(df['Date']), list(df['Value']), color="g") ax.xaxis.set_major_locator(matplotlib.dates.YearLocator()) ax.xaxis.set_major_formatter(matplotlib.dates.DateFormatter('%Y')) plt.text(df["Date"].iloc[-1], df["Value"].iloc[-1], df["Value"].iloc[-1]) plt.title(title) plt.get_current_fig_manager()