pandas | 易学教程

Syntax to use df.apply() with datetime.strptime [duplicate]

阅读更多关于 Syntax to use df.apply() with datetime.strptime [duplicate]

问题 This question already has answers here : How to convert string to datetime format in pandas python? (2 answers) Closed 2 years ago . Consider the following table 'df': date sales 0 2021-04-10 483 1 2022-02-03 226 2 2021-09-23 374 3 2021-10-17 186 4 2021-07-17 35 I would like to convert the column date that is currently a string to a date by using apply() and datetime.strptime() . I tried the following: format_date = "%Y-%m-%d" df["date_new"] = df.loc[:,"date"].apply(datetime.strptime,df.loc[:

Vectorizing an iterative function on Pandas DataFrame

阅读更多关于 Vectorizing an iterative function on Pandas DataFrame

问题 I have a dataframe where the first row is the initial condition. df = pd.DataFrame({"Year": np.arange(4), "Pop": [0.4] + [np.nan]* 3}) and a function f(x,r) = r*x*(1-x) , where r = 2 is a constant and 0 <= x <= 1 . I want to produce the following dataframe by applying the function to column Pop row-by-row iteratively. I.e., df.Pop[i] = f(df.Pop[i-1], r=2) df = pd.DataFrame({"Year": np.arange(4), "Pop": [0.4, 0.48, 4992, 0.49999872]}) Question: Is it possible to do this in a vectorized way? I

Set Xticks frequency to dataframe index

阅读更多关于 Set Xticks frequency to dataframe index

问题 I currently have a dataframe that has as an index the years from 1990 to 2014 (25 rows). I want my plot to have the X axis with all the years showing. I'm using add_subplot as I plan to have 4 plots in this figure (all of them with the same X axis). To create the dataframe: import pandas as pd import numpy as np index = np.arange(1990,2015,1) columns = ['Total Population','Urban Population'] pop_plot = pd.DataFrame(index=index, columns=columns) pop_plot = df_.fillna(0) pop_plot['Total

Set Xticks frequency to dataframe index

阅读更多关于 Set Xticks frequency to dataframe index

Bokeh: Generating graphs in a loop, the output graph's file sizes keep increasing

阅读更多关于 Bokeh: Generating graphs in a loop, the output graph's file sizes keep increasing

问题 I'm using bokeh to plot 100 graph files in a loop. for k in files: # Read the log file data into a df. log_file_name = str(k) + ".csv" logged_data = pd.read_csv("csv/"+log_file_name, parse_dates=["dttm_utc"], date_parser=dateparse) new_logged_data = logged_data.set_index("dttm_utc") mean_data = new_logged_data.resample("3D", how=[np.mean]) # Extract the energy values and time stamps out into two ds. energy_data = mean_data["value"]["mean"] time_data = mean_data.index # Plotting output_file(

Bokeh: Generating graphs in a loop, the output graph's file sizes keep increasing

阅读更多关于 Bokeh: Generating graphs in a loop, the output graph's file sizes keep increasing

How to lag data by x specific days on a multi index pandas dataframe?

阅读更多关于 How to lag data by x specific days on a multi index pandas dataframe?

问题 I have a dataframe that has dates, assets, and then price/volume data. I'm trying to pull in data from 7 days ago, but the issue is that I can't use shift() because my table has missing dates in it. date cusip price price_7daysago 1/1/2017 a 1 1/1/2017 b 2 1/2/2017 a 1.2 1/2/2017 b 2.3 1/8/2017 a 1.1 1 1/8/2017 b 2.2 2 I've tried creating a lambda function to try to use loc and timedelta to create this shifting, but I was only able to output empty numpy arrays: def row_delta(x, df, days,

Use Pandas to Get Multiple Tables From Webpage

阅读更多关于 Use Pandas to Get Multiple Tables From Webpage

问题 I am using Pandas to parse the data from the following page: http://kenpom.com/index.php?y=2014 To get the data, I am writing: dfs = pd.read_html(url) The data looks great and is perfectly parsed, except it only takes data from the 40 first rows. It seems to be a problem with the separation of the tables, that makes it so that pandas does no get all the information. How do you get pandas to get all the data from all the tables on that webpage? 回答1: The HTML of page you have posted have

Use Pandas to Get Multiple Tables From Webpage

阅读更多关于 Use Pandas to Get Multiple Tables From Webpage

Python for merging multiple files from a directory into one single file

阅读更多关于 Python for merging multiple files from a directory into one single file

问题 I need a single file with many columns(=number of files in the directory), from multiple file in the directory.. Each files has unique IDs which will not change for all files and so I need to merge these files based on that id. For example, file_1 looks like this id pool1 ABL1 1352 ABL12 1236 ABL13 1022 ABL14 815 ABL15 1591 ABL16 2703 And so as the other files the first column is same for all other files in the directory and second columns are different. I am looking for a output which looks