pandas

Python: Calculate average for each hour in CSV?

谁都会走 提交于 2021-02-08 08:58:01
问题 I want to calculate the average for each hours using a CSV file: Below is my DATA SET: Timestamp Temperature 9/1/2016 0:00:08 53.8 9/1/2016 0:00:38 53.8 9/1/2016 0:01:08 53.8 9/1/2016 0:01:38 53.8 9/1/2016 0:02:08 53.8 9/1/2016 0:02:38 54.1 9/1/2016 0:03:08 54.1 9/1/2016 0:03:38 54.1 9/1/2016 0:04:38 54 9/1/2016 0:05:38 54 9/1/2016 0:06:08 54 9/1/2016 0:06:38 54 9/1/2016 0:07:08 54 9/1/2016 0:07:38 54 9/1/2016 0:08:08 54.1 9/1/2016 0:08:38 54.1 9/1/2016 0:09:38 54.1 9/1/2016 0:10:32 54 9/1

pandas dataframe.apply — converting hex string to int number

泪湿孤枕 提交于 2021-02-08 08:42:13
问题 I am very new to both python and pandas. I would like to know how to convert dataframe elements from hex string input to integer number, also I have followed the solution provided by: convert pandas dataframe column from hex string to int However, it is still not working. The following is my code: df = pd.read_csv(filename, delim_whitespace = True, header = None, usecols = range(7,23,2)) for i in range(num_frame): skipheader = lineNum[header_padding + i*2] data = df.iloc[skipheader:skipheader

Pandas Multiindex Groupby aggregate column with value from another column

邮差的信 提交于 2021-02-08 08:41:13
问题 I have a pandas dataframe with multiindex where I want to aggregate the duplicate key rows as follows: import numpy as np import pandas as pd df = pd.DataFrame({'S':[0,5,0,5,0,3,5,0],'Q':[6,4,10,6,2,5,17,4],'A': ['A1','A1','A1','A1','A2','A2','A2','A2'], 'B':['B1','B1','B2','B2','B1','B1','B1','B2']}) df.set_index(['A','B']) Q S A B A1 B1 6 0 B1 4 5 B2 10 0 B2 6 5 A2 B1 2 0 B1 5 3 B1 17 5 B2 4 0 and I would like to groupby this dataframe to aggregate the Q values (sum) and keep the S value

How to convert monthly return to yearly return after checking if all 12 months are present in pandas?

六月ゝ 毕业季﹏ 提交于 2021-02-08 08:38:19
问题 I have monthly returns and i want to convert them into yearly returns for each company (using cusip6, i am using CRSP data). I also want to keep only those years which have all the 12 months. I am currently using the following code, but i would like to know if there is inbuilt functions in pandas that can do this?` def monthly_to_ann_ret(data): """ funtion to check if all 12 months are present and calculate yearly returns from monthly returns """ data['year'] = data['date'].dt.year data.sort(

Trying to insert pandas dataframe to temporary table

≡放荡痞女 提交于 2021-02-08 08:37:15
问题 I'm looking to create a temp table and insert a some data into it. I have used pyodbc extensively to pull data but I am not familiar with writing data to SQL from a python environment. I am doing this at work so I dont have the ability to create tables, but I can create temp and global temp tables. My intent is to insert a relatively small dataframe (150rows x 4cols)into a temp table and reference it throughout my session, my program structure makes it so that a global variable in the session

Changing monthly values to daily by evenly distributing between dates

℡╲_俬逩灬. 提交于 2021-02-08 08:32:10
问题 I have monthly dataset df = pd.DataFrame({'Month':[1,2], 'Plan':[310,620], 'Month_start_date': ['2020-01-01','2020-02-01']}) print(df) df['Month_start_date'] = (pd.to_datetime(df['Month_start_date'], format='%Y/%m/%d') .dt.to_period('m').dt.to_timestamp()) df = df.set_index('Month_start_date') I created a list of dates in a format i would like to reindex start = '2020-01-01' end = '2020-02-29' dates = pd.date_range(start, end, freq='D') dates when i try to change the dataframe to daily using

Find the business days between two columns in a pandas dataframe, which contain NaTs

纵饮孤独 提交于 2021-02-08 08:27:19
问题 I have 2 columns in my pandas data frame, and I want to calculate the business dates between them. Data: ID On hold Off Hold 101 09/15/2017 09/16/2017 102 NA NA 103 09/22/2017 09/26/2017 104 10/12/2017 10/30/2017 105 NA NA 106 08/05/2017 08/06/2017 107 08/08/2017 08/03/2017 108 NA NA I tried the below code using busday_count from numpy: df1['On hold'] = pd.to_datetime(df1['On hold']) df1['Off Hold'] = pd.to_datetime(df1['Off Hold']) np.busday_count(df1['On hold'].values.astype('datetime64[D]'

Sum a range of cells in a single column in pandas dataframe

廉价感情. 提交于 2021-02-08 08:24:46
问题 I have three columns in a DataFrame. I want to take the number in the Streak_Count column and sum up that number of cells from the returns in the MON TOTAL. The result is displayed in the WANTED RESULT as shown below. The issue I cant figure out is summing the number of cells which can be any number>> in this example between 1 and 4. MON TOTAL STREAK_COUNT WANTED RESULT 1/2/1992 1.123077 1 1.123077 (only 1 so 1.12) 2/3/1992 -1.296718 0 3/2/1992 -6.355612 2 -7.65233 (sum of -1.29 and -6.35) 4

Splitting Columns' Values in Pandas by delimiter without losing delimiter

时光怂恿深爱的人放手 提交于 2021-02-08 08:21:39
问题 Hi I have a dataframe that follows this format: df = pd.DataFrame(np.array([[1, 2, 'Apples 20pk ABC123', 4, 5], [6, 7, 'Oranges 40pk XYZ123', 9, 0], [5, 6, 'Bananas 20pk ABC123', 8, 9]]), columns= ['Serial #', 'Branch ID', 'Info', 'Value1', 'Value2']) Serial# Branch ID Info Value1 Value2 0 1 2 Apples 20pk ABC123 4 5 1 6 7 Bananas 20pk ABC123 9 0 2 5 6 Oranges 40pk XYZ123 8 9 I want to split the "Info" column's values based on the "pk" character. Essentially, I want to create two new columns,

Pandas DataFrame currency conversion

巧了我就是萌 提交于 2021-02-08 08:18:45
问题 I have DataFrame with two columns: col1 | col2 20 EUR 31 GBP 5 JPY I may have 10000 rows like this How to do fast currency conversion to base currency being GBP? should I use easymoney? I know how to apply conversion to single row but I do not know how to iterate through all the rows fast. EDIT: I would like to apply sth as: def convert_currency(amount, currency_symbol): converted = ep.currency_converter(amount=1000, from_currency=currency_symbol, to_currency="GBP") return converted df.loc[df