pandas

Convert Julian dates to normal dates in a dataframe?

痞子三分冷 提交于 2021-02-07 19:18:07
问题 I have a date column in a pandas DF with Julian dates. How can I convert these Julian dates to mm-dd-yyyy format. Sample data ORG CHAIN_NBR SEQ_NBR INT_STATUS BLOCK_CODE_1 DATA_BLOCK_CODE_1 0 523 1 0 A C 2012183 1 523 2 1 I A 2013025 2 521 3 1 A H 2007067 3 513 4 1 D H 2001046 4 513 5 1 8 I 2006075 I was using jd2gcal function but it's not working. I was also trying to write a code like this but of no use. for i,row in amna.iterrows(): amna['DATE_BLOCK_CODE_1'] = datetime.datetime.strptime

Convert Julian dates to normal dates in a dataframe?

一笑奈何 提交于 2021-02-07 19:17:23
问题 I have a date column in a pandas DF with Julian dates. How can I convert these Julian dates to mm-dd-yyyy format. Sample data ORG CHAIN_NBR SEQ_NBR INT_STATUS BLOCK_CODE_1 DATA_BLOCK_CODE_1 0 523 1 0 A C 2012183 1 523 2 1 I A 2013025 2 521 3 1 A H 2007067 3 513 4 1 D H 2001046 4 513 5 1 8 I 2006075 I was using jd2gcal function but it's not working. I was also trying to write a code like this but of no use. for i,row in amna.iterrows(): amna['DATE_BLOCK_CODE_1'] = datetime.datetime.strptime

Convert Julian dates to normal dates in a dataframe?

让人想犯罪 __ 提交于 2021-02-07 19:17:06
问题 I have a date column in a pandas DF with Julian dates. How can I convert these Julian dates to mm-dd-yyyy format. Sample data ORG CHAIN_NBR SEQ_NBR INT_STATUS BLOCK_CODE_1 DATA_BLOCK_CODE_1 0 523 1 0 A C 2012183 1 523 2 1 I A 2013025 2 521 3 1 A H 2007067 3 513 4 1 D H 2001046 4 513 5 1 8 I 2006075 I was using jd2gcal function but it's not working. I was also trying to write a code like this but of no use. for i,row in amna.iterrows(): amna['DATE_BLOCK_CODE_1'] = datetime.datetime.strptime

Pandas: Count time interval intersections over a group by

人盡茶涼 提交于 2021-02-07 19:00:45
问题 I have a dataframe of the following form import pandas as pd Out[1]: df = pd.DataFrame({'id':[1,2,3,4,5], 'group':['A','A','A','B','B'], 'start':['2012-08-19','2012-08-22','2013-08-19','2012-08-19','2013-08-19'], 'end':['2012-08-28','2013-09-13','2013-08-19','2012-12-19','2014-08-19']}) id group start end 0 1 A 2012-08-19 2012-08-28 1 2 A 2012-08-22 2013-09-13 2 3 A 2013-08-19 2013-08-21 3 4 B 2012-08-19 2012-12-19 4 5 B 2013-08-19 2014-08-19 For given row in my dataframe I'd like to count

Pandas: groupby with condition

谁说我不能喝 提交于 2021-02-07 18:43:27
问题 I have dataframe: ID,used_at,active_seconds,subdomain,visiting,category 123,2016-02-05 19:39:21,2,yandex.ru,2,Computers 123,2016-02-05 19:43:01,1,mail.yandex.ru,2,Computers 123,2016-02-05 19:43:13,6,mail.yandex.ru,2,Computers 234,2016-02-05 19:46:09,16,avito.ru,2,Automobiles 234,2016-02-05 19:48:36,21,avito.ru,2,Automobiles 345,2016-02-05 19:48:59,58,avito.ru,2,Automobiles 345,2016-02-05 19:51:21,4,avito.ru,2,Automobiles 345,2016-02-05 19:58:55,4,disk.yandex.ru,2,Computers 345,2016-02-05 19

Calculating subtractions of pairs of columns in pandas DataFrame

微笑、不失礼 提交于 2021-02-07 18:21:54
问题 I work with significantly sized (48K rows, up to tens of columns) DataFrames. At a certain point in their manipulation, I need to do pair-wise subtractions of column values and I was wondering if there is a more efficient way to do so rather than the one I'm doing (see below). My current code: # Matrix is the pandas DataFrame containing all the data comparison_df = pandas.DataFrame(index=matrix.index) combinations = itertools.product(group1, group2) for observed, reference in combinations:

Using ols function with parameters that contain numbers/spaces

寵の児 提交于 2021-02-07 18:14:32
问题 I am having a lot of difficulty using the statsmodels.formula.api function ols(formula,data).fit().rsquared_adj due to the nature of the names of my predictors. The predictors have numbers and spaces etc in them which it clearly doesn't like. I understand that I need to use something like patsy.builtins.Q So lets say my predictor would be weight.in.kg , it should be entered as follows: Q("weight.in.kg") so I need to take my formula from a list, and the difficulty arises in modifying every

Using ols function with parameters that contain numbers/spaces

荒凉一梦 提交于 2021-02-07 18:09:26
问题 I am having a lot of difficulty using the statsmodels.formula.api function ols(formula,data).fit().rsquared_adj due to the nature of the names of my predictors. The predictors have numbers and spaces etc in them which it clearly doesn't like. I understand that I need to use something like patsy.builtins.Q So lets say my predictor would be weight.in.kg , it should be entered as follows: Q("weight.in.kg") so I need to take my formula from a list, and the difficulty arises in modifying every

Using ols function with parameters that contain numbers/spaces

痞子三分冷 提交于 2021-02-07 18:09:22
问题 I am having a lot of difficulty using the statsmodels.formula.api function ols(formula,data).fit().rsquared_adj due to the nature of the names of my predictors. The predictors have numbers and spaces etc in them which it clearly doesn't like. I understand that I need to use something like patsy.builtins.Q So lets say my predictor would be weight.in.kg , it should be entered as follows: Q("weight.in.kg") so I need to take my formula from a list, and the difficulty arises in modifying every

Rank by grouby column aggregate

我是研究僧i 提交于 2021-02-07 17:38:29
问题 I want to create a column manager_rank that ranks a manager by the sum of returns. I have come up with one solution posted below but was hoping if someone else had something more elegant. import pandas as pd df = pd.DataFrame([['2012', 'A', 1], ['2012', 'B', 4], ['2011', 'A', 5], ['2011', 'B', 4]], columns=['year', 'manager', 'return']) Desired result: year manager return manager_rank 0 2012 A 1 2 1 2011 A 5 2 2 2012 B 4 1 3 2011 B 4 1 回答1: df['ranking'] = df.groupby('manager')['return']