dataframe

How to insert missing dates and forward fill columns after grouping by another column in pandas dataframe

拜拜、爱过 提交于 2021-02-10 14:53:47
问题 I have data available on a monthly basis(for different securities) which I want to convert to a daily basis by adding the missing dates and forward filling the monthly data for all the days of the month(i.e. data on 12/3/2015 = data on 12/1/2015 and so on for all securities). My data looks like this: x = pd.DataFrame({'ticker': ['a','a','a','b','b'], 'dt': ['12/1/2015','1/1/2016','2/1/2016','1/1/2016','2/1/2016'], 'score': [2.8,3.8,3.8,1.9,1.7]}) I tried creating a multi-index using dates and

How to insert missing dates and forward fill columns after grouping by another column in pandas dataframe

前提是你 提交于 2021-02-10 14:52:27
问题 I have data available on a monthly basis(for different securities) which I want to convert to a daily basis by adding the missing dates and forward filling the monthly data for all the days of the month(i.e. data on 12/3/2015 = data on 12/1/2015 and so on for all securities). My data looks like this: x = pd.DataFrame({'ticker': ['a','a','a','b','b'], 'dt': ['12/1/2015','1/1/2016','2/1/2016','1/1/2016','2/1/2016'], 'score': [2.8,3.8,3.8,1.9,1.7]}) I tried creating a multi-index using dates and

DataFrame: if value in a cell, copy value to cells below it

天大地大妈咪最大 提交于 2021-02-10 14:47:39
问题 I'm working on a stock analysis program and need to find 'SPLIT' amounts from the 'UNP_action', and then copy the corresponding 'UNP_action_amount' to rows above it only. I'm able to do this in a complicated way via loops, but I'm wondering if there's a more efficient way to do this within Pandas. Current: Date UNP_Adj_Close UNP_action UNP_action_amount 2008-05-23 31.83157 2008-05-27 33.032365 2008-05-28 32.965423 2008-05-29 33.61812 SPLIT 0.5 2008-05-30 34.438176 Desired: Date UNP_Adj_Close

Expand nested list of dictionaries in a pandas dataframe column

巧了我就是萌 提交于 2021-02-10 14:36:29
问题 I have this dataframe called "leads" I got from saving the output of an SFDC SOQL into a dataframe. I have been trying to expand column "Leads__r.record" Company Month Amount Leads__r.done Leads__r.record Leads__r.totalSize 0 A1 September 500000 True [{u'Id': u'Q500, u'Company': u'... 1.0 1 B1 December 16200 True [{u'Id': u'Q600', u'Company': u'... 1.0 2 C1 December 35000 True [{u'Id': u'Q700', u'Company': u'... 1.0 3 D1 December 16200 True [{u'Id': u'Q800', u'Company': u'... 1.0 4 E1

How to count observations with certain value in a group conditionally?

独自空忆成欢 提交于 2021-02-10 14:33:26
问题 I am working with the following data frame: Year Month Day X Y Color 2018 January 1 4.5 6 Red 2018 January 4 3.2 8.1 Red 2018 January 11 1.1 2.3 Blue 2018 February 7 5.4 2.2 Blue 2018 February 15 1.5 4.4 Red 2019 January 3 8.6 2.3 Red 2019 January 22 1.1 2.5 Blue 2019 January 23 5.5 7.8 Red 2019 February 5 6.9 1.1 Red 2019 February 10 1.8 1.3 Red I am looking to create a new column that indicates the number of observations where x is greater than y and the color is 'red' for a given month.

R: Is there a way to sort messy data where it pivots from long to wide, and as it moves across variables, into one logical key:value column?

微笑、不失礼 提交于 2021-02-10 14:24:06
问题 I have extremely messy data. A portion of it looks like the following example. x1_01=c("bearing_coordinates", "bearing_coordinates", "bearing_coordinates", "roadkill") x1_02=c(146,122,68,1) x2_01=c("tree_density","animals_on_road","animals_on_road", "tree_density") x2_02=c(13,2,5,11) x3_01=c("animals_on_road", "tree_density", "roadkill", "bearing_coordinates") x3_02=c(3,10,1,1000) x4_01=c("roadkill","roadkill", "tree_density", "animals_on_road") x4_02=c(1,1,12,6) testframe = data.frame(x1_01

List of dict of dict in Pandas

可紊 提交于 2021-02-10 14:18:19
问题 I have list of dict of dicts in the following form: [{0:{'city':'newyork', 'name':'John', 'age':'30'}}, {0:{'city':'newyork', 'name':'John', 'age':'30'}},] I want to create pandas DataFrame in the following form: city name age newyork John 30 newyork John 30 Tried a lot but without any success can you help me? 回答1: Use list comprehension with concat and DataFrame.from_dict: L = [{0:{'city':'newyork', 'name':'John', 'age':'30'}}, {0:{'city':'newyork', 'name':'John', 'age':'30'}}] df = pd

Concatenating data frame rows based on column condition

安稳与你 提交于 2021-02-10 13:58:29
问题 For subsequent discussion, I will refer to the example data frame below: Now, what I wish to achieve is to group all the packet times that are similar - i.e. all the 7s, 12s, etc. Furthermore, the PacketTime field should contain the difference in min and max ( max(PacketTime) - min(PacketTime) ), and the FrameLen , IPLen and TCPLen fields should be lists of all the values that correspond to the grouped time. For example for the 7s group, FrameLen would contain c(304, 276, 276) . My solution

Difference between consecutive dates in pandas groupby [duplicate]

时光总嘲笑我的痴心妄想 提交于 2021-02-10 13:22:09
问题 This question already has an answer here : Pandas find duration between dates where a condition is met? (1 answer) Closed 2 years ago . I have a data-frame as follows: df_raw_dates = pd.DataFrame({"id": [102, 102, 102, 103, 103, 103, 104], "val": [9,2,4,7,6,3,2], "dates": [pd.Timestamp(2002, 1, 1), pd.Timestamp(2002, 3, 3), pd.Timestamp(2003, 4, 4), pd.Timestamp(2003, 8, 9), pd.Timestamp(2005, 2, 3), pd.Timestamp(2005, 2, 8), pd.Timestamp(2005, 2, 3)]}) id val dates 0 102 9 2002-01-01 1 102 2

How to test string contains elements in list and assign the target element to another column via Pandas

天涯浪子 提交于 2021-02-10 13:16:11
问题 I have a one column list presenting some company names . Some of those names contain the country names (e.g., "China" in "China A1", 'Finland' in "C1 in Finland"). I want to extract their belonging countries based on the company name and a pre-defined list consisted of country names. The original dataframe df shows like this Company name Country 0 China A1 1 Australia-A2 2 Belgium_C1 3 C1 in Finland 4 D1 of Greece 5 E2 for Pakistan For now, I can only come up with an inefficient method. Here