pandas

pandas shift time series with missing values

筅森魡賤 提交于 2021-02-08 02:10:41
问题 I have a times series with some missing entries, that looks like this: date value --------------- 2000 5 2001 10 2003 8 2004 72 2005 12 2007 13 I would like to do create a column for the "previous_value". But I only want it to show values for consecutive years. So I want it to look like this: date value previous_value ------------------------------- 2000 5 nan 2001 10 5 2003 8 nan 2004 72 8 2005 12 72 2007 13 nan However just applying pandas shift function directly to the column 'value' would

Flatten of dict of lists into a dataframe

橙三吉。 提交于 2021-02-08 01:50:50
问题 I have a dict of lists say: data = {'a': [80, 130], 'b': [64], 'c': [58,80]} How do I flatten it and convert it into dataframe like the one below: 回答1: Use nested list comprehension with if-else if want no count one element lists: df = pd.DataFrame([('{}{}'.format(k, i), v1) if len(v) > 1 else (k, v1) for k, v in data.items() for i, v1 in enumerate(v, 1)], columns=['Index','Data']) print (df) Index Data 0 a1 80 1 a2 130 2 b 64 3 c1 58 4 c2 80 EDIT: data = {'a': [80, 130], 'b': np.nan, 'c':

Create new column based on condition on other categorical column

我只是一个虾纸丫 提交于 2021-02-07 23:57:12
问题 I have a dataframe as shown below Category Value A 10 B 22 A 2 C 30 B 23 B 4 C 8 C 24 A 9 I need to create a Flag column Flag based following conditions If the values of Category A is greater than or equal 5 then Flag=1, else 0 If the values of Category B is greater than or equal 20 then Flag=1, else 0 If the values of Category C is greater than or equal 25 then Flag=1, else 0 Expected output as shown below Category Value Flag A 10 1 B 22 1 A 2 0 C 30 1 B 23 1 B 4 0 C 8 0 C 24 0 A 9 1 I tried

Create new column based on condition on other categorical column

不问归期 提交于 2021-02-07 23:56:53
问题 I have a dataframe as shown below Category Value A 10 B 22 A 2 C 30 B 23 B 4 C 8 C 24 A 9 I need to create a Flag column Flag based following conditions If the values of Category A is greater than or equal 5 then Flag=1, else 0 If the values of Category B is greater than or equal 20 then Flag=1, else 0 If the values of Category C is greater than or equal 25 then Flag=1, else 0 Expected output as shown below Category Value Flag A 10 1 B 22 1 A 2 0 C 30 1 B 23 1 B 4 0 C 8 0 C 24 0 A 9 1 I tried

How to split csv file into respective csv files based on date in first column (python)?

你说的曾经没有我的故事 提交于 2021-02-07 22:54:11
问题 I have a large CSV with multiple years of electricity load data, and I would like to split it into multiple files on a month and year basis - i.e to return individual CSVs for Jan, Feb, Mar etc for 2013, 2014, 2015 etc. I have reviewed a lot of the solutions in the forums, and have not had any luck. My current file is structured as follows; 01-Jan-11,1,34606,34677,35648,35685,31058,484,1730 01-Jan-11,2,35092,35142,36089,36142,31460,520,1730 01-Jan-11,3,34725,34761,36256,36234,31109,520,1730

Pandas read_excel percentages as strings

a 夏天 提交于 2021-02-07 22:47:16
问题 My excel sheet has a column of percentages stored with the percent symbol (eg "50%"). How can I coerce pandas.read_excel to read the string "50%" instead of casting it to a float? Currently the read_excel implementation parses the percentage into the float 0.5. Additionally if I add a converter = {col_with_percentage: str} argument, it parses it into the string '0.5'. Is there a way to read the raw percentage value ("50%")? 回答1: You can pass your own function with the converters. Something to

How to read .odt using python? [closed]

我的未来我决定 提交于 2021-02-07 22:43:14
问题 Closed . This question needs details or clarity. It is not currently accepting answers. Want to improve this question? Add details and clarify the problem by editing this post. Closed 5 months ago . Improve this question I tried to read .odt using python with "odfpy" library, but it still doesn't work. Could you suggest me how to read .odt file using python or give me a simple source code. Thank you 回答1: First install odfpy library then, In [21]: from odf import text, teletype ...: from odf

How to read .odt using python? [closed]

删除回忆录丶 提交于 2021-02-07 22:42:41
问题 Closed . This question needs details or clarity. It is not currently accepting answers. Want to improve this question? Add details and clarify the problem by editing this post. Closed 5 months ago . Improve this question I tried to read .odt using python with "odfpy" library, but it still doesn't work. Could you suggest me how to read .odt file using python or give me a simple source code. Thank you 回答1: First install odfpy library then, In [21]: from odf import text, teletype ...: from odf

Combine multiple dictionaries into one pandas dataframe in long format

筅森魡賤 提交于 2021-02-07 21:48:18
问题 I have several dictionaries set up as follows: Dict1 = {'Orange': ['1', '2', '3', '4']} Dict2 = {'Red': ['3', '4', '5']} And I'd like the output to be one combined dataframe: | Type | Value | |--------------| |Orange| 1 | |Orange| 2 | |Orange| 3 | |Orange| 4 | | Red | 3 | | Red | 4 | | Red | 5 | I tried splitting everything out but I only get Dict2 in this dataframe. mydicts = [Dict1, Dict2] for x in mydicts: for k, v in x.items(): df = pd.DataFrame(v) df['Type'] = k 回答1: One option is using

Making a regression line through a bar char using pandas or seaborn

馋奶兔 提交于 2021-02-07 21:02:55
问题 I am new to Pandas and Seaborn and trying to learn. I am trying to add a trend line and a bar plot on the same graph. I have some data that looks like Year Sample Size 2000 500 2001 3000 2003 10000 2004 20000 2004 23000 I am new to pandas and seaborn and I am attempting to draw a line through the bar plot showing a decreasing or an increasing trend but struggling to do it on the same graph. Till now, I have a bar plot. Below you can find the code. sampleSizes['Sample Size'] -> is the column I