dataframe

day of Year values starting from a particular date

与世无争的帅哥 提交于 2021-02-05 06:40:54
问题 I have a dataframe with a date column. The duration is 365 days starting from 02/11/2017 and ending at 01/11/2018. Date 02/11/2017 03/11/2017 05/11/2017 . . 01/11/2018 I want to add an adjacent column called Day_Of_Year as follows: Date Day_Of_Year 02/11/2017 1 03/11/2017 2 05/11/2017 4 . . 01/11/2018 365 I apologize if it's a very basic question, but unfortunately I haven't been able to start with this. I could use datetime(), but that would return values such as 1 for 1st january, 2 for 2nd

merge pandas dataframes under new index level

馋奶兔 提交于 2021-02-05 06:40:30
问题 I have 2 pandas DataFrame s act and exp that I want to combine into a single dataframe df : import pandas as pd from numpy.random import rand act = pd.DataFrame(rand(3,2), columns=['a', 'b']) exp = pd.DataFrame(rand(3,2), columns=['a', 'c']) act #have a b 0 0.853910 0.405463 1 0.822641 0.255832 2 0.673718 0.313768 exp #have a c 0 0.464781 0.325553 1 0.565531 0.269678 2 0.363693 0.775927 Dataframe df should contain one more column index level than act and exp , and contain each under its own

merge pandas dataframes under new index level

谁说我不能喝 提交于 2021-02-05 06:40:08
问题 I have 2 pandas DataFrame s act and exp that I want to combine into a single dataframe df : import pandas as pd from numpy.random import rand act = pd.DataFrame(rand(3,2), columns=['a', 'b']) exp = pd.DataFrame(rand(3,2), columns=['a', 'c']) act #have a b 0 0.853910 0.405463 1 0.822641 0.255832 2 0.673718 0.313768 exp #have a c 0 0.464781 0.325553 1 0.565531 0.269678 2 0.363693 0.775927 Dataframe df should contain one more column index level than act and exp , and contain each under its own

combine two data frames and aggregate

☆樱花仙子☆ 提交于 2021-02-05 06:35:07
问题 I am having 2 data frames in the below format: dt1 id col1 col2 col3 col4 ___ ____ ____ _____ _____ 1 2 3 1 2 2 3 4 1 1 3 1 1 1 1 4 1 2 1 2 5 1 1 1 1 6 1 2 1 2 dt2 id col1 col2 col3 col4 ___ ____ ____ _____ _____ 1 1 3 1 2 2 3 4 1 0 4 1 1 1 1 6 1 2 1 2 9 2 1 1 1 12 1 2 1 2 and I want to aggregate and combine these two data frames by the id and the resulting dataframe like dt3 id col1 col2 col3 col4 ___ ____ ____ _____ _____ 1 3 6 2 4 2 6 8 2 1 3 1 1 1 1 4 2 3 2 3 5 1 1 1 1 6 2 4 2 4 9 2 1 1 1

Take long list of items and reshape into dataframe “rows” - pandas python 3

故事扮演 提交于 2021-02-05 06:35:06
问题 I have a long list of items that I want to put in a data frame at set intervals. I have another list with "column names". E.g. colnames = ['Title', 'Date', 'Abstract', 'ID', 'Volume'] data = [a, b, c, d, e, f, g, h, i ,j, k, l, m, n, o] I want to create a data frame that looks like: | Title | Date | Abstract | ID | Volume __________________________________________________________________ 0 a b c d e 1 f g h i j 2 k l m n o Thanks for any suggestions! 回答1: You need DataFrame constructor with

Converting to long panel data format with pandas

自古美人都是妖i 提交于 2021-02-05 06:22:06
问题 I have a DataFrame where rows represent time and columns represent individuals. I want to turn it into into long panel data format in pandas in an efficient manner, as the DataFames are rather large. I would like to avoid looping. Here is an example: The following DataFrame: id 1 2 date 20150520 3.0 4.0 20150521 5.0 6.0 should be transformed into: date id value 20150520 1 3.0 20150520 2 4.0 20150520 1 5.0 20150520 2 6.0 Speed is what's really important to me, due to the data size. I prefer it

Renaming columns on slice of dataframe not performing as expected

♀尐吖头ヾ 提交于 2021-02-05 06:16:45
问题 I was trying to clean up column names in a dataframe but only a part of the columns. It doesn't work when trying to replace column names on a slice of the dataframe somehow, why is that? Lets say we have the following dataframe: Note , on the bottom is copy-able code to reproduce the data: Value ColAfjkj ColBhuqwa ColCouiqw 0 1 a e i 1 2 b f j 2 3 c g k 3 4 d h l I want to clean up the column names (expected output): Value ColA ColB ColC 0 1 a e i 1 2 b f j 2 3 c g k 3 4 d h l Approach 1 : I

Renaming columns on slice of dataframe not performing as expected

半腔热情 提交于 2021-02-05 06:16:28
问题 I was trying to clean up column names in a dataframe but only a part of the columns. It doesn't work when trying to replace column names on a slice of the dataframe somehow, why is that? Lets say we have the following dataframe: Note , on the bottom is copy-able code to reproduce the data: Value ColAfjkj ColBhuqwa ColCouiqw 0 1 a e i 1 2 b f j 2 3 c g k 3 4 d h l I want to clean up the column names (expected output): Value ColA ColB ColC 0 1 a e i 1 2 b f j 2 3 c g k 3 4 d h l Approach 1 : I

Numpy Where with more than 2 conditions

可紊 提交于 2021-02-05 05:52:07
问题 Good Morning, I have the following a dataframe with two columns of integers and a Series (diff) computed as: diff = (df["col_1"] - df["col_2"]) / (df["col_2"]) I would like to create a column of the dataframe whose values are: equal to 0, if (diff >= 0) & (diff <= 0.35) equal to 1, if (diff > 0.35) equal to 2, if (diff < 0) & (diff >= - 0.35) equal to 3, if (diff < - 0.35) I tried with: df["Class"] = np.where( (diff >= 0) & (diff <= 0.35), 0, np.where( (diff > 0.35), 1, np.where( (diff < 0) &

Fill column of a dataframe from another dataframe

非 Y 不嫁゛ 提交于 2021-02-05 04:54:59
问题 I'm trying to fill a column of a dataframe from another dataframe based on conditions. Let's say my first dataframe is df1 and the second is named df2. df1 is described as bellow : +------+------+ | Col1 | Col2 | +------+------+ | A | 1 | | B | 2 | | C | 3 | | A | 1 | +------+------+ And : df2 is described as bellow : +------+------+ | Col1 | Col2 | +------+------+ | A | NaN | | B | NaN | | D | NaN | +------+------+ Each distinct value of Col1 has her an id number (In Col2), so what I want is