dataframe | 易学教程

day of Year values starting from a particular date

阅读更多关于 day of Year values starting from a particular date

问题 I have a dataframe with a date column. The duration is 365 days starting from 02/11/2017 and ending at 01/11/2018. Date 02/11/2017 03/11/2017 05/11/2017 . . 01/11/2018 I want to add an adjacent column called Day_Of_Year as follows: Date Day_Of_Year 02/11/2017 1 03/11/2017 2 05/11/2017 4 . . 01/11/2018 365 I apologize if it's a very basic question, but unfortunately I haven't been able to start with this. I could use datetime(), but that would return values such as 1 for 1st january, 2 for 2nd

merge pandas dataframes under new index level

阅读更多关于 merge pandas dataframes under new index level

问题 I have 2 pandas DataFrame s act and exp that I want to combine into a single dataframe df : import pandas as pd from numpy.random import rand act = pd.DataFrame(rand(3,2), columns=['a', 'b']) exp = pd.DataFrame(rand(3,2), columns=['a', 'c']) act #have a b 0 0.853910 0.405463 1 0.822641 0.255832 2 0.673718 0.313768 exp #have a c 0 0.464781 0.325553 1 0.565531 0.269678 2 0.363693 0.775927 Dataframe df should contain one more column index level than act and exp , and contain each under its own

merge pandas dataframes under new index level

阅读更多关于 merge pandas dataframes under new index level

combine two data frames and aggregate

阅读更多关于 combine two data frames and aggregate

问题 I am having 2 data frames in the below format: dt1 id col1 col2 col3 col4 ___ ____ ____ _____ _____ 1 2 3 1 2 2 3 4 1 1 3 1 1 1 1 4 1 2 1 2 5 1 1 1 1 6 1 2 1 2 dt2 id col1 col2 col3 col4 ___ ____ ____ _____ _____ 1 1 3 1 2 2 3 4 1 0 4 1 1 1 1 6 1 2 1 2 9 2 1 1 1 12 1 2 1 2 and I want to aggregate and combine these two data frames by the id and the resulting dataframe like dt3 id col1 col2 col3 col4 ___ ____ ____ _____ _____ 1 3 6 2 4 2 6 8 2 1 3 1 1 1 1 4 2 3 2 3 5 1 1 1 1 6 2 4 2 4 9 2 1 1 1

Take long list of items and reshape into dataframe “rows” - pandas python 3

阅读更多关于 Take long list of items and reshape into dataframe “rows” - pandas python 3

问题 I have a long list of items that I want to put in a data frame at set intervals. I have another list with "column names". E.g. colnames = ['Title', 'Date', 'Abstract', 'ID', 'Volume'] data = [a, b, c, d, e, f, g, h, i ,j, k, l, m, n, o] I want to create a data frame that looks like: | Title | Date | Abstract | ID | Volume __________________________________________________________________ 0 a b c d e 1 f g h i j 2 k l m n o Thanks for any suggestions! 回答1: You need DataFrame constructor with

Converting to long panel data format with pandas

阅读更多关于 Converting to long panel data format with pandas

问题 I have a DataFrame where rows represent time and columns represent individuals. I want to turn it into into long panel data format in pandas in an efficient manner, as the DataFames are rather large. I would like to avoid looping. Here is an example: The following DataFrame: id 1 2 date 20150520 3.0 4.0 20150521 5.0 6.0 should be transformed into: date id value 20150520 1 3.0 20150520 2 4.0 20150520 1 5.0 20150520 2 6.0 Speed is what's really important to me, due to the data size. I prefer it

Renaming columns on slice of dataframe not performing as expected

阅读更多关于 Renaming columns on slice of dataframe not performing as expected

问题 I was trying to clean up column names in a dataframe but only a part of the columns. It doesn't work when trying to replace column names on a slice of the dataframe somehow, why is that? Lets say we have the following dataframe: Note , on the bottom is copy-able code to reproduce the data: Value ColAfjkj ColBhuqwa ColCouiqw 0 1 a e i 1 2 b f j 2 3 c g k 3 4 d h l I want to clean up the column names (expected output): Value ColA ColB ColC 0 1 a e i 1 2 b f j 2 3 c g k 3 4 d h l Approach 1 : I

Renaming columns on slice of dataframe not performing as expected

阅读更多关于 Renaming columns on slice of dataframe not performing as expected

Numpy Where with more than 2 conditions

阅读更多关于 Numpy Where with more than 2 conditions

问题 Good Morning, I have the following a dataframe with two columns of integers and a Series (diff) computed as: diff = (df["col_1"] - df["col_2"]) / (df["col_2"]) I would like to create a column of the dataframe whose values are: equal to 0, if (diff >= 0) & (diff <= 0.35) equal to 1, if (diff > 0.35) equal to 2, if (diff < 0) & (diff >= - 0.35) equal to 3, if (diff < - 0.35) I tried with: df["Class"] = np.where( (diff >= 0) & (diff <= 0.35), 0, np.where( (diff > 0.35), 1, np.where( (diff < 0) &

Fill column of a dataframe from another dataframe

阅读更多关于 Fill column of a dataframe from another dataframe

问题 I'm trying to fill a column of a dataframe from another dataframe based on conditions. Let's say my first dataframe is df1 and the second is named df2. df1 is described as bellow : +------+------+ | Col1 | Col2 | +------+------+ | A | 1 | | B | 2 | | C | 3 | | A | 1 | +------+------+ And : df2 is described as bellow : +------+------+ | Col1 | Col2 | +------+------+ | A | NaN | | B | NaN | | D | NaN | +------+------+ Each distinct value of Col1 has her an id number (In Col2), so what I want is