pandas

Read in multiple csv into separate dataframes in Pandas

拜拜、爱过 提交于 2021-02-11 13:54:58
问题 I have a long list of csv files that I want to read as dataframes and name them by their file name. For example, I want to read in the file status.csv and assign its dataframe the name status . Is there a way I can efficiently do this using Pandas? Looking at this, I still have to write the name of each csv in my loop. I want to avoid that. Looking at this, that allows me to read multiple csv into one dataframe instead of many. 回答1: You can list all csv under a directory using os.listdir

Read in multiple csv into separate dataframes in Pandas

你说的曾经没有我的故事 提交于 2021-02-11 13:51:06
问题 I have a long list of csv files that I want to read as dataframes and name them by their file name. For example, I want to read in the file status.csv and assign its dataframe the name status . Is there a way I can efficiently do this using Pandas? Looking at this, I still have to write the name of each csv in my loop. I want to avoid that. Looking at this, that allows me to read multiple csv into one dataframe instead of many. 回答1: You can list all csv under a directory using os.listdir

Pandas - Date ranges that doesn't overlap

十年热恋 提交于 2021-02-11 13:50:54
问题 I'm getting lost in trying to find a easy way to determine when date ranges from 2 data frame are not overlapping. I have 2 dataframes : df1 = pd.DataFrame({ 'START':['2019-01-01 09:00:00', '2019-01-01 18:00:00'], 'END':['2019-01-01 16:00:00', '2019-01-02 02:00:00']}) df2 = pd.DataFrame({ 'START':['2019-01-01 08:00:00', '2019-01-01 14:00:00', '2019-01-01 22:00:00', '2019-01-02 01:00:00'], 'END':['2019-01-01 11:00:00', '2019-01-01 15:00:00', '2019-01-01 23:00:00', '2019-01-02 04:00:00']}) df1

replacing the value in new data frame python

只愿长相守 提交于 2021-02-11 13:48:31
问题 As you seen on the image we have three columns so i need to write code that can create new columns called aa and replace when whever see A replace it with 1 or B replace it 2 ETC. Thank you 回答1: IIUC use: df['aa'] = df['categories'].map(df.drop_duplicates('categories').set_index('categories')['WOF']) 来源: https://stackoverflow.com/questions/62656932/replacing-the-value-in-new-data-frame-python

How to find highest combination in dataframe [duplicate]

偶尔善良 提交于 2021-02-11 13:47:07
问题 This question already has answers here : Get the row(s) which have the max value in groups using groupby (11 answers) Closed 9 months ago . I have a data frame that has repeating values in 2 columns and I only want to keep the highest value of each combination. For the following data frame: df = pd.DataFrame( np.array([['A', 'B ', 3], ['A', 'B', 6], ['C', 'D', 9], ['C', 'D', 2], ['C', 'B', 4]])) df how would I get this dataframe as a result: |A|B|6| |C|D|9| |C|B|4| 回答1: Use groupby and

comparing values in two pandas dataframes to keep a running count

*爱你&永不变心* 提交于 2021-02-11 13:41:00
问题 My apologies for the length of this but I want to explain as fully as possible. I am completely stumped on how to solve this. The Setup: I have two dataframes the first has a list of all possible values in the first column there are no duplicate values in this column. Let's call it df_01. Theses are all the common possible values in each list. All additional columns represent independent lists. Each contains a number that represents how many days any given value of all possible values has

Python pandas Convert a Column to Column Header

时光怂恿深爱的人放手 提交于 2021-02-11 13:30:43
问题 I've a list of dict contains x and y. I want to make x as the index and y as the column headers. How can I do it? Thanks import pandas pt1 = {"x": 0, "y": 1, "val": 3,} pt2 = {"x": 0, "y": 2, "val": 6,} lst = [pt1, pt2] print(lst) # [{'x': 0, 'y': 1, 'val': 3}, {'x': 0, 'y': 2, 'val': 6}] df = pandas.DataFrame(lst) print(df) # val x y # 0 3 0 1 # 1 6 0 2 How can I convert df to this format? Thanks. # 1 2 # 0 3 6 回答1: You can use df.pivot: pt1 = {"x": 0, "y": 1, "val": 3,} pt2 = {"x": 0, "y":

Find all shortest Euclidean distances between two groups of point coordinates

左心房为你撑大大i 提交于 2021-02-11 13:15:05
问题 I have a Pandas DataFrame, where columns X1, Y1 have point coordinates for the first group of coordinates and columns X2, Y2 have point coordinates for the second group of coordinates. Both groups are independent of each other. It is just happen to be they are in the same dataframe. Example: X1,Y1,X2,Y2 41246.438,0.49,38791.673,0.49 41304.5,0.491,38921.557,0.491 41392.062,0.492,39037.135,0.492 41515.5,0.493,39199.972,0.493 41636.062,0.494,39346.561,0.494 41795.188,0.495,39477.63,0.495 42027

Find all shortest Euclidean distances between two groups of point coordinates

旧巷老猫 提交于 2021-02-11 13:12:20
问题 I have a Pandas DataFrame, where columns X1, Y1 have point coordinates for the first group of coordinates and columns X2, Y2 have point coordinates for the second group of coordinates. Both groups are independent of each other. It is just happen to be they are in the same dataframe. Example: X1,Y1,X2,Y2 41246.438,0.49,38791.673,0.49 41304.5,0.491,38921.557,0.491 41392.062,0.492,39037.135,0.492 41515.5,0.493,39199.972,0.493 41636.062,0.494,39346.561,0.494 41795.188,0.495,39477.63,0.495 42027

How to reconstruct a conversation from Watson Speech-to-Text output?

那年仲夏 提交于 2021-02-11 12:53:28
问题 I have the JSON output from Watson's Speech-to-Text service that I have converted into a list and then into a Pandas data-frame. I'm trying to identify how to reconstruct the conversation (with timings) akin to the following: Speaker 0: Said this [00.01 - 00.12] Speaker 1: Said that [00.12 - 00.22] Speaker 0: Said something else [00.22 - 00.56] My data-frame has a row for each word, and columns for the word, its start/end time, and the speaker tag (either 0 or 1). words = [['said', 0.01, 0.06