pandas | 易学教程

Read in multiple csv into separate dataframes in Pandas

阅读更多关于 Read in multiple csv into separate dataframes in Pandas

问题 I have a long list of csv files that I want to read as dataframes and name them by their file name. For example, I want to read in the file status.csv and assign its dataframe the name status . Is there a way I can efficiently do this using Pandas? Looking at this, I still have to write the name of each csv in my loop. I want to avoid that. Looking at this, that allows me to read multiple csv into one dataframe instead of many. 回答1: You can list all csv under a directory using os.listdir

Read in multiple csv into separate dataframes in Pandas

阅读更多关于 Read in multiple csv into separate dataframes in Pandas

Pandas - Date ranges that doesn't overlap

阅读更多关于 Pandas - Date ranges that doesn't overlap

问题 I'm getting lost in trying to find a easy way to determine when date ranges from 2 data frame are not overlapping. I have 2 dataframes : df1 = pd.DataFrame({ 'START':['2019-01-01 09:00:00', '2019-01-01 18:00:00'], 'END':['2019-01-01 16:00:00', '2019-01-02 02:00:00']}) df2 = pd.DataFrame({ 'START':['2019-01-01 08:00:00', '2019-01-01 14:00:00', '2019-01-01 22:00:00', '2019-01-02 01:00:00'], 'END':['2019-01-01 11:00:00', '2019-01-01 15:00:00', '2019-01-01 23:00:00', '2019-01-02 04:00:00']}) df1

replacing the value in new data frame python

阅读更多关于 replacing the value in new data frame python

问题 As you seen on the image we have three columns so i need to write code that can create new columns called aa and replace when whever see A replace it with 1 or B replace it 2 ETC. Thank you 回答1: IIUC use: df['aa'] = df['categories'].map(df.drop_duplicates('categories').set_index('categories')['WOF']) 来源： https://stackoverflow.com/questions/62656932/replacing-the-value-in-new-data-frame-python

How to find highest combination in dataframe [duplicate]

阅读更多关于 How to find highest combination in dataframe [duplicate]

问题 This question already has answers here : Get the row(s) which have the max value in groups using groupby (11 answers) Closed 9 months ago . I have a data frame that has repeating values in 2 columns and I only want to keep the highest value of each combination. For the following data frame: df = pd.DataFrame( np.array([['A', 'B ', 3], ['A', 'B', 6], ['C', 'D', 9], ['C', 'D', 2], ['C', 'B', 4]])) df how would I get this dataframe as a result: |A|B|6| |C|D|9| |C|B|4| 回答1: Use groupby and

comparing values in two pandas dataframes to keep a running count

阅读更多关于 comparing values in two pandas dataframes to keep a running count

问题 My apologies for the length of this but I want to explain as fully as possible. I am completely stumped on how to solve this. The Setup: I have two dataframes the first has a list of all possible values in the first column there are no duplicate values in this column. Let's call it df_01. Theses are all the common possible values in each list. All additional columns represent independent lists. Each contains a number that represents how many days any given value of all possible values has

Python pandas Convert a Column to Column Header

阅读更多关于 Python pandas Convert a Column to Column Header

问题 I've a list of dict contains x and y. I want to make x as the index and y as the column headers. How can I do it? Thanks import pandas pt1 = {"x": 0, "y": 1, "val": 3,} pt2 = {"x": 0, "y": 2, "val": 6,} lst = [pt1, pt2] print(lst) # [{'x': 0, 'y': 1, 'val': 3}, {'x': 0, 'y': 2, 'val': 6}] df = pandas.DataFrame(lst) print(df) # val x y # 0 3 0 1 # 1 6 0 2 How can I convert df to this format? Thanks. # 1 2 # 0 3 6 回答1: You can use df.pivot: pt1 = {"x": 0, "y": 1, "val": 3,} pt2 = {"x": 0, "y":

Find all shortest Euclidean distances between two groups of point coordinates

阅读更多关于 Find all shortest Euclidean distances between two groups of point coordinates

问题 I have a Pandas DataFrame, where columns X1, Y1 have point coordinates for the first group of coordinates and columns X2, Y2 have point coordinates for the second group of coordinates. Both groups are independent of each other. It is just happen to be they are in the same dataframe. Example: X1,Y1,X2,Y2 41246.438,0.49,38791.673,0.49 41304.5,0.491,38921.557,0.491 41392.062,0.492,39037.135,0.492 41515.5,0.493,39199.972,0.493 41636.062,0.494,39346.561,0.494 41795.188,0.495,39477.63,0.495 42027

Find all shortest Euclidean distances between two groups of point coordinates

阅读更多关于 Find all shortest Euclidean distances between two groups of point coordinates

How to reconstruct a conversation from Watson Speech-to-Text output?

阅读更多关于 How to reconstruct a conversation from Watson Speech-to-Text output?

问题 I have the JSON output from Watson's Speech-to-Text service that I have converted into a list and then into a Pandas data-frame. I'm trying to identify how to reconstruct the conversation (with timings) akin to the following: Speaker 0: Said this [00.01 - 00.12] Speaker 1: Said that [00.12 - 00.22] Speaker 0: Said something else [00.22 - 00.56] My data-frame has a row for each word, and columns for the word, its start/end time, and the speaker tag (either 0 or 1). words = [['said', 0.01, 0.06