dataframe

python: cumulative concatenate in pandas dataframe

只谈情不闲聊 提交于 2021-02-11 15:45:07
问题 How to do a cumulative concatenate in pandas dataframe? I found there are a number of solutions in R, but can't find it in python. Here is the problem: suppose we have a dataframe: with columns: date and name : import pandas as pd d = {'date': [1,1,2,2,3,3,3,4,4,4], 'name':['A','B','A','C','A','B','B','A','B','C']} df = pd.DataFrame(data=d) I want to get CUM_CONCAT , which is a cumulative concatenate groupby date: date name CUM_CONCAT 0 1 A [A] 1 1 B [A,B] 2 2 A [A] 3 2 C [A,C] 4 3 A [A] 5 3

python: cumulative concatenate in pandas dataframe

五迷三道 提交于 2021-02-11 15:44:51
问题 How to do a cumulative concatenate in pandas dataframe? I found there are a number of solutions in R, but can't find it in python. Here is the problem: suppose we have a dataframe: with columns: date and name : import pandas as pd d = {'date': [1,1,2,2,3,3,3,4,4,4], 'name':['A','B','A','C','A','B','B','A','B','C']} df = pd.DataFrame(data=d) I want to get CUM_CONCAT , which is a cumulative concatenate groupby date: date name CUM_CONCAT 0 1 A [A] 1 1 B [A,B] 2 2 A [A] 3 2 C [A,C] 4 3 A [A] 5 3

Group by and fill missing datetime values with duplicates

筅森魡賤 提交于 2021-02-11 15:24:49
问题 This question comes from this one: Group by and fill missing datetime values What I'm just trying is to group a Pandas Dataframe by contract, check if there are duplicated datetime values and fill this ones. If there are duplicates, there will be a total of 25 hours, and if not, 24. My input is this: contract datetime value1 value2 x 2019-01-01 00:00:00 50 60 x 2019-01-01 02:00:00 30 60 x 2019-01-01 02:00:00 70 80 x 2019-01-01 03:00:00 70 80 y 2019-01-01 00:00:00 30 100 With this Dataframe my

Size-1 array error when preparing decision model

半腔热情 提交于 2021-02-11 15:02:26
问题 I have DataFrame called data with 477154 rows. PDB_ID Chain Sequence Secstr 0 101M A GEWQLVLHVWAKVEA | HHHH HHHHGG| 1 102L A MVLSEGEWKVEA |HHHH HHHHHH| 2 102M A MVLSEGEWQLVLHVWAKVEA |HHHHHHHHHGGHH HHH | 3 103L A MVLSEGEWQLVLHVWAKV | HHHHH HHHHHH HH| 4 103L B MVLSEGEWQLVLHVWAKVEAVAL | HHHHH HHHHHH HHHHH | My goal is to get each character one by one from columns: 'Sequence' and 'Secstr' to arrays and make it usable for classification. Every row has different number of elements. I tried to do it

Error while trying to append data to columns in Python

丶灬走出姿态 提交于 2021-02-11 15:00:25
问题 I am trying to reverse geocode data and for that I have below query import overpy import pandas as pd import numpy as np df = pd.read_csv("/home/runner/sample.csv") df.sort_values(by=['cvdt35_timestamp_s'],inplace=True) api= overpy.Overpass() box = 0.0005 queries = [] results = [] df['Name']='' df['Highway'] ='' with open("sample.csv") as f: for row in df.index: query = 'way('+str(df.gps_lat_dd.iloc[row]-box)+','+str(df.gps_lon_dd.iloc[row]-box)+','+str(df.gps_lat_dd.iloc[row]+box)+','+str(df

pandas multiindex add labels to an index level

半世苍凉 提交于 2021-02-11 14:59:18
问题 I have a pandas dataframe with multiindex as the following: TALLY DAY NODE CLASS 2018-02-04 pdk2r08o005 3 7.0 2018-02-05 pdk2r08o005 3 24.0 2018-02-06 dsvtxvCsdbc02 3 2.0 pdk2r08o005 3 28.0 2018-02-07 pdk2r08o005 3 24.0 2018-02-08 dsvtxvCsdbc02 3 3.0 pdk2r08o005 3 24.0 2018-02-09 pdk2r08o005 3 24.0 2018-02-10 dsvtxvCsdbc02 3 2.0 pdk2r08o005 3 24.0 2018-02-11 pdk2r08o005 3 31.0 2018-02-12 pdk2r08o005 3 24.0 2018-02-13 pdk2r08o005 3 20.0 2018-02-14 dsvtxvCsdbc02 3 4.0 pdk2r08o005 3 24.0 2018-02

How to do intersection of dataframes in pandas

女生的网名这么多〃 提交于 2021-02-11 14:53:31
问题 I have a dataframe like following : <table border="1" class="dataframe"> <thead> <tr style="text-align: right;"> <th></th> <th>Title</th> <th>ASIN</th> <th>State</th> <th>SellerSKU</th> <th>Quantity</th> <th>FBAStock</th> <th>QuantityToShip</th> </tr> </thead> <tbody> <tr> <th>1</th> <td>Daedal crafters- Pack of Two Gajra (Orange and...</td> <td>B075T64ZWJ</td> <td>WEST BENGAL</td> <td>DC216</td> <td>1</td> <td>0</td> <td>1</td> </tr> <tr> <th>2</th> <td>Daedal Dream Catchers - Intricate Web

How to implement tkinter scrollbars?

让人想犯罪 __ 提交于 2021-02-11 14:46:21
问题 I'm struggling with my code to implement horizontal and vertical scrollbars to display the values of a dataframe. I have two frames/canvas, the first for the headers, the second for the values. So far, my code displays the dataframe but I can't finalize the scrollbars properly. Also, I would like my horizontal scrollbar to effect both canevas, so that my values and headers scroll together. But obviously the headers must not scroll with the vertical scrollbar (headers shall always remain

Clean wrong header inside Dataframe with Python/Pandas

懵懂的女人 提交于 2021-02-11 14:37:49
问题 I've got a corrupt data frame with random header duplicates inside the data frame. How to ignore or delete these rows while loading the data frame? Since this random header is in the data frame, pandas raise an error while loading. I would like to ignore this row while loading it with pandas. Or delete it somehow, before loading it with pandas. The file looks like this: col1, col2, col3 0, 1, 1 0, 0, 0 1, 1, 1 col1, col2, col3 <- this is the random copy of the header inside the dataframe 0, 1

Clean wrong header inside Dataframe with Python/Pandas

只谈情不闲聊 提交于 2021-02-11 14:35:12
问题 I've got a corrupt data frame with random header duplicates inside the data frame. How to ignore or delete these rows while loading the data frame? Since this random header is in the data frame, pandas raise an error while loading. I would like to ignore this row while loading it with pandas. Or delete it somehow, before loading it with pandas. The file looks like this: col1, col2, col3 0, 1, 1 0, 0, 0 1, 1, 1 col1, col2, col3 <- this is the random copy of the header inside the dataframe 0, 1