pandas

Select rows from a DataFrame based on string values in a column in pandas

白昼怎懂夜的黑 提交于 2021-02-10 09:34:26
问题 How to select rows from a DataFrame based on string values in a column in pandas? I just want to display the just States only which are in all CAPS. The states have the total number of cities. import pandas as pd import matplotlib.pyplot as plt %pylab inline d = pd.read_csv("states.csv") print(d) print(df) # States/cities B C D # 0 FL 3 5 6 # 1 Orlando 1 2 3 # 2 Miami 1 1 3 # 3 Jacksonville 1 2 0 # 4 CA 8 3 2 # 5 San diego 3 1 0 # 6 San Francisco 5 2 2 # 7 WA 4 2 1 # 8 Seattle 3 1 0 # 9

Select rows from a DataFrame based on string values in a column in pandas

我怕爱的太早我们不能终老 提交于 2021-02-10 09:33:01
问题 How to select rows from a DataFrame based on string values in a column in pandas? I just want to display the just States only which are in all CAPS. The states have the total number of cities. import pandas as pd import matplotlib.pyplot as plt %pylab inline d = pd.read_csv("states.csv") print(d) print(df) # States/cities B C D # 0 FL 3 5 6 # 1 Orlando 1 2 3 # 2 Miami 1 1 3 # 3 Jacksonville 1 2 0 # 4 CA 8 3 2 # 5 San diego 3 1 0 # 6 San Francisco 5 2 2 # 7 WA 4 2 1 # 8 Seattle 3 1 0 # 9

How to use str.replace to replace multiple pairs at once? [duplicate]

独自空忆成欢 提交于 2021-02-10 09:31:15
问题 This question already has answers here : How to replace multiple substrings of a string? (23 answers) Replace multiple substrings in a Pandas series with a value (5 answers) Closed 8 months ago . Currently I am using the following code to make replacements which is a little cumbersome: df1['CompanyA'] = df1['CompanyA'].str.replace('.','') df1['CompanyA'] = df1['CompanyA'].str.replace('-','') df1['CompanyA'] = df1['CompanyA'].str.replace(',','') df1['CompanyA'] = df1['CompanyA'].str.replace(

Extract date from timestamps of multiple time zones in Pandas

可紊 提交于 2021-02-10 09:26:55
问题 I have a Pandas DataFrame in which I've converted hour to local_hour based on the time_zone column. I now want to extract the date from local_hour as local_date but I get an error saying Tz-aware datetime.datetime cannot be converted to datetime64 unless utc=True . How can I do this? # Create dataframe import pandas as pd df = pd.DataFrame({ 'hour': ['2019-01-01 05:00:00', '2019-01-01 07:00:00', '2019-01-01 08:00:00'], 'time_zone': ['US/Eastern', 'US/Central', 'US/Mountain'] }) # Convert hour

Extract date from timestamps of multiple time zones in Pandas

浪尽此生 提交于 2021-02-10 09:24:29
问题 I have a Pandas DataFrame in which I've converted hour to local_hour based on the time_zone column. I now want to extract the date from local_hour as local_date but I get an error saying Tz-aware datetime.datetime cannot be converted to datetime64 unless utc=True . How can I do this? # Create dataframe import pandas as pd df = pd.DataFrame({ 'hour': ['2019-01-01 05:00:00', '2019-01-01 07:00:00', '2019-01-01 08:00:00'], 'time_zone': ['US/Eastern', 'US/Central', 'US/Mountain'] }) # Convert hour

Convert csv file to pandas dataframe

只谈情不闲聊 提交于 2021-02-10 08:40:21
问题 I have a CSV file in the following format: DATES, 01-12-2010, 01-12-2010, 01-12-2010, 02-12-2010, 02-12-2010, 02-12-2010 UNITS, Hz, kV, MW, Hz, kV, MW Interval, , , , , , 00:15, 49.82, 33.73755, 34.65, 49.92, 33.9009, 36.33, 00:30, 49.9, 33.7722, 35.34, 49.89, 33.8382, 37.65, 00:45, 49.94, 33.8316, 33.5, 50.09, 34.07745, 37.41, 01:00, 49.86, 33.94875, 30.91, 50.18, 34.20945, 36.11, 01:15, 49.97, 34.2243, 27.28, 50.11, 34.3596, 33.24, 01:30, 50.02, 34.3332, 26.91, 50.12, 34.452, 31.03, 01:45,

Convert csv file to pandas dataframe

谁说胖子不能爱 提交于 2021-02-10 08:39:35
问题 I have a CSV file in the following format: DATES, 01-12-2010, 01-12-2010, 01-12-2010, 02-12-2010, 02-12-2010, 02-12-2010 UNITS, Hz, kV, MW, Hz, kV, MW Interval, , , , , , 00:15, 49.82, 33.73755, 34.65, 49.92, 33.9009, 36.33, 00:30, 49.9, 33.7722, 35.34, 49.89, 33.8382, 37.65, 00:45, 49.94, 33.8316, 33.5, 50.09, 34.07745, 37.41, 01:00, 49.86, 33.94875, 30.91, 50.18, 34.20945, 36.11, 01:15, 49.97, 34.2243, 27.28, 50.11, 34.3596, 33.24, 01:30, 50.02, 34.3332, 26.91, 50.12, 34.452, 31.03, 01:45,

Using pandas to identify nearest objects

爱⌒轻易说出口 提交于 2021-02-10 07:56:15
问题 I have an assignment that can be done using any programming language. I chose Python and pandas since I have little experience using these and thought it would be a good learning experience. I was able to complete the assignment using traditional loops that I know from traditional computer programming, and it ran okay over thousands of rows, but it brought my laptop down to a screeching halt once I let it process millions of rows. The assignment is outlined below. You have a two-lane road on

How to groupby two columns and calculate the summation of rows using Pandas?

谁说我不能喝 提交于 2021-02-10 07:36:17
问题 I have a pandas data frame df like: Name Hour Activity A 4 TT A 3 TT A 5 UU B 1 TT C 1 TT D 1 TT D 2 TT D 3 UU D 4 UU The next step is to get the summation if the rows have identical value of the column Name and Activity . For example, for the case Name: A and Activity: TT will give the summation of 7 The result is the presented as below TT UU A 7 5 B 1 0 C 1 0 D 3 7 Is it possible to do something like this using pandas groupby? 回答1: Try groupby.sum and unstack df_final = df.groupby(['Name',

Pandas replace zero as the nearest average non-zero value

别说谁变了你拦得住时间么 提交于 2021-02-10 07:33:58
问题 I have a dataframe: df = pd.DataFrame({'A':[0,0,15,0,0,12,0,0,0,5]}) And I want to replace the 0 value with the nearest non zero value, For example, the first value is 0, then I find the the nearest non-zero value is 15, so I replace it as 15, then the data becomes: [15,0,15,0,0,12,0,0,0,5], Then for all the value except first one, I need to find the both side of the nearest non-zero value, and average them. So for the second 0, it would be (15+15)/2; And the third zero would be (15+12)/2 I