pandas

How to get frequency count of column values for each unique pair of columns in pandas?

我与影子孤独终老i 提交于 2021-02-11 06:52:27
问题 I have a Dataframe that looks like below data = [(datetime.datetime(2021, 2, 10, 7, 49, 7, 118658), u'12.100.90.10', u'100.100.12.1', u'LT_DOWN'), (datetime.datetime(2021, 2, 10, 7, 49, 14, 312273), u'12.100.90.10', u'100.100.12.1', u'LT_UP'), (datetime.datetime(2021, 2, 10, 7, 49, 21, 535932), u'12.100.90.10', u'100.100.12.1', u'LT_UP'), (datetime.datetime(2021, 2, 10, 7, 50, 28, 725961), u'12.100.90.10', u'100.100.12.1', u'PL_DOWN'), (datetime.datetime(2021, 2, 10, 7, 50, 32, 450853), u'10

How to highlight a plotline chart with vertical color bar for specific weekdays (saturday and sunday)? [duplicate]

冷暖自知 提交于 2021-02-11 06:52:11
问题 This question already has answers here : how to highlight weekends for time series line plot in python (2 answers) Closed 3 days ago . i plotted a daily line plot for flights and i would like to highlight all the saturdays and sundays. I'm trying to do it with axvspan but i'm struggling with the use of it? Any suggestions on how can this be coded? (flights.loc[flights['date'].dt.month.between(1, 2), 'date'] .dt.to_period('D') .value_counts() .sort_index() .plot(kind="line",figsize=(12,6)) )

Pandas. A pretty way to delete cell and shift left others in row?

核能气质少年 提交于 2021-02-11 06:51:44
问题 In dataframe I need to delete some cells and shift left others in row : df=pd.DataFrame({'X0':['anytext','anytext','anytext','anytext','anytext'], 'X1':['12:40','boss','engen','15:44','16:01'], 'X2':['anytext','12:44','14:06','anytext','anytext'], 'X3':['anytext','anytext','anytext','anytext','anytext']}) df X0 X1 X2 X3 0 anytext 12:40 anytext anytext 1 anytext boss 12:44 anytext 2 anytext engen 14:06 anytext 3 anytext 15:44 anytext anytext 4 anytext 16:01 anytext anytext I want to delete

Pandas. A pretty way to delete cell and shift left others in row?

时光总嘲笑我的痴心妄想 提交于 2021-02-11 06:51:21
问题 In dataframe I need to delete some cells and shift left others in row : df=pd.DataFrame({'X0':['anytext','anytext','anytext','anytext','anytext'], 'X1':['12:40','boss','engen','15:44','16:01'], 'X2':['anytext','12:44','14:06','anytext','anytext'], 'X3':['anytext','anytext','anytext','anytext','anytext']}) df X0 X1 X2 X3 0 anytext 12:40 anytext anytext 1 anytext boss 12:44 anytext 2 anytext engen 14:06 anytext 3 anytext 15:44 anytext anytext 4 anytext 16:01 anytext anytext I want to delete

Pandas custom function to find whether it is the 1st, 2nd etc Monday, Tuesday, etc - all suggestions welcome

独自空忆成欢 提交于 2021-02-11 06:50:30
问题 So I have the following code which reads in 5 columns, date ohlc. It then creates a column 'dow' to hold day of week. So far so good: import numpy as np import pandas as pd df = pd.read_csv('/content/drive/MyDrive/Forex/EURUSD-2018_12_18-2020_11_01.csv',parse_dates=True,names = ['date','1','2','3','4',]) df['date'] = pd.to_datetime(df['date']) df.index = df['date'] df['dow'] = df['date'].dt.dayofweek #df['downum'] = df.apply(lambda x: downu(x['date'])) df Producing the following output: date

Variable number of unwanted white spaces resulting into distorted column

China☆狼群 提交于 2021-02-11 06:35:34
问题 Recently, I asked the following question - Unwanted white spaces resulting into distorted column and the answer by @sharathnatraj was satisfactory and worked like a charm. Answer was: import re with open('trial1.txt', 'r') as f: lines = f.readlines() l = [re.sub(r"([a-z]{5,})\s([a-z]{5,})", r"\1\2", line) for line in lines] df = pd.read_csv(io.StringIO('\n'.join(l)), delim_whitespace=True) Sample data set: 1 CAgF3O3S silver trifluoromethanesulfonate 2923-28-6 256.937 629.15 1 --- --- --- ---

Variable number of unwanted white spaces resulting into distorted column

半世苍凉 提交于 2021-02-11 06:35:32
问题 Recently, I asked the following question - Unwanted white spaces resulting into distorted column and the answer by @sharathnatraj was satisfactory and worked like a charm. Answer was: import re with open('trial1.txt', 'r') as f: lines = f.readlines() l = [re.sub(r"([a-z]{5,})\s([a-z]{5,})", r"\1\2", line) for line in lines] df = pd.read_csv(io.StringIO('\n'.join(l)), delim_whitespace=True) Sample data set: 1 CAgF3O3S silver trifluoromethanesulfonate 2923-28-6 256.937 629.15 1 --- --- --- ---

Too many open files: '/home/USER/PATH/SERVICE_ACCOUNT.json' when calling Google's Natural Language API

无人久伴 提交于 2021-02-11 06:27:39
问题 I'm working on a Sentiment Analysis project using the Google Cloud Natural Language API and Python, this question might be similar to this other question, what I'm doing is the following: Reads a CSV file from Google Cloud Storage, file has approximately 7000 records. Converts the CSV into a Pandas DataFrame. Iterates over the dataframe and calls the Natural Language API to perform sentiment analysis on one of the dataframe's columns, on the same for loop I extract the score and magnitude

Calculate percentage on DataFrame

只谈情不闲聊 提交于 2021-02-11 06:07:37
问题 I'm trying to calculate the percentage of each crime of the following Dataframe: Violent Murder Larceny_Theft Vehicle_Theft Year 1960 288460 3095700 1855400 328200 1961 289390 3198600 1913000 336000 1962 301510 3450700 2089600 366800 1963 316970 3792500 2297800 408300 1964 364220 4200400 2514400 472800 So I should calculate first the total of crimes per year and then use that to calculate the percentage of each crime. I was trying the following: > perc = (crime *100) / crime.sum(axis=1) Any

Calculate percentage on DataFrame

喜欢而已 提交于 2021-02-11 06:07:27
问题 I'm trying to calculate the percentage of each crime of the following Dataframe: Violent Murder Larceny_Theft Vehicle_Theft Year 1960 288460 3095700 1855400 328200 1961 289390 3198600 1913000 336000 1962 301510 3450700 2089600 366800 1963 316970 3792500 2297800 408300 1964 364220 4200400 2514400 472800 So I should calculate first the total of crimes per year and then use that to calculate the percentage of each crime. I was trying the following: > perc = (crime *100) / crime.sum(axis=1) Any