pandas

Plotly Dash: Why is my figure failing to show with a multi dropdown selection?

时间秒杀一切 提交于 2021-02-08 11:01:07
问题 I am building a simple python dashboard using dash and plotly. I am also new to python (as is probably evident!) and I'm happy for any/all corrections. I would like to plot a time series of data from a pre-determined CSV file. I have added a dropdown selection box with which I would like to allow multiple different columns to be plotted. Sample data: "TOA5","HE605_RV50_GAF","CR6","7225","CR6.Std.07","CPU:BiSP5_GAF_v2d.CR6","51755","SensorStats" "TIMESTAMP","RECORD","BattV_Min","BattV_Avg",

Converting CSV file to HDF5 using pandas

不打扰是莪最后的温柔 提交于 2021-02-08 11:01:05
问题 When i use pandas to convert csv files to hdf5 files the resulting file is extremely large. For example a test csv file (23 columns, 1.3 million rows) of 170Mb results in an hdf5 file of 2Gb. However if pandas is bypassed and the hdf5 file is directly written (using pytables) it is only 20Mb. In the following code (that is used to do the conversion in pandas) the values of the object columns in the dataframe are explicitly converted to string objects (to prevent pickling): # Open the csv file

'DataFrame' object has no attribute 'to_frame'

。_饼干妹妹 提交于 2021-02-08 10:56:14
问题 I am new to python. Just following the tutorial: https://www.hackerearth.com/practice/machine-learning/machine-learning-projects/python-project/tutorial/ This is the dataframe miss: miss = train.isnull().sum()/len(train) miss = miss[miss>0] miss.sort_values(inplace = True) miss Electrical 0.000685 MasVnrType 0.005479 MasVnrArea 0.005479 BsmtQual 0.025342 BsmtCond 0.025342 BsmtFinType1 0.025342 BsmtExposure 0.026027 BsmtFinType2 0.026027 GarageCond 0.055479 GarageQual 0.055479 GarageFinish 0

'DataFrame' object has no attribute 'to_frame'

大城市里の小女人 提交于 2021-02-08 10:53:35
问题 I am new to python. Just following the tutorial: https://www.hackerearth.com/practice/machine-learning/machine-learning-projects/python-project/tutorial/ This is the dataframe miss: miss = train.isnull().sum()/len(train) miss = miss[miss>0] miss.sort_values(inplace = True) miss Electrical 0.000685 MasVnrType 0.005479 MasVnrArea 0.005479 BsmtQual 0.025342 BsmtCond 0.025342 BsmtFinType1 0.025342 BsmtExposure 0.026027 BsmtFinType2 0.026027 GarageCond 0.055479 GarageQual 0.055479 GarageFinish 0

Pandas read_excel() parses date columns with blank values to NaT

谁都会走 提交于 2021-02-08 10:50:47
问题 I am trying to read an excel file that has date columns with the below code src1_df = pd.read_excel("src_file1.xlsx", keep_default_na = False) Even though I have specified, keep_default_na = False, I see that the data frame has 'NaT' value(s) for corresponding blank cells in Excel date columns. Please suggest, how to get a blank string instead of 'NaT' while parsing Excel files. I am using Python 3.x and Pandas 0.23.4 回答1: src1_df = pd.read_excel("src_file1.xlsx", na_filter=False) Then you

Pandas parsing csv error - expected 1 fields found 9

拈花ヽ惹草 提交于 2021-02-08 10:50:23
问题 I'm trying to parse from a .csv file: planets = pd.read_csv("planets.csv", sep=',') But I always end up with this error: ParserError: Error tokenizing data. C error: Expected 1 fields in line 13, saw 9 This is how the first few lines of my csv file look like: # This file was produced by the test # Tue Apr 3 06:03:27 2018 # # COLUMN pl_hostname: Host Name # COLUMN pl_discmethod: Discovery Method # COLUMN pl_pnum: Number of Planets in System # COLUMN pl_orbper: Orbital Period [days] # COLUMN pl

pandas sum the differences between two columns in each group

自作多情 提交于 2021-02-08 10:48:33
问题 I have a df looks like, A B C D 2017-10-01 2017-10-11 M 2017-10 2017-10-02 2017-10-03 M 2017-10 2017-11-01 2017-11-04 B 2017-11 2017-11-08 2017-11-09 B 2017-11 2018-01-01 2018-01-03 A 2018-01 the dtype of A and B are datetime64 , C and D are of strings ; I like to groupby C and D and get the differences between B and A , df.groupby(['C', 'D']).apply(lambda row: row['B'] - row['A']) but I don't know how to sum such differences in each group and assign the values to a new column say E ,

Pandas parsing csv error - expected 1 fields found 9

女生的网名这么多〃 提交于 2021-02-08 10:47:20
问题 I'm trying to parse from a .csv file: planets = pd.read_csv("planets.csv", sep=',') But I always end up with this error: ParserError: Error tokenizing data. C error: Expected 1 fields in line 13, saw 9 This is how the first few lines of my csv file look like: # This file was produced by the test # Tue Apr 3 06:03:27 2018 # # COLUMN pl_hostname: Host Name # COLUMN pl_discmethod: Discovery Method # COLUMN pl_pnum: Number of Planets in System # COLUMN pl_orbper: Orbital Period [days] # COLUMN pl

Scraping wrong table

你。 提交于 2021-02-08 10:46:31
问题 I'm trying to get the advanced stats of players onto an excel sheet but the table it's scraping is the first one instead of the advanced stats table. ValueError: Length of passed values is 23, index implies 21 If i try to use the id instead, i get an another error about tbody. Also, I get an error about lname=name.split(" ")[1] IndexError: list index out of range. I think that has to do with 'Nene' in the list. Is there a way to fix that? import requests from bs4 import BeautifulSoup

pandas sum the differences between two columns in each group

試著忘記壹切 提交于 2021-02-08 10:45:26
问题 I have a df looks like, A B C D 2017-10-01 2017-10-11 M 2017-10 2017-10-02 2017-10-03 M 2017-10 2017-11-01 2017-11-04 B 2017-11 2017-11-08 2017-11-09 B 2017-11 2018-01-01 2018-01-03 A 2018-01 the dtype of A and B are datetime64 , C and D are of strings ; I like to groupby C and D and get the differences between B and A , df.groupby(['C', 'D']).apply(lambda row: row['B'] - row['A']) but I don't know how to sum such differences in each group and assign the values to a new column say E ,