pandas | 易学教程

How to compare numerical values to categorical ranges in column headers in pandas?

阅读更多关于 How to compare numerical values to categorical ranges in column headers in pandas?

问题 I have one dataframe that looks like this: import pandas as pd import datetime df1 = pd.DataFrame.from_dict( {'Unnamed: 4': {0: 'Values'}, datetime.datetime(2021, 1, 1, 0, 0): {0: 8}, datetime.datetime(2021, 1, 2, 0, 0): {0: 12}, datetime.datetime(2021, 1, 3, 0, 0): {0: 99}, datetime.datetime(2021, 1, 4, 0, 0): {0: 25}, datetime.datetime(2021, 1, 5, 0, 0): {0: 35}} ) and a second dataframe that looks like this df2 = pd.DataFrame.from_dict( {'Level': {0: 'Range', 1: 'Middle point', 2: 'Total

MemoryError: Unable to allocate array with shape (118, 840983) and data type float64

阅读更多关于 MemoryError: Unable to allocate array with shape (118, 840983) and data type float64

问题 I'm getting the following error: MemoryError: Unable to allocate array with shape (118, 840983) and data type float64 in my python code whenever I am running a python pandas.readcsv() function to read a text file. Why is this?? This is my code: import pandas as pd df = pd.read_csv("LANGEVIN_DATA.txt", delim_whitespace=True) 回答1: The MemoryError means, you file is too large to readcsv in one time, you need used the chunksize to avoid the error. just like: import pandas as pd df = pd.read_csv(

List of dict of dict in Pandas

阅读更多关于 List of dict of dict in Pandas

问题 I have list of dict of dicts in the following form: [{0:{'city':'newyork', 'name':'John', 'age':'30'}}, {0:{'city':'newyork', 'name':'John', 'age':'30'}},] I want to create pandas DataFrame in the following form: city name age newyork John 30 newyork John 30 Tried a lot but without any success can you help me? 回答1: Use list comprehension with concat and DataFrame.from_dict: L = [{0:{'city':'newyork', 'name':'John', 'age':'30'}}, {0:{'city':'newyork', 'name':'John', 'age':'30'}}] df = pd

Take difference between pivot table columns in Python

阅读更多关于 Take difference between pivot table columns in Python

问题 I have a dataframe with a Week , Campaign , Placement and Count column. In order to compare counts per weeks by Campaign and Placement I created a pivot table that works great. How do I create a new column with the difference between these 2 weeks (in percentage if possible)? Code: dfPivot = pd.pivot_table(dfPivot, values='Count',\ index=['Campaign', 'Placement'],columns=['Week'], aggfunc=np.sum) Current Output: Week 2019-10-27 2019-11-03 Campaign Placement Code A 111111111 4288.0 615.0

Take difference between pivot table columns in Python

阅读更多关于 Take difference between pivot table columns in Python

How can I convert a png to a dataframe for python?

阅读更多关于 How can I convert a png to a dataframe for python?

问题 I trained a model for Digit Recognizer (https://www.kaggle.com/c/digit-recognizer/data). The input data is a csv file. Each row in the file represent an image which is 28 pixels in height and 28 pixels in width, for a total of 784 pixels in total. The model is ready to use but I wonder how I can create a testing data for this input? If I have an image with digital number, how can I convert it to 28 by 28 pixels in an array format. I tried below code but it renders the image background as

How do I subclass or otherwise extend a pandas DataFrame without breaking DataFrame.append()?

阅读更多关于 How do I subclass or otherwise extend a pandas DataFrame without breaking DataFrame.append()?

问题 I have a complex object I'd like to build around a pandas DataFrame. I've tried to do this with a subclass, but appending to the DataFrame reinitializes all properties in a new instance even when using _metadata , as recommended here. I know subclassing pandas objects is not recommended but I don't know how to do what I want with composition (or any other method), so if someone can tell me how to do this without subclassing that would be great. I'm working with the following code: import

adding column based on count and unique count in python

阅读更多关于 adding column based on count and unique count in python

问题 i have a dataframe as shown below. type item new apple new apple new io new io old apple old io old io old se old pj etc el i need to create a new dataframe based on count and unique count type type_count unique_item_count new 4 2 old 5 4 etc 1 1 col 'type_count' is based on the frequency of labels in col'type' col 'unique_item_count' is based on the unique count of labels present in col'item' for each unique label in col'type' also if i add a new column type item val new apple 20 new apple 6

Python - Pandas: number/index of the minimum value in the given row

阅读更多关于 Python - Pandas: number/index of the minimum value in the given row

问题 I have one pandas dataframe, with one row and multiple columns. I want to get the column number/index of the minimum value in the given row. The code I found was: df.columns.get_loc('colname') The above code asks for a column name. My dataframe doesn't have column names. I want to get the column location of the minimum value. 回答1: Use argmin with converting DataFrame to array by values, only necessary only numeric data: df = pd.DataFrame({ 'B':[4,5,4,5,5,4], 'C':[7,8,9,4,2,3], 'D':[1,3,5,7,1

Python - Pandas: number/index of the minimum value in the given row

阅读更多关于 Python - Pandas: number/index of the minimum value in the given row