dataframe | 易学教程

pandas dataframe concat is giving unwanted NA/NaN columns

阅读更多关于 pandas dataframe concat is giving unwanted NA/NaN columns

问题 Instead of this example where it is horizontal After Pandas Dataframe pd.concat I get NaNs, I'm trying vertical: import pandas a=[['Date', 'letters', 'numbers', 'mixed'], ['1/2/2014', 'a', '6', 'z1'], ['1/2/2014', 'a', '3', 'z1'], ['1/3/2014', 'c', '1', 'x3']] df = pandas.DataFrame.from_records(a[1:],columns=a[0]) f=[] for i in range(0,len(df)): f.append(df['Date'][i] + ' ' + df['letters'][i]) df['new']=f c=[x for x in range(0,5)] b=[] b += [['NA'] * (5 - len(b))] df_a = pandas.DataFrame.from

How to find Average directional movement for stocks using Pandas?

阅读更多关于 How to find Average directional movement for stocks using Pandas?

问题 I have a dataframe of OHLCV data. I would like to know if anyone knows any tutorial or any way of finding ADX(Average directional movement ) using pandas? import pandas as pd import yfinance as yf import matplotlib.pyplot as plt import datetime as dt import numpy as nm start=dt.datetime.today()-dt.timedelta(59) end=dt.datetime.today() df=pd.DataFrame(yf.download("MSFT", start=start, end=end)) The average directional index, or ADX, is the primary technical indicator among the five indicators

Merge lines that share the same key into one line

阅读更多关于 Merge lines that share the same key into one line

问题 I have a Dataframe and would like to make another column that combines the columns whose name begins with the same value in Answer and QID . That is to say, having the following Dataframe QID Category Text QType Question: Answer0 Answer1 Country 0 16 Automotive Access to car Single Do you have access to a car? I own a car/cars I own a car/cars UK 1 16 Automotive Access to car Single Do you have access to a car? I lease/ have a company car I lease/have a company car UK 2 16 Automotive Access

Using formulas with aliases to perform multi-column operations

阅读更多关于 Using formulas with aliases to perform multi-column operations

问题 This question is related to a previous one I asked, but trying to be more generic. I want to use formulas to perform operations on multiple "groups" of data (i.e. a_data1 , a_data2 , b_data1 , b_data2 , and then make operations using the *_data1 columns). Based on @akrun's answer to that question, I created the following function. It takes a one-sided formula and applies it to all the "groups of data": suppressPackageStartupMessages({ library(dplyr) library(tidyr) }) polymutate <- function(df

Error message when appending data to pandas dataframe

阅读更多关于 Error message when appending data to pandas dataframe

问题 Can someone give me a hand with this: I created a loop to append successive intervals of historical price data from Coinbase. My loop iterates successfully a few times then crashes. Error message (under data_temp code line): "ValueError: If using all scalar values, you must pass an index" days = 10 end = datetime.now().replace(microsecond=0) start = end - timedelta(days=days) data_price = pd.DataFrame() for i in range(1,50): print(start) print(end) data_temp = pd.DataFrame(public_client.get

How do I assign categories in a dataframe if they contain any element from another dataframe?

阅读更多关于 How do I assign categories in a dataframe if they contain any element from another dataframe?

问题 I have two excel sheets. One contains summaries and the other contains categories with potential filter words. I need to assign categories to the first dataframe if any element matches in the second dataframe. I have attempted to expand the list in the second dataframe and map by matching the terms to any words in the first dataframe. Data for the test. import pandas as pd data1 = {'Bucket':['basket', 'bushel', 'peck', 'box'], 'Summary':['This is a basket of red apples. They are sour.', 'We

Dataframe add element from a column based on values contiguity from another columns

阅读更多关于 Dataframe add element from a column based on values contiguity from another columns

问题 I have a df like this: a=[1,2,10,11,15,16,17,18,30] b=[5,6,7,8,9,1,2,3,4] df=pd.DataFrame(list(zip(a,b)),columns=['s','i']) Using a I need to add elements of b. Result I would like: (1-2)=5+6=11 (10-11)=7+8=15 (15-18)=9+1+2+3=15 (30)=4 My idea was to create a list of values that are continuous, take the difference(+1) and use it to calculate the sum of the corresponding b elements. #find continuous integer def r (nums): nums= list(df['s']) gaps = [[s, e] for s, e in zip(nums, nums[1:]) if s+1

ParserError: Error tokenizing data. C error

阅读更多关于 ParserError: Error tokenizing data. C error

问题 i'm using a script ScriptGlobal.py that will call and execute 2 other scripts script1.py and script2.py exec(open("./script2.py").read()) AND exec(open("./script1.py").read()) The output of my script1 is the creation of csv file. df1.to_csv('file1.csv',index=False) The output of my script2 is the creation of another csv file. df2.to_csv('file2.csv',index=False) In my ScriptGlobal.py i want to read the 2 files file1.csv and file2.csv and then i got this error. ParserError: Error tokenizing

Rolling average calculating some values it shouldn't?

阅读更多关于 Rolling average calculating some values it shouldn't?

问题 Going off my question here I was redirected to another thread and was able to manipulate the code presented in that answer to get to where I want to be. I'm running into one problem now though and I'm a bit confused as to how it's coming about. My dataframe in essence looks as follows: Date HomeTeam AwayTeam HGoals AGoals HGRollA AGRollA 1/1 AAA BBB 4 2 2.67 1.67 Link to a more detailed image of said dataframe with some extra columns. Basically, every row has: -the date of the match -the home

Combine Two Lists in python in dataframe with file name added in one column and content in another

阅读更多关于 Combine Two Lists in python in dataframe with file name added in one column and content in another

问题 I have a list of files in a folder in my system file_list= ["A", "B", "C"] I Have read the files using a for loop and have obtained content which is as follows A = ["A1", "B1", "C1"] B = ["E1", "F1"] C = [] I would like the following output Content Name A1 A B1 A C1 A D1 B E1 B C How do I accomplish this. 回答1: Try this import pandas as pd data = list(zip((A, B, C), file_list)) df = pd.DataFrame(data, columns=['Content', 'Name']) df = df.explode('Content') print(df) Output: Content Name 0 A1 A