loc

How to lag data by x specific days on a multi index pandas dataframe?

只谈情不闲聊 提交于 2021-02-08 09:58:56
问题 I have a dataframe that has dates, assets, and then price/volume data. I'm trying to pull in data from 7 days ago, but the issue is that I can't use shift() because my table has missing dates in it. date cusip price price_7daysago 1/1/2017 a 1 1/1/2017 b 2 1/2/2017 a 1.2 1/2/2017 b 2.3 1/8/2017 a 1.1 1 1/8/2017 b 2.2 2 I've tried creating a lambda function to try to use loc and timedelta to create this shifting, but I was only able to output empty numpy arrays: def row_delta(x, df, days,

Apply loc for 2 columns values Pandas

≡放荡痞女 提交于 2021-01-27 14:50:41
问题 I´m tying to loc a dataframe with 2 columns parameters: if I do paises_cpm = df.loc[a] is working but if I do paises_cpm = df.loc[a,b] I receive an error: IndexingError: Unalignable boolean Series provided as indexer (index of the boolean Series and of the indexed object do not match import pandas as pd import time fecha = time.strftime(str((int(time.strftime("%d")))-1)) subastas = int(fecha) * 5000 impresiones = int(fecha) * 1000 df = pd.read_csv('Cliente_x_Pais.csv') a = df['Subastas'] >

Select row using the length of list in pandas cell

柔情痞子 提交于 2020-01-07 05:38:29
问题 I have a table df a b c 1 x y [x] 2 x z [c,d] 3 x t [e,f,g] Just wondering how to select the row using the length of c column such as df.loc[len(df.c) >1] I know this is not right.... what should be the right one? 回答1: You can using df.loc[np.array(list(map(len,df.c.values)))>1] 回答2: Try this: df[df.c.map(len)>1] 来源: https://stackoverflow.com/questions/47720421/select-row-using-the-length-of-list-in-pandas-cell

Cryptic warning pops up when doing pandas assignment with loc and iloc

南楼画角 提交于 2019-12-25 05:32:09
问题 There is a statement in my code that goes: df.loc[i] = [df.iloc[0][0], i, np.nan] where i is an iteration variable that I used in the for loop that this statement is residing in, np is my imported numpy module, and df is a DataFrame that looks something like: build_number name cycles 0 390 adpcm 21598 1 390 aes 5441 2 390 dfadd 463 3 390 dfdiv 1323 4 390 dfmul 167 5 390 dfsin 39589 6 390 gsm 6417 7 390 mips 4205 8 390 mpeg2 1993 9 390 sha 348417 So as you can see, the statement in my code

Returning subset of each group from a pandas groupby object

半城伤御伤魂 提交于 2019-12-24 15:00:27
问题 I have the multilevel dataframe that looks like: date_time name note value list index 1 0 2015-05-22 05:37:59 Tom 129 False 1 2015-05-22 05:38:59 Tom 0 True 2 2015-05-22 05:39:59 Tom 0 False 3 2015-05-22 05:40:59 Tom 45 True 2 4 2015-05-22 05:37:59 Kate 129 True 5 2015-05-22 05:41:59 Kate 0 False 5 2015-05-22 05:37:59 Kate 0 True I want iterate over the list , and for each first row of list check the value of column value , and if it is False , delete this row. So the final goal is to delete

Finding the row number for the header row in a CSV file / Pandas Dataframe

南楼画角 提交于 2019-12-24 01:16:01
问题 I am trying to get an index or row number for the row that holds the headers in my CSV file. The issue is, the header row can move up and down depending on the output of the report from our system (I have no control to change this) code: ht = pd.read_csv(file.csv) test = ht.get_loc('Code') #Code being header im using to locate the header row csv1 = read_csv(file.csv, header=test) df1 = df1.append(csv1) #Appending as have many files If I was to print test, I would expect a number around 4 or 5

Resources containing cross-language benchmarks?

烂漫一生 提交于 2019-12-22 08:24:27
问题 What resources are available that use benchmarks for comparing programming languages? I am interested in both How quickly a program in a given language can execute a given benchmark? How many lines of code are required in a given language to implement a given benchmark? There is a long-standing web site called the Computer Language Benchmarks Game, originally created by Doug Bagley as the "Great Computer Language Shootout". (You can view a little history at Portland Patterns Repository.) Is

interacting over a dateframe with functions

这一生的挚爱 提交于 2019-12-21 20:36:27
问题 if I have a date frame like this: N EG_00_04 NEG_04_08 NEG_08_12 NEG_12_16 NEG_16_20 NEG_20_24 \ datum_von 2017-10-12 21.69 15.36 0.87 1.42 0.76 0.65 2017-10-13 11.85 8.08 1.39 2.86 1.02 0.55 2017-10-14 7.83 5.88 1.87 2.04 2.29 2.18 2017-10-15 14.64 11.28 2.62 3.35 2.13 1.25 2017-10-16 5.11 5.82 -0.30 -0.38 -0.24 -0.10 2017-10-17 12.09 9.61 0.20 1.09 0.39 0.57 And I wanna check the values that are above 0 and change them to zero when they are lower. Not sure how should I use the function

Pandas loc multiple conditions [duplicate]

泄露秘密 提交于 2019-12-20 03:19:55
问题 This question already has answers here : Selecting with complex criteria from pandas.DataFrame (4 answers) Closed 6 months ago . I have a dataframe and I want to delete all rows where column A is equal to blue and also col B is equal to green. I though the below should work, but its not the case. Can anyone see the problem df=df.loc[~(df['A']=='blue' & df['B']=='green')] 回答1: You should separate the two propositions: df1=df.loc[~(df['A']=='blue') & ~(df['B']=='green')] 回答2: use eq instead of