pandas | 易学教程

pandas: Dataframe.replace() with regex

阅读更多关于 pandas: Dataframe.replace() with regex

问题 I have a table which looks like this: df_raw = pd.DataFrame(dict(A = pd.Series(['1.00','-1']), B = pd.Series(['1.0','-45.00','-']))) A B 0 1.00 1.0 1 -1 -45.00 2 NaN - I would like to replace '-' to '0.00' using dataframe.replace() but it struggles because of the negative values, '-1', '-45.00'. How can I ignore the negative values and replace only '-' to '0.00' ? my code: df_raw = df_raw.replace(['-','\*'], ['0.00','0.00'], regex=True).astype(np.float64) error code: ValueError: invalid

pandas: Dataframe.replace() with regex

阅读更多关于 pandas: Dataframe.replace() with regex

pandas: Dataframe.replace() with regex

阅读更多关于 pandas: Dataframe.replace() with regex

Pandas Vectorized lookup of Dictionary

阅读更多关于 Pandas Vectorized lookup of Dictionary

问题 This seems like it should be a common use case but I'm not finding any good guidance on this. I have a solution that works but I would rather have a vectorized lookup rather than using the Pandas apply() function. Here is an example of what I am doing: import pandas as pd example_dict = { "category1":{ "field1": 0.0, "filed2": 5.0}, "category2":{ "field1": 5.0, "field2": 8.0}} d = {"ids": range(10), "category": ["category1" if x % 2 == 0 else "category2" for x in range(10)]} df = pd.DataFrame

Problems with isin pandas

阅读更多关于 Problems with isin pandas

问题 Sorry, I just asked this question: Pythonic Way to have multiple Or's when conditioning in a dataframe but marked it as answered prematurely because it passed my overly simplistic test case, but isn't working more generally. (If it is possible to merge and reopen the question that would be great...) Here is the full issue: sum(data['Name'].isin(eligible_players)) > 0 sum(data['Name'] == "Antonio Brown") > 68 "Antonio Brown" in eligible_players > True Basically if I understand correctly, I am

Problems with isin pandas

阅读更多关于 Problems with isin pandas

Pandas groupby for multiple values in a column

阅读更多关于 Pandas groupby for multiple values in a column

问题 I have a data frame similar to the following +----------------+-------+ | class | year | +----------------+-------+ | ['A', 'B'] | 2001 | | ['A'] | 2002 | | ['B'] | 2001 | | ['A', 'B', 'C']| 2003 | | ['B', 'C'] | 2001 | | ['C'] | 2003 | +----------------+-------+ I want to create a data frame using this so that the resulting table shows the count of each category in class per yer. +-----+----+----+----+ |year | A | B | C | +-----+----+----+----+ |2001 | 1 | 3 | 1 | |2002 | 1 | 0 | 0 | |2003 |

Pandas: Apply function to each pair of columns

阅读更多关于 Pandas: Apply function to each pair of columns

问题 Function f(x,y) that takes two Pandas Series and returns a floating point number. I would like to apply f to each pair of columns in a DataFrame D and construct another DataFrame E of the returned values, so that f(D[i],D[j]) is the value of the i th row and j th column. The straightforward solution is to run a nested loop over all pairs of columns: E = pd.DataFrame([[f(D[i], D[j]) for i in D] for j in D], columns=D.columns, index=D.columns) But is there a more elegant solution that perhaps

Pandas: Apply function to each pair of columns

阅读更多关于 Pandas: Apply function to each pair of columns

Pandas: Apply function to each pair of columns

阅读更多关于 Pandas: Apply function to each pair of columns