pandas

Python Multiindex Dataframe remove maximum

ぐ巨炮叔叔 提交于 2021-02-07 04:08:15
问题 I am struggling with MultiIndex DataFrame in python pandas. Suppose I have a df like this: count day group name A Anna 10 Monday Beatrice 15 Tuesday B Beatrice 15 Wednesday Cecilia 20 Thursday What I need is to find the maximum in name for each group and remove it from the dataframe. The final df would look like: count day group name A Anna 10 Monday B Beatrice 15 Wednesday Does any of you have any idea how to do this? I am running out of ideas... Thanks in advance! EDIT: What if the original

Python Multiindex Dataframe remove maximum

半世苍凉 提交于 2021-02-07 04:05:59
问题 I am struggling with MultiIndex DataFrame in python pandas. Suppose I have a df like this: count day group name A Anna 10 Monday Beatrice 15 Tuesday B Beatrice 15 Wednesday Cecilia 20 Thursday What I need is to find the maximum in name for each group and remove it from the dataframe. The final df would look like: count day group name A Anna 10 Monday B Beatrice 15 Wednesday Does any of you have any idea how to do this? I am running out of ideas... Thanks in advance! EDIT: What if the original

Python Multiindex Dataframe remove maximum

核能气质少年 提交于 2021-02-07 04:05:56
问题 I am struggling with MultiIndex DataFrame in python pandas. Suppose I have a df like this: count day group name A Anna 10 Monday Beatrice 15 Tuesday B Beatrice 15 Wednesday Cecilia 20 Thursday What I need is to find the maximum in name for each group and remove it from the dataframe. The final df would look like: count day group name A Anna 10 Monday B Beatrice 15 Wednesday Does any of you have any idea how to do this? I am running out of ideas... Thanks in advance! EDIT: What if the original

Pandas - Merge two DataFrame with partial match

孤街醉人 提交于 2021-02-07 03:47:48
问题 Having the data frames illustrated in the image below, I would like to merge on ['A','B','C'] and ['X','Y','Z'] first then gradually look for a match with one less column, I.E ['A','B'] and ['X','Y'] then ['A'] and ['X'] without duplicating the rows of the result, in the example below a,y,y,v3 is left out since a,d,d already matched. My code so far, matches on all 3 columns: df1 = pd.DataFrame({"A":['a','b','c'],"B":['d','e','f'],"C":['d','e','f']}) df2 = pd.DataFrame({"X":['a','b','a','c'],

Pandas - Merge two DataFrame with partial match

删除回忆录丶 提交于 2021-02-07 03:45:26
问题 Having the data frames illustrated in the image below, I would like to merge on ['A','B','C'] and ['X','Y','Z'] first then gradually look for a match with one less column, I.E ['A','B'] and ['X','Y'] then ['A'] and ['X'] without duplicating the rows of the result, in the example below a,y,y,v3 is left out since a,d,d already matched. My code so far, matches on all 3 columns: df1 = pd.DataFrame({"A":['a','b','c'],"B":['d','e','f'],"C":['d','e','f']}) df2 = pd.DataFrame({"X":['a','b','a','c'],

Python: Grouping by date and finding the average of a column inside a dataframe

前提是你 提交于 2021-02-07 03:33:49
问题 I have a data frame that has a 3 columns. Time represents every day of the month for various months. what I am trying to do is get the 'Count' value per day and average it per each month, and do this for each country. The output must be in the form of a data frame. Curent data: Time Country Count 2017-01-01 us 7827 2017-01-02 us 7748 2017-01-03 us 7653 .. .. 2017-01-30 us 5432 2017-01-31 us 2942 2017-01-01 us 5829 2017-01-02 ca 9843 2017-01-03 ca 7845 .. .. 2017-01-30 ca 8654 2017-01-31 ca

combine/merge two csv using pandas/python

别说谁变了你拦得住时间么 提交于 2021-02-07 03:30:58
问题 I have two csvs, I want to combine or merge these csvs as left join... my key column is "id", I have same non-key column as "result" in both csvs, but I want to override "result" column if any value exists in "result" column of 2nd CSV . How can I achieve that using pandas or any scripting lang. Please see my final expected output. Input input.csv: id,scenario,data1,data2,result 1,s1,300,400,"{s1,not added}" 2,s2,500,101,"{s2 added}" 3,s3,600,202, output.csv: id,result 1,"{s1,added}" 3,"{s3

Python: Grouping by date and finding the average of a column inside a dataframe

天涯浪子 提交于 2021-02-07 03:30:36
问题 I have a data frame that has a 3 columns. Time represents every day of the month for various months. what I am trying to do is get the 'Count' value per day and average it per each month, and do this for each country. The output must be in the form of a data frame. Curent data: Time Country Count 2017-01-01 us 7827 2017-01-02 us 7748 2017-01-03 us 7653 .. .. 2017-01-30 us 5432 2017-01-31 us 2942 2017-01-01 us 5829 2017-01-02 ca 9843 2017-01-03 ca 7845 .. .. 2017-01-30 ca 8654 2017-01-31 ca

creating a pandas dataframe from a database query that uses bind variables

人盡茶涼 提交于 2021-02-07 03:30:35
问题 I'm working with an Oracle database. I can do this much: import pandas as pd import pandas.io.sql as psql import cx_Oracle as odb conn = odb.connect(_user +'/'+ _pass +'@'+ _dbenv) sqlStr = "SELECT * FROM customers" df = psql.frame_query(sqlStr, conn) But I don't know how to handle bind variables, like so: sqlStr = """SELECT * FROM customers WHERE id BETWEEN :v1 AND :v2 """ I've tried these variations: params = (1234, 5678) params2 = {"v1":1234, "v2":5678} df = psql.frame_query((sqlStr,params

Python: Grouping by date and finding the average of a column inside a dataframe

狂风中的少年 提交于 2021-02-07 03:29:45
问题 I have a data frame that has a 3 columns. Time represents every day of the month for various months. what I am trying to do is get the 'Count' value per day and average it per each month, and do this for each country. The output must be in the form of a data frame. Curent data: Time Country Count 2017-01-01 us 7827 2017-01-02 us 7748 2017-01-03 us 7653 .. .. 2017-01-30 us 5432 2017-01-31 us 2942 2017-01-01 us 5829 2017-01-02 ca 9843 2017-01-03 ca 7845 .. .. 2017-01-30 ca 8654 2017-01-31 ca