pandas

Create a dataframe from arrays python

社会主义新天地 提交于 2021-02-08 14:16:09
问题 I'm try to construct a dataframe (I'm using Pandas library) from some arrays and one matrix. in particular, if I have two array like this: A=[A,B,C] B=[D,E,F] And one matrix like this : 1 2 2 3 3 3 4 4 4 Can i create a dataset like this? A B C D 1 2 2 E 3 3 3 F 4 4 4 Maybe is a stupid question, but i m very new with Python and Pandas. I seen this : https://pandas.pydata.org/pandas-docs/version/0.23.4/generated/pandas.DataFrame.html but specify only 'colums'. I should read the matrix row for

Create a dataframe from arrays python

你。 提交于 2021-02-08 14:14:50
问题 I'm try to construct a dataframe (I'm using Pandas library) from some arrays and one matrix. in particular, if I have two array like this: A=[A,B,C] B=[D,E,F] And one matrix like this : 1 2 2 3 3 3 4 4 4 Can i create a dataset like this? A B C D 1 2 2 E 3 3 3 F 4 4 4 Maybe is a stupid question, but i m very new with Python and Pandas. I seen this : https://pandas.pydata.org/pandas-docs/version/0.23.4/generated/pandas.DataFrame.html but specify only 'colums'. I should read the matrix row for

Pandas how to get rows with consecutive dates and sales more than 1000?

江枫思渺然 提交于 2021-02-08 14:09:13
问题 I have a data frame called df : Date Sales 01/01/2020 812 02/01/2020 981 03/01/2020 923 04/01/2020 1033 05/01/2020 988 ... ... How can I get the first occurrence of 7 consecutive days with sales above 1000? This is what I am doing to find the rows where sales is above 1000: In [221]: df.loc[df["sales"] >= 1000] Out [221]: Date Sales 04/01/2020 1033 08/01/2020 1008 09/01/2020 1091 17/01/2020 1080 18/01/2020 1121 19/01/2020 1098 ... ... 回答1: You can assign a unique identifier per consecutive

Pandas how to get rows with consecutive dates and sales more than 1000?

有些话、适合烂在心里 提交于 2021-02-08 14:04:36
问题 I have a data frame called df : Date Sales 01/01/2020 812 02/01/2020 981 03/01/2020 923 04/01/2020 1033 05/01/2020 988 ... ... How can I get the first occurrence of 7 consecutive days with sales above 1000? This is what I am doing to find the rows where sales is above 1000: In [221]: df.loc[df["sales"] >= 1000] Out [221]: Date Sales 04/01/2020 1033 08/01/2020 1008 09/01/2020 1091 17/01/2020 1080 18/01/2020 1121 19/01/2020 1098 ... ... 回答1: You can assign a unique identifier per consecutive

pandas: Group by splitting string value in all rows (a column) and aggregation function

最后都变了- 提交于 2021-02-08 13:49:19
问题 If i have dataset like this: id person_name salary 0 [alexander, william, smith] 45000 1 [smith, robert, gates] 65000 2 [bob, alexander] 56000 3 [robert, william] 80000 4 [alexander, gates] 70000 If we sum that salary column then we will get 316000 I really want to know how much person who named 'alexander, smith, etc' (in distinct) makes in salary if we sum all of the salaries from its splitting name in this dataset (that contains same string value). output: group sum_salary alexander 171000

Python/Pandas: Converting numbers by comma separated for thousands

我怕爱的太早我们不能终老 提交于 2021-02-08 13:45:24
问题 I have a dataframe with a column containing long numbers. I am trying to convert all the values in the numbers column to comma separated for thousands. df col_1 col_2 Rooney 34590927 Ronaldo 5467382 John 25647398 How do I iterate and get the following result? Expected result: col_1 col_2 Rooney 34,590,927 Ronaldo 5,467,382 John 25,647,398 回答1: You can use string formatting, df['col_2'] = pd.to_numeric(df['col_2'].fillna(0), errors='coerce') df['col_2'] = df['col_2'].map('{:,.2f}'.format) Do

Set values based on df.query?

大兔子大兔子 提交于 2021-02-08 13:15:43
问题 I'd like to set the value of a column based on a query. I could probably use .where to accomplish this, but the criteria for .query are strings which are easier for me to maintain, especially when the criteria become complex. import numpy as np import pandas as pd np.random.seed(51723) df = pd.DataFrame(np.random.rand(n, 3), columns=list('abc')) I'd like to make a new column, d, and set the value to 1 where these criteria are met: criteria = '(a < b) & (b < c)' Among other things, I've tried:

Set values based on df.query?

穿精又带淫゛_ 提交于 2021-02-08 13:15:24
问题 I'd like to set the value of a column based on a query. I could probably use .where to accomplish this, but the criteria for .query are strings which are easier for me to maintain, especially when the criteria become complex. import numpy as np import pandas as pd np.random.seed(51723) df = pd.DataFrame(np.random.rand(n, 3), columns=list('abc')) I'd like to make a new column, d, and set the value to 1 where these criteria are met: criteria = '(a < b) & (b < c)' Among other things, I've tried:

Set values based on df.query?

拥有回忆 提交于 2021-02-08 13:14:30
问题 I'd like to set the value of a column based on a query. I could probably use .where to accomplish this, but the criteria for .query are strings which are easier for me to maintain, especially when the criteria become complex. import numpy as np import pandas as pd np.random.seed(51723) df = pd.DataFrame(np.random.rand(n, 3), columns=list('abc')) I'd like to make a new column, d, and set the value to 1 where these criteria are met: criteria = '(a < b) & (b < c)' Among other things, I've tried:

Set values based on df.query?

℡╲_俬逩灬. 提交于 2021-02-08 13:14:22
问题 I'd like to set the value of a column based on a query. I could probably use .where to accomplish this, but the criteria for .query are strings which are easier for me to maintain, especially when the criteria become complex. import numpy as np import pandas as pd np.random.seed(51723) df = pd.DataFrame(np.random.rand(n, 3), columns=list('abc')) I'd like to make a new column, d, and set the value to 1 where these criteria are met: criteria = '(a < b) & (b < c)' Among other things, I've tried: