pandas

Pandas DataFrame currency conversion

做~自己de王妃 提交于 2021-02-08 08:17:07
问题 I have DataFrame with two columns: col1 | col2 20 EUR 31 GBP 5 JPY I may have 10000 rows like this How to do fast currency conversion to base currency being GBP? should I use easymoney? I know how to apply conversion to single row but I do not know how to iterate through all the rows fast. EDIT: I would like to apply sth as: def convert_currency(amount, currency_symbol): converted = ep.currency_converter(amount=1000, from_currency=currency_symbol, to_currency="GBP") return converted df.loc[df

reshaping a data frame in pandas

安稳与你 提交于 2021-02-08 08:16:19
问题 Is there a simple way in pandas to reshape the following data frame: df = pd.DataFrame({'n':[1,1,2,2,1,1,2,2], 'l':['a','b','a','b','a','b','a','b'], 'v':[12,43,55,19,23,52,61,39], 'g':[0,0,0,0,1,1,1,1] }) to this format?: g a1 b1 a2 b2 0 12 43 55 19 1 23 52 61 39 回答1: In [75]: df['ln'] = df['l'] + df['n'].astype(str) In [76]: df.set_index(['g', 'ln'])['v'].unstack('ln') Out[76]: ln a1 a2 b1 b2 g 0 12 55 43 19 1 23 61 52 39 [2 rows x 4 columns] If you need that ordering then: In [77]: df.set

Split DataFrame Randomly (dependent on unique values)

可紊 提交于 2021-02-08 08:13:12
问题 I have a DataFrame df that looks like this: | A | B | ... | --------------------- | one | ... | ... | | one | ... | ... | | one | ... | ... | | two | ... | ... | | three | ... | ... | | three | ... | ... | | four | ... | ... | | five | ... | ... | | five | ... | ... | As you can see for A there are 5 unique values. I want to split the DataFrame randomly. For example I want 3 unique values in DataFrame df1 and 2 unique values in DataFrame df2 . My problem is that they aren't unique. I don't

Mapping rows of a Pandas dataframe to numpy array

柔情痞子 提交于 2021-02-08 08:00:46
问题 Sorry, I know there are so many questions relating to indexing, and it's probably starring me in the face, but I'm having a little trouble with this. I am familiar with .loc , .iloc , and .index methods and slicing in general. The method .reset_index may not have been (and may not be able to be) called on our dataframe and therefore index lables may not be in order. The dataframe and numpy array(s) are actually different length subsets of the dataframe, but for this example I'll keep them the

Collating timestamped events into date ranges with pandas

99封情书 提交于 2021-02-08 08:00:13
问题 I have a master data frame with batch numbers and a datetime range for which these batches occured like so: BatchNo StartTime Event A Event B BATCH23797 2013-09-06 02:22:00 0 0 BATCH23798 2013-09-06 06:06:00 0 0 BATCH23799 2013-09-06 14:33:00 0 0 BATCH23800 2013-09-06 18:12:00 0 0 BATCH23801 2013-09-06 21:38:00 0 0 And then I have another of timestamps for events that I am interested in. I have multiple ones of these with the data in different formats but end of the day I will have a list of

Set the color for scatter-plot with DataFrame.plot

回眸只為那壹抹淺笑 提交于 2021-02-08 07:50:20
问题 I am using python to plot a pandas DataFrame I set the color for plotting like this: allDf = pd.DataFrame({ 'x':[0,1,2,4,7,6], 'y':[0,3,2,4,5,7], 'a':[1,1,1,0,0,0], 'c':['red','green','blue','red','green','blue'] },index = ['p1','p2','p3','p4','p5','p6']) allDf.plot(kind='scatter',x='x',y='y',c='c') plt.show() However it doesn't work (every point has a blue color) If I changed the definition of DataFrame like this 'c':[1,2,1,2,1,2] It appears color but only black and white, I want to use blue

Set the color for scatter-plot with DataFrame.plot

落花浮王杯 提交于 2021-02-08 07:50:15
问题 I am using python to plot a pandas DataFrame I set the color for plotting like this: allDf = pd.DataFrame({ 'x':[0,1,2,4,7,6], 'y':[0,3,2,4,5,7], 'a':[1,1,1,0,0,0], 'c':['red','green','blue','red','green','blue'] },index = ['p1','p2','p3','p4','p5','p6']) allDf.plot(kind='scatter',x='x',y='y',c='c') plt.show() However it doesn't work (every point has a blue color) If I changed the definition of DataFrame like this 'c':[1,2,1,2,1,2] It appears color but only black and white, I want to use blue

Comparing two excel file with pandas

时光怂恿深爱的人放手 提交于 2021-02-08 07:50:04
问题 I have two excel file, A and B. A is Master copy where updated record of employee Name and Organization Name ( Name and Org ) is available. File B contains Name and Org columns with bit older record and many other columns which we are not interested in. Name Org 0 abc ddc systems 1 sdc ddc systems 2 csc ddd systems 3 rdc kbf org 4 rfc kbf org I want to do two operation on this: 1) I want to compare Excel B (column Name and Org ) with Excel A (column Name and Org ) and update file B with all

Read multiple csv files into separate pandas dataframes

折月煮酒 提交于 2021-02-08 07:49:57
问题 I've seen a few answers on reading multiple csv files into separate Pandas dataframes, and am still running into trouble. I've read my csv files and file names into a dictionary: path = os.getcwd() file_names = ['file1', 'thisisanotherfile', 'file3'] df_dict = {x: pd.read_csv('{}/{}.csv'.format(path, x)) for x in file_names} Which seems to work: print(df_dict['file1']) However what I'm looking for is a Pandas dataframe called 'file1' where I can access the data. Is it possible to get this

Set the color for scatter-plot with DataFrame.plot

不羁岁月 提交于 2021-02-08 07:49:51
问题 I am using python to plot a pandas DataFrame I set the color for plotting like this: allDf = pd.DataFrame({ 'x':[0,1,2,4,7,6], 'y':[0,3,2,4,5,7], 'a':[1,1,1,0,0,0], 'c':['red','green','blue','red','green','blue'] },index = ['p1','p2','p3','p4','p5','p6']) allDf.plot(kind='scatter',x='x',y='y',c='c') plt.show() However it doesn't work (every point has a blue color) If I changed the definition of DataFrame like this 'c':[1,2,1,2,1,2] It appears color but only black and white, I want to use blue