pandas | 易学教程

Pandas DataFrame currency conversion

阅读更多关于 Pandas DataFrame currency conversion

问题 I have DataFrame with two columns: col1 | col2 20 EUR 31 GBP 5 JPY I may have 10000 rows like this How to do fast currency conversion to base currency being GBP? should I use easymoney? I know how to apply conversion to single row but I do not know how to iterate through all the rows fast. EDIT: I would like to apply sth as: def convert_currency(amount, currency_symbol): converted = ep.currency_converter(amount=1000, from_currency=currency_symbol, to_currency="GBP") return converted df.loc[df

reshaping a data frame in pandas

阅读更多关于 reshaping a data frame in pandas

问题 Is there a simple way in pandas to reshape the following data frame: df = pd.DataFrame({'n':[1,1,2,2,1,1,2,2], 'l':['a','b','a','b','a','b','a','b'], 'v':[12,43,55,19,23,52,61,39], 'g':[0,0,0,0,1,1,1,1] }) to this format?: g a1 b1 a2 b2 0 12 43 55 19 1 23 52 61 39 回答1: In [75]: df['ln'] = df['l'] + df['n'].astype(str) In [76]: df.set_index(['g', 'ln'])['v'].unstack('ln') Out[76]: ln a1 a2 b1 b2 g 0 12 55 43 19 1 23 61 52 39 [2 rows x 4 columns] If you need that ordering then: In [77]: df.set

Split DataFrame Randomly (dependent on unique values)

阅读更多关于 Split DataFrame Randomly (dependent on unique values)

问题 I have a DataFrame df that looks like this: | A | B | ... | --------------------- | one | ... | ... | | one | ... | ... | | one | ... | ... | | two | ... | ... | | three | ... | ... | | three | ... | ... | | four | ... | ... | | five | ... | ... | | five | ... | ... | As you can see for A there are 5 unique values. I want to split the DataFrame randomly. For example I want 3 unique values in DataFrame df1 and 2 unique values in DataFrame df2 . My problem is that they aren't unique. I don't

Mapping rows of a Pandas dataframe to numpy array

阅读更多关于 Mapping rows of a Pandas dataframe to numpy array

问题 Sorry, I know there are so many questions relating to indexing, and it's probably starring me in the face, but I'm having a little trouble with this. I am familiar with .loc , .iloc , and .index methods and slicing in general. The method .reset_index may not have been (and may not be able to be) called on our dataframe and therefore index lables may not be in order. The dataframe and numpy array(s) are actually different length subsets of the dataframe, but for this example I'll keep them the

Collating timestamped events into date ranges with pandas

阅读更多关于 Collating timestamped events into date ranges with pandas

问题 I have a master data frame with batch numbers and a datetime range for which these batches occured like so: BatchNo StartTime Event A Event B BATCH23797 2013-09-06 02:22:00 0 0 BATCH23798 2013-09-06 06:06:00 0 0 BATCH23799 2013-09-06 14:33:00 0 0 BATCH23800 2013-09-06 18:12:00 0 0 BATCH23801 2013-09-06 21:38:00 0 0 And then I have another of timestamps for events that I am interested in. I have multiple ones of these with the data in different formats but end of the day I will have a list of

Set the color for scatter-plot with DataFrame.plot

阅读更多关于 Set the color for scatter-plot with DataFrame.plot

问题 I am using python to plot a pandas DataFrame I set the color for plotting like this: allDf = pd.DataFrame({ 'x':[0,1,2,4,7,6], 'y':[0,3,2,4,5,7], 'a':[1,1,1,0,0,0], 'c':['red','green','blue','red','green','blue'] },index = ['p1','p2','p3','p4','p5','p6']) allDf.plot(kind='scatter',x='x',y='y',c='c') plt.show() However it doesn't work (every point has a blue color) If I changed the definition of DataFrame like this 'c':[1,2,1,2,1,2] It appears color but only black and white, I want to use blue

Set the color for scatter-plot with DataFrame.plot

阅读更多关于 Set the color for scatter-plot with DataFrame.plot

Comparing two excel file with pandas

阅读更多关于 Comparing two excel file with pandas

问题 I have two excel file, A and B. A is Master copy where updated record of employee Name and Organization Name ( Name and Org ) is available. File B contains Name and Org columns with bit older record and many other columns which we are not interested in. Name Org 0 abc ddc systems 1 sdc ddc systems 2 csc ddd systems 3 rdc kbf org 4 rfc kbf org I want to do two operation on this: 1) I want to compare Excel B (column Name and Org ) with Excel A (column Name and Org ) and update file B with all

Read multiple csv files into separate pandas dataframes

阅读更多关于 Read multiple csv files into separate pandas dataframes

问题 I've seen a few answers on reading multiple csv files into separate Pandas dataframes, and am still running into trouble. I've read my csv files and file names into a dictionary: path = os.getcwd() file_names = ['file1', 'thisisanotherfile', 'file3'] df_dict = {x: pd.read_csv('{}/{}.csv'.format(path, x)) for x in file_names} Which seems to work: print(df_dict['file1']) However what I'm looking for is a Pandas dataframe called 'file1' where I can access the data. Is it possible to get this

Set the color for scatter-plot with DataFrame.plot

阅读更多关于 Set the color for scatter-plot with DataFrame.plot