pandas

Pandas merge without duplicating columns

喜你入骨 提交于 2021-02-10 23:35:37
问题 I need to merge two dataframes without creating duplicate columns. The first datframe (dfa) has missing values. The second dataframe (dfb) has unique values. This would be the same as a vlookup in Excel. dfa looks like this: postcode lat lon ...plus 32 more columns M20 2.3 0.2 LS1 NaN NaN LS1 NaN NaN LS2 NaN NaN M21 2.4 0.3 dfb only contains unique Postcodes and values where lat and lon were NaN in dfa. It looks like this: postcode lat lon LS1 1.4 0.1 LS2 1.5 0.2 The output I would like is:

Pandas Get Sequence of 1s and 0s Given Strings

为君一笑 提交于 2021-02-10 23:19:45
问题 Given the following: import pandas as pd df = pd.DataFrame({'a':['K','1','1,2,3']}) df a 0 K 1 1 2 1,2,3 I would like to convert the values in column a to a corresponding sequence of 1s and 0s given this map: K 1 2 3 4 5 1 1 1 1 1 1 If a value is present, a 1 is put in place of a 0. If the value is not present, the place is held by a 0. If no value is present, the sequence would be a string of 6 0s. So "K" would be: 100000 And "1,2,3" would be: 011100 Desired result: a b 0 K 100000 1 1 010000

Transforming a correlation matrix to a 3 column dataframe in pandas?

陌路散爱 提交于 2021-02-10 22:49:41
问题 I have a correlation matrix like so a b c a 1 0.5 0.3 b 0.5 1 0.7 c 0.3 0.7 1 And I want to transform this into a dataframe where the columns are like this: Letter1 letter2 correlation a a 1 a b 0.5 a c 0.3 b a 0.5 b b 1 . . . . . . Is there a pandas command to allow me to do this? Thanks in advance And a follow up to this, can I assign a value to the letters in Letter1 like so: Value1 Letter1 Value2 letter2 correlation 1 a 1 a 1 1 a 2 b 0.5 1 a 3 c 0.3 2 b 1 a 0.5 2 b 2 b 1 . . . . . . . . .

Transforming a correlation matrix to a 3 column dataframe in pandas?

强颜欢笑 提交于 2021-02-10 22:49:28
问题 I have a correlation matrix like so a b c a 1 0.5 0.3 b 0.5 1 0.7 c 0.3 0.7 1 And I want to transform this into a dataframe where the columns are like this: Letter1 letter2 correlation a a 1 a b 0.5 a c 0.3 b a 0.5 b b 1 . . . . . . Is there a pandas command to allow me to do this? Thanks in advance And a follow up to this, can I assign a value to the letters in Letter1 like so: Value1 Letter1 Value2 letter2 correlation 1 a 1 a 1 1 a 2 b 0.5 1 a 3 c 0.3 2 b 1 a 0.5 2 b 2 b 1 . . . . . . . . .

How to include multiple data columns in a seaborn barplot?

自闭症网瘾萝莉.ら 提交于 2021-02-10 22:19:35
问题 I have a dataframe that looks like this: I have used a barplot to represent the subscribers for each row. This is what I did: data = channels.sort_values('subscribers', ascending=False).head(5) chart = sns.barplot(x = 'name', y='subscribers',data=data) chart.set_xticklabels(chart.get_xticklabels(), rotation=90) for p in chart.patches: chart.annotate("{:,.2f}".format(p.get_height(), '.2f'), (p.get_x() + p.get_width() / 2., p.get_height()), ha = 'center', va = 'center', xytext = (0, 10),

How to include multiple data columns in a seaborn barplot?

你离开我真会死。 提交于 2021-02-10 22:17:01
问题 I have a dataframe that looks like this: I have used a barplot to represent the subscribers for each row. This is what I did: data = channels.sort_values('subscribers', ascending=False).head(5) chart = sns.barplot(x = 'name', y='subscribers',data=data) chart.set_xticklabels(chart.get_xticklabels(), rotation=90) for p in chart.patches: chart.annotate("{:,.2f}".format(p.get_height(), '.2f'), (p.get_x() + p.get_width() / 2., p.get_height()), ha = 'center', va = 'center', xytext = (0, 10),

Combine values of two columns of dataframe into one column

♀尐吖头ヾ 提交于 2021-02-10 21:55:12
问题 hi I want to append two column values into a single column, something like shown below in pandas. Can anyone help me out in doing that? | t1 | t2 | v1 | v2 | |------|------|----|----| | 0.0 | 10 | 1 | -1 | | 0.42 | 0.78 | 1 | -1 | new dataframe | t1,t2 combined | v1,v2 combined | |----------------|----------------| | 0.0 | 1 | | 0.42 | 1 | | 10 | -1 | | 0.78 | -1 | 回答1: pd.wide_to_long should work: df['value'] = list(range(0,2)) pd.wide_to_long(df, stubnames=['t', 'v'], i='value', j='dropme',

Set legend position when plotting a pandas dataframe with a second y-axis via pandas plotting interface [duplicate]

你说的曾经没有我的故事 提交于 2021-02-10 20:41:40
问题 This question already has answers here : How to plot two pandas time series on same plot with legends and secondary y-axis? (2 answers) Closed 2 years ago . I am plotting a pandas dataframe with a second y-axis via pandas plotting interface as described in the documentation like this: df = pd.DataFrame(np.random.randn(24*3, 3), index=pd.date_range('1/1/2019', periods=24*3, freq='h')) df.columns = ['A (left)', 'B (right)', 'C (right)'] ax = df.plot(secondary_y=['B (right)', 'C (right)'], mark

Interactively change a point plot in bokeh using RangeSlider to select columns in a pandas dataframe

自闭症网瘾萝莉.ら 提交于 2021-02-10 20:31:40
问题 I have a pandas dataframe df where the first two columns represent x, y coordinates and the remaining columns represent time slices (t0,...tn) where the presence(1) or absence(0) of each point at each time slice (ti) is recorded. I would like to use a RangeSlider (not a Slider ) so that I can slide across a range of time slices and plot points that are present within that range. This is what I got thus far, from bokeh.layouts import column from bokeh.plotting import figure, show from bokeh

Error saving files into google drive via google colab

℡╲_俬逩灬. 提交于 2021-02-10 20:07:54
问题 I am trying to save files onto my Google Drive from a colab notebook and I keep getting the same error. I have already mounted my drive. When I call pwd, I get, which seems right: /content/drive/My Drive/ Here is an example code and read-out: from google.colab import drive drive.mount('/content/drive') import pandas as pd import numpy as np df = pd.DataFrame(np.random.randint(0,100,size=(100, 4)), columns=list('ABCD')) print(df) df.to_csv('test.csv') A B C D 0 38 28 18 74 1 36 54 84 13 2 2 1