pandas

Joining a list of tuples within a pandas dataframe

↘锁芯ラ 提交于 2021-02-05 07:12:05
问题 I want to join a list of tuples within a dataframe. I have tried several methods of doing this within the dataframe with join and with lambda import pandas as pd from nltk import word_tokenize, pos_tag, pos_tag_sents data = {'Categories': ['animal','plant','object'], 'Type': ['tree','dog','rock'], 'Comment': ['The NYC tree is very big', 'NY The cat from the UK is small', 'The rock was found in LA.']} def posTag(data): data = pd.DataFrame(data) comments = data['Comment'].tolist()

How do you fill NaN with mean of a subset of a group?

假如想象 提交于 2021-02-05 07:11:30
问题 I have a data frame with some values by year and type . I want to replace all NaN values in each year with the mean of values in that year with a specific type. I would like to do this in the most elegant way possible. I'm dealing with a lot of data so less computation would be good as well. Example: df =pd.DataFrame({'year':[1,1,1,2,2,2], 'type':[1,1,2,1,1,2], 'val':[np.nan,5,10,100,200,np.nan]}) I want ALL nan's regardless of their type to be replaced with their respective year mean of all

Detect when matplotlib tick labels overlap

冷暖自知 提交于 2021-02-05 07:09:22
问题 I have a matplotlib bar chart generated by pandas, like this: index = ["Label 1", "Label 2", "Lorem ipsum dolor sit amet", "Lorem ipsum dolor sit amet, consectetur adipiscing elit. Duis ac vehicula leo, vitae sodales orci."] df = pd.DataFrame([1, 2, 3, 4], columns=["Value"], index=index) df.plot(kind="bar", rot=0) As you can see, with 0 rotation, the xtick labels overlap. How can I detect when two labels overlap, and rotate just those two labels to 90 degrees? 回答1: There is no easy way to

Detect when matplotlib tick labels overlap

。_饼干妹妹 提交于 2021-02-05 07:08:30
问题 I have a matplotlib bar chart generated by pandas, like this: index = ["Label 1", "Label 2", "Lorem ipsum dolor sit amet", "Lorem ipsum dolor sit amet, consectetur adipiscing elit. Duis ac vehicula leo, vitae sodales orci."] df = pd.DataFrame([1, 2, 3, 4], columns=["Value"], index=index) df.plot(kind="bar", rot=0) As you can see, with 0 rotation, the xtick labels overlap. How can I detect when two labels overlap, and rotate just those two labels to 90 degrees? 回答1: There is no easy way to

How do you fill NaN with mean of a subset of a group?

此生再无相见时 提交于 2021-02-05 07:08:18
问题 I have a data frame with some values by year and type . I want to replace all NaN values in each year with the mean of values in that year with a specific type. I would like to do this in the most elegant way possible. I'm dealing with a lot of data so less computation would be good as well. Example: df =pd.DataFrame({'year':[1,1,1,2,2,2], 'type':[1,1,2,1,1,2], 'val':[np.nan,5,10,100,200,np.nan]}) I want ALL nan's regardless of their type to be replaced with their respective year mean of all

Detect when matplotlib tick labels overlap

一世执手 提交于 2021-02-05 07:06:30
问题 I have a matplotlib bar chart generated by pandas, like this: index = ["Label 1", "Label 2", "Lorem ipsum dolor sit amet", "Lorem ipsum dolor sit amet, consectetur adipiscing elit. Duis ac vehicula leo, vitae sodales orci."] df = pd.DataFrame([1, 2, 3, 4], columns=["Value"], index=index) df.plot(kind="bar", rot=0) As you can see, with 0 rotation, the xtick labels overlap. How can I detect when two labels overlap, and rotate just those two labels to 90 degrees? 回答1: There is no easy way to

pandas dataframe count unique list

眉间皱痕 提交于 2021-02-05 07:00:46
问题 If the type of a column in dataframe is int , float or string , we can get its unique values with columnName.unique() . But what if this column is a list, e.g. [1, 2, 3]. How could I get the unique of this column? 回答1: I think you can convert values to tuples and then unique works nice: df = pd.DataFrame({'col':[[1,1,2],[2,1,3,3],[1,1,2],[1,1,2]]}) print (df) col 0 [1, 1, 2] 1 [2, 1, 3, 3] 2 [1, 1, 2] 3 [1, 1, 2] print (df['col'].apply(tuple).unique()) [(1, 1, 2) (2, 1, 3, 3)] L = [list(x)

pandas dataframe count unique list

馋奶兔 提交于 2021-02-05 07:00:32
问题 If the type of a column in dataframe is int , float or string , we can get its unique values with columnName.unique() . But what if this column is a list, e.g. [1, 2, 3]. How could I get the unique of this column? 回答1: I think you can convert values to tuples and then unique works nice: df = pd.DataFrame({'col':[[1,1,2],[2,1,3,3],[1,1,2],[1,1,2]]}) print (df) col 0 [1, 1, 2] 1 [2, 1, 3, 3] 2 [1, 1, 2] 3 [1, 1, 2] print (df['col'].apply(tuple).unique()) [(1, 1, 2) (2, 1, 3, 3)] L = [list(x)

Folium Chropleth Map renders grey shading instead of true colors of the thematic map

一笑奈何 提交于 2021-02-05 06:59:05
问题 I've got a problem that my choropleth map is not rendering correctly. I've got a bunch of ride-hailing data of the city of Chicago and I would like to create a choropleth map by census tract. I checked that the key_on feature is "geoid10" in the geojson file and ensured that the Pickup Census Tracts are all matching. I also ensured that the data types of the key in the geojson file and the dataframe are the same (they are both objects) Yet still, my choropleth map renders a black/grey tone

Pandas Rename a Single Row of MultiIndex by Tuple

孤街浪徒 提交于 2021-02-05 06:57:25
问题 I'm trying to rename a single row of a pandas dataframe by it's tuple. For example: import pandas as pd df = pd.DataFrame(data={'i1':[0,0,0,0,1,1,1,1], 'i2':[0,1,2,3,0,1,2,3], 'x':[1.,2.,3.,4.,5.,6.,7.,8.], 'y':[9,10,11,12,13,14,15,16]}) df.set_index(['i1','i2'], inplace=True) Creates df: x y i1 i2 0 0 1.0 9 1 2.0 10 2 3.0 11 3 4.0 12 1 0 5.0 13 1 6.0 14 2 7.0 15 3 8.0 16 I'd like to be able to use something like: df.rename(index={(0,1):(0,9)},inplace=True) to get: x y i1 i2 0 0 1.0 9 9 2.0