pandas

How to Read a Text File of Dictionaries into a DataFrame

拥有回忆 提交于 2021-02-07 10:09:50
问题 I have a text file from Kaggle of Clash Royale stats. It's in a format of Python Dictionaries. I am struggling to find out how to read that into a file in a meaningful way. Curious what the best way is to do this. It's a fairly complex Dict with Lists. Original Dataset here: https://www.kaggle.com/s1m0n38/clash-royale-matches-dataset {'players': {'right': {'deck': [['Mega Minion', '9'], ['Electro Wizard', '3'], ['Arrows', '11'], ['Lightning', '5'], ['Tombstone', '9'], ['The Log', '2'], [

Extracting sentences using pandas with specific words

别等时光非礼了梦想. 提交于 2021-02-07 10:09:27
问题 I have a excel file with a text column. All I need to do is to extract the sentences from the text column for each row with specific words. I have tried using defining a function. import pandas as pd from nltk.tokenize import sent_tokenize from nltk.tokenize import word_tokenize #################Reading in excel file##################### str_df = pd.read_excel("C:\\Users\\HP\Desktop\\context.xlsx") ################# Defining a function ##################### def sentence_finder(text,word):

Extracting sentences using pandas with specific words

不问归期 提交于 2021-02-07 10:04:36
问题 I have a excel file with a text column. All I need to do is to extract the sentences from the text column for each row with specific words. I have tried using defining a function. import pandas as pd from nltk.tokenize import sent_tokenize from nltk.tokenize import word_tokenize #################Reading in excel file##################### str_df = pd.read_excel("C:\\Users\\HP\Desktop\\context.xlsx") ################# Defining a function ##################### def sentence_finder(text,word):

Could there be an easier way to use pandas read_clipboard to read a Series?

北战南征 提交于 2021-02-07 10:01:10
问题 Some times, i want use read_clipboard to read Series es, and i would have to do: pd.Series(pd.read_clipboard(header=None).values[:,0]) So would it be nice if there was an easier way? I can do it very easily for data-frames, like: pd.read_clipboard() And that's it. But for Series , it's much longer-one-liner. So is there an easier way? That i don't know? Any secretive code? 回答1: Copy this to clipboard: 1 2 3 Better would be to use squeeze=True as an argument. pd.read_clipboard(header=None,

Cannot replace special characters in a Python pandas dataframe

我与影子孤独终老i 提交于 2021-02-07 09:55:47
问题 I'm working with Python 3.5 in Windows. I have a dataframe where a 'titles' str type column contains titles of headlines, some of which have special characters such as â , € , ˜ . I am trying to replace these with a space '' using pandas.replace . I have tried various iterations and nothing works. I am able to replace regular characters, but these special characters just don't seem to work. The code runs without error, but the replacement simply does not occur, and instead the original title

Cannot replace special characters in a Python pandas dataframe

女生的网名这么多〃 提交于 2021-02-07 09:55:41
问题 I'm working with Python 3.5 in Windows. I have a dataframe where a 'titles' str type column contains titles of headlines, some of which have special characters such as â , € , ˜ . I am trying to replace these with a space '' using pandas.replace . I have tried various iterations and nothing works. I am able to replace regular characters, but these special characters just don't seem to work. The code runs without error, but the replacement simply does not occur, and instead the original title

How to replace a dataframe column values with NaN based on a condition?

坚强是说给别人听的谎言 提交于 2021-02-07 09:53:45
问题 For example, The dataframe looks like: DF=pd.DataFrame([[1,1],[2,120],[3,25],[4,np.NaN],[5,45]],columns=["ID","Age"]) In the Age column, the values below 5 and greater than 100 have to converted to NaN. Any help is appreciated! 回答1: Using where and between df.Age=df.Age.where(df.Age.between(5,100)) df ID Age 0 10 NaN 1 20 NaN 2 30 25.0 来源: https://stackoverflow.com/questions/53983563/how-to-replace-a-dataframe-column-values-with-nan-based-on-a-condition

save multiple pd.DataFrames with hierarchy to hdf5

◇◆丶佛笑我妖孽 提交于 2021-02-07 09:48:30
问题 I have multiple pd.DataFrames which have hierarchical organization. Let's say I have: day_temperature_london_df = pd.DataFrame(...) night_temperature_london_df = pd.DataFrame(...) day_temperature_paris_df = pd.DataFrame(...) night_temperature_paris_df = pd.DataFrame(...) And I want to group them into hdf5 file so two of them go to group 'london' and two of others go to 'paris'. If I use h5py I lose the format of the pd.DataFrame , lose indexes and columns. f = h5py.File("temperature.h5", "w")

How to group all labels (index) which shares at least one “1” in the same column?

一笑奈何 提交于 2021-02-07 09:48:26
问题 Grouping Rules: has at least one "1" in the same column shares any number of rows in common (see example) For example: c0 c1 c2 c3 A 1 0 0 1 B 0 0 1 0 C 0 0 0 1 D 0 1 1 0 E 0 1 0 0 Expected output: [[A, C], [B, D, E]] As you can see B and E do not share "1" in columns, but they have "D" in common, therefore all 3 should be grouped 回答1: Here is a solution with networkx. import networkx as nx a = np.where(df.T, df.index, '').sum(axis=1) g = [list(x) for x in a if len(x) > 1] G = nx.Graph(g)

How to group all labels (index) which shares at least one “1” in the same column?

心不动则不痛 提交于 2021-02-07 09:47:32
问题 Grouping Rules: has at least one "1" in the same column shares any number of rows in common (see example) For example: c0 c1 c2 c3 A 1 0 0 1 B 0 0 1 0 C 0 0 0 1 D 0 1 1 0 E 0 1 0 0 Expected output: [[A, C], [B, D, E]] As you can see B and E do not share "1" in columns, but they have "D" in common, therefore all 3 should be grouped 回答1: Here is a solution with networkx. import networkx as nx a = np.where(df.T, df.index, '').sum(axis=1) g = [list(x) for x in a if len(x) > 1] G = nx.Graph(g)