python pandas replacing strings in dataframe with numbers

前端 未结 9 1546
名媛妹妹
名媛妹妹 2020-12-08 04:22

Is there anyway to use the mapping function or something better to replace values in an entire dataframe?

I only know how to perform the mapping on series.

I

相关标签:
9条回答
  • 2020-12-08 04:28

    You can build dictionary from column values itself and fill like below

    x=df['Item_Type'].value_counts()
    item_type_mapping={}
    item_list=x.index
    for i in range(0,len(item_list)):
        item_type_mapping[item_list[i]]=i
    
    df['Item_Type']=df['Item_Type'].map(lambda x:item_type_mapping[x]) 
    
    0 讨论(0)
  • 2020-12-08 04:29

    I know this is old, but adding for those searching as I was. Create a dataframe in pandas, df in this code

    ip_addresses = df.source_ip.unique()
    ip_dict = dict(zip(ip_addresses, range(len(ip_addresses))))
    

    That will give you a dictionary map of the ip addresses without having to write it out.

    0 讨论(0)
  • 2020-12-08 04:34

    To convert Strings like 'volvo','bmw' into integers first convert it to a dataframe then pass it to pandas.get_dummies()

      df  = DataFrame.from_csv("myFile.csv")
      df_transform = pd.get_dummies( df )
      print( df_transform )
    

    Better alternative: passing a dictionary to map() of a pandas series (df.myCol) (by specifying the column brand for example)

    df.brand = df.brand.map( {'volvo':0 , 'bmw':1, 'audi':2} )
    
    0 讨论(0)
  • 2020-12-08 04:37

    You can also do this with pandas rename_categories. You would first need to define the column as dtype="category" e.g.

    In [66]: s = pd.Series(["a","b","c","a"], dtype="category")
    
    In [67]: s
    Out[67]: 
    0    a
    1    b
    2    c
    3    a
    dtype: category
    Categories (3, object): [a, b, c]
    

    and then rename them:

    In [70]: s.cat.rename_categories([1,2,3])
    Out[70]: 
    0    1
    1    2
    2    3
    3    1
    dtype: category
    Categories (3, int64): [1, 2, 3]
    

    You can also pass a dict-like object to map the renaming, e.g.:

    In [72]: s.cat.rename_categories({1: 'x', 2: 'y', 3: 'z'})
    
    0 讨论(0)
  • 2020-12-08 04:41

    You can use the applymap DataFrame function to do this:

    In [26]: df = DataFrame({"A": [1,2,3,4,5], "B": ['a','b','c','d','e'],
                             "C": ['b','a','c','c','d'], "D": ['a','c',7,9,2]})
    In [27]: df
    Out[27]:
       A  B  C  D
    0  1  a  b  a
    1  2  b  a  c
    2  3  c  c  7
    3  4  d  c  9
    4  5  e  d  2
    
    In [28]: mymap = {'a':1, 'b':2, 'c':3, 'd':4, 'e':5}
    
    In [29]: df.applymap(lambda s: mymap.get(s) if s in mymap else s)
    Out[29]:
       A  B  C  D
    0  1  1  2  1
    1  2  2  1  3
    2  3  3  3  7
    3  4  4  3  9
    4  5  5  4  2
    
    0 讨论(0)
  • 2020-12-08 04:44

    df.replace(to_replace=['set', 'test'], value=[1, 2]) from @Ishnark comment on accepted answer.

    0 讨论(0)
提交回复
热议问题