how to round/remove traling “.0” zeros in pandas column?

后端 未结 11 1271
轮回少年
轮回少年 2020-12-06 10:15

I\'m trying to see if I can remove the trailing zeros from this phone number column.

Example:

0
1      8.00735e+09
2      4.35789e+09
3      6.10644e         


        
相关标签:
11条回答
  • 2020-12-06 10:19
    import numpy as np
    import pandas as pd
    
    s = pd.Series([ None, np.nan, '',8.00735e+09,  4.35789e+09, 6.10644e+09])
    
    s_new = s.fillna('').astype(str).str.replace(".0","",regex=False)
    s_new
    

    Here I filled null values with empty string, converted series to string type, replaced .0 with empty string.
    This outputs:

    0              
    1              
    2              
    3    8007350000
    4    4357890000
    5    6106440000
    dtype: object
    
    0 讨论(0)
  • 2020-12-06 10:22

    Here is a solution using pandas nullable integers (the solution assumes that input Series values are either empty strings or floating point numbers):

    import pandas as pd, numpy as np
    s = pd.Series(['', 8.00735e+09, 4.35789e+09, 6.10644e+09])
    s.replace('', np.nan).astype('Int64')
    

    Output (pandas-0.25.1):

    0           NaN
    1    8007350000
    2    4357890000
    3    6106440000
    dtype: Int64
    

    Advantages of the solution:

    • The output values are either integers or missing values (not 'object' data type)
    • Efficient
    0 讨论(0)
  • 2020-12-06 10:26

    use astype(np.int64)

    s = pd.Series(['', 8.00735e+09, 4.35789e+09, 6.10644e+09])
    mask = pd.to_numeric(s).notnull()
    s.loc[mask] = s.loc[mask].astype(np.int64)
    s
    
    0              
    1    8007350000
    2    4357890000
    3    6106440000
    dtype: object
    
    0 讨论(0)
  • 2020-12-06 10:28

    If somebody is still interesting: I had the problem that I round the df and get the trailing zeros. Here is what I did.

    new_df = np.round(old_df,3).astype(str)
    

    Then all trailing zeros were gone in the new_df.

    0 讨论(0)
  • 2020-12-06 10:29

    In Pandas/NumPy, integers are not allowed to take NaN values, and arrays/series (including dataframe columns) are homogeneous in their datatype --- so having a column of integers where some entries are None/np.nan is downright impossible.

    EDIT:data.phone.astype('object') should do the trick; in this case, Pandas treats your column as a series of generic Python objects, rather than a specific datatype (e.g. str/float/int), at the cost of performance if you intend to run any heavy computations with this data (probably not in your case).

    Assuming you want to keep those NaN entries, your approach of converting to strings is a valid possibility:

    data.phone.astype(str).str.split('.', expand = True)[0]

    should give you what you're looking for (there are alternative string methods you can use, such as .replace or .extract, but .split seems the most straightforward in this case).

    Alternatively, if you are only interested in the display of floats (unlikely I'd suppose), you can do pd.set_option('display.float_format','{:.0f}'.format), which doesn't actually affect your data.

    0 讨论(0)
  • 2020-12-06 10:30

    Just do

    data['phone'] = data['phone'].astype(str)          
    data['phone'] = data['phone'].str.replace('.0', ' ')
    

    which uses a regex style lookup on all entries in the column and replaces any '.0' matches with blank space. For example

    data = pd.DataFrame(
        data = [['bob','39384954.0'],['Lina','23827484.0']], 
        columns = ['user','phone'], index = [1,2]
    )
    
    data['phone'] = data['phone'].astype(str)
    data['phone'] = data['phone'].str.replace('.0', ' ')
    print data
    
       user     phone
    1   bob  39384954
    2  Lina  23827484
    
    0 讨论(0)
提交回复
热议问题