I\'m trying to see if I can remove the trailing zeros from this phone number column.
Example:
0
1 8.00735e+09
2 4.35789e+09
3 6.10644e
This answer by cs95 removes trailing “.0” in one row.
df = df.round(decimals=0).astype(object)
import numpy as np
tt = 8.00735e+09
time = int(np.format_float_positional(tt)[:-1])
It depends on the data format the telephone number is stored.
If it is in an numberic format changing to an integer might solve the problem
df = pd.DataFrame({'TelephoneNumber': [123.0, 234]})
df['TelephoneNumber'] = df['TelephoneNumber'].astype('int32')
If it is really a string you can replace and re-assign the column.
df2 = pd.DataFrame({'TelephoneNumber': ['123.0', '234']})
df2['TelephoneNumber'] = df2['TelephoneNumber'].str.replace('.0', '')
Try str.isnumeric
with astype
and loc
:
s = pd.Series(['', 8.00735e+09, 4.35789e+09, 6.10644e+09])
c = s.str.isnumeric().astype(bool)
s.loc[c] = s.loc[c].astype(np.int64)
print(s)
And now:
print(s)
Outputs:
0
1 8007350000
2 4357890000
3 6106440000
dtype: object
So Pandas automatically assign data type by looking at type of data in the event when you have mix type of data like some rows are NaN and some has int value there is huge possibilities it would assign dtype: object
or float64
EX 1:
import pandas as pd
data = [['tom', 10934000000], ['nick', 1534000000], ['juli', 1412000000]]
df = pd.DataFrame(data, columns = ['Name', 'Phone'])
>>> df
Name Phone
0 tom 10934000000
1 nick 1534000000
2 juli 1412000000
>>> df.dtypes
Name object
Phone int64
dtype: object
In above example pandas assume data type int64 reason being neither of row has NaN and all the rows in Phone column has integer value.
EX 2:
>>> data = [['tom'], ['nick', 1534000000], ['juli', 1412000000]]
>>> df = pd.DataFrame(data, columns = ['Name', 'Phone'])
>>> df
Name Phone
0 tom NaN
1 nick 1.534000e+09
2 juli 1.412000e+09
>>> df.dtypes
Name object
Phone float64
dtype: object
To answer to your actual question, to get rid of .0 at the end you can do something like this
Solution 1:
>>> data = [['tom', 9785000000.0], ['nick', 1534000000.0], ['juli', 1412000000]]
>>> df = pd.DataFrame(data, columns = ['Name', 'Phone'])
>>> df
Name Phone
0 tom 9.785000e+09
1 nick 1.534000e+09
2 juli 1.412000e+09
>>> df['Phone'] = df['Phone'].astype(int).astype(str)
>>> df
Name Phone
0 tom 9785000000
1 nick 1534000000
2 juli 1412000000
Solution 2:
>>> df['Phone'] = df['Phone'].astype(str).str.replace('.0', '', regex=False)
>>> df
Name Phone
0 tom 9785000000
1 nick 1534000000
2 juli 1412000000