Converting exponential notation numbers to strings - explanation

风流意气都作罢 提交于 2019-12-11 06:35:17

问题


I have DataFrame from this question:

temp=u"""Total,Price,test_num
0,71.7,2.04256e+14
1,39.5,2.04254e+14
2,82.2,2.04188e+14
3,42.9,2.04171e+14"""
df = pd.read_csv(pd.compat.StringIO(temp))

print (df)
   Total  Price      test_num
0      0   71.7  2.042560e+14
1      1   39.5  2.042540e+14
2      2   82.2  2.041880e+14
3      3   42.9  2.041710e+14

If convert floats to strings get trailing 0:

print (df['test_num'].astype('str'))
0    204256000000000.0
1    204254000000000.0
2    204188000000000.0
3    204171000000000.0
Name: test_num, dtype: object

Solution is convert floats to integer64:

print (df['test_num'].astype('int64'))
0    204256000000000
1    204254000000000
2    204188000000000
3    204171000000000
Name: test_num, dtype: int64

print (df['test_num'].astype('int64').astype(str))
0    204256000000000
1    204254000000000
2    204188000000000
3    204171000000000
Name: test_num, dtype: object

Question is why it convert this way?

I add this poor explanation, but feels it should be better:

Poor explanation:

You can check dtype of converted column - it return float64.

print (df['test_num'].dtype)
float64

After converting to string it remove exponential notation and cast to floats, so added traling 0:

print (df['test_num'].astype('str'))
0    204256000000000.0
1    204254000000000.0
2    204188000000000.0
3    204171000000000.0
Name: test_num, dtype: object

回答1:


When you use pd.read_csv to import data and do not define datatypes, pandas makes an educated guess and in this case decides, that column values like "2.04256e+14" are best represented by a float value.

This, converted back to string adds a ".0". As you corrently write, converting to int64 fixes this.

If you know that the column has int64 values only before input (and no empty values, which np.int64 cannot handle), you can force this type on import to avoid the unneeded conversions.

import numpy as np

temp=u"""Total,Price,test_num
0,71.7,2.04256e+14
1,39.5,2.04254e+14
2,82.2,2.04188e+14
3,42.9,2.04171e+14"""

df = pd.read_csv(pd.compat.StringIO(temp), dtype={2: np.int64})

print(df)

returns

   Total  Price         test_num
0      0   71.7  204256000000000
1      1   39.5  204254000000000
2      2   82.2  204188000000000
3      3   42.9  204171000000000


来源:https://stackoverflow.com/questions/51325032/converting-exponential-notation-numbers-to-strings-explanation

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!