可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效,请关闭广告屏蔽插件后再试):
问题:
I tried to convert a column from data type float64 to int64 using:
df['column name'].astype(int64)
but got an error:
NameError: name 'int64' is not defined
The column has number of people but was formatted as 7500000.0, any idea how I can simply change this float64 into int64?
回答1:
I think you need cast to numpy.int64:
df['column name'].astype(np.int64)
Sample:
df = pd.DataFrame({'column name':[7500000.0,7500000.0]}) print (df['column name']) 0 7500000.0 1 7500000.0 Name: column name, dtype: float64 df['column name'] = df['column name'].astype(np.int64) #same as #df['column name'] = df['column name'].astype(pd.np.int64) print (df['column name']) 0 7500000 1 7500000 Name: column name, dtype: int64
If some NaNs in columns need replace them to some int (e.g. 0) by fillna, because type of NaN is float:
df = pd.DataFrame({'column name':[7500000.0,np.nan]}) df['column name'] = df['column name'].fillna(0).astype(np.int64) print (df['column name']) 0 7500000 1 0 Name: column name, dtype: int64
Also check documentation - missing data casting rules
EDIT:
Convert values with NaNs is buggy:
df = pd.DataFrame({'column name':[7500000.0,np.nan]}) df['column name'] = df['column name'].values.astype(np.int64) print (df['column name']) 0 7500000 1 -9223372036854775808 Name: column name, dtype: int64
回答2:
You can need to pass in the string 'int64':
>>> import pandas as pd >>> df = pd.DataFrame({'a': [1.0, 2.0]}) # some test dataframe >>> df['a'].astype('int64') 0 1 1 2 Name: a, dtype: int64
There are some alternative ways to specify 64-bit integers:
>>> df['a'].astype('i8') # integer with 8 bytes (64 bit) 0 1 1 2 Name: a, dtype: int64 >>> import numpy as np >>> df['a'].astype(np.int64) # native numpy 64 bit integer 0 1 1 2 Name: a, dtype: int64
Or use np.int64 directly on your column (but it returns a numpy.array):
>>> np.int64(df['a']) array([1, 2], dtype=int64)