问题
I used this code to convert my float numbers into an integer, however, it does not work. Here are all step I gone through so far:
Step 1: I converted timestamp1 and timestamp2 to datetime in order subtract and get days:
a=pd.to_datetime(df['timestamp1'], format='%Y-%m-%dT%H:%M:%SZ')
b=pd.to_datetime(df['timestamp2'], format='%Y-%m-%dT%H:%M:%SZ')
df['delta'] = (b-a).dt.days
Step 2: Converted the strings into integers as the day:
df['delta'] = pd.to_datetime(df['delta'], format='%Y-%m-%d', errors='coerce')
df['delta'] = df['delta'].dt.day
Step 3: I am trying to convert floats into integers.
categorical_feature_mask = df.dtypes==object
categorical_cols = df.columns[categorical_feature_mask].tolist()
from sklearn.preprocessing import LabelEncoder
le = LabelEncoder()
df[categorical_cols] = df[categorical_cols].apply(lambda col: le.fit_transform(col))
df[categorical_cols].head(10)
However, it throws an error TypeError: ('argument must be a string or number', 'occurred at index col1')
回答1:
To convert a float column to an integer with float columns having NaN values two things you can do:
Convert to naive int and change NaN values to an arbitrary value such as this:
df[col].fillna(0).astype("int32")
If you want to conserve NaN values use this:
df[col].astype("Int32")
Note the difference with the capital "I". For further information on this implementation made by Pandas just look at this: Nullable integer data type.
Why do you need to do that ? Because by default Pandas considers that when your column has at least on NaN
value, the column is a Float, because this is how numpy behaves.
The same thing happen with strings, if you have at least one string value in your column, the whole column would be labeled as object
for Pandas, so this is why your first attempt failed.
来源:https://stackoverflow.com/questions/57264509/float-is-not-converting-to-integer-pandas