How to check if float pandas column contains only integer numbers?

后端 未结 4 647
挽巷
挽巷 2020-12-15 18:07

I have a dataframe

df = pd.DataFrame(data=np.arange(10),columns=[\'v\']).astype(float)

How to make sure that the numbers in v

相关标签:
4条回答
  • 2020-12-15 18:34

    Here's a simpler, and probably faster, approach:

    (df[col] % 1  == 0).all()
    

    To ignore nulls:

    (df[col].fillna(-9999) % 1  == 0).all()
    
    0 讨论(0)
  • 2020-12-15 18:42

    For completeness, Pandas v1.0+ offer the convert_dtypes() utility, that (among 3 other conversions) performs the requested operation for all dataframe-columns (or series) containing only integer numbers.

    If you wanted to limit the conversion to a single column only, you could do the following:

    >>> df.dtypes          # inspect previous dtypes
    v                      float64
    
    >>> df["v"] = df["v"].convert_dtype()
    >>> df.dtypes          # inspect converted dtypes
    v                      Int64
    
    0 讨论(0)
  • 2020-12-15 18:45

    Comparison with astype(int)

    Tentatively convert your column to int and test with np.array_equal:

    np.array_equal(df.v, df.v.astype(int))
    True
    

    float.is_integer

    You can use this python function in conjunction with an apply:

    df.v.apply(float.is_integer).all()
    True
    

    Or, using python's all in a generator comprehension, for space efficiency:

    all(x.is_integer() for x in df.v)
    True
    
    0 讨论(0)
  • 2020-12-15 18:57

    If you want to check multiple float columns in your dataframe, you can do the following:

    col_should_be_int = df.select_dtypes(include=['float']).applymap(float.is_integer).all()
    float_to_int_cols = col_should_be_int[col_should_be_int].index
    df.loc[:, float_to_int_cols] = df.loc[:, float_to_int_cols].astype(int)
    

    Keep in mind that a float column, containing all integers will not get selected if it has np.NaN values. To cast float columns with missing values to integer, you need to fill/remove missing values, for example, with median imputation:

    float_cols = df.select_dtypes(include=['float'])
    float_cols = float_cols.fillna(float_cols.median().round()) # median imputation
    col_should_be_int = float_cols.applymap(float.is_integer).all()
    float_to_int_cols = col_should_be_int[col_should_be_int].index
    df.loc[:, float_to_int_cols] = float_cols[float_to_int_cols].astype(int)
    
    0 讨论(0)
提交回复
热议问题