Specify correct dtypes to pandas.read_csv for datetimes and booleans

前端 未结 1 1850
甜味超标
甜味超标 2020-12-16 12:14

I am loading a csv file into a Pandas DataFrame. For each column, how do I specify what type of data it contains using the dtype argument?

  • I can d
相关标签:
1条回答
  • 2020-12-16 13:03

    There are a lot of options for read_csv which will handle all the cases you mentioned. You might want to try dtype={'A': datetime.datetime}, but often you won't need dtypes as pandas can infer the types.

    For dates, then you need to specify the parse_date options:

    parse_dates : boolean, list of ints or names, list of lists, or dict
    keep_date_col : boolean, default False
    date_parser : function
    

    In general for converting boolean values you will need to specify:

    true_values  : list  Values to consider as True
    false_values : list  Values to consider as False
    

    Which will transform any value in the list to the boolean true/false. For more general conversions you will most likely need

    converters : dict. optional Dict of functions for converting values in certain columns. Keys can either be integers or column labels

    Though dense, check here for the full list: http://pandas.pydata.org/pandas-docs/stable/generated/pandas.io.parsers.read_csv.html

    0 讨论(0)
提交回复
热议问题