Python - Iterate over a list of attributes

前端 未结 3 965
借酒劲吻你
借酒劲吻你 2020-12-11 11:01

I have a feature in my data set that is a pandas timestamp object. It has (among many others) the following attributes: year, hour, dayofweek, month.

I can create ne

3条回答
  •  难免孤独
    2020-12-11 11:29

    Don't use .apply here, pandas has various built-in utilities for handling datetime objects, use the dt attribute on the series objects:

    In [11]: start = datetime(2011, 1, 1)
        ...: end = datetime(2012, 1, 1)
        ...:
    
    In [12]: df = pd.DataFrame({'data':pd.date_range(start, end)})
    
    In [13]: df.dtypes
    Out[13]:
    data    datetime64[ns]
    dtype: object
    
    In [14]: df['year'] = df.data.dt.year
    
    In [15]: df['hour'] = df.data.dt.hour
    
    In [16]: df['month'] = df.data.dt.month
    
    In [17]: df['dayofweek'] = df.data.dt.dayofweek
    
    In [18]: df.head()
    Out[18]:
            data  year  hour  month  dayofweek
    0 2011-01-01  2011     0      1          5
    1 2011-01-02  2011     0      1          6
    2 2011-01-03  2011     0      1          0
    3 2011-01-04  2011     0      1          1
    4 2011-01-05  2011     0      1          2
    

    Or, dynamically as you wanted using getattr:

    In [24]: df = pd.DataFrame({'data':pd.date_range(start, end)})
    
    In [25]: nomtimes = ["year", "hour", "month", "dayofweek"]
        ...:
    
    In [26]: df.head()
    Out[26]:
            data
    0 2011-01-01
    1 2011-01-02
    2 2011-01-03
    3 2011-01-04
    4 2011-01-05
    
    In [27]: for t in nomtimes:
        ...:     df[t] = getattr(df.data.dt, t)
        ...:
    
    In [28]: df.head()
    Out[28]:
            data  year  hour  month  dayofweek
    0 2011-01-01  2011     0      1          5
    1 2011-01-02  2011     0      1          6
    2 2011-01-03  2011     0      1          0
    3 2011-01-04  2011     0      1          1
    4 2011-01-05  2011     0      1          2
    

    And if you must use a one-liner, go with:

    In [30]: df = pd.DataFrame({'data':pd.date_range(start, end)})
    
    In [31]: df.head()
    Out[31]:
            data
    0 2011-01-01
    1 2011-01-02
    2 2011-01-03
    3 2011-01-04
    4 2011-01-05
    
    In [32]: df = df.assign(**{t:getattr(df.data.dt,t) for t in nomtimes})
    
    In [33]: df.head()
    Out[33]:
            data  dayofweek  hour  month  year
    0 2011-01-01          5     0      1  2011
    1 2011-01-02          6     0      1  2011
    2 2011-01-03          0     0      1  2011
    3 2011-01-04          1     0      1  2011
    4 2011-01-05          2     0      1  2011
    

提交回复
热议问题