How to access a field of a namedtuple using a variable for the field name?

后端 未结 4 2007
忘掉有多难
忘掉有多难 2020-12-28 11:19

I can access elements of a named tuple by name as follows(*):

from collections import namedtuple
Car = namedtuple(\'Car\', \'color mileage\')
my_car = Car(\'         


        
相关标签:
4条回答
  • 2020-12-28 12:00

    since python version 3.6 one could inherit from typing.NamedTuple

    class HistoryItem(tp.NamedTuple):
        inp: str
        tsb: float
        rtn: int
        frequency: int = None
    
        def __getitem__(self, item):
            if isinstance(item, str):
                return getattr(self, item)
            return tp.NamedTuple.__getitem__(self, item)
            # return super().__getitem__(item)
    
        def get(self, item, default=None):
            try:
                return self[item]
            except (KeyError, AttributeError):
                return default
    

    then both item[num] or item["fld_name"] will work

    0 讨论(0)
  • 2020-12-28 12:04

    The 'getattr' answer works, but there is another option which is slightly faster.

    idx = {name: i for i, name in enumerate(list(df), start=1)}
    for row in df.itertuples(name=None):
       example_value = row[idx['product_price']]
    

    Explanation

    Make a dictionary mapping the column names to the row position. Call 'itertuples' with "name=None". Then access the desired values in each tuple using the indexes obtained using the column name from the dictionary.

    1. Make a dictionary to find the indexes.

    idx = {name: i for i, name in enumerate(list(df), start=1)}

    1. Use the dictionary to access the desired values by name in the row tuples
    for row in df.itertuples(name=None):
       example_value = row[idx['product_price']]
    

    Note: Use start=0 in enumerate if you call itertuples with index=False

    Here is a working example showing both methods and the timing of both methods.

    import numpy as np
    import pandas as pd
    import timeit
    
    data_length = 3 * 10**5
    fake_data = {
        "id_code": list(range(data_length)),
        "letter_code": np.random.choice(list('abcdefgz'), size=data_length),
        "pine_cones": np.random.randint(low=1, high=100, size=data_length),
        "area": np.random.randint(low=1, high=100, size=data_length),
        "temperature": np.random.randint(low=1, high=100, size=data_length),
        "elevation": np.random.randint(low=1, high=100, size=data_length),
    }
    df = pd.DataFrame(fake_data)
    
    
    def iter_with_idx():
        result_data = []
        
        idx = {name: i for i, name in enumerate(list(df), start=1)}
        
        for row in df.itertuples(name=None):
            
            row_calc = row[idx['pine_cones']] / row[idx['area']]
            result_data.append(row_calc)
            
        return result_data
    
          
    def iter_with_getaatr():
        
        result_data = []
        for row in df.itertuples():
            row_calc = getattr(row, 'pine_cones') / getattr(row, 'area')
            result_data.append(row_calc)
            
        return result_data
        
    
    dict_idx_method = timeit.timeit(iter_with_idx, number=100)
    get_attr_method = timeit.timeit(iter_with_getaatr, number=100)
    
    print(f'Dictionary index Method {dict_idx_method:0.4f} seconds')
    print(f'Get attribute method {get_attr_method:0.4f} seconds')
    

    Result:

    Dictionary index Method 49.1814 seconds
    Get attribute method 80.1912 seconds
    

    I assume the difference is due to lower overhead in creating a tuple vs a named tuple and also lower overhead in accessing it by the index rather than getattr but both of those are just guesses. If anyone knows better please comment.

    I have not explored how the number of columns vs number of rows effects the timing results.

    0 讨论(0)
  • 2020-12-28 12:14

    Another way of accessing them can be:

    field_idx = my_car._fields.index(field)
    my_car[field_idx]
    

    Extract index of the field and then use it to index the namedtuple.

    0 讨论(0)
  • 2020-12-28 12:21

    You can use getattr

    getattr(my_car, field)
    
    0 讨论(0)
提交回复
热议问题