How to convert a pandas DataFrame subset of columns AND rows into a numpy array?

前端 未结 3 1642
野性不改
野性不改 2020-12-04 10:31

I\'m wondering if there is a simpler, memory efficient way to select a subset of rows and columns from a pandas DataFrame.

For instance, given this dataframe:

<
相关标签:
3条回答
  • 2020-12-04 10:48

    .loc accept row and column selectors simultaneously (as do .ix/.iloc FYI) This is done in a single pass as well.

    In [1]: df = DataFrame(np.random.rand(4,5), columns = list('abcde'))
    
    In [2]: df
    Out[2]: 
              a         b         c         d         e
    0  0.669701  0.780497  0.955690  0.451573  0.232194
    1  0.952762  0.585579  0.890801  0.643251  0.556220
    2  0.900713  0.790938  0.952628  0.505775  0.582365
    3  0.994205  0.330560  0.286694  0.125061  0.575153
    
    In [5]: df.loc[df['c']>0.5,['a','d']]
    Out[5]: 
              a         d
    0  0.669701  0.451573
    1  0.952762  0.643251
    2  0.900713  0.505775
    

    And if you want the values (though this should pass directly to sklearn as is); frames support the array interface

    In [6]: df.loc[df['c']>0.5,['a','d']].values
    Out[6]: 
    array([[ 0.66970138,  0.45157274],
           [ 0.95276167,  0.64325143],
           [ 0.90071271,  0.50577509]])
    
    0 讨论(0)
  • 2020-12-04 10:57

    Perhaps something like this for the first problem, you can simply access the columns by their names:

    >>> df = pd.DataFrame(np.random.rand(4,5), columns = list('abcde'))
    >>> df[df['c']>.5][['b','e']]
              b         e
    1  0.071146  0.132145
    2  0.495152  0.420219
    

    For the second problem:

    >>> df[df['c']>.5][['b','e']].values
    array([[ 0.07114556,  0.13214495],
           [ 0.49515157,  0.42021946]])
    
    0 讨论(0)
  • 2020-12-04 11:01

    Use its value directly:

    In [79]: df[df.c > 0.5][['b', 'e']].values
    Out[79]: 
    array([[ 0.98836259,  0.82403141],
           [ 0.337358  ,  0.02054435],
           [ 0.29271728,  0.37813099],
           [ 0.70033513,  0.69919695]])
    
    0 讨论(0)
提交回复
热议问题