SQL join or R's merge() function in NumPy?

后端 未结 1 1700
鱼传尺愫
鱼传尺愫 2020-12-03 08:14

Is there an implementation where I can join two arrays based on their keys? Speaking of which, is the canonical way to store keys in one of the NumPy columns (NumPy doesn\'t

相关标签:
1条回答
  • 2020-12-03 08:42

    If you want to use only numpy, you can use structured arrays and the lib.recfunctions.join_by function (see http://pyopengl.sourceforge.net/pydoc/numpy.lib.recfunctions.html). A little example:

    In [1]: import numpy as np
       ...: import numpy.lib.recfunctions as rfn
       ...: a = np.array([(1, 10.), (2, 20.), (3, 30.)], dtype=[('id', int), ('A', float)])
       ...: b = np.array([(2, 200.), (3, 300.), (4, 400.)], dtype=[('id', int), ('B', float)])
    
    In [2]: rfn.join_by('id', a, b, jointype='inner', usemask=False)
    Out[2]: 
    array([(2, 20.0, 200.0), (3, 30.0, 300.0)], 
          dtype=[('id', '<i4'), ('A', '<f8'), ('B', '<f8')])
    

    Another option is to use pandas (documentation). I have no experience with it, but it provides more powerful data structures and functionality than standard numpy, "to make working with “relational” or “labeled” data both easy and intuitive". And it certainly has joining and merging functions (for example see http://pandas.sourceforge.net/merging.html#joining-on-a-key).

    0 讨论(0)
提交回复
热议问题