How to optimize vlookup for high search count ? (alternatives to VLOOKUP)

前端 未结 4 1219
名媛妹妹
名媛妹妹 2020-11-27 15:48

I am looking for alternatives to vlookup, with improved performance within the context of interest.

The context is the following:

  • I have a data set of
4条回答
  •  夕颜
    夕颜 (楼主)
    2020-11-27 16:02

    I considered the following alternatives:

    • VLOOKUP array-formula
    • MATCH / INDEX
    • VBA (using a dictionary)

    The compared performance is:

    • VLOOKUP simple formula : ~10 minutes
    • VLOOKUP array-formula : ~10 minutes (1:1 performance index)
    • MATCH / INDEX : ~2 minutes (5:1 performance index)
    • VBA (using a dictionary) : ~6 seconds (100:1 performance index)

    Using the same reference sheet

    1) Lookup sheet: (vlookup array formula version)

             A          B
         1
         2   key51359    {=VLOOKUP(A2:A10001;sheet1!$A$2:$B$100001;2;FALSE)}
         3   key41232    formula in B2
         4   key10102    ... extends to
       ...   ...         ... 
     99999   key4153     ... cell B100001
    100000   key12818    ... (select whole range, and press
    100001   key35032    ... CTRL+SHIFT+ENTER to make it an array formula)
    100002
    

    2) Lookup sheet: (match+index version)

             A           B                                       C
          1
          2  key51359    =MATCH(A2;sheet1!$A$2:$A$100001;)       =INDEX(sheet1!$B$2:$B$100001;B2)
          3  key41232    =MATCH(A3;sheet1!$A$2:$A$100001;)       =INDEX(sheet1!$B$2:$B$100001;B3)
          4  key10102    =MATCH(A4;sheet1!$A$2:$A$100001;)       =INDEX(sheet1!$B$2:$B$100001;B4)
        ...  ...         ...                                     ...
      99999  key4153     =MATCH(A99999;sheet1!$A$2:$A$100001;)   =INDEX(sheet1!$B$2:$B$100001;B99999)
     100000  key12818    =MATCH(A100000;sheet1!$A$2:$A$100001;)  =INDEX(sheet1!$B$2:$B$100001;B100000)
     100001  key35032    =MATCH(A100001;sheet1!$A$2:$A$100001;)  =INDEX(sheet1!$B$2:$B$100001;B100001)
     100002
    

    3) Lookup sheet: (vbalookup version)

           A          B
         1
         2  key51359    {=vbalookup(A2:A50001;sheet1!$A$2:$B$100001;2)}
         3  key41232    formula in B2
         4  key10102    ... extends to
       ...  ...         ...
     50000  key91021    ... 
     50001  key42       ... cell B50001
     50002  key21873    {=vbalookup(A50002:A100001;sheet1!$A$2:$B$100001;2)}
     50003  key31415    formula in B50001 extends to
       ...  ...         ...
     99999  key4153     ... cell B100001
    100000  key12818    ... (select whole range, and press
    100001  key35032    ... CTRL+SHIFT+ENTER to make it an array formula)
    100002
    

    NB : For some (external internal) reason, the vbalookup fails to return more than 65536 data at a time. So I had to split the array formula in two.

    and the associated VBA code :

    Function vbalookup(lookupRange As Range, refRange As Range, dataCol As Long) As Variant
      Dim dict As New Scripting.Dictionary
      Dim myRow As Range
      Dim I As Long, J As Long
      Dim vResults() As Variant
    
      ' 1. Build a dictionnary
      For Each myRow In refRange.Columns(1).Cells
        ' Append A : B to dictionnary
        dict.Add myRow.Value, myRow.Offset(0, dataCol - 1).Value
      Next myRow
    
      ' 2. Use it over all lookup data
      ReDim vResults(1 To lookupRange.Rows.Count, 1 To lookupRange.Columns.Count) As Variant
      For I = 1 To lookupRange.Rows.Count
        For J = 1 To lookupRange.Columns.Count
          If dict.Exists(lookupRange.Cells(I, J).Value) Then
            vResults(I, J) = dict(lookupRange.Cells(I, J).Value)
          End If
        Next J
      Next I
    
      vbalookup = vResults
    End Function
    

    NB: Scripting.Dictionary requires a referenc to Microsoft Scripting Runtime which must be added manually (Tools->References menu in the Excel VBA window)

    Conclusion :

    In this context, VBA using a dictionary is 100x faster than using VLOOKUP and 20x faster than MATCH/INDEX

提交回复
热议问题