How to delete entire row when case sensitive duplicates are found in Excel (for 100k records or more)?

大兔子大兔子 提交于 2019-12-04 19:14:29
CallumDA

You can use a Dictionary to check for binary uniqueness and variant arrays to speed things up. To use the dictionary you will need to include a reference to Microsoft Scripting Runtime Library

(Tools > References > Microsoft Scripting Runtime library)

I've tested this with 100,000 rows which takes on average 0.25 seconds on my laptop.

Sub RemoveDuplicateRows()
    Dim data As Range
    Set data = ThisWorkbook.Worksheets("Sheet1").UsedRange

    Dim v As Variant, tags As Variant
    v = data
    ReDim tags(1 To UBound(v), 1 To 1)
    tags(1, 1) = 0 'keep the header

    Dim dict As Dictionary
    Set dict = New Dictionary
    dict.CompareMode = BinaryCompare

    Dim i As Long
    For i = LBound(v, 1) To UBound(v, 1)
        With dict
            If Not .Exists(v(i, 1)) Then 'v(i,1) comparing the values in the first column 
                tags(i, 1) = i
                .Add Key:=v(i, 1), Item:=vbNullString
            End If
        End With
    Next i

    Dim rngTags As Range
    Set rngTags = data.Columns(data.Columns.count + 1)
    rngTags.Value = tags

    Union(data, rngTags).Sort key1:=rngTags, Orientation:=xlTopToBottom, Header:=xlYes

    Dim count As Long
    count = rngTags.End(xlDown).Row

    rngTags.EntireColumn.Delete
    data.Resize(UBound(v, 1) - count + 1).Offset(count).EntireRow.Delete
End Sub

Based on the brilliant answer from this question

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!