Improving the performance of FOR loop

后端 未结 3 810
时光取名叫无心
时光取名叫无心 2020-12-04 00:49

I am comparing sheets in a workbook. The workbook has two sheets named PRE and POST with the same 19 columns in each. The number of rows varies every day but are same for th

3条回答
  •  执念已碎
    2020-12-04 01:14

    If I may put my two cents in, here is my suggestion.

    I've tested the original code (with the the only alteration being the For y = 1 to 10 instead of For y = 1 to 20) and my code against 2 sheets with 10 columns and (initially 500,000) 250,000 rows data each. The reason I've used 10 instead of 20 lies in the fact that I don't know what data is in the columns and as a substitute I have used a random value of either 1 or 2.

    • For 10 columns that means that there are 2^10 = 1,024 possibilities.
    • For 20 columns that means that there are 2^20 = 1,048,576 possibilities.

    As I wanted to have at least the possibility of a few equal rows in each table I opted for the 10 column scenario.

    To time the macro I set up a timer macro which calls the macro to compare and delete data.

    In order to be able to compare the results both macros were executed directly after starting Excel and opening the file with the exact same data.

    I have

    • avoided all instances of Active
    • minimized the reading and writing of data between Excel and VBA which is accomplished by collecting all data on a sheet in a 2D array and then analyzing the array.
    • collected the rows to delete in ranges (1 per sheet) and deleted all the rows to be deleted outside the loop

    The Code

    Sub CompareAndDelete()
        Dim WsPre As Worksheet, WsPost As Worksheet
        Dim Row As Long, Column As Long
        Dim ArrPre() As Variant, ArrPost() As Variant
        Dim DeleteRow As Boolean
        Dim DeletePre As Range, DeletePost As Range
    
        With Application
            .ScreenUpdating = False
            .Calculation = xlCalculationManual
            .EnableEvents = False
        End With
    
        With ThisWorkbook
            Set WsPre = .Worksheets("PRE")
            Set WsPost = .Worksheets("Post")
        End With
    
        ArrPre = WsPre.Range(WsPre.Cells(1, 1), WsPre.Cells(WsPre.Cells(WsPre.Rows.Count, 1).End(xlUp).Row, 20))
        ArrPost = WsPost.Range(WsPost.Cells(1, 1), WsPost.Cells(WsPost.Cells(WsPost.Rows.Count, 1).End(xlUp).Row, 20))
    
        If Not UBound(ArrPre, 1) = UBound(ArrPost, 1) Then
            MsgBox "Unequal number of rows in sheets PRE and POST. Exiting macro.", vbCritical, "Unequal sheets"
        Else
    
            For Row = 2 To UBound(ArrPre, 1)
                DeleteRow = True
                For Column = 1 To UBound(ArrPre, 2)
                    If Not ArrPre(Row, Column) = ArrPost(Row, Column) Then
                        DeleteRow = False
                        Exit For
                    End If
                Next Column
    
                If DeleteRow = True Then
                    If DeletePre Is Nothing Then
                        Set DeletePre = WsPre.Rows(Row)
                        Set DeletePost = WsPost.Rows(Row)
                    Else
                        Set DeletePre = Union(DeletePre, WsPre.Rows(Row))
                        Set DeletePost = Union(DeletePost, WsPost.Rows(Row))
                    End If
    
                End If
            Next Row
    
            If Not DeletePre Is Nothing Then DeletePre.Delete
            If Not DeletePost Is Nothing Then DeletePost.Delete
    
        End If
    
        With Application
            .ScreenUpdating = True
            .Calculation = xlCalculationAutomatic
            .EnableEvents = True
        End With
    
    End Sub
    

    Results

    My Code - 500,000 rows of data.

    Datasheet with 500.000 rows and 10 columns has been processed in 14,23 seconds, 561 rows have been found equal and have been deleted.

    Original Code - 500,000 rows of data.

    Unfortunately my system couldn't handle this task and Excel stopped working.


    My Code - 250,000 rows of data.

    Datasheet with 250.000 rows and 10 columns has been processed in 4,72 seconds, 313 rows have been found equal and have been deleted.

    Original Code - 250,000 rows of data.

    Datasheet with 250.000 rows and 10 columns has been processed in 14,07 seconds, 313 rows have been found equal and have been deleted.

提交回复
热议问题