Excel Parsing multi-line Data column via macro with concatenated strings

笑着哭i 提交于 2020-01-05 07:29:48

问题


I have a excel column which runs in more than 20000 rows of data. Data format is almost consistent and requires parsing into newly inserted columns via macro. few parsing are straight forward and few require stacking and concatenated strings.

Example of repetitive multi-line data in single cell

LEGAL DETAILS FOR US5515106
Actual or expected expiration date=2014-05-26   
Legal state=DEAD   
Status=EXPIRED
Event publication date=1994-05-26 
Event code=US/APP 
Event indicator=Pos 
Event type=Examination events 
Application details
Application country=US US08249915 
Application date=1994-05-26 
Standardized application number=1994US-08249915
Event publication date=1994-07-21 
Event code=US/AS 
Event type=Change of name or address 
Event type=Reassignment 
Assignment
OWNER: THOMSON CONSUMER ELECTRONICS, INC., INDIANA
Effective date of the event=1994-06-08 
ASSIGNMENT OF ASSIGNORS INTEREST ASSIGNORS:CHANEY, JOHN WILLIAM BRIDGEWATER, KEVIN ELLIOTT REEL/FRAME:007121/0966
Event publication date=1996-05-07 
Event code=US/A 
Event indicator=Pos 
Event type=Event indicating In Force 
Patents Granted before 2001-04-15
Publication country=US 
Publication number=US5515106 
Publication stage Code=A 
Publication date=1996-05-07 
Standardized publication number=US5515106
Event publication date=1999-05-18 
Event code=US/NMFP 
Event type=Payment or non-payment notifications 
Publication of First Notice of Maintenance Fees Payable.
PAYMENT NOTICE YEAR: 
Year of payment of annual fees=3
Event publication date=1999-09-27 
Event code=US/FPAY 
Event indicator=Pos 
Event type=Event indicating In Force 
Event type=Payment or non-payment notifications 
Fee payment
Annual fees payment date=1999-09-27    
Year of payment of annual fees=4
Event publication date=1999-10-26 
Event code=US/NMFP 
Event type=Payment or non-payment notifications 
Publication of First Notice of Maintenance Fees Payable.
PAYMENT NOTICE YEAR: 
Year of payment of annual fees=3
Event publication date=2003-05-13 
Event code=US/NMFP 
Event type=Payment or non-payment notifications 
Publication of First Notice of Maintenance Fees Payable.
PAYMENT NOTICE YEAR: 
Year of payment of annual fees=7
Event publication date=2003-10-03 
Event code=US/FPAY 
Event indicator=Pos 
Event type=Event indicating In Force 
Event type=Payment or non-payment notifications 
Fee payment
Annual fees payment date=2003-10-03    
Year of payment of annual fees=8
Event publication date=2007-05-22 
Event code=US/NMFP 
Event type=Payment or non-payment notifications 
Publication of First Notice of Maintenance Fees Payable.
PAYMENT NOTICE YEAR: 
Year of payment of annual fees=11
Event publication date=2007-10-18 
Event code=US/FPAY 
Event indicator=Pos 
Event type=Event indicating In Force 
Event type=Payment or non-payment notifications 
Fee payment
Annual fees payment date=2007-10-18    
Year of payment of annual fees=12
Event publication date=2014-05-26 
Event code=US/EEDX 
Event indicator=Neg 
Event type=Event indicating Not In Force 
Patent has expired

If we look closely first four lines are different and data followed by '=' is parsed individually. After this repetitive order is followed:-

Event publication date

Event code

Event indicator

Event type

I am interested to do following with this data present in single cell:-

1. I have inserted certain column which has same starting of multline i.e. 'Event Publication date' in this case date written infront of '=' sign is parsed. further since each multiline has many such repetitive dates we need to stack and single respective row and all dates are grouped in chronological order.

2. In one of the special case of 'Event Type' column i need to concatenated two respective fields 'Event Publication Date' and Event Type together and stack them in single cell

3. In second special case i need to get only last Multiline section wherein Event Publication Date and Event Type concatenated together.

To explain this further Sample Excel Data can be downloaded from LINK and Desired Result Format Manually done can be downloaded from LINK Desired Format

I have worked out following code till now:-

Sub LegalStatus()
On Error GoTo eh
  If HeaderExists("Table1", "Event publication date") = True Then
 MsgBox "You have Already Done Legal Split!"
    Exit Sub
  Else

         Dim x       As Variant
    Dim y       As Variant
    Dim a()     As Variant
    Dim r       As Long
    Dim i       As Long
    Dim j       As Long

      Dim colNum As Integer
colNum = ActiveSheet.Rows(1).Find(what:="Legal Status", lookat:=xlWhole).Column
ActiveSheet.Columns(colNum + 1).Insert
ActiveSheet.Columns(colNum + 1).Insert
ActiveSheet.Columns(colNum + 1).Insert
ActiveSheet.Columns(colNum + 1).Insert
ActiveSheet.Columns(colNum + 1).Insert
ActiveSheet.Columns(colNum + 1).Insert
ActiveSheet.Columns(colNum + 1).Insert
ActiveSheet.Columns(colNum + 1).Insert
ActiveSheet.Cells(1, colNum + 1).Value = "Actual or expected expiration date"
ActiveSheet.Cells(1, colNum + 2).Value = "Legal state"
ActiveSheet.Cells(1, colNum + 3).Value = "Status"
ActiveSheet.Cells(1, colNum + 4).Value = "Event publication date"
ActiveSheet.Cells(1, colNum + 5).Value = "Event type"
ActiveSheet.Cells(1, colNum + 6).Value = "Latest Event Type"
ActiveSheet.Cells(1, colNum + 7).Value = "Year of payment of annual fees"
ActiveSheet.Cells(1, colNum + 8).Value = "Annual fees payment date"

    For r = 2 To Cells(Rows.Count, 1).End(xlUp).Row
        y = "Event publication date=" & SplitByLastOccurrence(Range("B" & r).Value, "Event publication date")(1)
        x = Split(y, vbLf)
        For i = LBound(x) To UBound(x)
            If InStr(x(i), "=") Then
                ReDim Preserve a(j)
                a(UBound(a)) = Split(x(i), "=")(1)
                j = j + 1
            End If
        Next i
        Range("C" & r).Resize(, UBound(a) + 1).Value = a
        Erase x: Erase a: j = 0
    Next r
       End If
eh:
    MsgBox "Sorry No Legal Status Column: " & Err.Description
End Sub


Function SplitByLastOccurrence(s As String, delimiter As String)
    Dim arr, i As Long

    If Len(s) = 0 Or Len(delimiter) = 0 Then
        SplitByLastOccurrence = CVErr(2001)
    Else
        i = InStrRev(s, delimiter)
        If i = 0 Then
            SplitByLastOccurrence = Array(s)
        Else
            ReDim arr(0 To 1)
            arr(0) = Trim(Left$(s, i - 1))
            arr(1) = Trim(Mid$(s, i + Len(delimiter) + 1))
            SplitByLastOccurrence = arr
        End If
    End If
End Function

I believe only experts can help me out.

来源:https://stackoverflow.com/questions/43422134/excel-parsing-multi-line-data-column-via-macro-with-concatenated-strings

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!