问题
I have a excel column which runs in more than 20000 rows of data. Data format is almost consistent and requires parsing into newly inserted columns via macro. few parsing are straight forward and few require stacking and concatenated strings.
Example of repetitive multi-line data in single cell
LEGAL DETAILS FOR US5515106
Actual or expected expiration date=2014-05-26
Legal state=DEAD
Status=EXPIRED
Event publication date=1994-05-26
Event code=US/APP
Event indicator=Pos
Event type=Examination events
Application details
Application country=US US08249915
Application date=1994-05-26
Standardized application number=1994US-08249915
Event publication date=1994-07-21
Event code=US/AS
Event type=Change of name or address
Event type=Reassignment
Assignment
OWNER: THOMSON CONSUMER ELECTRONICS, INC., INDIANA
Effective date of the event=1994-06-08
ASSIGNMENT OF ASSIGNORS INTEREST ASSIGNORS:CHANEY, JOHN WILLIAM BRIDGEWATER, KEVIN ELLIOTT REEL/FRAME:007121/0966
Event publication date=1996-05-07
Event code=US/A
Event indicator=Pos
Event type=Event indicating In Force
Patents Granted before 2001-04-15
Publication country=US
Publication number=US5515106
Publication stage Code=A
Publication date=1996-05-07
Standardized publication number=US5515106
Event publication date=1999-05-18
Event code=US/NMFP
Event type=Payment or non-payment notifications
Publication of First Notice of Maintenance Fees Payable.
PAYMENT NOTICE YEAR:
Year of payment of annual fees=3
Event publication date=1999-09-27
Event code=US/FPAY
Event indicator=Pos
Event type=Event indicating In Force
Event type=Payment or non-payment notifications
Fee payment
Annual fees payment date=1999-09-27
Year of payment of annual fees=4
Event publication date=1999-10-26
Event code=US/NMFP
Event type=Payment or non-payment notifications
Publication of First Notice of Maintenance Fees Payable.
PAYMENT NOTICE YEAR:
Year of payment of annual fees=3
Event publication date=2003-05-13
Event code=US/NMFP
Event type=Payment or non-payment notifications
Publication of First Notice of Maintenance Fees Payable.
PAYMENT NOTICE YEAR:
Year of payment of annual fees=7
Event publication date=2003-10-03
Event code=US/FPAY
Event indicator=Pos
Event type=Event indicating In Force
Event type=Payment or non-payment notifications
Fee payment
Annual fees payment date=2003-10-03
Year of payment of annual fees=8
Event publication date=2007-05-22
Event code=US/NMFP
Event type=Payment or non-payment notifications
Publication of First Notice of Maintenance Fees Payable.
PAYMENT NOTICE YEAR:
Year of payment of annual fees=11
Event publication date=2007-10-18
Event code=US/FPAY
Event indicator=Pos
Event type=Event indicating In Force
Event type=Payment or non-payment notifications
Fee payment
Annual fees payment date=2007-10-18
Year of payment of annual fees=12
Event publication date=2014-05-26
Event code=US/EEDX
Event indicator=Neg
Event type=Event indicating Not In Force
Patent has expired
If we look closely first four lines are different and data followed by '=' is parsed individually. After this repetitive order is followed:-
Event publication date
Event code
Event indicator
Event type
I am interested to do following with this data present in single cell:-
1. I have inserted certain column which has same starting of multline i.e. 'Event Publication date' in this case date written infront of '=' sign is parsed. further since each multiline has many such repetitive dates we need to stack and single respective row and all dates are grouped in chronological order.
2. In one of the special case of 'Event Type' column i need to concatenated two respective fields 'Event Publication Date' and Event Type together and stack them in single cell
3. In second special case i need to get only last Multiline section wherein Event Publication Date and Event Type concatenated together.
To explain this further Sample Excel Data can be downloaded from LINK and Desired Result Format Manually done can be downloaded from LINK Desired Format
I have worked out following code till now:-
Sub LegalStatus()
On Error GoTo eh
If HeaderExists("Table1", "Event publication date") = True Then
MsgBox "You have Already Done Legal Split!"
Exit Sub
Else
Dim x As Variant
Dim y As Variant
Dim a() As Variant
Dim r As Long
Dim i As Long
Dim j As Long
Dim colNum As Integer
colNum = ActiveSheet.Rows(1).Find(what:="Legal Status", lookat:=xlWhole).Column
ActiveSheet.Columns(colNum + 1).Insert
ActiveSheet.Columns(colNum + 1).Insert
ActiveSheet.Columns(colNum + 1).Insert
ActiveSheet.Columns(colNum + 1).Insert
ActiveSheet.Columns(colNum + 1).Insert
ActiveSheet.Columns(colNum + 1).Insert
ActiveSheet.Columns(colNum + 1).Insert
ActiveSheet.Columns(colNum + 1).Insert
ActiveSheet.Cells(1, colNum + 1).Value = "Actual or expected expiration date"
ActiveSheet.Cells(1, colNum + 2).Value = "Legal state"
ActiveSheet.Cells(1, colNum + 3).Value = "Status"
ActiveSheet.Cells(1, colNum + 4).Value = "Event publication date"
ActiveSheet.Cells(1, colNum + 5).Value = "Event type"
ActiveSheet.Cells(1, colNum + 6).Value = "Latest Event Type"
ActiveSheet.Cells(1, colNum + 7).Value = "Year of payment of annual fees"
ActiveSheet.Cells(1, colNum + 8).Value = "Annual fees payment date"
For r = 2 To Cells(Rows.Count, 1).End(xlUp).Row
y = "Event publication date=" & SplitByLastOccurrence(Range("B" & r).Value, "Event publication date")(1)
x = Split(y, vbLf)
For i = LBound(x) To UBound(x)
If InStr(x(i), "=") Then
ReDim Preserve a(j)
a(UBound(a)) = Split(x(i), "=")(1)
j = j + 1
End If
Next i
Range("C" & r).Resize(, UBound(a) + 1).Value = a
Erase x: Erase a: j = 0
Next r
End If
eh:
MsgBox "Sorry No Legal Status Column: " & Err.Description
End Sub
Function SplitByLastOccurrence(s As String, delimiter As String)
Dim arr, i As Long
If Len(s) = 0 Or Len(delimiter) = 0 Then
SplitByLastOccurrence = CVErr(2001)
Else
i = InStrRev(s, delimiter)
If i = 0 Then
SplitByLastOccurrence = Array(s)
Else
ReDim arr(0 To 1)
arr(0) = Trim(Left$(s, i - 1))
arr(1) = Trim(Mid$(s, i + Len(delimiter) + 1))
SplitByLastOccurrence = arr
End If
End If
End Function
I believe only experts can help me out.
来源:https://stackoverflow.com/questions/43422134/excel-parsing-multi-line-data-column-via-macro-with-concatenated-strings