How to iterate through text in the Word document by VB macro

核能气质少年 提交于 2019-12-11 10:23:31

问题


I wanted to count chars in the Word document by Macro I have no idea how to get reference two the text in visual basic macro and go through it.

I would like to count how many of every char was in the document. For example in document:

ABZBB

A x 1
B x 3
Z x 1

   Sub Macro1()
Dim Box As Shape
Set Box = ActiveDocument.Shapes.AddTextbox( _
Orientation:=msoTextOrientationHorizontal, _
Left:=50, Top:=50, Width:=200, Height:=400)
Box.TextFrame.TextRange.Text = "My text comes this way" + Chr(10)
Dim s As String
Application.ScreenUpdating = False
docLength = ActiveDocument.Range.Characters.Count

Box.TextFrame.TextRange.Text = Box.TextFrame.TextRange.Text + "Text length is: " + Str(docLength) + Chr(10)

Dim arr(128) As Integer
Dim character As Integer
For i = 1 To docLength - 1

        character = Asc(ActiveDocument.Range.Characters(i))
If iAsc >= 0 And iAsc <= 127 Then
         arr(character) = arr(character) + 1
 End If
Next i


End Sub

回答1:


Below is a simplistic, and perhaps slow, example of counting individual letters (and some other characters) in a document.

Sub CountChars()
    Dim iCount(57) As Integer
    Dim x As Integer
    Dim iTotal As Integer
    Dim iAsc As Integer

    Application.ScreenUpdating = False
    iTotal = ActiveDocument.Range.Characters.Count

    For x = 1 To iTotal
        iAsc = Asc(ActiveDocument.Range.Characters(x))
        If iAsc >= 65 And iAsc <= 122 Then
        iCount(iAsc - 65) = iCount(iAsc - 65) + 1
        End If
    Next x
    For x = 0 To 57
        Debug.Print x, iCount(x)
    Next x
    Application.ScreenUpdating = True
End Sub

Change to

Debug.Print Chr(x + 65), iCount(x)

to display the characters themselves.

It may be possible to use Find (somehow) to count occurrences of characters; otherwise it would require Regex.

Alternative using Replace:

'Tools, References: Microsoft Scripting Runtime
Sub CountCharsWithReplace()
    Dim doc As Document
    Dim rDupe As Range
    Dim dicChars As Scripting.Dictionary
    Dim s As String
    Dim iTotalChars As Integer
    Dim iTempChars As Integer
    Dim iDiff As Integer
    Dim n As Integer
    Dim blnExec As Boolean
    Dim lett As Variant
    Application.ScreenUpdating = False
    Set doc = ActiveDocument
    iTotalChars = doc.Range.Characters.Count
    Set rDupe = doc.Range
    Set dicChars = New Scripting.Dictionary
    Do While rDupe.Characters.Count > 1
        s = rDupe.Characters(1).Text
        blnExec = rDupe.Find.Execute(s, , , , , , , , , "", wdReplaceAll)
        iTempChars = doc.Range.Characters.Count
        iDiff = iTotalChars - iTempChars
        iTotalChars = iTempChars
        If Asc(s) >= 65 And Asc(s) <= 122 Then
            dicChars.Add s, iDiff
        End If
        n = n + 1
    Loop
    ActiveDocument.Undo Times:=n
    Application.ScreenUpdating = True
    For Each lett In dicChars.Keys
        Debug.Print lett, dicChars(lett)
    Next lett
End Sub



回答2:


Using VBA, to count the number of characters in the active document do:

ActiveDocument.Range.ComputeStatistics(wdStatisticCharacters)

or

Activedocument.Range.Characters.Count

To get the count for the current selection:

Selection.Range.ComputeStatistics(wdStatisticCharacters)

or

Selection.Range.Characters.Count

The second method in each example counts spaces as characters, the first doesn't.

EDIT: I did some speed testing on various methods to count the instances of a char in a document. Regular expressions and stuffing the document contents into a string are fastest - many times faster than looping through each character or FIND

For my test document I copied the contents of this web page into a Word document. As an accuracy check, I used Word's Find function/panel to find the number of instances of lower case "a". Before I edited this answer that was 409 instances.

I then created four functions to count the number of instances of a character (any string actually) in a Word document. The first simply loops through each character in the doc, similar to Andrew's. The second uses the Find function. The third stuffs the contents of the document into a string and loops through it. The fourth does the same thing but check the matches using a regular expression:

Function GetCharCountLoop(doc As Word.Document, char As String) As Long
Dim i As Long
Dim CharCount As Long

With doc.Content.Characters
    For i = 1 To .Count
        If .Item(i) = char Then
            CharCount = CharCount + 1
        End If
    Next i
End With
GetCharCountLoop = CharCount
End Function

Function GetCharCountFind(doc As Word.Document, char As String) As Long
Dim i As Long
Dim CharCount As Long

With doc.Content.Find
    Do While .Execute(FindText:=char, Forward:=True, MatchWholeWord:=False, MatchCase:=True) = True
        CharCount = CharCount + 1
    Loop
    GetCharCountFind = CharCount
End With
End Function

Function GetCharCountString(doc As Word.Document, char As String) As Long
Dim chars As String
Dim i As Long
Dim CharCount As Long

chars = doc.Content
For i = 1 To Len(chars)
    If Mid$(chars, i, 1) = char Then
            CharCount = CharCount + 1
        End If
    Next i
GetCharCountString = CharCount
End Function

Function GetCharCountRegex(doc As Word.Document, char As String) As Long
Dim chars As String
Dim CharCount As Long
Dim objRegExp As Object

chars = doc.Content
Set objRegExp = CreateObject("VBScript.RegExp")
With objRegExp
    .Pattern = char
    .IgnoreCase = False
    .Global = True
    CharCount = .Execute(chars).Count
End With
GetCharCountRegex = CharCount
End Function

I then tested them using this sub, running a single loop:

Sub TimeMethods()
Dim char As String
Dim CharCount As Long
Dim LoopCounter As Long
Dim NumLoops As Long
Dim StartTime As Double

char = "a"
NumLoops = 1

StartTime = Timer
For LoopCounter = 1 To NumLoops
CharCount = GetCharCountLoop(ActiveDocument, char)
Next LoopCounter
Debug.Print CharCount
Debug.Print Timer - StartTime

StartTime = Timer
For LoopCounter = 1 To NumLoops
CharCount = GetCharCountFind(ActiveDocument, char)
Next LoopCounter
Debug.Print CharCount
Debug.Print Timer - StartTime

StartTime = Timer
For LoopCounter = 1 To NumLoops
CharCount = GetCharCountString(ActiveDocument, char)
Next LoopCounter
Debug.Print CharCount
Debug.Print Timer - StartTime

StartTime = Timer
For LoopCounter = 1 To NumLoops
CharCount = GetCharCountRegex(ActiveDocument, char)
Next LoopCounter
Debug.Print CharCount
Debug.Print Timer - StartTime

End Sub

The results are dramatic:

GetCharCountLoop - 514.3046875 seconds

GetCharCountFind - 0.5859375 seconds

GetCharCountString - 0.015625 seconds

GetCharCountRegex - 0.015625 seconds

I dropped GetCharCountLoop from the running and ran the other three 100 times. According to this rudimentary timing, stuffing the contents into a string and counting, or using a regular expression, are almost 50 times faster than the Find method:

GetCharCountFind - 30.984375 seconds

GetCharCountString - 0.6328125 seconds

GetCharCountRegex - 0.578125 seconds

Note that the slowness of the first method, looping through each character is most evident with longer docs. In my initial testing - a file with just a few words - it was only twice as slow as the Find method.

Also note that I originally turned off ScreenUpdating per Andrew's subroutine, but it seems that makes no difference.



来源:https://stackoverflow.com/questions/17126690/how-to-iterate-through-text-in-the-word-document-by-vb-macro

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!