finding on which page a search string is located in a pdf document using python

前端 未结 3 1513
旧巷少年郎
旧巷少年郎 2020-12-15 12:36

Which python packages can I use to find out out on which page a specific “search string” is located ?

I looked into several python pdf packages but couldn\'t figur

3条回答
  •  清歌不尽
    2020-12-15 13:09

    I was able to successfully get the output using the code below.

    Code:

    import PyPDF2
    import re
    
    # Open the pdf file
    object = PyPDF2.PdfFileReader(r"C:\TEST.pdf")
    
    # Get number of pages
    NumPages = object.getNumPages()
    
    # Enter code here
    String = "Enter_the_text_to_Search_here"
    
    # Extract text and do the search
    for i in range(0, NumPages):
        PageObj = object.getPage(i)
        Text = PageObj.extractText()
        if re.search(String,Text):
             print("Pattern Found on Page: " + str(i))
    

    Sample Output:

    Pattern Found on Page: 7
    

提交回复
热议问题