Comparison of two pdf files

前端 未结 4 1006
感动是毒
感动是毒 2020-12-16 19:49

I need to compare the contents of two almost similar files and highlight the dissimilar portions in the corresponding pdf file. Am using pdfbox. Please help me atleast with

4条回答
  •  慢半拍i
    慢半拍i (楼主)
    2020-12-16 20:30

    I have come up with a jar using apache pdfbox to compare pdf files - this can compare pixel by pixel & highlight the differences.

    Check my blog : http://www.testautomationguru.com/introducing-pdfutil-to-compare-pdf-files-extract-resources/ for example & download.


    To get page count

    import com.taguru.utility.PDFUtil;
    
    PDFUtil pdfUtil = new PDFUtil();
    pdfUtil.getPageCount("c:/sample.pdf"); //returns the page count
    

    To get page content as plain text

    //returns the pdf content - all pages
    pdfUtil.getText("c:/sample.pdf");
    
    // returns the pdf content from page number 2
    pdfUtil.getText("c:/sample.pdf",2);
    
    // returns the pdf content from page number 5 to 8
    pdfUtil.getText("c:/sample.pdf", 5, 8);
    

    To extract attached images from PDF

    //set the path where we need to store the images
     pdfUtil.setImageDestinationPath("c:/imgpath");
     pdfUtil.extractImages("c:/sample.pdf");
    
    // extracts & saves the pdf content from page number 3
    pdfUtil.extractImages("c:/sample.pdf", 3);
    
    // extracts & saves the pdf content from page 2
    pdfUtil.extractImages("c:/sample.pdf", 2, 2);
    

    To store PDF pages as images

    //set the path where we need to store the images
     pdfUtil.setImageDestinationPath("c:/imgpath");
     pdfUtil.savePdfAsImage("c:/sample.pdf");
    

    To compare PDF files in text mode (faster – But it does not compare the format, images etc in the PDF)

    String file1="c:/files/doc1.pdf";
    String file1="c:/files/doc2.pdf";
    
    // compares the pdf documents & returns a boolean
    // true if both files have same content. false otherwise.
    pdfUtil.comparePdfFilesTextMode(file1, file2);
    
    // compare the 3rd page alone
    pdfUtil.comparePdfFilesTextMode(file1, file2, 3, 3);
    
    // compare the pages from 1 to 5
    pdfUtil.comparePdfFilesTextMode(file1, file2, 1, 5);
    

    To compare PDF files in Binary mode (slower – compares PDF documents pixel by pixel – highlights pdf difference & store the result as image)

    String file1="c:/files/doc1.pdf";
    String file1="c:/files/doc2.pdf";
    
    // compares the pdf documents & returns a boolean
    // true if both files have same content. false otherwise.
    pdfUtil.comparePdfFilesBinaryMode(file1, file2);
    
    // compare the 3rd page alone
    pdfUtil.comparePdfFilesBinaryMode(file1, file2, 3, 3);
    
    // compare the pages from 1 to 5
    pdfUtil.comparePdfFilesBinaryMode(file1, file2, 1, 5);
    
    //if you need to store the result
    pdfUtil.highlightPdfDifference(true);
    pdfUtil.setImageDestinationPath("c:/imgpath");
    pdfUtil.comparePdfFilesBinaryMode(file1, file2);
    

提交回复
热议问题