Extract Images and Words with coordinates and sizes from PDF

后端 未结 3 707
暖寄归人
暖寄归人 2021-01-02 16:29

I\'ve read much about PDF extractions and libraries (as iText) but i just haven\'t found a solution to extract images and text (with coordinates) from a PDF.

The tas

3条回答
  •  天涯浪人
    2021-01-02 17:13

    Use XPDF (http://www.foolabs.com/xpdf/)

    It can extract all the characters in the PDF with co-ordinates (pdftotext -bbox [sourcefile] [outputfile]) and also all the images and SVGs in the PDF.

    It's open source (GPLv2) and supports a lot of additional extraction functionalities as well.

提交回复
热议问题