A good library for converting PDF to TIFF? [closed]

白昼怎懂夜的黑 提交于 2019-11-26 22:49:06

问题


I need a Java library to convert PDFs to TIFF images. The PDFs are faxes, and I will be converting to TIFF so that I can then do barcode recognition on the image. Can anyone recommend a good free open source library for conversion from PDF to TIFF?


回答1:


Disclaimer: I work for Atalasoft

We have an SDK that can convert PDF to TIFF. The rendering is powered by Foxit software which makes a very powerful and efficient PDF renderer.




回答2:


I can't recommend any code library, but it's easy to use GhostScript to convert PDF into bitmap formats. I've personally used the script below (which also uses the netpbm utilties) to convert the first page of a PDF into a JPEG thumbnail:

#!/bin/sh

/opt/local/bin/gs -q -dLastPage=1 -dNOPAUSE -dBATCH -dSAFER -r300 \
    -sDEVICE=pnmraw -sOutputFile=- $* |
    pnmcrop |
    pnmscale -width 240 |
    cjpeg

You can use -sDEVICE=tiff... to get direct TIFF output in various TIFF sub-formats from GhostScript.




回答3:


we here also doing conversion PDF -> G3 tiffs with high and low res. From my experience the best tool you can have is Adobe PDF SDK, the only problem with it is its insane price. So we don't use it.

what works fine for us is ghostscript, last versions are pretty much robust and do render correctly majority of the pdfs. And we have quite a few of them coming during the day. In production conversion is done using the gsdll32.dll; but if you want to try it use the following command line:

gswin32c -dNOPAUSE -dBATCH -dMaxStripSize=8192 -sDEVICE=tiffg3 -r204x196 -dDITHERPPI=200 -sOutputFile=test.tif prefix.ps test.pdf

it would convert your PDF into the high res G3 TIFF. and prefix.ps code is here:

<< currentpagedevice /InputAttributes get
0 1 2 index length 1 sub {1 index exch undef } for
/InputAttributes exch dup 0 <</PageSize [0 0 612 1728]>> put
/Policies << /PageSize 3 >> >> setpagedevice

another thing about this sdk is that it's open source; you're getting both c and ps (postscript) source code for it. Also if you're going with another tool check what kind of an engine they have to power the pdf rendering, it could happen they are using gs for it; like for instance LeadTools does.

hope this helps, regards




回答4:


You can use the icepdf library (Apache 2.0 License). They even provide this exact use case as one of their example source code: http://wiki.icesoft.org/display/PDF/Multi-page+Tiff+Capture




回答5:


Maybe it is not neccessary to convert the PDF into TIFF. The fax will most likely be an embedded image in the PDF, so you could just extract these images again. That should be possible with the already mentioned iText library.

I don't know if this is easier than the other approach.




回答6:


Take a look at Apache PDFBox - A Java PDF Library




回答7:


No Itext can not convert PDFs to Tiff.

However, there are commercial libraries that can do that. jPDFImages is a 100% java library that can convert PDF to images in TIFF, JPEG or PNG formats (and maybe JBIG? I am not sure). It can also do the reverse, create PDF from images. It starts at $300 for a server.




回答8:


Here is a good article and wrapper classes for using GhostScript with C# .NET...ended up using this in production

http://www.codeproject.com/KB/cs/GhostScriptUseWithCSharp.aspx




回答9:


I have some great experience with iText (now, I'm using 5.0.6 version) and this is the code for tiff convertion into pdf:

private static String convertTiff2Pdf(String tiff) {

    // target path PDF
    String pdf = null;

    try {

        pdf = tiff.substring(0, tiff.lastIndexOf('.') + 1) + "pdf";

        // New document A4 standard (LETTER)
        Document document = new Document(PageSize.LETTER, 0, 0, 0, 0);

        PdfWriter writer = PdfWriter.getInstance(document, new FileOutputStream(pdf));
        int pages = 0;
        document.open();
        PdfContentByte cb = writer.getDirectContent();
        RandomAccessFileOrArray ra = null;
        int comps = 0;
        ra = new RandomAccessFileOrArray(tiff);
        comps = TiffImage.getNumberOfPages(ra);

        // Convertion statement
        for (int c = 0; c < comps; ++c) {
            Image img = TiffImage.getTiffImage(ra, c + 1);
            if (img != null) {
                System.out.println("page " + (c + 1));
                img.scalePercent(7200f / img.getDpiX(), 7200f / img.getDpiY());
                document.setPageSize(new Rectangle(img.getScaledWidth(), img.getScaledHeight()));
                img.setAbsolutePosition(0, 0);
                cb.addImage(img);
                document.newPage();
                ++pages;
            }
        }

        ra.close();
        document.close();

    } catch (Exception e) {
        logger.error("Convert fail");
        logger.debug("", e);
        pdf = null;
    }

    logger.debug("[" + tiff + "] -> [" + pdf + "] OK");
    return pdf;

}


来源:https://stackoverflow.com/questions/356550/a-good-library-for-converting-pdf-to-tiff

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!