iText7 Performance Issue Compared With iTextSharp

前端 未结 1 548
旧时难觅i
旧时难觅i 2020-12-12 08:09

I have tested iTextsharp and iText7 for HTML to PDF conversion. Based on the performance iTextsharp is taking 3 minutes for 10000 PDF creation. But iText7 taking 17 min

相关标签:
1条回答
  • 2020-12-12 08:33

    The answer to your question is simple: at iText Group, we are constantly improving the iText software, and there is certainly room for improving the performance. However, we won't ever be able to make the pdfHTML add-on as fast as the obsolete HTMLWorker. The reason is simple: HTMLWorker didn't support CSS, HTMLWorker only supported a small selection of tags, and so on... HTMLWorker was very simple and was only to be used for simple needs.

    We have created the pdfHTML add-on to support CSS (including functionality to add headers, footer, page numbers, etc...). We support plenty of HTML tags that weren't supported in HTMLWorker. We support absolute positioning of elements in pdfHTML. All of this functionality comes with a cost. That cost is CPU.

    It is intellectually unfair of you to compare the CPU use by HTMLWorker with the CPU use by pdfHTML.

    This being said: you can already save plenty of time by using ConverterProperties. Right now, you don't provide any ConverterProperties. This means that iText has to instantiate the default properties for every PDF you are creating. If you would create the ConverterProperties up-front, and reuse them, you could already save plenty of time, but you have to understand that the extra functionality provided by pdfHTML comes with a cost in CPU.

    This is how you create a ConverterProperties instance:

    ConverterProperties converterProperties = new ConverterProperties()
        .setBaseUri(".")
        .setCreateAcroForm(false)
        .setCssApplierFactory(new DefaultCssApplierFactory())
        .setFontProvider(new DefaultFontProvider())
        .setMediaDeviceDescription(MediaDeviceDescription.createDefault())
        .setOutlineHandler(new OutlineHandler())
        .setTagWorkerFactory(new DefaultTagWorkerFactory());
    

    As you can see, we create plenty of default objects: the default CCS Applier factory, the default font provider, the default media description, the default outline handler, and the default tag worker factory. The creation of all of these objects costs a tiny little bit of time, but when you multiply that time by 10,000 because you create 10,000 documents, the CPU needed to create those default objects can become significant, and that what happens when you convert an HTML file to PDF like this:

    HtmlConverter.convertToPdf(
        new FileInputStream("resources/test.html"),
        new FileOutputStream("results/test.pdf"));
    

    Since you are not adding a ConverterProperties parameter, iText will create a new instance of ConverterProperties internally for every document that you convert. All the default components of the ConverterProperties will be null, which means that for every document you create new instances of the CSS Applier factory, the font provider, etc... need to be created.

    It will save you some time (but not that much) if you create a ConverterProperties up-front (only once), as well as all the components. It is then important that you reuse that object when converting HTML to PDF:

    HtmlConverter.convertToPdf(
        new FileInputStream("resources/test.html"),
        new FileOutputStream("results/test.pdf"),
        converterProperties);
    
    0 讨论(0)
提交回复
热议问题