How can I convert XHTML nested list to pdf with iText?

◇◆丶佛笑我妖孽 提交于 2019-11-27 05:30:32

Please take a look at the example NestedListHtml

In this example, I take your code snippet list.html:

<ul>
  <li>First
    <ol>
      <li>Second</li>
      <li>Second</li>
    </ol>
  </li>
  <li>First</li>
</ul>

And I parse it into an ElementList:

// CSS
CSSResolver cssResolver =
    XMLWorkerHelper.getInstance().getDefaultCssResolver(true);

// HTML
HtmlPipelineContext htmlContext = new HtmlPipelineContext(null);
htmlContext.setTagFactory(Tags.getHtmlTagProcessorFactory());
htmlContext.autoBookmark(false);

// Pipelines
ElementList elements = new ElementList();
ElementHandlerPipeline end = new ElementHandlerPipeline(elements, null);
HtmlPipeline html = new HtmlPipeline(htmlContext, end);
CssResolverPipeline css = new CssResolverPipeline(cssResolver, html);

// XML Worker
XMLWorker worker = new XMLWorker(css, true);
XMLParser p = new XMLParser(worker);
p.parse(new FileInputStream(HTML));

Now I can add this list to the Document:

for (Element e : elements) {
    document.add(e);
}

Or I can list this list to a Paragraph:

Paragraph para = new Paragraph();
for (Element e : elements) {
    para.add(e);
}
document.add(para);

You will get the desired result as shown in nested_list.pdf

You can not add nested lists to a PdfPCell or to a ColumnText. For instance: this will not work:

PdfPTable table = new PdfPTable(2);
table.addCell("Nested lists don't work in a cell");
PdfPCell cell = new PdfPCell();
for (Element e : elements) {
    cell.addElement(e);
}
table.addCell(cell);
document.add(table);

This is due to a limitation in the ColumnText class that has been there for many years. We have evaluated the problem and the only way to fix this, would be to rewrite ColumnText entirely. This is not an item on our current technical road map.

Suresh N

Here's a workaround for nested ordered and un-ordered lists.

The rich Text editor I am using giving the class attribute "ql-indent-1/2/2/" for li tags, based on the attribute adding ul/ol starting and ending tags.

public String replaceIndentSubList(String htmlContent) {
    org.jsoup.nodes.Document document = Jsoup.parseBodyFragment(htmlContent);
    Elements element_UL = document.select("ul");
    Elements element_OL = document.select("ol");
    if (!element_UL.isEmpty()) {
        htmlContent = replaceIndents(htmlContent, element_UL, "ul");
    }
    if (!element_OL.isEmpty()) {
        htmlContent = replaceIndents(htmlContent, element_OL, "ol");
    }
    return htmlContent;
}


public String replaceIndents(String htmlContent, Elements element, String tagType) {
    String attributeKey = "class";
    String startingULTgas = "<" + tagType + ">";
    String endingULTags = "</" + tagType + ">";
    int lengthOfQLIndenet = new String("ql-indent-").length();
    HashMap<String, String> startingLiTagMap = new HashMap<String, String>();
    HashMap<String, String> lastLiTagMap = new HashMap<String, String>();
    Pattern regex = Pattern.compile("ql-indent-\\d");
    HashSet<String> hash_Set = new HashSet<String>();
    Elements element_Tag = element.select("li");
    for (org.jsoup.nodes.Element element2 : element_Tag) {
        org.jsoup.nodes.Attributes att = element2.attributes();
        if (att.hasKey(attributeKey)) {
            String attributeValue = att.get(attributeKey);
            Matcher matcher = regex.matcher(attributeValue);
            if (matcher.find()) {
                if (!startingLiTagMap.containsKey(attributeValue)) {
                    startingLiTagMap.put(attributeValue, element2.toString());
                }
                hash_Set.add(matcher.group(0));
                if (!startingLiTagMap.get(attributeValue)
                        .equalsIgnoreCase(element2.toString())) {
                    lastLiTagMap.put(attributeValue, element2.toString());
                }
            }
        }
    }
    System.out.println(htmlContent);
    Iterator value = hash_Set.iterator();
    while (value.hasNext()) {
        String liAttributeKey = (String) value.next();
        int noOfIndentes = Integer
                .parseInt(liAttributeKey.substring(lengthOfQLIndenet));
        if (noOfIndentes > 1)
            for (int i = 1; i < noOfIndentes; i++) {
                startingULTgas = startingULTgas + "<" + tagType + ">";
                endingULTags = endingULTags + "</" + tagType + ">";
            }
        htmlContent = htmlContent.replace(startingLiTagMap.get(liAttributeKey),
                startingULTgas + startingLiTagMap.get(liAttributeKey));
        if (lastLiTagMap.get(liAttributeKey) != null) {
            System.out.println("Inside last Li Map");
            htmlContent = htmlContent.replace(lastLiTagMap.get(liAttributeKey),
                    lastLiTagMap.get(liAttributeKey) + endingULTags);
        }
        else {
            htmlContent = htmlContent.replace(startingLiTagMap.get(liAttributeKey),
                    startingLiTagMap.get(liAttributeKey) + endingULTags);
        }
        startingULTgas = "<" + tagType + ">";
        endingULTags = "</" + tagType + ">";
    }
    System.out.println(htmlContent);[enter image description here][1]
    return htmlContent;
}
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!