pandoc command line parameters for resolving internal links

回眸只為那壹抹淺笑 提交于 2019-12-22 12:25:34

问题


My problem is similar to this post, but not identical. I somehow can't figure out the correct pandoc command line parameters for maintaining/resolving cross-document links when using a couple of interlinked HTML files as the input.

Let's say I have two files, chapter1.xhtml and chapter2.xhtml located in the /home/user/Documents folder with the following contents:

<?xml version="1.0" encoding="utf-8"?><!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN" "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd"><html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title></title>
</head>
<body>
<h3>Chapter 1</h3>
<p><a href="/home/user/Documents/chapter2.xhtml">Next chapter</a><br /></p>

<p>Lorem ipsum dolor sit amet, consectetur adipiscing elit.</p>
</body>
</html>

which contains a link to the next document.

and

<?xml version="1.0" encoding="utf-8"?><!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN" "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd"><html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title></title>
</head>
<body>
<h3>Chapter 2</h3>
<p><a href="/home/user/Documents/chapter1.xhtml">Previous chapter</a><br /></p>

<p>Lorem ipsum dolor sit amet, consectetur adipiscing elit.</p>
</body>
</html>

which contains a link to the previous document.

I used the following command line parameters:

$ pandoc -s --toc --verbose -o /home/user/Documents/output.markdown /home/user/Documents/chapter1.xhtml /home/user/Documents/chapter2.xhtml

And I got the following output:

---
---

-   [Chapter 1](#chapter-1)
-   [Chapter 2](#chapter-2)

### Chapter 1

[Next chapter](/home/user/Documents/chapter2.xhtml)\

Lorem ipsum dolor sit amet, consectetur adipiscing elit.

### Chapter 2

[Previous chapter](/home/user/Documents/chapter1.xhtml)\

Lorem ipsum dolor sit amet, consectetur adipiscing elit.

This problem also occurs when I select docx or latex/pdf as the output format. I also tried to use relative links, but nothing worked.

What are the correct parameters for resolving cross-document links?

tl;dr I.e. I don't want link references that contain the original paths; I want them to point to the new output document.


回答1:


The problem is that your links contain absolute paths (/home/user/Documents/chapter1.xhtml) instead of relative ones (chapter1.xhtml). I cannot imagine the ePUB file containing absolute paths, and if it does, the links in the file will only ever work correctly on your computer. So the solution will have to be fixing those ePUB files before feeding them to pandoc.

Note that roundtripping from pandoc from markdown to epub and back to html works as expected:

$ pandoc -o foo.epub
# foo

adfs

# bar

go [to foo](#foo)


$ unzip foo.epub

$ cat ch002.xhtml
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN" "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
  <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
  <meta http-equiv="Content-Style-Type" content="text/css" />
  <meta name="generator" content="pandoc" />
  <title>bar</title>
  <link rel="stylesheet" type="text/css" href="stylesheet.css" />
</head>
<body>
<div id="bar" class="section level1">
<h1>bar</h1>
<p>go <a href="ch001.xhtml#foo">to foo</a></p>
</div>
</body>
</html>

$ pandoc foo.epub

<p><span id="ch001.xhtml"></span></p>
<div id="ch001.xhtml#foo" class="section level1">
<h1>foo</h1>
<p>adfs</p>
</div>
<p><span id="ch002.xhtml"></span></p>
<div id="ch002.xhtml#bar" class="section level1">
<h1>bar</h1>
<p>go <a href="#ch001.xhtml#foo">to foo</a></p>
</div>

P.S.

Using two input documents like:

pandoc -o output.md chapter1.xhtml chapter2.xhtml

works as the pandoc README states:

If multiple input files are given, pandoc will concatenate them all (with blank lines between them) before parsing.

So for the parsing done by pandoc, it sees it as one document... so no wonder that cross-file links won't work.



来源:https://stackoverflow.com/questions/37327374/pandoc-command-line-parameters-for-resolving-internal-links

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!