Use compact lists when converting from docx to markdown

大憨熊 提交于 2019-12-11 06:16:27

问题


I'm using pandoc on Windows to convert from a .docx file to a .md file.

The flags I'm using are the following:

pandoc --wrap none --to markdown_github --output fms.md "FMS.docx"

When I view the output markdown file, it has newlines separating each list item. The documentation defines this as a loose list such as the one below.

- one

- two

- three

I want to use a compact list for the output such as the one below.

- one
- two
- three

Is there a flag to make pandoc output a compact list?

If not, how can I use a filter to achieve the desired output?


回答1:


There is no flag to achieve this, but there is a simple solution using pandoc's filter functionallity. Internally, list items are represented as a list of blocks; a list is compact if all block items only consist of Plain blocks. If all items consist of only a single paragraph, then it is sufficient to change the type of the item block from Para (for paragraph) to Plain.

The Lua program below does just that. Save it and use it as a Lua filter: pandoc -t markdown --lua-filter the-filter.lua your-document.docx (requires pandoc 2.1 or later):

local List = require 'pandoc.List'

function compactifyItem (blocks)
  return (#blocks == 1 and blocks[1].t == 'Para')
    and {pandoc.Plain(blocks[1].content)}
    or blocks
end

function compactifyList (l)
  l.content = List.map(l.content, compactifyItem)
  return l
end

return {{
    BulletList = compactifyList,
    OrderedList = compactifyList
}}

If one prefers Haskell over Lua, it's also possible to use the filter below with pandoc -t markdown --filter the-filter.hs your-document.docx:

import Text.Pandoc.JSON

main = toJSONFilter compactifyList

compactifyList :: Block -> Block
compactifyList blk = case blk of
  (BulletList items)         -> BulletList $ map compactifyItem items
  (OrderedList attrbs items) -> OrderedList attrbs $ map compactifyItem items
  _                          -> blk

compactifyItem :: [Block] -> [Block]
compactifyItem [Para bs] = [Plain bs]
compactifyItem item      = item

The same would also be possible using a Python filter in case neither Lua nor Haskell is an option. See pandoc's filters page for details.



来源:https://stackoverflow.com/questions/39576747/use-compact-lists-when-converting-from-docx-to-markdown

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!