I am trying to convert a large number of HTML files into Markdown using Pandoc in Windows, and have found an answer on how to do this on a Mac, but receive errors when attem
for ...
solutions) are for cmd.exe
, not PowerShell.The functionally equivalent PowerShell command is:
Get-ChildItem -File -Recurse -Filter *.md | ForEach-Object {
pandoc -o ($_.FullName + '.txt') $_.FullName
}
to convert files in folders recursively try this (Windows prompt command line):
for /r "startfolder" %i in (*.htm *.html) do pandoc -f html -t markdown "%~fi" -o "%~dpni.txt"
For use in a batch file double the %
.
Endoro's answer is great, don't get confused by the parameters added to %i
.
For helping others, I needed to convert from RST (restructured text) to dokuwiki syntax, so I created a convert.bat
with:
FOR /r "startfolder" %%i IN (*.rst) DO pandoc -f rst -t dokuwiki "%%~fi" -o "%%~dpni.txt"
Works for all rst files in folders and subfolders.
If you want to go recursively through a directory and its subdirectories to compile all the files of type, say, *.md
, then you can use the batch file I wrote in answer to another question How can I use pandoc for all files in the folder in Windows? . I call it pancompile.bat
and the usage is below. Go to the other answer for the code.
Usage: pancompile DIRECTORY FILENAME [filemask] ["options"]
Uses pandoc to compile all documents in specified directory and subdirectories to a single output document
DIRECTORY the directory/folder to parse recursively (passed to pandoc -s);
use quotation marks if there are spaces in the directory name
FILENAME the output file (passed to pandoc -o); use quotation marks if spaces
filemask an optional file mask/filter, e.g. *.md; leave blank for all files
"options" optional list of pandoc commands (must be in quotation marks)
Minimal example: pancompile docs complete_book.docx
Typical example: pancompile "My Documents" "Complete Book.docx" *.md "-f markdown -t docx --standalone --toc"
Using the powershell built-in gci:
gci -r -i *.md |foreach{$docx=$_.directoryname+"\"+$_.basename+".docx";pandoc $_.name -o $docx}
from https://github.com/jgm/pandoc/issues/5429