Is there a program or workflow to convert .doc or .docx files to Markdown or similar text?
PS: Ideally, I would welcome the option that a spec
I've tested these three: (1)-Pandoc / (2)-Mammoth / (3)-w2m
By far the superior tool for conversions with support for a multitude of file types (see Pandoc's man page for supported file types):
pandoc -f docx -t gfm somedoc.docx -o somedoc.md
To get pandoc to export markdown tables ('pipe_tables' in pandoc) use multimarkdown or gfm output formats.
If formatting to PDF, pandoc uses LaTeX templates for this so you may need to install the LaTeX package for your OS if that command does not work out of the box. Instructions at LaTeX Installation
In answer to this specific question (docx --> markdown), use the Writeage plugin for Microsoft Word. It also works the other way round markdown --> docx.
If you wish to preserve unicode characters, emojis and maintain superior fonts, you'll get some milage from the editors below when using copy-and-paste operations between file formats. Note, these do not read or write natively to docx.
For outside the US, set the geometry variable:
pandoc -s -V geometry:a4paper -o outfile.pdf infile.md
Its worth mentioning here - what's not that obvious when discovering Markdown is that MultiMarkdown is by far the most feature rich markdown format, supporting amongst other things - metadata, table of contents, footnotes, maths, tables and YAML.
But Github's default format uses gfm which also supports tables. I use gfm for Github/GitLab and MultiMarkdown for everything else.