doc/docx文件如何转换为降价或结构化文本?
问题描述:
是否存在将.doc
或.docx
文件转换为Markdown或类似文本的程序或工作流程?
Is there a program or workflow to convert .doc
or .docx
files to Markdown or similar text?
PS:理想情况下,我欢迎您选择将MS Word文档中的特定字体(例如consolas
)呈现为text-code: ```....```
.
PS: Ideally, I would welcome the option that a specific font (e.g. consolas
) in the MS Word document will be rendered to text-code: ```....```
.
答
Pandoc支持直接从docx转换为markdown:
Pandoc supports conversion from docx to markdown directly:
pandoc -f docx -t markdown foo.docx -o foo.markdown
支持多种降价格式:
-t gfm (GitHub-Flavored Markdown)
-t markdown_mmd (MultiMarkdown)
-t markdown (pandoc’s extended Markdown)
-t markdown_strict (original unextended Markdown)
-t markdown_phpextra (PHP Markdown Extra)
-t commonmark (CommonMark Markdown)