python-docx

最全总结 | 聊聊 Python 办公自动化之 Word(中)

拥有回忆 提交于 2021-02-16 05:45:35
聊聊 Python 数据处理全家桶(Memca 篇) 点击上方“ AirPython ”,选择“ 加为星标 ” 第一时间关注 Python 技术干货! 1. 前言 上一篇文章,对 Word 写入数据的一些常见操作进行了总结 最全总结 | 聊聊 Python 办公自动化之 Word(上) 相比写入数据,读取数据同样很实用! 本篇文章,将谈谈如何全面读取一个 Word 文档中的数据,并会指出一些要注意的点 2. 基本信息 我们同样使用 python-docx 这个依赖库来对 Word 文档进行读取 首先我们来读取文档的基本信息 它们分别是: 章节、页边距、页眉页脚边距、页面宽高、页面 方向等 在获取文档基础信息之前,我们通过文档路径构建一个文档对象 Document from docx import Document # 源文件目录 self.word_path = './output.docx' # 打开文档,构建一个文档对象 self.doc = Document(self.word_path) 1 - 章节( Section ) # 1、获取章节信息 # 注意:章节可以设置本页的大小、页眉、页脚 msg_sections = self.doc.sections print( "章节列表:" , msg_sections) # 章节数目 print( '章节数目:' , len

python-docx cannot be imported to python

泄露秘密 提交于 2021-02-10 19:41:57
问题 I'm trying to install python-docx so I typed in the cmd easy_install python-docx and got: Searching for python-docx Best match: python-docx 0.7.4 Processing python_docx-0.7.4-py2.6.egg python-docx 0.7.4 is already the active version in easy-install.pth Using c:\python26\lib\site-packages\python_docx-0.7.4-py2.6.egg Processing dependencies for python-docx Finished processing dependencies for python-docx but when I open python and type: import docx I got: File "c:\python26\lib\site-packages

Modify the XML in paragraph.style._element.xml in python-docx

ⅰ亾dé卋堺 提交于 2021-02-10 14:19:06
问题 I want to modify the color of the border, and I get its XML by calling style._element.xml : >>> document = Document() >>> run = document.add_heading(u'', 0).add_run('hello world') >>> paragraphs = document.paragraphs >>> print(paragraphs[0].style._element.xml) <w:style xmlns:w="http://schemas.openxmlformats.org/wordprocessingml/2006/main" xmlns:mc="http://schemas.openxmlformats.org/markup-compatibility/2006" xmlns:r="http://schemas.openxmlformats.org/officeDocument/2006/relationships" xmlns

ImportError: cannot import name Document

て烟熏妆下的殇ゞ 提交于 2021-02-08 04:29:31
问题 When I am running from docx import Document I am getting error as ImportError: cannot import name Document I am working on Python 2.7. 回答1: It looks like you have installed only docx , but I tried with this and worked for me: pip uninstall docx pip install python-docx This way you will be using the newest version of the library, hope you find it useful. 来源: https://stackoverflow.com/questions/53674449/importerror-cannot-import-name-document

How to add hyperlink to an image in python-docx

北城以北 提交于 2021-02-08 01:44:22
问题 I added an image using python docx. Now, I want to add a hyper-link. How to do that? import io import urllib from docx import Document from docx.shared import Inches document = Document() p = document.add_paragraph() r = p.add_run() url = r'http://www.example.com/a.jpg' io_url = io.BytesIO(urllib.request.urlopen(url).read()) r.add_picture(io_url) #TODO: add a hyperlink 'http://mywebsite.com' to r document.save('example.docx') Thank you very much. 回答1: It seems that the feature to add a

How to reset page number to 1 in python-docx?

|▌冷眼眸甩不掉的悲伤 提交于 2021-01-28 06:42:11
问题 I've added a Field 'Page' to my code using code: def _add_field(run, field): """ add a field to a run """ fldChar1 = OxmlElement('w:fldChar') # creates a new element fldChar1.set(qn('w:fldCharType'), 'begin') # sets attribute on element instrText = OxmlElement('w:instrText') instrText.set(qn('xml:space'), 'preserve') # sets attribute on element instrText.text = field fldChar2 = OxmlElement('w:fldChar') fldChar2.set(qn('w:fldCharType'), 'separate') t = OxmlElement('w:t') t.text = "Right-click

Python auto-numbering in headings to text [docx files]

a 夏天 提交于 2021-01-28 02:28:18
问题 I'm trying to find a way to convert headings' auto-numbering to text In MS Word's VBA it is just: Sub Test() ActiveDocument.Range.ListFormat.ConvertNumbersToText End Sub but how about Python 3.x? 回答1: Python-docx has the styles attribute. Documentation and examples of use here: http://python-docx.readthedocs.io/en/latest/user/styles-using.html 来源: https://stackoverflow.com/questions/49194396/python-auto-numbering-in-headings-to-text-docx-files

python-docx: tables missing from document.tables

故事扮演 提交于 2021-01-28 00:03:06
问题 When trying to access a table in the word document below, tables before the table of contents are missing from document.tables https://www.fedramp.gov/assets/resources/templates/FedRAMP-SSP-High-Baseline-Template.docx Here is an example of me importing the doc and checking the first table in the tables list and the corresponding table in section 1 of the document (after the table of contents): https://puu.sh/DBm0O/86ee455e03.png Here is the table I'm trying to access https://puu.sh/DBm2f

Setting pgNumType property in python-docx is without effect

丶灬走出姿态 提交于 2021-01-27 10:47:40
问题 I'm trying to set page numbers in a word document using python-docx . I found an attribute pgNumType (pageNumberType), which I'm setting with this code: document = Document() section = document.add_section(WD_SECTION.CONTINUOUS) sections = document.sections sectPr = sections[0]._sectPr pgNumType = OxmlElement('w:pgNumType') pgNumType.set(qn('w:fmt'), 'decimal') pgNumType.set(qn('w:start'), '1') sectPr.append(pgNumType) This code does nothing, no page numbers are in the output document. I did

Setting pgNumType property in python-docx is without effect

半腔热情 提交于 2021-01-27 10:46:16
问题 I'm trying to set page numbers in a word document using python-docx . I found an attribute pgNumType (pageNumberType), which I'm setting with this code: document = Document() section = document.add_section(WD_SECTION.CONTINUOUS) sections = document.sections sectPr = sections[0]._sectPr pgNumType = OxmlElement('w:pgNumType') pgNumType.set(qn('w:fmt'), 'decimal') pgNumType.set(qn('w:start'), '1') sectPr.append(pgNumType) This code does nothing, no page numbers are in the output document. I did