docx

Number of pages of a word document with Python

冷暖自知 提交于 2019-11-28 06:03:05
问题 Is there a way to get efficiently the number of pages of a word document (.doc, .docx) with Python ? And for an .odt file ? I want to use this for a web application based on Web2py on Linux. Thank you ! 回答1: You can read the value <Properties> <Pages>CountValue</Pages> from docProps/app.xml in the docx package or <office:document-meta> <office:meta> <meta:document-statistic meta:page-count="CountValue"> form meta.xml in odt package. If these values ​​do not exist (they are optional), you have

Convert html to doc in java

别等时光非礼了梦想. 提交于 2019-11-28 01:14:54
I would like to convert either an html or xhtml document (preferably with styles) to Microsoft .doc and/or .docx format. There seem to be plenty of examples for doing this the other way around but I haven't found any useful examples for converting to ms document formats. Can anyone point me to an api or provide an example for doing this please Many thanks docx4j 2.8.0 supports converting XHTML documents and fragments to docx content. Disclosure: I wrote some of the code. Yet another solution would be to use jodconverter which seems to basic html to doc conversion... it doesn't claim to do it

How to Search and Replace in odt Open Office document?

◇◆丶佛笑我妖孽 提交于 2019-11-28 01:13:20
问题 In my Delphi application I am currently do Search&Replace programmatically for doc and docx word documents using office ole automation. Does anyone has the code to do the same (for doc, docs, odt) in OpenOffice? I also asked a related question on saving to pdf. 回答1: You should take a focus on XReplaceable interface. Here is the example. Please note, that there's no error handling. I've tested it with LibreOffice writer and it works fine for me. uses ComObj; procedure OpenOfficeReplace(const

git如何支持doc文档

ε祈祈猫儿з 提交于 2019-11-28 01:10:13
这个问题很容易解决,只要添加一个 .gitattributes 内容如下: ///////////////////////////////////////////////////////////////////////// # Auto detect text files and perform LF normalization * text=auto # Custom for Visual Studio *.cs diff=csharp *.sln merge=union *.csproj merge=union *.vbproj merge=union *.fsproj merge=union *.dbproj merge=union # Standard to msysgit *.doc diff=astextplain *.DOC diff=astextplain *.docx diff=astextplain *.DOCX diff=astextplain *.dot diff=astextplain *.DOT diff=astextplain *.pdf diff=astextplain *.PDF diff=astextplain *.rtf diff=astextplain *.RTF diff=astextplain //////////////////////

PyInstaller and python-docx module do not work together

|▌冷眼眸甩不掉的悲伤 提交于 2019-11-28 01:07:15
问题 I am trying to make an executable of my program to give to my FTC team. Everything works but when I try to use my script that includes python-docx in it but it does not complete the whole thing. It works when I run it in PyCharm and from the terminal. Here is the code. I have python3. from tkinter import * import sys,math,random,datetime,os,time import tkinter.messagebox from tkinter import filedialog from tkinter.filedialog import askopenfilename from tkinter.messagebox import showerror from

Is there any java library (maybe poi?) which allows to merge docx files? [closed]

谁说胖子不能爱 提交于 2019-11-27 22:21:50
I need to write a java application which can merge docx files. Any suggestions? The following Java APIs are available to handle OpenXML MS Word documents with Java: Apache POI XWPF OpenOffice.org API OpenXML4J Docx4J There was one more, but I don't recall the name anymore. As to your functional requirement: merging two documents is technically tricky to achieve the result as the enduser would expect. Most API's won't allow that. You'll need to extract the desired information from two documents and then create one new document based on this information yourself. With POI my solution is: public

Convert Html to Docx in c# [closed]

家住魔仙堡 提交于 2019-11-27 22:13:23
i want to convert a html page to docx in c#, how can i do it? Below does the same thing as Luis code, but just a bit more readable and applied to an ASP.NET MVC application: var word = new Microsoft.Office.Interop.Word.Application(); word.Visible = false; var filePath = Server.MapPath("~/MyFiles/Html2PdfTest.html"); var savePathPdf = Server.MapPath("~/MyFiles/Html2PdfTest.pdf"); var wordDoc = word.Documents.Open(FileName: filePath, ReadOnly: false); wordDoc.SaveAs2(FileName: savePathPdf, FileFormat: WdSaveFormat.wdFormatPDF); you can also save in other formats such as docx like this: var

Knitr & Rmarkdown docx tables

青春壹個敷衍的年華 提交于 2019-11-27 20:28:06
问题 When using knitr and rmarkdown together to create a word document you can use an existing document to style the output. For example in my yaml header: output: word_document: reference_docx: style.docx fig_caption: TRUE within this style i have created a default table style - the goal here is to have the kable table output in the correct style. When I knit the word document and use the style.docx the tables are not stylized according to the table. Using the style inspector has not been helpful

Using OpenXML SDK to replace text on a docx file with a line break (newline)

Deadly 提交于 2019-11-27 19:26:16
问题 I am trying to use C# to replace a specific string of text on an entire DOCX file with a line break (newline). The string of text that I am searching for could be in a paragraph or in a table in the file. I am currently using the code below to replace text. using (WordprocessingDocument doc = WordprocessingDocument.Open("yourdoc.docx", true)) { var body = doc.MainDocumentPart.Document.Body; foreach (var text in body.Descendants<Text>()) { if (text.Text.Contains("##Text1##")) { text.Text =

Reading doc and docx files using C# without having MS Office installed on server

99封情书 提交于 2019-11-27 19:20:03
问题 I'm working on a project (asp.net, c#, vb 2010, .net 4) and I need to read both DOC and DOCX files, that I've previosly uploaded (I've done uploading part). Tricky part is that I don't have MS Office installed on server and that I can't use it. Is there any public library that I can include into my project without having to install anything? Both docs are very simple: NUMBER TAB STRING NUMBER TAB STRING NUMBER TAB STRING ... I need to extract number and string for each row (paragraph). May