docx

What could be causing this corruption in .docx files during httpwebrequest?

旧街凉风 提交于 2019-11-29 23:46:06
问题 I am using httpwebrequest to post a file with some additional form data from an MVC app to a classic ASP site. If the file is a .docx, it always arrives as corrupted. Others seem to open fine, but it could be that their formats are more flexible. When I open the original and corrupted files in Sublime Text, I noticed that the corrupted file is missing a block of 0000 at the end. When I manually replace this block the file opens fine. Is there something I'm doing incorrectly in my .NET code

Add styling rules in pandoc tables for odt/docx output (table borders)

我只是一个虾纸丫 提交于 2019-11-29 23:05:42
I'm generating some odt/docx reports via markdown using knitr and pandoc and am now wondering how you'd go about formating tables. Primarily I'm interested in adding rules (at least top, bottom and one below the header, but being able to add arbitrary ones inside the table would be nice too). Running the following example from the pandoc documentation through pandoc (without any special parameters) just yields a "plain" table without any kind of rules/colours/guides (in either -t odt or -t docx ). +---------------+---------------+--------------------+ | Fruit | Price | Advantages | +==========

How to convert .docx to .odt with Libreoffice on Ubuntu bash

为君一笑 提交于 2019-11-29 18:41:00
问题 There is a problem for converting DOCX to PDF using Libreoffice.(in RTL documents) but converting same document saved in ODT format will works fine. Anyone knows how to convert an existing DOCX file to ODT using Ubuntu bash? 回答1: Then you can use this command directly from command line libreoffice --headless --convert-to odt *.docx 回答2: You can directly save it to odt format in Libre Office. Click on save as Select ODT as format Name the file Click on ok 来源: https://stackoverflow.com

How to extract plain text from a DOCX file using the new OOXML support in Apache POI 3.5?

*爱你&永不变心* 提交于 2019-11-29 17:24:33
问题 On September 28, 2009 the Apache POI project released version 3.5 which officially supports the OOXML formats introduced in Office 2007, like DOCX and XLSX. Please provide a code sample for extracting a DOCX file's content in plain text, ignoring any styles or formatting. I am asking this because I have been unable to find any Apache POI examples covering the new OOXML support. 回答1: This worked for me. Make sure you add the required jars (upgrade xmlbeans, etc.) public String extractText

Apache POI XWPF adding shapes to header

生来就可爱ヽ(ⅴ<●) 提交于 2019-11-29 16:39:49
I'm trying to add some shapes and a logo-file into the header of my word docx document. Adding a picture works for me, but i didn't find any solution how to add a shape. can anyone help me? String imgFile="logo.png"; XWPFDocument document = new XWPFDocument(new FileInputStream("myfile.docx")); CTSectPr sectPr = document.getDocument().getBody().addNewSectPr(); XWPFHeaderFooterPolicy headerFooterPolicy = new XWPFHeaderFooterPolicy(document, sectPr); XWPFHeader header = headerFooterPolicy.createHeader(XWPFHeaderFooterPolicy.DEFAULT); XWPFParagraph paragraph = header.getParagraphArray(0);

Change image layout or wrap in DOCX with Apache POI

隐身守侯 提交于 2019-11-29 16:17:06
I paste image into docx programmatically. But in result the layout does not suit me. Faced a lack of documentation. I need to change image wrap (layout). For example now I have this: But want this: UPD1 : What I do: iterate through the paragraphs, then through the runs and find certain run with special bookmark. In this run I add picture: XWPFPicture pic = run.addPicture( new ByteArrayInputStream(picSource), Document.PICTURE_TYPE_PNG, "pic", Units.toEMU(100), Units.toEMU(30)); UPD2 : Investigated something interesting inside this class: org.openxmlformats.schemas.drawingml.x2006

How to read metadata information from docx documents?

不打扰是莪最后的温柔 提交于 2019-11-29 15:11:19
问题 what I need to achieve is to have a word document template(docx), which will contain Title, Author name, Date, etc. This template then will be used by users to complete it. I need to create a c# program, that will take in the docx file and read all the information of interest(title, name, date, ..). So my questions are: How do I put the metadata into the template saying: this is Title, this is Date, this is Name, etc? (not programatically) How do I programmatically read that information? 回答1:

Number of pages of a word document with Python

自作多情 提交于 2019-11-29 12:08:22
Is there a way to get efficiently the number of pages of a word document (.doc, .docx) with Python ? And for an .odt file ? I want to use this for a web application based on Web2py on Linux. Thank you ! You can read the value <Properties> <Pages>CountValue</Pages> from docProps/app.xml in the docx package or <office:document-meta> <office:meta> <meta:document-statistic meta:page-count="CountValue"> form meta.xml in odt package. If these values ​​do not exist (they are optional), you have to make a calculation of the entire document, in fact perform rendering, that much more difficult Only for

PyInstaller and python-docx module do not work together

女生的网名这么多〃 提交于 2019-11-29 11:54:14
I am trying to make an executable of my program to give to my FTC team. Everything works but when I try to use my script that includes python-docx in it but it does not complete the whole thing. It works when I run it in PyCharm and from the terminal. Here is the code. I have python3. from tkinter import * import sys,math,random,datetime,os,time import tkinter.messagebox from tkinter import filedialog from tkinter.filedialog import askopenfilename from tkinter.messagebox import showerror from time import gmtime, strftime from docx import Document from docx.shared import Inches import

java使用freemarker模板导出word(docx格式;流形式输入输出)

不打扰是莪最后的温柔 提交于 2019-11-29 10:39:38
前言:好久没有更文了,最近又再做关于导出word文档项目。其实网上很多有关导出的博文,多数是大同小异的,但是还远远不能满足我的需求。之前写过一篇导出word的文章,那个还不太成熟,随着业务的增加,肯定有了不小的变化,所以今天这篇文章索性就叫续集吧,希望可以帮到大家! 上一篇写的是有关doc格式的。具体详情请访问: 点击打开链接 ,在这里说明一下上篇存在的一些问题: 1、记得上篇说到获取模板的时候,是通过new File("url")的形式来获取的;其实我是不推荐这种方式的,除非是你们的需求就是这样要求的;我在导出的过程中,模板(testword.ftl)是以Blob大型文件存在mysql中的,然后我可以以流的形式获取到这个模板,包括导出过程中,任意涉及到文件的输入输出的时候,尽量的都要使用流来操作;下面我贴一下代码: 我想你百度了很多导出的文章,应该大多数都是以下面这种形式获取的吧: System.out.println("---进入createDocArea---"); this.configuration.setDirectoryForTemplateLoading(new File("/template/"));//第二种模板路径 Template t = null; File outFile = null; byte[] bFile = null; try { t =