docx

How to change font encoding when converting docx -> pdf with docx4j?

为君一笑 提交于 2019-12-22 18:28:15
问题 When I'm a converting docx document to pdf my national characters transform into "#" marks. Is there any way to set a font encoding for pdf documents? I used xdocreport in the past and it can handle that, but I had problems with images, headers and footers. Docx4j manages to do this, but not fonts. After conversion, fonts have ANSI encoding while I'd like to have windows-1250. Is there an option to set this? 回答1: My problem was - missing proper True Type Fonts on linux server. The default

Docx: can't seem to get a bulleted list to render

北城以北 提交于 2019-12-22 15:09:22
问题 I'm building a DocX document essentially from scratch with XML. I have a very simple goal: create a bullet-point list, like a ul in HTML. Reading the WordProcessingML specification for numbered lists (section 2.9), I created what I thought would satisfy this. Here's what my numbering.xml looks like: <?xml version="1.0" encoding="UTF-8" standalone="yes"?> <w:numbering xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math" xmlns:mo="http://schemas.microsoft.com/office/mac/office

Convert PDF to .docx with Python

£可爱£侵袭症+ 提交于 2019-12-22 11:19:18
问题 I'm trying very hard to find the way to c onvert a PDF file to a .docx file with Python . I have seen other posts related with this, but none of them seem to work correctly in my case. I'm using specifically import os import subprocess for top, dirs, files in os.walk('/my/pdf/folder'): for filename in files: if filename.endswith('.pdf'): abspath = os.path.join(top, filename) subprocess.call('lowriter --invisible --convert-to doc "{}"' .format(abspath), shell=True) This gives me Output[1], but

Reading docx files, recognizing and storing italicized text

陌路散爱 提交于 2019-12-22 09:06:12
问题 How should I go about reading a .docx file with Python and being able to recognize the italicized text and storing it as a string? I looked at the docx python package but all I see is features for writing to a .docx file. I appreciate the help in advance 回答1: Here's what my example document, TestDocument.docx , looks like. Note: The word "Italic" is in Italics, but "Emphasis" uses the style, Emphasis. If you install the python-docx module. This is a fairly simple exercise. >>> from docx

Django create .odt or .docx documents to download

喜欢而已 提交于 2019-12-21 20:24:42
问题 I need to generate either .odt or .docx files based on the information I have in my database. Let's say I have a model: class Contact(models.Model): first_name = models.CharField() last_name = models.CharField() email = models.EmailField() I want that users are able to generate office document that contains that information and also some other text. I took a look for this example which is using python-docx and it gives me an idea how to generate that document. But I can't figure out where

Is there any way to generate a thumbnail image of a DOCX file?

微笑、不失礼 提交于 2019-12-21 17:32:06
问题 I have done this using "pay" tools like ASPOSE, but I was curious if there are any open source tools out there that will do this. 回答1: I am sure there may be tools out there to do that, but if you can get the file into a format that can then be rasterized easily, it may be worth exploring that. eg converting the work doc to pdf then thumbnailing that. 来源: https://stackoverflow.com/questions/5921761/is-there-any-way-to-generate-a-thumbnail-image-of-a-docx-file

Read Microsoft Word Documents into Plain Text (DOC, DOCX) in Java

故事扮演 提交于 2019-12-21 16:46:52
问题 I'm looking for something in Java to read in Word documents to process their text.. all I need is there text, nothing fancy. I know about Apache POI, however it doesn't include support for DOCX right now, anything out there? 回答1: If you don't require formatting information, images and all other fancy stuff, then the job is lot easier. Just some 5 to 10 lines of code will do. Treat DOCX as a zip file. It consists a bunch of files which includes 'document.xml'. Use ZipInputStream and extract

Xamarin free HTML or DOC to PDF Conversion

妖精的绣舞 提交于 2019-12-21 05:26:06
问题 I'm currently searching for a library or a way to convert HTML OR DOCX files into PDF on the phone/tab, primarily I'am searching for a way on Android or iOS idk if its a PCL or platform specific approach. I could do this for every Platform independently, because our app requires iOS 8 or android kitkat, both supporting native PDF conversion but i want to do it seamless for the user, so the question is, if anyone has done this before, without loading it into a visible Webview at first or has

pandoc convert html with style sheet to docx

℡╲_俬逩灬. 提交于 2019-12-20 19:36:14
问题 I've been banging my head on this one for a few hours, and I'm sure the solution is quite simple, or non-existent. I'm trying to convert an html file to docx! <!DOCTYPE html> <html> <head> <style> body { background-color: #d0e4fe; } h1 { color: orange; text-align: center; } p { font-family: "Times New Roman"; font-size: 20px; } </style> </head> <body> <h1>My First CSS Example</h1> <p>This is a paragraph.</p> </body> </html> I can convert it no problem, but I can't get the styles to stick.

doc文件阅读器可以干什么?

浪尽此生 提交于 2019-12-20 18:59:20
【推荐】2019 Java 开发者跳槽指南.pdf(吐血整理) >>> doc文件阅读器 一款doc文件阅读器,它是一款专门针对DOC文件而推出的阅读器!而有很多朋友会问了,doc是什么样的文件呢,实际上它是电脑文件中常见副档名的一种,当然还有一种说法就是 word2003 之前版本保存的格式!我们先不管它是什么格式,而是要怎么打开它,doc文件阅读器完美的支持可doc文件以及TXT文件,方便了用户更好的进行使用!另外,该软件不需要电脑中安装word软件,因为它是运行在电脑上的桌面程序。 软件功能 1、打开,阅读和打印Word文档来自于FoxPDF Doc阅读器; 2、它完全不需要Microsoft软件。Doc阅读器能显示高品质的Word文档(Doc, DocX)等; 3、独立软件, 它不要Microsoft软件和Microsoft Word; 4、Doc阅读器支持可以打开,查看和打印高速; 5、支持的操作系统有 Windows 2000/xp/2003/Vista/2008/7/8等; 6、同时支持32位和64位系统。 7、Doc阅读器易于使用,只需拖放打开,查看和打印Word文件; 8、支持英语,法语,德语,意大利语,中文简体,中文繁体,日文等语言; 拓展: doc阅读器不仅可以打开Word(Doc,Docx)和Rtf文件,而且还能显示TXT文件它完全不需要Microsoft