doc

Convert doc to pdf using Apache POI

爱⌒轻易说出口 提交于 2019-11-28 09:28:46
I am trying to convert doc to pdf using Apache POI, but the resulting pdf document contains only text, it is not having any formating like images, tables alignment etc. How can I convert doc to pdf with having all formattings like tables, images, alignments? Here is my code: import java.io.File; import java.io.FileInputStream; import java.io.FileOutputStream; import java.io.OutputStream; import com.lowagie.text.Document; import com.lowagie.text.DocumentException; import com.lowagie.text.Paragraph; import com.lowagie.text.pdf.PdfWriter; import org.apache.poi.hwpf.HWPFDocument; import org.apache

Python操作es

被刻印的时光 ゝ 提交于 2019-11-28 08:45:18
操作几个方面 结果过滤,对于返回结果做过滤,主要是优化返回内容。 直接操作elasticsearch对象,处理一些简单的索引信息。一下几个方面都是建立在es对象的基础上。 Indices,关于索引的细节操作,比如创建自定义的mappings。 Cluster,关于集群的相关操作。 Nodes,关于节点的相关操作。 Cat API,换一种查询方式,一般的返回都是json类型的,cat提供了简洁的返回结果。 Snapshot,快照相关,快照是从正在运行的Elasticsearch集群中获取的备份。我们可以拍摄单个索引或整个群集的快照,并将其存储在共享文件系统的存储库中,并且有一些插件支持S3,HDFS,Azure,Google云存储等上的远程存储库。 Task Management API,任务管理API是新的,仍应被视为测试版功能。API可能以不向后兼容的方式更改。 结果过滤 filter_path参数用于过滤减少es返回信息,可以指定返回相关的内容,还支持一些通配符的操作* 1 body = { 2 "query": { 3 "match": { 4 "name": "成都" 5 } 6 } 7 } 8 # print(es.search(index="p1", body=body)) 9 print(es.search(index="p1", body=body, filter

Number of pages of a word document with Python

冷暖自知 提交于 2019-11-28 06:03:05
问题 Is there a way to get efficiently the number of pages of a word document (.doc, .docx) with Python ? And for an .odt file ? I want to use this for a web application based on Web2py on Linux. Thank you ! 回答1: You can read the value <Properties> <Pages>CountValue</Pages> from docProps/app.xml in the docx package or <office:document-meta> <office:meta> <meta:document-statistic meta:page-count="CountValue"> form meta.xml in odt package. If these values ​​do not exist (they are optional), you have

Advice for learning Linux x86-64 assembly & documentation [closed]

邮差的信 提交于 2019-11-28 04:34:18
Does anyone have documentation pertaining to learning the fundamentals of Linux x86-64 assembly? I'm not sure whether or not to learn it as is, or to learn x86 first, and learn it later, but being as I have an x86-64 computer and not an x86, I was thinking of learning x86-64 instead ;) Maybe someone could give me some incentive, and direction as to learning what, how, and with what documentation. Kindly give me your most favoured documentation titles, I code a little Python, this is my first attempt at a lower level language, and I'm more than ready to dedicate to it. Thanks all Callum General

Convert html to doc in java

别等时光非礼了梦想. 提交于 2019-11-28 01:14:54
I would like to convert either an html or xhtml document (preferably with styles) to Microsoft .doc and/or .docx format. There seem to be plenty of examples for doing this the other way around but I haven't found any useful examples for converting to ms document formats. Can anyone point me to an api or provide an example for doing this please Many thanks docx4j 2.8.0 supports converting XHTML documents and fragments to docx content. Disclosure: I wrote some of the code. Yet another solution would be to use jodconverter which seems to basic html to doc conversion... it doesn't claim to do it

How to avoid java.lang.NoSuchMethodError: org.apache.poi.util.IOUtils.copy(Ljava/io/InputStream;Ljava/io/OutputStream;) in Apache POI

只谈情不闲聊 提交于 2019-11-28 00:47:41
I have a code for adding watermark to existing .doc file. The following is the code I have tried so far public static void main(String[] args) { try { XWPFDocument xDoc = new XWPFDocument(new FileInputStream("test.doc")); XWPFHeaderFooterPolicy xFooter = new XWPFHeaderFooterPolicy(xDoc); xFooter.createWatermark("My Watermark"); } catch(Exception e) { e.printStackTrace(); } } The following is what I got Exception in thread "main" java.lang.NoSuchMethodError: org.apache.poi.util.IOUtils.copy(Ljava/io/InputStream;Ljava/io/OutputStream;)V at org.apache.poi.util.PackageHelper.open(PackageHelper

How do you display a formatted Word Doc in HTML/PHP?

时光毁灭记忆、已成空白 提交于 2019-11-27 22:37:04
What is the best way to display a formatted Word Doc in HTML/PHP? Here is the code I currently have but it doesn't format it: $word = new COM("word.application") or die ("Could not initialise MS Word object."); $word->Documents->Open(realpath("ACME.doc")); // Extract content. $content = (string) $word->ActiveDocument->Content; echo $content; $word->ActiveDocument->Close(false); $word->Quit(); $word = null; unset($word); I know nothing about COM, but poking around the Word API docs on MSDN, it looks like your best bet is going to be using Document.SaveAs to save as wsFormatFilteredHTML to a

Reading doc and docx files using C# without having MS Office installed on server

99封情书 提交于 2019-11-27 19:20:03
问题 I'm working on a project (asp.net, c#, vb 2010, .net 4) and I need to read both DOC and DOCX files, that I've previosly uploaded (I've done uploading part). Tricky part is that I don't have MS Office installed on server and that I can't use it. Is there any public library that I can include into my project without having to install anything? Both docs are very simple: NUMBER TAB STRING NUMBER TAB STRING NUMBER TAB STRING ... I need to extract number and string for each row (paragraph). May