How can I get useful git diff of files saved by Libre Office Writer, with output in the command line?

青春壹個敷衍的年華 提交于 2020-05-28 04:54:26

问题


Default version of git diff for default .odt files is not showing what was changed

Binary files i/filename.odt and w/filename.odt differ

Is there a way to show what was really changed and keep file directly editable by Libre Office?


回答1:


You could also use the flat xml format proposed by Libreoffice.

The .fodt file format. See Libreoffice and version control or this answer that provides good links.

From the link:

If a document is saved as .fodt file it keeps the same data the .odt file would contain. Only that this time the data is represented as human-readable text (which makes the work much easier for the version control system) and not compressed. So saving a document as flat xml makes it possible to keep server space requirements and network load low at the relatively low cost of wasting a few kilobytes on the local hard disks.




回答2:


Note: As mentioned, ideally one should avoid versioning binary files, as they make comparing, integrating and resolving conflicts more difficult.


In git, you can configure a diff driver specific to each office file to convert them to a plain-text representation before comparing them.

Here are a few examples of tools that can be used:

  • catdoc (for Word)
  • catppt (for Powerpoint)
  • odt2txt (for Writer)
  • xls2csv (for Excel)

First, the file type of each office file can be configured globally in the $HOME/.config/git/attributes file:

*.doc binary diff=doc
*.odt binary diff=odt
*.ppt binary diff=ppt
*.xls binary diff=xls

Then, to globally configure the diff driver for each of those file types:

git config --global diff.doc.textconv catdoc
git config --global diff.odt.textconv odt2txt
git config --global diff.ppt.textconv catppt
git config --global diff.xls.textconv xls2csv

Source: https://medium.com/@mbrehin/git-advanced-diff-odt-pdf-doc-xls-ppt-25afbf4f1105




回答3:


Don't store odt files in git. You can unzip them and store the contents instead which is XML. You might need to add newlines to the XML files as they are, IIRC, just XML one-liners.




回答4:


For the basics, to diff the text in any zipped-xml format you can use xmllint to format the xml's and diff those, say you've done

git show master:summary.odt >${file1=`mktemp`}
git show feature:summary.odt >${file2=`mktemp`}
7z x -o ${extract1=`mktemp -d`} $file1
7z x -o ${extract2=`mktemp -d`} $file2
find $extract1 $extract2 -iname \*.xml -execdir xmllint --format {} -o {}.pretty \;

and you can now diff the .pretty's to see what changed. Pack that up with the usual scaffolding and you've got yourself a basic diff driver. You can even replace the xml with the prettified xml, edit it, repack it, it all works.



来源:https://stackoverflow.com/questions/52873519/how-can-i-get-useful-git-diff-of-files-saved-by-libre-office-writer-with-output

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!