compression

Spark: writing DataFrame as compressed JSON

南笙酒味 提交于 2019-11-30 02:00:16
Apache Spark's DataFrameReader.json() can handle gzipped JSONlines files automatically but there doesn't seem to be a way to get DataFrameWriter.json() to write compressed JSONlines files. The extra network I/O is very expensive in the cloud. Is there a way around this problem? giorgioca The following solutions use pyspark, but I assume the code in Scala would be similar. First option is to set the following when you initialise your SparkConf: conf = SparkConf() conf.set("spark.hadoop.mapred.output.compress", "true") conf.set("spark.hadoop.mapred.output.compression.codec", "org.apache.hadoop

File compression before upload on the client-side

Deadly 提交于 2019-11-30 01:28:04
Basically I'll be working with large XML files (approx. 20 - 50 MB). These files needs to be uploaded on a server. I know it isn't possible to touch the files with javascript, nor to implement HTTP compression on the client-side. My question is that if any solution exists (flash / action script) that compresses a file and has a javascript API? The scenario is this: Trying to upload 50 MB XML file Before upload a grab it with Javascript and send it to the compressor. Upload the compressed file instead of the original one. Flash's inbuilt implementation of ByteArray has a method ( ByteArray:

How to tell if a file is gzip compressed?

自闭症网瘾萝莉.ら 提交于 2019-11-30 00:28:21
问题 I have a Python program which is going to take text files as input. However, some of these files may be gzip compressed. Is there a cross-platform, usable from Python way to determine if a file is gzip compressed or not? Is the following reliable or could an ordinary text file 'accidentally' look gzip-like enough for me to get false positives? try: gzip.GzipFile(filename, 'r') # compressed # ... except: # not compressed # ... 回答1: The magic number for gzip compressed files is 1f 8b . Although

Compressing content with PHP ob_start() vs Apache Deflate/Gzip?

北城以北 提交于 2019-11-29 23:01:36
Most sites want to compress their content to save on bandwidth. However, When it comes to apache servers running PHP there are two ways to do it - with PHP or with apache. So which one is faster or easier on your server? For example, in PHP I run the following function at the start of my pages to enable it: /** * Gzip compress page output * Original function came from wordpress.org */ function gzip_compression() { //If no encoding was given - then it must not be able to accept gzip pages if( empty($_SERVER['HTTP_ACCEPT_ENCODING']) ) { return false; } //If zlib is not ALREADY compressing the

Implementing in-memory compression for objects in Java

╄→尐↘猪︶ㄣ 提交于 2019-11-29 22:55:34
We have this use case where we would like to compress and store objects (in-memory) and decompress them as and when required. The data we want to compress is quite varied, from float vectors to strings to dates. Can someone suggest any good compression technique to do this ? We are looking at ease of compression and speed of decompression as the most important factors. Thanks. If you want to compress instances of MyObject you could have it implement Serializable and then stream the objects into a compressed byte array, like so: ByteArrayOutputStream baos = new ByteArrayOutputStream();

javascript string compression with localStorage

允我心安 提交于 2019-11-29 21:44:38
I am using localStorage in a project, and it will need to store lots of data, mostly of type int, bool and string. I know that javascript strings are unicode, but when stored in localStorage , do they stay unicode? If so, is there a way I could compress the string to use all of the data in a unicode byte, or should i just use base64 and have less compression? All of the data will be stored as one large string. EDIT: Now that I think about it, base64 wouldn't do much compression at all, the data is already in base 64, a-zA-Z0-9 ;: is 65 characters. Oren Trutner "when stored in localStorage, do

GZipStream And DeflateStream will not decompress all bytes

本秂侑毒 提交于 2019-11-29 21:20:46
I was in need of a way to compress images in .net so i looked into using the .net GZipStream class (or DeflateStream). However i found that decompression was not always successful, sometimes the images would decompress fine and other times i would get a GDI+ error that something is corrupted. After investigating the issue i found that the decompression was not giving back all the bytes it compressed. So if i compressed 2257974 bytes i would sometimes get back only 2257870 bytes (real numbers). The most funny thing is that sometimes it would work. So i created this little test method that

What is the easiest way to add compression to WCF in Silverlight?

拈花ヽ惹草 提交于 2019-11-29 20:59:58
I have a silverlight 2 beta 2 application that accesses a WCF web service. Because of this, it currently can only use basicHttp binding. The webservice will return fairly large amounts of XML data. This seems fairly wasteful from a bandwidth usage standpoint as the response, if zipped, would be smaller by a factor of 5 (I actually pasted the response into a txt file and zipped it.). The request does have the "Accept-Encoding: gzip, deflate" - Is there any way have the WCF service gzip (or otherwise compress) the response? I did find this link but it sure seems a bit complex for functionality

Data Compression Algorithms

女生的网名这么多〃 提交于 2019-11-29 20:17:58
I was wondering if anyone has a list of data compression algorithms. I know basically nothing about data compression and I was hoping to learn more about different algorithms and see which ones are the newest and have yet to be developed on a lot of ASICs. I'm hoping to implement a data compression ASIC which is independent of the type of data coming in (audio,video,images,etc.) If my question is too open ended, please let me know and I'll revise. Thank you There are a ton of compression algorithms out there. What you need here is a lossless compression algorithm. A lossless compression

How can I automatically compress and minimize JavaScript files in an ASP.NET MVC app?

五迷三道 提交于 2019-11-29 18:49:35
So I have an ASP.NET MVC app that references a number of javascript files in various places (in the site master and additional references in several views as well). I'd like to know if there is an automated way for compressing and minimizing such references into a single .js file where possible. Such that this ... <script src="<%= ResolveUrl("~") %>Content/ExtJS/Ext.ux.grid.GridSummary/Ext.ux.grid.GridSummary.js" type="text/javascript"></script> <script src="<%= ResolveUrl("~") %>Content/ExtJS/ext.ux.rating/ext.ux.ratingplugin.js" type="text/javascript"></script> <script src="<%= ResolveUrl("~