compression

How to use CompressionCodec in Hadoop

倾然丶 夕夏残阳落幕 提交于 2019-12-06 10:44:19
I am doing following to do compression of o/p files from reducer: OutputStream out = ipFs.create( new Path( opDir + "/" + fileName ) ); CompressionCodec codec = new GzipCodec(); OutputStream cs = codec.createOutputStream( out ); BufferedWriter cout = new BufferedWriter( new OutputStreamWriter( cs ) ); cout.write( ... ) But got null pointer exception in line 3: java.lang.NullPointerException at org.apache.hadoop.io.compress.zlib.ZlibFactory.isNativeZlibLoaded(ZlibFactory.java:63) at org.apache.hadoop.io.compress.GzipCodec.createOutputStream(GzipCodec.java:92) at myFile$myReduce.reduce(myFile

Android - reading a text file from Assets seems to include a LOT of junk before/after the actual data?

一世执手 提交于 2019-12-06 09:24:40
I package a text file with my Android App (in Assets) which I read within the App itself. To avoid this file being compressed, it's named 'mytestfile.mp3' and until recently, that worked just fine. In one of the recent SDK/ADT changes, it seems something 'odd' is happening when reading from Assets and I'm open to ideas as to what it is... I use code something like this AssetFileDescriptor descriptor = getAssets().openFd("mytextfile.mp3"); BufferedReader f = new BufferedReader(new FileReader(descriptor.getFileDescriptor())); String line = f.readLine(); while (line != null) { // do stuff Log.d(

Compress Image before Saving to disk in Java

安稳与你 提交于 2019-12-06 09:21:27
Is it possible to compress an image before saving it? I'm using the Robot class to capture images, and it returns a BufferedImage. How can I compress this image and then save it? .png files are (losslessly) compressed images. You can use ImageIO.write() to save a .png image : ImageIO.write(myBufferedImage, "png", outputfile); There is colour compression ("compression quality") and there is resolution compression ("resizing"). E.g. I got a 4mb photo to 270K using a very low "compression quality", but that looked awful, but I got it down to 12K using a reasonable quality but a smaller size. My

Is there a fast and non-fancy C# code/algorithm to compress a string of comma separated digits close to maximum info density?

亡梦爱人 提交于 2019-12-06 08:09:13
In a nutshell, I programmed myself into a corner by creating a CLR aggregate that performs row id concatenation, so I say: select SumKeys(id), name from SomeTable where name='multiple rows named this' and I get something like: SumKeys name -------- --------- 1,4,495 multiple rows named this But it dies when SumKeys gets > 8000 chars and I don't think I can do anything about it. As a quick fix (it's only failing 1% of the time for my application) I thought I might compress the string down and I thought some of you bright people out there might know a slick way to do this. Something like base64

Noise image compression

断了今生、忘了曾经 提交于 2019-12-06 08:06:31
I have "noise" image... what is the best way to compress this image ? It can be little lossy. Techniques based on DCT and Wavelet are bad for this sort of problem. My idea was to generate some recreatable noise and then store only differences... but I cant find any solution for recreatable noise images. Image example: Well, it's almost the definition of noise to not be compressible. This statement applies to "real noise", when there is no known correlation between the perceived output and another applicable rule. So if your image is just that, or built to look like it is, then sorry, it's not

Comprising different compressing methods for JSON data in python3

情到浓时终转凉″ 提交于 2019-12-06 07:58:30
问题 So, I want to compress my JSON data using different compressor. I used this to compress the JSON. import gzip import JSON with gzip.GzipFile('2.json', 'r') as isfile: for line in isfile: obj = json.loads(line) which raises error. raise OSError('Not a gzipped file (%r)' % magic) OSError: Not a gzipped file (b'[\n') I also tried direct compressing using. zlib_data= zlib.compress(data) which raises an error. return lz4.block.compress(*args, **kwargs) TypeError: a bytes-like object is required,

Decompress all Gzip files in a Hadoop hdfs directory

≯℡__Kan透↙ 提交于 2019-12-06 07:42:32
问题 On my HDFS, I have a bunch of gzip files that I want to decompress to a normal format. Is there an API for doing this? Or how could I write a function to do this? I don't want to use any command-line tools; instead, I want to accomplish this task by writing Java code. 回答1: You need a CompressionCodec to decompress the file. The implementation for gzip is GzipCodec. You get a CompressedInputStream via the codec and out the result with simple IO. Something like this: say you have a file file.gz

Huffman compression algorithm

一个人想着一个人 提交于 2019-12-06 07:42:16
问题 I've implemented file compression using huffman's algorithm, but the problem I have is that to enable decompression of the compressed file, the coding tree used, or the codes itself should be written to the file too. The question is: how do i do that? What is the best way to write the coding tree at the beggining of the compressed file? 回答1: There's a pretty standard implementation of Huffman Coding in the Basic Compression Library (BCL), including a recursive function that writes the tree

What is suitable buffer size for uncompressing large gzip files in gzopen using php?

坚强是说给别人听的谎言 提交于 2019-12-06 07:38:33
function uncompress($srcName, $dstName) { $sfp = gzopen($srcName, "rb"); $dstName = str_replace('.gz', '', $dstName); $fp = fopen($dstName, "w"); fseek($FileOpen, -4, SEEK_END); $buf = fread($FileOpen, 4); $GZFileSize = end(unpack("V", $buf)); while ($string = gzread($sfp, $GZFileSize)) { fwrite($fp, $string, strlen($string)); } gzclose($sfp); fclose($fp); } I use this code for uncompressing but It does not work and I get following error: Internal Server Error The server encountered an internal error or misconfiguration and was unable to complete your request. Please contact the server

How to write Huffman coding to a file using Python?

半世苍凉 提交于 2019-12-06 07:21:56
I created a Python script to compress text by using the Huffman algorithm. Say I have the following string: string = 'The quick brown fox jumps over the lazy dog' Running my algorithm returns the following 'bits': result = '01111100111010101111010011111010000000011000111000010111110111110010100110010011010100101111100011110001000110101100111101000010101101110110111000111010101110010111111110011000101101000110111000' By comparing the amount of bits of the result with the input string, the algorithm seems to work: >>> print len(result), len(string) * 8 194 344 But now comes the question: how do