huffman-code

Jpeg restart markers

匆匆过客 提交于 2019-12-04 10:40:31
问题 I made jpeg decoder, but I didn't implement restart markers logic. That is reason why my program don't work on some images (for example images saved with Photoshop: File->Save As->jpeg). I want to implement restart marker logic, but there is no detailed online explanation how restart marker logic works. Please can anyone tell me more about restart markers, or suggest me online resource where I can read more about it. Thx! 回答1: Restart markers are quite simple. They were designed to allow

How to create Huffman tree from FFC4 (DHT) header in jpeg file?

六眼飞鱼酱① 提交于 2019-12-04 07:38:58
I thought I could work this one out myself but I don't seem to be moving forward at all. Ok, the background: I need to create a Huffman tree of codes from the information provided by the FFC4, DHT (Define Huffman Table) header in a jpg file. The DHT header defines the Huffman table in this way: 1) A series of 16 bytes. Each byte defines how many symbols have a Huffman code of n amount of bits where n is the position of the byte in the series. (did that make any sense?!!) For example the raw data in hex is: 00 01 05 01 01 01 ... 00 this means that: Num of bits: 1 2 3 4 5 6 7 ... 16 Num of codes

Optimized order of HTML attributes for compression

家住魔仙堡 提交于 2019-12-04 02:01:55
I read somewhere that organizing HTML attributes in a certain order can improve the rate of compression for the HTML document. (I think I read this from Google or Yahoo recommendation for faster sites). If I recall correctly, the recommendation was to put the most common attributes first (e.g. id , etc.) then put the rest in alphabetical order. I'm a bit confused by this. For example, if id attributes were put right after every p tag, the id would contain unique values. Thus, the duplicated string would be limited to this: <p id=" (say there were <p id="1"> and <p id="2"/> ). Because the value

Is it possible to achieve Huffman decoding in GPU?

☆樱花仙子☆ 提交于 2019-12-04 01:10:53
We have a database encoded with Huffman coding. The aim here is to copy on the GPU it with its associated decoder; then on the GPU, decod the database and do stuff on this decoded database without copying back it on the CPU. I am far to be a Huffman specialist, but the few I know shows that it seems to be an algorithm essentially based on control structures. With the basic algorithm, I am afraid that there will be a lot of serialized operations. My 2 questions are: do you know if there exists any efficient GPU version for Huffman coding if not, do you think there exists a Huffman algorithm

Fast search in compressed text files

梦想与她 提交于 2019-12-03 13:27:55
问题 I need to be able to search for text in a large number of files (.txt) that are zipped. Compression may be changed to something else or even became proprietary. I want to avoid unpacking all files and compress (encode) the search string and search in compressed files. This should be possible using Huffman compression with the same codebook for all files. I don't want to re-invent the wheel, so .. anyone knows a library that does something like this or Huffman algorithm that is implemented and

Converting a String representation of bits to a byte

自古美人都是妖i 提交于 2019-12-03 12:29:11
I'm just beginning to learn about file compression and I've run into a bit of a roadblock. I have an application that will encode a string such as "program" as a compressed binary representation "010100111111011000" (note this is still stored as a String). Encoding g 111 r 10 a 110 p 010 o 011 m 00 Now I need to write this to the file system using a FileOutputStream , the problem I'm having is, how can I convert the string "010100111111011000" to a byte[] / byte s to be written to the file system with FileOutputStream ? I've never worked with bits/bytes before so I'm kind of at a dead end here

Writing files in bit form to a file in C

最后都变了- 提交于 2019-12-03 12:10:06
I am implementing the huffman algorithm in C. I have got the basic functionality down up to the point where the binary codewords are obtained. so for example, abcd will be 100011000 or something similar. now the question is how do you write this code in binary form in the compressed file. I mean if I write it normally each 1 and 0 will be one character so there is no compression. I need to write those 1s and 0s in their bit form. is that possible in C. if so how? Collect bits until you have enough bits to fill a byte and then write it.. E.g. something like this: int current_bit = 0; unsigned

Jpeg restart markers

有些话、适合烂在心里 提交于 2019-12-03 08:06:22
I made jpeg decoder, but I didn't implement restart markers logic. That is reason why my program don't work on some images (for example images saved with Photoshop: File->Save As->jpeg). I want to implement restart marker logic, but there is no detailed online explanation how restart marker logic works. Please can anyone tell me more about restart markers, or suggest me online resource where I can read more about it. Thx! Restart markers are quite simple. They were designed to allow resynchronization after an error. Since most JPEG images are transmitted over error-free channels, they're

What are the real-world applications of huffman coding?

家住魔仙堡 提交于 2019-12-03 04:27:18
I am told that Huffman coding is used as loseless data compression algorithm , but I am also told that real data compress software do not employ Huffman coding, because if the keys are not distributed decentralized enough, the compressed file could be even larger than the orignal file. This leaves me wondering are there any real-world application of Huffman coding? Huffman is widely used in all the mainstream compression formats that you might encounter - from GZIP, PKZIP (winzip etc) and BZIP2, to image formats such as JPEG and PNG. All compression schemes have pathological data-sets that

Fast search in compressed text files

与世无争的帅哥 提交于 2019-12-03 03:26:37
I need to be able to search for text in a large number of files (.txt) that are zipped. Compression may be changed to something else or even became proprietary. I want to avoid unpacking all files and compress (encode) the search string and search in compressed files. This should be possible using Huffman compression with the same codebook for all files. I don't want to re-invent the wheel, so .. anyone knows a library that does something like this or Huffman algorithm that is implemented and tested, or maybe a better idea ? thanks in advance Most text files are compressed with one of the LZ