• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Jeanne Boyarsky
  • Ron McLeod
  • Paul Clapham
  • Liutauras Vilda
Sheriffs:
  • paul wheaton
  • Rob Spoor
  • Devaka Cooray
Saloon Keepers:
  • Stephan van Hulst
  • Tim Holloway
  • Carey Brown
  • Frits Walraven
  • Tim Moores
Bartenders:
  • Mikalai Zaikin

I need some help with Huffman encoding...

 
Greenhorn
Posts: 3
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
OK, so one of my teachers from college just made me do a file compressing utility, using the Huffman encoding. I don't have much problem with the code tree, since I've been reading quite a bit about data structures in Java. The problem, however, is with the encoding process itself.

The thing is, that Huffman encoding creates code words of absolutely any length, from 1 bit to whatever the **** you get. But guess what? A computer can only handle full bytes. That's a big problem for me, 'cause I have absolutely no idea on how to handle these "partial bytes", and I don't think padding with zeroes would do, because that would make my program much, much less efficient. I've been looking on the Internets a way to read and handle "partial bytes" with Java, partly because I'm really good at Java, partly because I know absolutely nothing about C++, but so far I've found nothing at all.

Any idea on how can I keep my Huffman encoding efficient?
 
Rancher
Posts: 43081
77
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Welcome to JavaRanch.

Is performance really that important in this assignment? I would think that correctness is foremost.

One way around this might be the java.util.BitSet class, which lets you handle "bit strings" of any length.
 
author
Posts: 23951
142
jQuery Eclipse IDE Firefox Browser VI Editor C++ Chrome Java Linux Windows
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
You may want to ask your professor on what you are allowed to use.

If I remember correctly, way back in college, when I did this assignment, I was expected to use the bit operators to mask and shift the bits into place. I did it in C back then -- I wanted to do it in assembly (for my Commodor 64), but my professor insisted that it ran on the school's DEC system.

Anyway, Java has all of the bit operators from the C language, so you should check if this is what you are expected to use.

Henry
 
Rancher
Posts: 4803
7
Mac OS X VI Editor Linux
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Originally posted by Acoyani Garrido Sandoval:
But guess what? A computer can only handle full bytes. That's a big problem for me, 'cause I have absolutely no idea on how to handle these "partial bytes", and I don't think padding with zeroes would do, because that would make my program much, much less efficient.



As others have said, why worry about efficiency.

Sounds like a good assignment. One thing it will teach you is that its only modern computers driven from a particular set of engineering assumptions that even have a concept of a 'byte' which seems to be universally defined as 8 bits.

Some of the greatest old computers were word oriented, and could pick up and write any number of bits anywhere in the word (which was often 36 bits). So it is good to break out of the mindset that a byte is an important concept, and that all bytes contain 8 bits.

Most compression systems use heavy packing, not only Huffman, but LZW, and similar approaches.

You can always use shift and XOR/AND/OR operators to do anything.
 
With a little knowledge, a cast iron skillet is non-stick and lasts a lifetime.
reply
    Bookmark Topic Watch Topic
  • New Topic