This week's book giveaway is in the Cloud/Virtualizaton forum.We're giving away four copies of Mesos in Action and have Roger Ignazio on-line!See this thread for details.
Win a copy of Mesos in Action this week in the Cloud/Virtualizaton forum!

# Base64 Encoding

Rick Beaver
Ranch Hand
Posts: 464
Hi All

Right, hit my bit level limit - I am trying to write a base64 encoder to improve my understanding of bit level operations in Java.

I am clearly missing some fundamental understanding because I am having difficulty manipulating groups of bits.

So far I have a byte arrray containing my characters to be converted.

I am struggling with the following steps to get me going:

1. Place the first three bytes in to a 24 bit block. What structure can I use to do this? I guess I can use a BitSet somehow but that is a massive overhead for what I want to do.
2. Strip off 6 bits at a time (from most significant bit) of the 24 bit block - I do not understand how to represent the 6 bits.
3. Get the base 10 representation of the 6 bits - I guess this will be easier to understand once I have figured out how to get the 6 bits in to some kind of structure.

Can anyone give me a prod to get me started please?

Cheers

Rick

Jim Yingst
Wanderer
Sheriff
Posts: 18671
You can think of an int as a set of 32 bits. If you need a set of 24 or 6 bits, well you just don't use the higher-order bits. This allows you to perform operations on groups of bits all at once, e.g.

If you can figure out what the result of the above code is, you can use very similar operations to perform most of the steps you described above.

Ranch Hand
Posts: 886
6
Shifting (>> << and ANDing (&) allow you to get at various combinations of bits. You can load bytes into an int by ORing. For example to put three bytes into an int: init the int to 0, OR in the first byte, shift the int left 8 bits, OR in the second byte, shift the int left 8 bits and OR in the third byte.
A problem you may have is with bytes being signed. The effect of this is that the sign bit of the byte is propagated thru the h.o. bits of an int when the byte is expanded to an int. To get rid of those bits, use the AND as done in the last step in the above code.

Use lots of println() statements to see your intermediate results. Do the steps one at a time.
Integer.toHexString() with also help.

Rick Beaver
Ranch Hand
Posts: 464
Hi guys

OK - here is what I have so far

Does this do it?

If so now I have my bytes in the least significant 24 bits of my 32 bit int. I can ignore the most significant 8 bits that I am not using.

The next thing I need to do is get my 24 bits into four values consisting of 6 bits each.

So my non-bitwise brain is thinking of a clunky solution something like this:

Absolutely no laughing allowed - I know I suck at this low level stuff

Can anyone hint at how I can improve on my second block?

Thanks again for helping out a dummy.
[ September 23, 2005: Message edited by: Rick Beaver ]

Rick O'Shay
Ranch Hand
Posts: 531
I believe you are adding complexity needlessly if your goal is to understand bit manipulation. I was going to add "in java" but it's essentially identical in C, C++ and C#. No chuckling but this is a beginner topic in any language.

http://java.sun.com/docs/books/tutorial/java/nutsandbolts/bitwise.html

Nail down bitwise operations then you can do things like Base64 encoding or anything else that manipulates bits and bit fields.

Rick Beaver
Ranch Hand
Posts: 464
Hi Rick

Thanks for the reply. I understand the operators and what their purpose is. The part I am struggling with is visualisation which will enable me to better manipulate bits.
For example I know I can use a mask integer like:

00000000111111000000000000000000

and do an & on my int in my second code block to get first block - The thing that I struggle with is how to get that mask in the first place - apart from working out the base 10 value of that bit pattern and assigning an int that value.

I hope that it is clearer where my problem lies. Thanks.

Rick Beaver
Ranch Hand
Posts: 464
OK - using a mask this time I think I have a workable solution. Any comments would be appreciated... thanks again.

[ September 23, 2005: Message edited by: Rick Beaver ]

Jim Yingst
Wanderer
Sheriff
Posts: 18671
[Rick O'Shay]: No chuckling but this is a beginner topic in any language.

I think that bit manipulation has become much less commonly used over the years. It used to be a very standard thing for all programmers to know well; nowadays it's possible for programmers to spend most of their careers with little or no contact with bit manipulation. I think there's definitely value to understanding how it works, much like there's value in understanding how to perform long division by hand, even though calculators are common. But I'm not at all surprised that there are people who haven't spent much time with this topic, or who have forgotten some of the details of something they learned once and barely used since then.

Rick Beaver: Looking good. I haven't tested it at all; I leave that to you.

I see you're using >>> rather than >> now, which is good for this application. Earlier code had some problems there.

If you apply the bitMask after right shifting, you won't have to keep changing the bitMask. This would be slightly easier to understand, and also be slightly faster.

Note that programmers doing bit manipulation often use hexadeximal literals rather than decimal, as it's easier to convert that to binary (in your head, with a little practice) and thus it's easy to understand the underlying bit patterns. If I see 63 I have to think several seconds about what the bit pattern would be; if I see 0x3F it takes me less than a second.

Rick Beaver
Ranch Hand
Posts: 464
If you apply the bitMask after right shifting, you won't have to keep changing the bitMask. This would be slightly easier to understand, and also be slightly faster.

Hi Jim

Thanks for the feedback - very useful in helping me to understand this subject.

Regarding your comment in quotes, yes, I should have thought through the flow a little bit more so I have changed the order things happen at the start of the for loop so that the line hass gone away.

While working on this my understanding has improved a lot. So, thanks for all your help.

Rick Beaver
Ranch Hand
Posts: 464
Right - final solution

Rick O'Shay
Ranch Hand
Posts: 531
>> The thing that I struggle with is how to get that mask in the first place

Work with hex values so that you can picture the pattern in your head. If you work with bits for a short while you will be able to look at 0xFB and visualize it as a binary number without even thinking. It literally becomes autonomic. Just like the symbols "apple" immediately appear as something concrete in your mind's eye.

So to generate a pattern you just write it in hex and there you have it. You already know how to mask, shift, rotate and such.

Well, sadly there is no rotate but you can still rotate bits:

[ September 23, 2005: Message edited by: Rick O'Shay ]

Rick Beaver
Ranch Hand
Posts: 464
Thanks Rick

I am going to start learning the hex value of numbers. Hopefully that will be the piece that will help my understanding...

Thanks all

Tony Morris
Ranch Hand
Posts: 1608
Here is a snippet of a base64 encoder.

[ September 23, 2005: Message edited by: Tony Morris ]