File APIs for Java Developers
Manipulate DOC, XLS, PPT, PDF and many others from your application.
http://aspose.com/file-tools
The moose likes Java in General and the fly likes Some Questions About IO Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Java » Java in General
Bookmark "Some Questions About IO" Watch "Some Questions About IO" New topic
Author

Some Questions About IO

Simon John
Greenhorn

Joined: Jan 04, 2002
Posts: 20
Q-What is the difference b/w binary and character stream?
Q_When we should use binary stram and when character stream ?
*Any link (good resource)that can differentiate and visualize the above concept s thoroughly.
Q-Why character encoding is used Any example which can describe the IMPORTANCE of character encoding?
*Any link that can describe character encoding with examples.
Thanks in advance
Rob Ross
Bartender

Joined: Jan 07, 2002
Posts: 2205
You're likely to get more responses if you ask simple questions...or at least break up your many questions into different threads. I'll answer the first one for you though
What is the difference b/w binary and character stream?

A binary stream is interpreted as a raw sequence of bytes. Higher level streams (like a DataInputStream or DataOutputStream) can build on top of this and read& write more than one byte at a time, for example, when reading an int, which is composed of 4 bytes.
In the stream, the bytes are just long sequences of bytes. I'm going to use hexidecimal notation because it's usually used to describe bytes.
But first a quick analogy: if I write this sequence of decimal characters:
1038475689328469283738493749983793
And tell you this represents a "stream" of decimal numbers, you also need to know where to start, and how many characters to read, to re-assemble a meaningful number. For example, I could now tell you that this stream reprents a sequence of 3 digit decimal numbers ,starting at the first character. Now you can impose some order on this stream and extract information:
103, 847, 568, etc. These are the numbers stored in this stream. Of course, if I tell you that this stream is composed of five digit numbers, that creates an entire different number sequence, doesn't it? : 10384, 75689, ... etc.
So we have a similar situation with a binary stream. Instead of decimal characters, we have bytes, which represent 8 bits of binary information. Each byte can range in value from 0x00-0xFF. So if you have a binary stream that looks like this:
AC03FC248793FF2C...
To extract meaningful information you have to know how many bytes to read for a value. If you're just reading bytes, you can read one byte at a time. : AC, 03, FC, 24, etc.
If you're reading ints though, each of those is made of 4 bytes, so your values would be: AC03FC24, 8793FF2C,... etc.
So that's binary streams in a nutshell.
A character stream is a very different type of stream. You are no longer storing binary representations of numeric values, you are storing an encoding that maps to a particular character in some character-set. First, you have to know what character set to map to. For example, I could create a simple encoding scheme that looks like this:
A = 1, B = 2, C = 3,...
This means that I will be using the number "1" to represent an "A", and a "2" to reprent a "B", etc. Now if write the string "ABC" out to my simple character stream, it might look like this:
123.
And when I read it back it, I am not really reading a "1","2","3", etc, but I am using these values to reconstitue the characters A,B, and C.
But this only works if the encoding scheme is the same when I am writing AND reading. If I saved this character file "123" in my simple encoding, but you tried to read it with a different encoding, say one in which
A = 3, B = 2, C = 1
Then when you read the "123", you would get the characters "CBA", which is not the original sequence I wrote to the file.
Back to java character streams now. Java uses Unicode internally as its character encoding. In Unicode, 16 bits are used for each character. So you can represent a lot of characters. But when you want to write out a string like "ABC", you may not necessarily write out in Unicode. You might want to use ASCII, or Latin_1, which uses 8 bits for each character. Again, as above, it really doesn't matter what encoding you choose as long as the reader and writer agree to use the same one. It's almost like you're taking a sequence of characters and using a "secret" code, translating a message into a sequence of bytes. In fact, this is exactly what you're doing, althought the "code" is not secret, it's well known.
The most important thing to remember is that if you're dealing with binary data, use binary streams, like FileInputStream and FileOutputStream. If you're dealing with character data, use Readers and Writers, and make sure you use the same encoding for both reading and writing.


Rob
SCJP 1.4
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: Some Questions About IO