• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Jeanne Boyarsky
  • Ron McLeod
  • Paul Clapham
  • Liutauras Vilda
Sheriffs:
  • paul wheaton
  • Rob Spoor
  • Devaka Cooray
Saloon Keepers:
  • Stephan van Hulst
  • Tim Holloway
  • Carey Brown
  • Frits Walraven
  • Tim Moores
Bartenders:
  • Mikalai Zaikin

I/O Wrappers

 
Ranch Hand
Posts: 111
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I am trying to get a handle on I/O wrappers. The more I experiment, the more
questions I find. To begin with:
What is really happening when you wrap an underlying I/O stream? For example, the API states:
<h6>
In general, a Writer sends its output immediately to the underlying character or
byte stream. Unless prompt output is required, it is advisable to wrap a
BufferedWriter around any Writer whose write() operations may be costly, such as
FileWriters and OutputStreamWriters. For example,

PrintWriter out
= new PrintWriter(new BufferedWriter(new FileWriter("foo.out")));
will buffer the PrintWriter's output to the file. Without buffering, each invocation of
a print() method would cause characters to be converted into bytes that would then be
written immediately to the file, which can be very inefficient.
</h6>
Doesn't it depend on what method gets invoked for the overloaded version of write?
For example, consider this code that writes two strings to two files:


How is (1) any more efficient than (2) in terms of execution or to get back to the
original question, what is really happening when you wrap i/o streams? How do you know
what is the best construction to use? It seems that you can do almost anything. For
example:

How does (1) differ from (2) in machine execution (outside of the fact that you have
to use different methods)? If I wanted to buffer the output based upon using the
DataOuputStream class (suppose I just wanted to write various primitives), would
(2) be the correct way to accomplish this or would using RandomAccessFile be better?
(3) is just an example of how carried away you can get with wrapping. I really have no
idea what it's doing, but it's still more efficient than (2) in terms of space:


Any clarification on these issues would be appreciated. Thanks.

[This message has been edited by Betty Reynolds (edited April 18, 2000).]
 
Wanderer
Posts: 18671
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
In this first section, (1) and (2) will refer to Betty's first block of sample code.
"Doesn't it depend on what method gets invoked for the overloaded version of write?"
I'm not sure what you mean here. All writers and outputstreams ultimately convert their output to bytes. Yes, there is a choice of methods that could be invoked by the outer stream or writer. In general, the methods which operate on an array of chars or bytes are probably more efficent than those which use one byte/char at a time. The thing is, in order to use these methods, it helps if you've saved up an array of bytes/chars to begin with, as in a buffer. In your example, you've basically already done this since there's only one string to write, and the Writers have the whole thing from the beginning of the process. But consider this (somewhat extreme) example:
<code><pre> fw.write("A");
fw.write("B");
fw.write("C");
fw.write("D");
fw.write("E");
fw.write("F");</pre></code>
This will result in six separate accesses to the file. The execution of this thread will pause each time as the file is accessed (probably on hard disc). If the FileWriter fw were instead replaced with a BufferedWriter wrapped in a FileWriter, then it would instead save up the string of characters "ABCDEF" in a buffer, and write the whole set at once in a single file access. The only catch is that it probably won't get around to actually doing the write until you call flush() or close() on the stream.
Sure, it would be foolish to intentionally split the write into 6 separate statements as I've done above. But it's not at all unsual to have multiple writes to the same stream, perhaps separated by other processing that needs to be done.
"How is (1) any more efficient than (2) in terms of execution?"
In your example, it probably isn't. But for multiple writes it is, as discussed above.
"How do you know what is the best construction to use?"
Generally, I try to look at:
  • Is all the data character data? If so, Readers and Writers are typically easier (especially if you want UTF-8 / ASCII output). Otherwise, use InputStream / OutputStream.
  • Where do you want the data to go / come from? This probably determines the innermost stream, e.g. FileWriter, or System.in.
  • What set of methods would you find most convenient to access the data? This probably determines the outermost stream, like DataOutputStream, or LineReader.
  • Consider putting a Buffered Input/Output Stream/Reader/Writer thingy in the middle for efficiency, as discussed above. Unless of course you were already able to use one as the outermost stream.

  • You may still find several possible constructs. In many cases there may not be much real difference between your choices. I'd usually go with the one that looks simplest. Try is and see if it works the way you hoped. If not, try another.


    OK, now (1), (2), and (3) refer to Betty's second block of sample code.
    "How does (1) differ from (2) in machine execution (outside of the fact that you have to use different methods)?"
    Well, the methods used are the main difference. The main possible point to using a DataOutputStream is to use its wide selection of methods for convenience - if it's not the outermost stream, then you can't access those methods, and what's the point then? So in (1) the DataOutputStream is essentially dead weight. If you remove it, then (1) may be slightly faster than (2) and use slightly less memory. But it doesn't have access to all those nifty DataOutput methods.
    "If I wanted to buffer the output based upon using the
    DataOuputStream class (suppose I just wanted to write various primitives), would (2) be the correct way to accomplish this or would using RandomAccessFile be better?"
    I think (2) would be more efficient. The main possible reasons I see to use RandomAccessFile are that (a) it's easier to set up - you don't have to decide on a combination of streams/readers/writers , (b) you can read and write using the same object, and (c) you can use seek() for, well, random access, if that's important. Otherwise though I imagine (2) would give better performance to just write a bunch of data to a file.
    In (3) again, the DataOutputStream is dead weight in the middle - but the others are potentially useful. A PrintStream offers different access methods than other Streams, which you might find more to you liking. The BufferedOutputStream is good for efficiency, and the FileOutputStream is necessary if you want the output to go to a file.
    As for the lengths of the three files in your last example, that's determined by the three different methods you chose to write with. The inner streams/writers have nothing to do with it - they just do what the outer stream tells them to:
    (1) bufBDF.write(j) - uses the write(int) method of BufferedOutputStream. Takes the last 8 bits of the argument and writes it as a single byte. Ignores the rest. So if you've got a value outside the range of byte, this is probably a bad thing. If your value is within the range of a byte, this is as compact as you can get. What's written is not a string representation, it's the raw data - which means that it you try to look at the file in a text editor, you'll see gibberish.
    (2) datBF.writeInt(j) - uses the writeInt(int) method of DataOutputSream. This takes an int and writes it as exactly four bytes. Again, it's not a string representation, it's raw data. Four bytes may seem inefficient for writing the value 10, but it's pretty compact for a value like 1234567890. If you've got a lot of int values that do in fact use the full range of possible int values (or at least a significant portion of it), then this is the way to go.
    (3) prtPDBF.print(j) - uses the print(int) method of PrintStream. This takes an int and converts it into a readable character string representation - in this case, "10", which takes two bytes as an ASCII / UTF-8 string. Of course, if you're also writing other stuff, you'll want to put in commas or line breaks or something so that we can tell where one number ends and another begins. On average, this method will probably take the most space and take a bit longer to execute (doing int -> string conversions) but will produce nice friendly output that humans can look at and understand. Which is useless for some applications and invaluable for others.
    And on that note, I should probably get back to work.

    [This message has been edited by Jim Yingst (edited April 18, 2000).]
 
Betty Reynolds
Ranch Hand
Posts: 111
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Now that I look at this post by the light of day, it seems like I threw an awful lot of questions out there. Thanks Jim for taking the time to answer them all.
Your assistance is first-rate as always!
 
reply
    Bookmark Topic Watch Topic
  • New Topic