aspose file tools*
The moose likes I/O and Streams and the fly likes Binary Files Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Java » I/O and Streams
Bookmark "Binary Files" Watch "Binary Files" New topic
Author

Binary Files

Stephen McDermott
Greenhorn

Joined: Mar 11, 2008
Posts: 4
Hello,

I know how to read/write to binary files, but only static record sizes.

Can someone help me develop an algorithm for writing/reading variable record sizes?
Nicholas Jordan
Ranch Hand

Joined: Sep 17, 2006
Posts: 1282
Read in an int or something at the beginning of the file. Use that value to control the length of the file reading.


"The differential equations that describe dynamic interactions of power generators are similar to that of the gravitational interplay among celestial bodies, which is chaotic in nature."
Stephen McDermott
Greenhorn

Joined: Mar 11, 2008
Posts: 4
but if each individual record is variable...that won't quite work...
Nicholas Jordan
Ranch Hand

Joined: Sep 17, 2006
Posts: 1282
Well if each record is of variable length it gets in need of some thought or something, but variable length traffic goes all over the place so several approaches should be within reach.
Class File{
nested class fileHeader{
int number of records;
int first record length;

The problem already resembles a linked-list, which is known computer science. The tape archive has been ported to java, that is by nature a variable length record format.
Stephen McDermott
Greenhorn

Joined: Mar 11, 2008
Posts: 4
I'm not sure what you're getting at here...

I already have my files organized in a LinkedList (for sorting/searching)

whats with this file class?
Ulf Dittmer
Marshal

Joined: Mar 22, 2005
Posts: 42950
    
  70
Using binary files with variable-length content is tricky. The RandomAccessFile class can read them fairly efficiently, but you can use it for writing only by overwriting bytes, not by inserting or deleting anything.

That means you can't use it (or any other class of the JRE) to replace a record of length N with a record of length M in the middle of a file.
Jim Yingst
Wanderer
Sheriff

Joined: Jan 30, 2000
Posts: 18671
Stephen, has the format of these files already been established, or do you get to choose the format?

And what does "static record sizes" mean? I thought maybe all the records were the same size, but according to your second post, that's not the case.

If you can choose the format, there are several options:
  • Put a number at the beginning of each record indicating the length of that record. This is probably simplest for a binary file. This might be what Nick was trying to get at, but you need a record length for each record, not just one at the beginning of the file. And you don't necessarily need to know the number of records in advance - the end of the file can signal the end of records.
  • Use a particular delimiter between records. This works well for text formats, where you can use a newline, or XML, or any other delimeter that seems convenient. You need to be able to escape the delimiter if it occurs naturally within a record, e.g. replacing a newline with "\n". For a binary format it may be too much trouble to guard against the possibility of delimiters occurring accidentally within a record.
  • Create an index that identifies where each record begins. This might be located at the beginning of the file, or in a separate file. This is more complex, but it allows the records to be accessed in random order, without needing to read every previous record in order to get a record near the end.
  • Use some other existing tool to write records so that you don't need to know the format yourself. The main example that comes to mind is Java's object seriealization format, using writeObject() and readObject() from ObjectOutputStream and ObjectInputStream. Another possibility is XStream. You could even use a file-backed database such as HSQL.

  • Ulf's comment applies equally to any of these techniques (except perhaps the database option where it's all handled for you). If you need to be able to change these records, you pretty much need to rewrite the entire file.

    Exception #1: if you only need to add records at the end of the file, that's fine, you can just append them. (This gets more complicated if you have an index though.)

    Exception #2: if you include a delete flag as part of the format for each record, you can delete a record by setting its delete flag, without changing it's length. Then you can delete a record from the middle of the files, and write a new version of the record at the end of the file, without having to rewrite the entire file. This can be very fast initially, but eventually you may want to rewrite the entire file and remove those deleted records entirely. Note that if you start down this road, you're well on your way to writing your own database program, and you might well be better off using an existing database instead.
    [ March 12, 2008: Message edited by: Jim Yingst ]

    "I'm not back." - Bill Harding, Twister
    Stephen McDermott
    Greenhorn

    Joined: Mar 11, 2008
    Posts: 4
    Lets see...

    Each record will be of variable length, because the user will be able to store data in them over time...so progressively, they get larger..

    I know that it'd be best to use a text file, but for my project (to get 3/15 credits, I need a binary file)
    Nicholas Jordan
    Ranch Hand

    Joined: Sep 17, 2006
    Posts: 1282
    String has a method byte[] LINEBUFFER = String.getBytes();// which can be used directly, or several of the classes in java.io have methods that will write a string. I strongly advise against getting into char/byte translation. In general review the four approaches Jim provides, as you can see he was even able to project where I was trying to take one approach and as well gives effective and complete overviews of four approaches. Persisting data from one invocation of the program to the next also can involve cross-checking the data against multiple files, but this may not be part of the spec you were given. If you want credits, then you need to use a team approach which resolves to code what you were told to code.

    Start with the 'code you can understand' method for selecting which of the four approaches to use. Set your first goal of 'anything that works' then wrap-around to beginning and look for ways to make it actually work.
     
    I agree. Here's the link: http://aspose.com/file-tools
     
    subject: Binary Files