File APIs for Java Developers
Manipulate DOC, XLS, PPT, PDF and many others from your application.
http://aspose.com/file-tools
The moose likes I/O and Streams and the fly likes Performance Problem? Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login


Win a copy of Spring in Action this week in the Spring forum!
JavaRanch » Java Forums » Java » I/O and Streams
Bookmark "Performance Problem?" Watch "Performance Problem?" New topic
Author

Performance Problem?

Santana Iyer
Ranch Hand

Joined: Jun 13, 2005
Posts: 335
I have web application.

I have a file which has 2 million records, each record is of one line
however size is variable but max 32 characters.

I display records on page (1,000 records a page).

Here user can add record as well as modify/edit record as all 1000 records
shown on page can be modified. Also user may delete record by checking
checkbox which appear with every record.

Assume user is on page 5 hence viewing 4001 to 5000 record after add and modify he clicks save following even occur on save

1. I create a temp file.
2. Read records sequentially using readLine() from original file and copy to temp file (1 to 4000).
3. Next I write records from form 1000 records which I displayed on screen (so that modification done by user on viewed records is saved).
4. I go to 5001 record in original file and write to temp file remaining all records. (Now temp file contains updated data).
5. Take backup of original and rename temp to original.


Also I show page links so when user clicks on say 100 page I have to sequentially read file (skip 100000 line using readLine()). Than read 1000 records which will be shown to user.



I find this solution not v good but also can not think of better solution.

I have to do so many readLines because record size is not fixed.
Also since record can be added in between and also deleted hence
have to create temp file to copy all records.


I can not use database, this is restriction.

Is there any better solution?
Ernest Friedman-Hill
author and iconoclast
Marshal

Joined: Jul 08, 2003
Posts: 24187
    
  34

If you can make all the records the same size, then you can use RandomAccessFile, which will let you skip directly to a record and modify it without rewriting the whole file.

If you can't, then you could do something a little more complex: make an "index file" which shows the starting offset of each record. Then when you modify a record, append the new one to the end of the file, and modify the index file to point to the new offset (you'd use RandomAccessFile on the index.) Occasionally (overnight?) you could rewrite the main file, omitting all the "dead" records.


[Jess in Action][AskingGoodQuestions]
Santana Iyer
Ranch Hand

Joined: Jun 13, 2005
Posts: 335
Thanks but problems

1. I have to write record in between two records not at the end.
2. Also with RandomAccessFile I can modify record of same length
ABCD; now I can modify this record with 5 chars only not more or less.
Or my next record gets corrupted.
3. During deletion I have to rewrite file.

For above problems it seems to me that everytime I have to rewrite file i.e. create temp file and rename to original.
Ernest Friedman-Hill
author and iconoclast
Marshal

Joined: Jul 08, 2003
Posts: 24187
    
  34

If you have to keep the records in a file, in order, and they're variable length, then yes, you're pretty much hosed. In this kind of problem, you can improve performance only by changing the data structures you use. Since this is apparently not possible, you have two options:

1) Go to whoever is insisting on this storage format, explain why it's a bad choice, and offer alternatives.

2) Wait until that same person complains to you about the performance of the deployed system, tell them it's because of the storage format, and suffer the consequences at that time.

But there's no magic way to make rewriting the file go ten times faster.
Santana Iyer
Ranch Hand

Joined: Jun 13, 2005
Posts: 335
Thanks a lot.
I am more worried about corruption of data rather than performance.
But it seems this is the only way, I just wanted to know some senior's opinion. Thankyou Sir.
Ernest Friedman-Hill
author and iconoclast
Marshal

Joined: Jul 08, 2003
Posts: 24187
    
  34

Originally posted by Santana Iyer:

I am more worried about corruption of data rather than performance.


If you write a new file, while saving the old one, and only rename the new one once you know that the file writing went OK, as you've described, then this is generally a safe thing to do. Of course, you do have to worry about concurrent updates, something you haven't mentioned here. If there is more than one user of the system at a time, then rewriting a single file obviously becomes a difficult and dangerous thing to manage.
Amit A. Patil
Ranch Hand

Joined: May 04, 2006
Posts: 38
Even though you cant have a Database the entities of your file should have a valid max size.
So practically you should be able to use fixed width records,but that also might mean wastage of space and increased file size.
 
It is sorta covered in the JavaRanch Style Guide.
 
subject: Performance Problem?