wood burning stoves 2.0*
The moose likes Beginning Java and the fly likes UNIX 'tail' clone - implementation help Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login


JavaRanch » Java Forums » Java » Beginning Java
Bookmark "UNIX Watch "UNIX New topic
Author

UNIX 'tail' clone - implementation help

Gerry Giese
Ranch Hand

Joined: Aug 02, 2001
Posts: 247
Hello all!
I'm working on a simple Java version of the UNIX 'tail' command to show the last X lines of a text file. My question is how to implement it best, considering both memory requirements and performance of Java API classes used.
I currently have a version that works, but I'm not positive that it's the best/fastest way to do it. Here's how it works:
I declare list = Vector(5000,500) to set initial size and increment (I plan to work with log files, which right now are small), and in the constructor of the Tail class I require a filename and the 'tailSize', ie how many lines from the end of the file to show. I then read the file with a BufferedReader.readLine() and append the resulting string to list vector. When I reach the end I just calculate the last element index and that index - tailSize and show those lines in a for loop.
Pretty straightforward, and it seems to work on the files I've tried, but they've only been 1k-20k. I'm pretty sure when I get to 1MB log files it'll have problems. My best guess for an optimization with files that big would be once I read tailSize elements + 1 I should start removing elements from the front of the list so I don't grow my vector to the same size as the text file. I also wonder if I should just use an array of strings, or an ArrayList, instead of Vector, but the code would be a little more complicated I would guess.
I'm sure there's lots of other ways to do this, especially if I knew file I/O better, so any help would be appreciated!


CJP (Certifiable Java Programmer), AMSE (Anti-Microsoft Software Engineer)
Author of Posts in the Saloon
Zkr Ryz
Ranch Hand

Joined: Jan 04, 2001
Posts: 187
Originally posted by Gerry Giese:
Hello all!
I'm working on a simple Java version of the UNIX 'tail' command to show the last X lines of a text file. My question is how to implement it best, considering both memory requirements and performance of Java API classes used.
I currently have a version that works, but I'm not positive that it's the best/fastest way to do it. Here's how it works:
I declare list = Vector(5000,500) to set initial size and increment (I plan to work with log files, which right now are small), and in the constructor of the Tail class I require a filename and the 'tailSize', ie how many lines from the end of the file to show. I then read the file with a BufferedReader.readLine() and append the resulting string to list vector. When I reach the end I just calculate the last element index and that index - tailSize and show those lines in a for loop.
Pretty straightforward, and it seems to work on the files I've tried, but they've only been 1k-20k. I'm pretty sure when I get to 1MB log files it'll have problems. My best guess for an optimization with files that big would be once I read tailSize elements + 1 I should start removing elements from the front of the list so I don't grow my vector to the same size as the text file. I also wonder if I should just use an array of strings, or an ArrayList, instead of Vector, but the code would be a little more complicated I would guess.
I'm sure there's lots of other ways to do this, especially if I knew file I/O better, so any help would be appreciated!

Hello Gerry
You are rigth, there's another way, it should work almost the same, I suppose you're using a File object to open your file.
You can use a RandomAccessFile object to open your file and read from it, this class has a method ( public void seek( long pos ); ) which allow you set a "file pointer " to any position you want and then read from that place. But there's a problem with it, you have to set it to a integer value position which represents a count of bytes , so you have to do a guess about where would be the line where you wanna start to read from.
The method : public long length(); returns the length of the file, then you have to make a guess where would be "10 lines before the end " ( in the default implementation of tail ) if you know that each line has , mmhh, let's say 100 bytes you would have to do something like:
// create a Random access file instance to lob.txt and read
// only mode
RandomAccessFile raf = new RandomAccessFile("log.txt","r");
// set the position to file length - 1000 bytes
raf.seek( raf.legth - ( 10 * 100 ) );
// read from that position
String s = "";
while( ( s = raf.readLine() ) != null ){
yourOldVector.add( s );
}
then if your vector.size() returns more than 10 you read more than 10 lines.
The problem here is to find a way to make a good guees
of where to start to read.
Hope it helps.
DISCLAIMER: I've never use a RandomAccesFile before ( At least I don't remember ), all these thing I've said were based on the API so is not sure to works fine, but It should.
I'll try to code something and if it works I'll post it here
ant let you know.

Have a nice coding. . . . . .
Regars Zkr Ryz
Zkr Ryz
Ranch Hand

Joined: Jan 04, 2001
Posts: 187
After a while.. . . . .
I tryed the next program with a 6 mb file and worked in about 2 secs.

A way to guess the size of a line in a file could be:
read a few lines ( may be 10 ), and make an average
and set lineSize to that.

I hope this helps.
Peter Tran
Bartender

Joined: Jan 02, 2001
Posts: 783
Gerry, Zkr,
These are good starting attempts, but one of the unique features of the UNIX tail command is that you can run tail -f against a file and see the output as they are being written to the file. Bonus question: Can you see how you can do this with the current java.io.* classes? I'll check in later to review your solution. Figure this one out and you'll really feel like a java.io guru.
-Peter
James Gray
Ranch Hand

Joined: Sep 10, 2001
Posts: 30
Heck, I'm interested in the answer to that question! Would available() do the trick, occaisionally checked from a background thread?
Gerry Giese
Ranch Hand

Joined: Aug 02, 2001
Posts: 247
Just checked in to see the replies - I've been busy putting an app into production... the horror! the horror!
Zkr Ryz: Interesting solution. Bound to be fast, yes, but as you note it's difficult to get elegant results because of variation in line length. The log files I'm trying to Tail vary from 20 chars to 150-200 on a fairly regular basis, but in chunks (ie, an app or class has debugging turned 'on' briefly and is extremely verbose, then is turned off). I'll definitely consider using your solution or some version of it if performance becomes an issue. I just don't know how big the log files I'm dealing with will become.
Peter: I was purposely not trying to do follow since my intended display is a webpage. I have logfile access permissions, but others on my development team do not, and I wanted them to be able to check out the logfile for debugging purposes until we can get them permissions or have split the logfile out into app logs and system logs so we can get access. I am intrigued by trying to get it to work anyway, for my own personal knowledge. I'm pretty weak in java I/O, and need to work on networking and threads, too.
Peter & James: I looked around the API just briefly and I'm particularly interested in the 'LineNumberReader' because it allows you to track line numbers as well as mark/reset to a point in the stream or skip to a particular spot. It would take some work to understand the class and how to do it properly, but you could read off the end, track your position, then either reset or skip to it on subsequent reads performed on a timed basis using some sort of wait timer. Peter: am I close?
All: Thanks for the replies!
I probably should have done this to begin with, but here's the first version of the code FYI (I have a servlet also that uses Tail to see the last X lines, which is why I wrote the dang thing in the first place!):
mport java.io.*;
import java.util.Vector;
class Tail
{
private String _fileName = "";
private int _tailSize = 25; // default tailsize

Vector _theList = new Vector(5000,500);
public Tail( String fileName ) { _fileName = fileName; }
public Tail( String fileName, int tailSize )
{
_fileName = fileName; _tailSize = tailSize;
}
public StringBuffer run()
{
StringBuffer output = new StringBuffer("");
String thisLine;

try
{
FileReader fr = new FileReader( _fileName );
BufferedReader myInput = new BufferedReader( fr );

while( ( thisLine = myInput.readLine() ) != null )
{
_theList.add( thisLine );
}

} // end try
catch( IOException e )
{
output.append( "Error reading file: " + e );
}

_theList.trimToSize();

int end = _theList.size();
int start = _theList.size() - _tailSize;
if( start < 0 ) { start = 0; _tailSize = end; }
output.append( "========== Tail [" + _fileName + "]\n" );
output.append( "========== Showing last [" + _tailSize + "] lines\n" );

for( int i = start; i < end; i++ )
{
output.append( (String)_theList.get(i) + "\n" );
}
output.append( "========== END\n" );

return output;
}

public static void main( String args[] )
{
int argsLength = args.length;

if( argsLength == 0 ) System.exit(0);
if( argsLength == 1 )
{
Tail tail = new Tail( args[0] );
System.out.println( tail.run().toString() );
}
if( argsLength == 2 )
{
Tail tail = new Tail( args[0], new Integer( args[1] ).intValue() );
System.out.println( tail.run().toString() );
}
}
}
Gerry Giese
Ranch Hand

Joined: Aug 02, 2001
Posts: 247
Drat! The code didn't format like I thought it might. Here's another try using the PRE tag - hope this works!
<pre>
import java.io.*;
import java.util.Vector;
class Tail
{
private String _fileName = "";
private int _tailSize = 25; // default tailsize

Vector _theList = new Vector(5000,500);
public Tail( String fileName ) { _fileName = fileName; }
public Tail( String fileName, int tailSize )
{
_fileName = fileName; _tailSize = tailSize;
}
public StringBuffer run()
{
StringBuffer output = new StringBuffer("");
String thisLine;

try
{
FileReader fr = new FileReader( _fileName );
BufferedReader myInput = new BufferedReader( fr );

while( ( thisLine = myInput.readLine() ) != null )
{
_theList.add( thisLine );
}

} // end try
catch( IOException e )
{
output.append( "Error reading file: " + e );
}

_theList.trimToSize();

int end = _theList.size();
int start = _theList.size() - _tailSize;
if( start < 0 ) { start = 0; _tailSize = end; }
output.append( "========== Tail [" + _fileName + "]\n" );
output.append( "========== Showing last [" + _tailSize + "] lines\n" );

for( int i = start; i < end; i++ )
{
output.append( (String)_theList.get(i) + "\n" );
}
output.append( "========== END\n" );

return output;
}

public static void main( String args[] )
{
int argsLength = args.length;

if( argsLength == 0 ) System.exit(0);
if( argsLength == 1 )
{
Tail tail = new Tail( args[0] );
System.out.println( tail.run().toString() );
}
if( argsLength == 2 )
{
Tail tail = new Tail( args[0], new Integer( args[1] ).intValue() );
System.out.println( tail.run().toString() );
}
}
}
</pre>
Zkr Ryz
Ranch Hand

Joined: Jan 04, 2001
Posts: 187
Peter: probably using a thread that check for file size grow . . .
mmhhh, I don't know, usin the method ready() in Reader ??
To be honest, I don't know.



Pho Tek
Ranch Hand

Joined: Nov 05, 2000
Posts: 761

There's a Java implementation called "follow"
at http://follow.sourceforge.net/


Regards,

Pho
Gerry Giese
Ranch Hand

Joined: Aug 02, 2001
Posts: 247
Thanks, Pho! It will take some adaptation to make it work with a servlet, but the core FileFollow class seems to be separated enough from the GUI to make it possible. My biggest challenge will be figuring out how to keep the 'connection' alive so the program can keep pushing followed updates to the web page. I guess I'll need to just keep sending output in a loop and flush the buffer every so often, or maybe use a meta-refresh and keep the latest updates buffered in a Session object. Again, thanks!
 
It is sorta covered in the JavaRanch Style Guide.
 
subject: UNIX 'tail' clone - implementation help