Meaningless Drivel is fun!*
The moose likes I/O and Streams and the fly likes Recursive Read of Files from a Directory. Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login


Win a copy of Android Security Essentials Live Lessons this week in the Android forum!
JavaRanch » Java Forums » Java » I/O and Streams
Bookmark "Recursive Read of Files from a Directory." Watch "Recursive Read of Files from a Directory." New topic
Author

Recursive Read of Files from a Directory.

Arun Seshadri
Greenhorn

Joined: Dec 22, 2005
Posts: 18
I am trying to recursively read all the files under a given directory.
Is there a API to do this. Any help is appreciated.

Thanks
Arun
Jeff Albertson
Ranch Hand

Joined: Sep 16, 2005
Posts: 1780
Do you need anything besides java.io.File?


There is no emoticon for what I am feeling!
Casper Maxwell
Ranch Hand

Joined: Aug 04, 2005
Posts: 88
You may look at the following tips:

How to follow a directory structure

How to get the list of specific file types in a directory
Jim Yingst
Wanderer
Sheriff

Joined: Jan 30, 2000
Posts: 18671
In File, the listFiles() method is your friend. I've never understood why people keep using list() when then need File objects immediately after.


"I'm not back." - Bill Harding, Twister
Arun Seshadri
Greenhorn

Joined: Dec 22, 2005
Posts: 18
Thanks for the tips. The code was indeed helpful.

I have a list of files under the directory (in excess of 20K) and would
like to know if there is a faster way to implement the listing of files.

Is there anything in the Java NIO package that we can use to speed up the file listing.

Many Thanks
Arun
Arun Seshadri
Greenhorn

Joined: Dec 22, 2005
Posts: 18
Another Small Problem that is noticed on Windows XP Professional When running java 1.5.

If I use the following command the program would not work.
java TestFiles "X:\Java5\Transfer\tests"

The following command works fine.

java TestFiles "X:\Java5\Transfer"

Could anyone suggest what could be the problem.

Thanks
Arun
Jim Yingst
Wanderer
Sheriff

Joined: Jan 30, 2000
Posts: 18671
Does X:\Java5\Transfer\tests exist?

Is it a directory?

Do you have read permission in that directory?

Is there anything in the directory?

If the above questions do not suggest an explanation, please show the code you're using here. And describe exactly how it "doesn't work".

Hope that helps...
[ January 05, 2006: Message edited by: Jim Yingst ]
Arun Seshadri
Greenhorn

Joined: Dec 22, 2005
Posts: 18
The Files were indeed not accessible. I have sorted that part. The only problem that remains is with speed of the following code.

Could you please tell me what is wrong with the following method. As it takes way too long to produce an output.

Am I doing something wrong ?


Thanks
Arun
[ January 06, 2006: Message edited by: Jim Yingst ]
Jim Yingst
Wanderer
Sheriff

Joined: Jan 30, 2000
Posts: 18671
Please use code tags in the future. I added some to your previous post for you.

The code you show wouldn't have any output. Do you mean you uncommented the System.out.println()? Was there some output, just very slow? Or was there no output, despite a long wait?

Unfortunately, list() and listFiles() can be very slow on some machines if the directory is very large. Now that I think about it, you might get faster results here using list() instead, though I doubt it. Maybe you should try it though. Anyway, there's not a lot you can do to speed this up unless you're willing to do some platform-specific stuff. Try using the command line to list the contents of this huge directory, and comparing this to the time it takes to call File.ListFiles() or File.list() in the same directory. Is the command line much faster than list()? If not, then the problem is probably with your underlying system, and changing your program is unlikely to help you. But if the command line is much faster, then you might want to use Runtime.exec() to do the directory listing - then read lines from the InputStream of the Process object which the exec() returns. Each line would be the name of one of the files or subdirectories. This is inelegant, but if the command line is much faster than list(), this will allow you to improve performance on that particular system at least.

You could also use JNI to call a C method. I haven't used JNI in ages, and my experience is that using exec() will probably be easier.

Hope that helps...
[ January 06, 2006: Message edited by: Jim Yingst ]
Arun Seshadri
Greenhorn

Joined: Dec 22, 2005
Posts: 18
Tried using the command line on the recursive call and it has about 10 second difference between running a command on command line and the Java Program using Listfiles. I did not notice a major difference between listfiles and list.

I guess performing a JNI system call may be a good idea. Would some methods from the NIO FileChannels be useful to get the filename.

Many Thanks
Arun
Jim Yingst
Wanderer
Sheriff

Joined: Jan 30, 2000
Posts: 18671
[arun]: it has about 10 second difference

So is that like the difference between 0.1 sec and 10.1 sec? Or the difference between 1000 sec and 1010 sec? Is ten seconds a big difference or a small one?

Would some methods from the NIO FileChannels be useful to get the filename.

Not that I know of. I've seen occasional mention the last few years of a possible new filesysem API to come out sometime. But it didn't make it into JDK 5 it seems, and doesn't seem to be in JDK 6 so far, so who knows? Don't hold your breath.

Good luck...
Arun Seshadri
Greenhorn

Joined: Dec 22, 2005
Posts: 18
Running the test looks like following for listfiles.

Elapsed = 49487
Listing Files to Array took 49487ms
Total Files Found 5

And if I run the command line to list all the files
it takes 39 seconds approximately. I would say the time it takes is indeed more. It would be good to see if there are any other ways to decrease the time to search for files.

How about JNI? How can that help me speed up the search.

Thanks
Arun
Jim Yingst
Wanderer
Sheriff

Joined: Jan 30, 2000
Posts: 18671
Can't help you much with JNI other than to direct you to google it or read teh tutorial. I doubt it will help much here. More importantly:

Total Files Found 5

Is this correct? Are there really just 5 files here? And it takes 39 seconds even from the command line? Something is seriously, seriously wrong with your system, having nothing to do with Java. Or maybe that 5 is wrong? Could you show the code you're actually using here? How is the 5 counted? Are you doing anything else with each file?
Arun Seshadri
Greenhorn

Joined: Dec 22, 2005
Posts: 18
As requested the code is below.



I think there may be something wrong with the system or I am doing something wrong. I am in the process of writing a applicatoin which monitors the folders continuously to see if there are any new files. But when the application starts it needs to look if there are any files that already exist.

I tried on the command line and that was 39 seconds that it took to look for the files.

[ January 09, 2006: Message edited by: Jim Yingst ]
[ January 09, 2006: Message edited by: Arun Seshadri ]
Jim Yingst
Wanderer
Sheriff

Joined: Jan 30, 2000
Posts: 18671
I edited your post to fix the code tag (you had [\code] instead of [/code]) and fix the indentation. Mixing spaces and tabs for indentation is a bad idea, especially if you set the tab size to anything other than 8 spaces. Most browsers will convert the tabs to 8 spaces, and if that's not what you intended, the indentation gets screwed up.

Anyway: are there really only 5 files? And the command line takes 39 seconds?
Arun Seshadri
Greenhorn

Joined: Dec 22, 2005
Posts: 18
Sorry about the formatting.

No in that case there were 5 files but in reality we may have anywhere upto 10,000 files of variable length. So that is the reason I was looking if something better can be done.
[ January 09, 2006: Message edited by: Arun Seshadri ]
Jim Yingst
Wanderer
Sheriff

Joined: Jan 30, 2000
Posts: 18671
All right, for the moment let's ignore the up to 10,000 files in real life. And forget about your Java program. With just 5 files - listing them recursively from the command line takes 39 sec? Really? What command are you giving exactly?

The only time I've ever witnessed behavior that slow, it was on a networked drive in a badly overworked network. Are you using a drive that's on a very busy network? Is there anything else unusual about your system that you know of?
Arun Seshadri
Greenhorn

Joined: Dec 22, 2005
Posts: 18
Jim,

You are brilliant.
On the command line I use dir /S it takes 39 seconds and with dir /S /B takes a lot less approx 9 seconds.

As you suggested the files on the local C: Drive works fine but on the
network drive it is stuffed. So I guess the problem is not with listfiles but the network performance.





Thank you very much for your valuable help.

Many Thanks
Arun
[ January 10, 2006: Message edited by: Arun Seshadri ]
 
 
subject: Recursive Read of Files from a Directory.
 
Similar Threads
reading files in specified directory
Retrieving image file from the database and displaying in the browser
reading files in specific directory
Determining Termination Condition for threads
Problem with Apache axis