It's not a secret anymore!
The moose likes Java in General and the fly likes searching file among 3 hundred thousand files Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Java » Java in General
Bookmark "searching file among 3 hundred thousand files" Watch "searching file among 3 hundred thousand files" New topic

searching file among 3 hundred thousand files

gaurav kumar

Joined: Apr 20, 2006
Posts: 16
Hi all,
i have to search for a particular PDF file among a collection of more than 3 hundred thousand files. I m first making a file object and then taking all the file names into a String array for further processing. But on running this, my server is getting hanged. Is this could be because of very large number of file names in the string array? Could there be some other data structure which can give better performance
Ramen Chatterjee
Ranch Hand

Joined: Apr 27, 2006
Posts: 62
Could you be clearer about what you are trying to achieve, ie why are you looking for this file. Also, 300,000 files is a lot! How are these stored? Do you have all 300,000 in one directory (is this possible?).

Could try harder
gaurav kumar

Joined: Apr 20, 2006
Posts: 16
the files are reports which get generated periodically. The requirements are in a way that require these many files to be present in a single directory at a time(300,000 is the maximum of files can also less than this).
Also the piece of code which is giving problem is as below:

File file = new File("D:\\project\\ftp\\Example test invoiced");
String[] filelst = file.list();

thanx in advance
Dave Wingate
Ranch Hand

Joined: Mar 26, 2002
Posts: 262
One possible improvement would be to make your directory structure less flat. So instead of having one report directory with 300,000 files, maybe you could group the reports in some meaningful way:

That way, you don't have to create an array with 300,000 members just to iterate through all of the file names.
[ June 23, 2006: Message edited by: Dave Wingate ]

Fun programming etcetera!
Stan James
(instanceof Sidekick)
Ranch Hand

Joined: Jan 29, 2003
Posts: 8791
You might want to index these files as they are created so you don't have to search through every file every time you need something. I use Lucene for indexing with good results.

A good question is never answered. It is not a bolt to be tightened into place but a seed to be planted and to bear more seed toward the hope of greening the landscape of the idea. John Ciardi
I agree. Here's the link:
subject: searching file among 3 hundred thousand files
It's not a secret anymore!