Win a copy of Clojure in Action this week in the Clojure forum!
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

searching for a content in a file

 
N Naresh
Ranch Hand
Posts: 66
  • 0
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi i want to search for a content in different types of files like msword,text,excel,pdf,ppt etc...

Could anybody tell me what is the optimum code to search for a content in any type of file using BufferReader

Right now i am using following code where "text" is the content to be searched but the problem here is i am getting the filename even though if i give for wrong content.

BufferedReader in = null;
try
{
in = new BufferedReader( new FileReader( f ) );
String line;

while ( ( line = in.readLine() ) != null )
{
if ( line.toLowerCase().indexOf( text ) != -1 )
{
return true;
}
}

}
catch ( IOException e )
{
cLog.error( cConfiguration.getFormattedString("search.error",new Object[]{f}), e );
return false;
}
finally
{
if ( in != null )
{
try { in.close(); } catch ( IOException e ) {}
finally {}
}
}
return false;
 
Ulf Dittmer
Rancher
Pie
Posts: 42966
73
  • 0
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Text files are the only ones you'll be able to search reliably using this approach. The other formats are binary, structured files that are not amenable to this kind of simplistic approach.

I'd look into using a search library like Lucene for this, but you'll still need to use particular libraries to get at the contents of those formats, like Apache POI for DOC, XLS and PPT, and PDFBox for PDFs.
 
N Naresh
Ranch Hand
Posts: 66
  • 0
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Thanks for your reply could you please give me small example how we can search binary files.
 
Ulf Dittmer
Rancher
Pie
Posts: 42966
73
  • 0
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Apache POI has extensive documentation and examples online. For PDFBox, look for "text extraction" on its web site.
 
N Naresh
Ranch Hand
Posts: 66
  • 0
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Thank you very much could you please give me the website URL.
 
Rob Spoor
Sheriff
Pie
Posts: 20381
46
Chrome Eclipse IDE Java Windows
  • 0
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
http://www.google.com
 
I agree. Here's the link: http://aspose.com/file-tools
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic