Hi i want to search for a content in different types of files like msword,text,excel,pdf,ppt etc...
Could anybody tell me what is the optimum code to search for a content in any type of file using BufferReader
Right now i am using following code where "text" is the content to be searched but the problem here is i am getting the filename even though if i give for wrong content.
while ( ( line = in.readLine() ) != null )
{
if ( line.toLowerCase().indexOf( text ) != -1 )
{
return true;
}
}
}
catch ( IOException e )
{
cLog.error( cConfiguration.getFormattedString("search.error",new Object[]{f}), e );
return false;
}
finally
{
if ( in != null )
{
try { in.close(); } catch ( IOException e ) {}
finally {}
}
}
return false;
Ulf Dittmer
Marshal
Joined: Mar 22, 2005
Posts: 35223
7
posted
0
Text files are the only ones you'll be able to search reliably using this approach. The other formats are binary, structured files that are not amenable to this kind of simplistic approach.
I'd look into using a search library like Lucene for this, but you'll still need to use particular libraries to get at the contents of those formats, like Apache POI for DOC, XLS and PPT, and PDFBox for PDFs.