aspose file tools
The moose likes Other Open Source Projects and the fly likes question about lucene index creating Big Moose Saloon
  Search | Java FAQ | Recent Topics
Register / Login
JavaRanch » Java Forums » Products » Other Open Source Projects
Reply Bookmark "question about lucene index creating" Watch "question about lucene index creating" New topic
Author

question about lucene index creating

jonathan ford
Greenhorn

Joined: Nov 07, 2007
Posts: 9
the two arguments are: c:\index; c:\data, there is one text file in c:\data, I run the program, it turns out nothing in c:\index, the codes are following:


package indexer;

import java.io.*;
import java.util.*;
import org.apache.lucene.index.*;
import org.apache.lucene.analysis.standard.*;
import org.apache.lucene.document.*;
import java.lang.Object;

public class Indexer {

/**
* @param args
*/
public static void main(String[] args) throws Exception {
// TODO Auto-generated method stub
if (args.length !=2) {
throw new Exception("Usage: java" +Indexer.class.getName()
+ " <index dir> <data dir>");
}
File indexDir = new File (args[0]);
File dataDir = new File(args[1]);
long start = new Date().getTime();
int numIndexed = index (indexDir, dataDir);
long end = new Date().getTime();
System.out.println("Indexing " + numIndexed + " files took "
+ (end - start) + " milliseconds");
}
// open an index and start file directory traversal
public static int index(File indexDir, File dataDir) throws IOException {
if (!dataDir.exists() || !dataDir.isDirectory()) {
throw new IOException(dataDir
+ " does not exist or is not a directory");
}
//IndexWriter writer = new IndexWriter(indexDir, new StandardAnalyzer(), true);
IndexWriter writer = new IndexWriter(indexDir, new StandardAnalyzer());
writer.setUseCompoundFile(false);
indexDirectory(writer, dataDir);
int numIndexed = writer.docCount();
writer.optimize();
writer.close();
return numIndexed;
}
// recursive method that calls itself when it finds a directory
private static void indexDirectory(IndexWriter writer, File dir) throws IOException {
File[] files = dir.listFiles();
for (int i = 0; i < files.length; i++) {
File f = files[i];
if (f.isDirectory()) {
indexDirectory(writer, f);
}
else if (f.getName().endsWith(".txt")) {
indexFile(writer, f);
}
}
}

// method to actually index a file using Lucene
private static void indexFile(IndexWriter writer, File f) throws IOException {
if (f.isHidden() || !f.exists() || !f.canRead()) {
return;
}
System.out.println("Indexing " + f.getCanonicalPath());
Document doc = new Document();
//doc.add(Field.Text("contents", new FileReader(f)));
doc.add(new Field("contents", new FileReader(f)));

doc.add(new Field("filename", f.getCanonicalPath(), Field.Store.YES, Field.Index.UN_TOKENIZED));
writer.addDocument(doc);
}
}
Stan James
(instanceof Sidekick)
Ranch Hand

Joined: Jan 29, 2003
Posts: 8791
Hi, welcome to the ranch! For future reference, see the CODE button below the post editor when posting code. That preserves your formatting and makes things much easier to read.

I don't see anything that jumps out at me here. I guess you confirmed you're actually getting to the file via the System.out. I've never used the Field constructor with the reader. For what it's worth, here's part of my code:

[ November 08, 2007: Message edited by: Stan James ]

A good question is never answered. It is not a bolt to be tightened into place but a seed to be planted and to bear more seed toward the hope of greening the landscape of the idea. John Ciardi
 
I agree. Here's the link: http://ej-technologies/jprofiler - if it wasn't for jprofiler, we would need to run our stuff on 16 servers instead of 3.
 
subject: question about lucene index creating
 
Similar Threads
0 hits while using pharsequery
indexing and searching on pdf page by page
Lucene
PDF file indexing and Searching using lucene
Lucene beginner question