This week's book giveaways are in the Cloud and AI/ML forums.
We're giving away four copies each of Cloud Native Patterns and Natural Language Processing and have the authors on-line!
See this thread and this one for details.
Win a copy of Cloud Native PatternsE this week in the Cloud forum
or Natural Language Processing in the AI/ML forum!
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Devaka Cooray
  • Liutauras Vilda
  • Jeanne Boyarsky
  • Bear Bibeault
Sheriffs:
  • Paul Clapham
  • Knute Snortum
  • Rob Spoor
Saloon Keepers:
  • Tim Moores
  • Ron McLeod
  • Piet Souris
  • Stephan van Hulst
  • Carey Brown
Bartenders:
  • Tim Holloway
  • Frits Walraven
  • Ganesh Patekar

Reindex

 
Ranch Hand
Posts: 17424
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I reindexed all my posts (postId=0 to postId=99999) after which I couldn't find some posts (via search).

I looked into net.jforum.search.LuceneReindexer
I saw lastPostId was limited by
int fetchCount = SystemGlobals.getIntValue(ConfigKeys.LUCENE_INDEXER_DB_FETCH_COUNT);

Which means only postId=0 to postId=50 (the default FETCH_COUNT value) were reindexed.

In SystemsGlobals.properties we can read:



This value is only read in the LuceneReindexer.reindex(). I changed it to ConfigKeys.LUCENE_INDEXER_RAM_NUMDOCS (10000 by default) and all my posts where finaly indexed.

I'm not familiar at all with Lucene, my question is: what do you mean by "Number of posts to retrieve on each read from the database"?
Why is it used in the reindex process? I sometimes reindex the database and want to be sure, every post was reindexed. What should I do?

Thanks
[originally posted on jforum.net by Exo7]
 
Migrated From Jforum.net
Ranch Hand
Posts: 17424
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
"on each read" == "while there are data in the database, retrieve them as 50 by 50, to not overload the system memory".

When you downloaded JForum? I upgraded the package yesterday (October 13) because some bug in a lucene query, which may be related to your problem (or not).

Are you able to always reproduce this problem? Which options in the reindex page you selected?

Rafael
[originally posted on jforum.net by Rafael Steil]
 
Migrated From Jforum.net
Ranch Hand
Posts: 17424
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I downloaded the new version 3 days ago, I corrected the lucene query myself after having warned you about it.

Rafael Steil wrote:Are you able to always reproduce this problem? Which options in the reindex page you selected?


Yes:
Lucene Statistics > By Message ID From 0 to 99999 > Check if message exists before adding to index AND Recreate index from scratch

It only reindex from 0 to 50.

<br /> <br />

Rafael Steil wrote:"on each read" == "while there are data in the database, retrieve them as 50 by 50, to not overload the system memory".

<br /> <br /> Ok, I may find the problem: <br /> <br /> Let's say I only have 3 posts in my DB, with postId 300, 301 and 302 <br /> The reindex method will first check from 0 to 50 <br /> The List l = dao.getPostsToIndex(firstPostId, toPostId); <br /> is empty. <br /> <br /> And hasMorePosts = hasMorePosts && l.size() > 0; will be set to false.
The loop is broken.

None of my three posts has been reindexed.
[originally posted on jforum.net by Exo7]
 
Migrated From Jforum.net
Ranch Hand
Posts: 17424
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Yep, I understand the problem now. A fix is in the CVS already.

A workaround is to reindex by date instead of post id or, of course, start with the first post id that's in the database. But anyway, a fix for it exists. I'll update the package in the next days.

Rafael
[originally posted on jforum.net by Rafael Steil]
 
Migrated From Jforum.net
Ranch Hand
Posts: 17424
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi,
I have this similar issue.
I tried using the reindex by date but the system indexes only some messages, I have 16 messages at the wnd of reinex process the number of documents that are indexed is always 10.

Because I�m using Oracle I also fixed a query, in Oracle.sql adding under
SearchModel



Where can I download the fix from CVS?

TIA
[originally posted on jforum.net by radar]
 
Migrated From Jforum.net
Ranch Hand
Posts: 17424
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Try reindexing by message id if the date option doesn't work for you.

Rafael
[originally posted on jforum.net by Rafael Steil]
 
Migrated From Jforum.net
Ranch Hand
Posts: 17424
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi,
HAPPY NEW YEAR!!
I tried with the ID instead of date but it works only for few messages.
It indexes only some messages and skips others.

MDT
[originally posted on jforum.net by radar]
 
It is sorta covered in the JavaRanch Style Guide.
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!