Community
Participate
Working Groups
Build Identifier: It seems that the hash in database it not updated when set modified and visited flag on changed records. So deltaindexing always compare the current hash value with the one of the initial crawl. This implicate that changed records are market as changed by every following crawl job and is always added to the index. Reproducible: Always Steps to Reproduce: 1. Crawl a folder with file (with deltainding full and jpa impl) 2. Record is added 3. Crawl the folder again 4. obsoleteIdIterator could not find any obsolete ids 5. Modify the file. 6. Crawl the folder again 7. Records is added 8. Crawl the folder again 9. Records is added, but should be obsoleteIdIterator could not find any obsolete ids (no record is added). 10. Repeat step 8 - 9 will always add the same record even if it is not changed between two crawl jobs.
Created attachment 191089 [details] log file
Created attachment 191091 [details] possible solution patch This patch should fix the problem by updated the hash when modify and visit changed records.
Thank you Peter for opening this issue and providing a patch. @Daniel: Could you please take a look at this? Cheers Igor
Hi Peter, thanks for your contribution. I could reproduce the problem you described easily and also fix it with your proposal. I also improved the JUnit tests to cover the described scenario. Without the fix they failed, with the fix they are successfull. All is checked into trunk with revision 1016. Thanks, Daniel
Closing this