Some Eclipse Foundation services are deprecated, or will be soon. Please ensure you've read this important communication.

Bug 420991

Summary: search can't find stuff
Product: [ECD] Orion Reporter: Rafael Chaves <eclipse>
Component: ServerAssignee: John Arthorne <john.arthorne>
Status: RESOLVED FIXED QA Contact:
Severity: critical    
Priority: P3 CC: ken_walker, libingw, mamacdon, Michael_Rennie
Version: 4.0   
Target Milestone: 5.0 M2   
Hardware: PC   
OS: Linux   
Whiteboard:

Description Rafael Chaves CLA 2013-11-04 10:09:40 EST
I was having an issue with search not returning some hits or not returning hits at all. Tried in orionhub.org and (although the issue fails differently) I can never get any hits. 

1 - create an account
2 - create a root folder
3 - create a .txt file in it and type in some text you can search for
4 - search for a string you know is in the file created in #3
5 - result: no hits
Comment 1 Mark Macdonald CLA 2013-11-04 19:53:23 EST
I can reproduce this on OrionHub. Searching with my existing user account, I see search results in files that I created several months ago, but 0 hits from newly-created files.
Comment 2 John Arthorne CLA 2013-11-14 13:35:46 EST
I think I have found the root cause here, but it is hard to be certain because I can't reproduce the problem on any smaller server. We introduced a bug during search optimization work last year, where we stopped indexing last modified time for each file. The indexer does a query based on last modified time, and if it cannot find a match this is what triggers re-indexing of a given file. 

By removing that index, it meant this query never returned any result, and as a result on every pass the indexer is re-indexing every file. On a smaller server this doesn't seem to hurt much, but on orionhub where the workspace is quite large the indexer can no longer keep up.

This is compounded by a throttling heuristic that puts the indexer job to sleep if it consuming too much CPU time. This makes the indexer "back off" if it is doing too much work, making it hard for the indexer to ever catch up.

I have released a fix to both add last modified time back to the index, and to tweak the throttling heuristic so it never backs off for too long. This could likely use further tuning for very large deployments.  The fix for both is found here:

http://git.eclipse.org/c/orion/org.eclipse.orion.server.git/commit/?id=9bbc49bb2e65b3d85812b6e953066caf102b58b3

I am leaving this open for now because I have not confirmed that this fixes the problem on orionhub.
Comment 3 John Arthorne CLA 2014-07-07 15:07:08 EDT
This was resolved but never closed.