| Summary: | [search] index holds stale references to external locations that have been deleted | ||||||
|---|---|---|---|---|---|---|---|
| Product: | [ECD] Orion | Reporter: | John Arthorne <john.arthorne> | ||||
| Component: | Client | Assignee: | Jay Arthanareeswaran <jarthana> | ||||
| Status: | RESOLVED FIXED | QA Contact: | |||||
| Severity: | normal | ||||||
| Priority: | P3 | CC: | jarthana, malgorzata.tomczyk, susan | ||||
| Version: | 0.2 | Keywords: | helpwanted | ||||
| Target Milestone: | 0.2 | Flags: | john.arthorne:
review+
|
||||
| Hardware: | PC | ||||||
| OS: | Windows 7 | ||||||
| Whiteboard: | |||||||
| Attachments: |
|
||||||
|
Description
John Arthorne
*** Bug 339559 has been marked as a duplicate of this bug. *** the dup bug also mentions duplicate entries in the dialog.
>----- The file displays twice, the file links differ: one has root project
>directory uppercase and other lowercase
I will take a look. We either have to keep the search framework in loop during delete operations or during indexing, look for modified folders and re-index them it that's allowed. (In reply to comment #3) > I will take a look. We either have to keep the search framework in loop during > delete operations or during indexing, look for modified folders and re-index > them it that's allowed. It's impossible for the search framework to know about deleted resources all the time. So that leaves us with only the second option and I can think of two approaches. 1. Before returning the search results, find whether the resources exists and if it doesn't remove that from the result and from the search index. The problem here is this will make the search operations slower. 2. In the background job, look for all the modified folders and remove clear the index for them. The problem with this is we may never be able to find out the exact delta and this will result in redundant work. Anyone have any thoughts on the above? The best way forward would be to have a resource change notification. Or do we already have one? There are no resource change events. The deletion could easily happen in another process so an event within a single server process won't necessarily help. Currently the indexer crawls the file system and compares to the index. I think we would also need to do the inverse: crawl the index and see if each entry exists in the file system. If it is very expensive this could be done less frequently than the file->index synchronization. However I think if we could do this comparison it would be much more efficient than flushing the index and recomputing on any folder change. Created attachment 192837 [details]
Proposed Patch
Patch contains a new job as John suggested. Hope the patch is alright as this is my first using Git.
Thanks Jay, the patch looks great. I tried it out on a server with about 20MB of files, and the purge job took 200ms vs 1.7s for indexing, so it is actually quite efficient. However I still like the idea of them being separate jobs so we can control their frequency independently if desired. I have set the default purge period to be 30s rather than the 3m that you wrote. Otherwise the fix works well and the patch is good! |