| Summary: | [index] PatternSearchJob ignores participant index entries | ||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Product: | [Eclipse Project] JDT | Reporter: | Ian Tewksbury <itewksbu> | ||||||||||
| Component: | Core | Assignee: | Satyam Kandula <satyam.kandula> | ||||||||||
| Status: | VERIFIED FIXED | QA Contact: | |||||||||||
| Severity: | normal | ||||||||||||
| Priority: | P3 | CC: | amj87.iitr, daniel_megert, frederic_fusier, jarthana, min123, nsand.dev, Olivier_Thomann, srikanth_sankaran, thatnitind | ||||||||||
| Version: | 3.6 | Flags: | frederic_fusier:
review+
|
||||||||||
| Target Milestone: | 3.6.1 | ||||||||||||
| Hardware: | All | ||||||||||||
| OS: | All | ||||||||||||
| Whiteboard: | |||||||||||||
| Attachments: |
|
||||||||||||
|
Description
Ian Tewksbury
Has there been any further investigation into this yet? Thanks for the update. (In reply to comment #1) > Has there been any further investigation into this yet? Thanks for the update. Unfortunately no :-( This is to late in the 3.6 game to have a look at such bugs. If, early 3.7, you will not see any change on this request, then do not hesitate to ping us again... Are the indexes getting created through SearchParticipant#scheduleDocumentIndexing()? (In reply to comment #3) > Are the indexes getting created through > SearchParticipant#scheduleDocumentIndexing()? We are not currently overriding SearchParticipant#scheduleDocumentIndexing() in our SearchParticipant but we do call #scheduleDocumentIndexing passing in our implimentation of SearchDocument and the location of our index file. (In reply to comment #4) > We are not currently overriding SearchParticipant#scheduleDocumentIndexing() in > our SearchParticipant but we do call #scheduleDocumentIndexing passing in our > implimentation of SearchDocument and the location of our index file. You don't have to override scheduleDocumentIndexing() but should be calling to really get the indexes register with the JavaModelManager. In the JUnit test case we have, I could make sure that calling scheduleDocumentIndexing() was being good enough. Can you make sure that the call is happening properly in your case? If not, could you give me reproducible steps? (In reply to comment #5) > You don't have to override scheduleDocumentIndexing() but should be calling to > really get the indexes register with the JavaModelManager. Right now we are calling JSPSearchParticipant#scheduleDocumentIndexing(JavaSearchDocumentDelegate, pathToIndexFile) where JSPSearchPartiicpant is our implimentation of SearchParticipant and JavaSearchDocumentDelegate is our implimentation of SearchDocument for JSP files so the path returned is actully one to our translated java file. > In the JUnit test > case we have, I could make sure that calling scheduleDocumentIndexing() was > being good enough. Can you make sure that the call is happening properly in > your case? Not sure what you mean by "properly". > If not, could you give me reproducible steps? Not sure what to give you here either. The side effect for us is that if we open a new workspace and add a bunch of JSP files and index them all, which includes calling #scheduleDocumentIndexing then our results will appear in Java search results. If you then close the workspace and open it back up Java searches will not find our search results unless we completely re-index everything and call #scheduleDocumentIndexing again. (In reply to comment #6) > If you then close the workspace and open it back up Java > searches will not find our search results unless we completely re-index > everything and call #scheduleDocumentIndexing again I understand the problem now. The IndexManager is not completely persisting the information of the index files that can be reused. A workaround could be to call #scheduleDocumentIndexing for atleast one document in a project. This should ensure that the data structures in IndexManager to be updated properly and doesn't need the complete indexes to be created. However, this is not a good solution. There are mainly two reasons for this limitation -- 1. IndexManager currently persists only the index file names. It doesn't store the complete path and also doesn't store the container information. 2. If the IndexManager finds out the file is not there or finds out that it has to reindex for some reason, it doesn't have a way to ask the participant to reindex. The first limitation probably can be updated by just modifying the code or even by using the workaround, but to have a better story to include the second limitation, the API should be modified. I will look at the best way that could be done. (In reply to comment #7) > I will look at the best way that could be done. Thank you for continuing to look into this, it is very much appreciated. Satyam, Any news to report? Thanks again for the investigation thus far, it is great to see some movement on this. (In reply to comment #9) > Satyam, Any news to report? Thanks again for the investigation thus far, it is > great to see some movement on this. As I mentioned, the best way I think should be is to introduce a new API SearchPaticipant#initializeIndex(IPath indexLocation), which will return true if the file is good and JDT could read that file properly. If it returns false, it means that the JDT couldn't make use of the file and hence scheduleDocumentIndexing() had to be called once again. Is that good for you? Do you need a fix for previous versions of Eclipse? Frederic, What do you think about this API proposal? (In reply to comment #10) > As I mentioned, the best way I think should be is to introduce a new API > SearchPaticipant#initializeIndex(IPath indexLocation), which will return true > if the file is good and JDT could read that file properly. If it returns false, > it means that the JDT couldn't make use of the file and hence > scheduleDocumentIndexing() had to be called once again. > > Is that good for you? Do you need a fix for previous versions of Eclipse? > > Frederic, What do you think about this API proposal? I didn't mention it completely in my last comment. This API should be called for all the index locations at the startup. This will register the index files with the IndexManager. (In reply to comment #10) > (In reply to comment #9) > > Satyam, Any news to report? Thanks again for the investigation thus far, it is > > great to see some movement on this. > > As I mentioned, the best way I think should be is to introduce a new API > SearchPaticipant#initializeIndex(IPath indexLocation), which will return true > if the file is good and JDT could read that file properly. If it returns false, > it means that the JDT couldn't make use of the file and hence > scheduleDocumentIndexing() had to be called once again. > > Is that good for you? It sounds good. I won't know for sure until I actually get my hands on it and can test it in our situation. > Do you need a fix for previous versions of Eclipse? We would like this in 3.6.1 if at all possible so that we have it for use in WTP 3.2.2 (In reply to comment #11) > (In reply to comment #10) > > As I mentioned, the best way I think should be is to introduce a new API > > SearchPaticipant#initializeIndex(IPath indexLocation), which will return true > > if the file is good and JDT could read that file properly. If it returns false, > > it means that the JDT couldn't make use of the file and hence > > scheduleDocumentIndexing() had to be called once again. > > > > Is that good for you? Do you need a fix for previous versions of Eclipse? > > > > Frederic, What do you think about this API proposal? > > I didn't mention it completely in my last comment. This API should be called > for all the index locations at the startup. This will register the index files > with the IndexManager. I'm a little bit afraid by adding an API which should be called at each startup otherwise it would not work properly. IMO, but maybe Im' wrong, I do not think an additional API is really necessary. I'm thinking about storing the location into the table when testing whether the index does exist or not (typically in JMM.ensureIndexExists(IPath, IPath) method when state == null)... Wouldn't it do the trick? As soon as the location would be in the table, then we could benefit of the JMM mechanism at startup to retrieve index files and the client specific ones would be taken into account without any other additional changes... Do I miss something or could that work? Moreover, adding an API would be a NO GO for a change in 3.6.1 version, hence it would be better to find a non-API change to fix this issue... (In reply to comment #13) > I'm a little bit afraid by adding an API which should be called at each startup > otherwise it would not work properly. > > IMO, but maybe Im' wrong, I do not think an additional API is really necessary. > I'm thinking about storing the location into the table when testing whether the > index does exist or not (typically in JMM.ensureIndexExists(IPath, IPath) > method when state == null)... Wouldn't it do the trick? As soon as the location > would be in the table, then we could benefit of the JMM mechanism at startup to > retrieve index files and the client specific ones would be taken into account > without any other additional changes... > > Do I miss something or could that work? I tried to explain that in comment 7, but I think it is not that clear. You are right, this problem can be fixed by just adding it appropriately. However, if the index version changes or the index file doesn't exist, there is no way to tell the caller that a re-index needs to be done. Hence, I thought an API will help to do that. We could do the fix as part of 3.6.1 and introduce the API in 3.7 - what do you say? (In reply to comment #15) > We could do the fix as part of 3.6.1 and introduce the API in 3.7 - what do you > say? Understanding that no new API can go into 3.6.1 this sounds like the best solution and would at least put us in a better place for our WTP 3.2.2 release. Created attachment 176958 [details]
Proposed patch on HEAD
I will attach a patch for 3.6.1. shortly. Created attachment 176961 [details]
Proposed patch for 3.6.1
Frederic, Here is the patch for 3.6.1. Please review.
Sorry, but I have to reject this patch as there's an issue in SearchParticipant:
the identity test at line 210 sounds invalid to me. IMO, it should be replaced by an equality test between the two paths.
Another points in IndexManager:
1) getIndexes(...)
a) I think that the participant index should also be rebuilt when the java
like names extension has changed.
b) why test if the index file exists and if so, then create the index
instead of using the same method than when the containerPath is not null:
index = getIndex(containerPath, indexLocation, true /*reuse index file*/,
false /*do not create if none*/);
Hence, IMO, adding a) and b) would lead to a change in this method which should only be something like:
// only need containerPath if the index must be built
IPath containerPath = (IPath) this.indexLocations.keyForValue(indexLocation);
if (containerPath == null &&
!getJavaPluginWorkingLocation().isPrefixOf(indexLocation)) {
// the index might belong to non-jdt search participant...
containerPath = getParticipantsContainer(indexLocation);
}
if (containerPath != null) { // sanity check
... unchanged code here ...
}
2) readParticipantsIndexNamesFile()
a) one thing worry me in the fact that the participantsContainers can be
null after having called this method... If there's no participant index
files, the participant index names file does not exist, hence the
participantsContainers will always stays null. Then, each time the
getParticipantsContainer(IPath) method will be called, the
readParticipantsIndexNamesFile will also be called. So, each time a new
File will be created, Util.getFileCharContent will be called with
this file, a FileNotFoundException raised and finally caught to exit the
method without having done anything... :-(
IMO, it would be really interesting in this method to initialize the
participantContainers to an empty table and have a boolean saying that
either the participant index names file does not exist or that it has
been read...
b) It seems that savedSignature is an unnecessary local variable
c) The file itself could be store as a final field instead of having a
static string PARTICIPANTS_INDEX_NAMES (like savedIndexNamesFile)
When a) is taken into account, I agree this will be less interesting
but I would prefer this implementation (it's just a personal feeling
nothing mandatory in the new patch)
3) removeIndexPath(IPath)
a) The added if block should be inside the block of the
if (locations != null) statement.
b) The added test should be rewritten after changes done in 2)
c) I think the participant index names files should be updated when a key
is removed from the participantsContainers table, shouldn't it?
4) updateParticipant(IPath, IPath)
a) Shouldn't the participantsContainers table be refreshed when the
index location already exists but the container path is different?
I do not know if this is a concrete case, but I guess it might happen
5) writeSavedIndexNamesFile()
a) The method may delete the file when there's no longer any participant
container, but we can leave with an empty file, it doesn't hurt too
much...
Hope my remarks are clear enough...
(In reply to comment #20) Frederic thanks for the comments! I am just adding the comments for those I don't agree completely. > Another points in IndexManager: > 1) getIndexes(...) > a) I think that the participant index should also be rebuilt when the java > like names extension has changed. > b) why test if the index file exists and if so, then create the index > instead of using the same method than when the containerPath is not null: > index = getIndex(containerPath, indexLocation, true /*reuse index file*/, > false /*do not create if none*/); As I understand, the search participant may not be really worried about the java like name extensions. I think the files can be anything and hence I haven't really taken care of it. As we don't know how to really do a re-indexing as of now, I am ignoring the re-indexing part as of now. > 4) updateParticipant(IPath, IPath) > a) Shouldn't the participantsContainers table be refreshed when the > index location already exists but the container path is different? > I do not know if this is a concrete case, but I guess it might happen I actually implemented this way first and then removed :). Indexing story is a little weak here. The participant can use the same index file for two containers and we don't really handle that well in the other places. Hence I didn't handle that because it is arguable and here I get a better performance :) > 5) writeSavedIndexNamesFile() > a) The method may delete the file when there's no longer any participant > container, but we can leave with an empty file, it doesn't hurt too > much... I will leave it as it won't hurt:) This is all looking really great, thanks a bunch. I have a general question related to the index. Currently in our implementation of SearchParticipant we have a different index file for each folder containing JSPs in the workspace. Looking at the JDT implementation it seems you use one index per project and JAR file. Would it make more sense to only have one index file per project rather then per folder? Created attachment 177018 [details]
Proposed patch for 3.6.1
Frederic, Here is the new patch updated with your comments.
(In reply to comment #21) > (In reply to comment #20) > > 1) getIndexes(...) > As I understand, the search participant may not be really worried about the > java like name extensions. I think the files can be anything and hence I > haven't really taken care of it. As we don't know how to really do a > re-indexing as of now, I am ignoring the re-indexing part as of now. > Sounds right, ok to keep your implementation > > > 4) updateParticipant(IPath, IPath) > I actually implemented this way first and then removed :). Indexing story is > a little weak here. The participant can use the same index file for two > containers and we don't really handle that well in the other places. Hence I > didn't handle that because it is arguable and here I get a better performance > :) > ok > > 5) writeSavedIndexNamesFile() > I will leave it as it won't hurt:) > Sure (In reply to comment #23) > Created an attachment (id=177018) [details] > Proposed patch for 3.6.1 > > Frederic, Here is the new patch updated with your comments. I didn't see any changes for the remarks 2.a), 3.b) and 3.c) done in comment 20. As your comment 21 didn't say anything about these remarks, I was expecting to see the corresponding implementation in the patch. Do I miss something? (In reply to comment #25) > I didn't see any changes for the remarks 2.a), 3.b) and 3.c) done in comment > 20. > As your comment 21 didn't say anything about these remarks, I was expecting to > see the corresponding implementation in the patch. Do I miss something? 2.a) Now, the participantContainers is not null at the end of readParticipantsIndexNamesFile() and this function is called only when the participantContainers is null. Hence, it should not be called once again. However, I am not using any other flag as I thought the null check and a get() should be good enough. I misunderstood 3.b. However, as I am not using any flag, I guess this is ok. 3.c. The participantUpdated flag should do the trick. Once this flag is triggered, saveIndexes() called during the idle time should write the participantfile. Do you think we should write it right away? I think it shouldn't be a problem. (In reply to comment #26) > (In reply to comment #25) > > I didn't see any changes for the remarks 2.a), 3.b) and 3.c) done in comment > > 20. > > As your comment 21 didn't say anything about these remarks, I was expecting to > > see the corresponding implementation in the patch. Do I miss something? > 2.a) Now, the participantContainers is not null at the end of > readParticipantsIndexNamesFile() and this function is called only when the > participantContainers is null. Hence, it should not be called once again. > However, I am not using any other flag as I thought the null check and a get() > should be good enough. Ooops, sorry, I missed your change in this method... :-S You're right, with it, it's OK now :-) > I misunderstood 3.b. However, as I am not using any flag, I guess this is ok. Yes, again, with your change in readParticipantsIndexNamesFile() method 3.b is also OK now :-)) > 3.c. The participantUpdated flag should do the trick. Once this flag is > triggered, saveIndexes() called during the idle time should write the > participantfile. Do you think we should write it right away? I think it > shouldn't be a problem. The saveIndexes is called from notifyIdle(long) only when the needToSave flag has been set to true. As the removeIndexPath(IPath) does change this flag, there's a way where the participantsContainers table could be updated and the change not persisted in the corresponding file. Hence, IMO, we should write this file right away... (In reply to comment #27) > > > 3.c. The participantUpdated flag should do the trick. Once this flag is > > triggered, saveIndexes() called during the idle time should write the > > participantfile. Do you think we should write it right away? I think it > > shouldn't be a problem. > > The saveIndexes is called from notifyIdle(long) only when the needToSave flag > has been set to true. As the removeIndexPath(IPath) does change this flag, > there's a way where the participantsContainers table could be updated and the > change not persisted in the corresponding file. Hence, IMO, we should write > this file right away... Obviously, you should read: "... As the removeIndexPath(IPath) does NOT change this flag,..." Sorry for the confusion Created attachment 177073 [details]
Proposed patch for 3.6.1
In this patch, I am calling the writeParticipantsIndexNamesFile() while removing the index.
(In reply to comment #29) > Created an attachment (id=177073) [details] > Proposed patch for 3.6.1 > > In this patch, I am calling the writeParticipantsIndexNamesFile() while > removing the index. I understood why I missed your change in readParticipantsIndexNamesFile() in previous patch... It's because the getParticipantsContainer(IPath) still had the null test on participantsContainers. Seeing that, I wrongly supposed that the method didn't change either... This patch fixes all this issue now and looks really good to me. Great work Satyam :-) (In reply to comment #30) > (In reply to comment #29) > > Created an attachment (id=177073) [details] [details] > > Proposed patch for 3.6.1 > > > > In this patch, I am calling the writeParticipantsIndexNamesFile() while > > removing the index. > > I understood why I missed your change in readParticipantsIndexNamesFile() in > previous patch... It's because the getParticipantsContainer(IPath) still had > the null test on participantsContainers. Seeing that, I wrongly supposed that > the method didn't change either... > > This patch fixes all this issue now and looks really good to me. > > Great work Satyam :-) I have applied this fix to my workspace and it does not seem to fix my issue :( Do we need to do anything new to take advantage of this fix? (In reply to comment #31) > (In reply to comment #30) > > (In reply to comment #29) > > > Created an attachment (id=177073) [details] [details] [details] > > > Proposed patch for 3.6.1 > > > > > > In this patch, I am calling the writeParticipantsIndexNamesFile() while > > > removing the index. > > > > I understood why I missed your change in readParticipantsIndexNamesFile() in > > previous patch... It's because the getParticipantsContainer(IPath) still had > > the null test on participantsContainers. Seeing that, I wrongly supposed that > > the method didn't change either... > > > > This patch fixes all this issue now and looks really good to me. > > > > Great work Satyam :-) > > I have applied this fix to my workspace and it does not seem to fix my issue :( > > Do we need to do anything new to take advantage of this fix? From the quick debugging I just did this is what I found: 1. in new workspace I created a dynamic web 2. created a Java class file with one method 3. created a JSP with a java region creating an instance of the java class and calling the method 4. do a references to search from the java class on the new method 5. result found in JSP 6. close workspace 7. open workspace 8. do search on same method, result in JSP not found 9. close project and open again 10. do search, result found while debugging I have noticed that in Index#query on any of the test cases I get no results from the disk index but on step 4/5 and step 10 results come from the memory index, on step 8 no results come from either the disk index or the memory index. Thus it seems to me that after shutting down the workspace our memory index must not be preserving itself to the disk index correctly and then when the workspace opens back up (step 7) there is no memory index yet because nothing has changed (like is caused in step 9 by opening and closing the project) and the disk index seems to be invalid, so I get no results. Suggestions? (In reply to comment #32) > > From the quick debugging I just did this is what I found: > > 1. in new workspace I created a dynamic web > 2. created a Java class file with one method > 3. created a JSP with a java region creating an instance of the java class and > calling the method > 4. do a references to search from the java class on the new method > 5. result found in JSP > 6. close workspace > 7. open workspace > 8. do search on same method, result in JSP not found > 9. close project and open again > 10. do search, result found > > while debugging I have noticed that in Index#query on any of the test cases I > get no results from the disk index but on step 4/5 and step 10 results come > from the memory index, on step 8 no results come from either the disk index or > the memory index. > > Thus it seems to me that after shutting down the workspace our memory index > must not be preserving itself to the disk index correctly and then when the > workspace opens back up (step 7) there is no memory index yet because nothing > has changed (like is caused in step 9 by opening and closing the project) and > the disk index seems to be invalid, so I get no results. > > Suggestions? Some more debugging. I have found that our index is correctly being saved to disk on the workspace exit. But when the workspace comes back up and I do my first search (step 8) our index is being saved over by a blank index file. This means when the search actually happens nothing is in our disk index. (In reply to comment #33) > > Some more debugging. I have found that our index is correctly being saved to > disk on the workspace exit. But when the workspace comes back up and I do my > first search (step 8) our index is being saved over by a blank index file. > This means when the search actually happens nothing is in our disk index. Sorry for the furry of comments, probably could have waited until I was done looking... So I found the problem. IndexbinaryFolder#exicute, when it gets to line 130 "this.manager.request(new SaveIndex(this.containerPath, this.manager));" the value of "this.manager" is: Enable count:1 Jobs in queue:3 0 - job[0]: indexing binary folder /Foo/WebContent 1 - job[1]: removing src/__2F_Foo_2F_WebContent_2F_bla_2E_jsp.java from index /Foo/WebContent 2 - job[2]: removing src/__2F_Foo_2F_WebContent_2F_Blarg_2E_jsp.java from index /Foo/WebContent .... Thous two Java files with the funky looking names were our temporary in memory Java translations of our JSP files. Probably is we don't ever save thous translations to disk so they are definitely not there when the workspace loads back up. We were counting on their indexed state being saved in the index and thus not having to actually keep the translated file around. But the JDT index is obviously detecting these Java translations are no where to be found and thus decides they need to be removed from the index. Any way to prevent this? (In reply to comment #34) Ian, Thanks for trying out the patch and debugging the problem. The problem seems to be different. It is likely possible that the indexer is confused with the build path. The same component which is a source folder seems to be added as a class folder. Can you please verify the build path. (In reply to comment #35) > (In reply to comment #34) > Ian, Thanks for trying out the patch and debugging the problem. The problem > seems to be different. Yes the problem is definitely a new one. Your patch has definitely worked as expected. Where as before our index file was not even found, now it is. >It is likely possible that the indexer is confused with > the build path. The same component which is a source folder seems to be added > as a class folder. Can you please verify the build path. What do you mean by "check the build path"? Like checking the eclipse preferences, or something while debugging? (In reply to comment #36) > Yes the problem is definitely a new one. Your patch has definitely worked as > expected. Where as before our index file was not even found, now it is. Could I go ahead and release the patch? > What do you mean by "check the build path"? Like checking the eclipse > preferences, or something while debugging? I meant the project build path(.classpath file). (In reply to comment #37) > (In reply to comment #36) > > Yes the problem is definitely a new one. Your patch has definitely worked as > > expected. Where as before our index file was not even found, now it is. > Could I go ahead and release the patch? > Yes, Satyam, go ahead :-) Released the patch on 3.6.1 (In reply to comment #36) > (In reply to comment #35) > > (In reply to comment #34) > > Ian, Thanks for trying out the patch and debugging the problem. The problem > > seems to be different. > > Yes the problem is definitely a new one. Your patch has definitely worked as > expected. Where as before our index file was not even found, now it is. Frederic, Satyam what follow up is needed for the new issue ? (In reply to comment #39) > Released the patch on 3.6.1 Satyam, shouldn't this also be on HEAD ? (In reply to comment #36) > (In reply to comment #35) > > (In reply to comment #34) > > Ian, Thanks for trying out the patch and debugging the problem. The problem > > seems to be different. > > Yes the problem is definitely a new one. Your patch has definitely worked as > expected. Where as before our index file was not even found, now it is. > Ian, please open a new bug for this new issue, thanks (In reply to comment #40) > (In reply to comment #39) > > Released the patch on 3.6.1 > > Satyam, shouldn't this also be on HEAD ? As per comment 15, we might introduce an new API to make this fix better. We're waiting for the come back of Jerome who designed the SearchParticipant API initially to verify whether it's accurate or no... So, we'll continue to work on this for 3.7 soon, but as we do not know whether it will be before or after 3.6.1 delivery. Hence, we preferred to set this bug as resolved for 3.6.1 and reopen it when the work on this for 3.7 will restart. I talked with Jerome about this and we agreed that a new API is not really necessary. The proposed fix for 3.6.1 addresses the client needs without any change in his code and seems definitely acceptable even for 3.7. Furthermore, not adding a new API will make the fix identical for both streams which it simpler and also sounds better to verify that it has no undesirable side effects (as it will be used in both versions)... So, from my side, I would vote to release the patch as this in HEAD... I have opened Bug 323392 to track my issue mentioned in comment 32, comment 33 and comment 34. (In reply to comment #29) > Created an attachment (id=177073) [details] > Proposed patch for 3.6.1 Patch looks good, +1 for 3.6.1. I think we should release it as it is for 3.7 M2 even as we explore any additional changes that are required in the context of Bug 323392. Released on HEAD Verified for 3.6.1 RC2 by code inspection. Verified for 3.7M2 |