| Summary: | Indexer holds exclusive index lock for too long | ||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Product: | [Tools] CDT | Reporter: | Sergey Prigogin <eclipse.sprigogin> | ||||||||||||||||||||||
| Component: | cdt-indexer | Assignee: | Sergey Prigogin <eclipse.sprigogin> | ||||||||||||||||||||||
| Status: | RESOLVED FIXED | QA Contact: | Markus Schorn <mschorn.eclipse> | ||||||||||||||||||||||
| Severity: | normal | ||||||||||||||||||||||||
| Priority: | P3 | CC: | andre.bossert, eichelberg, mschorn.eclipse, tparker, yevshif | ||||||||||||||||||||||
| Version: | 6.0 | ||||||||||||||||||||||||
| Target Milestone: | 7.0 | ||||||||||||||||||||||||
| Hardware: | PC | ||||||||||||||||||||||||
| OS: | Linux | ||||||||||||||||||||||||
| Whiteboard: | |||||||||||||||||||||||||
| Attachments: |
|
||||||||||||||||||||||||
|
Description
Sergey Prigogin
I think ASTProvider.WAIT_NO would not help much because the index needs to be read-locked during access of the AST anyway. It might make sense to add a check whether the indexer is idle or not. Another possibility is to use a SharedASTJob and wait only a minimal amount of time for completion. (In reply to comment #1) > I think ASTProvider.WAIT_NO would not help much because the index needs to be > read-locked during access of the AST anyway. > It might make sense to add a check whether the indexer is idle or not. > Another possibility is to use a SharedASTJob and wait only a minimal amount of > time for completion. When the indexer is not idle this does not imply that it holds a write lock on an index. It first parses a file, resolves all names and only then acquires a write lock to fill the data into the index. In theory this should not take terribly long and we may need to figure out why the indexer holds the write lock for a substantial amount of time. Created attachment 149821 [details]
More granular index locking
I've added logging that triggers every time when a write lock on an index is held for longer than 1 second. During index rebuild with 2030 sources and 3343 headers
write lock was held once for 101 seconds. and 9 times for longer than 10 seconds. With this patch the longest lock duration was reduced to 52 seconds, which is still way too long.
Here is timing of longest calls to WritableCIndex.setFileContent: Names Time (ms) Names/sec 39599 22940 1726 31951 6996 4567 41841 18446 2268 39272 6274 6259 18359 53249 345 21106 12510 1687 13088 6342 2064 29443 26293 1120 67060 8517 7874 31357 6128 5117 Wide range of names/sec numbers is quite surprising. I'd like to gather feedback on a proposed solution to the problem of indexer holding write index lock for too long. When adding a file to index do it in five steps: 1. Add a new PDOMFile with a special "hidden" location. 2. Add names to that file. 3. Change the location of the old file from real to hidden and and location of the new file from "hidden" to real. 4. Remove names belonging to the old file. 5. Remove the old file. During potentially lengthy steps 2 and 4 watch for threads waiting for read locks and release write lock between adding of deleting index names if there is a thread waiting for read lock. Modify all index queries to filter out all names belonging to a file with "hidden" location. Note, that releasing the write lock also clears the result-cache in the PDOM. That is something I'd like to avoid while one AST is written to the index. Too much switching between the writer and the readers will yield an overall performance penalty. If it is necessary we need to do some testing to figure out whether or not it slows down the overall indexing time and if so if the added time can be justified. My strategy so far was to compute all the data first and only then take the write lock and write the stuff to the database. This is done by the method PDOMWriter.resolveNames(). However, resolving all the names does not compute the entire information that is stored in the index (e.g. whether a function is inline or not, the type of a variable, ...). It would be interesting to see whether the time is spent for the actual writing of the data or whether a lot of time is spent for computing a certain piece of information. In addition to any improvements on the maximum time the indexer holds the write lock, comment 1 proposes a safer approach to solve the issue with the hyperlink detector. (In reply to comment #5) > I'd like to gather feedback on a proposed solution to the problem of indexer > holding write index lock for too long. > When adding a file to index do it in five steps: > 1. Add a new PDOMFile with a special "hidden" location. > 2. Add names to that file. > 3. Change the location of the old file from real to hidden and and location of > the new file from "hidden" to real. > 4. Remove names belonging to the old file. > 5. Remove the old file. > During potentially lengthy steps 2 and 4 watch for threads waiting for read > locks and release write lock between adding of deleting index names if there is > a thread waiting for read lock. > Modify all index queries to filter out all names belonging to a file with > "hidden" location. You cannot do concurrent read and write operations on the index: * The caching of chunks depends on the locking mechanism. * Adding a name to the index has the side effect of modifying or adding bindings. * The table to find files would be shared between readers and the writer. Sergey, I have tested your patch and I do not see a performance problem. Before making any further changes(In reply to comment #6) > It would be interesting to see > whether the time is spent for the actual writing of the data or whether a lot > of time is spent for computing a certain piece of information. I agree that this information is key to making a right decision. I'll work on getting it. (In reply to comment #7) > You cannot do concurrent read and write operations on the index: > * The caching of chunks depends on the locking mechanism. This should be manageable if release of write locks between names only happens when the lock was held for a long time, say 2 seconds, and there is somebody waiting for it. > * Adding a name to the index has the side effect of modifying or adding > bindings. This is a tough one, but it should not be any different than moving a declaration from one file to another and changing it at the same time with the destination file saved before the source one. > * The table to find files would be shared between readers and the writer. Access to this table would have to be protected. I assume this table will be modified only during steps 1 and 5, during which the write lock will be held. (In reply to comment #9) > (In reply to comment #7) > > You cannot do concurrent read and write operations on the index: > > ... I misunderstood your proposal in comment 5 of breaking up the writing into 5 write-locked steps. There would not be any concurrent read for a write, so please ignore my comment 7. The challenge with this approach is to leave the index in a consistent state whenever the write lock is released. As a minimum you would need to make sure that the information about which bindings are declared/defined is exchanged in a single step. I think there is more potential in preparing the writing to the index a bit further: * We need to lookup all the bindings from the ast in the index. We could do that without a write-lock. * Specializations in the ast specialize information as late as possible. In the index we store the specialized versions of everything. With that we might end up doing template instantiations while we hold the write lock. CPPFindBinding.findBinding(PDOMNode,PDOMLinkage,char[],int,int,long) and FindBinding.findBinding(PDOMNode,PDOMLinkage,char[],int[],long) together are responsible for 82% of time spent while holding exclusive index lock. Here are profiling results for methods called by WritableCIndex.setFileContent: Time (ms) % Time # Calls Method 50005 100.00% 1 WritableCIndex.setFileContent 50004 100.00% 1 WritablePDOM.addFileContent 50003 100.00% 1 PDOMFile.addNames 49939 99.87% 17658 PDOMFile.createPDOMName 49605 99.20% 14693 PDOMCPPLinkage.addBinding(IASTName) 49570 99.13% 14693 PDOMCPPLinkage.addBinding(IBinding,IASTName) 41123 82.24% 5281 PDOMCPPLinkage.adaptBinding 41104 82.20% 5249 PDOMCPPLinkage.doAdaptBinding 41029 82.05% 5237 CPPFindBinding.findBinding(PDOMNode,PDOMLinkage,IBinding,long) 29680 59.35% 3502 CPPFindBinding.findBinding(PDOMNode,PDOMLinkage,char[],int,int,long) 11258 22.51% 1735 FindBinding.findBinding(PDOMNode,PDOMLinkage,char[],int[],long) 6090 12.18% 5210 PDOMCPPLinkage.createBinding More than 99% of the findBinding calls return null since they try to find bindings that are about to be added to the index, but haven't been added yet.
All such calls are made from the following code fragment in PDOMCPPLinkage.doAdaptBinding:
if (parent instanceof IPDOMMemberOwner) {
PDOMBinding glob= CPPFindBinding.findBinding(parent, this, binding, 0);
Created attachment 150451 [details]
Test case demonstrating nonlinear behavior
It looks like the worst cases of prolonged holding of exclusive lock are caused by adding classes with large number of members. The attached example demonstrates nonlinear behavior with respect to number of class members.
When indexing test2000.h the lock is held for 58 seconds, but when indexing test500x4.h it is held for 13 seconds only.
Created attachment 150633 [details]
increases speed for classes with lots of members
I have tested and applied the above patch for 6.1 > 20091027. (In reply to comment #15) The fix works quite nicely. The longest lock holding time has reduced to 16 sec. I'll get a new more detailed profile to see if further improvement is possible. Could this issue be related to issue I posted in http://www.eclipse.org/forums/index.php?t=msg&th=156739&start=0&S=ed1c4b79e546903617fc1ff8840fe556 ? (In reply to comment #17) > Could this issue be related to issue I posted in > http://www.eclipse.org/forums/index.php?t=msg&th=156739&start=0&S=ed1c4b79e546903617fc1ff8840fe556 > ? Yes, this is the same issue. (In reply to comment #15) > I have tested and applied the above patch for 6.1 > 20091027. Hi, If possible the fix for this issue should be also applied to 6.0.2 (not only 6.1), so eclipse user will get it as quickly as possible. Thanks, Yevgeny (In reply to comment #15) Nothing stood up during further profiling. Long duration of holding write lock is caused by huge size of files being added to the index. All offending files are auto generated code. The worst offender contains 70K lines. The second worst is ten times smaller, but contains a class declaration with 6K lines. The next step is to try to implement the plan outlined in comment #5. I've attempted to implement index update procedure described in comment #5, but I'm not getting the same index contents as the current approach. Is there a utility method I can use to dump the contents of the index in human-readable form? I'd like to compare index contents with two different update styles. (In reply to comment #21) > Is there a> utility method I can use to dump the contents of the index in > human-readable form? Found PDOMPrettyPrinter. I've realized that I don't know how to cleanly remove a binding from the index. Removal of bindings is necessary for file-local bindings. A new binding is associated with a new PDOMFile object, so the old binding belonging to the old PDOMFile has to be deleted. Surprisingly PDOMFile.clear doesn't delete any bindings except macros. If I create a file containing a file-local variable: static int x; then comment it out and save the file, PDOMNode.delete method is not being called. Is it intentional?. Aren't we leaking index space in situations like this? Markus, could you please comment on deletion of bindings from the index. Thanks. (In reply to comment #24) > Markus, could you please comment on deletion of bindings from the index. > Thanks. There is not much to say, bindings are not deleted from the index. We do not track the usage of a binding, so we don't have the means to determine whether a binding is referenced in the index or not. This also holds for file-local bindings which are really local to a compilation unit and can therefore be referenced by entities from other files. Bug 154563 is about deleting bindings in the index. Created attachment 155176 [details] Experimental patch with lock yielding The patch implements the approach outlined in comment 5. Full index rebuild with the patch is noticeable slower, responsiveness during indexing is greatly improved. Hopefully the slowdown can be reduced by tuning. I haven't yet figured out a bug in counting of source files, only headers are currently counted. I'm not sure about other side effects. (In reply to comment #27) The slowdown of index rebuild turned out to be a false alarm. Created attachment 155189 [details]
Experimental patch with lock yielding (take 2)
Fixed the bug in source file counting and cleaned up code a bit.
Sergey, I had a quick look at your patch. From what I saw it looks good. One thing caught my attention: * PDOM.addName(...) could be called very oftern, it should probably be optimized using the same pattern as PDOM.hasLastingDefinition(...) Beyond that: * WritablePDOM.hasLastingDefintion(): We could avoid the creation of the PDOMFile object by introducing a method IIndexFragmentName.getFileRecord(), I am not sure whether this is an important optimization, though. Created attachment 155709 [details]
Experimental patch with lock yielding
Fixed few bugs in the previous version of the patch.
(In reply to comment #30) > * PDOM.addName(...) could be called very oftern, it should probably be > optimized using the same pattern as PDOM.hasLastingDefinition(...) Not sure what optimization you have in mind. > Beyond that: > * WritablePDOM.hasLastingDefintion(): We could avoid the creation of the > PDOMFile object by introducing a method IIndexFragmentName.getFileRecord(), > I am not sure whether this is an important optimization, though. Done. Didn't notice any speedup. (In reply to comment #32) > (In reply to comment #30) > > * PDOM.addName(...) could be called very oftern, it should probably be > > optimized using the same pattern as PDOM.hasLastingDefinition(...) > Not sure what optimization you have in mind. The method addName() creates the PDOMFile object and checks its timestamp. This check is necessary only, when there is an uncommitted file. By overriding the method in WritablePDOM you can first check whether there is a file being updated (without any access to the database): if (fileBeingUpdated != null) { // Check whether name belongs to uncommitted file ... } else { // Add name without checks. } This an the other optimization suggested does not influence the indexer, however it could be important for clients that query the index, i.e. search, open element or content assist. private void addName(IIndexFragmentName name, ArrayList<IIndexFragmentName> names) throws CoreException { + IIndexFile file = name.getFile(); + if (file == null || file.getTimestamp() > 0) { + names.add(name); + } + } Created attachment 156475 [details] Updated patch The patch fixes few bugs that were present in the previous version and implements the optimization suggested in comment 33. In the process of testing the patch I discovered an index update issue that exists even without the patch. I'm not sure what to do with it. To reproduce the issue change CModelListener.UPDATE_LR_CHANGED_FILES_COUNT to 1 and create a project consisting of two files: test.h ------ class A {}; test.cc ------- #include "test.h" A a; 1. Reindex the project to make sure everything is clean. 2. Rename test.h to test.h_. The include in test.cc becomes unresolved. 3. Rename test.h_ back to test.h. The include in test.cc is still unresolved. This seems to be caused by the following code in AbstractIndexerTask.extractFiles: for (IIndexFragmentFile ifile : indexFiles) { if (ifile != null && ifile.hasContent()) { IIndexInclude ctx= ifile.getParsedInContext(); ... } } When test.h is renamed to test.h_, the corresponding PDOMFile is cleared and its timestamp becomes -1. The file maintains its IncludedBy info. When test.h is restored, it doesn't get reindexed because ifile.hasContent() returns false. Removing ifile.hasContent() condition solves the problem, but I don't know what side effects it may cause. (In reply to comment #34) > ... > This seems to be caused by the following code in > AbstractIndexerTask.extractFiles: > for (IIndexFragmentFile ifile : indexFiles) { > if (ifile != null && ifile.hasContent()) { > IIndexInclude ctx= ifile.getParsedInContext(); > ... > } > } > Removing ifile.hasContent() condition solves the problem, but I don't know > what side effects it may cause. You are right, the file must be updated independent of whether it currently has content or not. I don't expect this fix to have a side effect. Created attachment 157044 [details]
Final patch as submitted to HEAD
Fixed in HEAD > 20100123. (In reply to comment #37) > Fixed in HEAD > 20100123. We have similar issue with 6.0.1: eclipse get stuck for a few seconds if using CTRL + C/V. Can it be fixed for 6.0.x too? Thanks, Andre (In reply to comment #38) > We have similar issue with 6.0.1: eclipse get stuck for a few seconds if using > CTRL + C/V. Can it be fixed for 6.0.x too? Unfortunately the change is to large and too risky for 6.0.2. Hi, Does anyone knows when CDT 7.0 is planned to be released? Currently CDT 6.0.x it is almost impossible to work while the indexing is being done. Thanks, Yevgeny (In reply to comment #40) > Does anyone knows when CDT 7.0 is planned to be released? Currently CDT 6.0.x > it is almost impossible to work while the indexing is being done. Helios release schedule is in http://www.eclipse.org/projects/project-plan.php?projectid=eclipse#release_milestones. An alternative to waiting until June is to use the next CDT build from http://download.eclipse.org/tools/cdt/builds/6.1.0/index.html. This build will require platform 3.5M4 or higher. (In reply to comment #41) > An alternative to waiting until June is to use the next CDT build from > http://download.eclipse.org/tools/cdt/builds/6.1.0/index.html. This build will > require platform 3.5M4 or higher. This link is outdated. Use http://download.eclipse.org/tools/cdt/builds/7.0.0/index.html Created attachment 157435 [details]
Clear result cache only once per Ist-translation unit
As follow up I made sure that the result cache of the index is cleared only after one IASTTranslationUnit has been written. This may be important because this cache holds instantiations for which the AST cannot determine equality, otherwise (these instances are no PDOM objects). Created attachment 157438 [details]
Clear result cache only once per IASTTranslationUnit
Great job! I have run the indexer on a bunch of projects. I get the same results as before and do not observe a performance penalty. |