Community
Participate
Working Groups
Build Identifier: 20110218-0911 The performance of the builder management (both parallel and non-parallel build) is severely slowed down by the fact that the ErrorParser is an outputstream and connected to all build processes, and that the ErrorParserManager.write is synchronized. A lot of output from the build will slow down the process significantly. Infact a quite simple case, a command line with a few hundred characters, slows down the process. This could easily happen with a few include directories as arguments to the compiler. This is especially clear when parallel building, this leads to severe starvation of new jobs, and in my tests more or less leading to single thread performance. Since the processLine() occurs inside the synchronized write() of the outputstream that is natural. I'm not very familiar with CDT code overall so any input is appreciated. It should perhaps be that the error parsing is done in a separate thread, and not necessarily exactly when the error messages appear. This would require storing the build output in some kind of buffer. Please discuss. I have written a small patch which is a hackish workaround for the slowness of the ErrorParser. Since the ErrorParser is used as a outputstream, it is also used to print the commandlines. My patch is done on the assumption that the command lines themselves never contain anything worth parsing for errors. It patches the printCommandLine() methods of the parallel builder and standard builder to only output the string, not errorparse it. Just doing patch this lets me fully saturate 4 cpus on a quad cpu machine, whereas before I could only get the cpu power of about 1 cpu. Reproducible: Always Steps to Reproduce: 1. Create a standard c project 2. Add a few include directories to build the command line length 3. Add a few files just to have something to compile for testing 4. Notice the speed difference when building with error parsers disabled, and enabled. most easily noticed for parallel builder.
Created attachment 191106 [details] org.eclipse.cdt.core patch Patch is for version v201009241320 of org.eclipse.cdt.core
Created attachment 191108 [details] org.eclipse.cdt.managedbuilder.core patch Patch is for version v201009241320 of org.eclipse.cdt.managedbuilder.core
The patches attached do not apply cleanly. If you attach your patches as a single workspace patch that applies cleanly, it will make it easier to review. This all being said, I'm not sure we can really apply this patch. The assumption is that the error parser gets to see everything that happens on the command line during the build, and having the error parser manager bypass the error parsers for certain lines breaks that assumption. Although most error parsers would not rely upon it, it's hard to say whether there are error parsers out there relying on seeing this output. I know in the past my team has used this to try to get around the issue of parsing errors from tools that do not put a full file context beside each error that is output, so I think it's a very plausible scenario. I believe the synchronization on write is done so that messages that the build system outputs to the console are not messed up by writes done by the build output reader thread.
(In reply to comment #0) > Build Identifier: 20110218-0911 > My patch is done on the assumption that the command lines themselves never > contain anything worth parsing for errors. > It patches the printCommandLine() methods of the parallel builder and standard > builder to only output the string, not errorparse it. > > Just doing patch this lets me fully saturate 4 cpus on a quad cpu machine, > whereas before I could only get the cpu power of about 1 cpu. How does not parsing just command lines printout could solve this issue? If there is any output from the compiler the problem is back, isn't it? Regardless, this sounds like a good idea to move handling of output by error parsers into a separate thread. It sounds like a different issue but should be looked into along with changes suggested in bug 314428.
Created attachment 191132 [details] Workspace patch instead Patch is against v201009241320
(In reply to comment #4) > (In reply to comment #0) > > Build Identifier: 20110218-0911 > > My patch is done on the assumption that the command lines themselves never > > contain anything worth parsing for errors. > > It patches the printCommandLine() methods of the parallel builder and standard > > builder to only output the string, not errorparse it. > > > > Just doing patch this lets me fully saturate 4 cpus on a quad cpu machine, > > whereas before I could only get the cpu power of about 1 cpu. > How does not parsing just command lines printout could solve this issue? If > there is any output from the compiler the problem is back, isn't it? > > Regardless, this sounds like a good idea to move handling of output by error > parsers into a separate thread. It sounds like a different issue but should be > looked into along with changes suggested in bug 314428. Indeed, this patch isn't really a solution. But it is a quick workaround to solve the problem in most cases I have been. In projects where most of the code compiles cleanly, and only a few short warnings on a few files it works good.
(In reply to comment #3) > The patches attached do not apply cleanly. If you attach your patches as a > single workspace patch that applies cleanly, it will make it easier to review. > I attached a workspace patch that should work, sorry about that. > I believe the synchronization on write is done so that messages that the build > system outputs to the console are not messed up by writes done by the build > output reader thread. Yeah, and that's all fine, but coupled with the sometimes-slow-errorparsing happening inside of that synchronized write it leads to performance issues. I think it would be nice to have all the build output in some kind of syncrhonized buffer so it still is synchronized build output, and error parsing can take place at the same time, without blocking for longer times then it takes to retrieve a line for example. But then again, I'm not that familiar with the CDT code so this might not be the best suggestion.