Community
Participate
Working Groups
The current org.eclipse.xtext.builder.builderState.IMarkerUpdater (used by the AbstractBuilderState) defines a batch interface to validate resources and update / delete the corresponding markers. For the builder it would however make more sense to update the markers for one resource at the time. The reason being that the resource validation may (and is in fact likely to) load additional resources into the resource set. Not only will the builder discard these additionally loaded resources right after the validation (the resource set is cleared) but this may also lead to an OOME during validation if the resource set contains a lot of loaded resources. Also note that the current batch interface doesn't really come with any performance benefits. As pointed out the builder would even be able to benefit from a non batch interface. I propose to drop the IMarkerUpdater interface entirely and inline its remaining functionality in AbstractBuilderState. A patch with a solution proposal will follow.
Created attachment 207955 [details] proposed patch The proposed patch removes the IMarkerUpdater and inlines it into the AbstractBuilderState. Further, as described, the builder now validates resources one by one right after resolving them in the second build phase. An alternative to dropping IMarkerUpdater would be to change its interface (copied from AbstractBuilderState in the proposal): void updateMarkers(Resource resource, IProgressMonitor monitor); void deleteMarkers(URI uri, IProgressMonitor monitor); Note that IMarkerUpdater is defined in the builder plug-in and is therefore currently not a published API.
Scheduled for review.
I am not quite convinced. Could you elaborate a bit more why the patch would improve performance? If I get it right, the batch strategy avoids loading resources needed in the validation of several reosurces in the cluster multiple times. How will your solution improve that situation? As far as the OOME is concerned, we need of course less memory when not keeping the whole cluster in memory, but that sounds like the performance for memory trade-off we chose when introducing the clutering strategy. Maybe we should rather look for a better clustering strategy to improve performance?
(In reply to comment #3) > I am not quite convinced. Could you elaborate a bit more why the patch would > improve performance? You are quite correct: The performance benefits are at the moment hypothetical and would need to be exploited by the builder: It is possible that a resource loaded lazily during validation is in the build queue (i.e. has not been processed yet). The builder would have the ability to process that resource next. This is not possible if all resources are validated as a last step of processing a cluster. > If I get it right, the batch strategy avoids loading resources needed in the > validation of several reosurces in the cluster multiple times. How will your > solution improve that situation? The current batch strategy doesn't have any benefits for performance as resources are always loaded lazily into the same resource set. > As far as the OOME is concerned, we need of course less memory when not keeping > the whole cluster in memory, but that sounds like the performance for memory > trade-off we chose when introducing the clutering strategy. > > Maybe we should rather look for a better clustering strategy to improve > performance? I agree that the clustering strategy can be improved. But it would also be possible to implement a build queue reordering as mentioned above (prioritize resources which were loaded on demand into the resource set). In any case I think the proposed change would provide the clustering and loading strategy with a better basis to make the decision. There is an additional improvement we've implemented in our builder based on this resource-per-resource validation: Once a resource has been validated the builder itself doesn't need it anymore, thus the builder can free up resources associated with it. In our builder we then simply clear the resource's cache (XtextResource#getCache()). A more advanced clustering strategy may decide to unload the resource (somehow...). That is one of the next things we would like to do: Some kind of "moving window" clustering strategy. In summary: * I agree that the refactoring doesn't come with any direct performance benefits, but it should form a better basis for future performance improvements. * The IMarkerUpdater would be cleaner (as pointed out the current batch interface doesn't buy anything). * The risk of OOMEs is reduced as the clustering strategy is also given a chance to cap the cluster in between resource validations.
Knut, I am feeling a little bad to remove even an non-API interface without an immediate improvement, e.g. a particular problem solved by the change. This is why I am unassigning this bug. Maybe someone else is willing to take over?
IMHO the cleaner and simpler API would already be reason enough to make the change. I also cannot think of any drawbacks with the proposed change (unless one considers evolving an internal API for better clarity a drawback). And let's also not forget about the resulting lower risk of encountering OOMEs.
The only specialization of the MarkerUpdateImpl that I found via google implements a workaround for something that was fixed for 2.0.x so it *should* not hurt to delete this class and inline its functionality in the builder itself. Personally I don't have a strong opinion about this task but I see the potential that we have if the builder itself is in charge of validating / resource loading / unloading and can therefore exploit the dependency graph between resources to optimize the build order. So please go ahead and update the patch according to the latest changes in the MarkerUpdaterImpl
Pushed fix to master: http://git.eclipse.org/c/tmf/org.eclipse.xtext.git/commit/?id=cb7482175198e3a32325514dcba193309757e77c Instead of completely removing IMarkerUpdater its API was changed to only process a single resource at the time.
Closing all bugs that were set to RESOLVED before Neon.0