Some Eclipse Foundation services are deprecated, or will be soon. Please ensure you've read this important communication.
Bug 342903 - Improving m2e performance
Summary: Improving m2e performance
Status: RESOLVED FIXED
Alias: None
Product: z_Archived
Classification: Eclipse Foundation
Component: m2e (show other bugs)
Version: unspecified   Edit
Hardware: PC All
: P3 normal (vote)
Target Milestone: ---   Edit
Assignee: Project Inbox CLA
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2011-04-14 17:46 EDT by Snjezana Peco CLA
Modified: 2021-04-19 13:24 EDT (History)
3 users (show)

See Also:


Attachments
0001-M2e-Performance-All.patch (30.26 KB, patch)
2011-04-14 17:48 EDT, Snjezana Peco CLA
no flags Details | Diff
0001-M2e-Performance-Caching-Lifecycle-Metadata.patch (3.64 KB, patch)
2011-04-14 17:49 EDT, Snjezana Peco CLA
no flags Details | Diff
0001-M2e-Performance-Markers.patch (8.57 KB, patch)
2011-04-14 17:49 EDT, Snjezana Peco CLA
no flags Details | Diff
0001-M2e-Performance-Nexus-indexer (4.76 KB, patch)
2011-04-14 17:50 EDT, Snjezana Peco CLA
no flags Details | Diff
0001-M2e-Performance-Pom-Editor-leak (5.80 KB, patch)
2011-04-14 17:50 EDT, Snjezana Peco CLA
no flags Details | Diff
0001-M2e-Performance-Project-refresh.patch (2.43 KB, patch)
2011-04-14 17:51 EDT, Snjezana Peco CLA
no flags Details | Diff
0001-M2e-Performance-Project-Refresh-Job.patch (3.96 KB, patch)
2011-04-14 17:52 EDT, Snjezana Peco CLA
no flags Details | Diff
0001-M2e-Performance-Source-Attachment.patch (1.75 KB, patch)
2011-04-14 17:53 EDT, Snjezana Peco CLA
no flags Details | Diff
0001-M2e-Performance-Workspace-State.patch (2.18 KB, patch)
2011-04-14 17:54 EDT, Snjezana Peco CLA
no flags Details | Diff
0001-M2e-Performance-Caching-Lifecycle-Metadata.patch (5.35 KB, patch)
2011-04-17 18:08 EDT, Snjezana Peco CLA
no flags Details | Diff
0001-M2e-Performance-Project-Refresh-Job.patch (4.47 KB, patch)
2011-04-17 18:10 EDT, Snjezana Peco CLA
no flags Details | Diff
0001-M2e-Performance-Source-Attachment.patch (3.03 KB, patch)
2011-04-17 18:12 EDT, Snjezana Peco CLA
no flags Details | Diff
0001-M2e-Performance-Markers.patch (8.57 KB, patch)
2011-04-17 18:13 EDT, Snjezana Peco CLA
no flags Details | Diff
0001-M2e-Performance-Nexus-indexer (4.77 KB, patch)
2011-04-17 18:13 EDT, Snjezana Peco CLA
no flags Details | Diff
0001-M2e-Performance-Workspace-State.patch (2.18 KB, patch)
2011-04-17 18:14 EDT, Snjezana Peco CLA
igor: iplog+
Details | Diff
0001-M2E-Performance-All-patches-except-Project-Refresh.patch (26.58 KB, patch)
2011-04-17 18:17 EDT, Snjezana Peco CLA
no flags Details | Diff
0001-M2e-Performance-Caching-Lifecycle-Metadata.patch (6.85 KB, patch)
2011-04-18 17:46 EDT, Snjezana Peco CLA
no flags Details | Diff
0001-M2e-Performance-Caching-Lifecycle-Metadata.patch (6.87 KB, patch)
2011-04-20 13:51 EDT, Snjezana Peco CLA
igor: iplog+
Details | Diff
0001-m2e-performance-Updating-a-classpath.patch (2.90 KB, patch)
2011-04-28 17:54 EDT, Snjezana Peco CLA
igor: iplog+
Details | Diff
0002-m2e-performance-Updating-a-classpath.patch (4.69 KB, patch)
2011-04-29 12:15 EDT, Snjezana Peco CLA
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Snjezana Peco CLA 2011-04-14 17:46:21 EDT
Build Identifier: master

The issue is described in http://dev.eclipse.org/mhonarc/lists/m2e-dev/msg00354.html
I have tested JBoss AS 7 in the offline mode (JBoss AS 7 has been built using Maven CLI).
I have created separate patches, but the enhancement is significant only when using all the patches (0001-M2e-Performance-All.patch).
The enhancement is about three times.

> 1) ProjectRegistryRefreshJob changes a workspace and fires a lot of resource change listeners. 
When creating a marker, for instance, this job fires five resource change listeners (createMarker and four the setAttribute methods). This job should be WorkspaceJob.

> 2) ProjectRegistryRefreshJob and MavenBuilder (a builder job) aren't sinchronized what often causes StaleMutableProjectRegistryException (ProjectRegistryManager, line 319) and that is the cause of starting this job again.

Issues #1 and #2 are related to each other. The easiest way to synchronize these two jobs is that ProjectRegistryRefreshJob is a workspace job.
If you think a workspace lock is too aggressive, it is possible to create this patch in the following way:

- creating markers within WorkspaceRunnable
- synchronizing the jobs using the standard locking mechanism

I haven't used Job.getJobManager().join(...) because it could cause a deadlock when there is EPP's "eventConsumer" job. See https://bugs.eclipse.org/bugs/show_bug.cgi?id=300124 and https://bugs.eclipse.org/bugs/show_bug.cgi?id=306449

Patch:  0001-M2e-Performance-Project-Refresh-Job.patch
speeding up: the Maven>Update Dependencies action about 10%
The enhancement is more significant when Eclipse is heavy loaded.

> 3) LifecycleMappingFactory.getBundleMetadataSources is too often called. Every time it is called it reads the extension point registry and lifecycle mapping xml files. Caching would speed up the overall performance.

Patch: 0001-M2e-Performance-Caching-Lifecycle-Metadata.patch
Speeding up: < 1%

The enhancement could be more significant if additional maven configurators (wtp, JBoss Tools, Google ...) are created.

> 4) the BuildPathManager.configureAttchedSourcesAndJavadoc method is very slow. I have improved it a little (scheduleDownload isn't called when the download preferences are off), but it is possible to improve it more.

Patch:  0001-M2e-Performance-Source-Attachement.patch
Speeding up: 5-10%
The patch includes a fix for issue #8.

> 5) NexusIndexManager.mavenProjectChanged(...) always removes and adds back an artifact to the index repository. Performance will be much better if the index repository is updated only when > the artifact doesn't exist or is changed. Searching the index repository is much faster than removing and adding artifacts. 

Patch: 0001-M2e-performance-Nexus-indexer.patch
Speeding up: 15-20%

> 6) MarkerLocationService.addEditorHintMarkers calls two methods that acquire the WTP's IDOMModel (StructuredModelManager.getModelManager().getModelForRead(...)) and release it. WTP's methods can be slow for larger files and it would be better to acquire/release the model once for MarkerLocationService.addEditorHintMarkers. MarkerLocationService.addEditorHintMarkers is used only once.

Patch: 0001-M2e-Performance-Markers.patch
Speeding up: 10%

> 7) ProjectRegistryManager.applyMutableProjectRegistry calls stateReader.writeWorkspaceState(projectRegistry) when refreshing a project. Since the workspaceState.ser file is only used when starting a workspace, this method can be used only when stopping the org.eclipse.m2e.core bundle

Patch: 0001-M2e-Performance-Workspace-State.patch
Speeding up: 2-3 %

> 8) BuildPathManager.mavenProjectChanged updates Maven classpath containers of all projects no matter they are changed or not. They should be updated only for relevant events  (event.getFlags() != 0).

See issue #4.

> 9) ProjectRegistryManager.refresh(MutableProjectRegistry newState, DependencyResolutionContext context, IProgressMonitor monitor) aggressively
read Maven projects.
When calling Maven>Update Project Configuration on the jboss-as-parent project, the MavenImpl.readProject  is called for about 2-6 times for each project in the workspace. Sometimes this method lasts short time (Maven cache is active), but sometimes takes a longer time.
I have tried to cache MavenProjectFacade in each of the two phases so that MavenImpl.readProject is called two times for every project (once per each phase). Not sure if it is possible to make this method to be run only once.

Patch: 0001-M2e-Performance-Project-refresh.patch
Speeding up: about 60%

10) The Maven Pom Editor has a huge memory leak. Just opening/closing jboss-as-parent/pom.xml will cause the JVM heap size to increase by 2-10MB which can cause Eclipse to crash (OOM).

Patch: 0001-M2e-Performance-Pom-Editor-Leak.patch

The leak happens when clicking the Effective POM tab. M2eclipse calls effectivePomSourcePage.setInput(editorInput) every time when clicking this tab. Calling StructuredTextEditor.setInput the second time adds a selection change listener that isn't released which retains the complete MavenPomEditor structure in the heap. The patch changes the contents of the Effective POM page using the IDocument interface.


Reproducible: Always

Steps to Reproduce:
1. Apply some patch against m2e-core master
2. Call Maven>Update Project Configuration, Maven>Update Dependencies or edit a pom.xml file
Comment 1 Snjezana Peco CLA 2011-04-14 17:48:24 EDT
Created attachment 193307 [details]
0001-M2e-Performance-All.patch
Comment 2 Snjezana Peco CLA 2011-04-14 17:49:04 EDT
Created attachment 193308 [details]
0001-M2e-Performance-Caching-Lifecycle-Metadata.patch
Comment 3 Snjezana Peco CLA 2011-04-14 17:49:43 EDT
Created attachment 193309 [details]
0001-M2e-Performance-Markers.patch
Comment 4 Snjezana Peco CLA 2011-04-14 17:50:16 EDT
Created attachment 193310 [details]
0001-M2e-Performance-Nexus-indexer
Comment 5 Snjezana Peco CLA 2011-04-14 17:50:49 EDT
Created attachment 193311 [details]
0001-M2e-Performance-Pom-Editor-leak
Comment 6 Snjezana Peco CLA 2011-04-14 17:51:37 EDT
Created attachment 193312 [details]
0001-M2e-Performance-Project-refresh.patch
Comment 7 Snjezana Peco CLA 2011-04-14 17:52:20 EDT
Created attachment 193313 [details]
0001-M2e-Performance-Project-Refresh-Job.patch
Comment 8 Snjezana Peco CLA 2011-04-14 17:53:33 EDT
Created attachment 193314 [details]
0001-M2e-Performance-Source-Attachment.patch
Comment 9 Snjezana Peco CLA 2011-04-14 17:54:13 EDT
Created attachment 193315 [details]
0001-M2e-Performance-Workspace-State.patch
Comment 10 Igor Fedorenko CLA 2011-04-15 09:05:04 EDT
Sigh. It's unfortunate you decided to file all separate/unrelated patches under the same bugzilla... Overall changed line count is over 250 LOC, which means that we will have to review (and possibly ask for fixes) all patches first, then file a CQ for entire submission and wait for it approved before we'll be able to apply any of these patches (http://www.eclipse.org/legal/EclipseLegalProcessPoster.pdf).
Comment 11 Igor Fedorenko CLA 2011-04-15 09:15:18 EDT
For issues 1 and 2, I believe there is a better solution. When workspace autobuild is enabled, there is no need to run ProjectRegistryRefreshJob at all because MavenBuilder will be called for all the same workspace changes and will re-read the projects as necessary. Regardless of workspace autobuild on/off state, ProjectRegistryManager should batch marker changes and apply all of them as part of manager.applyMutableProjectRegistry call. Please provide reworked 0001-M2e-Performance-Project-Refresh-Job.patch.
Comment 12 Igor Fedorenko CLA 2011-04-15 09:20:45 EDT
Comment on attachment 193308 [details]
0001-M2e-Performance-Caching-Lifecycle-Metadata.patch

0001-M2e-Performance-Caching-Lifecycle-Metadata.patch does not handle dynamic installation and uninstallation of bundles. It also does not properly synchronize access to static lazy-initialized member. If you believe that <1% performance improvement justifies the added complexity of the proper fix, please provide reworked patch. We will also need unit tests that verify how the new code handles installation/uninstallation of bundles.
Comment 13 Igor Fedorenko CLA 2011-04-15 09:22:12 EDT
Comment on attachment 193313 [details]
0001-M2e-Performance-Project-Refresh-Job.patch

see comment #11 for the reasons why
Comment 14 Snjezana Peco CLA 2011-04-15 12:26:37 EDT
(In reply to comment #11)
> For issues 1 and 2, I believe there is a better solution. When workspace
> autobuild is enabled, there is no need to run ProjectRegistryRefreshJob at all
> because MavenBuilder will be called for all the same workspace changes and will
> re-read the projects as necessary. Regardless of workspace autobuild on/off
> state, ProjectRegistryManager should batch marker changes and apply all of them
> as part of manager.applyMutableProjectRegistry call. Please provide reworked
> 0001-M2e-Performance-Project-Refresh-Job.patch.


Do you mean ProjectRegistryRefreshJob calls ProjectRegistryManager.refresh only when auto-build is on, but ProjectRegistryManager.applyMutableProjectRegistry is always called?
ProjectRegistryRefreshJob is also called from the Maven>Update Dependencies and Maven>Update Snapshots action. Then, both ProjectRegistryManager.refresh and ProjectRegistryManager.applyMutableProjectRegistry would be called. Right?

This would probably improve performance when auto-build is on, but the build job and ProjectRegistryRefreshJob still should be synchronized because the user could call a manual build and edit pom.xml while building.
Comment 15 Igor Fedorenko CLA 2011-04-15 15:09:58 EDT
(In reply to comment #14)
> 
> Do you mean ProjectRegistryRefreshJob calls ProjectRegistryManager.refresh only
> when auto-build is on, but ProjectRegistryManager.applyMutableProjectRegistry
> is always called?
> ProjectRegistryRefreshJob is also called from the Maven>Update Dependencies and
> Maven>Update Snapshots action. Then, both ProjectRegistryManager.refresh and
> ProjectRegistryManager.applyMutableProjectRegistry would be called. Right?
> 
> This would probably improve performance when auto-build is on, but the build
> job and ProjectRegistryRefreshJob still should be synchronized because the user
> could call a manual build and edit pom.xml while building.

What I mean is, ProjectRegistryRefreshJob should not listen/react to resource change events when workspace autobuild is enabled. 

So far, I am not convinced that MavenBuilder and ProjectRegistryRefreshJob must be synchronized to handle the case when the user manually forces workspace build and saves pom.xml at the same time. In fact, I believe it is desirable to allow better concurrency at the expanse of somewhat slower overall throughput in this case.
Comment 16 Igor Fedorenko CLA 2011-04-15 15:14:53 EDT
Comment on attachment 193312 [details]
0001-M2e-Performance-Project-refresh.patch

This patch breaks some workspace parent pom resolution scenarios. I've added a unit test to show the problem https://github.com/sonatype/m2e-core-tests/commit/4a1b6cd05d20ff42a74338fc64b205a9d2e6248b .

Beware that we are fixing some bugs in dependency tracking code to address issues identified in bug 342910, so you probably want to wait until we push our changes. There is a chance we'll implement this particular performance improvement too ;-)
Comment 17 Igor Fedorenko CLA 2011-04-16 00:19:57 EDT
fyi, most of the work re bug 342910 is done now. m2e is expected to refresh projects more efficiently now http://git.eclipse.org/c/m2e/m2e-core.git/commit/?id=9d1d5dfa5bf140ee229fbcfb67929c78e7d92f01
Comment 18 Snjezana Peco CLA 2011-04-17 18:03:20 EDT
> fyi, most of the work re bug #342910 is done now. m2e is expected to refresh
> projects more efficiently now
> http://git.eclipse.org/c/m2e/m2e-core.git/commit/?id=9d1d5dfa5bf140ee229fbcfb67929c78e7d92f01

It seems to cache projects the same way as in the attachment  #193312 [details] 0001-M2e-Performance-Project-refresh.patch. 
Speeding up is about 20-30% (less than 60% that is provided by the attached patch, probably due to the listeners you added).
You can ignore the patch.

> bug #342910
> * Introduce new "Do not automatically update dependencies from remote
> repositories" boolean preference. When checked (the default) m2e will force
> "never" global snapshot/release update policy, i.e. only resolve new/unknown
> artifacts from remote repositories and "never" re-resolve artifacts already
> present locally.

I have proposed that in http://dev.eclipse.org/mhonarc/lists/m2e-dev/msg00371.html and think that will significantly increase overall m2e performance.
Users will be able to work offline and download artifacts when necessary.
Comment 19 Snjezana Peco CLA 2011-04-17 18:08:42 EDT
Created attachment 193445 [details]
0001-M2e-Performance-Caching-Lifecycle-Metadata.patch

> Comment on attachment  #193308 [details]
> 0001-M2e-Performance-Caching-Lifecycle-Metadata.patch

> 0001-M2e-Performance-Caching-Lifecycle-Metadata.patch does not handle dynamic
> installation and uninstallation of bundles. It also does not properly
> synchronize access to static lazy-initialized member. If you believe that <1%
> performance improvement justifies the added complexity of the proper fix,
> please provide reworked patch. We will also need unit tests that verify how the
> new code handles installation/uninstallation of bundles.

This patch could be important in the case of existing more bundles with lifecycle metadata. Now it is synchronized and cleaned up when uninstalling its bundle.
As to unit tests, I suppose it is also necessary to test other m2e features (ProjectConfigurationManager, ProjectRegistryRefreshJob, MavenProjectManager, PlexusContainer ...).
I haven't succeeded to find tests that test installation/uninstallation of bundles.

Attached is a new patch rebased to m2e-core master.
Comment 20 Snjezana Peco CLA 2011-04-17 18:10:15 EDT
Created attachment 193446 [details]
0001-M2e-Performance-Project-Refresh-Job.patch

(In reply to comment #15)

Now, ProjectRegistryRefreshJob.resourceChanged is disabled when workspace autobuild is enabled.
I believe this will improve performance when editing pom.xml files.
Synchronization code is the same as in the previous patch.

Attached is a new patch rebased to m2e-core master.
Comment 21 Snjezana Peco CLA 2011-04-17 18:12:11 EDT
Created attachment 193447 [details]
0001-M2e-Performance-Source-Attachment.patch

I have enhanced this patch. 
Namely, the ProjectRegistry.hasDependencyChange method calls the ProjectRegistryManager.hasDiff method in order to decide if a classpath is changed or not.
The ProjectRegistryManager.hasDiff method tests if two sets are equal and is too restrictive.
The ProjectRegistryManager.hasDiff method requires the sets to be ordered and in the most of the cases returns false for the same sets (classpaths). 
The patch will enhance refreshing (attachment #193312 [details] 0001-M2e-Performance-Project-refresh.patch and bug #342910).
If the sets need to be ordered, m2e would need to use a sorted set.

Now the patch speeds up the Maven>Update Configuration action by 20%.
Comment 22 Snjezana Peco CLA 2011-04-17 18:13:10 EDT
Created attachment 193448 [details]
0001-M2e-Performance-Markers.patch
Comment 23 Snjezana Peco CLA 2011-04-17 18:13:50 EDT
Created attachment 193449 [details]
0001-M2e-Performance-Nexus-indexer
Comment 24 Snjezana Peco CLA 2011-04-17 18:14:52 EDT
Created attachment 193450 [details]
0001-M2e-Performance-Workspace-State.patch
Comment 25 Snjezana Peco CLA 2011-04-17 18:17:13 EDT
Created attachment 193451 [details]
0001-M2E-Performance-All-patches-except-Project-Refresh.patch

I have rebased all the patches except:

- attachment  #193312 [details] 0001-M2e-Performance-Project-refresh.patch that is obsolete because it is included in bug #342910.
- attachment  #193311 [details] 0001-M2e-Performance-Pom-Editor-leak - I will add it to a separate bug.

A new cumulative patch includes all the rebased patches. Enhancement is still about 3 times.
Patch: 0001-M2E-Performance-All-patches-except-Project-Refresh

Now the cumulative patch is < 250 lines in size.
Comment 26 Igor Fedorenko CLA 2011-04-18 14:25:01 EDT
Comment on attachment 193445 [details]
0001-M2e-Performance-Caching-Lifecycle-Metadata.patch

I don't see how this patch handles dynamically installed/removed bundles better than the original version. For example, if the user installed m2e-tycho project configuration without restarting eclipse, m2e is expected to see the new bundle's metadata right away. With this patch, m2e will continue to use the cached metadata until eclipse or at least m2e is restarted.
Comment 27 Igor Fedorenko CLA 2011-04-18 14:36:59 EDT
Comment on attachment 193446 [details]
0001-M2e-Performance-Project-Refresh-Job.patch

At this point I am not convinced that blocking workspace updates for entire duration of project dependency update is necessary. I agree with the change to #resourceChanged method, but not the rest of this patch (which, btw, looks rather hackish). 

Please provide sample project and steps I can use to see the benefits of the additional synchronization. Alternatively, provide reworked patch that only includes the change to #resourceChanged method.
Comment 28 Igor Fedorenko CLA 2011-04-18 15:01:22 EDT
Comment on attachment 193447 [details]
0001-M2e-Performance-Source-Attachment.patch

This patch breaks 13 unit tests (see the list below). If you believe the tests are wrong, please provide a patch that corrects the tests. I do not believe the tests are wrong, however, so some explanation is needed as well.

Also, we need some tests that demonstrate why the old ProjectRegistryManager.hasDiff implementation was wrong. More specifically, dependency order IS important as it affects compile and runtime classpath order. Depending on classpath order different versions of the same classes and/or resources will be used and I need to understand why you believe that dependency order change can be ignored.

List of unit tests that fail after applying this patch

testPomDelete(org.eclipse.m2e.tests.ResourceChangeListenerTest)
testPomRename(org.eclipse.m2e.tests.ResourceChangeListenerTest)
test(org.eclipse.m2e.tests.ClasspathProviderTest)
testDownloadSources_001_basic(org.eclipse.m2e.tests.BuildPathManagerTest)
testDownloadSources_001_sourceAttachment(org.eclipse.m2e.tests.BuildPathManagerTest)
testDownloadSources_002_javadoconly(org.eclipse.m2e.tests.BuildPathManagerTest)
testDownloadSources_003_customRemoteRepository(org.eclipse.m2e.tests.BuildPathManagerTest)
testDownloadSources_004_testsClassifier(org.eclipse.m2e.tests.BuildPathManagerTest)
testDownloadSources_005_classifier(org.eclipse.m2e.tests.BuildPathManagerTest)
testDownloadSources_007_missingTestsSources(org.eclipse.m2e.tests.BuildPathManagerTest)
testClassifiers(org.eclipse.m2e.tests.BuildPathManagerTest)
test005_dependencyAvailableFromLocalRepoAndWorkspace(org.eclipse.m2e.tests.BuildPathManagerTest)
test013_missingDependencyMarker(org.eclipse.m2e.tests.ProjectRegistryManagerTest)
Comment 29 Igor Fedorenko CLA 2011-04-18 15:37:21 EDT
Comment on attachment 193450 [details]
0001-M2e-Performance-Workspace-State.patch

I do not see any measurable performance improvement during project dependency update after applying this patch. Can you provide a sample project and steps I need to take to see the performance improvement.
Comment 30 Igor Fedorenko CLA 2011-04-18 15:46:57 EDT
Comment on attachment 193448 [details]
0001-M2e-Performance-Markers.patch

I do not see any measurable performance improvement during project dependency
update after applying this patch. Can you provide a sample project and steps I
need to take to see the performance improvement?
Comment 31 Igor Fedorenko CLA 2011-04-18 16:02:05 EDT
Comment on attachment 193449 [details]
0001-M2e-Performance-Nexus-indexer

This patch looks promising and I can confirm ~30% performance improvement compared to master@d86b67 during dependency refresh on my hardware using jboss-as as a test project. Before I apply this change

1. Please reconcile the new #getArtifactInfo with existing #getIndexedArtifactFile methods. They are virtually identical.
2a. Workspace index only contains information about project GAV coordinates (see how it is setup with NexusIndex.DETAILS_MIN around line 186). So you only really need to compare project's old/new artifact keys to know if index update is required. This should significantly simplify the patch.
2b. If you still believe more elaborate "should-index-be-updated" logic is necessary, we will need unit tests to justify/verify it.
Comment 32 Snjezana Peco CLA 2011-04-18 17:44:45 EDT
(In reply to comment #26)
> Comment on attachment 193445 [details]
> 0001-M2e-Performance-Caching-Lifecycle-Metadata.patch
> 
> I don't see how this patch handles dynamically installed/removed bundles better
> than the original version. For example, if the user installed m2e-tycho project
> configuration without restarting eclipse, m2e is expected to see the new
> bundle's metadata right away. With this patch, m2e will continue to use the
> cached metadata until eclipse or at least m2e is restarted.

I supposed you meant installation/uninstallation of the o.e.m2e.core bundle. 
Attached is a new patch.
Comment 33 Snjezana Peco CLA 2011-04-18 17:46:26 EDT
Created attachment 193534 [details]
0001-M2e-Performance-Caching-Lifecycle-Metadata.patch
Comment 34 Snjezana Peco CLA 2011-04-18 17:47:55 EDT
(In reply to comment #27)
> Comment on attachment 193446 [details]
> 0001-M2e-Performance-Project-Refresh-Job.patch
> 
> At this point I am not convinced that blocking workspace updates for entire
> duration of project dependency update is necessary. I agree with the change to
> #resourceChanged method, but not the rest of this patch (which, btw, looks
> rather hackish). 
> 
> Please provide sample project and steps I can use to see the benefits of the
> additional synchronization. Alternatively, provide reworked patch that only
> includes the change to #resourceChanged method.

I have tested JBoss AS 7 (git://github.com/jbossas/jboss-as.git) and the Maven>Update Project Configuration action on the jboss-as-parent project.
You can try any other project containing a maven marker as follows:

- set a breakpoint on some IResourceChangeListener.resourceChanged method (doesn't have to be within m2e)
- call the Maven>Update Project Configuration action

If you do this before applying the patch, you will see that IResourceChangeListener.resourceChanged gets called a lot of time (depending on maven markers).
If you apply the patch, IResourceChangeListener.resourceChanged will be called only once.
An Eclipse installation can have a lot of resource change listeners and the patch will improve performance when there are a large number of plugins in a distribution and projects in the workspace.
When refreshing the jboss-as-parent project, for instance, all resource change listeners in Eclipse will be called about 1000 times (53 projects X 3-4 maven markers X 5). 
You can also create markers using WorkspaceRunnable as described in comment #0. That would decrease the number of calling resource change listeners 5 times.
IMO this patch brings significant enhancement. If you don't think so, ignore it.
Comment 35 Snjezana Peco CLA 2011-04-18 17:49:06 EDT
(In reply to comment #28)
> Comment on attachment 193447 [details]
> 0001-M2e-Performance-Source-Attachment.patch
> 
> This patch breaks 13 unit tests (see the list below). If you believe the tests
> are wrong, please provide a patch that corrects the tests. I do not believe the
> tests are wrong, however, so some explanation is needed as well.
> 
> Also, we need some tests that demonstrate why the old
> ProjectRegistryManager.hasDiff implementation was wrong. More specifically,
> dependency order IS important as it affects compile and runtime classpath
> order. Depending on classpath order different versions of the same classes
> and/or resources will be used and I need to understand why you believe that
> dependency order change can be ignored.
> 
> List of unit tests that fail after applying this patch
> 
> testPomDelete(org.eclipse.m2e.tests.ResourceChangeListenerTest)
> testPomRename(org.eclipse.m2e.tests.ResourceChangeListenerTest)
> test(org.eclipse.m2e.tests.ClasspathProviderTest)
> testDownloadSources_001_basic(org.eclipse.m2e.tests.BuildPathManagerTest)
> testDownloadSources_001_sourceAttachment(org.eclipse.m2e.tests.BuildPathManagerTest)
> testDownloadSources_002_javadoconly(org.eclipse.m2e.tests.BuildPathManagerTest)
> testDownloadSources_003_customRemoteRepository(org.eclipse.m2e.tests.BuildPathManagerTest)
> testDownloadSources_004_testsClassifier(org.eclipse.m2e.tests.BuildPathManagerTest)
> testDownloadSources_005_classifier(org.eclipse.m2e.tests.BuildPathManagerTest)
> testDownloadSources_007_missingTestsSources(org.eclipse.m2e.tests.BuildPathManagerTest)
> testClassifiers(org.eclipse.m2e.tests.BuildPathManagerTest)
> test005_dependencyAvailableFromLocalRepoAndWorkspace(org.eclipse.m2e.tests.BuildPathManagerTest)
> test013_missingDependencyMarker(org.eclipse.m2e.tests.ProjectRegistryManagerTest)

My test was the following:

- imported JBoss AS 7 - git://github.com/jbossas/jboss-as.git
- built it
- set a breakpoint on ProjectRegistry.hasDependencyChange
- called Maven>Update Project Configuration

ProjectRegistry.hasDependencyChange always returns true (there are changes) although I haven't changed any project.  
The ClasspathDescriptor class saves entries in ArrayList what means they are in random order. This is probably the reason why ProjectRegistry.hasDependencyChange always returns false . Since MavenProject.getArtifacts() uses LinkedHashSet, I suppose ClasspathDescriptor.entries should be LinkedList. Do you agree?
Comment 36 Snjezana Peco CLA 2011-04-18 18:02:20 EDT
(In reply to comment #29)

I have tested JBoss AS 7 (git://github.com/jbossas/jboss-as.git) .
In my workspace, the workspaceState.ser is about 700k  in size and it is unnecessarily written after refreshing a project (53 times for the jboss-as-parent project).
Comment 37 Snjezana Peco CLA 2011-04-18 18:03:19 EDT
(In reply to comment #30)
> Comment on attachment 193448 [details]
> 0001-M2e-Performance-Markers.patch
> 
> I do not see any measurable performance improvement during project dependency
> update after applying this patch. Can you provide a sample project and steps I
> need to take to see the performance improvement?

See comment #36
Comment 38 Igor Fedorenko CLA 2011-04-18 18:16:32 EDT
(In reply to comment #34)
> 
> I have tested JBoss AS 7 (git://github.com/jbossas/jboss-as.git) and the
> Maven>Update Project Configuration action on the jboss-as-parent project.
> You can try any other project containing a maven marker as follows:
> 
> - set a breakpoint on some IResourceChangeListener.resourceChanged method
> (doesn't have to be within m2e)
> - call the Maven>Update Project Configuration action
> 
> If you do this before applying the patch, you will see that
> IResourceChangeListener.resourceChanged gets called a lot of time (depending on
> maven markers).
> If you apply the patch, IResourceChangeListener.resourceChanged will be called
> only once.
> An Eclipse installation can have a lot of resource change listeners and the
> patch will improve performance when there are a large number of plugins in a
> distribution and projects in the workspace.
> When refreshing the jboss-as-parent project, for instance, all resource change
> listeners in Eclipse will be called about 1000 times (53 projects X 3-4 maven
> markers X 5). 
> You can also create markers using WorkspaceRunnable as described in comment #0.
> That would decrease the number of calling resource change listeners 5 times.
> IMO this patch brings significant enhancement. If you don't think so, ignore
> it.

I suggest we split this patch in two.

The change to #resourceChanged is good and I will extract/apply just this part if you do not mind (please confirm).

To reduce number of resource change events, however, we'd need a new patch that *collects* marker information but defers actual marker creation until #applyMutableProjectRegistry is called. This should eliminate unnecessary resource change events without blocking workspace modifications for extended periods of time.
Comment 39 Igor Fedorenko CLA 2011-04-18 18:28:04 EDT
(In reply to comment #35)
> (In reply to comment #28)
> > Comment on attachment 193447 [details] [details]
> > 0001-M2e-Performance-Source-Attachment.patch
> > 
> > This patch breaks 13 unit tests (see the list below). If you believe the tests
> > are wrong, please provide a patch that corrects the tests. I do not believe the
> > tests are wrong, however, so some explanation is needed as well.
> > 
> > Also, we need some tests that demonstrate why the old
> > ProjectRegistryManager.hasDiff implementation was wrong. More specifically,
> > dependency order IS important as it affects compile and runtime classpath
> > order. Depending on classpath order different versions of the same classes
> > and/or resources will be used and I need to understand why you believe that
> > dependency order change can be ignored.
> > 
> > List of unit tests that fail after applying this patch
> > 
> > testPomDelete(org.eclipse.m2e.tests.ResourceChangeListenerTest)
> > testPomRename(org.eclipse.m2e.tests.ResourceChangeListenerTest)
> > test(org.eclipse.m2e.tests.ClasspathProviderTest)
> > testDownloadSources_001_basic(org.eclipse.m2e.tests.BuildPathManagerTest)
> > testDownloadSources_001_sourceAttachment(org.eclipse.m2e.tests.BuildPathManagerTest)
> > testDownloadSources_002_javadoconly(org.eclipse.m2e.tests.BuildPathManagerTest)
> > testDownloadSources_003_customRemoteRepository(org.eclipse.m2e.tests.BuildPathManagerTest)
> > testDownloadSources_004_testsClassifier(org.eclipse.m2e.tests.BuildPathManagerTest)
> > testDownloadSources_005_classifier(org.eclipse.m2e.tests.BuildPathManagerTest)
> > testDownloadSources_007_missingTestsSources(org.eclipse.m2e.tests.BuildPathManagerTest)
> > testClassifiers(org.eclipse.m2e.tests.BuildPathManagerTest)
> > test005_dependencyAvailableFromLocalRepoAndWorkspace(org.eclipse.m2e.tests.BuildPathManagerTest)
> > test013_missingDependencyMarker(org.eclipse.m2e.tests.ProjectRegistryManagerTest)
> 
> My test was the following:
> 
> - imported JBoss AS 7 - git://github.com/jbossas/jboss-as.git
> - built it
> - set a breakpoint on ProjectRegistry.hasDependencyChange
> - called Maven>Update Project Configuration
> 
> ProjectRegistry.hasDependencyChange always returns true (there are changes)
> although I haven't changed any project.  
> The ClasspathDescriptor class saves entries in ArrayList what means they are in
> random order. This is probably the reason why
> ProjectRegistry.hasDependencyChange always returns false . Since
> MavenProject.getArtifacts() uses LinkedHashSet, I suppose
> ClasspathDescriptor.entries should be LinkedList. Do you agree?

* Both LinkedHashSet and ArrayList guarantee order of elements, so this can't explain the behaviour you see
* To help us understand the problem better and come up with a solution, please provide a standalone test project that demonstrates the problem. The test project should be relatively small so it can be turned into a regression test we will use to make use the problem does not reappear.
Comment 40 Snjezana Peco CLA 2011-04-18 18:41:09 EDT
(In reply to comment #38)
> I suggest we split this patch in two.
> 
> The change to #resourceChanged is good and I will extract/apply just this part
> if you do not mind (please confirm).
>

Agreed.
 
> To reduce number of resource change events, however, we'd need a new patch that
> *collects* marker information but defers actual marker creation until
> #applyMutableProjectRegistry is called. This should eliminate unnecessary
> resource change events without blocking workspace modifications for extended
> periods of time.
Comment 41 Igor Fedorenko CLA 2011-04-18 20:25:16 EDT
Applied #resourceChanged part of 0001-M2e-Performance-Project-Refresh-Job.patch http://git.eclipse.org/c/m2e/m2e-core.git/commit/?id=175da5ea95f3be06087c5c4959b9f86f5b880f42
Comment 42 Igor Fedorenko CLA 2011-04-18 20:26:03 EDT
Comment on attachment 193450 [details]
0001-M2e-Performance-Workspace-State.patch

Applied http://git.eclipse.org/c/m2e/m2e-core.git/commit/?id=bd28b4ed9db64973915d0ad8d4cbedb2504032e3
Comment 43 Snjezana Peco CLA 2011-04-19 18:02:54 EDT
(In reply to comment #31)
> Comment on attachment 193449 [details]
> 0001-M2e-Performance-Nexus-indexer
> 
> This patch looks promising and I can confirm ~30% performance improvement
> compared to master@d86b67 during dependency refresh on my hardware using
> jboss-as as a test project. Before I apply this change
> 
> 1. Please reconcile the new #getArtifactInfo with existing
> #getIndexedArtifactFile methods. They are virtually identical.
> 2a. Workspace index only contains information about project GAV coordinates
> (see how it is setup with NexusIndex.DETAILS_MIN around line 186). So you only
> really need to compare project's old/new artifact keys to know if index update
> is required. This should significantly simplify the patch.
> 2b. If you still believe more elaborate "should-index-be-updated" logic is
> necessary, we will need unit tests to justify/verify it.

I could try to make those changes.
Re #2b, I think we would have to check timestamp.
Comment 44 Igor Fedorenko CLA 2011-04-20 12:33:49 EDT
(In reply to comment #43)
> (In reply to comment #31)
> > Comment on attachment 193449 [details] [details]
> > 0001-M2e-Performance-Nexus-indexer
> > 
> > This patch looks promising and I can confirm ~30% performance improvement
> > compared to master@d86b67 during dependency refresh on my hardware using
> > jboss-as as a test project. Before I apply this change
> > 
> > 1. Please reconcile the new #getArtifactInfo with existing
> > #getIndexedArtifactFile methods. They are virtually identical.
> > 2a. Workspace index only contains information about project GAV coordinates
> > (see how it is setup with NexusIndex.DETAILS_MIN around line 186). So you only
> > really need to compare project's old/new artifact keys to know if index update
> > is required. This should significantly simplify the patch.
> > 2b. If you still believe more elaborate "should-index-be-updated" logic is
> > necessary, we will need unit tests to justify/verify it.
> 
> I could try to make those changes.
> Re #2b, I think we would have to check timestamp.

With help and advice from Maven Indexer developers, we've implemented http://git.eclipse.org/c/m2e/m2e-core.git/commit/?id=40d3424ae941113e36a054b719d0fe1fef5359e0 , which puts this optimization where it belongs, i.e. in Maven Indexer itself.
Comment 45 Igor Fedorenko CLA 2011-04-20 12:37:01 EDT
Comment on attachment 193534 [details]
0001-M2e-Performance-Caching-Lifecycle-Metadata.patch

This patch does not apply. Do you mind rebasing it off latest master HEAD?
Comment 46 Snjezana Peco CLA 2011-04-20 13:51:58 EDT
Created attachment 193730 [details]
0001-M2e-Performance-Caching-Lifecycle-Metadata.patch

Rebased to the latest master.
Comment 47 Igor Fedorenko CLA 2011-04-20 15:27:12 EDT
Comment on attachment 193730 [details]
0001-M2e-Performance-Caching-Lifecycle-Metadata.patch

I've added missing synchronized modified to setBundleMetadataSources and applied this patch. Thank you. 

http://git.eclipse.org/c/m2e/m2e-core.git/commit/?id=cd221a5336069559b6d764282b4a34393a8b2871
Comment 48 Snjezana Peco CLA 2011-04-20 15:48:02 EDT
Now Maven>Update Project Configuration is much faster. 
The jboss-as-parent project is updated for less than 10 sec.
Before applying these patches, it took over 30 sec to update this project.
Comment 49 Snjezana Peco CLA 2011-04-28 17:53:03 EDT
(In reply to comment #28)

M2e updates a classpath every time when refreshing a project because the BasicProjectRegistry.copy method uses HashMap and HashSet that break ordering of the project's requirements.
The attached patch (0001-m2e-performance-Updating-a-classpath.patch) uses LinkedHashMap/LinkedHashSet what causes a classpath to be updated only if it has been changed.
The Maven>Update Project Configuration action on the jboss-as-parent project is speeded up by 30%. Overall performance is much better because updating the project's classpath causes the project to be rebuilt. Importing all the JBoss AS 7 projects is faster at least 5 times, for instance.
Comment 50 Snjezana Peco CLA 2011-04-28 17:54:07 EDT
Created attachment 194319 [details]
0001-m2e-performance-Updating-a-classpath.patch
Comment 51 Igor Fedorenko CLA 2011-04-28 22:09:05 EDT
Comment on attachment 194319 [details]
0001-m2e-performance-Updating-a-classpath.patch

Thank you for spotting the problem with HashMap/HashSet and providing the patch. 

Unfortunately, after applying the patch m2e does not properly update classpath when projects added to or removed from the workspace, which is detected by a number of failing unit tests. Reverting your changes to BuildPathManager solves the problem without significant performance degradation, so I applied your change to BasicProjectRegistry only.

If you still want to optimize BuildPathManager#mavenProjectChanged, please submit new patch but make sure that all existing m2e tests pass with your changes.

I also have to ask -- did you run m2e unit tests? If not, why? If yes, did all tests pass for you?
Comment 52 Snjezana Peco CLA 2011-04-29 12:14:15 EDT
(In reply to comment #51)

I have created a new patch (0002-m2e-performance-Updating-a-classpath.patch) that updates the project's classpath when either all or some of the following requirements are fulfilled:
- project's requirements have been changed
- some project has been added to the workspace or removed from it

All the org.eclipse.m2e.tests tests passed except:

testCatalogSelectedByPackaging(org.eclipse.m2e.tests.discovery.MavenDiscoveryTest)
testCatalogSelectedByMojoExecution(org.eclipse.m2e.tests.discovery.MavenDiscoveryTest)
testMultipleSelections(org.eclipse.m2e.tests.discovery.MavenDiscoveryTest)

that fail with and without this patch.

Now importing JBoss AS 7 projects is faster more than 5 times.
Comment 53 Snjezana Peco CLA 2011-04-29 12:15:37 EDT
Created attachment 194379 [details]
0002-m2e-performance-Updating-a-classpath.patch
Comment 54 Igor Fedorenko CLA 2011-04-29 19:58:25 EDT
Comment on attachment 194379 [details]
0002-m2e-performance-Updating-a-classpath.patch

This patch makes some FLAG_NONE events into FLAG_DEPENDENCIES events. This will result in longer refresh/rebuild after projects are removed during git-pull, for example. I've added new unit test to show the problem https://github.com/sonatype/m2e-core-tests/commit/11a22ea34226db84c7f9cb1c8bc0e4b22ffc09bc .
Comment 55 Snjezana Peco CLA 2011-05-02 17:18:23 EDT
(In reply to comment #54)
> Comment on attachment 194379 [details]
> 0002-m2e-performance-Updating-a-classpath.patch
> 
> This patch makes some FLAG_NONE events into FLAG_DEPENDENCIES events. This will
> result in longer refresh/rebuild after projects are removed during git-pull,
> for example.

Could you give an example of a project that takes a longer time when the patch is included?
Without the patch, m2eclipse always sets project's classpath no matter if an event is of the type FLAG_NONE or FLAG_DEPENDENCIES.
With this patch included, m2eclipse sets project's classpath only for the FLAG_DEPENDENCIES events.
I don't know how this could cause longer refresh/rebuild.

> I've added new unit test to show the problem
> https://github.com/sonatype/m2e-core-tests/comm/11a22ea34226db84c7f9cb1c8bc0e4b22ffc09bc

ProjectRegistryManagerTest.test022_noChangeReloadWithUnrelatedRemoveProject fails on the following line:

assertEquals(MavenProjectChangedEvent.FLAG_NONE, e1.getFlags());

I don't understand why this is a problem. M2eclipse doesn't use FLAG_NONE or FLAG_DEPENDENCIES anywhere in the code and I don't see what this line is supposed to test.
Comment 56 Igor Fedorenko CLA 2011-05-02 18:11:51 EDT
(In reply to comment #55)
> 
> ProjectRegistryManagerTest.test022_noChangeReloadWithUnrelatedRemoveProject
> fails on the following line:
> 
> assertEquals(MavenProjectChangedEvent.FLAG_NONE, e1.getFlags());
> 
> I don't understand why this is a problem. M2eclipse doesn't use FLAG_NONE or
> FLAG_DEPENDENCIES anywhere in the code and I don't see what this line is
> supposed to test.

MavenProjectChangedEvent.FLAG_DEPENDENCIES events should be sent only when project dependencies change and this is what the test is meant to validate for one specific scenario. Or, put it differently, the test validates in one particular scenario that MavenProjectChangedEvent.FLAG_NONE is sent when there are no project dependency changes.
Comment 57 Snjezana Peco CLA 2011-05-02 19:09:37 EDT
(In reply to comment #56)
> 
> MavenProjectChangedEvent.FLAG_DEPENDENCIES events should be sent only when
> project dependencies change and this is what the test is meant to validate for
> one specific scenario. Or, put it differently, the test validates in one
> particular scenario that MavenProjectChangedEvent.FLAG_NONE is sent when there
> are no project dependency changes.

I understand what the test validates, but don't understand why that is important since m2eclipse doesn't use either MavenProjectChangedEvent.FLAG_DEPENDENCIES or MavenProjectChangedEvent.FLAG_NONE. Before adding LinkedHashSet/LinkedHashMap to the BasicProjectRegistry.copy method, m2eclipse was exclusively creating MavenProjectChangedEvent.FLAG_DEPENDENCIES events.
Anyway, all your tests passed, the patch offers a significant enhancement and it is on you is to decide whether to apply it or not.
Comment 58 Igor Fedorenko CLA 2011-05-02 19:34:03 EDT
There are no plans to apply the remaining part of the patch. It introduces a bug in the way m2e emits project change events. Besides, the remaining part of the patch provides only modest performance improvements over current code (5-10% in my tests).
Comment 59 Fred Bricon CLA 2013-02-11 19:20:05 EST
cleaning IP Log up
Comment 60 Denis Roy CLA 2021-04-19 13:24:38 EDT
Moved to https://github.com/eclipse-m2e/m2e-core/issues/