Some Eclipse Foundation services are deprecated, or will be soon. Please ensure you've read this important communication.

Bug 569521

Summary: An option in tycho configuration that let tycho always use cached content of the metadata.
Product: z_Archived Reporter: Johan Compagner <jcompagner>
Component: TychoAssignee: Project Inbox <tycho-inbox>
Status: CLOSED MOVED QA Contact:
Severity: normal    
Priority: P3 CC: laeubi, mistria
Version: unspecified   
Target Milestone: ---   
Hardware: PC   
OS: Windows 10   
See Also: https://bugs.eclipse.org/bugs/show_bug.cgi?id=571195
https://github.com/eclipse/tycho/issues/140
Whiteboard:

Description Johan Compagner CLA 2020-12-07 07:44:49 EST
Currently we use a target file to define what tycho uses to resolve all the dependencies:

<groupId>org.eclipse.tycho</groupId>
  <artifactId>target-platform-configuration</artifactId>
   <version>${tycho.version}</version>
    <configuration>
     <target>
	<artifact>
           <groupId>com.servoy</groupId>
	   <artifactId>com.servoy.eclipse.target</artifactId>
           <version>${project.parent.version}</version>
	</artifact>
     </target>

the thing is that that target file is always fully static if the content is the same because we use hard versions for everything

<location includeAllPlatforms="false" includeConfigurePhase="true" includeMode="planner" includeSource="true" type="InstallableUnit">
 <unit id="net.sourceforge.sqlexplorer.feature.feature.group" version="3.6.2"/>
 <repository location="http://developer.servoy.com/sqlexplorer/"/>
</location>

so if tycho hits for that target file http://developer.servoy.com/sqlexplorer/ that url to get the stuff 

then after that it never has to download or check that again until the contents of the target file (sha1 hash?) has changed
So it doesn't need to get the artifacts.jar or content.jar from those sites the next time it can just use directly the cached metadata information.

so it would be nice to have an extra option:

<forceUsingCachedMetadata>true</forceUsingCachedMetadata>

in the configuration area

It should not cache based on the target version if possible, because for us for our master branch that will not really work
(it would work for more fixed released version of our product)

this is because master will be on a a version that is not release and it could be that the contents of the target file does change a few times (to update plugins and sites for the next release)

so pseudo wise it would be nice to have something like:

MetaData metadat;
if (forceUsingCachedMetadata) {
   String sha = calculateSHA(targetFileContents);
   
   metadata = getCached(sha, siteUrl);
   if (metadata == null) {
     metadata = download(siteUrl); 
     cache(sha, siteUrl, metadata);
   }
}
else {
     metadata = download(siteUrl); 
}


this way we improve in my eyes 2 things

1> speed, no network IO nothing happens anymore after the first time.
2> reliability, builds don't suddenly break because of IO exceptions because a site can't be reached and the build can't be made, even if everything is already on the system todo the build.
Comment 1 Mickael Istria CLA 2020-12-07 07:47:24 EST
AS the discussion is going on, it's been identified that such option is not the best approach, and instead relying on a go-offline (to cache locally) mojo + usage of plain Maven --offline flag would be a better solution for this case.
Please open different issues for them,
Comment 2 Mickael Istria CLA 2020-12-07 07:52:21 EST
I'm reopening actually, because that may be somehow valid; but realistically, it's not something that will be easy to implement properly in Tycho and we want make Tycho code-base dirtier for that case.
Comment 3 Johan Compagner CLA 2020-12-07 08:05:19 EST
i won't make a case for maven --offline
because that won't help us

because of the extra stuff you need to do then you need to make sure you run as far as i understand the gooffline mojo and so on
That makes it all very tricky for us
We have many git repo's with tycho and maven that builds up for 1 branch (we also have 3 or 4 branches of those agian) that uses the target platform and mvn stuff
So i need to run the offline mojo on all of them once when something changes in the target, many manual things to do and to configure.
I won't go that way
Comment 4 Johan Compagner CLA 2020-12-07 08:05:27 EST
i won't make a case for maven --offline
because that won't help us

because of the extra stuff you need to do then you need to make sure you run as far as i understand the gooffline mojo and so on
That makes it all very tricky for us
We have many git repo's with tycho and maven that builds up for 1 branch (we also have 3 or 4 branches of those agian) that uses the target platform and mvn stuff
So i need to run the offline mojo on all of them once when something changes in the target, many manual things to do and to configure.
I won't go that way
Comment 5 Christoph Laeubrich CLA 2020-12-07 08:26:29 EST
I think calculating a SHA doesn't make sense in the light of 

> It should not cache based on the target version if possible

as this is the part of the target that for PDE identifies if a target needs to be reloaded. And comparing the target in the sense of content would be a bit to heavy. 
But I think it isn't needed anyways. You claimed your updatesites are static, so metadata will never change.

If there would be such an option I would expect Tycho to behave like a caching proxy:
1) file is there -> return it
2) file is not there -> download it and store it locally
Comment 6 Johan Compagner CLA 2020-12-07 08:46:07 EST
it was just a fast way for me of saying you don't need to do anything
tycho could be a bit smarter yes

if it hits this:

<location includeAllPlatforms="false" includeConfigurePhase="true" includeMode="planner" includeSource="true" type="InstallableUnit">
 <unit id="net.sourceforge.sqlexplorer.feature.feature.group" version="3.6.2"/>
 <repository location="http://developer.servoy.com/sqlexplorer/"/>
</location>

then for me it could also be that if it sees the unit already in the cache:
<unit id="net.sourceforge.sqlexplorer.feature.feature.group" version="3.6.2"/>

for that version

then just use that directly.
don't go to the http://developer.servoy.com/sqlexplorer/ to check it.

Then it will be even better, because even if i update my target file
and i change another unit that another version is needed or i add another location, then the above one is still just gotten from the cache.

I thought i made it easier to cache on 1 key ;)

but caching it per unit is even way better in my eyes.
Comment 7 Mickael Istria CLA 2020-12-07 09:03:02 EST
(In reply to Johan Compagner from comment #6)
> then just use that directly.

Again, you're missing that the .target is just 1 input in dependency resolution; together with many other things. Depending on what you're building, this unit may be missing, conflicting with other constraints, or replaced by another one in the current reactor... Getting p2 metadata helps to figure out all that and report some useful errors.

Then, if the unit is cached already, it just uses from cache and wouldn't redownload it.
If you want a tool that doesn't do dependency management and which just take a .target and use it and only it without extra dependency management check, then Tycho is not the right candidate.

> Then it will be even better, because even if i update my target file
> and i change another unit that another version is needed or i add another
> location, then the above one is still just gotten from the cache.

That's already the case. If dependency resolution says that it want a unit, and the unit is in cache, it just uses it.

As mentioned, at this point, you're sharing a lot of ideas, but we're missing the main point: what is the very actual issue you want to fix? Some useless downloads? Of what specifically? It's probably that just an improvement in some cache would be enough...
Comment 8 Christoph Laeubrich CLA 2020-12-07 09:06:45 EST
BTW: You can greatly increase the likelihood for such a feature by providing a minimal reproducer project with steps to how to reproduce your issue. That way it is much easier to see what needs to be done.

You can even earn some bonus-points by providing a gerrit that enhance tychos itest by an example that reproduces the issue without killing the actual internet connection (e.g. some test start an embedded jetty that could be shut down after the first build then start another build to show that behavior).

This greatly increases the chance that this feature is not broken somewhere in the future because people are not aware anymore of it.
Comment 9 Johan Compagner CLA 2020-12-07 09:49:27 EST
i already said what it does improve in my first comment:

1> speed, no network IO nothing happens anymore after the first time.
2> reliability, builds don't suddenly break because of IO exceptions because a site can't be reached and the build can't be made, even if everything is already on the system todo the build.

for me for our build the first section:

[INFO] Computing target platform for MavenProject: 
[INFO] Performing subquery
[INFO] Resolving dependencies of MavenProject:

is by far the most time consuming are there is.

in a small project (just 1 plugin) that takes 48 second to complete
30+ seconds its busy in the above stages.

if i kill my network, within 1 second it comes back:

[ERROR] Failed to resolve target definition C:\Users\jcomp\.m2\repository\com\servoy\com.servoy.eclipse.target\2020.12.0.3620_rc\com.servoy.eclipse.target-2020.12.0.3620_rc.target: Failed to load p2 metadata repository from location http://developer.servoy.com/sqlexplorer/: Unknown Host: http://developer.servoy.com/sqlexplorer/content.xml: Unknown host developer.servoy.com -> [Help 1]

so i really think it is for me all the IO request to all the servers

but i can't know for sure because mvn install -X doesn't really tell me anything in the area of "Computing target platform for MavenProject"
that just hanging there for a while

So yes it could be that that is not network io but really crunching locally through my things.

But then at least builds are not bombing out because of network problems

because in my eyes its just not needed to go to http://developer.servoy.com/sqlexplorer/ 

tycho should be able just like maven itself can, to run the second time without a network connection, because all the artifacts should already be local

IF i update the target file to take a different, newer version of something i do see all kinds of download, which is normal, but i still don't get, but maybe i just don't understand the dependency managment then good enough why you need a connection and why you need to really call stuff if everything is already there locally
Comment 10 Christoph Laeubrich CLA 2020-12-07 10:07:19 EST
You can simply take net io out of the game by downloading the updatesite, extract it and reference it via file:/<location goes here>

Tycho downloads the metadata to check if there are never versions available, otherwise it can't know if the version actually is there, especially if your using planner mode the outcome might be different even if your target references the same version of a feature!

This metadata might or might not be cached atm and an example project would help to investigate this, anyways tycho is not an HA/LoadBalancer/Networkmanager Tool that helps with failing network servers...
Comment 11 Johan Compagner CLA 2020-12-07 10:42:46 EST
Again for me its all fixed
Tycho will never find a newer version. then he once that i have in my target file, because everything is fixed 

i never see after the initial change in my target and doing a build once that it download stuff again after that, i wouldn't expect that at all.

but i guess that there are people using "maven snapshot" versions in target files? so the same version (snapshot) can constantly change and downloaded at every build? So you have kind of remote snapshots?
then i guess you can't do an network optimization, then you constantly need to check, i guess this would only really be true for the developer/master branch then in those scenario's
Comment 12 Johan Compagner CLA 2020-12-17 03:20:32 EST
as an example
for us the past day, all our build broke.
one with:

[ERROR] Failed to resolve target definition /var/lib/jenkins/jobs/lts/jobs/servoy-eclipse/workspace/launch_targets/com.servoy.eclipse.target.target: Failed to load p2 metadata repository from location https://download.eclipse.org/tools/orbit/downloads/drops/R20190827152740/repository: Unable to read repository at https://download.eclipse.org/tools/orbit/downloads/drops/R20190827152740/repository. Artifact not found: https://download.eclipse.org/tools/orbit/downloads/drops2/R20190827152740/repository/content.xml.xz. -> [Help 1]

the other with

[INFO] Adding repository https://download.eclipse.org/egit/updates-5.6.1
[ERROR] Failed to resolve target definition /var/lib/jenkins/.m2/repository/com/servoy/com.servoy.eclipse.target/2020.3.2.3564_LTS/com.servoy.eclipse.target-2020.3.2.3564_LTS.target: Failed to load p2 metadata repository from location https://download.eclipse.org/egit/updates-5.6.1: Artifact not found: https://download.eclipse.org/egit/updates-5.6.1/content.xml.xz. -> [Help 1]


and another

[INFO] Computing target platform for MavenProject: com.servoy:com.servoy.jre.linux.x86_64.feature:12.0.0 @ /var/lib/jenkins/jobs/master/jobs/servoy-eclipse/workspace/com.servoy.jre.linux.x86_64/pom.xml
[ERROR] Failed to resolve target definition /var/lib/jenkins/.m2/repository/com/servoy/com.servoy.eclipse.target/2021.3.0.3640_rc/com.servoy.eclipse.target-2021.3.0.3640_rc.target: Failed to load p2 metadata repository from location https://download.eclipse.org/tools/orbit/downloads/drops/R20200529191137/repository: Unable to read repository at https://download.eclipse.org/tools/orbit/downloads/drops/R20200529191137/repository/content.xml. ClientProtocolException: Circular redirect to 'https://download.eclipse.org/errors/404.php/' -> [Help 1]

So am i really the only one that is really annoyed by this behavior?

that tycho can't run without an internet connection?
You can't work and build stuff when you are in a plane?
(which with plain maven is fine)
Comment 13 Mickael Istria CLA 2020-12-17 03:33:18 EST
(In reply to Johan Compagner from comment #12)
> that tycho can't run without an internet connection?
> You can't work and build stuff when you are in a plane?

Indeed, Tycho currently doesn't support offline mode (-o) flag in Maven. Can you please open a dedicated Bugzilla entry about that?
Comment 14 Johan Compagner CLA 2020-12-17 03:54:37 EST
its not about the offline flag
i can build my normal mvn projects just fine with standard:

mvn clean install

no offline mode needed
Comment 15 Mickael Istria CLA 2020-12-17 03:56:36 EST
(In reply to Johan Compagner from comment #14)
> its not about the offline flag

as long as we don't have offline with a flag, there are no chances we can have a "failback to offline" strategy working. There is a strong relationship between those, supporting the offline flag is IMO a necessary good 1st iteration to continue on this topic.
Comment 16 Christoph Laeubrich CLA 2020-12-17 05:32:17 EST
Whether or not its about offline mode, your problem seems to be related to unreliable web-servers. To lower your pain here until a final solution is found I would suggest to simply install a local caching proxy that guards you from network outages.

Apart from that, all that caching stuff is sometimes a painful source of confusion and unpredictable build results as well as complex code pathes in tycho itself.
I think we should take away as much of that work from tycho as possible, so maybe a much more better solution would be when tycho would include some kind of embedded, caching and mojo-configurable proxy-server that is used to query remote sites and throw away all this custom caching and guessing when or when not to contact a remote site.
Comment 17 Mickael Istria CLA 2021-04-08 18:04:55 EDT
Eclipse Tycho is moving away from this bugs.eclipse.org issue tracker to https://github.com/eclipse/tycho/issues/ instead. If this issue is relevant to you, your action is required.
0. Verify this issue is still happening with latest Tycho 2.4.0-SNAPSHOT
  if issue has disappeared, please change status of this issue to "CLOSED WORKFORME" with some details about your testing environment and how you did verify the issue; and you're done
  if issue is still present when latest release:
* Create a new issue at https://github.com/eclipse/tycho/issues/
  ** Use as title in GitHub the title of this Bugzilla ticket (may include the bug number or not, at your own convenience)
  ** In the GitHub description, start with a link to this bugzilla ticket
  ** Optionally add new content to the description if it can helps towards resolution
  ** Submit GitHub issue
* Update bugzilla ticket
  ** Add to "See also" property (up right column) the link to the newly created GitHub issue
  ** Add a comment "Migrated to <link-to-newly-created-GitHub-issue>"
  ** Set status as CLOSED MOVED
  ** Submit

All issues that remain open will be automatically closed next week or so. Then the Bugzilla component for Tycho will be archived and made read-only.
Comment 18 Johan Compagner CLA 2021-05-31 09:38:33 EDT
Migrated to https://github.com/eclipse/tycho/issues/140