| Summary: | signing service not working | ||||||
|---|---|---|---|---|---|---|---|
| Product: | [Technology] CBI | Reporter: | Sam Davis <sam.davis> | ||||
| Component: | signing-service | Assignee: | CBI Inbox <cbi-inbox> | ||||
| Status: | RESOLVED FIXED | QA Contact: | |||||
| Severity: | blocker | ||||||
| Priority: | P3 | CC: | david_williams, denis.roy, mikael.barbero, stephan.herrmann, stepper | ||||
| Version: | unspecified | ||||||
| Target Milestone: | --- | ||||||
| Hardware: | PC | ||||||
| OS: | Windows 7 | ||||||
| See Also: |
https://bugs.eclipse.org/bugs/show_bug.cgi?id=487857 https://git.eclipse.org/r/69178 https://git.eclipse.org/r/69207 https://git.eclipse.org/c/cbi/org.eclipse.cbi.git/commit/?id=565c4f2f53dcfb5906ec4d41cea74af40f19c7db https://git.eclipse.org/r/69214 https://git.eclipse.org/c/cbi/org.eclipse.cbi.git/commit/?id=4163219772a8a658a02c7ae6f14773314eff24ed |
||||||
| Whiteboard: | |||||||
| Attachments: |
|
||||||
|
Description
Sam Davis
Restart your HIPP from your account control panel, it should fix it. (see bug 489550 comment 16 for details) Thanks for the pointer Mikael. I'll reopen this if I can't get it working. Mikael, I am still having problems after restarting a couple of times, although the error is different now. Do you have any idea what would cause this?
[exec] Buildfile: /home/data/httpd/download-staging.priv/tools/mylyn/signing/mylyn/3.19.0-SNAPSHOT/site.zip.ant.xml
[exec]
[exec] doIt:
[exec]
[exec] BUILD FAILED
[exec] /home/data/httpd/download-staging.priv/tools/mylyn/signing/mylyn/3.19.0-SNAPSHOT/site.zip.ant.xml:9: java.io.FileNotFoundException: /tmp/genie.mylyn/site/plugins/org.eclipse.mylyn.monitor.core.source_3.19.0.v20160315-2132.jar (No such file or directory)
https://hudson.eclipse.org/mylyn/job/mylyn-3.19.x-release/15/console
(In reply to Sam Davis from comment #3) > Mikael, I am still having problems after restarting a couple of times, > although the error is different now. Do you have any idea what would cause > this? > > [exec] Buildfile: > /home/data/httpd/download-staging.priv/tools/mylyn/signing/mylyn/3.19.0-SNAPSHOT/site.zip.ant.xml > > [exec] > [exec] doIt: > [exec] > [exec] BUILD FAILED > [exec] > /home/data/httpd/download-staging.priv/tools/mylyn/signing/mylyn/3.19.0-SNAPSHOT/site.zip.ant.xml:9: > java.io.FileNotFoundException: > /tmp/genie.mylyn/site/plugins/org.eclipse.mylyn.monitor.core.source_3.19.0.v20160315-2132.jar > (No such file or directory) > > https://hudson.eclipse.org/mylyn/job/mylyn-3.19.x-release/15/console Similar to bug 490014 ? I think the signing service was changed to include an ant task, and I think that task is deleting a temp dir before its time. *** Bug 490238 has been marked as a duplicate of this bug. *** *** Bug 490014 has been marked as a duplicate of this bug. *** I'm reading some notes from Mikael, as he is currently unavailable. Please stand by. Are all those affected using Buckminster? CDO is affected and uses Buckminster. Does buckminster call the signer as an ant task? I've issued a commandline signing request for a single jar, and for a zip file, and they both signed successfully. Created attachment 260545 [details]
Bucky's signer.ant
I've found this in Buckminster.
(In reply to comment #9) > Are all those affected using Buckminster? Mylyn is not using Buckminster (as far as I know). (In reply to Denis Roy from comment #11) > I've issued a commandline signing request for a single jar, and for a zip > file, and they both signed successfully. This is one of mine that caused a failure: /shared/download-staging.priv/modeling/emf/cdo/emf-cdo-integration/site_2062397902.zip (In reply to Denis Roy from comment #5) > I think the signing service was changed to include an ant task, and I think > that task is deleting a temp dir before its time. It might interact with how "java.io.tmpdir" is defined ... if that changes for anyone. In the code that is called by the ant task, the "temp directory" is created based on http://git.eclipse.org/c/cbi/org.eclipse.cbi.git/tree/utilities/org.eclipse.cbi.releng.tools/src/org/eclipse/cbi/releng/tools/UpdatePackPropertiesFile.java#n519 and whole path of "temp directory" is defined at http://git.eclipse.org/c/cbi/org.eclipse.cbi.git/tree/utilities/org.eclipse.cbi.releng.tools/src/org/eclipse/cbi/releng/tools/UpdatePackPropertiesFile.java#n123 but is is true if we find it "already exists" we assume that was from some previous failed run and remove it before re-creating it: http://git.eclipse.org/c/cbi/org.eclipse.cbi.git/tree/utilities/org.eclipse.cbi.releng.tools/src/org/eclipse/cbi/releng/tools/UpdatePackPropertiesFile.java#n261 So if that "temporary destination" happens to "be the same as" the signing directory, then for sure that would break. And then at the end we do "clean up" what the "updatepackproperites" task is doing at http://git.eclipse.org/c/cbi/org.eclipse.cbi.git/tree/utilities/org.eclipse.cbi.releng.tools/src/org/eclipse/cbi/releng/tools/UpdatePackPropertiesFile.java#n205 Again, that all assumed we had our own "unique" location that would not remove anything we didn't create, but we do not actually check for that. If you change verbose to "true" in the ant task on the deployed system, you might get better error messages as to what is going on. (In reply to David Williams from comment #15) > (In reply to Denis Roy from comment #5) > > I think the signing service was changed to include an ant task, and I think > > that task is deleting a temp dir before its time. > > It might interact with how "java.io.tmpdir" is defined ... if that changes > for anyone. > I should have mentioned, by default, java.io.tmpdir should be defined as /tmp, unless someone is changing that. If chaned, it is typically changed as an ANT_OPT or directly on java call, as -Djava.io.tmpdir="<some absolute path>" I have looked at the Mylyn log in detail, and does look like the temp destination disappears "in the middle of things". One explanation might be if there are several zip files of the same name being signed "in parallel"? For Mylyn it does look like they have several "site/targets" in their build, from looking at https://hudson.eclipse.org/mylyn/job/mylyn-3.19.x-release/lastSuccessfulBuild/artifact/ Not sure if they get "queued up" in parallel or not. Also not sure if that is the same case for other builds, such as CDO? So, might be a red-herring. Does anyone know if "several zips to sign in the same build" is a common thread among those that are failing? David, I think you're on to something. We should introduce some random value in the. In *nix we usually use the process ID. My guess is that perhaps a nested signing call is reusing (and deleting) the temp directory. New Gerrit change created: https://git.eclipse.org/r/69178 (In reply to David Williams from comment #17) > Does anyone know if "several zips to sign in the same build" is a common > thread among those that are failing? In bug 490014 I had a single zip file and only one invocation of /usr/bin/sign. I could work around the issue by renaming otdt.zip to otdt.jar. It looks that this way I simply bypassed some checks for nested zips(?), the code for which seem to contain the bug? (In reply to Eclipse Genie from comment #19) > New Gerrit change created: https://git.eclipse.org/r/69178 I am not sure how to get "process id" from within Java (and from a quick search, it is not easy). I was already about done creating a change that used "system time" to nearest milliseconds to append to "destinationdirectory" if we find it already exists, in order to come up with a (long) unique name. I am not sure where this is built on https://hudson.eclipse.org/cbi/ And much less know how it is deployed. But, thought I would at least submit it for review. (In reply to Stephan Herrmann from comment #20) > (In reply to David Williams from comment #17) > > Does anyone know if "several zips to sign in the same build" is a common > > thread among those that are failing? > > In bug 490014 I had a single zip file and only one invocation of > /usr/bin/sign. > > I could work around the issue by renaming otdt.zip to otdt.jar. It looks > that this way I simply bypassed some checks for nested zips(?), the code for > which seem to contain the bug? I don't think that would accomplish much Stephan, unless you are saying there is only one bundle in that zip file. And, not sure it is a matter of "nested zips", so much as parallel calls. I think "parallel calls" could *maybe* happen under a number of situations, and depend on details I know nothing of, so I do think we should fix the underlying code. But in the meantime :) if you could name your zip uniquely, such as otdt.zip to otdt201603231646.zip then THAT *might* work around the problem in the underlying code. I gather that something changed which has caused this problem. Would it be possible to revert the change until the problem is fixed? If I'm not able to do a build in the next few days we may have to delay our release. (In reply to Eclipse Genie from comment #19) > New Gerrit change created: https://git.eclipse.org/r/69178 FYI, I am testing this locally and something seems wrong. It might be my "testing environment", but just wanted to let you know, so it would not be deployed "more broken". :/ (In reply to David Williams from comment #22) > (In reply to Stephan Herrmann from comment #20) > > (In reply to David Williams from comment #17) > > > Does anyone know if "several zips to sign in the same build" is a common > > > thread among those that are failing? > > > > In bug 490014 I had a single zip file and only one invocation of > > /usr/bin/sign. > > > > I could work around the issue by renaming otdt.zip to otdt.jar. It looks > > that this way I simply bypassed some checks for nested zips(?), the code for > > which seem to contain the bug? > > I don't think that would accomplish much Stephan, unless you are saying > there is only one bundle in that zip file. Not just one bundle. The amazing fact is: it did accomplish s.t.: after that rename I *was* able to sign, which previously I wasn't. (sorry for ambiguity in language: "could" was intended as past tense, not as subjunctive :) ) New Gerrit change created: https://git.eclipse.org/r/69207 (In reply to Eclipse Genie from comment #26) > New Gerrit change created: https://git.eclipse.org/r/69207 Ok, this patch I have tested and seems ok. (There were two bugs with the other one, one badly breaking the function :( the other simply would not fix "parallel processing" :) I could not test with true "parallel processes" but did "hack" a test of a /tmp directory with the same name and confirmed fix did create a new /tmp directory with milliseconds appended to the pre-existing name. So, if that is the (only) problem, I think this will help. If only someone knew how to build it and deploy it. :) (In reply to Sam Davis from comment #23) > I gather that something changed which has caused this problem. Would it be > possible to revert the change until the problem is fixed? If I'm not able to > do a build in the next few days we may have to delay our release. It was a pretty involved fix. I think better to push forward, even if it means a delay. Oh, and BTW, we do need to delay anyway, because whoever planned our schedule (i.e. me :) did not take into account a popular holiday. Looks like we will release on "Tuesday", https://dev.eclipse.org/mhonarc/lists/cross-project-issues-dev/msg13062.html But would still like to get the Sim. Release repo completed by Thursday evening. I was referring to the Mylyn 3.19 release. Tasktop would like to consume the final build ASAP. Gerrit change https://git.eclipse.org/r/69207 was merged to [master]. Commit: http://git.eclipse.org/c/cbi/org.eclipse.cbi.git/commit/?id=565c4f2f53dcfb5906ec4d41cea74af40f19c7db (In reply to David Williams from comment #27) > If only someone knew how to build it and deploy it. :) It appears this job is the one that builds "cbi releng tools": https://hudson.eclipse.org/cbi/job/cli-signing-ant-tasks/ and https://hudson.eclipse.org/cbi/job/cli-signing-ant-tasks/8/ has the cbiRelengTools.jar in it that has my fix. I downloaded it, and confirmed running it creates a "/tmp" folder of /tmp/site_19852896171458773698199 if /tmp/site_1985289617 already exists. And, its function still works too! :) From there ... I am not sure. I suspect Mikael would run something on https://hudson.eclipse.org/cbi/view/signing-packaging/ but, hard for me to see what. BTW, for those of you having trouble with this issue, if you can provide the zip file here, I can test a bit more, and make sure there is not some other basic assumption we are making that is related to the problem. You can provide a Hudson link if you have it, or in some cases I can probably get from your "signing directory", or you can attach it to this bug if not too huge. Thanks, (In reply to David Williams from comment #32) > BTW, for those of you having trouble with this issue, if you can provide the > zip file here, I can test a bit more, and make sure there is not some other > basic assumption we are making that is related to the problem. > > You can provide a Hudson link if you have it, or in some cases I can > probably get from your "signing directory", or you can attach it to this bug > if not too huge. > > Thanks, Of the 41 zip files I had access to under the "signing directory", a fair number were already "processed" so I guess those worked ok. A moderate number had not been processed yet but worked ok. But, three of them, Virgo, Mylin, and OTDT failed with similar stack traces of "not found" files. And all that is without any parallel processing. I'll be taking a close look to see why those 3 are "special". (Of course, it *might* be they were "harmed" by previous runs of the utility?) Just wanted to keep you informed. As another workaround, for any of you that have control over your zip files, if the zip file already contains a file at the "top level" named "pack.properties" then the utility will assume you already have it covered, and will not do anything, and just move on to the jarproccesor step. What is "supposed" to be in there, is a list of jars you do not want signed or processed for pack200. See https://wiki.eclipse.org/JarProcessor_Options But, if you had an empty file in there, the "utility" would not mind. (I am not sure what the jarprocessor would do with it, though. Following is an example of a valid "empty" pack.properties file. But, literally empty might work too -- that is just one I use from time to time. = = = = pack200.default.args=-E4 pack.excludes=. = = = = For anyone not familiar with the background of this issue, the point was to prevent users of "CBI" from having to know all that, and do all the work themselves (though, some may, already). So, I am not saying you *should* use this workaround. Just letting you know. New Gerrit change created: https://git.eclipse.org/r/69214 Gerrit change https://git.eclipse.org/r/69214 was merged to [master]. Commit: http://git.eclipse.org/c/cbi/org.eclipse.cbi.git/commit/?id=4163219772a8a658a02c7ae6f14773314eff24ed (In reply to Eclipse Genie from comment #34) > New Gerrit change created: https://git.eclipse.org/r/69214 This fixes the problem with the 3 "special" zip files. Zip Files 101: Zip file entries need not contain "directory" entries for each directory in the zip file (though most do) but you have to check and "infer" if there are any implied directory names in the "filename" entry and if so create it on the file system before writing the entry. Otherwise, you get the non-helpful "FileNotFoundException" while trying to create the file. (Gee, would it have killed them to have created a "DirectoryNotFoundException" :) Good test cases to add to our pool! :) = = = = = = = I have re-built the jar, and it is contained in https://hudson.eclipse.org/cbi/job/cli-signing-ant-tasks/10/ (though, again, I am not sure of Mikael's "full process" and it might also be built by some other job?) The CDO build still fails: https://hudson.eclipse.org/cdo/job/emf-cdo-integration/300/console ;-( This is the respective zip file: /shared/download-staging.priv/modeling/emf/cdo/emf-cdo-integration/site_1841147332.zip (In reply to Eike Stepper from comment #37) > The CDO build still fails: > https://hudson.eclipse.org/cdo/job/emf-cdo-integration/300/console ;-( Nothing has been changed yet in the _deployed_ signing service. > This is the respective zip file: > /shared/download-staging.priv/modeling/emf/cdo/emf-cdo-integration/ > site_1841147332.zip Good, another test case. David, what's the name of the jar you've built? Mikael left me some instructions, and I can deploy it to production if I know what it is. (In reply to Denis Roy from comment #39) > David, what's the name of the jar you've built? Mikael left me some > instructions, and I can deploy it to production if I know what it is. Its name is cbiRelengTools.jar. It's at that "number 10" build at https://hudson.eclipse.org/cbi/job/cli-signing-ant-tasks/10/artifact/utilities/org.eclipse.cbi.releng.tools/lib/cbiRelengTools.jar I believe during deployment it goes "under" a folder called 'lib' (and any old ones there removed or renamed so they do not end with '.jar'.) I can't seem to locate that jar anywhere... closest I can see is: ./org.eclipse.cbi.git/cli-tools/signing/jar/ant-tasks/cbi-ant-tasks.jar ./org.eclipse.cbi.git/cli-tools/signing/jar/jarprocessors/jarprocessor.jar ./org.eclipse.cbi.git/cli-tools/signing/jar/jarprocessors/org.eclipse.equinox.p2.jarprocessor_1.0.300.v20131211-1531.jar (In reply to Denis Roy from comment #41) > I can't seem to locate that jar anywhere... closest I can see is: > > ./org.eclipse.cbi.git/cli-tools/signing/jar/ant-tasks/cbi-ant-tasks.jar > ./org.eclipse.cbi.git/cli-tools/signing/jar/jarprocessors/jarprocessor.jar > ./org.eclipse.cbi.git/cli-tools/signing/jar/jarprocessors/org.eclipse. > equinox.p2.jarprocessor_1.0.300.v20131211-1531.jar It is definitely not those with "jarprocessor" in the name or path. Is it "nested" in cbi-ant-tasks.jar? org.eclipse.cbi-sandbox.git/cli-tools/signing/jar/ant-tasks # jar tf cbi-ant-tasks.jar META-INF/ META-INF/MANIFEST.MF org/ org/eclipse/ org/eclipse/cbi/ org/eclipse/cbi/releng/ org/eclipse/cbi/releng/tools/ org/eclipse/cbi/releng/tools/UpdatePackPropertiesFile.class org/eclipse/cbi/releng/tools/UpdatePackPropertiesFile$JarFileFilter.class Doesn't appear so. (In reply to Denis Roy from comment #43) > org.eclipse.cbi-sandbox.git/cli-tools/signing/jar/ant-tasks # jar tf > cbi-ant-tasks.jar > META-INF/ > META-INF/MANIFEST.MF > org/ > org/eclipse/ > org/eclipse/cbi/ > org/eclipse/cbi/releng/ > org/eclipse/cbi/releng/tools/ > org/eclipse/cbi/releng/tools/UpdatePackPropertiesFile.class > org/eclipse/cbi/releng/tools/UpdatePackPropertiesFile$JarFileFilter.class > > > Doesn't appear so. Those are the right package names and class files. I guess he renames it. If you unzip it, and the jar I originally pointed to, you can tell if there is anything "extra" in his, or if he literally just renames it. Let's take a chance. I've replaced the "old" jar with your new one from Hudson: -rw-rw-r-- 1 genie signers 9583 Mar 23 19:26 cbi-ant-tasks.jar -rw-rw-r-- 1 genie signers 9259 Mar 24 11:27 cbi-ant-tasks.jar.pre-fix Can you test the service? (In reply to Denis Roy from comment #45) > Let's take a chance. > > I've replaced the "old" jar with your new one from Hudson: > > -rw-rw-r-- 1 genie signers 9583 Mar 23 19:26 cbi-ant-tasks.jar > -rw-rw-r-- 1 genie signers 9259 Mar 24 11:27 cbi-ant-tasks.jar.pre-fix > > > Can you test the service? I tested from the command line with three of the zips I snagged last night, from Mylyn, Virgo, and CDO. I started these at roughly the same time. They all three worked fine and all came back with validly signed jars. I've started an Orbit build, but that takes 2 or 3 hours to finish. I suggest we mark as "fixed" -- but naturally if anyone still finds issues in "real builds" they should reopen. [If anyone does find issues, please provide log and original "zip submitted for signing" if possible.] Thanks for getting this working. I haven't tested the output of the build but it at least produced artifacts: https://hudson.eclipse.org/mylyn/job/mylyn-3.19.x-release/16/artifact/ Thank you David and Denis! I've kicked the CDO M6 build: https://hudson.eclipse.org/cdo/job/emf-cdo-integration/301/console and will report here how it went... Actually, if you don't mind, I'll leave this open since I want to circle back with Mikael and make sure that a) I've deployed the jar correctly (I don't think I did) and b) to make sure we document the process up maintaining the signing service, since it has undergone a major overhaul. David, thanks for digging into this and for the patch. @Denis, don't mind at all (but, you could have opened a new "reminder bug" :) Sometimes I prefer that, just so people don't get the impression "it is still a problem and they don't even try". But, in this case ... it is your bug! = = = = = = = = Also, from a short chat with Eike, his build was getting further, but still failing. It almost seemed to me (half guessing) that Buckminster's "wait until signed" heuristic was somehow thinking "things were signed" and "moved on" too early and then of course failed. (A signed zip was created where it was supposed to be, but by that time his build was already done failing). If any other Buckminster users experience that, please document (even if in a new bug :). I know in Orbit I initially had a similar problem because when first tested there were "extra files" being created in the signing directory, and Orbit's "brain dead" heuristics somehow interpreted that as "all done signing". While it would be nice if such heuristics were fixed at the source, if that is not easy, it may be possible for us "not to touch" the input file (for example) and create a new zip with the pack.properties in it and pass it to the "jarsigner". That is why I would like to know if it is a common problem ... or, just Eike. :) = = = = = = = = Thank you, Denis for deploying, and thanks to everyone else for your reports and patience. Good idea. Closing. (In reply to David Williams from comment #50) > Also, from a short chat with Eike, his build was getting further, but still > failing. I could fix the problem with my build and I believe I know what has happened. Interestingly even my new, good build produced a console log with 50% of it (adds 2.6 MB!) being filled with these [addPackProperties] lines. I have never seen these lines before and I would be happy to not see them in my logs. One reason is that transferring the logs and navigating in them is quite slow and nasty now. But, more importantly, I think they caused my failures because: a) Buckminster's signing.ant script checks the /usr/bin/sign output for occurrences of the string "ERROR": <exec executable="/usr/bin/sign" outputproperty="sign.output"> <arg value="${eclipse.staging.area}/${subject.file}"/> <arg value="nomail"/> <arg value="${staging.output.folder}"/> </exec> <fail message="${sign.output}"> <condition> <or> <contains string="${sign.output}" substring="ERROR"/> <contains string="${sign.output}" substring="Usage:"/> </or> </condition> </fail> b) CDO has a file with the string "ERROR" in the name. So my work-around was to rename this CDO file to some other name. And then the build no longer failed. But I'd prefer that the verbosity of the /usr/bin/sign executable is reduced back to its original level. What do you think? Denis, all my failed builds have left quite a bit of junk under /shared/download-staging.priv/modeling/emf/cdo. I'm lacking write permissions there, so can you please empty the following two folders for me? /shared/download-staging.priv/modeling/emf/cdo/emf-cdo-integration /shared/download-staging.priv/modeling/emf/cdo/emf-cdo-maintenance Please let me know if you prefer a separate bugzilla for this. Thanks ;-) (In reply to Eike Stepper from comment #52) > But I'd prefer that the verbosity of the /usr/bin/sign executable is reduced > back to its original level. What do you think? Fully agree. It was supposed to have been "on" just during early testing, and honestly, I think Mikael thought he turned it off already (both issues mention in bug 489326 and Mikael's comment in bug 489326 comment 4. I will add a reminder there to set the default to 'false' AND will see if there is an easy way to pipe it somewhere else besides to "users output" if it ever needs to be turned on. (But, honestly, I think "ERROR" in file names should be illegal in Java -- I am always having to fine tune my regex to avoid them as I scan logs. :) P.S. A. Thanks for tracking it down, Eike. B. And apologies that I "wasn't around" when I said I would be, but I forgot and I am on vacation a few days. :/ (In reply to David Williams from comment #54) > (In reply to Eike Stepper from comment #52) > > > But I'd prefer that the verbosity of the /usr/bin/sign executable is reduced > > back to its original level. What do you think? > > Fully agree. It was supposed to have been "on" just during early testing, > and honestly, I think Mikael thought he turned it off already (both issues > mention in bug 489326 and Mikael's comment in bug 489326 comment 4. I will > add a reminder there to set the default to 'false' AND will see if there is > an easy way to pipe it somewhere else besides to "users output" if it ever > needs to be turned on. I actually did it in http://git.eclipse.org/c/cbi/org.eclipse.cbi.git/commit/cli-tools/signing/jar?id=59697bc0bb25f01bb26928260eabb1664b8abab9 and I've checked that the deployed version really is containing this change. Do you still see the verbose logging? Thanks Mikael! My next signed build is on Friday. I'll let you know then... |