| Summary: | Need download file checksum verifier service | ||
|---|---|---|---|
| Product: | Community | Reporter: | Konstantin Komissarchik <konstantin> |
| Component: | Servers | Assignee: | Eclipse Webmaster <webmaster> |
| Status: | RESOLVED WONTFIX | QA Contact: | |
| Severity: | enhancement | ||
| Priority: | P3 | CC: | david_williams, denis.roy, gunnar |
| Version: | unspecified | ||
| Target Milestone: | --- | ||
| Hardware: | All | ||
| OS: | All | ||
| Whiteboard: | |||
|
Description
Konstantin Komissarchik
Would it be possible to give this higher priority? We had to disable checksum verification of eclipse.org downloads because we are constantly running into issues. The latest problem file is this one: /modeling/emf/emf/downloads/drops/2.7.0/S201101281446/emf-xsd-Update-2.7.0M5.zip In general, I see two following cases where projects either do not understand that they should not be overwriting already-published download files or are purposefully disregarding the instructions... 1. A late-breaking issue in an important milestone build causes a re-spin after previous artifacts have already be published. 2. A nightly build published with exactly the same file name every night. If implementing a cache scrubber would take too much work, can we at least for now disable the checksum cache and have checksums be computed on request? I am changing the severity of this to a major problem as the combination of the way the download server works and they way projects choose to use it leads to severe problems in securily and reliably using eclipse.org download facility. A few things come to mind. If you're building from within our firewall, using our local file system, I'm not sure what the purpose is of running a checksum on the files you are accessing. Won't your results always be the same as my results? Unlike Intenet downloads, our local filesystems are robust, so any/all transit errors would either be caught and retried by the stack, or lead to an hard (and evident) I/O error in your build. Checksum on the fly could work but it's not something I'd want to expose to the world. But again, if you're accessing files from our filesystem directly, I don't see the point of checksumming. On the other hand, there is a true risk of tampering, and this is where the current service shines: the sum is computed upon the file's initial upload. If the file is tampered with (maliciously or otherwise), that will be evident. If the sums were simply scrubbed and recomputed regularly, we would simply be generating a new sum for a file that has been potentially tampered with. Not secure at all. As far as implementing a cache scrubber, I'm not sure how I'd choose which files to scrub. There are over 215,000 files registered with the download file index. I can't rely on a file's timestamp on disk, since that is easily changed by the owners (and, in fact, the emf example in comment 1 still has the same timestamp on disk as the timestamp in the database, at the moment the sum was computed). I can't imagine re-computing all the files regularly -- that's too much disk I/O, and does nothing to promote security. If you ask me -- and I know you haven't -- I would like to assume that a file change after initial upload is an exception, not a rule, and that before erasing a known checksum and replacing it with a new one, the change should be validated by the owning project. In the case of comment 1, what would happen if the EMF team would confirm that they have _not_ changed the bits? That would confirm that the file has either been tampered with or it is corrupted, and would warrant deeper investigation. I'm open to suggestions. But I do not believe scrubbing the cache is the right thing to do. > If you're building from within our firewall, using our local file > system, I'm not sure what the purpose is of running a checksum on the > files you are accessing. Won't your results always be the same as my > results? In Sapphire, the same build script that runs on eclipse.org Hudson is used daily by the developers on their local machines. Of course, the problem with unreliable checksums goes beyond Sapphire usecases. > Checksum on the fly could work but it's not something I'd want to expose to the > world. Why not? If you are worried about DDoS, you can set it up the checksum service behind a volatile cache with expiry of say a minute. Note that we aren't looking for an internal-only solution. > On the other hand, there is a true risk of tampering, and this is where the > current service shines There seem to be two distinct requirements: 1. Publish a checksum that the public can use to verify downloads against corruption and external tampering. 2. Prevent internal tampering on the server itself. It might be easier to fulfill these requirements separately. Right now, requirement #2 isn't really fulfilled very well anyway. An external party has to report the problem and then you have to go back-n-forth trying to figure out if the issue is with the mirror, external transit, local tampering, etc. I do agree that safeguarding against internal tampering is important, but to really address this, you need the system that automatically notices the problem or prevents the problem from happening in the first place. For instance, if we don't want projects overwriting their artifacts, why not devise a system that makes it impossible to do that? > As far as implementing a cache scrubber, I'm not sure how I'd choose which > files to scrub. There are over 215,000 files registered with the download > file index. I would recommend scrubbing everything with a fairly low priority service. Should be able to do several full scrubs in a day without impacting things too much. Of course, generating checksums on the fly behind a short-life cache would likely be much cheaper on CPU and disk resources. You are never going to generate a checksum that isn't needed. > If you ask me -- and I know you haven't -- I would like to assume that a file > change after initial upload is an exception, not a rule, and that before > erasing a known checksum and replacing it with a new one, the change should > be validated by the owning project. I've seen numerous cases of this problem, so the issue isn't exactly rare either. There are at least two projects that are doing overwriting nightly for their nightly builds. If you do decide to solve this problem with the scrubber, I don't have a problem if you want to notify the project first, but make sure that it is an automated system and that there are consequences for projects to not take an action, such as "respond if you didn't make this change; we will overwrite the checksum in 24 hours if we don't hear from you". Of course, solving #1 and #2 separately would be much better as in that time between notice and regen, the checksums retrieved from the official download server are wrong and that has potential to impact and confuse a lot of people. I'll write a small utility that will compare cached sums for the day's downloads with the latest version of the file on disk, and send me a report. This should provide a clue as to how many files are changing on disk after the initial upload. (In reply to comment #4) > I'll write a small utility that will compare cached sums for the day's > downloads with the latest version of the file on disk, and send me a report. > This should provide a clue as to how many files are changing on disk after the > initial upload. That would be an interesting report. Besides "scrubbing the cache" any projects that do replace a file, with same name, but different content should be sure to be counseled not to do it! At least, if they do it on a regular basis. I'm also surprised they would deliberately tweak mtime to back date a file? (But, admit ... I don't fully understand ctime, mtime, and the funky atime :) But, maybe this bug is just about a few rare occasions. I find it an interesting problem. This bug hasn't had any activity in quite some time. Maybe the problem got resolved, was a duplicate of something else, or became less pressing for some reason - or maybe it's still relevant but just hasn't been looked at yet. If you have further information on the current state of the bug, please add it. The information can be, for example, that the problem still occurs, that you still want the feature, that more information is needed, or that the bug is (for whatever reason) no longer relevant. -- The automated Eclipse Genie. This bug hasn't had any activity in quite some time. Maybe the problem got resolved, was a duplicate of something else, or became less pressing for some reason - or maybe it's still relevant but just hasn't been looked at yet. If you have further information on the current state of the bug, please add it. The information can be, for example, that the problem still occurs, that you still want the feature, that more information is needed, or that the bug is (for whatever reason) no longer relevant. -- The automated Eclipse Genie. This bug hasn't had any activity in quite some time. Maybe the problem got resolved, was a duplicate of something else, or became less pressing for some reason - or maybe it's still relevant but just hasn't been looked at yet. If you have further information on the current state of the bug, please add it. The information can be, for example, that the problem still occurs, that you still want the feature, that more information is needed, or that the bug is (for whatever reason) no longer relevant. -- The automated Eclipse Genie. We won't be addressing this. |