Some Eclipse Foundation services are deprecated, or will be soon. Please ensure you've read this important communication.
Bug 362354 - Perhaps provide some audit-able record of backups?
Summary: Perhaps provide some audit-able record of backups?
Status: RESOLVED WORKSFORME
Alias: None
Product: Community
Classification: Eclipse Foundation
Component: Servers (show other bugs)
Version: unspecified   Edit
Hardware: PC Linux
: P3 enhancement (vote)
Target Milestone: ---   Edit
Assignee: Eclipse Webmaster CLA
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2011-10-28 14:34 EDT by David Williams CLA
Modified: 2012-02-29 16:19 EST (History)
2 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description David Williams CLA 2011-10-28 14:34:37 EDT
Given this outcome of bug 361707, especially comment 361707#c38, I'm wondering if there should be some improved "test of the backup system"? Please take this as a minor suggestion ... I do not mean to overly add to anyone's workload or exaggerate one incident ... but ... backups are pretty important. I sounds like perhaps the backups were missing for a few months? 

In any case, I was wondering if there should be some simple "test" or "report" of the backups ... perhaps the ISP could provide a one page summary of "number of files backed up by server name or top level directories" or something, and it be "published" on http://archive.eclipse.org/arch/, or somewhere. Think anomalies would be noticeable from that? If that's no good, maybe just a weekly, monthly, or quarterly "test of the emergency broadcast system"? (I say that, since, here in the States, anyway, the TV stations are required to run those tests once a month or so, which you normally only see if you stay up till 3 AM like I do, and half the time they fail :/ ... like no sound or sound is too low or message flashes for just 3 seconds, or something). 

Anyway ... just suggestions ... feel free to close as "won't fix" if you think there is nothing worth while to do here. (I know at home, I back up frequently, for years, and never once have tested if its accurate ... the one time I really needed one of those backups, I had made the mistake of buying Vista Home Edition, which did not even backup JSP (and other) files, by design ... I should have been testing it, I guess! ... but, not sure I could have ever thought to devise a test to spot that weirdness.
Comment 1 David Williams CLA 2011-10-28 14:37:24 EDT
I never remember how to type in short-hand comment references, such as for 

https://bugs.eclipse.org/bugs/show_bug.cgi?id=361707#c38

perhaps prefaced with bug instead of comment, such as 

bug 361707#c38
Comment 2 Denis Roy CLA 2011-10-28 17:37:51 EDT
Yes, and yes.  And yes.

We've been using the same managed backup system since 2005.

Since then, we've been receiving an email each month to remind us to test our restores. For the first few years, we did.  We even brought in a blank box to simulate the case where fire would have destroyed all our hardware.

But as the years passed, testing our restores went to the backburner.  What came out was always what went in.  Even in September our restores were tested -- they just didn't include code in the SCMs.  This broke when we recently swapped the NFS servers and changed some mount points.

We receive a nightly report that shows us the size of the backup.  Since the daily diffs in the code repos is very, very small compared to the diffs of everything else, no red flags were raised.  There was still a ton of files being backed up on dev.eclipse.org.

Anyway, we've been doing everything you've suggested, more or less religiously, and that has obviously failed, so I'm open to suggestions.  Perhaps in our test restore we should select files what we know should always be available in a restore (such as one directory in each SCM).
Comment 3 David Williams CLA 2011-10-28 18:54:45 EDT
Sounds like you have it covered, and this was just an hard-to-catch case that slipped though. I think you are right to add some specific tests that would have caught this particular case to your test restore ... just like we do with our code ... if a regression slips in, we try and add a junit test that would have caught the regression, just to make sure it doesn't happen again. (And, honestly, we are not that good about doing that. :) 

Thanks for the info.
Comment 4 Denis Roy CLA 2011-10-28 19:15:00 EDT
(In reply to comment #1)
> I never remember how to type in short-hand comment references, such as for 
> 
> https://bugs.eclipse.org/bugs/show_bug.cgi?id=361707#c38
> 
> perhaps prefaced with bug instead of comment, such as 
> 
> bug 361707#c38


bug 12345 comment 4 should work
Comment 5 Denis Roy CLA 2012-02-29 16:19:58 EST
> Sounds like you have it covered, and this was just an hard-to-catch case that
> slipped though.

David, I'll close this as WORKSFORME.  We'll be deploying a brand-new backups solution -- one which will give us more control -- so I'm hoping to write some automated backup tests.