Some Eclipse Foundation services are deprecated, or will be soon. Please ensure you've read this important communication.

Bug 300306

Summary: pserver delays
Product: Community Reporter: Christian Campo <christian.campo>
Component: CVSAssignee: Eclipse Webmaster <webmaster>
Status: RESOLVED WONTFIX QA Contact:
Severity: normal    
Priority: P3 CC: anne.jacko, b.muskalla, elias, erwin, pascal, ruediger.herrmann, sharon.corbett, wayne.beaton, webmaster, yang.meyer
Version: unspecified   
Target Milestone: ---   
Hardware: All   
OS: All   
Whiteboard:

Description Christian Campo CLA 2010-01-21 03:40:59 EST
Hi,

since the last technical restructure of the CVS server infrastructure the pservers are not in sync with extssh access. There is also no easy way to tell how up-to-date pserver is.

While this was done to make extssh quicker (help committers) it creates a problem for contributers. It happened to us more than once that active contributors in our project where searching for fixes for bugs that only exists on contributers computers and not on committer computers and were due to the fact that pserver is several hours behind the committed code.

At the same time running automated builds and tests also now is error-prone since fixing a bug and running the build again sometimes works, sometimes we have to wait a few hours and try another build.

I know that you could make the build checkout using a committers userid and password, but I am not sure why I want my committers userid and password to be spread in scripts or property files. In my feeling it contradicts the idea that the userid identifies me as a person and make me personally responsible for changes in CVS.

Still it wouldnt help the contributors.

I am pretty unsatisfied with the number of disadvantages we experience with the pserver delays.

I would like to know why pserver cvs is several hours behind (and not only minutes or seconds)?

Would it be possible to have a status page either per CVS or for all CVSes what the last timestamp was that was synced to the pserver CVS copy ? Then at least when we commit something at 13:32 we had an easy way to check when that code is in the pserver copy.

At last not least, I would vote for reverting the change and put pserver and extssh back in sync. The time we lost in issues related to this in the past month is WAY MORE than time we lost in delayed CVS access in the time since Riena started 2 years ago.

thanks
christian campo
Comment 1 Denis Roy CLA 2010-01-21 10:02:41 EST
> I would like to know why pserver cvs is several hours behind (and not only
> minutes or seconds)?

I have hooked into all the CVS facilities to initiate a sync after a commit.  For the most part, it works.  For some reason, that doesn't seem to be working for Riena.  I will investigate.

> Would it be possible to have a status page either per CVS or for all CVSes what
> the last timestamp was that was synced to the pserver CVS copy ? Then at least
> when we commit something at 13:32 we had an easy way to check when that code is
> in the pserver copy.

Again, most commits are propagated immediately, so I'm not sure what timestamp you'd like for me to show.

> At last not least, I would vote for reverting the change and put pserver and
> extssh back in sync. The time we lost in issues related to this in the past
> month is WAY MORE than time we lost in delayed CVS access in the time since
> Riena started 2 years ago.

I apologize for the inconveniences this has caused you, and I will investigate this further to attempt to solve the problem with Riena, but the primary motivation for splitting CVS is because we have hardware constraints. The servers are too busy.  We have plans for adding capacity in 2010, but until then, the situation will be the same.

You can minimize the impact of this by using extssh as much as possible (for builds, for instance).
Comment 2 Christian Campo CLA 2010-01-21 10:36:36 EST
Dennis,

I dont think that this problem is Riena specific. The other day Benjamin Muskalla complaint about the exact same problem with delays up to 8 (EIGHT) hours.

here is what you had to say:
----------
Date: Thu, 14 Jan 2010 15:25:39 -0500
From: Eclipse Webmaster (Denis Roy) <webmaster@eclipse.org>
To: Benjamin Muskalla <bmuskalla@eclipsesource.com>

Hi Benny.

Happy New Year to you to.  We made the change to pserver because we're
starting to run low on hardware resources.  Until we get some more
hardware, we'll have to live with the delayed sync.  Sorry.

Denis

----------

If we have to live (as I understand your mail above) with such a situation where pserver in general is unreliable, where syncs are delayed by 8 hours then a status page for the individual CVSes would help us to see until what timestamp the sync has crawled forward. That would help a lot.

Any change or statuspage would be much appreciated.

thanks

christian
Comment 3 Denis Roy CLA 2010-01-21 10:49:54 EST
(In reply to comment #2)
> I dont think that this problem is Riena specific.

I never said it was.

> If we have to live (as I understand your mail above) with such a situation
> where pserver in general is unreliable,

Who is 'we' ?  Committers can use extssh and never have any issues.  Unfortunately, this does cause problems for anonymous users, including contributors, but at this point my back is against a wall.


> Any change or statuspage would be much appreciated.

Before going there, can you help me solve the problem?  It seems that, on disk, the file below was changed Jan. 20 at 23:29 ET.  However, when I look at the CVS history using Eclipse, it seems the last modification was made Jan. 13.

Jan 20 23:29 /cvsroot/rt/org.eclipse.equinox/p2/bundles/org.eclipse.equinox.p2.ui/src/org/eclipse/equinox/internal/p2/ui/model/IIUElement.java

Can you help me understand what has changed?
Comment 4 Christian Campo CLA 2010-01-21 11:10:09 EST
on the file is a tag that says v20100120-1129...so while a tag does not change the file (I am not a CVS expert here) the time 11:29 and the day seem to correlate to the last modification date.

maybe that helps ?
Comment 5 Christian Campo CLA 2010-01-21 11:38:55 EST
currently for an arbitrary file in Riena 

/cvsroot/rt/org.eclipse.riena/org.eclipse.riena.tests/src/org/eclipse/riena/navigation/model/NavigationProcessorTest

the delay is like more than 10 minutes, less than an hour.

To understand the syncing better. Is it actually syncing some log file, so that I can be sure that i.e. if one commit of a time 5:32 is synced than EVERY other comit from 5:32 and any time before 5:32 is synced ?

Then a status page could help to tell me what the current lag is, since it seems to vary between 1 hour and somtimes its 8 hours depending of the day and other reasons.

However I have heard from Benjamin that sometimes, builds are breaking because implementation classes and test classes that are in different bundles but committed at the same time are not synced together.

It that case you would unable to provide such a definite time on a status page up to which ALL committs for a CVS repo are synced.

So how does it work ?
Comment 6 Elias Volanakis CLA 2010-01-21 17:51:04 EST
From my POV I've two problems with the current sync:

- The delay is too big (checked in SubModuleViewTest 15m ago - still not synced)
- The delay is too variable (if I knew the delay is *always* under X minutes, I could adjust too that. Right now it can be seconds or minutes or hours)
Comment 7 Elias Volanakis CLA 2010-01-21 18:18:29 EST
For what it's worth: its now 40+ minutes and I'm not seeing the last check-in yet. That's too long to be usable, from my POV.
Comment 8 Christian Campo CLA 2010-01-22 07:50:31 EST
Wayne, I am adding you in CC just to rise awareness that while Dennis is doing what he can, this is a major problem for contributors from the community. 
And of course for buildmasters of the individual eclipse projects. They will step by step invest hours and days to convert their buildprocess from pserver access to extssh.
Comment 9 Denis Roy CLA 2010-01-22 09:40:35 EST
Christian,

I noticed that you did a commit on UITestHelper.java, here:

http://dev.eclipse.org/viewcvs/index.cgi/org.eclipse.riena/org.eclipse.riena.ui.swt/src/org/eclipse/riena/internal/ui/swt/test/?root=RT_Project

That commit was propagated immediately to pserver as it should be.  The ViewCVS application uses the "delayed" pserver data, and it shows the up-to-date information.  Unfortunately, this immediate sync is not working for some committer accounts, and that is what I'm working on solving.  Please bear with me.




For what it's worth, I started discussing this change in October 2009:
http://dev.eclipse.org/mhonarc/lists/eclipse.org-committers/msg00789.html


All the logic behind this implementation is described in bug 293355.  
Bug 293355 comment 8 also explains that this type of delayed-sync for anonymous users is not an unusual setup.

(In reply to comment #8)
> And of course for buildmasters of the individual eclipse projects. They will
> step by step invest hours and days to convert their buildprocess from pserver
> access to extssh.

In bug 293355 comment 7, Kim said that she can easily update the Platform's build system to accommodate the change.  If it takes you hours and days, I would respectfully suggest something is wrong with your build process.
Comment 10 Christian Campo CLA 2010-01-22 10:15:11 EST
(In reply to comment #9)
> Christian,
> 
> I noticed that you did a commit on UITestHelper.java, here:
> 
> http://dev.eclipse.org/viewcvs/index.cgi/org.eclipse.riena/org.eclipse.riena.ui.swt/src/org/eclipse/riena/internal/ui/swt/test/?root=RT_Project
> 
> That commit was propagated immediately to pserver as it should be.  The ViewCVS
> application uses the "delayed" pserver data, and it shows the up-to-date
> information.  Unfortunately, this immediate sync is not working for some
> committer accounts, and that is what I'm working on solving.  Please bear with
> me.
> 
Not sure what is immediately. The moment I wrote the comment, the class was not yet synced. If you look at comment 7, you see that Elias also has 40 minutes+  delay for a commit he was doing.
> 
> 
> 
> For what it's worth, I started discussing this change in October 2009:
> http://dev.eclipse.org/mhonarc/lists/eclipse.org-committers/msg00789.html
I looked at it and saw that the main benefit you explain is that CVS extssh access was so slow because the world was accessing the same disk through pserver for their builds. Now the (Eclipse) world all need to switch to extssh to gain realiable builds again. So with luck in a few month we have very busy disk again dont we ?
> 
> 
> All the logic behind this implementation is described in bug 293355.  
> Bug 293355 comment 8 also explains that this type of delayed-sync for anonymous
> users is not an unusual setup.
> 
> (In reply to comment #8)
> > And of course for buildmasters of the individual eclipse projects. They will
> > step by step invest hours and days to convert their buildprocess from pserver
> > access to extssh.
> 
> In bug 293355 comment 7, Kim said that she can easily update the Platform's
> build system to accommodate the change.  If it takes you hours and days, I
> would respectfully suggest something is wrong with your build process.
most likely :-)

But it seems nothing really good can be done. So I guess the only solution is to switch to extssh for the build and ask contributors for patience.
Comment 11 Denis Roy CLA 2010-01-22 15:10:36 EST
I fixed a bug which was preventing committers with restricted shells from properly triggering the sync mechanism.  



> I looked at it and saw that the main benefit you explain is that CVS extssh
> access was so slow because the world was accessing the same disk through
> pserver for their builds. Now the (Eclipse) world all need to switch to extssh
> to gain realiable builds again. So with luck in a few month we have very busy
> disk again dont we ?

The Eclipse world is so much smaller than the entire world.  You would not believe how often a [university|research team|company xyz|some guy] pulls Gigabytes of code across countless branches and versions for hours and hours and hours for purposes that have no direct benefit to the Eclipse community.  The cost to them is virtually zero.  The cost to us is slower servers for the people that count the most.

So at this point, I think pserver should be in almost perfect sync with extssh at all times, but I'm sure there are commits that may slip through the cracks.  I'll keep monitoring for those to see why they happen and I'll do my best to fix the problem.

In the meanwhile, if you have contributors who work so closely to HEAD that these syncs affect their work, wouldn't it be worth considering voting them in as committers?
Comment 12 Elias Volanakis CLA 2010-01-22 15:27:18 EST
Thanks, Denis. 

I hope that at least my commits sync quickly now. If not, I will report back here.

Our main problem is that we don't want our build server to run under another committers name. However maybe we should vote our build server into a commiter?

Elias.
Comment 13 Denis Roy CLA 2010-01-22 15:36:08 EST
(In reply to comment #12)
> Thanks, Denis. 
> 
> I hope that at least my commits sync quickly now. If not, I will report back
> here.

I just saw you commit AuthorizationServiceITest.launch,v and Riena - AllFastTests.launch,v and they zipped by the rsync onto pserver.  Let me know if this is what you're seeing.


 
> Our main problem is that we don't want our build server to run under another
> committers name. However maybe we should vote our build server into a commiter?

Build servers have feelings too.
Comment 14 Elias Volanakis CLA 2010-01-22 15:41:43 EST
:-) 

Yes, this is fixed for me now. I also just committed build.properties and it took just a few seconds to show up via pserver.
Comment 15 Elias Volanakis CLA 2010-01-25 14:02:17 EST
Thanks,
Denis.

@Christian: I'm closing this - the delays are gone for me. If you notice anything, please reopen.
Comment 18 Christian Campo CLA 2010-02-01 05:23:38 EST
So I changed a little bit in the first missing sync RienaHessianProxyFactory and it got synced immediatelly. So in some cases it seems to fail and then the sync is lost.
Comment 19 Christian Campo CLA 2010-02-01 08:51:12 EST
I tagged all projects in Riena, something like 30 projects. While the extssh access shows all single files as being tagged, the pserver sync only ALMOST has all the tags. The tag in question is called v20100201_2_0_0_M5. You can see it applied on http://dev.eclipse.org/viewcvs/index.cgi/org.eclipse.riena/org.eclipse.riena.navigation.ui.swt/schema/moduleGroupView.exsd?root=RT_Project&view=log in the extssh and pserver version.

It seems however that only the tags from src and schema where synced. So in META-INF directory, the MANIFEST.MF has not that tag on pserver, http://dev.eclipse.org/viewcvs/index.cgi/org.eclipse.riena/org.eclipse.riena.navigation.ui.swt/META-INF/MANIFEST.MF?root=RT_Project&view=log
There the revision 1.44 still only has the tag HEAD, while on extssh it also has the tag mentioned above. Many many other projects seem have gotten tagged and synced correctly.
Comment 20 Christian Campo CLA 2010-02-09 02:17:53 EST
closing as wontfix, since this wont get better anytime soon. it currently works most of the time, except for the few cases where it doesnt.
Comment 21 Denis Roy CLA 2010-03-15 11:20:24 EDT
*** Bug 305861 has been marked as a duplicate of this bug. ***
Comment 22 Denis Roy CLA 2010-03-20 23:14:30 EDT
*** Bug 306615 has been marked as a duplicate of this bug. ***