Some Eclipse Foundation services are deprecated, or will be soon. Please ensure you've read this important communication.
Bug 293416 - CVS unusably slow
Summary: CVS unusably slow
Status: RESOLVED NOT_ECLIPSE
Alias: None
Product: Community
Classification: Eclipse Foundation
Component: CVS (show other bugs)
Version: unspecified   Edit
Hardware: PC Linux-GTK
: P3 normal (vote)
Target Milestone: ---   Edit
Assignee: Eclipse Webmaster CLA
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2009-10-27 06:44 EDT by Dani Megert CLA
Modified: 2009-11-03 12:00 EST (History)
5 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Dani Megert CLA 2009-10-27 06:44:05 EDT
Since last week Eclipse CVS has become unusably slow (or times out, depending on the connection timeout set on the Team > CVS > Connection preference page). This affects tagging but also comparing or synchronizing a project.

While bug 293355 talks about redirecting pserver connections to a read-only copy, this bug here is about finding out what happened that it is so much worse since about last week.

I see the issue most of the time happen in big projects who have many folders and files. A common lock point (and hence my test case) is e.g. inside /org.eclipse.jdt.ui.tests.refactoring/resources/*. Comparing that project with HEAD takes > 1 hour which makes working impossible.

Very interesting is that it is *not* a "one lock locks all" situation: while the synchronization (or tagging) is blocked from my Linux computer I can easily do the same operation (at the same time) from my Windows XP laptop (inside same network as the Linux machine) with the same extssh user (dmegert) i.e. it is constantly very very slow on the Linux computer but always works fine on the other. I tried on both machines with a new workspace and hence the only difference I can see is the OS of the machine. Maybe the different ssh versions make a difference?
Comment 1 Dani Megert CLA 2009-10-27 10:29:55 EDT
Comparing org.eclipse.jdt.ui.tests.refactoring with HEAD on a Windows 7 machine also seems to work at this point, but still no luck with the Linux machine.
Comment 2 Denis Roy CLA 2009-10-27 10:43:57 EDT
(In reply to comment #1)
> Comparing org.eclipse.jdt.ui.tests.refactoring with HEAD on a Windows 7 machine
> also seems to work at this point, but still no luck with the Linux machine.

So it works on one machine but not on the other within the same network... The only variable here seems to be at your end.  Why are you blaming the repository?  Can you explain (in layman terms) how I can reproduce this?
Comment 3 Denis Roy CLA 2009-10-27 11:09:21 EDT
I checked out /cvsroot/eclipse/org.eclipse.jdt.ui.tests.refactoring (using extssh) and did a Synchronize.  It took a couple of minutes, but there are 21,700 files in there...  There are currently  no locks for that plugin.
Comment 4 Dani Megert CLA 2009-10-27 11:18:38 EDT
>So it works on one machine but not on the other within the same network...
Yes.

> checked out /cvsroot/eclipse/org.eclipse.jdt.ui.tests.refactoring
Strange enough that works also fine for me, but what's taking 1 to 2hours is
1. select it in the Package Explorer
2. Compare With > Latest from HEAD

> The only variable here seems to be at your end.
I doubted that because others seemed to have same performance issues (note that they are not in the same network as I am, even if they are IBMers) and because when you killed the locks yesterday it immediately continued.
Comment 5 Dani Megert CLA 2009-10-27 11:25:24 EDT
I just started another compare on that project which is processing very very slowly. You should be able to see it now.
Comment 6 Karl Matthias CLA 2009-10-27 11:59:23 EDT
Dani, are all IBM people running a corporate Linux build?
Comment 7 Dani Megert CLA 2009-10-27 12:06:17 EDT
>Dani, are all IBM people running a corporate Linux build?
We have a different install which is quite old:

Suse Linux Enterprise Server 9
GNOME: 2.4.1
gtk2-2.2.4-125.17
Comment 8 Dani Megert CLA 2009-10-27 13:37:15 EDT
After various tries I've replaced dev.eclipse.org in my CVS connection with the IP of node1 (206.191.52.51) and guess what: it now works (previously I was on node5 according to Denis).

Maybe somehow related to bug 289408 (but there it was node1 with the issues).
Comment 9 Karl Matthias CLA 2009-10-27 16:18:21 EDT
Dani, if it always worked fine from Windows, and not from Linux (even for different users), then I don't think it has anything to do with the nodes themselves.  If you put it back to dev.eclipse.org is that now working as well?
Comment 10 Dani Megert CLA 2009-10-28 03:59:26 EDT
>then I don't think it has anything to do with the nodes
>themselves.  If you put it back to dev.eclipse.org is that now working as well?
No, unfortunately it does not (just tried again as of 08:55 CET). Fact is that if I use "dev.eclipse.org" as host in the CVS repository connection it takes more than an hour but using "206.191.52.51" works below 5 minutes.
Comment 11 Dani Megert CLA 2009-10-29 05:14:14 EDT
Any updated on this? Now I it's also blocking on my Windows machine :-( and releng had to restart the builds because they hung in CVS.
Comment 12 Dani Megert CLA 2009-10-30 05:26:50 EDT
This morning, the Windows machine (with dev.eclipse.org) works again but on the Linux machine it's still the same: working OK with 206.191.52.51 but almost dead (it continues but very very slow) when using dev.eclipse.org.
Comment 13 Gunnar Wagenknecht CLA 2009-10-30 06:30:48 EDT
(In reply to comment #12)
> This morning, the Windows machine (with dev.eclipse.org) works again but on the
> Linux machine it's still the same: working OK with 206.191.52.51 but almost
> dead (it continues but very very slow) when using dev.eclipse.org.

Can you check the network settings of both machines? Are they in the same network, behind the same switch/hub? Do they have the same DNS settings, same default gateway,etc.? 

Looks like DNS problems on your side with the Linux machine.
Comment 14 Dani Megert CLA 2009-10-30 06:38:09 EDT
>Looks like DNS problems on your side with the Linux machine.
I doubt, since the server can be resolved and I can do all the CVS operations - they just take very very long. I don't think that it resolves the name of the server during the same CVS op.
Comment 15 Denis Roy CLA 2009-10-30 09:55:06 EDT
Everything here is pointing at a specific problem with your Linux machine. As root, can you paste the output of:

- /sbin/ifconfig
- /sbin/ethtool /dev/ethX  <-- where ethX is the primary interface you use to connect (typically eth0)
- free
Comment 16 Dani Megert CLA 2009-10-30 10:20:48 EDT
>Everything here is pointing at a specific problem with your Linux machine.
I had the same issue yesterday on one of the Windows machines

Here's the info (/sbin/ethtool is not available):

eth0      Link encap:Ethernet  HWaddr 00:02:55:7B:69:62
          inet addr:9.4.202.202  Bcast:9.4.202.255  Mask:255.255.255.0
          inet6 addr: fe80::202:55ff:fe7b:6962/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:1237807 errors:0 dropped:0 overruns:0 frame:0
          TX packets:1010576 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:902374937 (860.5 Mb)  TX bytes:374280499 (356.9 Mb)
          Base address:0x2500 Memory:fbfe0000-fc000000
 
lo        Link encap:Local Loopback
          inet addr:127.0.0.1  Mask:255.0.0.0
          inet6 addr: ::1/128 Scope:Host
          UP LOOPBACK RUNNING  MTU:16436  Metric:1
          RX packets:7256954 errors:0 dropped:0 overruns:0 frame:0
          TX packets:7256954 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:3408726951 (3250.8 Mb)  TX bytes:3408726951 (3250.8 Mb)
 
             total       used       free     shared    buffers     cached
Mem:       1034280     987324      46956          0      38716     261416
-/+ buffers/cache:     687192     347088
Swap:      4200484      20768    4179716
Comment 17 Dani Megert CLA 2009-11-03 08:52:38 EST
Made some more tests: it's definitely not DNS resolving. When using node1.eclipse.org instead of its IP (206.191.52.51) it also works. When I use the IP of dev.eclipse.org (206.191.52.50) it also fails.

I also tried rebooting the machine - no luck.

Could it be that this specific machine has some stale state on your load balancer?
Comment 18 Denis Roy CLA 2009-11-03 09:16:58 EST
> Could it be that this specific machine has some stale state on your load
> balancer?

Just for the one Linux box you're using?  It seems like a far fetch.  I'm going to close this as not eclipse -- neither Karl nor I can reproduce this, and the problem seems quite isolated.  If there is anything else I can do to help, please let me know.  Otherwise, I'm out of ideas.
Comment 19 Dani Megert CLA 2009-11-03 09:45:26 EST
> If there is anything else I can do to help,
The only thing would be to reset the load balancer (if it has state). For now we simply work against node1.
Comment 20 Karl Matthias CLA 2009-11-03 12:00:15 EST
Hi Dani, it does keep state, but your session will have been flushed within no more than a few hours after you used it last.  It is also kept by IP address and destination port, so if you're coming from a corporate firewall you almost certainly are using the same load balancer state no matter which machine you're coming from.  This has to be a localized issue.