Some Eclipse Foundation services are deprecated, or will be soon. Please ensure you've read this important communication.

Bug 576498

Summary: SSH slowdown on 147.75.85.211 and 147.75.85.214
Product: Community Reporter: Adam Farley <adfarley>
Component: ServersAssignee: Mikaël Barbero <mikael.barbero>
Status: RESOLVED FIXED QA Contact:
Severity: normal    
Priority: P3 CC: denis.roy, mikael.barbero, sxa
Version: unspecified   
Target Milestone: ---   
Hardware: PC   
OS: Windows 10   
Whiteboard:
Attachments:
Description Flags
ps output from 147.75.85.211 none

Description Adam Farley CLA 2021-10-07 09:15:07 EDT
Summary: 

SSH sessions for both machines are very slow, with login taking minutes, where it's possible at all. Please help resolve this.

Details:

Where logins have been achieved, typed words are slow to show up, putty windows occasionally crash, processes can fail to run altogether, and the ps -ef reports a massive number of "/var/tmp/cruner" processes. Unknown if this is a virus.

uptime output:
1:44pm  up 99 day(s), 20:42,  2 users,  load average: 4282.57, 4858.22, 5379.39

The last thing I did on either of those machines was to run a suite of JCK tests with concurrency set to 6 (as "psrinfo -p" returned 8, so I took that to mean 8 cores and set the test to 6 to try and prevent this exact scenario). Given the uptime results, it's possible that 6 java threads is enough to overwhelm the machine anyway. Additionally, I only ran this test suite on 147.75.85.214, so I presume that (if this *is* the cause) both machines share the same hardware.
Comment 1 Adam Farley CLA 2021-10-07 09:15:56 EDT
P.S. The slowdown and process launching issues are blocking all work on Solaris x86.
Comment 2 Denis Roy CLA 2021-10-07 10:17:40 EDT
> uptime output:
> 1:44pm  up 99 day(s), 20:42,  2 users,  load average: 4282.57, 4858.22,
> 5379.39

Would top be helpful in finding which process(es) are hogging the system?
Comment 3 Adam Farley CLA 2021-10-07 10:57:00 EDT
Unfortunately, top isn't a command the Solaris x86 systems recognize.

A website recommended "prstat -a", but that failed to run:

-bash-3.2$ prstat -a
-bash: fork: Resource temporarily unavailable

If it helps, ps -ef still works, and I stuck the output into a file. ps_output.txt, attached.
Comment 4 Adam Farley CLA 2021-10-07 10:57:53 EDT
Created attachment 287274 [details]
ps output from 147.75.85.211
Comment 5 Stewart Addison CLA 2021-10-07 11:10:56 EDT
I have the `ps` listing from the machine - there is an "obvious" problem with processes owned by root (therefore out of our control) which needs addressing by the Eclipse sysadmins :-)
Comment 6 Mikaël Barbero CLA 2021-10-07 11:11:32 EDT
I4m looking into it as we speak.
Comment 7 Denis Roy CLA 2021-10-07 14:08:05 EDT
These two machines were compromised, and will not be coming back.

We're investigating -- we'll prepare a report. 

We'll revise our processes for validating machines before adding them machines to the build process. I suggest new ones be rebuilt.
Comment 8 Adam Farley CLA 2021-10-12 13:11:21 EDT
Thanks Denis & Mikael. After consulting with the team, it seems a good idea for us to wait until the report is ready before adding more solaris x86 machines to the pool.

That way we can be confident that the new machines avoid the issue/s that caused the old ones to be compromised.
Comment 9 Denis Roy CLA 2021-10-12 13:32:17 EDT
The report has been authored and reviewed, and we've circulated it to the Executive team as there are legal implications here.

We'll release the report as soon as we can, as it contains a few recommendations.

Thx for your patience.
Comment 10 Denis Roy CLA 2021-11-05 10:14:03 EDT
This has been resolved.