Some Eclipse Foundation services are deprecated, or will be soon. Please ensure you've read this important communication.
Bug 311700 - Launch Error, cannot create routing file, unable to determine process location
Summary: Launch Error, cannot create routing file, unable to determine process location
Status: RESOLVED FIXED
Alias: None
Product: PTP
Classification: Tools
Component: RM.MPICH2 (show other bugs)
Version: 4.0   Edit
Hardware: Macintosh Mac OS X - Carbon (unsup.)
: P3 major (vote)
Target Milestone: 4.0.3   Edit
Assignee: Project Inbox CLA
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2010-05-05 07:18 EDT by chuan.bai CLA
Modified: 2010-08-19 20:17 EDT (History)
3 users (show)

See Also:


Attachments
The binary version of the core with this modification. (69.16 KB, application/octet-stream)
2010-05-31 12:25 EDT, Swatch Puppy CLA
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description chuan.bai CLA 2010-05-05 07:18:39 EDT
Build Identifier: 20100218-1602

Error message shows up everytime I try to start a debug session. The runtime mode seems alright. There are other errors happens at the same time but no pop up window, such as, Unexpected exception

org.eclipse.core.runtime.CoreException: Cannot create routing file: unable to determine process location
	at org.eclipse.ptp.debug.sdm.core.SDMDebugger.newCoreException(SDMDebugger.java:367)
	at org.eclipse.ptp.debug.sdm.core.SDMDebugger.writeRoutingFile(SDMDebugger.java:436)
	at org.eclipse.ptp.debug.sdm.core.SDMDebugger.createDebugSession(SDMDebugger.java:155)
	at org.eclipse.ptp.debug.core.PDebugModel.createDebugSession(PDebugModel.java:530)
	at org.eclipse.ptp.launch.ParallelLaunchConfigurationDelegate$1$1.run(ParallelLaunchConfigurationDelegate.java:184)
	at org.eclipse.jface.operation.ModalContext$ModalContextThread.run(ModalContext.java:121)


Unexpected IO:

java.io.IOException: Stream closed
	at java.io.BufferedReader.ensureOpen(BufferedReader.java:97)
	at java.io.BufferedReader.ready(BufferedReader.java:417)
	at org.eclipse.rse.internal.services.local.shells.LocalShellOutputReader.internalReadLine(LocalShellOutputReader.java:205)
	at org.eclipse.rse.services.shells.AbstractHostShellOutputReader.handle(AbstractHostShellOutputReader.java:74)
	at org.eclipse.rse.services.shells.AbstractHostShellOutputReader.run(AbstractHostShellOutputReader.java:180)


Reproducible: Always

Steps to Reproduce:
1.Start resource manager
2.Start a program in debug mode
3.
Comment 1 Swatch Puppy CLA 2010-05-31 12:25:02 EDT
Created attachment 170558 [details]
The binary version of the core with this modification.
Comment 2 Swatch Puppy CLA 2010-05-31 12:25:50 EDT
Hello,

This error is not restricted to Macs, i'm working on a sony, with ubuntu 10.04 and have the same problem, also, it's not caused by mpdlistjobs output changes from the mpich2-1.2.1p1.
Aparently there it is needed some time for the process.getNode() method to return something diferent than NULL, so i solved it the following way (it's not a solution because it can bring a lot of problems, but it works for me(so far)):

	private void writeRoutingFile(IPLaunch launch) throws CoreException {
		DebugUtil.trace(DebugUtil.SDM_MASTER_TRACING, Messages.SDMDebugger_12);
		IProgressMonitor monitor = new NullProgressMonitor();
		OutputStream os = null;
		try {
			os = fRoutingFileStore.openOutputStream(0, monitor);
		} catch (CoreException e) {
			throw newCoreException(e.getLocalizedMessage());
		}
		PrintWriter pw = new PrintWriter(os);
		IPProcess processes[] = launch.getPJob().getProcesses();
		pw.format("%d\n", processes.length); //$NON-NLS-1$
		int base = 50000;
		int range = 10000;
		Random random = new Random();
		for (IPProcess process : processes) {
			String index = process.getProcessIndex();
			/*
			 * For make shure that all processes can getNode()
			 */
			while (process.getNode() == null){}
			IPNode	node = process.getNode();
			if (node != null) {
				String nodeName = node.getName();
				int portNumber = base + random.nextInt(range);
				pw.format("%s %s %d\n", index, nodeName, portNumber); //$NON-NLS-1$
			} else {
				throw newCoreException(Messages.SDMDebugger_15);
			}
		}
		pw.close();
		try {
			os.close();
		} catch (IOException e) {
			throw newCoreException(e.getLocalizedMessage());
		}
	}
Comment 3 Swatch Puppy CLA 2010-06-01 19:57:41 EDT
I've got a little bit more time to think it trough, i believe that the thread that it's trying to access the nodes info, it's doing it earlier than the thread that it's writing the nodes pointer onto the array of nodes is writing in it. 
And apparently the solution that i gave before is cool for debugging but it's not able to determine when the program it's finished, which in fact make a lot of sense.

I'm flooded with work right now, so hope that this can help someone fixing this bug.


The best regards, and the best of lucks,

SwatchPuppy
Comment 4 chuan.bai CLA 2010-06-04 10:23:37 EDT
I have a try on the code that you've given. During the startup stage, the resource manager shows a fine connection with the MPD server. However, once I try to start a debug session, it would tell me that it cannot connect to the MPD server. But it's a good point to start with. Thanks

(In reply to comment #3)
> I've got a little bit more time to think it trough, i believe that the thread
> that it's trying to access the nodes info, it's doing it earlier than the
> thread that it's writing the nodes pointer onto the array of nodes is writing
> in it. 
> And apparently the solution that i gave before is cool for debugging but it's
> not able to determine when the program it's finished, which in fact make a lot
> of sense.
> 
> I'm flooded with work right now, so hope that this can help someone fixing this
> bug.
> 
> 
> The best regards, and the best of lucks,
> 
> SwatchPuppy
Comment 5 Greg Watson CLA 2010-06-04 10:33:55 EDT
It looks like this is a race condition between when the debug job is launched and when the process information is updated. It's more common in the MPICH case because the mpdlistjobs command is only run periodically. I think the solution to wait for the process information is correct, but it needs to be interruptable and not block the UI. I'll work on a fix.
Comment 6 Greg Watson CLA 2010-06-04 17:02:15 EDT
I've committed a fix for this to HEAD. I don't have MPICH installed, so I can't test it. If you could test asap that would be appreciated.
Comment 7 chuan.bai CLA 2010-06-04 17:46:59 EDT
I am sorry but where can I download your fix? Thanks. BTW which parallel environment are you working with?
Comment 8 Greg Watson CLA 2010-06-04 18:55:59 EDT
There will be a new build tonight at 10pm. Builds are available from http://wiki.eclipse.org/PTP/builds/4.0.0

I work mainly with Open MPI and PE.

Thanks!
Comment 9 chuan.bai CLA 2010-06-05 19:01:10 EDT
When I was installing Eclipse 3.6, the debugger can't be built through "sh BUILD". The "libmi" stuff is missing from the directory.


configure: WARNING: You must have XMLTO to compile the XML documentation for libaif.
configure: creating ./config.status
config.status: creating Makefile
config.status: creating doc/Makefile
config.status: creating config.h
config.status: config.h is unchanged
config.status: executing depfiles commands
Making install in libaif
make[2]: Nothing to be done for `install-exec-am'.
make[2]: Nothing to be done for `install-data-am'.
Making install in libmi
/bin/sh: line 0: cd: libmi: No such file or directory
make: *** [install-recursive] Error 1
Comment 10 Greg Watson CLA 2010-06-05 19:43:30 EDT
Sorry about that. Please try the latest build: http://www.eclipse.org/downloads/download.php?file=/tools/ptp/builds/helios/I.I201006051912/ptp-master-4.0.0-I201006051912.zip

Thanks,
Greg
Comment 11 chuan.bai CLA 2010-06-06 11:27:06 EDT
The problem persists when I was using sh BUILD to compile the sdm debugger. Should I use an older version of sdm to debug? Thank you very much for your help.

Max

(In reply to comment #10)
> Sorry about that. Please try the latest build:
> http://www.eclipse.org/downloads/download.php?file=/tools/ptp/builds/helios/I.I201006051912/ptp-master-4.0.0-I201006051912.zip
> 
> Thanks,
> Greg
Comment 12 chuan.bai CLA 2010-06-06 13:21:36 EDT
After a reinstallation of the entire eclipse. The sdm debugger has been compiled. The job was launched in the mpd server, but the eclipse can't connect to it. I will investigate into this matter in the meanwhile. Thank you

Max
Comment 13 chuan.bai CLA 2010-06-06 18:13:11 EDT
As I have said earlier jobs can be detected by mpdlistjobs through terminal but eclipse can't detect them. The pop-up window when I started the debugging job shows "waiting for job information..." and stuck with it... I decide to use OpenMPI instead to continue my work, but if there is any further patches I am pleased to try them. Thank you Greg and Swatch Puppy for your kind support.

Best regards,
Max

(In reply to comment #10)
> Sorry about that. Please try the latest build:
> http://www.eclipse.org/downloads/download.php?file=/tools/ptp/builds/helios/I.I201006051912/ptp-master-4.0.0-I201006051912.zip
> 
> Thanks,
> Greg
Comment 14 Greg Watson CLA 2010-08-19 20:16:44 EDT
Fixed in 4.0 and HEAD.