| Summary: | [JAXB] resource manager not picking right application to run | ||
|---|---|---|---|
| Product: | [Tools] PTP | Reporter: | Praful Hebbar <prafhebb> |
| Component: | RM.PE | Assignee: | Greg Watson <g.watson> |
| Status: | RESOLVED FIXED | QA Contact: | |
| Severity: | normal | ||
| Priority: | P3 | CC: | jalameda |
| Version: | 5.0.1 | ||
| Target Milestone: | 5.0.3 | ||
| Hardware: | PC | ||
| OS: | Windows XP | ||
| Whiteboard: | |||
|
Description
Praful Hebbar
Another problem also occurs when creating a new launch configuration. The first time the application is launched, the executable is not found. However if the launch configuration is opened and saved again, it works on the second launch. I believe that bothe problems are both caused by the JAXB variable maps not getting updated correctly. First reported by Jay Alameda: Starting from scratch – creating new resource managers (pbs-batch and pbs-interactive), and then creating new runtime configs, one for each one, being careful to hit “apply” at the resources tab and then at the application tab, which should “set” everything – I noticed that the first time through, (this one interactive), the command line is not set: This job will be charged to account: adn qsub: waiting for job 4085820.abem5.ncsa.uiuc.edu to start qsub: job 4085820.abem5.ncsa.uiuc.edu ready ---------------------------------------- Begin Torque Prologue (Tue Sep 13 13:02:20 2011) Job ID: 4085820 Username: jalameda Group: acp Job Name: STDIN Limits: mem=8gb,ncpus=1,neednodes=1,nodes=1,walltime=00:10:00 Job Queue: lincoln_debug Account: lincoln.adn Nodes: abe1208 /dev/sda2 on /tmp type ext2 (rw) End Torque Prologue ---------------------------------------- mpirun -np 4 [jalameda@abe1208 ~]$ mpirun -np 4 -------------------------------------------------------------------------- No executable was specified on the mpirun command line. Aborting. -------------------------------------------------------------------------- [jalameda@abe1208 ~]$ [jalameda@abe1208 ~]$ Killing the job, and going back to the config,apps tab, hitting “apply” again, and rerunning, appears to make things ok: This job will be charged to account: adn qsub: waiting for job 4085821.abem5.ncsa.uiuc.edu to start qsub: job 4085821.abem5.ncsa.uiuc.edu ready ---------------------------------------- Begin Torque Prologue (Tue Sep 13 13:06:24 2011) Job ID: 4085821 Username: jalameda Group: acp Job Name: STDIN Limits: mem=8gb,ncpus=1,neednodes=1,nodes=1,walltime=00:10:00 Job Queue: lincoln_debug Account: lincoln.adn Nodes: abe1208 /dev/sda2 on /tmp type ext2 (rw) End Torque Prologue ---------------------------------------- mpirun -np 4 /u/ncsa/jalameda/mpi/ring [jalameda@abe1208 ~]$ mpirun -np 4 /u/ncsa/jalameda/mpi/ring Master: end of trip 1 of 1: after receiving passed_num=4 (should be =trip*numprocs=4) from source=3 [jalameda@abe1208 ~]$ This seems to be perhaps the case too for batch, but, I’m not sure. I just tried again to repeat the steps, but moving from apps to resources and then viewing the script seems to show the correct command line. I have committed a temporary fix to this problem that should be available in the next PTP build. Please test prior to COB SR1 RC4 (9/14) if possible. I will have to contact Al Rossi to work out a permanent fix. Greg's temporary fix seems to have worked - my quick test today did not show the issue. I have committed Al's fix to ptp_5_0 and HEAD. |