itaps-parallel Orphan processes beware
Tim Tautges
tautges at mcs.anl.gov
Wed Oct 29 10:09:11 CDT 2008
Hi all,
The load problem on the mesh machine was due to orphan processes not
being killed with a job. So, if you're running a job and it dies or you
have to kill it, make sure the processes all get killed (ps -ef |grep
<username>) - note, those processes don't show up in the output of 'top'.
Also, please keep the jobs to 4 procs most of the time, or check with
others if you need larger jobs.
Thanks.
- tim
--
================================================================
"You will keep in perfect peace him whose mind is
steadfast, because he trusts in you." Isaiah 26:3
Tim Tautges Argonne National Laboratory
(tautges at mcs.anl.gov) (telecommuting from UW-Madison)
phone: (608) 263-8485 1500 Engineering Dr.
fax: (608) 263-4499 Madison, WI 53706
More information about the itaps-parallel
mailing list