Pavan,<br><br>Thank you!<br><br>I tried to reconfigure with --with-hydra-bss=rsh,ssh,fork,slurm, I got segmentation fault at run time. I'll try --with-hydra-bss=rsh,ssh,fork,slurm,ll,lsf,sge,pbs,none,persist later.<br>
<br>Below I first run with "-bootstrap rsh -bootstrap-exec /usr/bin/bprsh", it looks great. Then I set the HYDRA_BOOTSTRAP and HYDRA_BOOTSTRAP_EXEC environment variables, and run without bootstrap option, it still runs, but a bunch of garbage also spill out. Are they normal message and I just somehow make them quite?<br>
<br>[root@flatline examples]# mpiexec -bootstrap rsh -bootstrap-exec /usr/bin/bprsh -n 12 -f machinefile ./cpi<br>Process 4 of 12 is on n1<br>Process 8 of 12 is on n2<br>Process 0 of 12 is on n0<br>Process 5 of 12 is on n1<br>
Process 9 of 12 is on n2<br>Process 1 of 12 is on n0<br>Process 6 of 12 is on n1<br>Process 10 of 12 is on n2<br>Process 2 of 12 is on n0<br>Process 7 of 12 is on n1<br>Process 11 of 12 is on n2<br>Process 3 of 12 is on n0<br>
pi is approximately 3.1415926544231256, Error is 0.0000000008333325<br>wall clock time = 0.004450<br>[root@flatline examples]# export HYDRA_BOOTSTRAP=rsh<br>[root@flatline examples]# export HYDRA_BOOTSTRAP_EXEC=/usr/bin/bprsh<br>
[root@flatline examples]# mpiexec -n 12 -f machinefile ./cpi<br>Process 1 of 12 is on n0<br>Process 4 of 12 is on n1<br>Process 8 of 12 is on n2<br>Process 2 of 12 is on n0<br>Process 6 of 12 is on n1<br>Process 9 of 12 is on n2<br>
Process 3 of 12 is on n0<br>Process 7 of 12 is on n1<br>Process 10 of 12 is on n2<br>Process 0 of 12 is on n0<br>Process 5 of 12 is on n1<br>Process 11 of 12 is on n2<br>pi is approximately 3.1415926544231256, Error is 0.0000000008333325<br>
wall clock time = 0.003011<br>*** glibc detected *** /home/lgu/mpich2-install/bin/hydra_pmi_proxy: munmap_chunk(): invalid pointer: 0x00007fff63b6ec5d ***<br>*** glibc detected *** /home/lgu/mpich2-install/bin/hydra_pmi_proxy: munmap_chunk(): invalid pointer: 0x00007fffcf689c5d ***<br>
*** glibc detected *** /home/lgu/mpich2-install/bin/hydra_pmi_proxy: munmap_chunk(): invalid pointer: 0x00007fff94651c5d ***<br>======= Backtrace: =========<br>======= Backtrace: =========<br>/lib64/libc.so.6(cfree+0x166)[0x3ef10729d6]<br>
/home/lgu/mpich2-install/bin/hydra_pmi_proxy[0x41a52b]<br>/home/lgu/mpich2-install/bin/hydra_pmi_proxy[0x404ad5]<br>/lib64/libc.so.6(__libc_start_main+0xf4)[0x3ef101d994]<br>/home/lgu/mpich2-install/bin/hydra_pmi_proxy[0x403ab9]<br>
======= Memory map: ========<br>/lib64/libc.so.6(cfree+0x166)[0x3ef10729d6]<br>/home/lgu/mpich2-install/bin/hydra_pmi_proxy[0x41a52b]<br>/home/lgu/mpich2-install/bin/hydra_pmi_proxy[0x404ad5]<br>/lib64/libc.so.6(__libc_start_main+0xf4)[0x3ef101d994]<br>
/home/lgu/mpich2-install/bin/hydra_pmi_proxy[0x403ab9]<br>======= Memory map: ========<br>00400000-0043c000 r-xp 00000000 00:16 2810395 /home/lgu/mpich2-install/bin/hydra_pmi_proxy<br>0063c000-0063e000 rw-p 0003c000 00:16 2810395 /home/lgu/mpich2-install/bin/hydra_pmi_proxy<br>
0063e000-00662000 rw-p 0063e000 00:00 0<br>0c13b000-0c15c000 rw-p 0c13b000 00:00 0 [heap]<br>3ef0c00000-3ef0c1c000 r-xp 00000000 00:12 3430 /lib64/<a href="http://ld-2.5.so">ld-2.5.so</a><br>
3ef0e1b000-3ef0e1c000 r--p 0001b000 00:12 3430 /lib64/<a href="http://ld-2.5.so">ld-2.5.so</a><br>3ef0e1c000-3ef0e1d000 rw-p 0001c000 00:12 3430 /lib64/<a href="http://ld-2.5.so">ld-2.5.so</a><br>
3ef1000000-3ef114e000 r-xp 00000000 00:12 813 /lib64/libc.so.6<br>3ef114e000-3ef134e000 ---p 0014e000 00:12 813 /lib64/libc.so.6<br>3ef134e000-3ef1352000 r--p 0014e000 00:12 813 /lib64/libc.so.6<br>
3ef1352000-3ef1353000 rw-p 00152000 00:12 813 /lib64/libc.so.6<br>3ef1353000-3ef1358000 rw-p 3ef1353000 00:00 0<br>3ef1400000-3ef1482000 r-xp 00000000 00:12 859 /lib64/libm.so.6<br>
3ef1482000-3ef1681000 ---p 00082000 00:12 859 /lib64/libm.so.6<br>3ef1681000-3ef1682000 r--p 00081000 00:12 859 /lib64/libm.so.6<br>3ef1682000-3ef1683000 rw-p 00082000 00:12 859 /lib64/libm.so.6<br>
3ef1800000-3ef1805000 r-xp 00000000 00:12 5939 /usr/lib64/libnuma.so.1<br>3ef1805000-3ef1a04000 ---p 00005000 00:12 5939 /usr/lib64/libnuma.so.1<br>3ef1a04000-3ef1a05000 rw-p 00004000 00:12 5939 /usr/lib64/libnuma.so.1<br>
3ef1c00000-3ef1c16000 r-xp 00000000 00:12 1098 /lib64/libpthread.so.0<br>3ef1c16000-3ef1e15000 ---p 00016000 00:12 1098 /lib64/libpthread.so.0<br>3ef1e15000-3ef1e16000 r--p 00015000 00:12 1098 /lib64/libpthread.so.0<br>
3ef1e16000-3ef1e17000 rw-p 00016000 00:12 1098 /lib64/libpthread.so.0<br>3ef1e17000-3ef1e1b000 rw-p 3ef1e17000 00:00 0<br>3ef2000000-3ef2014000 r-xp 00000000 00:12 6497 /usr/lib64/libz.so.1<br>
3ef2014000-3ef2213000 ---p 00014000 00:12 6497 /usr/lib64/libz.so.1<br>3ef2213000-3ef2214000 rw-p 00013000 00:12 6497 /usr/lib64/libz.so.1<br>3ef2400000-3ef2407000 r-xp 00000000 00:12 1094 /lib64/librt.so.1<br>
3ef2407000-3ef2607000 ---p 00007000 00:12 1094 /lib64/librt.so.1<br>3ef2607000-3ef2608000 r--p 00007000 00:12 1094 /lib64/librt.so.1<br>3ef2608000-3ef2609000 rw-p 00008000 00:12 1094 /lib64/librt.so.1<br>
3ef3c00000-3ef3c15000 r-xp 00000000 00:12 1106 /lib64/libnsl.so.1<br>3ef3c15000-3ef3e14000 ---p 00015000 00:12 1106 /lib64/libnsl.so.1<br>3ef3e14000-3ef3e15000 r--p 00014000 00:12 1106 /lib64/libnsl.so.1<br>
3ef3e15000-3ef3e16000 rw-p 00015000 00:12 1106 /lib64/libnsl.so.1<br>3ef3e16000-3ef3e18000 rw-p 3ef3e16000 00:00 0<br>3ef9400000-3ef9533000 r-xp 00000000 00:12 6495 /usr/lib64/libxml2.so.2<br>
3ef9533000-3ef9733000 ---p 00133000 00:12 6495 /usr/lib64/libxml2.so.2<br>3ef9733000-3ef973c000 rw-p 00133000 00:12 6495 /usr/lib64/libxml2.so.2<br>3ef973c000-3ef973d000 rw-p 3ef973c000 00:00 0<br>
3efbc00000-3efbc0d000 r-xp 00000000 00:12 4377 /lib64/libgcc_s.so.1<br>3efbc0d000-3efbe0d000 ---p 0000d000 00:12 4377 /lib64/libgcc_s.so.1<br>3efbe0d000-3efbe0e000 rw-p 0000d000 00:12 4377 /lib64/libgcc_s.so.1<br>
2abbba01e000-2abbba020000 rw-p 2abbba01e000 00:00 0<br>2abbba020000-2abbba024000 r-xp 00000000 00:16 2647315 /home/lgu/mpich2-install/lib/libmpl.so.1.0.0<br>2abbba024000-2abbba223000 ---p 00004000 00:16 2647315 /home/lgu/mpich2-install/lib/libmpl.so.1.0.0<br>
2abbba223000-2abbba224000 rw-p 00003000 00:16 2647315 /home/lgu/mpich2-install/lib/libmpl.so.1.0.0<br>2abbba243000-2abbba246000 rw-p 2abbba243000 00:00 0<br>2abbba246000-2abbba248000 r-xp 00000000 00:12 830 /lib64/libdl.so.2<br>
2abbba248000-2abbba448000 ---p 00002000 00:12 830 /lib64/libdl.so.2<br>2abbba448000-2abbba449000 r--p 00002000 00:12 830 /lib64/libdl.so.2<br>2abbba449000-2abbba44a000 rw-p 00003000 00:12 830 /lib64/libdl.so.2<br>
2abbba44a000-2abbba44c000 rw-p 2abbba44a000 00:00 0<br>2abbba44c000-2abbba481000 r--s 00000000 00:12 3970 /var/run/nscd/db52YQK7 (deleted)<br>7fff9463d000-7fff94652000 rw-p 7ffffffe9000 00:00 0 [stack]7fff94693000-7fff94696000 r-xp 7fff94693000 00:00 0 [vdso]<br>
ffffffffff600000-ffffffffffe00000 ---p 00000000 00:00 0 [vsyscall]<br>======= Backtrace: =========<br>/lib64/libc.so.6(cfree+0x166)[0x3ef10729d6]<br>/home/lgu/mpich2-install/bin/hydra_pmi_proxy[0x41a52b]<br>
/home/lgu/mpich2-install/bin/hydra_pmi_proxy[0x404ad5]<br>/lib64/libc.so.6(__libc_start_main+0xf4)[0x3ef101d994]<br>/home/lgu/mpich2-install/bin/hydra_pmi_proxy[0x403ab9]<br>======= Memory map: ========<br>00400000-0043c000 r-xp 00000000 00:16 2810395 /home/lgu/mpich2-install/bin/hydra_pmi_proxy<br>
0063c000-0063e000 rw-p 0003c000 00:16 2810395 /home/lgu/mpich2-install/bin/hydra_pmi_proxy<br>0063e000-00662000 rw-p 0063e000 00:00 0<br>09275000-09296000 rw-p 09275000 00:00 0 [heap]<br>
3ef0c00000-3ef0c1c000 r-xp 00000000 00:12 3477 /lib64/<a href="http://ld-2.5.so">ld-2.5.so</a><br>3ef0e1b000-3ef0e1c000 r--p 0001b000 00:12 3477 /lib64/<a href="http://ld-2.5.so">ld-2.5.so</a><br>
3ef0e1c000-3ef0e1d000 rw-p 0001c000 00:12 3477 /lib64/<a href="http://ld-2.5.so">ld-2.5.so</a><br>3ef1000000-3ef114e000 r-xp 00000000 00:12 813 /lib64/libc.so.6<br>3ef114e000-3ef134e000 ---p 0014e000 00:12 813 /lib64/libc.so.6<br>
3ef134e000-3ef1352000 r--p 0014e000 00:12 813 /lib64/libc.so.6<br>3ef1352000-3ef1353000 rw-p 00152000 00:12 813 /lib64/libc.so.6<br>3ef1353000-3ef1358000 rw-p 3ef1353000 00:00 0<br>
3ef1400000-3ef1482000 r-xp 00000000 00:12 859 /lib64/libm.so.6<br>3ef1482000-3ef1681000 ---p 00082000 00:12 859 /lib64/libm.so.6<br>3ef1681000-3ef1682000 r--p 00081000 00:12 859 /lib64/libm.so.6<br>
3ef1682000-3ef1683000 rw-p 00082000 00:12 859 /lib64/libm.so.6<br>3ef1800000-3ef1805000 r-xp 00000000 00:12 5898 /usr/lib64/libnuma.so.1<br>3ef1805000-3ef1a04000 ---p 00005000 00:12 5898 /usr/lib64/libnuma.so.1<br>
3ef1a04000-3ef1a05000 rw-p 00004000 00:12 5898 /usr/lib64/libnuma.so.1<br>3ef1c00000-3ef1c16000 r-xp 00000000 00:12 1115 /lib64/libpthread.so.0<br>3ef1c16000-3ef1e15000 ---p 00016000 00:12 1115 /lib64/libpthread.so.0<br>
3ef1e15000-3ef1e16000 r--p 00015000 00:12 1115 /lib64/libpthread.so.0<br>3ef1e16000-3ef1e17000 rw-p 00016000 00:12 1115 /lib64/libpthread.so.0<br>3ef1e17000-3ef1e1b000 rw-p 3ef1e17000 00:00 0<br>
3ef2000000-3ef2014000 r-xp 00000000 00:12 5896 /usr/lib64/libz.so.1<br>3ef2014000-3ef2213000 ---p 00014000 00:12 5896 /usr/lib64/libz.so.1<br>3ef2213000-3ef2214000 rw-p 00013000 00:12 5896 /usr/lib64/libz.so.1<br>
3ef2400000-3ef2407000 r-xp 00000000 00:12 1111 /lib64/librt.so.1<br>3ef2407000-3ef2607000 ---p 00007000 00:12 1111 /lib64/librt.so.1<br>3ef2607000-3ef2608000 r--p 00007000 00:12 1111 /lib64/librt.so.1<br>
3ef2608000-3ef2609000 rw-p 00008000 00:12 1111 /lib64/librt.so.1<br>3ef3c00000-3ef3c15000 r-xp 00000000 00:12 1123 /lib64/libnsl.so.1<br>3ef3c15000-3ef3e14000 ---p 00015000 00:12 1123 /lib64/libnsl.so.1<br>
3ef3e14000-3ef3e15000 r--p 00014000 00:12 1123 /lib64/libnsl.so.1<br>3ef3e15000-3ef3e16000 rw-p 00015000 00:12 1123 /lib64/libnsl.so.1<br>3ef3e16000-3ef3e18000 rw-p 3ef3e16000 00:00 0<br>
3ef9400000-3ef9533000 r-xp 00000000 00:12 5894 /usr/lib64/libxml2.so.2<br>3ef9533000-3ef9733000 ---p 00133000 00:12 5894 /usr/lib64/libxml2.so.2<br>3ef9733000-3ef973c000 rw-p 00133000 00:12 5894 /usr/lib64/libxml2.so.2<br>
3ef973c000-3ef973d000 rw-p 3ef973c000 00:00 0<br>3efbc00000-3efbc0d000 r-xp 00000000 00:12 4435 /lib64/libgcc_s.so.1<br>3efbc0d000-3efbe0d000 ---p 0000d000 00:12 4435 /lib64/libgcc_s.so.1<br>
3efbe0d000-3efbe0e000 rw-p 0000d000 00:12 4435 /lib64/libgcc_s.so.1<br>2afc5843f000-2afc58441000 rw-p 2afc5843f000 00:00 0<br>2afc58441000-2afc58445000 r-xp 00000000 00:16 2647315 /home/lgu/mpich2-install/lib/libmpl.so.1.0.0<br>
2afc58445000-2afc58644000 ---p 00004000 00:16 2647315 /home/lgu/mpich2-install/lib/libmpl.so.1.0.0<br>2afc58644000-2afc58645000 rw-p 00003000 00:16 2647315 /home/lgu/mpich2-install/lib/libmpl.so.1.0.0<br>
2afc58664000-2afc58667000 rw-p 2afc58664000 00:00 0<br>2afc58667000-2afc58669000 r-xp 00000000 00:12 830 /lib64/libdl.so.2<br>2afc58669000-2afc58869000 ---p 00002000 00:12 830 /lib64/libdl.so.2<br>
2afc58869000-2afc5886a000 r--p 00002000 00:12 830 /lib64/libdl.so.2<br>2afc5886a000-2afc5886b000 rw-p 00003000 00:12 830 /lib64/libdl.so.2<br>2afc5886b000-2afc5886d000 rw-p 2afc5886b000 00:00 0<br>
2afc5886d000-2afc588a2000 r--s 00000000 00:12 4024 /var/run/nscd/dbrOewAG (deleted)<br>7fffcf675000-7fffcf68a000 rw-p 7ffffffe9000 00:00 0 [stack]7fffcf6d2000-7fffcf6d5000 r-xp 7fffcf6d2000 00:00 0 [vdso]<br>
ffffffffff600000-ffffffffffe00000 ---p 00000000 00:00 0 [vsyscall]<br>bpsh: Child process exited abnormally.<br>[mpiexec@flatline] HYDT_bscu_wait_for_completion (./tools/bootstrap/utils/bscu_wait.c:70): one of the processes terminated badly; aborting<br>
[mpiexec@flatline] HYDT_bsci_wait_for_completion (./tools/bootstrap/src/bsci_wait.c:18): launcher returned error waiting for completion<br>[mpiexec@flatline] HYD_pmci_wait_for_completion (./pm/pmiserv/pmiserv_pmci.c:216): launcher returned error waiting for completion<br>
[mpiexec@flatline] main (./ui/mpich/mpiexec.c:404): process manager error waiting for completion<br>[root@flatline examples]#<br><br><br><br><div class="gmail_quote">On Fri, Feb 18, 2011 at 11:47 AM, Pavan Balaji <span dir="ltr"><<a href="mailto:balaji@mcs.anl.gov">balaji@mcs.anl.gov</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;"><br>
Setting the HYDRA_BOOTSTRAP and HYDRA_BOOTSTRAP_EXEC environment variables should work. What error are you seeing?<br>
<br>
Alternatively, you can reconfigure MPICH2 as with --with-hydra-bss=rsh,ssh,fork,slurm,ll,lsf,sge,pbs,none,persist<br>
<br>
This will reprioritize the launchers to give a higher priority to rsh.<br>
<br>
-- Pavan<div><div></div><div class="h5"><br>
<br>
On 02/18/2011 10:39 AM, Limin Gu wrote:<br>
</div></div><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;"><div><div></div><div class="h5">
Hi,<br>
<br>
I have successfully built and run mpich 1.3.2 on our cluster. But since<br>
we rather use bprsh (rsh like) between nodes, I have to specify the<br>
bootstrap at mpiexec command line, like this:<br>
<br>
mpiexec -bootstrap rsh -bootstrap-exec /usr/bin/bprsh<br>
<br>
It works, but is there a way that I can make rsh as the default<br>
bootstrap in some config file, so I don't have to specify that on every<br>
mpiexec command?<br>
<br>
I have tried to set HYDRA_BOOTSTRAP and HYDRA_BOOTSTRAP_EXEC environment<br>
variables, that didn't work.<br>
<br>
Thank you!<br>
<br>
Limin<br>
<br>
<br>
<br></div></div>
_______________________________________________<br>
mpich-discuss mailing list<br>
<a href="mailto:mpich-discuss@mcs.anl.gov" target="_blank">mpich-discuss@mcs.anl.gov</a><br>
<a href="https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss" target="_blank">https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss</a><br>
</blockquote>
<br>
-- <br>
Pavan Balaji<br>
<a href="http://www.mcs.anl.gov/%7Ebalaji" target="_blank">http://www.mcs.anl.gov/~balaji</a><br>
</blockquote></div><br>