<html><head><base href="x-msg://883/"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space; "><div><div><br></div></div><div>I am trying to install mpich2-1.3.2p1 to my cluster,</div><div><br></div><div>I configureed with:</div><div><br></div><div><div> ./configure \</div><div> --prefix=/neem2/huangwei/apps/mpich2-1.3.2p1 \</div><div> --with-pm=hydra \</div><div> --without-hydra-bindlib</div></div><div><br></div><div><br></div><div>When I run the examples, I got some problem with 2 processors, as below:</div><div><br></div><div><br></div><div>mpiexec -n 1 cpi</div><div><br></div><div>Process 0 of 1 is on neem</div><div>pi is approximately 3.1415926544231341, Error is 0.0000000008333410</div><div>wall clock time = 0.000265</div><div><br></div><div><br></div><div>mpiexec -n 2 cpi</div><div><br></div><div>Process 0 of 2 is on neem</div><div>Process 1 of 2 is on neem</div><div>pi is approximately 3.1415926544231318, Error is 0.0000000008333387</div><div>wall clock time = 0.000212</div><div>[mpiexec@neem] ONE OF THE PROCESSES TERMINATED BADLY: CLEANING UP</div><div>APPLICATION TERMINATED WITH THE EXIT STRING: Terminated (signal 15)</div><div><br></div><div>Anyone knows what is wrong here?</div><div><br></div><div>Thanks,</div><div><br></div><div>Wei</div><div><font class="Apple-style-span" size="3"><br></font></div><div><font class="Apple-style-span" size="3"><br></font></div><div><span class="Apple-style-span" style="font-size: 12px; "><div style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space; "><span class="Apple-style-span" style="border-collapse: separate; color: rgb(0, 0, 0); font-family: 'Times New Roman'; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: 2; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-border-horizontal-spacing: 0px; -webkit-border-vertical-spacing: 0px; -webkit-text-decorations-in-effect: none; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; "><div style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space; "><div><div><a href="mailto:huangwei@ucar.edu">huangwei@ucar.edu</a></div><div>VETS/CISL<br>National Center for Atmospheric Research<br>P.O. Box 3000 (1850 Table Mesa Dr.)<br>Boulder, CO 80307-3000 USA<br>(303) 497-8924</div></div><div><br></div></div></span><br class="Apple-interchange-newline"></div></span><br class="Apple-interchange-newline"><br class="Apple-interchange-newline">
</div>
<br><div><div>On May 9, 2011, at 4:27 PM, Joe Vallino wrote:</div><br class="Apple-interchange-newline"><blockquote type="cite"><span class="Apple-style-span" style="border-collapse: separate; font-family: 'Times New Roman'; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: 2; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-border-horizontal-spacing: 0px; -webkit-border-vertical-spacing: 0px; -webkit-text-decorations-in-effect: none; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; font-size: medium; "><div><div style="font-family: 'Times New Roman'; font-size: 12pt; color: rgb(0, 0, 0); "><font class="Apple-style-span" size="3">Thanks Rajeev; I'm still just learning MPI, so the MPI_IN_PLACE error was caused by my own naivete.</font><div style="color: rgb(0, 0, 0); font-family: 'Times New Roman'; font-size: 12pt; "><br></div><div><font class="Apple-style-span" size="3">The non</font>compliant<font class="Apple-style-span" size="3"> code was:</font></div><div style="color: rgb(0, 0, 0); font-family: 'Times New Roman'; font-size: 12pt; "><br></div><div><div> CALL MPI_GATHER(lc_convex(mygid+1), 1, MPI_INTEGER, lc_convex(mygid+1), &</div><div> 1, MPI_INTEGER, 0, myComm, ierr)</div><div><br></div><div>which I replaced with</div><div><br></div><div> if (mygid == 0) then</div><div> CALL MPI_GATHER(MPI_IN_PLACE, 1, MPI_INTEGER, lc_convex(mygid+1), &</div><div> 1, MPI_INTEGER, 0, myComm, ierr)</div><div> else </div><div> CALL MPI_GATHER(lc_convex(mygid+1), 1, MPI_INTEGER, MPI_IN_PLACE, &</div><div> 1, MPI_INTEGER, 0, myComm, ierr)</div><div> end if</div><div style="color: rgb(0, 0, 0); font-family: 'Times New Roman'; font-size: 12pt; "><br></div></div><div><font class="Apple-style-span" size="3">since this is the only place where gathering occurs. I assume this is the only way to fix the non</font>compliant<font class="Apple-style-span" size="3"> code, but it is certainly not as "pretty". There were a few other areas requiring related fixes, but all is working now.</font></div><div style="color: rgb(0, 0, 0); font-family: 'Times New Roman'; font-size: 12pt; "><br></div><div style="color: rgb(0, 0, 0); font-family: 'Times New Roman'; font-size: 12pt; "><div style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; ">Thanks for all your help!!</div><div style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; ">-joe</div><div style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; "><br></div><hr id="zwchr"><b>From:<span class="Apple-converted-space"> </span></b>"Rajeev Thakur" <<a href="mailto:thakur@mcs.anl.gov">thakur@mcs.anl.gov</a>><br><b>To:<span class="Apple-converted-space"> </span></b><a href="mailto:mpich-discuss@mcs.anl.gov">mpich-discuss@mcs.anl.gov</a><br><b>Sent:<span class="Apple-converted-space"> </span></b>Monday, May 9, 2011 1:52:40 PM<br><b>Subject:<span class="Apple-converted-space"> </span></b>Re: [mpich-discuss] MPICH2 internal errors on Win 7 x64<br><br>It probably means that the data that the root sends to itself in MPI_Gather may not already be in the right location in recvbuf.<br><br>Rajeev<br><br><br>On May 9, 2011, at 12:41 PM, Joe Vallino wrote:<br><br>> Rajeev, et al.<br>><span class="Apple-converted-space"> </span><br>> The use of MPI_IN_PLACE did allow MPCHI2 to run w/o errors (thanks). Interestingly, the test problem generates a different (and incorrect) answer from what it should now. Intel MPI also produces the same incorrect answer when using MPI_IN_PLACE, but produces the correct result when violating the MPI 2.2 standard regarding identical sbuf and rbuf .<br>><span class="Apple-converted-space"> </span><br>> Any ideas pop to mind with this situation? Anything else magical with MPI_IN_PLACE ?<br>><span class="Apple-converted-space"> </span><br>> cheers<br>> -joe<br>><span class="Apple-converted-space"> </span><br>> From: "Rajeev Thakur" <<a href="mailto:thakur@mcs.anl.gov">thakur@mcs.anl.gov</a>><br>> To:<span class="Apple-converted-space"> </span><a href="mailto:mpich-discuss@mcs.anl.gov">mpich-discuss@mcs.anl.gov</a><br>> Sent: Monday, May 9, 2011 10:58:33 AM<br>> Subject: Re: [mpich-discuss] MPICH2 internal errors on Win 7 x64<br>><span class="Apple-converted-space"> </span><br>> The error check was added in a recent version of MPICH2.<br>><span class="Apple-converted-space"> </span><br>> Rajeev<br>> <br>> On May 9, 2011, at 9:50 AM, Joe Vallino wrote:<br>><span class="Apple-converted-space"> </span><br>> > Thanks Rajeev. I'll take a look at that, but I wonder why the code runs fine on Intel MPI which is based on MPICH2. <br>> ><span class="Apple-converted-space"> </span><br>> > cheers,<br>> > -joe<br>> ><span class="Apple-converted-space"> </span><br>> > From: "Rajeev Thakur" <<a href="mailto:thakur@mcs.anl.gov">thakur@mcs.anl.gov</a>><br>> > To:<span class="Apple-converted-space"> </span><a href="mailto:mpich-discuss@mcs.anl.gov">mpich-discuss@mcs.anl.gov</a><br>> > Sent: Monday, May 9, 2011 9:55:11 AM<br>> > Subject: Re: [mpich-discuss] MPICH2 internal errors on Win 7 x64<br>> ><span class="Apple-converted-space"> </span><br>> > The code is passing the same buffer as sendbuf and recvbuf to MPI_Gather, which is not allowed. You need to use MPI_IN_PLACE as described in the MPI standard (see MPI 2.2 for easy reference).<br>> ><span class="Apple-converted-space"> </span><br>> > Rajeev<br>> ><span class="Apple-converted-space"> </span><br>> ><span class="Apple-converted-space"> </span><br>> > On May 8, 2011, at 6:46 PM, Joe Vallino wrote:<br>> ><span class="Apple-converted-space"> </span><br>> > > Hi,<br>> > ><span class="Apple-converted-space"> </span><br>> > > I've installed MPICH2 (1.3.2p1, Windows EM64T binaries) on a Window 7 x64 machine (2 sockets, 4 cores each). MPICH2 works fine for simple tests, be when I attempt to run a more complex use of MPI, I get various internal MPI errors, such as:<br>> > ><span class="Apple-converted-space"> </span><br>> > > Fatal error in PMPI_Gather: Invalid buffer pointer, error stack:<br>> > > PMPI_Gather(863): MPI_Gather(sbuf=0000000000BC8040, scount=1, MPI_INTEGER, rbuf=0000000000BC8040, rcount=1, MPI_INTEGER, root=0, comm=0x84000004) failed<br>> > > PMPI_Gather(806): Buffers must not be aliased<br>> > ><span class="Apple-converted-space"> </span><br>> > > job aborted:<br>> > > rank: node: exit code[: error message]<br>> > > 0: ECO37: 1: process 0 exited without calling finalize<br>> > > 1: ECO37: 123<br>> > ><span class="Apple-converted-space"> </span><br>> > > The errors occur regardless if using x32 or x64 builds.<br>> > ><span class="Apple-converted-space"> </span><br>> > > The code I'm tying to run is pVTDIRECT (see TOMS package 897 on<span class="Apple-converted-space"> </span><a href="http://netlib.org">netlib.org</a>), and the above errors are produced by running the simple test routine that comes with the package. Since the package can be easily compiled and run, this should allow others to confirm the problem, if anyone is feeling so motivated :)<br>> > ><span class="Apple-converted-space"> </span><br>> > > As an attempt to confirm the problem is with MPICH2 build, I installed a commercial MPI build (csWMPI II), which works fine with the TOMS package, so this would indicate the problem is with MPICH2.<br>> > ><span class="Apple-converted-space"> </span><br>> > > Since the TOMS package uses Fortran 95, and I'm using the latest Intel ifort compiler with VS2008, I tried to build MPICH2 from the 1.3.2p1 source, but after banging my head on that for a day w/o success, I decided to see if anyone has any suggestions here (or if anyone can confirm the problem with the TOMS package under Windows MPICH2 release).<br>> > ><span class="Apple-converted-space"> </span><br>> > > - Can anyone point me to a Win x64 build that used newer versions of intel fortran (V 11 or 12) and/or more recent releases of Windows SDK, which seem to be the main wild cards in the build process?<br>> > ><span class="Apple-converted-space"> </span><br>> > > - I will continue to try and build MPICH2 for windows, but I suspect I will not succeed given my *cough* skills.<br>> > ><span class="Apple-converted-space"> </span><br>> > > Thanks!<br>> > > -joe<br>> > > _______________________________________________<br>> > > mpich-discuss mailing list<br>> > ><span class="Apple-converted-space"> </span><a href="mailto:mpich-discuss@mcs.anl.gov">mpich-discuss@mcs.anl.gov</a><br>> > ><span class="Apple-converted-space"> </span><a href="https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss">https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss</a><br>> ><span class="Apple-converted-space"> </span><br>> > _______________________________________________<br>> > mpich-discuss mailing list<br>> ><span class="Apple-converted-space"> </span><a href="mailto:mpich-discuss@mcs.anl.gov">mpich-discuss@mcs.anl.gov</a><br>> ><span class="Apple-converted-space"> </span><a href="https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss">https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss</a><br>> > _______________________________________________<br>> > mpich-discuss mailing list<br>> ><span class="Apple-converted-space"> </span><a href="mailto:mpich-discuss@mcs.anl.gov">mpich-discuss@mcs.anl.gov</a><br>> ><span class="Apple-converted-space"> </span><a href="https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss">https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss</a><br>><span class="Apple-converted-space"> </span><br>> _______________________________________________<br>> mpich-discuss mailing list<br>><span class="Apple-converted-space"> </span><a href="mailto:mpich-discuss@mcs.anl.gov">mpich-discuss@mcs.anl.gov</a><br>><span class="Apple-converted-space"> </span><a href="https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss">https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss</a><br>> _______________________________________________<br>> mpich-discuss mailing list<br>><span class="Apple-converted-space"> </span><a href="mailto:mpich-discuss@mcs.anl.gov">mpich-discuss@mcs.anl.gov</a><br>><span class="Apple-converted-space"> </span><a href="https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss">https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss</a><br><br>_______________________________________________<br>mpich-discuss mailing list<br><a href="mailto:mpich-discuss@mcs.anl.gov">mpich-discuss@mcs.anl.gov</a><br><a href="https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss">https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss</a><br></div></div>_______________________________________________<br>mpich-discuss mailing list<br><a href="mailto:mpich-discuss@mcs.anl.gov">mpich-discuss@mcs.anl.gov</a><br><a href="https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss">https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss</a><br></div></span></blockquote></div><br></body></html>