[mpich-discuss] assertion failed in socksm.c it_plfc->revents when trying mpptest
Gregory Alan Hildstrom
ghildstrom at trustedcs.com
Thu Jun 11 08:18:02 CDT 2009
Aha! No it does not. It errors with the same error, so it is not an
mpptest problem.
Process 0 of 2 is on servernode
Process 1 of 2 is on computenode
Assertion failed in file socksm.c at line 1663: (it_plfd->revents &
0x008) == 0
internal ABORT - process 0
rank 0 in job 2 servernode_54275 caused collective abort of all ranks
exit status of rank 0: return code 1
I discovered that my /etc/hosts file was incorrect on the computenode. I
changed it to match the servernode's /etc/hosts file and things seem to
be working correctly now. I was able to run cpi and mpptest.
Thank you. -Greg
Pavan Balaji wrote:
>
> The problem with the test suite seems quite weird to me. We run the
> MPICH2 test suite several times in the nightly scripts as well as for
> our local testing every day, and we never saw any such problem.
>
> Does the "cpi" test in your build/examples directory work fine?
>
> $ mpiexec -n 4 ./examples/cpi
>
> -- Pavan
>
> On 06/10/2009 06:07 PM, Gregory Alan Hildstrom wrote:
>> Hello. I am trying to set up a simple performance test using mpptest
>> between two nodes running a stripped down version of RHEL5 x86_64. I
>> am able to mpdboot -n 2 and the subsequent mpdtrace shows the host
>> names of both nodes. I ran mpdringtest with no problems. I'm running
>> mpich2-1.1 and mpptest from perftest-1.4b.
>>
>> I ran "mpirun -np 2 mpptest -logscale" and got the following error
>> output.
>>
>> Assertion failed in file socksm.c at line 1663: (it_plfd->revents &
>> 0x008) == 0
>> internal ABORT - process 0
>> Assertion failed in file socksm.c at line 1663: (it_plfd->revents &
>> 0x008) == 0
>> internal ABORT - process 1
>> rank 0 in job 3 serverhost_37494 caused collective abort of all ranks
>> exit status of rank 0: return code 1
>>
>> I am not sure if this is an mpich issue or an mpptest issue. I tried
>> to make the test suite bundled with mpich2, but it did not build,
>> which is why I am trying mpptest. The test suite first complained
>> about a missing lib.a, which should have been libmpich.a in
>> test/mpid/ch3's makefile. Then it complained about a missing test.h in
>> test/util/timer/timers.c. I commented out the include test.h, which
>> resulted in an undefined reference to "Test_Waitforall".
>>
>> Any suggestions of where to look for problems? I would also appreciate
>> any pointers to other mpich-compatible benchmarks or test suites.
>>
>> Thank you. -Greg
>>
>
--
Gregory Alan Hildstrom
Senior Secure Systems Engineer
Trusted Computer Solutions
10010 San Pedro Ave, Suite 220
San Antonio, TX 78216
Office: 210-340-3151x101
Mobile: 210-413-6082
Fax: 210-340-3568
Email: ghildstrom at trustedcs.com
Email2: hildstrom at yahoo.com
More information about the mpich-discuss
mailing list