[MPICH] Intel 10.1

Mike Colonno Mike.Colonno at spacex.com
Wed Feb 13 21:38:01 CST 2008


   Simple cases, like the test cases that come with MPICH distributions, seem to work fine for n = anything. Any more sophisticated code works for n = small number (8 is typical) but fails for n > this number. This is independent of how the processes are distributed (number of processes / server). I have tried several different version of MPICH and MPICH2, including the one you mentioned, but all have the same result. This leads me to believe the issue lies with the compiler(s). We're using x64 servers, 2 dual-core Xeons per machine, RHEL 4.3 and 4.5 (upgrading OS had no effect). 
 
   Thanks,
   ~Mike C.


________________________________

From: Pavan Balaji [mailto:balaji at mcs.anl.gov]
Sent: Wed 2/13/2008 7:22 PM
To: Mike Colonno
Cc: mpich-discuss at mcs.anl.gov
Subject: Re: [MPICH] Intel 10.1



Mike,

We internally use Intel 9.1. We are still waiting on a site license for
Intel 10.1. Are you using mpich2-1.0.6p1? Also, what hardware platform
are you running these tests on?

Can you try a simple test and see if it works fine:

  $ mpiexec hostname

  -- Pavan

On 02/13/2008 08:55 PM, Mike Colonno wrote:
>    Hi folks ~
> 
>    Has anyone out there had any luck building MPICH and / or MPICH2 (and subsequent MPI applications) with Intel 10.1 compilers (C++ and Fortran)? If so, please forward me the details (flags, versions, OS, etc.). All of my MPI programs suffer "collective abort" errors killed by signal 9 or 11 (with essentially no repeatable pattern between the two) which are likely caused by seg. faults behind the scenes.  These codes worked great on older hardware / compiler versions and despite a large number of experiments I have been unable to find the secret. Compiled in sequential mode (without MPI) they all function well.
> 
>    Thanks,
>    ~Mike C.
> 
>

--
Pavan Balaji
http://www.mcs.anl.gov/~balaji





More information about the mpich-discuss mailing list