[mpich-discuss] Problems Running WRF on Ubuntu 11.10, MPICH2

Sukanta Basu sukanta.basu at gmail.com
Sat Feb 11 19:10:06 CST 2012


Hello Preeti, Gus, and Anthony,

I would like to thank you all for your help and suggestions!

I have decided to go with the gnu compilers. Now, parallel WRF (using
mpich2-hydra) runs nicely on the Cray CX1.

Cheers,
Sukanta

On Thu, Feb 9, 2012 at 9:51 AM, Preeti <preeti at csa.iisc.ernet.in> wrote:
> Hi Sukanta,
>
> I am not sure what may be the problem but you are right, 24GB is more than
> enough for your domain size. I have been running larger WRF domains in dmpar
> mode on various architectures including Cray for quite some time. For me, it
> suffices to have the following in my .bashrc
>
> unset limits
> export MP_STACK_SIZE=64000000
>
> Are you using FPMPI? Is yes, what happens if you dont use the profiler?
>
> Regards
> Preeti
>
> On Thu, Feb 9, 2012 at 7:55 PM, Sukanta Basu <sukanta.basu at gmail.com> wrote:
>>
>> Hi Gus,
>>
>> The Cray has 4 nodes (each containing 8 cores, 24 GB RAM). The nodes
>> are connected by gigE. I need to use dmpar option.
>>
>> Regards,
>> Sukanta
>>
>> On Thu, Feb 9, 2012 at 8:59 AM, Gustavo Correa <gus at ldeo.columbia.edu>
>> wrote:
>> > Hi Basu
>> >
>> > Sorry, I missed the 'dmpar' information.
>> > I am not familiar to it, but I guess it is the Cray trick to make the
>> > CX1 machine
>> > look like a standard distributed memory environment?
>> > [As opposed to a full shared memory environment across all nodes,
>> > which would be 'smpar', right?]
>> >
>> > If 'dmpar' it is a standard distributed memory environment, I presume
>> > all that
>> > I said before still holds.  I would just try to set KMP_STACKSIZE to 16m
>> > or more on
>> > all nodes, and run WRF again.
>> >
>> > FYI, I had some issues compiling some models with Intel 12.0, and in
>> > other mailing
>> > lists I saw people that had issues with version 12.1.
>> > However, I compiled some models with Intel 11.1 correctly, but as I said
>> > before, not WRF.
>> >
>> > BTW, we're cheap here.  No funding for fancy machines, no Cray, no IBM,
>> > no SGI.
>> > The top thing we can buy is a standard Linux cluster once in a while. :)
>> >
>> > Good luck,
>> > Gus Correa
>> >
>> > On Feb 9, 2012, at 8:25 AM, Sukanta Basu wrote:
>> >
>> >> Hi Gus,
>> >>
>> >> Thanks for your email.
>> >>
>> >> I am compiling WRF with dmpar option (distributed memory). WRF has a
>> >> different option for hybrid openmp+mpi (they call it dmpar+smpar). To
>> >> the best of my knowledge, openmp is not invoked.
>> >>
>> >> I do understand the distinction between openmp and openmpi. Yesterday,
>> >> I uninstalled mpich2 and installed openmpi. I compiled and ran wrf
>> >> jobs. As I mentioned before, I faced different types of problems.
>> >>
>> >> I have been using WRF on various clusters for ~6-7 years. I bought a
>> >> Cray CX1 recently and trying to set it up myself for running WRF
>> >> locally. Now, I am suspecting that there is some compatibility issues
>> >> between WRF and the Intel Composer. I used to use Intel 11.1 compiler.
>> >>
>> >> I will set KMP_STACKSIZE and re-run the simulations with
>> >> wrf+mpich2+intel.
>> >>
>> >> Best regards,
>> >> Sukanta
>> >>
>> >> On Thu, Feb 9, 2012 at 8:02 AM, Gustavo Correa <gus at ldeo.columbia.edu>
>> >> wrote:
>> >>> Hi Sukanta
>> >>>
>> >>> Did you read the final part of my previous email about KMP_STACKSIZE?
>> >>> This is how Intel calls the OpenMP threads stack size.
>> >>> I think you misspelled that environment variable [it is not
>> >>> MP_STACKSIZE as your email says].
>> >>>
>> >>> Did you compile WRF with OpenMP turned on and with the Intel compiler?
>> >>> If you did, you certainly need to increase also the threads' stack
>> >>> size.
>> >>>
>> >>> I had experiences similar to yours, with other models compiled with
>> >>> Intel ifort,
>> >>> and OpenMP, i.e., unexplained segmentation faults, even though the
>> >>> stacksize was
>> >>> set to unlimited.
>> >>>
>> >>> Some time ago I posted this same solution in this mailing list to
>> >>> somebody
>> >>> at LLNL or ANL, I think, who was having this type of problem as well.
>> >>> It is common in hybrid MPI+OpenMP programs.
>> >>>
>> >>> I would set KMP_STACKSIZE to 16m  at least *on all nodes*, maybe in
>> >>> your .bashrc, or in the script that launches the job.  I don't remember the
>> >>> syntax on top of my head,
>> >>> but the MPICH2 mpiexec [hydra] probably has a way to export the
>> >>> environment variables
>> >>> to all processes.  Check 'man mpiexec'.
>> >>> You must ensure that the environment variable is set *on all nodes*.
>> >>>
>> >>> You may need more than 16m, depending on how fine a grid you are
>> >>> using.
>> >>> In another model here I had to use 512m, but this also depends
>> >>> on how much memory/RAM your nodes have available per core.
>> >>> You could try increasing it step by step, say, doubling each  time:
>> >>> 16m, 32m, 64m, ...
>> >>>
>> >>> Anyway, this is a guess based on what happened here.
>> >>> There is no guarantee that it will work, although it may be worth
>> >>> trying it.
>> >>> The problem you see may also be a bug in WRF, or an input/forcing file
>> >>> that is missing, etc.
>> >>>
>> >>> I hope this helps,
>> >>> Gus Correa
>> >>>
>> >>> PS - Note:  Just to avoid confusion with names.
>> >>> OpenMP and OpenMPI  [or Open MPI] are different things.
>> >>> The former is the thread-based standard for parallelization:
>> >>> http://openmp.org/wp/
>> >>> The latter is another open source  MPI, like MPICH2:
>> >>> http://www.open-mpi.org/
>> >>>
>> >>>
>> >>> On Feb 8, 2012, at 10:33 PM, Sukanta Basu wrote:
>> >>>
>> >>>> Hi Gus,
>> >>>>
>> >>>> I tried setting the stack option in limits.conf. No change. I logged
>> >>>> on to each nodes and checked that the ulimit is indeed unlimited.
>> >>>>
>> >>>> I just installed openmpi and recompiled WRF. It now runs with any
>> >>>> array sizes. However, I have a different problem. Now, one of the
>> >>>> processes quits suddenly during the run (with a segmentation fault
>> >>>> error). I think both the mpich2 and openmpi problems are somewhat
>> >>>> related.
>> >>>>
>> >>>> Best regards,
>> >>>> Sukanta
>> >>>>
>> >>>> On Wed, Feb 8, 2012 at 6:20 PM, Gustavo Correa
>> >>>> <gus at ldeo.columbia.edu> wrote:
>> >>>>> Hi Sukanta
>> >>>>>
>> >>>>> Did you set the stacksize [not only memlock] to unlimited in
>> >>>>> /etc/security/limits.conf on all nodes?
>> >>>>>
>> >>>>> Not sure this will work, but you could try to run 'ulimit -s'  and
>> >>>>> 'ulimit -l' via mpiexec, just to check:
>> >>>>>
>> >>>>> mpiexec -prepend-rank -f hostfile -np 32 ulimit -s
>> >>>>> mpiexec -prepend-rank -f hostfile -np 32 ulimit -l
>> >>>>>
>> >>>>> Or just login to each node and check.
>> >>>>>
>> >>>>> Also, if your WRF is compiled with OpenMP,
>> >>>>> I think the Intel-specific environment variable for OMP_STACKSIZE is
>> >>>>> KMP_STACKSIZE [not MP_STACKSIZE], although they should also accept
>> >>>>> the portable/standard OMP_STACKSIZE [but I don't know if they do].
>> >>>>> For some models here I had to make is as big as 512m [I don't run
>> >>>>> wrf, though].
>> >>>>> 'man ifort' should tell more about it [at the end of the man page].
>> >>>>>
>> >>>>> I hope this helps,
>> >>>>> Gus Correa
>> >>>>>
>> >>>>> On Feb 8, 2012, at 4:23 PM, Anthony Chan wrote:
>> >>>>>
>> >>>>>>
>> >>>>>> There is fpi, Fortran counterpart of cpi, you can try that.
>> >>>>>> Also, there is MPICH2 testsuite which is located in
>> >>>>>> mpich2-xxx/test/mpi can be invoked by "make testing".
>> >>>>>> It is unlikely those tests will reveal anything.
>> >>>>>> The testsuite is meant to test the MPI implementation
>> >>>>>> not your app.
>> >>>>>>
>> >>>>>> As what you said earlier, your difficulty in running WRF
>> >>>>>> with larger dataset is memory related.  You should contact WRF
>> >>>>>> emailing list for more pointers.
>> >>>>>>
>> >>>>>> ----- Original Message -----
>> >>>>>>> Hi Anthony,
>> >>>>>>>
>> >>>>>>> Is there any other mpi example code (other than cpi.c) that I
>> >>>>>>> could
>> >>>>>>> test which will give me more information about my mpich setup?
>> >>>>>>>
>> >>>>>>> Here is the output from cpi (using 32 cores on 4 nodes):
>> >>>>>>>
>> >>>>>>> mpiuser at crayN1-5150jo:~/Misc$ mpiexec -f mpd.hosts -n 32 ./cpi
>> >>>>>>> Process 1 on crayN1-5150jo
>> >>>>>>> Process 18 on crayN2-5150jo
>> >>>>>>> Process 2 on crayN2-5150jo
>> >>>>>>> Process 26 on crayN2-5150jo
>> >>>>>>> Process 5 on crayN1-5150jo
>> >>>>>>> Process 14 on crayN2-5150jo
>> >>>>>>> Process 21 on crayN1-5150jo
>> >>>>>>> Process 22 on crayN2-5150jo
>> >>>>>>> Process 25 on crayN1-5150jo
>> >>>>>>> Process 6 on crayN2-5150jo
>> >>>>>>> Process 9 on crayN1-5150jo
>> >>>>>>> Process 17 on crayN1-5150jo
>> >>>>>>> Process 30 on crayN2-5150jo
>> >>>>>>> Process 10 on crayN2-5150jo
>> >>>>>>> Process 29 on crayN1-5150jo
>> >>>>>>> Process 13 on crayN1-5150jo
>> >>>>>>> Process 8 on crayN3-5150jo
>> >>>>>>> Process 20 on crayN3-5150jo
>> >>>>>>> Process 4 on crayN3-5150jo
>> >>>>>>> Process 12 on crayN3-5150jo
>> >>>>>>> Process 0 on crayN3-5150jo
>> >>>>>>> Process 24 on crayN3-5150jo
>> >>>>>>> Process 16 on crayN3-5150jo
>> >>>>>>> Process 28 on crayN3-5150jo
>> >>>>>>> Process 3 on crayN4-5150jo
>> >>>>>>> Process 7 on crayN4-5150jo
>> >>>>>>> Process 11 on crayN4-5150jo
>> >>>>>>> Process 23 on crayN4-5150jo
>> >>>>>>> Process 27 on crayN4-5150jo
>> >>>>>>> Process 31 on crayN4-5150jo
>> >>>>>>> Process 19 on crayN4-5150jo
>> >>>>>>> Process 15 on crayN4-5150jo
>> >>>>>>> pi is approximately 3.1416009869231249, Error is
>> >>>>>>> 0.0000083333333318
>> >>>>>>> wall clock time = 0.009401
>> >>>>>>>
>> >>>>>>> Best regards,
>> >>>>>>> Sukanta
>> >>>>>>>
>> >>>>>>> On Wed, Feb 8, 2012 at 1:19 PM, Anthony Chan <chan at mcs.anl.gov>
>> >>>>>>> wrote:
>> >>>>>>>>
>> >>>>>>>> Hmm.. Not sure what is happening.. I don't see anything
>> >>>>>>>> obviously wrong in your mpiexec verbose output (though
>> >>>>>>>> I am not hydra expert). Your code now is killed because of
>> >>>>>>>> segmentation fault. Naively, I would recompile WRF with -g
>> >>>>>>>> and use a debugger to see where segfault is. If you don't want
>> >>>>>>>> to mess around WRF source code, you may want to contact WRF
>> >>>>>>>> developers to see if they have encountered similar problem
>> >>>>>>>> before.
>> >>>>>>>>
>> >>>>>>>> ----- Original Message -----
>> >>>>>>>>> Dear Anthony,
>> >>>>>>>>>
>> >>>>>>>>> Thanks for your response. Yes, I did try MP_STACK_SIZE and
>> >>>>>>>>> OMP_STACKSIZE. The error is still there. I have attached a log
>> >>>>>>>>> file
>> >>>>>>>>> (I
>> >>>>>>>>> ran mpiexec with -verbose option). May be this will help.
>> >>>>>>>>>
>> >>>>>>>>> Best regards,
>> >>>>>>>>> Sukanta
>> >>>>>>>>>
>> >>>>>>>>> On Tue, Feb 7, 2012 at 3:28 PM, Anthony Chan <chan at mcs.anl.gov>
>> >>>>>>>>> wrote:
>> >>>>>>>>>>
>> >>>>>>>>>> I am not familar with WRF, and not sure if WRF uses any thread
>> >>>>>>>>>> in dmpar mode. Did you try setting MP_STACK_SIZE or
>> >>>>>>>>>> OMP_STACKSIZE
>> >>>>>>>>>> ?
>> >>>>>>>>>>
>> >>>>>>>>>> see: http://forum.wrfforum.com/viewtopic.php?f=6&t=255
>> >>>>>>>>>>
>> >>>>>>>>>> A.Chan
>> >>>>>>>>>>
>> >>>>>>>>>> ----- Original Message -----
>> >>>>>>>>>>> Hi,
>> >>>>>>>>>>>
>> >>>>>>>>>>> I am using a small cluster of 4 nodes (each with 8 cores + 24
>> >>>>>>>>>>> GB
>> >>>>>>>>>>> RAM).
>> >>>>>>>>>>> OS: Ubuntu 11.10. The cluster uses nfs file system and gigE
>> >>>>>>>>>>> connections.
>> >>>>>>>>>>>
>> >>>>>>>>>>> I installed mpich2 and ran cpi.c program successfully.
>> >>>>>>>>>>>
>> >>>>>>>>>>> I installed WRF (http://www.wrf-model.org/index.php) using the
>> >>>>>>>>>>> intel
>> >>>>>>>>>>> compilers (dmpar option)
>> >>>>>>>>>>> I set ulimit -l and -s to be unlimited in .bashrc (all nodes)
>> >>>>>>>>>>> I set memlock to be unlimited in limits.conf (all nodes)
>> >>>>>>>>>>> I have password-less ssh (public key sharing) on all the nodes
>> >>>>>>>>>>> I ran parallel jobs with 40x40x40, 40x40x50, and 40x40x60 grid
>> >>>>>>>>>>> points
>> >>>>>>>>>>> successfully. However, when I utilize 40x40x80 grid points, I
>> >>>>>>>>>>> get
>> >>>>>>>>>>> the
>> >>>>>>>>>>> following MPI error:
>> >>>>>>>>>>>
>> >>>>>>>>>>> **********************************************************
>> >>>>>>>>>>> Fatal error in PMPI_Wait: Other MPI error, error stack:
>> >>>>>>>>>>> PMPI_Wait(183)............: MPI_Wait(request=0x34e83a4,
>> >>>>>>>>>>> status=0x7fff7b24c400) failed
>> >>>>>>>>>>> MPIR_Wait_impl(77)........:
>> >>>>>>>>>>> dequeue_and_set_error(596): Communication error with rank 8
>> >>>>>>>>>>> **********************************************************
>> >>>>>>>>>>> Given that I can run the exact simulation with slightly lesser
>> >>>>>>>>>>> number
>> >>>>>>>>>>> of grid points without any problem, this error is related to
>> >>>>>>>>>>> stack
>> >>>>>>>>>>> size. What could be the problem?
>> >>>>>>>>>>>
>> >>>>>>>>>>> Thanks,
>> >>>>>>>>>>> Sukanta
>> >>>>>>>>>>>
>> >>>>>>>>>>> --
>> >>>>>>>>>>> Sukanta Basu
>> >>>>>>>>>>> Associate Professor
>> >>>>>>>>>>> North Carolina State University
>> >>>>>>>>>>> http://www4.ncsu.edu/~sbasu5/
>> >>>>>>>>>>> _______________________________________________
>> >>>>>>>>>>> mpich-discuss mailing list mpich-discuss at mcs.anl.gov
>> >>>>>>>>>>> To manage subscription options or unsubscribe:
>> >>>>>>>>>>> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
>> >>>>>>>>>> _______________________________________________
>> >>>>>>>>>> mpich-discuss mailing list mpich-discuss at mcs.anl.gov
>> >>>>>>>>>> To manage subscription options or unsubscribe:
>> >>>>>>>>>> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
>> >>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>> --
>> >>>>>>>>> Sukanta Basu
>> >>>>>>>>> Associate Professor
>> >>>>>>>>> North Carolina State University
>> >>>>>>>>> http://www4.ncsu.edu/~sbasu5/
>> >>>>>>>
>> >>>>>>>
>> >>>>>>>
>> >>>>>>> --
>> >>>>>>> Sukanta Basu
>> >>>>>>> Associate Professor
>> >>>>>>> North Carolina State University
>> >>>>>>> http://www4.ncsu.edu/~sbasu5/
>> >>>>>> _______________________________________________
>> >>>>>> mpich-discuss mailing list     mpich-discuss at mcs.anl.gov
>> >>>>>> To manage subscription options or unsubscribe:
>> >>>>>> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
>> >>>>>
>> >>>>> _______________________________________________
>> >>>>> mpich-discuss mailing list     mpich-discuss at mcs.anl.gov
>> >>>>> To manage subscription options or unsubscribe:
>> >>>>> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
>> >>>>
>> >>>>
>> >>>>
>> >>>> --
>> >>>> Sukanta Basu
>> >>>> Associate Professor
>> >>>> North Carolina State University
>> >>>> http://www4.ncsu.edu/~sbasu5/
>> >>>> _______________________________________________
>> >>>> mpich-discuss mailing list     mpich-discuss at mcs.anl.gov
>> >>>> To manage subscription options or unsubscribe:
>> >>>> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
>> >>>
>> >>> _______________________________________________
>> >>> mpich-discuss mailing list     mpich-discuss at mcs.anl.gov
>> >>> To manage subscription options or unsubscribe:
>> >>> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
>> >>
>> >>
>> >>
>> >> --
>> >> Sukanta Basu
>> >> Associate Professor
>> >> North Carolina State University
>> >> http://www4.ncsu.edu/~sbasu5/
>> >> _______________________________________________
>> >> mpich-discuss mailing list     mpich-discuss at mcs.anl.gov
>> >> To manage subscription options or unsubscribe:
>> >> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
>> >
>> > _______________________________________________
>> > mpich-discuss mailing list     mpich-discuss at mcs.anl.gov
>> > To manage subscription options or unsubscribe:
>> > https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
>>
>>
>>
>> --
>> Sukanta Basu
>> Associate Professor
>> North Carolina State University
>> http://www4.ncsu.edu/~sbasu5/
>> _______________________________________________
>> mpich-discuss mailing list     mpich-discuss at mcs.anl.gov
>> To manage subscription options or unsubscribe:
>> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
>>
>> --
>> This message has been scanned for viruses and
>> dangerous content by MailScanner, and is
>> believed to be clean.
>>
>
>
> _______________________________________________
> mpich-discuss mailing list     mpich-discuss at mcs.anl.gov
> To manage subscription options or unsubscribe:
> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
>



-- 
Sukanta Basu
Associate Professor
North Carolina State University
http://www4.ncsu.edu/~sbasu5/


More information about the mpich-discuss mailing list