[MOAB-dev] debugging mbparallelcomm_test on cosmea

Alvaro Caceres acaceres at mcs.anl.gov
Tue Jan 19 11:13:18 CST 2010


Hi Dmitry,

For what it's worth, I've been able to run parallel jobs on cosmea with less than 4 procs, you just need to edit a line in the script you give to qsub (in "#PBS -l nodes=X:ppn=4", set ppn to 1 or 2)...

Alvaro
----- Original Message -----
From: "Dima Karpeev" <karpeev at gmail.com>
To: "Jason Kraftcheck" <kraftche at cae.wisc.edu>
Cc: "Dmitry Karpeev" <karpeev at mcs.anl.gov>, moab-dev at lists.mcs.anl.gov
Sent: Tuesday, January 19, 2010 10:53:22 AM GMT -06:00 US/Canada Central
Subject: Re: [MOAB-dev] debugging mbparallelcomm_test on cosmea

I understand that it's an alloc problem.
I'm running mbparallelcomm_test. I'm trying to track it witha debugger  
now, but doing it on cosmea is a bit hard: I can't submit jobs with  
fewer than 4 procs: each node has 4 cores.

I was just wondering if this was a known problem.

Dmitry

On Jan 19, 2010, at 8:38, Jason Kraftcheck <kraftche at cae.wisc.edu>  
wrote:

> Dmitry Karpeev wrote:
>> I'm trying to get some parallel runs out of mbparallelcomm_test on  
>> cosmea,
>> but I keep getting this error (when running in parallel):
>> terminate called after throwing an instance of 'std::bad_alloc'
>>  what():  St9bad_alloc
>>
>> The exception appears to be thrown on rank 0, which causes an abort
>> across the comm.
>> Any ideas about what's going on?
>> I'm about to dig in to see what the problem may be, but I thought  
>> someone might
>> save me the trouble :-)
>>
>
> This is the result of a failed memory allocation (new, malloc, or  
> perhaps an
> increase in the stack size.)  Are you trying to allocate a very  
> large array?
>
> - jason


More information about the moab-dev mailing list