[petsc-users] How to understand these error messages

Fande Kong fd.kong at siat.ac.cn
Tue Jun 25 05:09:04 CDT 2013


Hi Barry,

If I use the intel mpi, my code can correctly run and can produce some
correct results. Yes, you are right. The IBM MPI has some bugs.

Thank you for your help.

Regards,

On Tue, Jun 25, 2013 at 11:08 AM, Satish Balay <balay at mcs.anl.gov> wrote:

> On Tue, 25 Jun 2013, Fande Kong wrote:
>
> > Hi Barry,
> >
> > How to use valgrind to debug parallel program on the supercomputer with
> > many cores? If we follow the instruction "mpiexec -n NPROC valgrind
> > --tool=memcheck -q --num-callers=20 --log-file=valgrind.log.%p
> > PETSCPROGRAMNAME -malloc off PROGRAMOPTIONS",  for 10000 cores, 10000
> files
> > will be printed. Maybe we need to put all information into a single file.
> > How to do this?
>
> For this many cores - the PIDs across nodes won't be unique. It might
> map over to say 1000 files - so I suggest [assuming $HOSTNAME is set
> on each host]
>
> --log-file=valgrind.log.%q{HOSTNAME}.%p
>
> You don't want to be mixing output from all the cores - then it would
> be unreadable.
>
> But if your filesystem cannot handle these many files - you could try
> consolidating output per node as:
>
> --log-file=valgrind.log.%q{HOSTNAME}
>
> [or perhaps create a subdir per node or something - and stash files in
> these dirs]
>
> for each hostname: mkdir -p ${HOSTNAME}
>
> --log-file=%q{HOSTNAME}/valgrind.log.%p
>
>
> Satish
>
>


-- 
Fande Kong
ShenZhen Institutes of Advanced Technology
Chinese Academy of Sciences
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20130625/69152663/attachment.html>


More information about the petsc-users mailing list