[petsc-users] calloc with MPI and PetSc

Matthew Knepley knepley at gmail.com
Tue Feb 3 10:32:38 CST 2015


On Tue, Feb 3, 2015 at 10:11 AM, Kharche, Sanjay <S.R.Kharche at exeter.ac.uk>
wrote:

>
> Hi Karli
>
> The OpenMPI errors may be a consequence of the curropt memory that I
> cannot identify.
>
> I tried all combinations of memory allocation:
>
> (int *) calloc(size,sizeof(int)); // typecasting
>
> and
>
> calloc(size, sizeof(int))
>
> and also tried it by replacing with malloc. None of them work. In
> addition, I have now added some very simple non-petsc part to my code - a
> for loop with some additions and substractions. This loop does not use the
> arrarys that I am trying to allocate memory, neither do they use Petsc.
> Now, even the first calloc of the 3 that I would like to use does not work!
> I will appreciate knowing the reason for this.
>

Go to an example. If this does not happen, there is a bug in your code. So

  cd src/snes/examples/tutorials
  make ex5
  ./ex5 -snes_monitor
  <Add a calloc to the code>
  ./ex5 -snes_monitor

If that is fine, you have a bug. Usually valgrind can find them.

  Thanks,

    Matt


> thanks for your time.
> Sanjay
>
>
> ________________________________________
> From: petsc-users-bounces at mcs.anl.gov [petsc-users-bounces at mcs.anl.gov]
> on behalf of Karl Rupp [rupp at iue.tuwien.ac.at]
> Sent: 03 February 2015 15:58
> To: Sanjay Kharche; petsc-users at mcs.anl.gov
> Subject: Re: [petsc-users] calloc with MPI and PetSc
>
> Hi Sanjay,
>
> is this the full output? The errors/warnings is due to OpenMPI, so they
> may not be harmful. You may try building and running with mpich instead
> to get rid of these. If these are the only errors reported by valgrind,
> can you also try to use malloc instead of calloc?
>
> Best regards,
> Karli
>
>
>
> On 02/03/2015 03:48 PM, Sanjay Kharche wrote:
> >
> > Hi Karl
> >
> > You are right - the code is not valgrind clean even on single processor.
> The Valgrind output below shows the line number of the TSSolve in my code.
> >
> > valgrind ./sk2d
> > ==7907== Memcheck, a memory error detector
> > ==7907== Copyright (C) 2002-2010, and GNU GPL'd, by Julian Seward et al.
> > ==7907== Using Valgrind-3.6.1 and LibVEX; rerun with -h for copyright
> info
> > ==7907== Command: ./sk2d
> > ==7907==
> > ==7907== Invalid read of size 4
> > ==7907==    at 0x55985C6: opal_os_dirpath_create (in
> /usr/lib/openmpi/lib/libmpi.so.1.0.2)
> > ==7907==    by 0x553A2C7: orte_session_dir (in
> /usr/lib/openmpi/lib/libmpi.so.1.0.2)
> > ==7907==    by 0x554DAD1: orte_ess_base_app_setup (in
> /usr/lib/openmpi/lib/libmpi.so.1.0.2)
> > ==7907==    by 0x545B584: ??? (in
> /usr/lib/openmpi/lib/openmpi/mca_ess_singleton.so)
> > ==7907==    by 0x552C213: orte_init (in
> /usr/lib/openmpi/lib/libmpi.so.1.0.2)
> > ==7907==    by 0x54E4FBB: ??? (in /usr/lib/openmpi/lib/libmpi.so.1.0.2)
> > ==7907==    by 0x54FE30F: PMPI_Init_thread (in
> /usr/lib/openmpi/lib/libmpi.so.1.0.2)
> > ==7907==    by 0x4136FAA: PetscInitialize (pinit.c:781)
> > ==7907==    by 0x8049448: main (sk2d.c:109)
> > ==7907==  Address 0x580e9f4 is 68 bytes inside a block of size 71 alloc'd
> > ==7907==    at 0x4006D69: malloc (vg_replace_malloc.c:236)
> > ==7907==    by 0x5598542: opal_os_dirpath_create (in
> /usr/lib/openmpi/lib/libmpi.so.1.0.2)
> > ==7907==    by 0x553A2C7: orte_session_dir (in
> /usr/lib/openmpi/lib/libmpi.so.1.0.2)
> > ==7907==    by 0x554DAD1: orte_ess_base_app_setup (in
> /usr/lib/openmpi/lib/libmpi.so.1.0.2)
> > ==7907==    by 0x545B584: ??? (in
> /usr/lib/openmpi/lib/openmpi/mca_ess_singleton.so)
> > ==7907==    by 0x552C213: orte_init (in
> /usr/lib/openmpi/lib/libmpi.so.1.0.2)
> > ==7907==    by 0x54E4FBB: ??? (in /usr/lib/openmpi/lib/libmpi.so.1.0.2)
> > ==7907==    by 0x54FE30F: PMPI_Init_thread (in
> /usr/lib/openmpi/lib/libmpi.so.1.0.2)
> > ==7907==    by 0x4136FAA: PetscInitialize (pinit.c:781)
> > ==7907==    by 0x8049448: main (sk2d.c:109)
> > ________________________________________
> > From: Karl Rupp [rupp at iue.tuwien.ac.at]
> > Sent: 03 February 2015 14:42
> > To: Sanjay Kharche; petsc-users at mcs.anl.gov
> > Subject: Re: [petsc-users] calloc with MPI and PetSc
> >
> > Hi Sanjay,
> >
> > this sounds a lot like a memory corruption somewhere in the code. Could
> > you please verify first that the code is valgrind-clean? Does the same
> > problem show up with one MPI rank?
> >
> > Best regards,
> > Karli
> >
> >
> > On 02/03/2015 03:21 PM, Sanjay Kharche wrote:
> >>
> >> Dear All
> >>
> >> I have a code in C that uses Petsc and MPI. My code is an extension of
> ex15.c in the ts tutorials.
> >>
> >> I am trying to allocate memory for 3 int arrays, for which I have
> already declared int pointers. These arrays are not intended for use by the
> petsc functions. I am allocating memory using calloc. The use of 1 calloc
> call is fine, however when I try to allocate memory for 2 or more arrays,
> the TSSolve(ts,u) gives an error. I found this by including and excluding
> the TSSolve call. I have tried making the array pointers PetscInt but with
> same result. The first few lines of the error message are also pasted after
> the relevant code snippet. Can you let me know how I can allocate memory
> for 3 arrays. These arrays are not relevant to any petsc functions.
> >>
> >> thanks
> >> Sanjay
> >>
> >> Relevant code in main():
> >>
> >>     PetscInt    size = 0;                   /* Petsc/MPI
>                */
> >>     PetscInt    rank = 0;
> >>
> >>    int *usr_mybase; // mybase, myend, myblocksize are to be used in
> non-petsc part of code.
> >>    int *usr_myend;
> >>    int *usr_myblocksize;
> >>     int R_base, transit;
> >>     MPI_Status status;
> >>     MPI_Request request;
> >> /*********************************end of declarations in
> main************************/
> >>     PetscInitialize(&argc,&argv,(char*)0,help);
> >>     /* Initialize user application context
> */
> >>     user.da           = NULL;
> >>     user.boundary     = 1;  /* 0: Drichlet BC; 1: Neumann BC
> */
> >>     user.viewJacobian = PETSC_FALSE;
> >>
> >>     MPI_Comm_size(PETSC_COMM_WORLD, &size);
> >>     MPI_Comm_rank(PETSC_COMM_WORLD, &rank);
> >>
> >>     printf("my size is %d, and rank is %d\n",size, rank);
> >>
> >>      usr_mybase      = (int*) calloc (size,sizeof(int)); // 1st call to
> calloc is ok.
> >> //   usr_myend       = (int*) calloc (size,sizeof(int));  // when I
> uncomment this call to calloc, TSSolve fails. error below.
> >> //   usr_myblocksize = (int*) calloc (size,sizeof(int));
> >> .
> >> .
> >> .
> >>      TSSolve(ts,u); // has a problem when I use 2 callocs.
> >>
> >>
> >> The output and error message:
> >>
> >> mpiexec -n 4 ./sk2d -draw_pause .1 -ts_monitor_draw_solution
> >> my size is 4, and rank is 2
> >> my size is 4, and rank is 0
> >> my size is 4, and rank is 3
> >> my size is 4, and rank is 1
> >> [0]PETSC ERROR: --------------------- Error Message
> --------------------------------------------------------------
> >> [0]PETSC ERROR: Floating point exception
> >> [1]PETSC ERROR: --------------------- Error Message
> --------------------------------------------------------------
> >> [1]PETSC ERROR: Floating point exception
> >> [1]PETSC ERROR: Vec entry at local location 320 is not-a-number or
> infinite at beginning of function: Parameter number 2
> >> [1]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html
> for trouble shooting.
> >> [2]PETSC ERROR: --------------------- Error Message
> --------------------------------------------------------------
> >> [2]PETSC ERROR: Floating point exception
> >> [2]PETSC ERROR: Vec entry at local location 10 is not-a-number or
> infinite at beginning of function: Parameter number 2
> >> [2]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html
> for trouble shooting.
> >> [2]PETSC ERROR: Petsc Release Version 3.5.2, unknown
> >> [3]PETSC ERROR: [0]PETSC ERROR: Vec entry at local location 293 is
> not-a-number or infinite at beginning of function: Parameter number 2
> >>
> >
>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150203/15431dc4/attachment.html>


More information about the petsc-users mailing list