[petsc-users] calloc with MPI and PetSc
Sanjay Kharche
Sanjay.Kharche at manchester.ac.uk
Tue Feb 3 08:48:08 CST 2015
Hi Karl
You are right - the code is not valgrind clean even on single processor. The Valgrind output below shows the line number of the TSSolve in my code.
valgrind ./sk2d
==7907== Memcheck, a memory error detector
==7907== Copyright (C) 2002-2010, and GNU GPL'd, by Julian Seward et al.
==7907== Using Valgrind-3.6.1 and LibVEX; rerun with -h for copyright info
==7907== Command: ./sk2d
==7907==
==7907== Invalid read of size 4
==7907== at 0x55985C6: opal_os_dirpath_create (in /usr/lib/openmpi/lib/libmpi.so.1.0.2)
==7907== by 0x553A2C7: orte_session_dir (in /usr/lib/openmpi/lib/libmpi.so.1.0.2)
==7907== by 0x554DAD1: orte_ess_base_app_setup (in /usr/lib/openmpi/lib/libmpi.so.1.0.2)
==7907== by 0x545B584: ??? (in /usr/lib/openmpi/lib/openmpi/mca_ess_singleton.so)
==7907== by 0x552C213: orte_init (in /usr/lib/openmpi/lib/libmpi.so.1.0.2)
==7907== by 0x54E4FBB: ??? (in /usr/lib/openmpi/lib/libmpi.so.1.0.2)
==7907== by 0x54FE30F: PMPI_Init_thread (in /usr/lib/openmpi/lib/libmpi.so.1.0.2)
==7907== by 0x4136FAA: PetscInitialize (pinit.c:781)
==7907== by 0x8049448: main (sk2d.c:109)
==7907== Address 0x580e9f4 is 68 bytes inside a block of size 71 alloc'd
==7907== at 0x4006D69: malloc (vg_replace_malloc.c:236)
==7907== by 0x5598542: opal_os_dirpath_create (in /usr/lib/openmpi/lib/libmpi.so.1.0.2)
==7907== by 0x553A2C7: orte_session_dir (in /usr/lib/openmpi/lib/libmpi.so.1.0.2)
==7907== by 0x554DAD1: orte_ess_base_app_setup (in /usr/lib/openmpi/lib/libmpi.so.1.0.2)
==7907== by 0x545B584: ??? (in /usr/lib/openmpi/lib/openmpi/mca_ess_singleton.so)
==7907== by 0x552C213: orte_init (in /usr/lib/openmpi/lib/libmpi.so.1.0.2)
==7907== by 0x54E4FBB: ??? (in /usr/lib/openmpi/lib/libmpi.so.1.0.2)
==7907== by 0x54FE30F: PMPI_Init_thread (in /usr/lib/openmpi/lib/libmpi.so.1.0.2)
==7907== by 0x4136FAA: PetscInitialize (pinit.c:781)
==7907== by 0x8049448: main (sk2d.c:109)
________________________________________
From: Karl Rupp [rupp at iue.tuwien.ac.at]
Sent: 03 February 2015 14:42
To: Sanjay Kharche; petsc-users at mcs.anl.gov
Subject: Re: [petsc-users] calloc with MPI and PetSc
Hi Sanjay,
this sounds a lot like a memory corruption somewhere in the code. Could
you please verify first that the code is valgrind-clean? Does the same
problem show up with one MPI rank?
Best regards,
Karli
On 02/03/2015 03:21 PM, Sanjay Kharche wrote:
>
> Dear All
>
> I have a code in C that uses Petsc and MPI. My code is an extension of ex15.c in the ts tutorials.
>
> I am trying to allocate memory for 3 int arrays, for which I have already declared int pointers. These arrays are not intended for use by the petsc functions. I am allocating memory using calloc. The use of 1 calloc call is fine, however when I try to allocate memory for 2 or more arrays, the TSSolve(ts,u) gives an error. I found this by including and excluding the TSSolve call. I have tried making the array pointers PetscInt but with same result. The first few lines of the error message are also pasted after the relevant code snippet. Can you let me know how I can allocate memory for 3 arrays. These arrays are not relevant to any petsc functions.
>
> thanks
> Sanjay
>
> Relevant code in main():
>
> PetscInt size = 0; /* Petsc/MPI */
> PetscInt rank = 0;
>
> int *usr_mybase; // mybase, myend, myblocksize are to be used in non-petsc part of code.
> int *usr_myend;
> int *usr_myblocksize;
> int R_base, transit;
> MPI_Status status;
> MPI_Request request;
> /*********************************end of declarations in main************************/
> PetscInitialize(&argc,&argv,(char*)0,help);
> /* Initialize user application context */
> user.da = NULL;
> user.boundary = 1; /* 0: Drichlet BC; 1: Neumann BC */
> user.viewJacobian = PETSC_FALSE;
>
> MPI_Comm_size(PETSC_COMM_WORLD, &size);
> MPI_Comm_rank(PETSC_COMM_WORLD, &rank);
>
> printf("my size is %d, and rank is %d\n",size, rank);
>
> usr_mybase = (int*) calloc (size,sizeof(int)); // 1st call to calloc is ok.
> // usr_myend = (int*) calloc (size,sizeof(int)); // when I uncomment this call to calloc, TSSolve fails. error below.
> // usr_myblocksize = (int*) calloc (size,sizeof(int));
> .
> .
> .
> TSSolve(ts,u); // has a problem when I use 2 callocs.
>
>
> The output and error message:
>
> mpiexec -n 4 ./sk2d -draw_pause .1 -ts_monitor_draw_solution
> my size is 4, and rank is 2
> my size is 4, and rank is 0
> my size is 4, and rank is 3
> my size is 4, and rank is 1
> [0]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
> [0]PETSC ERROR: Floating point exception
> [1]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
> [1]PETSC ERROR: Floating point exception
> [1]PETSC ERROR: Vec entry at local location 320 is not-a-number or infinite at beginning of function: Parameter number 2
> [1]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
> [2]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
> [2]PETSC ERROR: Floating point exception
> [2]PETSC ERROR: Vec entry at local location 10 is not-a-number or infinite at beginning of function: Parameter number 2
> [2]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
> [2]PETSC ERROR: Petsc Release Version 3.5.2, unknown
> [3]PETSC ERROR: [0]PETSC ERROR: Vec entry at local location 293 is not-a-number or infinite at beginning of function: Parameter number 2
>
More information about the petsc-users
mailing list