[petsc-users] calloc with MPI and PetSc
Sanjay Kharche
Sanjay.Kharche at manchester.ac.uk
Tue Feb 3 10:20:08 CST 2015
Hi Karli, Barry
The valgrind error I got was only that one.
In any case, the calloc errors have now completely vanished. I will work on reproducing the errors and have a version of my simple program so that I can understand what was causing it. But that is for another time - as of now, my program is working as I want it to.
thanks
Sanjay
________________________________________
From: Barry Smith [bsmith at mcs.anl.gov]
Sent: 03 February 2015 16:11
To: Sanjay Kharche
Cc: Karl Rupp; petsc-users at mcs.anl.gov
Subject: Re: [petsc-users] calloc with MPI and PetSc
Do you get more valgrind errors or is that the only one? That one is likely harmless.
Barry
> On Feb 3, 2015, at 8:48 AM, Sanjay Kharche <Sanjay.Kharche at manchester.ac.uk> wrote:
>
>
> Hi Karl
>
> You are right - the code is not valgrind clean even on single processor. The Valgrind output below shows the line number of the TSSolve in my code.
>
> valgrind ./sk2d
> ==7907== Memcheck, a memory error detector
> ==7907== Copyright (C) 2002-2010, and GNU GPL'd, by Julian Seward et al.
> ==7907== Using Valgrind-3.6.1 and LibVEX; rerun with -h for copyright info
> ==7907== Command: ./sk2d
> ==7907==
> ==7907== Invalid read of size 4
> ==7907== at 0x55985C6: opal_os_dirpath_create (in /usr/lib/openmpi/lib/libmpi.so.1.0.2)
> ==7907== by 0x553A2C7: orte_session_dir (in /usr/lib/openmpi/lib/libmpi.so.1.0.2)
> ==7907== by 0x554DAD1: orte_ess_base_app_setup (in /usr/lib/openmpi/lib/libmpi.so.1.0.2)
> ==7907== by 0x545B584: ??? (in /usr/lib/openmpi/lib/openmpi/mca_ess_singleton.so)
> ==7907== by 0x552C213: orte_init (in /usr/lib/openmpi/lib/libmpi.so.1.0.2)
> ==7907== by 0x54E4FBB: ??? (in /usr/lib/openmpi/lib/libmpi.so.1.0.2)
> ==7907== by 0x54FE30F: PMPI_Init_thread (in /usr/lib/openmpi/lib/libmpi.so.1.0.2)
> ==7907== by 0x4136FAA: PetscInitialize (pinit.c:781)
> ==7907== by 0x8049448: main (sk2d.c:109)
> ==7907== Address 0x580e9f4 is 68 bytes inside a block of size 71 alloc'd
> ==7907== at 0x4006D69: malloc (vg_replace_malloc.c:236)
> ==7907== by 0x5598542: opal_os_dirpath_create (in /usr/lib/openmpi/lib/libmpi.so.1.0.2)
> ==7907== by 0x553A2C7: orte_session_dir (in /usr/lib/openmpi/lib/libmpi.so.1.0.2)
> ==7907== by 0x554DAD1: orte_ess_base_app_setup (in /usr/lib/openmpi/lib/libmpi.so.1.0.2)
> ==7907== by 0x545B584: ??? (in /usr/lib/openmpi/lib/openmpi/mca_ess_singleton.so)
> ==7907== by 0x552C213: orte_init (in /usr/lib/openmpi/lib/libmpi.so.1.0.2)
> ==7907== by 0x54E4FBB: ??? (in /usr/lib/openmpi/lib/libmpi.so.1.0.2)
> ==7907== by 0x54FE30F: PMPI_Init_thread (in /usr/lib/openmpi/lib/libmpi.so.1.0.2)
> ==7907== by 0x4136FAA: PetscInitialize (pinit.c:781)
> ==7907== by 0x8049448: main (sk2d.c:109)
> ________________________________________
> From: Karl Rupp [rupp at iue.tuwien.ac.at]
> Sent: 03 February 2015 14:42
> To: Sanjay Kharche; petsc-users at mcs.anl.gov
> Subject: Re: [petsc-users] calloc with MPI and PetSc
>
> Hi Sanjay,
>
> this sounds a lot like a memory corruption somewhere in the code. Could
> you please verify first that the code is valgrind-clean? Does the same
> problem show up with one MPI rank?
>
> Best regards,
> Karli
>
>
> On 02/03/2015 03:21 PM, Sanjay Kharche wrote:
>>
>> Dear All
>>
>> I have a code in C that uses Petsc and MPI. My code is an extension of ex15.c in the ts tutorials.
>>
>> I am trying to allocate memory for 3 int arrays, for which I have already declared int pointers. These arrays are not intended for use by the petsc functions. I am allocating memory using calloc. The use of 1 calloc call is fine, however when I try to allocate memory for 2 or more arrays, the TSSolve(ts,u) gives an error. I found this by including and excluding the TSSolve call. I have tried making the array pointers PetscInt but with same result. The first few lines of the error message are also pasted after the relevant code snippet. Can you let me know how I can allocate memory for 3 arrays. These arrays are not relevant to any petsc functions.
>>
>> thanks
>> Sanjay
>>
>> Relevant code in main():
>>
>> PetscInt size = 0; /* Petsc/MPI */
>> PetscInt rank = 0;
>>
>> int *usr_mybase; // mybase, myend, myblocksize are to be used in non-petsc part of code.
>> int *usr_myend;
>> int *usr_myblocksize;
>> int R_base, transit;
>> MPI_Status status;
>> MPI_Request request;
>> /*********************************end of declarations in main************************/
>> PetscInitialize(&argc,&argv,(char*)0,help);
>> /* Initialize user application context */
>> user.da = NULL;
>> user.boundary = 1; /* 0: Drichlet BC; 1: Neumann BC */
>> user.viewJacobian = PETSC_FALSE;
>>
>> MPI_Comm_size(PETSC_COMM_WORLD, &size);
>> MPI_Comm_rank(PETSC_COMM_WORLD, &rank);
>>
>> printf("my size is %d, and rank is %d\n",size, rank);
>>
>> usr_mybase = (int*) calloc (size,sizeof(int)); // 1st call to calloc is ok.
>> // usr_myend = (int*) calloc (size,sizeof(int)); // when I uncomment this call to calloc, TSSolve fails. error below.
>> // usr_myblocksize = (int*) calloc (size,sizeof(int));
>> .
>> .
>> .
>> TSSolve(ts,u); // has a problem when I use 2 callocs.
>>
>>
>> The output and error message:
>>
>> mpiexec -n 4 ./sk2d -draw_pause .1 -ts_monitor_draw_solution
>> my size is 4, and rank is 2
>> my size is 4, and rank is 0
>> my size is 4, and rank is 3
>> my size is 4, and rank is 1
>> [0]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
>> [0]PETSC ERROR: Floating point exception
>> [1]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
>> [1]PETSC ERROR: Floating point exception
>> [1]PETSC ERROR: Vec entry at local location 320 is not-a-number or infinite at beginning of function: Parameter number 2
>> [1]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
>> [2]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
>> [2]PETSC ERROR: Floating point exception
>> [2]PETSC ERROR: Vec entry at local location 10 is not-a-number or infinite at beginning of function: Parameter number 2
>> [2]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
>> [2]PETSC ERROR: Petsc Release Version 3.5.2, unknown
>> [3]PETSC ERROR: [0]PETSC ERROR: Vec entry at local location 293 is not-a-number or infinite at beginning of function: Parameter number 2
>>
>
More information about the petsc-users
mailing list