[petsc-users] calloc with MPI and PetSc
Kharche, Sanjay
S.R.Kharche at exeter.ac.uk
Tue Feb 3 10:11:32 CST 2015
Hi Karli
The OpenMPI errors may be a consequence of the curropt memory that I cannot identify.
I tried all combinations of memory allocation:
(int *) calloc(size,sizeof(int)); // typecasting
and
calloc(size, sizeof(int))
and also tried it by replacing with malloc. None of them work. In addition, I have now added some very simple non-petsc part to my code - a for loop with some additions and substractions. This loop does not use the arrarys that I am trying to allocate memory, neither do they use Petsc. Now, even the first calloc of the 3 that I would like to use does not work! I will appreciate knowing the reason for this.
thanks for your time.
Sanjay
________________________________________
From: petsc-users-bounces at mcs.anl.gov [petsc-users-bounces at mcs.anl.gov] on behalf of Karl Rupp [rupp at iue.tuwien.ac.at]
Sent: 03 February 2015 15:58
To: Sanjay Kharche; petsc-users at mcs.anl.gov
Subject: Re: [petsc-users] calloc with MPI and PetSc
Hi Sanjay,
is this the full output? The errors/warnings is due to OpenMPI, so they
may not be harmful. You may try building and running with mpich instead
to get rid of these. If these are the only errors reported by valgrind,
can you also try to use malloc instead of calloc?
Best regards,
Karli
On 02/03/2015 03:48 PM, Sanjay Kharche wrote:
>
> Hi Karl
>
> You are right - the code is not valgrind clean even on single processor. The Valgrind output below shows the line number of the TSSolve in my code.
>
> valgrind ./sk2d
> ==7907== Memcheck, a memory error detector
> ==7907== Copyright (C) 2002-2010, and GNU GPL'd, by Julian Seward et al.
> ==7907== Using Valgrind-3.6.1 and LibVEX; rerun with -h for copyright info
> ==7907== Command: ./sk2d
> ==7907==
> ==7907== Invalid read of size 4
> ==7907== at 0x55985C6: opal_os_dirpath_create (in /usr/lib/openmpi/lib/libmpi.so.1.0.2)
> ==7907== by 0x553A2C7: orte_session_dir (in /usr/lib/openmpi/lib/libmpi.so.1.0.2)
> ==7907== by 0x554DAD1: orte_ess_base_app_setup (in /usr/lib/openmpi/lib/libmpi.so.1.0.2)
> ==7907== by 0x545B584: ??? (in /usr/lib/openmpi/lib/openmpi/mca_ess_singleton.so)
> ==7907== by 0x552C213: orte_init (in /usr/lib/openmpi/lib/libmpi.so.1.0.2)
> ==7907== by 0x54E4FBB: ??? (in /usr/lib/openmpi/lib/libmpi.so.1.0.2)
> ==7907== by 0x54FE30F: PMPI_Init_thread (in /usr/lib/openmpi/lib/libmpi.so.1.0.2)
> ==7907== by 0x4136FAA: PetscInitialize (pinit.c:781)
> ==7907== by 0x8049448: main (sk2d.c:109)
> ==7907== Address 0x580e9f4 is 68 bytes inside a block of size 71 alloc'd
> ==7907== at 0x4006D69: malloc (vg_replace_malloc.c:236)
> ==7907== by 0x5598542: opal_os_dirpath_create (in /usr/lib/openmpi/lib/libmpi.so.1.0.2)
> ==7907== by 0x553A2C7: orte_session_dir (in /usr/lib/openmpi/lib/libmpi.so.1.0.2)
> ==7907== by 0x554DAD1: orte_ess_base_app_setup (in /usr/lib/openmpi/lib/libmpi.so.1.0.2)
> ==7907== by 0x545B584: ??? (in /usr/lib/openmpi/lib/openmpi/mca_ess_singleton.so)
> ==7907== by 0x552C213: orte_init (in /usr/lib/openmpi/lib/libmpi.so.1.0.2)
> ==7907== by 0x54E4FBB: ??? (in /usr/lib/openmpi/lib/libmpi.so.1.0.2)
> ==7907== by 0x54FE30F: PMPI_Init_thread (in /usr/lib/openmpi/lib/libmpi.so.1.0.2)
> ==7907== by 0x4136FAA: PetscInitialize (pinit.c:781)
> ==7907== by 0x8049448: main (sk2d.c:109)
> ________________________________________
> From: Karl Rupp [rupp at iue.tuwien.ac.at]
> Sent: 03 February 2015 14:42
> To: Sanjay Kharche; petsc-users at mcs.anl.gov
> Subject: Re: [petsc-users] calloc with MPI and PetSc
>
> Hi Sanjay,
>
> this sounds a lot like a memory corruption somewhere in the code. Could
> you please verify first that the code is valgrind-clean? Does the same
> problem show up with one MPI rank?
>
> Best regards,
> Karli
>
>
> On 02/03/2015 03:21 PM, Sanjay Kharche wrote:
>>
>> Dear All
>>
>> I have a code in C that uses Petsc and MPI. My code is an extension of ex15.c in the ts tutorials.
>>
>> I am trying to allocate memory for 3 int arrays, for which I have already declared int pointers. These arrays are not intended for use by the petsc functions. I am allocating memory using calloc. The use of 1 calloc call is fine, however when I try to allocate memory for 2 or more arrays, the TSSolve(ts,u) gives an error. I found this by including and excluding the TSSolve call. I have tried making the array pointers PetscInt but with same result. The first few lines of the error message are also pasted after the relevant code snippet. Can you let me know how I can allocate memory for 3 arrays. These arrays are not relevant to any petsc functions.
>>
>> thanks
>> Sanjay
>>
>> Relevant code in main():
>>
>> PetscInt size = 0; /* Petsc/MPI */
>> PetscInt rank = 0;
>>
>> int *usr_mybase; // mybase, myend, myblocksize are to be used in non-petsc part of code.
>> int *usr_myend;
>> int *usr_myblocksize;
>> int R_base, transit;
>> MPI_Status status;
>> MPI_Request request;
>> /*********************************end of declarations in main************************/
>> PetscInitialize(&argc,&argv,(char*)0,help);
>> /* Initialize user application context */
>> user.da = NULL;
>> user.boundary = 1; /* 0: Drichlet BC; 1: Neumann BC */
>> user.viewJacobian = PETSC_FALSE;
>>
>> MPI_Comm_size(PETSC_COMM_WORLD, &size);
>> MPI_Comm_rank(PETSC_COMM_WORLD, &rank);
>>
>> printf("my size is %d, and rank is %d\n",size, rank);
>>
>> usr_mybase = (int*) calloc (size,sizeof(int)); // 1st call to calloc is ok.
>> // usr_myend = (int*) calloc (size,sizeof(int)); // when I uncomment this call to calloc, TSSolve fails. error below.
>> // usr_myblocksize = (int*) calloc (size,sizeof(int));
>> .
>> .
>> .
>> TSSolve(ts,u); // has a problem when I use 2 callocs.
>>
>>
>> The output and error message:
>>
>> mpiexec -n 4 ./sk2d -draw_pause .1 -ts_monitor_draw_solution
>> my size is 4, and rank is 2
>> my size is 4, and rank is 0
>> my size is 4, and rank is 3
>> my size is 4, and rank is 1
>> [0]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
>> [0]PETSC ERROR: Floating point exception
>> [1]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
>> [1]PETSC ERROR: Floating point exception
>> [1]PETSC ERROR: Vec entry at local location 320 is not-a-number or infinite at beginning of function: Parameter number 2
>> [1]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
>> [2]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
>> [2]PETSC ERROR: Floating point exception
>> [2]PETSC ERROR: Vec entry at local location 10 is not-a-number or infinite at beginning of function: Parameter number 2
>> [2]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
>> [2]PETSC ERROR: Petsc Release Version 3.5.2, unknown
>> [3]PETSC ERROR: [0]PETSC ERROR: Vec entry at local location 293 is not-a-number or infinite at beginning of function: Parameter number 2
>>
>
More information about the petsc-users
mailing list