[mpich-discuss] Problem using mpi_type_create_f90_integer

Thomas Jahns jahns at dkrz.de
Mon Dec 19 04:45:57 CST 2011


On 12/17/2011 12:40 AM, Dave Goodell wrote:
> Hi Thomas,
> 
> Will you send us a copy of the "src/binding/f90/mpif90model.h" file from your build directory?  That should help us figure out what's happening with "WITH_BUG1".

See the first attachment.

> As for "WITH_BUG2", this looks like a bug in your test program.  You are using the literal array "(/ 1, 1 /)" as your array_of_displacements argument to MPI_Type_create_struct.  In my hazy understanding of Fortran, the type of this literal defaults to an INTEGER array.  It should be an array of type INTEGER(kind=MPI_ADDRESS_KIND).  When I made this change I was able to entirely eliminate the Valgrind warnings that had been coming out of the program.

In case you meant the literal array (/ 0, 8 /) you're right, I got this wrong
when shortening the original program, thanks for pointing that out. It contains
a computed array of offsets which is of the correct type, but that didn't make
it into the test program unfortunately. But when correcting the code accordingly
(bug_revised.f90), I still get the same error:

$ ~/opt/mpich2-1.4.1p1-nag52-x64-linux/bin/mpif90 -fpp -DWITH_BUG2 bug_revised.f90
NAG Fortran Compiler Release 5.2(747)
Extension: /home/tjahns/opt/mpich2-1.4.1p1-nag52-x64-linux/include/mpif.h, line
479: Byte count on numeric data type
           detected at *@8
Extension: /home/tjahns/opt/mpich2-1.4.1p1-nag52-x64-linux/include/mpif.h, line
492: Byte count on numeric data type
           detected at *@8
Extension: /home/tjahns/opt/mpich2-1.4.1p1-nag52-x64-linux/include/mpif.h, line
493: Byte count on numeric data type
           detected at *@8
Warning: bug_revised.f90, line 45: Unused external reference MPI_WTICK
         detected at TEST_F90INT@<end-of-statement>
Warning: bug_revised.f90, line 45: Unused external reference PMPI_WTICK
         detected at TEST_F90INT@<end-of-statement>
Warning: bug_revised.f90, line 45: Unused external reference PMPI_WTIME
         detected at TEST_F90INT@<end-of-statement>
Warning: bug_revised.f90, line 45: Unused external reference MPI_WTIME
         detected at TEST_F90INT@<end-of-statement>
[NAG Fortran Compiler normal termination, 7 warnings]
$ ~/opt/mpich2-1.4.1p1-nag52-x64-linux/bin/mpiexec -n 2 ./a.out

=====================================================================================
=   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
=   EXIT CODE: 11
=   CLEANING UP REMAINING PROCESSES
=   YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
=====================================================================================
APPLICATION TERMINATED WITH THE EXIT STRING: Segmentation fault (signal 11)

$  ~/opt/mpich2-1.4.1p1-intel-x64-linux/bin/mpif90 -fpp -DWITH_BUG2 bug_revised.f90
$  ./a.out
forrtl: severe (174): SIGSEGV, segmentation fault occurred
Image              PC                Routine            Line        Source

libmpich.so.3      00007FDB1DED2CF8  Unknown               Unknown  Unknown
libmpich.so.3      00007FDB1DE51C4A  Unknown               Unknown  Unknown
libmpich.so.3      00007FDB1DE4EBF6  Unknown               Unknown  Unknown
libmpich.so.3      00007FDB1DE91749  Unknown               Unknown  Unknown
libmpich.so.3      00007FDB1DEF5A75  Unknown               Unknown  Unknown
libmpich.so.3      00007FDB1DEF5955  Unknown               Unknown  Unknown
libmpich.so.3      00007FDB1DEF5ABA  Unknown               Unknown  Unknown
a.out              0000000000404977  Unknown               Unknown  Unknown
a.out              000000000040483C  Unknown               Unknown  Unknown
libc.so.6          00007FDB1CFA0C4D  Unknown               Unknown  Unknown
a.out              0000000000404739  Unknown               Unknown  Unknown

Greetings, Thomas

> 
> -Dave
> 
> On Dec 14, 2011, at 7:33 AM CST, Thomas Jahns wrote:
> 
>> Hello,
>>
>> I'm having some problems when using mpi_type_create_f90_integer, where, for 8
>> byte integers (i.e. requesting 18 decimal places), I get an immediate error for
>> and for 4 byte integers I get a later segmentation-fault when committing a
>> struct type containing the result of mpi_type_create_f90_integer, both bugs
>> happen with the current Intel Fortran compiler 12.1.0 20110811 and NAG Fortran
>> Compiler Release 5.2(747) (with gcc 4.6.1 host compiler for the NAG).
>>
>> The below example exhibits both problem when run after compilation with defining
>> either WITH_BUG1 or WITH_BUG2.
>>
>> In the case of WITH_BUG1 I get:
>>
>> $  ~/opt/mpich2-1.4.1p1-nag52-x64-linux/bin/mpif90 -fpp -DWITH_BUG1 bug.f90
>> NAG Fortran Compiler Release 5.2(747)
>> Extension: /home/tjahns/opt/mpich2-1.4.1p1-nag52-x64-linux/include/mpif.h, line
>> 479: Byte count on numeric data type
>>           detected at *@8
>> Extension: /home/tjahns/opt/mpich2-1.4.1p1-nag52-x64-linux/include/mpif.h, line
>> 492: Byte count on numeric data type
>>           detected at *@8
>> Extension: /home/tjahns/opt/mpich2-1.4.1p1-nag52-x64-linux/include/mpif.h, line
>> 493: Byte count on numeric data type
>>           detected at *@8
>> Warning: bug.f90, line 44: Unused external reference MPI_WTICK
>>         detected at TEST_F90INT@<end-of-statement>
>> Warning: bug.f90, line 44: Unused external reference PMPI_WTICK
>>         detected at TEST_F90INT@<end-of-statement>
>> Warning: bug.f90, line 44: Unused external reference PMPI_WTIME
>>         detected at TEST_F90INT@<end-of-statement>
>> Warning: bug.f90, line 44: Unused external reference MPI_WTIME
>>         detected at TEST_F90INT@<end-of-statement>
>> [NAG Fortran Compiler normal termination, 7 warnings]
>> $ ./a.out
>> Fatal error in MPI_Type_create_f90_integer: Other MPI error, error stack:
>> MPI_Type_create_f90_integer(117):  MPI_Type_create_f90_int (range=14) failed
>> MPI_Type_create_f90_integer(97).:  No integer type with 14 digits of range is
>> avaiable
>>
>> The second bug manifests itself in a segmentation fault I've traced to dlp being
>> invalid in MPID_Segment_init.
>>
>> $ ~/opt/mpich2-1.4.1p1-nag52-x64-linux/bin/mpif90 -fpp -DWITH_BUG2 bug.f90
>> NAG Fortran Compiler Release 5.2(747)
>> Extension: /home/tjahns/opt/mpich2-1.4.1p1-nag52-x64-linux/include/mpif.h, line
>> 479: Byte count on numeric data type
>>           detected at *@8
>> Extension: /home/tjahns/opt/mpich2-1.4.1p1-nag52-x64-linux/include/mpif.h, line
>> 492: Byte count on numeric data type
>>           detected at *@8
>> Extension: /home/tjahns/opt/mpich2-1.4.1p1-nag52-x64-linux/include/mpif.h, line
>> 493: Byte count on numeric data type
>>           detected at *@8
>> Warning: bug.f90, line 44: Unused external reference MPI_WTICK
>>         detected at TEST_F90INT@<end-of-statement>
>> Warning: bug.f90, line 44: Unused external reference PMPI_WTICK
>>         detected at TEST_F90INT@<end-of-statement>
>> Warning: bug.f90, line 44: Unused external reference PMPI_WTIME
>>         detected at TEST_F90INT@<end-of-statement>
>> Warning: bug.f90, line 44: Unused external reference MPI_WTIME
>>         detected at TEST_F90INT@<end-of-statement>
>> [NAG Fortran Compiler normal termination, 7 warnings]
>> $ gdb ./a.out
>> GNU gdb (GDB) 7.0.1-debian
>> Copyright (C) 2009 Free Software Foundation, Inc.
>> License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
>> This is free software: you are free to change and redistribute it.
>> There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
>> and "show warranty" for details.
>> This GDB was configured as "x86_64-linux-gnu".
>> For bug reporting instructions, please see:
>> <http://www.gnu.org/software/gdb/bugs/>...
>> Reading symbols from
>> /home/tjahns/bug-reporting/mpich2-mpi_type_create_f90_integer_problem/a.out...done.
>> (gdb)  r
>> Starting program:
>> /home/tjahns/bug-reporting/mpich2-mpi_type_create_f90_integer_problem/a.out
>> [Thread debugging using libthread_db enabled]
>>
>> Program received signal SIGSEGV, Segmentation fault.
>> MPID_Segment_init (buf=0x0, count=<value optimized out>, handle=-1946157051,
>>    segp=0x703da0, flag=<value optimized out>)
>>    at
>> ../../../../../../mpich2-1.4.1p1/src/mpid/common/datatype/dataloop/segment.c:173
>> 173	    DLOOP_Stackelm_load(elmp, dlp, 0);
>> (gdb) bt
>> #0  MPID_Segment_init (buf=0x0, count=<value optimized out>,
>>    handle=-1946157051, segp=0x703da0, flag=<value optimized out>)
>>    at
>> ../../../../../../mpich2-1.4.1p1/src/mpid/common/datatype/dataloop/segment.c:173
>> #1  0x000000000045e5ed in DLOOP_Dataloop_create_flattened_struct (count=2,
>>    blklens=0x703d7c, disps=0x703d88, oldtypes=0x703d70, dlp_p=0x6d63b0,
>>    dlsz_p=0x6d63b8, dldepth_p=0x6d63bc, flag=0)
>>    at
>> ../../../../../../mpich2-1.4.1p1/src/mpid/common/datatype/dataloop/dataloop_create_struct.c:581
>> #2  MPID_Dataloop_create_struct (count=2, blklens=0x703d7c, disps=0x703d88,
>>    oldtypes=0x703d70, dlp_p=0x6d63b0, dlsz_p=0x6d63b8, dldepth_p=0x6d63bc,
>>    flag=0)
>>    at
>> ../../../../../../mpich2-1.4.1p1/src/mpid/common/datatype/dataloop/dataloop_create_struct.c:244
>> #3  0x000000000045b91b in MPID_Dataloop_create (type=-1946157050,
>>    dlp_p=0x6d63b0, dlsz_p=0x6d63b8, dldepth_p=0x6d63bc, flag=0)
>>    at
>> ../../../../../../mpich2-1.4.1p1/src/mpid/common/datatype/dataloop/dataloop_create.c:279
>> #4  0x0000000000416b72 in MPID_Type_commit (datatype_p=0x7fffffffc5e4)
>>    at ../../../../../mpich2-1.4.1p1/src/mpid/common/datatype/mpid_type_commit.c:43
>> #5  0x000000000040dd7a in MPIR_Type_commit_impl (
>>    datatype=<value optimized out>)
>>    at ../../../../mpich2-1.4.1p1/src/mpi/datatype/type_commit.c:43
>> #6  0x000000000040df4b in PMPI_Type_commit (datatype=0x7fffffffc5e4)
>>    at ../../../../mpich2-1.4.1p1/src/mpi/datatype/type_commit.c:114
>> #7  0x000000000040d849 in pmpi_type_commit_ (v1=<value optimized out>,
>>    ierr=0x7fffffffc5e8)
>>    at ../../../../mpich2-1.4.1p1/src/binding/f77/type_commitf.c:190
>> #8  0x000000000040d609 in main (argc=1, argv=0x7fffffffc6d8) at bug.f90:24
>> (gdb)
>>
>> If any extra information might be useful, I'll be happy to provide.
>>
>> Regards,
>> Thomas
>> -- 
>> Thomas Jahns
>> DKRZ GmbH, Department: Application software
>>
>> Deutsches Klimarechenzentrum
>> Bundesstraße 45a
>> D-20146 Hamburg
>>
>> Phone: +49-40-460094-151
>> Fax: +49-40-460094-270
>> Email: Thomas Jahns <jahns at dkrz.de>
>> <mpich2-1.4.1p1-nag52.log.bz2><bug.f90>_______________________________________________
>> mpich-discuss mailing list     mpich-discuss at mcs.anl.gov
>> To manage subscription options or unsubscribe:
>> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
> 
> _______________________________________________
> mpich-discuss mailing list     mpich-discuss at mcs.anl.gov
> To manage subscription options or unsubscribe:
> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
> 


-- 
Thomas Jahns
DKRZ GmbH, Department: Application software

Deutsches Klimarechenzentrum
Bundesstraße 45a
D-20146 Hamburg

Phone: +49-40-460094-151
Fax: +49-40-460094-270
Email: Thomas Jahns <jahns at dkrz.de>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: mpif90model.h
Type: text/x-chdr
Size: 1562 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/mpich-discuss/attachments/20111219/add2f36f/attachment.h>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: bug_revised.f90
Type: text/x-fortran
Size: 1365 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/mpich-discuss/attachments/20111219/add2f36f/attachment.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 5884 bytes
Desc: S/MIME Cryptographic Signature
URL: <http://lists.mcs.anl.gov/pipermail/mpich-discuss/attachments/20111219/add2f36f/attachment-0001.bin>


More information about the mpich-discuss mailing list