[MPICH] debug flag
Wei-keng Liao
wkliao at ece.northwestern.edu
Wed May 30 13:28:45 CDT 2007
The code file is attached. The command-line arguments are
"filename npx npy npz", where filename is the name of output file, and
npx, npy, npz are the number of processes along X-Y-Z dimensions.
The XYZ dimensions in each subarray is fixed to 50 x 50 x 50. Each array
element is a double type. The global array size is hence proportional to
the number of processes. There is a fourth dimension of size 11, but not
partitioned.
To repeat my experiment on 4000 processes, please use
npx=20, npy=20, npz=10
For 2000 processes, change npy to 10.
Wei-keng
On Wed, 30 May 2007, Rajeev Thakur wrote:
>> I have written a short C code for this I/O pattern. ...
>> Let me know if you would like a copy of it.
>
> Of course!
>
>
>> -----Original Message-----
>> From: Wei-keng Liao [mailto:wkliao at ece.northwestern.edu]
>> Sent: Wednesday, May 30, 2007 11:49 AM
>> To: Rajeev Thakur
>> Cc: mpich-discuss at mcs.anl.gov
>> Subject: RE: [MPICH] debug flag
>>
>>
>> I just got the results by disabling aggregation. The coredump was
>> generated by rank 784 (out of 4000) and indicates the following info.
>>
>> ad_aggregate.c:242
>> proc = -603978814 <-- !?
>> off = -166212992 <-- !?
>> min_st_offset = 0
>> fd_len = 400
>> fd_size = 262582 <-- should be 11000000
>>
>> going up one level to ad_write_coll.c:170
>> below is some of variables set by
>> ADIOI_Calc_my_off_len() at line 101
>> count 1375000
>> offset = 0
>> start_offset = 407601600
>> end_offset = -2149678961 <-- should be 839993999
>> contig_access_count = 27500
>>
>> I suspect the file type is not flatten correctly.
>>
>> I have written a short C code for this I/O pattern. I ran it on 4000
>> processes and it produced the same error. On 2000 processes, it ran
>> fine, just like my program. Let me know if you would like a
>> copy of it.
>>
>> Wei-keng
>>
>>
>> On Tue, 29 May 2007, Rajeev Thakur wrote:
>>
>>> Can you try disabling aggregation and see if the error
>> still remains. You
>>> can disable it by creating an info object as follows and
>> passing it to
>>> File_set_view
>>> MPI_Info_set(info, "cb_config_list", "*:*");
>>>
>>> Rajeev
>>>
>>>> -----Original Message-----
>>>> From: owner-mpich-discuss at mcs.anl.gov
>>>> [mailto:owner-mpich-discuss at mcs.anl.gov] On Behalf Of Wei-keng Liao
>>>> Sent: Tuesday, May 29, 2007 1:33 AM
>>>> To: Howard Pritchard
>>>> Cc: mpich-discuss at mcs.anl.gov
>>>> Subject: Re: [MPICH] debug flag
>>>>
>>>> Howard,
>>>>
>>>> Thanks for this information. It is very helpful. I was able
>>>> to find more
>>>> details by using the debug built mpich. Below is what I found
>>>> from the
>>>> coredump that may help debugging the ROMIO source.
>>>>
>>>> 1. coredump is from MPI rank 2919. I allocated 4000 MPI processes,
>>>> (2000 nodes, each nodes has 2 CPUs). I am checking
>>>> mpich2-1.0.2 source.
>>>>
>>>> 2. MPI_Abort() is called at line 97 by function
>>>> ADIOI_Calc_aggregator(),
>>>> in file ad_aggregate.c where
>>>> rank_index = 5335, fd->hints->cb_nodes = 2000, off =
>> 2802007600,
>>>> min_off = 0, fd_size = 525164 (fd_size should be 11000000))
>>>>
>>>> 3. It is function ADIOI_Calc_my_req() called
>>>> ADIOI_Calc_aggregator() at
>>>> line 240, from file ad_aggregate.c, where
>>>> i = 0 in loop for (i=0; i < contig_access_count; i++)
>>>> off = 2802007600, min_st_offset = 0, fd_len = 400,
>>>> fd_size = 525164
>>>> (fd_size should be 11000000)
>>>>
>>>> 4. It is function ADIOI_GEN_WriteStridedColl() called
>>>> ADIOI_Calc_my_req()
>>>> at line 170, from file ad_write_coll.c
>>>> I would like to see what went wrong with fd_size from function
>>>> ADIOI_Calc_file_domains() where fd_size is set and saw
>>>> that fd_size is
>>>> determined by st_offsets[] and end_offsets[] which depend
>>>> on variables
>>>> start_offset and end_offset.
>>>>
>>>> So, I went a few line up and checked the values for variables
>>>> start_offset and end_offset. They were set by
>>>> ADIOI_Calc_my_off_len()
>>>> at line 101 and I found the value of end_offset must be wrong!
>>>> end_offset should always >= start_offset, but the core
>> shows that
>>>> start_offset = 2802007600, end_offset = 244727039
>>>>
>>>> So, I looked into ADIOI_Calc_my_off_len() in
>>>> ad_read_coll.c and checked
>>>> variable end_offset_ptr which was set by variable
>>>> end_offset at line
>>>> 453, since filetype_size > 0 and filetype_is_contig == 0.
>>>> Hence, the only place end_offset is set is at line 420:
>>>> end_offset = off + frd_size - 1;
>>>> end_offset is determined by off and frd_size. However, frd_size
>>>> is declared as an integer. But end_offset is ADIO_Offset. Maybe
>>>> it is an type overflow! At line 351, I can see a type cast
>>>> frd_size = (int) (disp + flat_file->indices[i] + ...
>>>>
>>>> Something fishy here. Unfortunately, the coredump does
>>>> not cover here.
>>>> Look like an interactive debugging with a break point
>>>> cannot be avoided.
>>>>
>>>> Wei-keng
>>>>
>>>>
>>>> On Mon, 28 May 2007, Howard Pritchard wrote:
>>>>
>>>>> Hello Wei-keng,
>>>>>
>>>>> Here is a way on xt/qk systems to compile with the debug
>>>> mpich2 library:
>>>>>
>>>>> 1) do
>>>>> module show xt-mpt
>>>>>
>>>>> to see which mpich2 the system manager has made the default.
>>>>>
>>>>> For instance, on an internal system here at cray this
>>>> command shows:
>>>>>
>>>>>
>> -------------------------------------------------------------------
>>>>> /opt/modulefiles/xt-mpt/1.5.49:
>>>>>
>>>>> setenv MPT_DIR /opt/xt-mpt/1.5.49
>>>>> setenv MPICHBASEDIR /opt/xt-mpt/1.5.49/mpich2-64
>>>>> setenv MPICH_DIR /opt/xt-mpt/1.5.49/mpich2-64/P2
>>>>> setenv MPICH_DIR_FTN_DEFAULT64
>>>> /opt/xt-mpt/1.5.49/mpich2-64/P2W
>>>>> prepend-path LD_LIBRARY_PATH
>> /opt/xt-mpt/1.5.49/mpich2-64/P2/lib
>>>>> prepend-path PATH /opt/xt-mpt/1.5.49/mpich2-64/P2/bin
>>>>> prepend-path MANPATH /opt/xt-mpt/1.5.49/mpich2-64/man
>>>>> prepend-path MANPATH /opt/xt-mpt/1.5.49/romio/man
>>>>> prepend-path PE_PRODUCT_LIST MPT
>>>>>
>> -------------------------------------------------------------------
>>>>>
>>>>> The debug library you want to use is thus going to be
>>>> picked up by the
>>>>> mpicc installed at:
>>>>>
>>>>> /opt/xt-mpt/1.5.49/mpich2-64/P2DB
>>>>>
>>>>> 2) Now with the cray compiler scripts like cc, ftn, etc.
>>>> you specify the
>>>>> alternate location to use for compiling/linking by
>>>>>
>>>>> cc -driverpath=/opt/xt-mpt/1.5.49/mpich2-64/P2DB/bin -o
>>>> a.out.debug ......
>>>>>
>>>>> or whichever path is appropriate for the xt-mpt installed
>>>> on your system.
>>>>>
>>>>> 3) When you rerun the binary, you may want to set the MPICH_DBMASK
>>>>> environment variable to 0x200.
>>>>>
>>>>> I am pretty sure you are running out of memory, based on
>> the area in
>>>>> the ADIO_Calc_my_req where the error arises. Clearly this
>>>> is not a very
>>>>> good way to report an oom condition. I'll investigate.
>>>>>
>>>>> You may be able to save some memory by tweaking the environment
>>>>> variables controlling mpi buffer space. Refer to the
>>>> intro_mpi man page
>>>>> on your xt/qk system.
>>>>>
>>>>> Hope this helps,
>>>>>
>>>>> Howard
>>>>>
>>>>> Wei-keng Liao wrote:
>>>>>
>>>>>>
>>>>>> Well, I am aware of mpich2version, but unforturnately that
>>>> command is not
>>>>>> available to users on that machine. The only commands
>>>> avaliable to me are
>>>>>> mpicc, mpif77, mpif90, and mpicxx.
>>>>>>
>>>>>> Wei-keng
>>>>>>
>>>>>>
>>>>>> On Fri, 25 May 2007, Anthony Chan wrote:
>>>>>>
>>>>>>>
>>>>>>> <mpich2-install-dir>/bin/mpich2version may show if
>>>> --enable-g is set.
>>>>>>>
>>>>>>> A.Chan
>>>>>>>
>>>>>>> On Fri, 25 May 2007, Wei-keng Liao wrote:
>>>>>>>
>>>>>>>>
>>>>>>>> The problem is that I cannot run my own mpich on the
>>>> machine. I can see
>>>>>>>> the MPICH I am using is of version 2-1.0.2 from peeking
>>>> at mpif90 script.
>>>>>>>> Is there a way to know if it is built using
>>>> --enable-g=dbg option from
>>>>>>>> the
>>>>>>>> mpif90 script?
>>>>>>>>
>>>>>>>> I don't know if this help, but below is the whole
>> error message:
>>>>>>>>
>>>>>>>> aborting job:
>>>>>>>> application called MPI_Abort(MPI_COMM_WORLD, 1) - process <id>
>>>>>>>> (there are 4000 lines, each with a distinct id number)
>>>>>>>>
>>>>>>>> ----- DEBUG: PCB, CONTEXT, STACK TRACE ---------------------
>>>>>>>>
>>>>>>>> PROCESSOR [ 0]
>>>>>>>> log_nid = 15 phys_nid = 0x98 host_id = 7691
>>>> host_pid = 18545
>>>>>>>> group_id = 12003 num_procs = 4000 rank = 15
>>>> local_pid = 3
>>>>>>>> base_node_index = 0 last_node_index = 1999
>>>>>>>>
>>>>>>>> text_base = 0x00000000200000 text_len = 0x00000000400000
>>>>>>>> data_base = 0x00000000600000 data_len = 0x00000000a00000
>>>>>>>> stack_base = 0x000000fec00000 stack_len = 0x00000001000000
>>>>>>>> heap_base = 0x00000001200000 heap_len = 0x0000007b000000
>>>>>>>>
>>>>>>>> ss = 0x000000000000001f fs = 000000000000000000 gs =
>>>>>>>> 0x0000000000000017
>>>>>>>> rip = 0x00000000002d46fe
>>>>>>>> rdi = 0x0000000006133a90 rsi = 0xffffffffdc0003c2 rbp =
>>>>>>>> 0x00000000ffbf9d40
>>>>>>>> rsp = 0x00000000ffbf9cc0 rbx = 0x0000000000000190 rdx =
>>>>>>>> 0x000000003eb08c39
>>>>>>>> rcx = 0x0000000008ea18b0 rax = 0x0000000008ecff30 cs =
>>>>>>>> 0x000000000000001f
>>>>>>>> R8 = 0x0000000007ad2ab0 R9 = 0xfffffffffffffe0c R10 =
>>>>>>>> 0x0000000008e6bd30
>>>>>>>> R11 = 0x0000000000000262 R12 = 0x0000000000000a8c R13 =
>>>>>>>> 0xfffffffff0538770
>>>>>>>> R14 = 0x00000000fffffe0c R15 = 0x0000000008ed3dc0
>>>>>>>> rflg = 0x0000000000010206 prev_sp = 0x00000000ffbf9cc0
>>>>>>>> error_code = 6
>>>>>>>>
>>>>>>>> SIGNAL #[11][Segmentation fault] fault_address =
>>>> 0xffffffff78ed4cc8
>>>>>>>> 0xffbf9cc0 0x ffbf9cf0 0x fa0 0x
>>>> a00006b6c 0x
>>>>>>>> a8c3e9ab7ff
>>>>>>>> 0xffbf9ce0 0x 8ed7c50 0x 7d0 0x
>>>> 0 0x
>>>>>>>> 6b6c002d455b
>>>>>>>> 0xffbf9d00 0x 8ea18b0 0x 8e6bd30 0x
>>>> 61338a0 0x
>>>>>>>> fa0
>>>>>>>> 0xffbf9d20 0x 0 0x 61338a0 0x
>>>> 8036c 0x
>>>>>>>> 8ec4390
>>>>>>>> 0xffbf9d40 0x ffbf9e80 0x 2d2280 0x
>>>> 8ecff30 0x
>>>>>>>> 8036c
>>>>>>>> 0xffbf9d60 0x fa0 0x ffbf9de4 0x
>>>> ffbf9de8 0x
>>>>>>>> ffbf9df0
>>>>>>>> 0xffbf9d80 0x ffbf9df8 0x 0 0x
>>>> 0 0x
>>>>>>>> 8ebc680
>>>>>>>> 0xffbf9da0 0x 1770 0x 7d000a39f88 0x
>>>> 0 0x
>>>>>>>> 650048174f
>>>>>>>> 0xffbf9dc0 0x 14fb184c000829 0x 6a93500 0x
>>>> ffbf9e30 0x
>>>>>>>> 292e54
>>>>>>>> 0xffbf9de0 0x 0 0x 8ed3dc0 0x
>>>> 7af 0x
>>>>>>>> 0
>>>>>>>> 0xffbf9e00 0x 0 0x 8ecc0a0 0x
>>>> 8ecff30 0x
>>>>>>>> 8036c
>>>>>>>> 0xffbf9e20 0x 100000014 0x 8e6bd30 0x
>>>> 8ea18b0 0x
>>>>>>>> 1770
>>>>>>>> 0xffbf9e40 0xffffffff6793163f 0x 6b6c00a39fa8 0x
>>>> fa00000000f 0x
>>>>>>>> 61338a0
>>>>>>>> 0xffbf9e60 0x 4c000829 0x 14fb18 0x
>>>> 65 0x
>>>>>>>> 0
>>>>>>>> 0xffbf9e80 0x ffbf9ee0 0x 2a397c 0x
>>>> 866b60 0x
>>>>>>>> ffbf9eb0
>>>>>>>>
>>>>>>>>
>>>>>>>> Stack Trace: ------------------------------
>>>>>>>> #0 0x00000000002d46fe in ADIOI_Calc_my_req()
>>>>>>>> #1 0x00000000002d2280 in ADIOI_GEN_WriteStridedColl()
>>>>>>>> #2 0x00000000002a397c in MPIOI_File_write_all()
>>>>>>>> #3 0x00000000002a3a4a in PMPI_File_write_all()
>>>>>>>> #4 0x00000000002913a8 in pmpi_file_write_all_()
>>>>>>>> could not find symbol for addr 0x73696e6966204f49
>>>>>>>> --------------------------------------------
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Fri, 25 May 2007, Robert Latham wrote:
>>>>>>>>
>>>>>>>>> On Fri, May 25, 2007 at 03:56:16PM -0500, Wei-keng Liao wrote:
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> I have an MPI I/O application that runs fine up to
>>>> 1000 processes, but
>>>>>>>>>> failed when using 4000 processes. Parts of error message are
>>>>>>>>>> ...
>>>>>>>>>> Stack Trace: ------------------------------
>>>>>>>>>> #0 0x00000000002d46fe in ADIOI_Calc_my_req()
>>>>>>>>>> #1 0x00000000002d2280 in ADIOI_GEN_WriteStridedColl()
>>>>>>>>>> #2 0x00000000002a397c in MPIOI_File_write_all()
>>>>>>>>>> #3 0x00000000002a3a4a in PMPI_File_write_all()
>>>>>>>>>> #4 0x00000000002913a8 in pmpi_file_write_all_()
>>>>>>>>>> could not find symbol for addr 0x73696e6966204f49
>>>>>>>>>> aborting job:
>>>>>>>>>> application called MPI_Abort(MPI_COMM_WORLD, 1) -
>>>> process 1456
>>>>>>>>>> ...
>>>>>>>>>>
>>>>>>>>>> My question is what debug flags should I use for
>>>> compiling and running
>>>>>>>>>> in
>>>>>>>>>> order to help find what exact location in function
>>>> ADIOI_Calc_my_req()
>>>>>>>>>> causes this error?
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Hi Wei-keng
>>>>>>>>>
>>>>>>>>> If you build MPICH2 with --enable-g=dbg, then all of
>>>> MPI will be built
>>>>>>>>> with debugging symbols. Be sure to 'make clean'
>>>> first: the ROMIO
>>>>>>>>> objects might not rebuild otherwise.
>>>>>>>>>
>>>>>>>>> I wonder what caused the abort? maybe ADIOI_Malloc
>>>> failed to allocate
>>>>>>>>> memory? Well, a stack trace with debugging symbols should be
>>>>>>>>> interesting.
>>>>>>>>>
>>>>>>>>> ==rob
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> Rob Latham
>>>>>>>>> Mathematics and Computer Science Division A215 0178
>>>> EA2D B059 8CDF
>>>>>>>>> Argonne National Lab, IL USA B29D F333
>>>> 664A 4280 315B
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>>
>>>
>>
>>
>
-------------- next part --------------
#include <stdio.h>
#include <stdlib.h>
#include <mpi.h>
/*----< main() >------------------------------------------------------------*/
int main(int argc, char **argv) {
int i, err, rank, np, buf_size, debug;
double *buf;
int np_dim[3], rank_dim[3], array_of_sizes[3];
int array_of_subsizes[3], array_of_starts[3];
MPI_File fh;
MPI_Datatype ftype;
MPI_Status status;
MPI_Info info;
MPI_Init(&argc, &argv);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
MPI_Comm_size(MPI_COMM_WORLD, &np);
if (argc != 5) {
fprintf(stderr,"Usage: %s filename npx npy npz\n",argv[0]);
MPI_Finalize();
exit(1);
}
debug = 0;
for (i=0; i<3; i++) {
np_dim[i] = atoi(argv[2+i]); /* no. processes in each dim */
array_of_sizes[i] = 50 * np_dim[i]; /* global 3D array size */
array_of_subsizes[i] = 50; /* sub array size is fixed */
}
if (debug) {
printf("%d: np_dim = %d %d %d\n",rank,np_dim[0],np_dim[1],np_dim[2]);
printf("%d: array_of_sizes = %d %d %d\n",rank,array_of_sizes[0],
array_of_sizes[1],array_of_sizes[2]);
printf("%d: array_of_subsizes = %d %d %d\n",rank,array_of_subsizes[0],
array_of_subsizes[1],array_of_subsizes[2]);
}
/* check if number of processes is matched */
if (np != np_dim[0]*np_dim[1]*np_dim[2]) {
fprintf(stderr,"Error: process number mismatch ");
fprintf(stderr,"npx(%d) npy(%d) npz(%d) total(%d)\n",
np_dim[0],np_dim[1],np_dim[2],np);
MPI_Finalize();
exit(1);
}
/* process rank in each dimension */
rank_dim[0] = rank % np_dim[0];
rank_dim[1] = (rank / np_dim[0]) % np_dim[1];
rank_dim[2] = rank / (np_dim[0] * np_dim[1]);
/* starting coordinates of the subarray in each dimension */
for (i=0; i<3; i++)
array_of_starts[i] = 50 * rank_dim[i];
if (debug) {
printf("%d: rank_dim = %d %d %d\n",rank,rank_dim[0],rank_dim[1],
rank_dim[2]);
printf("%d: array_of_starts = %d %d %d\n",rank,array_of_starts[0],
array_of_starts[1],array_of_starts[2]);
}
/* create file type */
MPI_Type_create_subarray(3, array_of_sizes, array_of_subsizes,
array_of_starts, MPI_ORDER_FORTRAN,
MPI_DOUBLE, &ftype) ;
MPI_Type_commit(&ftype);
/* create MPI I/O hint */
MPI_Info_create(&info);
MPI_Info_set(info, "romio_no_indep_rw", "true");
/* disable I/O aggregation */
/*
MPI_Info_set(info, "cb_config_list", "*:*");
*/
/* open the file */
err = MPI_File_open(MPI_COMM_WORLD, argv[1],
MPI_MODE_CREATE | MPI_MODE_WRONLY,
info, &fh);
if (err != MPI_SUCCESS) {
printf("Error: MPI_File_open() filename %s\n",argv[1]);
MPI_Abort(MPI_COMM_WORLD, -1);
exit(1);
}
/* set the file view */
MPI_File_set_view(fh, 0, MPI_DOUBLE, ftype, "native", MPI_INFO_NULL);
/* prepare write buffer */
buf_size = 11; /* fourth dimension, not partitioned */
for (i=0; i<3; i++)
buf_size *= array_of_subsizes[i];
buf = (double*) malloc(buf_size*sizeof(double));
if (debug)
printf("%d: buf_size = %d bytes\n", rank,buf_size*sizeof(double));
/* MPI collective write */
MPI_File_write_all(fh, buf, buf_size, MPI_DOUBLE, &status);
MPI_File_close(&fh);
free(buf);
MPI_Info_free(&info);
MPI_Finalize();
return 0;
}
More information about the mpich-discuss
mailing list