[MPICH] ROMIO. Two-phased run-time error

Rajeev Thakur thakur at mcs.anl.gov
Wed Sep 19 13:59:54 CDT 2007


There is a bug in ROMIO that causes a seg fault for the particular set of
I/O hints used in BTIO. BTIO sets the collective buffering hints cb_nodes to
4 and cb_buffer_size to 1000000. If you do not set these hints and pass
MPI_INFO_NULL instead, it works. This can be done by adding the line
collbuf_nodes = 0
on line 29 in NPB3.2.1/NPB3.2-MPI/BT/full_mpiio.f 
(just above the if statement)

Rajeev

> -----Original Message-----
> From: owner-mpich-discuss at mcs.anl.gov 
> [mailto:owner-mpich-discuss at mcs.anl.gov] On Behalf Of 
> Francisco Javier García Blas
> Sent: Wednesday, September 19, 2007 6:31 AM
> To: mpich-discuss at mcs.anl.gov
> Subject: [MPICH] ROMIO. Two-phased run-time error 
> 
> Hi all,
> 
> We did some measurements with BTIO over PVFS and PVFS2 on lonestar at 
> TACC (http://www.tacc.utexas.edu/services/userguides/lonestar/).
> 
> We launched BTIO with 4 proccess and  but when we launch BTIO 
> with 9 or 
> more proceess the bench fail. We tried to run BTIO with PVFS, 
> PVFS2 and 
> unix file system but it falied too.
> 
> Debuging the error seems to be in collective writes. 
> Two-phased does not 
> invoke file primitives.
> 
> We use mpich2-1.0.5p4 and compile it with gcc version 3.4.6.
> 
> Any clue why this happens? 
> 
> Thanks a lot
> 
> -- 
> --------------------------------------------------
> Francisco Javier García Blas
> Computer Architecture, Communications and Systems Area.
> Computer Science Department. UNIVERSIDAD CARLOS III DE MADRID
> Avda. de la Universidad, 30
> 28911 Leganés (Madrid), SPAIN
> e-mail: fjblas at arcos.inf.uc3m.es
>         fjblas at inf.uc3m.es
> Phone:(+34) 916249104
> FAX: (+34) 916249129
> --------------------------------------------------
> 
> 




More information about the mpich-discuss mailing list