[mpich-discuss] Facing problem while using MPI_File_set_view

Christina Patrick christina.subscribes at gmail.com
Wed May 27 14:04:04 CDT 2009


I have a feeling that there is a bug in the MPI code when we use
MPI_Type_create_subarray() to create row file view as specified below.
If you change the buffer size below in my program to something smaller
than the entire view, you will get an error. I checked the use of
MPI_Type_create_subarray() API for creating column and block file
views. They work fine.

If somebody could please take a look at this problem, it would be
really helpful.

Thanks and Regards,
Christina.

On Mon, May 25, 2009 at 5:19 PM, Christina Patrick
<christina.subscribes at gmail.com> wrote:
> Hi Everybody,
>
> I am writing a program to read an array of 16384 x 16384 of type
> double ~ 2GB file size
> I am using the file view as follows:
>
> ------------------------------
> |                             |    P0
> ------------------------------
> |                             |    P1
> ------------------------------
> |                             |    P2
> ------------------------------
> |                             |    P3
> ------------------------------
>
> I create the file view using MPI_Type_create_subarray() and use
> collective I/O MPI_File_read_all()
>
> When I read the entire file view of P0 (P1, P2, P3) into a buffer
> (512MB) (in a single instance), the program works fine.
> However, I do not want to use a buffer as big as 512 MB. So I use a
> smaller buffer (such as 8MB) and iterate over the file view of P0 (P1,
> P2, P3), my program starts throwing errors/get segmentation faults:
> [E 16:49:28.377861] Error: payload_progress: Bad address
> OR
> segmentation fault in ADIOI_Calc_my_off_len() at line: while
> (flat_file->type != fd->filetype) flat_file = flat_file->next;
> because flat_file becomes 0x0 in one of the processes.
>
> If I use a column view and logically do the same thing, I do not face
> this problem. (I hope that I have been able to explain the problem. In
> case of doubt, please let me know.)
>
> Could somebody please help me,
>
> Thanks and Regards,
> Christina.
>
> PS: I am pasting the program below:
>
> #include "mpi.h"
> #include <stdio.h>
> #include <string.h>
> #include <stdlib.h>
> #include <math.h>
> #include <errno.h>
>
> #define ROWS                (16384)
> #define COLS                (16384)
> #define MPI_DATATYPE        (MPI_DOUBLE)
> #define C_DATATYPE          double
> #define DIMS                (2)
> #define COLL_BUFSIZE        (536870912)
>
> int main(int argc, char **argv) {
>  char          fname[] = "pvfs2:/home/mdl/patrick/pvfs2/testfile";
>  int           i = 0, nprocs = 0, mynod = 0, provided = 0, c_size =
> 0, mpi_size = 0, iterations = 0,
>                array_size[] = {0, 0}, array_subsize[] = {0, 0},
> array_start[] = {0, 0};
>  long          rows = 0l, cols = 0l, coll_bufsize = 0l, rows_view =
> 0l, cols_view = 0l, rows_collbuf = 0l, cols_collbuf = 0l, elts_collbuf
> = 0l;
>  unsigned long filesize = 0l;
>  double        *buffer = NULL;
>  MPI_File      fhandle;
>  MPI_Status    status;
>  MPI_Datatype  subarray;
>
>  MPI_Init_thread(&argc, &argv, MPI_THREAD_MULTIPLE, &provided);
>  MPI_Comm_rank(MPI_COMM_WORLD, &mynod);
>  MPI_Comm_size(MPI_COMM_WORLD, &nprocs);
>
>  MPI_Type_size(MPI_DATATYPE, &mpi_size);
>  c_size     = sizeof(C_DATATYPE);
>  if(c_size != mpi_size) {
>    fprintf(stderr, "Datatypes in MPI and C do not match\n");
>    MPI_Abort(MPI_COMM_WORLD, EIO);
>  }
>
>  rows             = ROWS;
>  cols             = COLS;
>  coll_bufsize     = COLL_BUFSIZE;
>  elts_collbuf     = coll_bufsize / mpi_size;
>  rows_view        = rows / nprocs;
>  cols_view        = cols;
>  cols_collbuf     = cols_view;
>  rows_collbuf     = elts_collbuf / cols_collbuf;
>  filesize         = rows * cols * mpi_size;
>  array_size[0]    = rows;
>  array_size[1]    = cols;
>  array_subsize[0] = rows_view;
>  array_subsize[1] = cols_view;
>  array_start[0]   = rows_view * mynod;
>  array_start[1]   = 0;
>
>  buffer = (C_DATATYPE *)malloc(coll_bufsize);
>  if(!buffer) {
>    fprintf(stderr, "calloc error\n");
>    MPI_Abort(MPI_COMM_WORLD, ENOMEM);
>  }
>
>  MPI_File_open(MPI_COMM_WORLD, fname, MPI_MODE_RDONLY, MPI_INFO_NULL,
> &fhandle);
>
>  MPI_Type_create_subarray(DIMS, array_size, array_subsize,
> array_start, MPI_ORDER_C, MPI_DATATYPE, &subarray);
>  MPI_Type_commit(&subarray);
>  MPI_File_set_view(fhandle, 0, MPI_DATATYPE, subarray, "native",
> MPI_INFO_NULL);
>
>  iterations = rows_view / rows_collbuf;
>
>  for(i = 0; i < iterations; i++)
>    MPI_File_read_all(fhandle, buffer, elts_collbuf, MPI_DATATYPE, &status);
>
>  MPI_File_close(&fhandle);
>  free(buffer);
>  MPI_Finalize();
>
>  return 0;
> }
>


More information about the mpich-discuss mailing list