[MPICH] Help With I/O

Matthew Chambers matthew.chambers at vanderbilt.edu
Wed Apr 18 09:39:50 CDT 2007


Hi Erich,

 

Frankly the code leading up to that for loop is a bit scary.  I don't really
know anything about MPI I/O, but I can get you a few tips on your C++:

-          Unless you're using a horribly out of date compiler like MSVC 6,
you should use the standard header names <iostream>, <vector>, <ctime>, etc.

-          If you are using a horribly out of date compiler like MSVC 6, you
should upgrade to the free MSVC++ 2005 Express Edition.

-          In this case it's a cosmetic fix, but you should probably pass
the vector<bool> parameter by reference instead of by value.

-          You seem to be doing some mind boggling casting in order to
determine if num_elements_per_rank is too big to fit in an int (but only on
your last process?).  You might get rid of that voodoo by using size_t
(usually at least an unsigned 4 byte int) for your position indexes instead
(vector<T>::size() returns vector<T>::size_type which usually boils down to
size_t).

 

Beyond that, I would need to see some debug output from your for loop.  For
example, what indexes are actually being passed to the I/O calls by each
process?  Does MPI::File::Read_at() allocate memory for the "buf" variable
you pass it?  If not, you haven't allocated any memory for it and that would
lead to a crash before you could say "new char."

 

Good luck,

Matt Chambers

 

  _____  

From: owner-mpich-discuss at mcs.anl.gov
[mailto:owner-mpich-discuss at mcs.anl.gov] On Behalf Of Erich Peterson
Sent: Wednesday, April 18, 2007 3:02 AM
To: mpich-discuss at mcs.anl.gov
Subject: [MPICH] Help With I/O

 

Hi all, I'm trying to write this little routine which is part of my graduate
project. What I'm trying to do is pass a vector of bools in to the QueryData
method. In that method I split the vector up into equal parts among the
number of processes, have each process open a datafile which as 20 byte
records (no new lines), read that record from the file if the vector they
are checking has "true" (basically if vector[0] = true, it means grab the
first record of the file), and lastly, it should output that record into a
new file.
 
I have been able to determine it is messing up on the for-loop. The error
is:
 
[Student at cluster1 erich_test_area]$ mpiexec -n 3
/mnt/pvfs2/acxiom/erich_test_area/RecordRetrieval
terminate called after throwing an instance of 'MPI::Exception'
terminate called after throwing an instance of 'MPI::Exception'
terminate called after throwing an instance of 'MPI::Exception'
rank 0 in job 1  cluster1_33602   caused collective abort of all ranks
  exit status of rank 0: killed by signal 6 

If someone could please tell me or edit the code if they see what is wrong.
Thanks!
 
Main.cpp:

#include "RecordRetrieval.h"
#include <vector.h>
int main()
{
   vector<bool> vec;
   vec.push_back(true);
   vec.push_back(false);
   vec.push_back(true);
   vec.push_back(false);
   vec.push_back(true);
   vec.push_back(false);
   RecordRetrieval rec;
   rec.QueryData(vec, "test.dat");
   return 0;
}

RecordRetrieval.cpp:
 
#include "RecordRetrieval.h"
#include "mpi.h"
#include "time.h"
#include "iostream.h"
void RecordRetrieval::QueryData(vector<bool> unencoded_vector, char *
filename)
{
    int num_processes;
    int num_vector_elements;
    float num_elements_per_rank;
    int local_start_position;
    int local_end_position;
    char * buf;
    int my_rank;
    MPI::File input_file;
    MPI::Status input_file_status;
    MPI::File output_file;
    MPI::Status output_file_status;
    //MPI::Offset filesize;
    char output_filename[30];
    size_t i;
    struct tm tim;
    time_t now;
    now = time(NULL);
    tim = *(localtime(&now));
    i = strftime(output_filename, 30, "%m_%d_%Y_%H_%M_%S", &tim);

    /* Let the system do what it needs to start up MPI */
    MPI::Init();
    /* Get my process rank */
    my_rank = MPI::COMM_WORLD.Get_rank();
    /* Find out how many processes are being used */
    num_processes = MPI::COMM_WORLD.Get_size();
    num_vector_elements = unencoded_vector.size();

    num_elements_per_rank = num_vector_elements / num_processes;
    local_start_position = my_rank * (int)num_elements_per_rank;
    if(my_rank == num_processes - 1)
    {
        if(num_elements_per_rank * num_processes ==
(int)num_elements_per_rank * num_processes)
        {
            local_end_position = local_start_position +
((int)num_elements_per_rank - 1);
        }
        else
        {
            local_end_position = (local_start_position +
(int)num_elements_per_rank - 1) +
                (((int)num_elements_per_rank * num_processes) -
((int)num_elements_per_rank * num_processes));
        }
    }
    else
    {
        local_end_position = local_start_position +
((int)num_elements_per_rank - 1);
    }
    input_file = MPI::File::Open(MPI::COMM_WORLD, filename,
MPI::MODE_RDONLY,
                    MPI::INFO_NULL);
    output_file = MPI::File::Open(MPI::COMM_WORLD, output_filename,
MPI::MODE_CREATE | MPI::MODE_WRONLY, MPI::INFO_NULL);
    // filesize = input_file.Get_size();
    for(int i = local_start_position; i < local_end_position + 1; i++)
    {
        if(unencoded_vector[i])
        {
            input_file.Read_at(i * 20, buf, 20, MPI_CHAR,
input_file_status);
            output_file.Write_shared(buf, 20, MPI_CHAR, output_file_status);
        }
    }
    cout << "Error";
    input_file.Close();
    output_file.Close();
                                           
    MPI::Finalize();
}

  _____  

Discover the new Windows Vista Learn more!
<http://search.msn.com/results.aspx?q=windows+vista&mkt=en-US&form=QBRE> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/mpich-discuss/attachments/20070418/75186bd7/attachment.htm>


More information about the mpich-discuss mailing list