[MPICH] Help With I/O
Matthew Chambers
matthew.chambers at vanderbilt.edu
Wed Apr 18 09:39:50 CDT 2007
Hi Erich,
Frankly the code leading up to that for loop is a bit scary. I don't really
know anything about MPI I/O, but I can get you a few tips on your C++:
- Unless you're using a horribly out of date compiler like MSVC 6,
you should use the standard header names <iostream>, <vector>, <ctime>, etc.
- If you are using a horribly out of date compiler like MSVC 6, you
should upgrade to the free MSVC++ 2005 Express Edition.
- In this case it's a cosmetic fix, but you should probably pass
the vector<bool> parameter by reference instead of by value.
- You seem to be doing some mind boggling casting in order to
determine if num_elements_per_rank is too big to fit in an int (but only on
your last process?). You might get rid of that voodoo by using size_t
(usually at least an unsigned 4 byte int) for your position indexes instead
(vector<T>::size() returns vector<T>::size_type which usually boils down to
size_t).
Beyond that, I would need to see some debug output from your for loop. For
example, what indexes are actually being passed to the I/O calls by each
process? Does MPI::File::Read_at() allocate memory for the "buf" variable
you pass it? If not, you haven't allocated any memory for it and that would
lead to a crash before you could say "new char."
Good luck,
Matt Chambers
_____
From: owner-mpich-discuss at mcs.anl.gov
[mailto:owner-mpich-discuss at mcs.anl.gov] On Behalf Of Erich Peterson
Sent: Wednesday, April 18, 2007 3:02 AM
To: mpich-discuss at mcs.anl.gov
Subject: [MPICH] Help With I/O
Hi all, I'm trying to write this little routine which is part of my graduate
project. What I'm trying to do is pass a vector of bools in to the QueryData
method. In that method I split the vector up into equal parts among the
number of processes, have each process open a datafile which as 20 byte
records (no new lines), read that record from the file if the vector they
are checking has "true" (basically if vector[0] = true, it means grab the
first record of the file), and lastly, it should output that record into a
new file.
I have been able to determine it is messing up on the for-loop. The error
is:
[Student at cluster1 erich_test_area]$ mpiexec -n 3
/mnt/pvfs2/acxiom/erich_test_area/RecordRetrieval
terminate called after throwing an instance of 'MPI::Exception'
terminate called after throwing an instance of 'MPI::Exception'
terminate called after throwing an instance of 'MPI::Exception'
rank 0 in job 1 cluster1_33602 caused collective abort of all ranks
exit status of rank 0: killed by signal 6
If someone could please tell me or edit the code if they see what is wrong.
Thanks!
Main.cpp:
#include "RecordRetrieval.h"
#include <vector.h>
int main()
{
vector<bool> vec;
vec.push_back(true);
vec.push_back(false);
vec.push_back(true);
vec.push_back(false);
vec.push_back(true);
vec.push_back(false);
RecordRetrieval rec;
rec.QueryData(vec, "test.dat");
return 0;
}
RecordRetrieval.cpp:
#include "RecordRetrieval.h"
#include "mpi.h"
#include "time.h"
#include "iostream.h"
void RecordRetrieval::QueryData(vector<bool> unencoded_vector, char *
filename)
{
int num_processes;
int num_vector_elements;
float num_elements_per_rank;
int local_start_position;
int local_end_position;
char * buf;
int my_rank;
MPI::File input_file;
MPI::Status input_file_status;
MPI::File output_file;
MPI::Status output_file_status;
//MPI::Offset filesize;
char output_filename[30];
size_t i;
struct tm tim;
time_t now;
now = time(NULL);
tim = *(localtime(&now));
i = strftime(output_filename, 30, "%m_%d_%Y_%H_%M_%S", &tim);
/* Let the system do what it needs to start up MPI */
MPI::Init();
/* Get my process rank */
my_rank = MPI::COMM_WORLD.Get_rank();
/* Find out how many processes are being used */
num_processes = MPI::COMM_WORLD.Get_size();
num_vector_elements = unencoded_vector.size();
num_elements_per_rank = num_vector_elements / num_processes;
local_start_position = my_rank * (int)num_elements_per_rank;
if(my_rank == num_processes - 1)
{
if(num_elements_per_rank * num_processes ==
(int)num_elements_per_rank * num_processes)
{
local_end_position = local_start_position +
((int)num_elements_per_rank - 1);
}
else
{
local_end_position = (local_start_position +
(int)num_elements_per_rank - 1) +
(((int)num_elements_per_rank * num_processes) -
((int)num_elements_per_rank * num_processes));
}
}
else
{
local_end_position = local_start_position +
((int)num_elements_per_rank - 1);
}
input_file = MPI::File::Open(MPI::COMM_WORLD, filename,
MPI::MODE_RDONLY,
MPI::INFO_NULL);
output_file = MPI::File::Open(MPI::COMM_WORLD, output_filename,
MPI::MODE_CREATE | MPI::MODE_WRONLY, MPI::INFO_NULL);
// filesize = input_file.Get_size();
for(int i = local_start_position; i < local_end_position + 1; i++)
{
if(unencoded_vector[i])
{
input_file.Read_at(i * 20, buf, 20, MPI_CHAR,
input_file_status);
output_file.Write_shared(buf, 20, MPI_CHAR, output_file_status);
}
}
cout << "Error";
input_file.Close();
output_file.Close();
MPI::Finalize();
}
_____
Discover the new Windows Vista Learn more!
<http://search.msn.com/results.aspx?q=windows+vista&mkt=en-US&form=QBRE>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/mpich-discuss/attachments/20070418/75186bd7/attachment.htm>
More information about the mpich-discuss
mailing list