[MPICH] MPI-IO, vector datatype
Rob Ross
rross at mcs.anl.gov
Fri May 4 08:08:44 CDT 2007
Glad Rajeev figured it out.
Regards,
Rob
Russell L. Carter wrote:
> That appears to be it. That's not conventional
> c++ semantics. Damn. It all works perfectly now!
>
> Now I know why there's that effort over at
> www.boost.org to sanify the MPI api.
>
> Thank you all, especially Rajeev and Rob for all
> your help.
>
> Best,
> Russell
>
>
> Rajeev Thakur wrote:
>> OK, I am no C++ expert, but if I define a datatype
>> MPI::Datatype newtype; and do
>> newtype = filetype.Create_vector(...)
>> newtype.Commit();
>>
>> And then pass newtype as the filetype to Set_view in both places, your
>> code
>> works.
>>
>> Rajeev
>>
>>
>>> -----Original Message-----
>>> From: Russell L. Carter [mailto:rcarter at esturion.net] Sent: Thursday,
>>> May 03, 2007 11:57 PM
>>> To: Rajeev Thakur
>>> Cc: 'Rob Ross'; mpich-discuss at mcs.anl.gov
>>> Subject: Re: [MPICH] MPI-IO, vector datatype
>>>
>>> All right, I've been thinking, ok, I'm an old fart,
>>> I can revert to assembly code. :-)
>>>
>>> I will provide you with the direct analogue in c tomorrow.
>>>
>>> Now that I think about it, it's quite possible that I was wrong
>>> about the c++ api not being a problem, as there may be a
>>> missing & (reference) operator somewhere.
>>>
>>> Best,
>>> Russell
>>>
>>> Rajeev Thakur wrote:
>>>> I don't see any bug in the program, so I am guessing it has
>>> to do with C++,
>>>> may be even the C++ binding in MPICH2. Can you run the C
>>> version of the
>>>> program you downloaded from the book.
>>>> Rajeev
>>>>> -----Original Message-----
>>>>> From: Russell L. Carter [mailto:rcarter at esturion.net] Sent:
>>>>> Thursday, May 03, 2007 11:35 PM
>>>>> To: Rajeev Thakur
>>>>> Cc: 'Rob Ross'; mpich-discuss at mcs.anl.gov
>>>>> Subject: Re: [MPICH] MPI-IO, vector datatype
>>>>>
>>>>> Rajeev Thakur wrote:
>>>>>> Can you try writing to /tmp in case /home/rcarter is NFS.
>>>>> Yes indeed, NFS is problematical no? Generally it fails, as
>>>>> I discovered today. Probably from the error messages I need
>>>>> to enforce sync semantics. But after running into these problems
>>>>> I settled back on testing with multiple process single filesystem,
>>>>> using the local unix fs, or multiple node global pvfs2 filesystems
>>>>> I have.
>>>>>
>>>>> So yes, those last od dumps are from a single system, single
>>>>> filesystem. Specifically a linux 2.6 kernel with 2 cpus and
>>>>> a lot of fast disk.
>>>>>
>>>>> I might add that I admin all these systems and have been doing
>>>>> this sort of stuff for 17 years so any underlying (re)configuration
>>>>> that might help is not out of the question to try out.
>>>>>
>>>>> But I don't think that's the problem.
>>>>>
>>>>> Best,
>>>>> Russell
>>>>>
>>>>>> Rajeev
>>>>>>
>>>>>>> -----Original Message-----
>>>>>>> From: Russell L. Carter [mailto:rcarter at esturion.net] Sent:
>>>>>>> Thursday, May 03, 2007 9:03 PM
>>>>>>> To: Rob Ross
>>>>>>> Cc: Rajeev Thakur; mpich-discuss at mcs.anl.gov
>>>>>>> Subject: Re: [MPICH] MPI-IO, vector datatype
>>>>>>>
>>>>>>> Hi Rob,
>>>>>>>
>>>>>>> Rob Ross wrote:
>>>>>>>> Hi Russell,
>>>>>>>>
>>>>>>>> The "nblocks(1)" sets that variable to 1, yes? Sorry, C++
>>>>>>> isn't my thing.
>>>>>>>
>>>>>>> Well, I mentioned that I tried multiple values for nblocks: 1, 2,
>>>>>>> and 4,
>>>>>>> for instance. It would only increase the lines of code to add in
>>>>>>> a command line argument and I wanted to keep the code as small
>>>>>>> as possible, and it surely is.
>>>>>>>
>>>>>>> To get the wrong result, set nblocks to 2: nblocks(2).
>>>>>>>
>>>>>>> I'd like to emphasize that I have tried to change nothing about
>>>>>>> the algorithm in the read_all.c program featured on p.
>>> 65. of Using
>>>>>>> MPI-2. Using that algorithm, I can't write a file and then
>>>>>>> read it with the same view. My c++ code is written to make
>>>>>>> that especially clear. The c++ code in mpicxx.h is just dead
>>>>>>> simple inline calls to the c api, so it's not a c++ problem.
>>>>>>>
>>>>>>> Maybe I'm wrong (cool, problem solved), and there's a
>>>>> working example
>>>>>>> somewhere? That would be great.
>>>>>>>
>>>>>>> Best,
>>>>>>> Russell
>>>>>>>
>>>>>>>
>>>>>>>> A vector with a count of 1 is the same as a contig with a
>>>>>>> count equal to
>>>>>>>> the blocksize of the vector. This would explain what you're
>>>>>>> seeing. The
>>>>>>>> stride is only used if the count is greater than 1.
>>>>>>>>
>>>>>>>> Regards,
>>>>>>>>
>>>>>>>> Rob
>>>>>>>>
>>>>>>>> Russell L. Carter wrote:
>>>>>>>>>> It is easy to run on a single machine. With MPD, all you
>>>>>>> need to do is
>>>>>>>>>> % mpd &
>>>>>>>>>> % mpiexec -n 2 a.out
>>>>>>>>> Works great. No difference between pvfs2 and unix.
>>>>>>>>>
>>>>>>>>>> blocks of 4 ints each because you have defined INTS_PER_BLK=4.
>>>>>>>>> I'm guilty of a transcription error, crap. Sorry about that,
>>>>>>>>> that's a stupid waste of time. Should have been
>>> INTS_PER_BLK=8.
>>>>>>>>> With INTS_PER_BLK=4, I agree with your values but the problem
>>>>>>>>> is still there. I have found what appears to be the problem.
>>>>>>>>> The stride arg in the Create_vector method appears to be
>>>>>>>>> ignored. It doesn't matter what I set it to, 0 on up to
>>>>>>>>> nprocs*blocksize, the block data for each proc is written
>>>>>>>>> out contiguously.
>>>>>>>>>
>>>>>>>>> If I set the view displacement to be myrank*nints,
>>>>>>>>> the file always looks like this, without
>>>>>>>>> any holes, for any number of blocks and stride I set
>>>>>>>>> (nprocs is 2, neg is rank 0, pos is rank 1):
>>>>>>>>>
>>>>>>>>> 0000000 0 -1 -2 -3
>>>>>>>>> 0000020 -4 -5 -6 -7
>>>>>>>>> 0000040 -8 -9 -10 -11
>>>>>>>>> 0000060 -12 -13 -14 -15
>>>>>>>>> 0000100 -16 -17 -18 -19
>>>>>>>>> 0000120 -20 -21 -22 -23
>>>>>>>>> 0000140 -24 -25 -26 -27
>>>>>>>>> 0000160 -28 -29 -30 -31
>>>>>>>>> 0000200 0 1 2 3
>>>>>>>>> 0000220 4 5 6 7
>>>>>>>>> 0000240 8 9 10 11
>>>>>>>>> 0000260 12 13 14 15
>>>>>>>>> 0000300 16 17 18 19
>>>>>>>>> 0000320 20 21 22 23
>>>>>>>>> 0000340 24 25 26 27
>>>>>>>>> 0000360 28 29 30 31
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> If I set the view displacements to
>>>>>>>>> blocksize*sizeof(int)*myrank, the file looks like this,
>>>>>>>>> for any stride (nblocks/proc is 2 here):
>>>>>>>>>
>>>>>>>>> 0000000 0 -1 -2 -3
>>>>>>>>> 0000020 -4 -5 -6 -7
>>>>>>>>> 0000040 -8 -9 -10 -11
>>>>>>>>> 0000060 -12 -13 -14 -15
>>>>>>>>> 0000100 0 1 2 3
>>>>>>>>> 0000120 4 5 6 7
>>>>>>>>> 0000140 8 9 10 11
>>>>>>>>> 0000160 12 13 14 15
>>>>>>>>> 0000200 16 17 18 19
>>>>>>>>> 0000220 20 21 22 23
>>>>>>>>> 0000240 24 25 26 27
>>>>>>>>> 0000260 28 29 30 31
>>>>>>>>>
>>>>>>>>> The further reduced code is appended. As far as I can tell
>>>>>>>>> it should produce identical datatypes and views as the program
>>>>>>>>> on p. 65 of Using MPI-2. It was my impression that that
>>>>>>>>> program was intended to read interleaved data, maybe it's
>>>>>>>>> not?
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Russell
>>>>>>>>>
>>>>>>>>> #include "mpi.h"
>>>>>>>>> #include <iostream>
>>>>>>>>> using namespace std;
>>>>>>>>>
>>>>>>>>> struct tester
>>>>>>>>> {
>>>>>>>>> tester()
>>>>>>>>> : myrank(MPI::COMM_WORLD.Get_rank()),
>>>>>>>>> nprocs(MPI::COMM_WORLD.Get_size()),
>>>>>>>>> bufsize(FILESIZE/nprocs), nints(bufsize/sizeof(int)),
>>>>>>>>> nblocks(1), blocksize(nints/nblocks),
>>>>>>>>> filetype(MPI::INT),
>>>>>>>>> //fname("pvfs2:/mnt/pvfs/tst/testfile")
>>>>>>>>> fname("/home/rcarter/mpibin/testfile")
>>>>>>>>> {
>>>>>>>>> std::ios::sync_with_stdio(false);
>>>>>>>>> filetype.Create_vector(nblocks, blocksize, nprocs
>>>>>>> * blocksize);
>>>>>>>>> filetype.Commit();
>>>>>>>>> obuf = new int[bufsize];
>>>>>>>>> ibuf = new int[bufsize];
>>>>>>>>> }
>>>>>>>>> ~tester() {
>>>>>>>>> delete[] obuf;
>>>>>>>>> delete[] ibuf;
>>>>>>>>> }
>>>>>>>>> void write()
>>>>>>>>> {
>>>>>>>>> for (int i = 0; i < nints; ++i) {
>>>>>>>>> if (myrank)
>>>>>>>>> obuf[i] = i;
>>>>>>>>> else
>>>>>>>>> obuf[i] = -i;
>>>>>>>>> }
>>>>>>>>>
>>>>>>>>> MPI::File f = open_set_view(MPI_MODE_CREATE |
>>>>>>> MPI_MODE_WRONLY);
>>>>>>>>> f.Write_all(obuf, nints, MPI_INT, status);
>>>>>>>>> f.Close();
>>>>>>>>> }
>>>>>>>>> void read()
>>>>>>>>> {
>>>>>>>>> MPI::File f = open_set_view(MPI_MODE_RDONLY);
>>>>>>>>> f.Read_all(ibuf, nints, MPI_INT, status);
>>>>>>>>> f.Close();
>>>>>>>>> for (int i = 0; i < nints; ++i) {
>>>>>>>>> if (obuf[i] != ibuf[i]) {
>>>>>>>>> cerr << "myrank, i, obuf[i], ibuf[i]: " <<
>>>>>>> myrank << " "
>>>>>>>>> << i << " " << obuf[i] << " " <<
>>>>>>> ibuf[i] << endl;
>>>>>>>>> }
>>>>>>>>> }
>>>>>>>>> }
>>>>>>>>> private:
>>>>>>>>> static const int FILESIZE = 256;
>>>>>>>>> int myrank, nprocs, bufsize, nints, nblocks,
>>>>>>> blocksize, *obuf, *ibuf;
>>>>>>>>> MPI::Datatype filetype;
>>>>>>>>> string fname;
>>>>>>>>> MPI::Status status;
>>>>>>>>>
>>>>>>>>> MPI::File open_set_view(int mode)
>>>>>>>>> {
>>>>>>>>> MPI::File f = MPI::File::Open(MPI::COMM_WORLD,
>>>>>>> fname.c_str(),
>>>>>>>>> mode, MPI::INFO_NULL);
>>>>>>>>> MPI::Offset disp = blocksize * sizeof(int) * myrank;
>>>>>>>>> f.Set_view(disp, MPI_INT, filetype, "native",
>>>>>>> MPI_INFO_NULL);
>>>>>>>>> return f;
>>>>>>>>> }
>>>>>>>>> };
>>>>>>>>> int main()
>>>>>>>>> {
>>>>>>>>> cerr << "Starting rwall.\n";
>>>>>>>>> try {
>>>>>>>>> MPI::Init();
>>>>>>>>> tester t;
>>>>>>>>> t.write();
>>>>>>>>> MPI::COMM_WORLD.Barrier();
>>>>>>>>> t.read();
>>>>>>>>> MPI::Finalize();
>>>>>>>>> } catch (exception &e) {
>>>>>>>>> cerr << "\nCaught exception: " << e.what() << endl;
>>>>>>>>> return -1;
>>>>>>>>> } catch (MPI::Exception& e) {
>>>>>>>>> cerr << "\nError:\n" << e.Get_error_string();
>>>>>>>>> return -2;
>>>>>>>>> }
>>>>>>>>> cerr << "rwall end.\n";
>>>>>>>>> return 0;
>>>>>>>>> }
>>>>>>>>>
>>>>>>> --
>>>>>>> Russell L. Carter
>>>>>>> Esturion, LLC
>>>>>>> 2285 Sandia Drive
>>>>>>> Prescott, Arizona 86301
>>>>>>>
>>>>>>> rcarter at esturion.net
>>>>>>> 928 308-4154
>>>>>>>
>>>>>>>
>>>>> --
>>>>> Russell L. Carter
>>>>> Esturion, LLC
>>>>> 2285 Sandia Drive
>>>>> Prescott, Arizona 86301
>>>>>
>>>>> rcarter at esturion.net
>>>>> 928 308-4154
>>>>>
>>>>>
>>>
>>> --
>>> Russell L. Carter
>>> Esturion, LLC
>>> 2285 Sandia Drive
>>> Prescott, Arizona 86301
>>>
>>> rcarter at esturion.net
>>> 928 308-4154
>>>
>>>
>>
>
>
More information about the mpich-discuss
mailing list