[MPICH] MPI-IO, vector datatype

Russell L. Carter rcarter at esturion.net
Thu May 3 23:57:25 CDT 2007


All right, I've been thinking, ok, I'm an old fart,
I can revert to assembly code. :-)

I will provide you with the direct analogue in c tomorrow.

Now that I think about it, it's quite possible that I was wrong
about the c++ api not being a problem, as there may be a
missing & (reference) operator somewhere.

Best,
Russell

Rajeev Thakur wrote:
> I don't see any bug in the program, so I am guessing it has to do with C++,
> may be even the C++ binding in MPICH2. Can you run the C version of the
> program you downloaded from the book. 
> 
> Rajeev 
> 
>> -----Original Message-----
>> From: Russell L. Carter [mailto:rcarter at esturion.net] 
>> Sent: Thursday, May 03, 2007 11:35 PM
>> To: Rajeev Thakur
>> Cc: 'Rob Ross'; mpich-discuss at mcs.anl.gov
>> Subject: Re: [MPICH] MPI-IO, vector datatype
>>
>> Rajeev Thakur wrote:
>>> Can you try writing to /tmp in case /home/rcarter is NFS.
>> Yes indeed, NFS is problematical no?  Generally it fails, as
>> I discovered today.  Probably from the error messages I need
>> to enforce sync semantics.  But after running into these problems
>> I settled back on testing with multiple process single filesystem,
>> using the local unix fs, or multiple node global pvfs2 filesystems
>> I have.
>>
>> So yes, those last od dumps are from a single system, single
>> filesystem.  Specifically a linux 2.6 kernel with 2 cpus and
>> a lot of fast disk.
>>
>> I might add that I admin all these systems and have been doing
>> this sort of stuff for 17 years so any underlying (re)configuration
>> that might help is not out of the question to try out.
>>
>> But I don't think that's the problem.
>>
>> Best,
>> Russell
>>
>>> Rajeev
>>>
>>>> -----Original Message-----
>>>> From: Russell L. Carter [mailto:rcarter at esturion.net] 
>>>> Sent: Thursday, May 03, 2007 9:03 PM
>>>> To: Rob Ross
>>>> Cc: Rajeev Thakur; mpich-discuss at mcs.anl.gov
>>>> Subject: Re: [MPICH] MPI-IO, vector datatype
>>>>
>>>> Hi Rob,
>>>>
>>>> Rob Ross wrote:
>>>>> Hi Russell,
>>>>>
>>>>> The "nblocks(1)" sets that variable to 1, yes? Sorry, C++ 
>>>> isn't my thing.
>>>>
>>>> Well, I mentioned that I tried multiple values for nblocks: 
>>>> 1, 2, and 4,
>>>> for instance.  It would only increase the lines of code to add in
>>>> a command line argument and I wanted to keep the code as small
>>>> as possible, and it surely is.
>>>>
>>>> To get the wrong result, set nblocks to 2: nblocks(2).
>>>>
>>>> I'd like to emphasize that I have tried to change nothing about
>>>> the algorithm in the read_all.c program featured on p. 65. of Using
>>>> MPI-2.   Using that algorithm, I can't write a file and then
>>>> read it with the same view.  My c++ code is written to make
>>>> that especially clear.  The c++ code in mpicxx.h is just dead
>>>> simple inline calls to the c api, so it's not a c++ problem.
>>>>
>>>> Maybe I'm wrong (cool, problem solved), and there's a 
>> working example
>>>> somewhere?  That would be great.
>>>>
>>>> Best,
>>>> Russell
>>>>
>>>>
>>>>> A vector with a count of 1 is the same as a contig with a 
>>>> count equal to 
>>>>> the blocksize of the vector. This would explain what you're 
>>>> seeing. The 
>>>>> stride is only used if the count is greater than 1.
>>>>>
>>>>> Regards,
>>>>>
>>>>> Rob
>>>>>
>>>>> Russell L. Carter wrote:
>>>>>>> It is easy to run on a single machine. With MPD, all you 
>>>> need to do is
>>>>>>> % mpd &
>>>>>>> % mpiexec -n 2 a.out 
>>>>>> Works great.  No difference between pvfs2 and unix.
>>>>>>
>>>>>>> blocks of 4 ints each because you have defined INTS_PER_BLK=4.
>>>>>> I'm guilty of a transcription error, crap.  Sorry about that,
>>>>>> that's a stupid waste of time.  Should have been INTS_PER_BLK=8.
>>>>>> With INTS_PER_BLK=4, I agree with your values but the problem
>>>>>> is still there.  I have found what appears to be the problem.
>>>>>> The stride arg in the Create_vector method appears to be
>>>>>> ignored.  It doesn't matter what I set it to, 0 on up to
>>>>>> nprocs*blocksize, the block data for each proc is written
>>>>>> out contiguously.
>>>>>>
>>>>>> If I set the view displacement to be myrank*nints,
>>>>>> the file always looks like this, without
>>>>>> any holes, for any number of blocks and stride I set
>>>>>> (nprocs is 2, neg is rank 0, pos is rank 1):
>>>>>>
>>>>>> 0000000           0          -1          -2          -3
>>>>>> 0000020          -4          -5          -6          -7
>>>>>> 0000040          -8          -9         -10         -11
>>>>>> 0000060         -12         -13         -14         -15
>>>>>> 0000100         -16         -17         -18         -19
>>>>>> 0000120         -20         -21         -22         -23
>>>>>> 0000140         -24         -25         -26         -27
>>>>>> 0000160         -28         -29         -30         -31
>>>>>> 0000200           0           1           2           3
>>>>>> 0000220           4           5           6           7
>>>>>> 0000240           8           9          10          11
>>>>>> 0000260          12          13          14          15
>>>>>> 0000300          16          17          18          19
>>>>>> 0000320          20          21          22          23
>>>>>> 0000340          24          25          26          27
>>>>>> 0000360          28          29          30          31
>>>>>>
>>>>>>
>>>>>> If I set the view displacements to
>>>>>> blocksize*sizeof(int)*myrank, the file looks like this,
>>>>>> for any stride (nblocks/proc is 2 here):
>>>>>>
>>>>>> 0000000           0          -1          -2          -3
>>>>>> 0000020          -4          -5          -6          -7
>>>>>> 0000040          -8          -9         -10         -11
>>>>>> 0000060         -12         -13         -14         -15
>>>>>> 0000100           0           1           2           3
>>>>>> 0000120           4           5           6           7
>>>>>> 0000140           8           9          10          11
>>>>>> 0000160          12          13          14          15
>>>>>> 0000200          16          17          18          19
>>>>>> 0000220          20          21          22          23
>>>>>> 0000240          24          25          26          27
>>>>>> 0000260          28          29          30          31
>>>>>>
>>>>>> The further reduced code is appended.  As far as I can tell
>>>>>> it should produce identical datatypes and views as the program
>>>>>> on p. 65 of Using MPI-2.  It was my impression that that
>>>>>> program was intended to read interleaved data, maybe it's
>>>>>> not?
>>>>>>
>>>>>> Thanks,
>>>>>> Russell
>>>>>>
>>>>>> #include "mpi.h"
>>>>>> #include <iostream>
>>>>>> using namespace std;
>>>>>>
>>>>>> struct tester
>>>>>> {
>>>>>>     tester()
>>>>>>         : myrank(MPI::COMM_WORLD.Get_rank()),
>>>>>>           nprocs(MPI::COMM_WORLD.Get_size()),
>>>>>>           bufsize(FILESIZE/nprocs), nints(bufsize/sizeof(int)),
>>>>>>           nblocks(1), blocksize(nints/nblocks),
>>>>>>           filetype(MPI::INT),
>>>>>>           //fname("pvfs2:/mnt/pvfs/tst/testfile")
>>>>>>           fname("/home/rcarter/mpibin/testfile")
>>>>>>     {
>>>>>>         std::ios::sync_with_stdio(false);
>>>>>>         filetype.Create_vector(nblocks, blocksize, nprocs 
>>>> * blocksize);
>>>>>>         filetype.Commit();
>>>>>>         obuf = new int[bufsize];
>>>>>>         ibuf = new int[bufsize];
>>>>>>     }
>>>>>>     ~tester() {
>>>>>>         delete[] obuf;
>>>>>>         delete[] ibuf;
>>>>>>     }
>>>>>>     void write()
>>>>>>     {
>>>>>>         for (int i = 0; i < nints; ++i) {
>>>>>>             if (myrank)
>>>>>>                 obuf[i] = i;
>>>>>>             else
>>>>>>                 obuf[i] = -i;
>>>>>>         }
>>>>>>
>>>>>>         MPI::File f = open_set_view(MPI_MODE_CREATE | 
>>>> MPI_MODE_WRONLY);
>>>>>>         f.Write_all(obuf, nints, MPI_INT, status);
>>>>>>         f.Close();
>>>>>>     }
>>>>>>     void read()
>>>>>>     {
>>>>>>         MPI::File f = open_set_view(MPI_MODE_RDONLY);
>>>>>>         f.Read_all(ibuf, nints, MPI_INT, status);
>>>>>>         f.Close();
>>>>>>         for (int i = 0; i < nints; ++i) {
>>>>>>             if (obuf[i] != ibuf[i]) {
>>>>>>                 cerr << "myrank, i, obuf[i], ibuf[i]: " << 
>>>> myrank << " "
>>>>>>                      << i << " " << obuf[i] << " " << 
>>>> ibuf[i] << endl;
>>>>>>             }
>>>>>>         }
>>>>>>     }
>>>>>> private:
>>>>>>     static const int FILESIZE = 256;
>>>>>>     int myrank, nprocs, bufsize, nints, nblocks, 
>>>> blocksize, *obuf, *ibuf;
>>>>>>     MPI::Datatype filetype;
>>>>>>     string fname;
>>>>>>     MPI::Status status;
>>>>>>
>>>>>>     MPI::File open_set_view(int mode)
>>>>>>     {
>>>>>>         MPI::File f = MPI::File::Open(MPI::COMM_WORLD, 
>>>> fname.c_str(),
>>>>>>                                       mode, MPI::INFO_NULL);
>>>>>>         MPI::Offset disp = blocksize * sizeof(int) * myrank;
>>>>>>         f.Set_view(disp, MPI_INT, filetype, "native", 
>>>> MPI_INFO_NULL);
>>>>>>         return f;
>>>>>>     }
>>>>>> };
>>>>>> int main()
>>>>>> {
>>>>>>     cerr << "Starting rwall.\n";
>>>>>>     try {
>>>>>>         MPI::Init();
>>>>>>         tester t;
>>>>>>         t.write();
>>>>>>         MPI::COMM_WORLD.Barrier();
>>>>>>         t.read();
>>>>>>         MPI::Finalize();
>>>>>>     } catch (exception &e) {
>>>>>>         cerr << "\nCaught exception: " << e.what() << endl;
>>>>>>         return -1;
>>>>>>     } catch (MPI::Exception& e) {
>>>>>>         cerr << "\nError:\n" << e.Get_error_string();
>>>>>>         return -2;
>>>>>>     }
>>>>>>     cerr << "rwall end.\n";
>>>>>>     return 0;
>>>>>> }
>>>>>>
>>>> -- 
>>>> Russell L. Carter
>>>> Esturion, LLC
>>>> 2285 Sandia Drive
>>>> Prescott, Arizona 86301
>>>>
>>>> rcarter at esturion.net
>>>> 928 308-4154
>>>>
>>>>
>>
>> -- 
>> Russell L. Carter
>> Esturion, LLC
>> 2285 Sandia Drive
>> Prescott, Arizona 86301
>>
>> rcarter at esturion.net
>> 928 308-4154
>>
>>
> 


-- 
Russell L. Carter
Esturion, LLC
2285 Sandia Drive
Prescott, Arizona 86301

rcarter at esturion.net
928 308-4154




More information about the mpich-discuss mailing list