[petsc-users] Unable to create >4GB sized HDF5 files on Cray XC30
Juha Jäykkä
juhaj at iki.fi
Sat Oct 5 02:50:14 CDT 2013
> What range of chunk sizes are you using? For each fixed number of
> ranks, how does the performance vary when varying chunk size from, say,
> 5MB to 500MB?
I didn't write down the results as they were just a byproduct of getting
usable performance, but I will do that when I have the time.
> > Why not use the size of the local part of the DA/Vec? That would guarantee
> That's fine, but the chunk size needs to be *collective* so we need to
> do a reduction or otherwise compute the "average size".
I guess I was lucky to have every rank's local Vec precisely the same size (no
wonder: my test was a 1024^3 lattice with 4096 ranks probably means there were
64^3 each.
What happens when, say, one has three ranks and Vec lengths 120, 120, and 128
bytes (or 15, 15, and 16 doubles) on the three ranks: the average becomes
122.667 which isn't even an integer. How should the chunk dimensions look like
here? Ceil(122.667), perhaps? But then the sum of chunk sizes > data size, so
there will be extra data in the file (which I presume HDF5 will take care of
when accessing the file), is that right?
I'm just asking all these stupid questions since you asked for a patch. ;)
> > Is the granularity (number of ranks actually doing disc IO) settable on
> > HDF5 side or does that need to be set in MPI-IO?
> I'm not sure what you mean. On a system like BG, the computed nodes are
> not connected to disks and instead have to send the data to IO nodes.
> The distribution of IO nodes is part of the machine design. The ranks
> participating in IO are just rearranging data before sending it to the
> IO nodes.
Sorry, I forgot about BG and its kin. I meant the ranks participating in IO.
Obviously the number of IO nodes is determined by the hardware. I just had a
Cray XK30 in my mind, where those ranks participating in IO are the IO nodes,
too, so I didn't think of making the distinction between "ranks doing MPI-IO"
and "IO nodes", which I of course should have.
> Send a patch (or submit a pull request) against 'maint' and we'll
> consider it. As long as the change doesn't break any existing uses, it
> could be merged to 'maint' (thus v3.4.k for k>=3) after testing.
I'll try to get something useful in. What's the timetable?
Cheers,
Juha
--
-----------------------------------------------
| Juha Jäykkä, juhaj at iki.fi |
| http://koti.kapsi.fi/~juhaj/ |
-----------------------------------------------
More information about the petsc-users
mailing list