[Nek5000-users] [*] Re: Handling large data

nek5000-users at lists.mcs.anl.gov nek5000-users at lists.mcs.anl.gov
Thu Apr 8 13:04:36 CDT 2010


Hi,

     Thanks for the info. Those are really mind boggling numbers!

Mani chandra

On 04/08/2010 02:55 AM, nek5000-users at lists.mcs.anl.gov wrote:
> Dear Mani,
>
> I performed a single simulation producing 100TB of raw data. This is so far the most data intensive simulation we did with NEK. I am not sure what you have exactly in mind but I am happy to share some of my experience. For large scale computations you're most concerned about I/O performance, data analysis and data transfer.
>
>
> I/O Performance
> ----------------------
> The write throughput depends on many parameters:
>
> - data layout
> You'll get the best write performance by dumping large contiguous blocks.
>
> - dump data in parallel
> Here you have multiple options. It turns out that a two phase approach where only a subset of mpi-tasks actually perform the I/O works best. If all mpi tasks are writing into a shared or separate file you'll run into filesystem and network contentions effects. In addition the overhead of handling thousands of files slows down the metadata server which will become your bottleneck at some point.
> The optimal number of I/O nodes depends on the parallel filesystem configuration. On Intrepid (ANL's BG/P) you have to use around 512 I/O nodes to get close to the peak write throughput.
>
> In most simulations the amount of data to dump per mpi-tasks is only a few MB. However to get peak throughput you'll need several hundred MB per mpi-task. The peak write throughput rate is ~80 GB/s on Intrepid however in a typical large scale simulation you'll get only a few GB/s. If I recall correct I got around 6 GB/s using 32 racks (a rack has 4096 cores) on Intrepid.
>
>
> Data Analysis
> -------------------
> Similar story here. Reading the data is often the most time consuming part even if you some expensive volume rendering of big datasets.The performance problems are very similar here however the data access pattern can be even more irregular and read-ahead caching algorithms will not work well in this case. In the past we have used VisIt on a few hundred processors to do the data analysis. The read throughput rates are rather low (around 1 GB/s) but this depends on your access pattern. Slicing and contouring operations can be speeded up by some additional metadata. Together with some VisIt developers we were able to improve the NEK5000 reader performance in VisIt significantly. A slice operation does not take more than a few seconds for a 50GB file even on a few processors.
>
> Data Transfer
> ------------------
> This can become really a pain. You definitely need a high-speed connection between your home institution and the supercomputer where you computed the data. Using GridFTP we have been able to transfer the data with an averaged throughput of ~40 MB/s. So you still need 1 month to transfer 100TB ;)
>
>
>
> Stefan
>
>
>
> On Apr 7, 2010, at 9:29 PM, nek5000-users at lists.mcs.anl.gov wrote:
>
>    
>> Dear Nek devs,
>>
>>     This is not a Nek5k specific question, but given your experience performing some massive simulations (https://nek5000.mcs.anl.gov/index.php/Visualization_Gallery) which must have generated a huge amount of data, could you maybe give some tips on how best to handle and organize data? Right now, I do the most naive thing possible: putting separate cases in different folders with the folder name indicating the parameters (case_X_para1_para2..) and constantly keep transferring the data (not a script!) to avoid filling up our quota in the cluster's hard disk.
>>
>> I'd just like to know how things are done on the really big machines. Thanks!
>>
>> Mani chandra
>> _______________________________________________
>> Nek5000-users mailing list
>> Nek5000-users at lists.mcs.anl.gov
>> https://lists.mcs.anl.gov/mailman/listinfo/nek5000-users
>>      
> _______________________________________________
> Nek5000-users mailing list
> Nek5000-users at lists.mcs.anl.gov
> https://lists.mcs.anl.gov/mailman/listinfo/nek5000-users
>
>    




More information about the Nek5000-users mailing list