Problem on Blue Gene/P

Wei-keng Liao wkliao at ece.northwestern.edu
Mon Jun 15 10:37:37 CDT 2009


This error message indicates a different value used in defining  
variable metadata
among processes. Please make sure the variable define values are the  
same across
all processes.

Wei-keng

On Jun 14, 2009, at 1:50 PM, Julien Bodart wrote:

> Hi Wei-keng,
>
> Actually, if I was writing a single file, the array size in this  
> case would be around 2^31. But as I am writing on several files  
> (64), the size decreases to 2^25, which I guess, should not be a  
> problem.
> Do you have any informations on the kind of check which is performed  
> before giving this error message:
>
> "NC definations on multiprocesses conflict."
>
> Thanks for your help!
>
>
>
> 2009/6/13 Wei-keng Liao <wkliao at ece.northwestern.edu>
> Hi, Julien,
>
> The current release of parallel netcdf 1.0.3 does not support
> an array size that has more than 2^32 elements. It is because
> of the netCDF header format uses only 32-bit integers. I think
> this is most likely your case.
>
> The next release of pnetCDF will support large-size arrays,
> i.e. > 2^32 elements. The file format will also have slight
> changes (in order to use 64-bit integers for metadata.)
>
> Wei-keng
>
>
>
> On Jun 12, 2009, at 7:19 AM, Julien Bodart wrote:
>
> Hi everybody,
>
> I am doing some CFD on a Blue Gene/P computer (40 thousand cores). I  
> am trying to use parallel Netcdf, as I originally used Netcdf-3  
> format. Everything works fine on small cases(10 millions grid  
> points), but when going to bigger cases (1000 millions grid-points)  
> problems arise. I tried to use the 64 bits flag, which does not  
> improve the matter.
> Actually I am trying to write 64 files whose size is lower than 1GB,  
> using some subcommunicators.
> While it does not create any problems on small cases, bigger cases  
> stop at the ncmpi_enddef call on some files (randomly, even with  
> synchronisation in between), saying that there is a mismatch between  
> dimensions. After many check it does not seems that there is  
> something wrong with the dimensions. I have no idea of how to solve  
> the problem. Did anyone had similar problem? Thanks for your help.
>
> Julien
>
>



More information about the parallel-netcdf mailing list