[mpich-discuss] Overflow in MPI_Aint

Christina Patrick christina.subscribes at gmail.com
Tue Jul 21 14:42:40 CDT 2009


Hi Pavan,

Thank you vey much for letting me know. It seems that I have no other
option but to upgrade to 1.1.

Thanks and Regards,
Christina.

On Tue, Jul 21, 2009 at 2:55 PM, Pavan Balaji<balaji at mcs.anl.gov> wrote:
>
> Joe: I believe all the Aint related patches have already gone into 1.1. Is
> something still missing?
>
> Christina: If you update to mpich2-1.1, you can try the --with-aint-size
> configure option. Note that this has not been tested on anything other than
> BG/P, but it might be worth a shot.
>
>  -- Pavan
>
> On 07/21/2009 01:49 PM, Christina Patrick wrote:
>>
>> Hi Joe,
>>
>> I am attaching my test case in this email. If you run it with any
>> number of processes except one, it will give you the SIGFPE error.
>> Similarly if you change the write in this program to a read, you will
>> get the same problem.
>>
>> I would sure appreciate a patch for this problem. If it is not too
>> much trouble, could you please give me the patch? I could try making
>> the corresponding changes to my setup.
>>
>> Thanks and Regards,
>> Christina.
>>
>> On Tue, Jul 21, 2009 at 2:06 PM, Joe Ratterman<jratt at us.ibm.com> wrote:
>>>
>>> Christina,
>>> Blue Gene/P is a 32-bit platform where we have hit similar problems.  To
>>> get
>>> around this, we increased the size of MPI_Aint in MPICH2 to be larger
>>> than
>>> void*, to 64 bits.  I suspect that your test case would work on our
>>> system,
>>> and I would like to see your test code if that is possible.  It should
>>> run
>>> on our system, and I would like to make sure we have it correct.
>>> If you are interested, we have patches against 1.0.7 and 1.1.0 that you
>>> can
>>> use (we skipped 1.0.8).  If you can build MPICH2 using those patches, you
>>> may be able to run your application.  On the other hand, they may be too
>>> specific to our platform.  We have been working with ANL to incorporate
>>> our
>>> changes into the standard MPICH2 releases, but there isn't a lot of
>>> demand
>>> for 64-bit MPI-IO on 32-bit machines.
>>>
>>> Thanks,
>>> Joe Ratterman
>>> IBM Blue Gene/P Messsaging
>>> jratt at us.ibm.com
>>>
>>>
>>> On Fri, Jul 17, 2009 at 7:12 PM, Christina Patrick
>>> <christina.subscribes at gmail.com> wrote:
>>>>
>>>> Hi Pavan,
>>>>
>>>> I ran the command
>>>>
>>>> $ getconf | grep -i WORD
>>>> WORD_BIT=32
>>>>
>>>> So I guess it is a 32 bit system.
>>>>
>>>> Thanks and Regards,
>>>> Christina.
>>>>
>>>> On Fri, Jul 17, 2009 at 8:06 PM, Pavan Balaji<balaji at mcs.anl.gov> wrote:
>>>>>
>>>>> Is it a 32-bit system? MPI_Aint is the size of a (void *), so on 32-bit
>>>>> systems it's restricted to 2GB.
>>>>>
>>>>>  -- Pavan
>>>>>
>>>>> On 07/17/2009 07:04 PM, Christina Patrick wrote:
>>>>>>
>>>>>> Hi Everybody,
>>>>>>
>>>>>> I am trying to create an array 32768 x 32768 x 8 bytes(double) = 8GB
>>>>>> file using 16 MPI processes. However, everytime, I try doing that, MPI
>>>>>> aborts. The backtrace is showing me that there is a problem in
>>>>>> ADIOI_Calc_my_off_len() function. There is a variable there:
>>>>>> MPI_Aint filetype_extent;
>>>>>>
>>>>>> and the value of the variable is filetype_extent = 0 whenever it
>>>>>> executes
>>>>>> MPI_Type_extent(fd->filetype, &filetype_extent);
>>>>>> Hence, when it reaches the statement:
>>>>>>   335             n_filetypes  = (offset - flat_file->indices[0]) /
>>>>>> filetype_extent;
>>>>>> I always get SIGFPE. Is there a solution to this problem? Can I create
>>>>>> such a big file?
>>>>>> I checked the value of the variable while creating a file of upto 2G
>>>>>> and it is NOT zero which makes me conclude that there is an overflow
>>>>>> when I am specifying 8G.
>>>>>>
>>>>>> Thanks and Regards,
>>>>>> Christina.
>>>>>>
>>>>>> PS: I am using the PVFS2 filesystem with mpich2-1.0.8 and pvfs-2.8.0.
>>>>>
>>>>> --
>>>>> Pavan Balaji
>>>>> http://www.mcs.anl.gov/~balaji
>>>>>
>>>
>
> --
> Pavan Balaji
> http://www.mcs.anl.gov/~balaji
>


More information about the mpich-discuss mailing list