[mpich-discuss] Overflow in MPI_Aint

Pavan Balaji balaji at mcs.anl.gov
Tue Jul 21 15:21:13 CDT 2009


Christina,

If you are upgrading, mpich2-1.1.1 should be out today. You can as well 
upgrade to that.

  -- Pavan

On 07/21/2009 02:42 PM, Christina Patrick wrote:
> Hi Pavan,
> 
> Thank you vey much for letting me know. It seems that I have no other
> option but to upgrade to 1.1.
> 
> Thanks and Regards,
> Christina.
> 
> On Tue, Jul 21, 2009 at 2:55 PM, Pavan Balaji<balaji at mcs.anl.gov> wrote:
>> Joe: I believe all the Aint related patches have already gone into 1.1. Is
>> something still missing?
>>
>> Christina: If you update to mpich2-1.1, you can try the --with-aint-size
>> configure option. Note that this has not been tested on anything other than
>> BG/P, but it might be worth a shot.
>>
>>  -- Pavan
>>
>> On 07/21/2009 01:49 PM, Christina Patrick wrote:
>>> Hi Joe,
>>>
>>> I am attaching my test case in this email. If you run it with any
>>> number of processes except one, it will give you the SIGFPE error.
>>> Similarly if you change the write in this program to a read, you will
>>> get the same problem.
>>>
>>> I would sure appreciate a patch for this problem. If it is not too
>>> much trouble, could you please give me the patch? I could try making
>>> the corresponding changes to my setup.
>>>
>>> Thanks and Regards,
>>> Christina.
>>>
>>> On Tue, Jul 21, 2009 at 2:06 PM, Joe Ratterman<jratt at us.ibm.com> wrote:
>>>> Christina,
>>>> Blue Gene/P is a 32-bit platform where we have hit similar problems.  To
>>>> get
>>>> around this, we increased the size of MPI_Aint in MPICH2 to be larger
>>>> than
>>>> void*, to 64 bits.  I suspect that your test case would work on our
>>>> system,
>>>> and I would like to see your test code if that is possible.  It should
>>>> run
>>>> on our system, and I would like to make sure we have it correct.
>>>> If you are interested, we have patches against 1.0.7 and 1.1.0 that you
>>>> can
>>>> use (we skipped 1.0.8).  If you can build MPICH2 using those patches, you
>>>> may be able to run your application.  On the other hand, they may be too
>>>> specific to our platform.  We have been working with ANL to incorporate
>>>> our
>>>> changes into the standard MPICH2 releases, but there isn't a lot of
>>>> demand
>>>> for 64-bit MPI-IO on 32-bit machines.
>>>>
>>>> Thanks,
>>>> Joe Ratterman
>>>> IBM Blue Gene/P Messsaging
>>>> jratt at us.ibm.com
>>>>
>>>>
>>>> On Fri, Jul 17, 2009 at 7:12 PM, Christina Patrick
>>>> <christina.subscribes at gmail.com> wrote:
>>>>> Hi Pavan,
>>>>>
>>>>> I ran the command
>>>>>
>>>>> $ getconf | grep -i WORD
>>>>> WORD_BIT=32
>>>>>
>>>>> So I guess it is a 32 bit system.
>>>>>
>>>>> Thanks and Regards,
>>>>> Christina.
>>>>>
>>>>> On Fri, Jul 17, 2009 at 8:06 PM, Pavan Balaji<balaji at mcs.anl.gov> wrote:
>>>>>> Is it a 32-bit system? MPI_Aint is the size of a (void *), so on 32-bit
>>>>>> systems it's restricted to 2GB.
>>>>>>
>>>>>>  -- Pavan
>>>>>>
>>>>>> On 07/17/2009 07:04 PM, Christina Patrick wrote:
>>>>>>> Hi Everybody,
>>>>>>>
>>>>>>> I am trying to create an array 32768 x 32768 x 8 bytes(double) = 8GB
>>>>>>> file using 16 MPI processes. However, everytime, I try doing that, MPI
>>>>>>> aborts. The backtrace is showing me that there is a problem in
>>>>>>> ADIOI_Calc_my_off_len() function. There is a variable there:
>>>>>>> MPI_Aint filetype_extent;
>>>>>>>
>>>>>>> and the value of the variable is filetype_extent = 0 whenever it
>>>>>>> executes
>>>>>>> MPI_Type_extent(fd->filetype, &filetype_extent);
>>>>>>> Hence, when it reaches the statement:
>>>>>>>   335             n_filetypes  = (offset - flat_file->indices[0]) /
>>>>>>> filetype_extent;
>>>>>>> I always get SIGFPE. Is there a solution to this problem? Can I create
>>>>>>> such a big file?
>>>>>>> I checked the value of the variable while creating a file of upto 2G
>>>>>>> and it is NOT zero which makes me conclude that there is an overflow
>>>>>>> when I am specifying 8G.
>>>>>>>
>>>>>>> Thanks and Regards,
>>>>>>> Christina.
>>>>>>>
>>>>>>> PS: I am using the PVFS2 filesystem with mpich2-1.0.8 and pvfs-2.8.0.
>>>>>> --
>>>>>> Pavan Balaji
>>>>>> http://www.mcs.anl.gov/~balaji
>>>>>>
>> --
>> Pavan Balaji
>> http://www.mcs.anl.gov/~balaji
>>

-- 
Pavan Balaji
http://www.mcs.anl.gov/~balaji


More information about the mpich-discuss mailing list