[petsc-users] Unable to read in values thru namelist in Fortran after using PETSc 64bit in linux
Smith, Barry F.
bsmith at mcs.anl.gov
Wed Apr 3 01:05:27 CDT 2019
Based on the data below I am guess you are passing PetscInt arguments for all the arguments to MPI_ALLGATHERV. This won't work if PetscInt variables are of size 64 bits. Lets look at the arguments from https://www.mpich.org/static/docs/v3.1.x/www3/MPI_Allgather.html
int MPI_Allgather(const void *sendbuf, int sendcount, MPI_Datatype sendtype,
void *recvbuf, int recvcount, MPI_Datatype recvtype,
MPI_Comm comm)
all he arguments labeled void* can be passed PetscInt declared arguments but the ones labeled int must be passed regular 32 bit integers (to help our own development we label all these variables as PetscMPIInt in PETSc)
Good luck, if this does not resolve your problem please send us a small standalone problem we can run that reproduces your problem.
Barry
> On Apr 3, 2019, at 12:48 AM, TAY wee-beng <zonexo at gmail.com> wrote:
>
> Hi,
>
> I just encounter a mpi_allgatherv problem when using 64bit PETSc:
>
> call MPI_ALLGATHERV(tmp_mpi_data,counter,MPIU_INTEGER,tmp_mpi_data2,counter_global,idisp,MPIU_INTEGER,MPI_COMM_WORLD,ierr)
>
> The error is:
>
> Fatal error in PMPI_Allgatherv: Message truncated, error stack:
> PMPI_Allgatherv(1452).....: MPI_Allgatherv(sbuf=0x75cfdc0, scount=1119, dtype=0x4c000831, rbuf=0x75d8600, rcounts=0x7ffffffdb880, displs=0x7ffffffdb860, dtype=0x4c000831, MPI_COMM_WORLD) failed
> MPIR_Allgatherv_impl(1013): fail failed
> MPIR_Allgatherv(967)......: fail failed
>
> MPIR_Allgatherv_intra(222): fail failed
> MPIR_Localcopy(107).......: Message truncated; 8952 bytes received but buffer size is 8864
>
> The variables are all defined as PetscInt. The strange thing is that I did 2-3 MPI_ALLGATHERV which are exactly the same, but only the last one got problem. Should I change all these variables to PetscMPIInt?
>
> Also, can I change all PetscInt in the code to PetscMPIInt, except if it's labeled void *?
>
>
> Thank you very much.
>
> Yours sincerely,
>
> ================================================
> TAY Wee-Beng (Zheng Weiming) 郑伟明
> Personal research webpage: http://tayweebeng.wixsite.com/website
> Youtube research showcase: https://www.youtube.com/channel/UC72ZHtvQNMpNs2uRTSToiLA
> linkedin: www.linkedin.com/in/tay-weebeng
> ================================================
>
> On 21/9/2018 2:57 AM, Smith, Barry F. wrote:
>> Yes, you need to go through your code and check each MPI call and make sure you use PetscMPIInt for integer arguments and PetscInt for the void* arguments and also make sure that the data type you use in the MPI calls (when communicating PetscInt) is MPIU_INT.
>>
>> You should not need a fancy debugger to find out the crash point. Just a basic debugger like gdb, lldb, or dbx will
>>
>> Barry
>>
>>
>>> On Sep 20, 2018, at 8:57 AM, TAY wee-beng <zonexo at gmail.com> wrote:
>>>
>>> Hi,
>>>
>>> Sorry I'm still a bit confused. My 64bit code still doesn't work once I use more than 1 procs. It just aborts at some point. I'm been trying to use ARM Forge mpi debugging tool to find the error but it's a bit difficult to back trace.
>>>
>>> So I should carefully inspect each mpi subroutine or function, is that correct?
>>>
>>> If it's INT, then I should use PetscMPIInt. If it's labeled void *, I should use PetscInt. Is that so?
>>>
>>> Thank you very much
>>>
>>> Yours sincerely,
>>>
>>> ================================================
>>> TAY Wee-Beng 郑伟明 (Zheng Weiming)
>>> Personal research webpage: http://tayweebeng.wixsite.com/website
>>> Youtube research showcase: https://www.youtube.com/channel/UC72ZHtvQNMpNs2uRTSToiLA
>>> linkedin: www.linkedin.com/in/tay-weebeng
>>> ================================================
>>>
>>> On 19/9/2018 12:49 AM, Smith, Barry F. wrote:
>>>> PetscMPIInt (or integer) are for all lengths passed to MPI functions; simple look at the prototypes for the MPI function you care about and it will tell you which arguments are integer. The DATA you are passing into the MPI arrays (which are labeled void * in the manual pages) should be PetscInt.
>>>>
>>>>
>>>>
>>>>> On Sep 18, 2018, at 1:39 AM, TAY wee-beng <zonexo at gmail.com> wrote:
>>>>>
>>>>> Hi,
>>>>>
>>>>> In that case, does it apply to all MPI subroutines such as MPI_ALLGATHER?
>>>>>
>>>>> In other words, must I assign local_array_length etc as PetscMPIInt?
>>>>>
>>>>> call MPI_ALLGATHER(local_array_length,1,MPIU_INTEGER,array_length,1,MPIU_INTEGER,MPI_COMM_WORLD,ierr)
>>>>>
>>>>> Or is it ok to change all integers from PetscInt to PetscMPIInt?
>>>>>
>>>>> With the exception of ierr - PetscErrorCode
>>>>>
>>>>>
>>>>> Thank you very much.
>>>>>
>>>>> Yours sincerely,
>>>>>
>>>>> ================================================
>>>>> TAY Wee-Beng (Zheng Weiming) 郑伟明
>>>>> Personal research webpage: http://tayweebeng.wixsite.com/website
>>>>> Youtube research showcase: https://www.youtube.com/channel/UC72ZHtvQNMpNs2uRTSToiLA
>>>>> linkedin: www.linkedin.com/in/tay-weebeng
>>>>> ================================================
>>>>>
>>>>> On 18/9/2018 1:39 PM, Balay, Satish wrote:
>>>>>> https://www.mpich.org/static/docs/v3.1/www3/MPI_Comm_size.html
>>>>>>
>>>>>> int MPI_Comm_size( MPI_Comm comm, int *size )
>>>>>>
>>>>>> i.e there is no PetscInt here. [MPI does not know about PETSc datatypes]
>>>>>>
>>>>>> For convinence we provide PetscMPIInt to keep track of such variables
>>>>>> [similarly PetscBLASInt]. For eg: Check src/vec/vec/examples/tests/ex2f.F
>>>>>>
>>>>>> Satish
>>>>>>
>>>>>> On Tue, 18 Sep 2018, TAY wee-beng wrote:
>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> I managed to find the error appearing after using PETSc 64bit in linux -
>>>>>>>
>>>>>>> call MPI_COMM_SIZE(MPI_COMM_WORLD, num_procs, ierr)
>>>>>>>
>>>>>>> I have assigned num_procs as PetscInt and I got 0 instead of 1 (for 1 procs)
>>>>>>>
>>>>>>> Assigning num_procs as integer as the problem.
>>>>>>>
>>>>>>> Is this supposed to be the case? Or is it a bug?
>>>>>>>
>>>>>>> Thank you very much.
>>>>>>>
>>>>>>> Yours sincerely,
>>>>>>>
>>>>>>> ================================================
>>>>>>> TAY Wee-Beng (Zheng Weiming) 郑伟明
>>>>>>> Personal research webpage: http://tayweebeng.wixsite.com/website
>>>>>>> Youtube research showcase:
>>>>>>> https://www.youtube.com/channel/UC72ZHtvQNMpNs2uRTSToiLA
>>>>>>> linkedin: www.linkedin.com/in/tay-weebeng
>>>>>>> ================================================
>>>>>>>
>>>>>>> On 8/9/2018 1:14 AM, Smith, Barry F. wrote:
>>>>>>>> You can try valgrind
>>>>>>>> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
>>>>>>>>
>>>>>>>> Barry
>>>>>>>>
>>>>>>>>
>>>>>>>>> On Sep 7, 2018, at 1:44 AM, TAY wee-beng <zonexo at gmail.com> wrote:
>>>>>>>>>
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> I found that I am unable to read in values thru namelist in Fortran after
>>>>>>>>> using PETSc 64bit in linux.
>>>>>>>>>
>>>>>>>>> I have a parameter txt file which is read in using namelist in Fortran:
>>>>>>>>>
>>>>>>>>> namelist /body_input/ no_body, convex_body, motion_type, hover, wing_config
>>>>>>>>> ...
>>>>>>>>>
>>>>>>>>> open (unit = 44 , FILE = "ibm3d_input.txt" , status = "old", iostat =
>>>>>>>>> openstatus(4))
>>>>>>>>>
>>>>>>>>> if (openstatus(4) > 0) then
>>>>>>>>>
>>>>>>>>> print *, "ibm3d_input file not present or wrong filename."
>>>>>>>>>
>>>>>>>>> stop
>>>>>>>>>
>>>>>>>>> end if
>>>>>>>>>
>>>>>>>>> read (44,nml = solver_input)
>>>>>>>>>
>>>>>>>>> read (44,nml = grid_input)
>>>>>>>>>
>>>>>>>>> read (44,nml = body_input)...
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> After using PETSc 64bit, my code aborts and I realise that it is because
>>>>>>>>> the values have became NaN. Strangely, it does not occur in windows with
>>>>>>>>> VS2008.
>>>>>>>>>
>>>>>>>>> I wonder if it's a bug with the Intel Fortran compiler 2018.
>>>>>>>>>
>>>>>>>>> Anyone has similar experience?
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> Thank you very much.
>>>>>>>>>
>>>>>>>>> Yours sincerely,
>>>>>>>>>
>>>>>>>>> ================================================
>>>>>>>>> TAY Wee-Beng (Zheng Weiming) 郑伟明
>>>>>>>>> Personal research webpage: http://tayweebeng.wixsite.com/website
>>>>>>>>> Youtube research showcase:
>>>>>>>>> https://www.youtube.com/channel/UC72ZHtvQNMpNs2uRTSToiLA
>>>>>>>>> linkedin: www.linkedin.com/in/tay-weebeng
>>>>>>>>> ================================================
>>>>>>>>>
More information about the petsc-users
mailing list