[petsc-dev] Cannot locate file: share/petsc/datafiles/matrices/small

Junchao Zhang junchao.zhang at gmail.com
Sun Sep 12 14:38:45 CDT 2021


An old issue with SF_Window is at
https://gitlab.com/petsc/petsc/-/issues/555,  though which is a different
error.

--Junchao Zhang


On Sun, Sep 12, 2021 at 2:20 PM Junchao Zhang <junchao.zhang at gmail.com>
wrote:

> We met SF + Windows errors before.  Stefano wrote the code, which I don't
> think was worth doing. SF with MPI one-sided is hard to be correct (due to
> shared memory programming), bad in performance, and no users use that.
> I would suggest we just disable the test and feature?   Stefano, what do
> you think?
>
> --Junchao Zhang
>
>
> On Sun, Sep 12, 2021 at 2:10 PM Pierre Jolivet <pierre at joliv.et> wrote:
>
>>
>> On 12 Sep 2021, at 8:56 PM, Matthew Knepley <knepley at gmail.com> wrote:
>>
>> On Sun, Sep 12, 2021 at 2:49 PM Antonio T. sagitter <
>> sagitter at fedoraproject.org> wrote:
>>
>>> Those attached are configure.log/make.log from a MPI build in Fedora 34
>>> x86_64 where the error below occurred.
>>>
>>
>> This is OpenMPI 4.1.0. Is that the only MPI you build? My first
>> inclination is that this is an MPI implementation bug.
>>
>> Junchao, do we have an OpenMPI build in the CI?
>>
>>
>> config/examples/arch-ci-linux-cuda-double-64idx.py:
>>  '--download-openmpi=1',
>> config/examples/arch-ci-linux-pkgs-dbg-ftn-interfaces.py:
>>  '--download-openmpi=1',
>> config/examples/arch-ci-linux-pkgs-opt.py:  '--download-openmpi=1',
>>
>> config/BuildSystem/config/packages/OpenMPI.py uses version 4.1.0 as well.
>> I’m not sure PETSc is to blame here Antonio. You may want to try to ditch
>> the OpenMPI shipped by your packet manager and try --download-openmpi as
>> well, just for a quick sanity check.
>>
>> Thanks,
>> Pierre
>>
>>   Thanks,
>>
>>      Matt
>>
>>
>>> On 9/12/21 19:18, Antonio T. sagitter wrote:
>>> > Okay. I will try to set correctly the DATAFILESPATH options.
>>> >
>>> > I see even this error:
>>> >
>>> > not ok
>>> > vec_is_sf_tutorials-ex1_4+sf_window_sync-fence_sf_window_flavor-create
>>> #
>>> > Error code: 68
>>> >
>>> > #    PetscSF Object: 4 MPI processes
>>> >
>>> > #      type: window
>>> >
>>> > #      [0] Number of roots=3, leaves=2, remote ranks=2
>>> >
>>> > #      [0] 0 <- (3,1)
>>> >
>>> > #      [0] 1 <- (1,0)
>>> >
>>> > #      [1] Number of roots=2, leaves=3, remote ranks=2
>>> >
>>> > #      [1] 0 <- (0,1)
>>> >
>>> > #      [1] 1 <- (2,0)
>>> >
>>> > #      [1] 2 <- (0,2)
>>> >
>>> > #      [2] Number of roots=2, leaves=3, remote ranks=3
>>> >
>>> > #      [2] 0 <- (1,1)
>>> >
>>> > #      [2] 1 <- (3,0)
>>> >
>>> > #      [2] 2 <- (0,2)
>>> >
>>> > #      [3] Number of roots=2, leaves=3, remote ranks=2
>>> >
>>> > #      [3] 0 <- (2,1)
>>> >
>>> > #      [3] 1 <- (0,0)
>>> >
>>> > #      [3] 2 <- (0,2)
>>> >
>>> > #      [0] Roots referenced by my leaves, by rank
>>> >
>>> > #      [0] 1: 1 edges
>>> >
>>> > #      [0]    1 <- 0
>>> >
>>> > #      [0] 3: 1 edges
>>> >
>>> > #      [0]    0 <- 1
>>> >
>>> > #      [1] Roots referenced by my leaves, by rank
>>> >
>>> > #      [1] 0: 2 edges
>>> >
>>> > #      [1]    0 <- 1
>>> >
>>> > #      [1]    2 <- 2
>>> >
>>> > #      [1] 2: 1 edges
>>> >
>>> > #      [1]    1 <- 0
>>> >
>>> > #      [2] Roots referenced by my leaves, by rank
>>> >
>>> > #      [2] 0: 1 edges
>>> >
>>> > #      [2]    2 <- 2
>>> >
>>> > #      [2] 1: 1 edges
>>> >
>>> > #      [2]    0 <- 1
>>> >
>>> > #      [2] 3: 1 edges
>>> >
>>> > #      [2]    1 <- 0
>>> >
>>> > #      [3] Roots referenced by my leaves, by rank
>>> >
>>> > #      [3] 0: 2 edges
>>> >
>>> > #      [3]    1 <- 0
>>> >
>>> > #      [3]    2 <- 2
>>> >
>>> > #      [3] 2: 1 edges
>>> >
>>> > #      [3]    0 <- 1
>>> >
>>> > #      current flavor=CREATE synchronization=FENCE MultiSF
>>> sort=rank-order
>>> >
>>> > #        current info=MPI_INFO_NULL
>>> >
>>> > #    [buildhw-x86-09:1135574] *** An error occurred in MPI_Accumulate
>>> >
>>> > #    [buildhw-x86-09:1135574] *** reported by process [3562602497,3]
>>> >
>>> > #    [buildhw-x86-09:1135574] *** on win rdma window 4
>>> >
>>> > #    [buildhw-x86-09:1135574] *** MPI_ERR_RMA_RANGE: invalid RMA
>>> address
>>> > range
>>> >
>>> > #    [buildhw-x86-09:1135574] *** MPI_ERRORS_ARE_FATAL (processes in
>>> > this win will now abort,
>>> >
>>> > #    [buildhw-x86-09:1135574] ***    and potentially your MPI job)
>>> >
>>> > #    [buildhw-x86-09.iad2.fedoraproject.org:1135567] 3 more processes
>>> > have sent help message help-mpi-errors.txt / mpi_errors_are_fatal
>>> >
>>> > #    [buildhw-x86-09.iad2.fedoraproject.org:1135567] Set MCA
>>> parameter
>>> > "orte_base_help_aggregate" to 0 to see all help / error messages
>>> >
>>> > Looks like an error related to OpenMPI-4*:
>>> > https://github.com/open-mpi/ompi/issues/6374
>>> >
>>>
>>> --
>>> ---
>>> Antonio Trande
>>> Fedora Project
>>> mailto: sagitter at fedoraproject.org
>>> GPG key: 0x29FBC85D7A51CC2F
>>> GPG key server: https://keyserver1.pgp.com/
>>
>>
>>
>> --
>> What most experimenters take for granted before they begin their
>> experiments is infinitely more interesting than any results to which their
>> experiments lead.
>> -- Norbert Wiener
>>
>> https://www.cse.buffalo.edu/~knepley/
>> <http://www.cse.buffalo.edu/~knepley/>
>>
>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20210912/1071c95c/attachment.html>


More information about the petsc-dev mailing list