[petsc-dev] Cannot locate file: share/petsc/datafiles/matrices/small

Junchao Zhang junchao.zhang at gmail.com
Sun Sep 12 14:20:18 CDT 2021


We met SF + Windows errors before.  Stefano wrote the code, which I don't
think was worth doing. SF with MPI one-sided is hard to be correct (due to
shared memory programming), bad in performance, and no users use that.
I would suggest we just disable the test and feature?   Stefano, what do
you think?

--Junchao Zhang


On Sun, Sep 12, 2021 at 2:10 PM Pierre Jolivet <pierre at joliv.et> wrote:

>
> On 12 Sep 2021, at 8:56 PM, Matthew Knepley <knepley at gmail.com> wrote:
>
> On Sun, Sep 12, 2021 at 2:49 PM Antonio T. sagitter <
> sagitter at fedoraproject.org> wrote:
>
>> Those attached are configure.log/make.log from a MPI build in Fedora 34
>> x86_64 where the error below occurred.
>>
>
> This is OpenMPI 4.1.0. Is that the only MPI you build? My first
> inclination is that this is an MPI implementation bug.
>
> Junchao, do we have an OpenMPI build in the CI?
>
>
> config/examples/arch-ci-linux-cuda-double-64idx.py:
>  '--download-openmpi=1',
> config/examples/arch-ci-linux-pkgs-dbg-ftn-interfaces.py:
>  '--download-openmpi=1',
> config/examples/arch-ci-linux-pkgs-opt.py:  '--download-openmpi=1',
>
> config/BuildSystem/config/packages/OpenMPI.py uses version 4.1.0 as well.
> I’m not sure PETSc is to blame here Antonio. You may want to try to ditch
> the OpenMPI shipped by your packet manager and try --download-openmpi as
> well, just for a quick sanity check.
>
> Thanks,
> Pierre
>
>   Thanks,
>
>      Matt
>
>
>> On 9/12/21 19:18, Antonio T. sagitter wrote:
>> > Okay. I will try to set correctly the DATAFILESPATH options.
>> >
>> > I see even this error:
>> >
>> > not ok
>> > vec_is_sf_tutorials-ex1_4+sf_window_sync-fence_sf_window_flavor-create
>> #
>> > Error code: 68
>> >
>> > #    PetscSF Object: 4 MPI processes
>> >
>> > #      type: window
>> >
>> > #      [0] Number of roots=3, leaves=2, remote ranks=2
>> >
>> > #      [0] 0 <- (3,1)
>> >
>> > #      [0] 1 <- (1,0)
>> >
>> > #      [1] Number of roots=2, leaves=3, remote ranks=2
>> >
>> > #      [1] 0 <- (0,1)
>> >
>> > #      [1] 1 <- (2,0)
>> >
>> > #      [1] 2 <- (0,2)
>> >
>> > #      [2] Number of roots=2, leaves=3, remote ranks=3
>> >
>> > #      [2] 0 <- (1,1)
>> >
>> > #      [2] 1 <- (3,0)
>> >
>> > #      [2] 2 <- (0,2)
>> >
>> > #      [3] Number of roots=2, leaves=3, remote ranks=2
>> >
>> > #      [3] 0 <- (2,1)
>> >
>> > #      [3] 1 <- (0,0)
>> >
>> > #      [3] 2 <- (0,2)
>> >
>> > #      [0] Roots referenced by my leaves, by rank
>> >
>> > #      [0] 1: 1 edges
>> >
>> > #      [0]    1 <- 0
>> >
>> > #      [0] 3: 1 edges
>> >
>> > #      [0]    0 <- 1
>> >
>> > #      [1] Roots referenced by my leaves, by rank
>> >
>> > #      [1] 0: 2 edges
>> >
>> > #      [1]    0 <- 1
>> >
>> > #      [1]    2 <- 2
>> >
>> > #      [1] 2: 1 edges
>> >
>> > #      [1]    1 <- 0
>> >
>> > #      [2] Roots referenced by my leaves, by rank
>> >
>> > #      [2] 0: 1 edges
>> >
>> > #      [2]    2 <- 2
>> >
>> > #      [2] 1: 1 edges
>> >
>> > #      [2]    0 <- 1
>> >
>> > #      [2] 3: 1 edges
>> >
>> > #      [2]    1 <- 0
>> >
>> > #      [3] Roots referenced by my leaves, by rank
>> >
>> > #      [3] 0: 2 edges
>> >
>> > #      [3]    1 <- 0
>> >
>> > #      [3]    2 <- 2
>> >
>> > #      [3] 2: 1 edges
>> >
>> > #      [3]    0 <- 1
>> >
>> > #      current flavor=CREATE synchronization=FENCE MultiSF
>> sort=rank-order
>> >
>> > #        current info=MPI_INFO_NULL
>> >
>> > #    [buildhw-x86-09:1135574] *** An error occurred in MPI_Accumulate
>> >
>> > #    [buildhw-x86-09:1135574] *** reported by process [3562602497,3]
>> >
>> > #    [buildhw-x86-09:1135574] *** on win rdma window 4
>> >
>> > #    [buildhw-x86-09:1135574] *** MPI_ERR_RMA_RANGE: invalid RMA
>> address
>> > range
>> >
>> > #    [buildhw-x86-09:1135574] *** MPI_ERRORS_ARE_FATAL (processes in
>> > this win will now abort,
>> >
>> > #    [buildhw-x86-09:1135574] ***    and potentially your MPI job)
>> >
>> > #    [buildhw-x86-09.iad2.fedoraproject.org:1135567] 3 more processes
>> > have sent help message help-mpi-errors.txt / mpi_errors_are_fatal
>> >
>> > #    [buildhw-x86-09.iad2.fedoraproject.org:1135567] Set MCA parameter
>> > "orte_base_help_aggregate" to 0 to see all help / error messages
>> >
>> > Looks like an error related to OpenMPI-4*:
>> > https://github.com/open-mpi/ompi/issues/6374
>> >
>>
>> --
>> ---
>> Antonio Trande
>> Fedora Project
>> mailto: sagitter at fedoraproject.org
>> GPG key: 0x29FBC85D7A51CC2F
>> GPG key server: https://keyserver1.pgp.com/
>
>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
> https://www.cse.buffalo.edu/~knepley/
> <http://www.cse.buffalo.edu/~knepley/>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20210912/0d6de562/attachment-0001.html>


More information about the petsc-dev mailing list