[petsc-users] UCX ERROR KNEM inline copy failed

Matthew Knepley knepley at gmail.com
Tue Oct 2 10:21:01 CDT 2018


On Tue, Oct 2, 2018 at 11:16 AM Y. Yang <yangyiwei.yang at mfm.tu-darmstadt.de>
wrote:

> Dear PETSc team
>
> Recently I'm using MOOSE (http://www.mooseframework.org/) which is built
> with PETSc and, Unfortunately, I encountered some problems with
> following PETSc options:
>

I do not know what problem you are reporting.I don't know what package
knem_ep.c is
part of, but its not PETSc.

  Thanks,

     Matt


> petsc_options_iname = '-pc_type -ksp_gmres_restart -sub_ksp_type
> -sub_pc_type -pc_asm_overlap -pc_factor_mat_solver_package'
>
> petsc_options_value = 'asm          1201  preonly             ilu
>             4    superlu_dist'
>
>
> the error message is:
>
> Time Step 1, time = 1
>                  dt = 1
>
>      |residual|_2 of individual variables:
>                          c:   779.034
>                          w:   0
>                          T:   6.57948e+07
>                          gr0: 211.617
>                          gr1: 206.973
>                          gr2: 209.382
>                          gr3: 191.089
>                          gr4: 185.242
>                          gr5: 157.361
>                          gr6: 128.473
>                          gr7: 87.6029
>
>   0 Nonlinear |R| =  [32m6.579482e+07 [39m
> [1538482623.976180] [hpb0085:22501:0]        knem_ep.c:84   UCX ERROR
> KNEM inline copy failed, err = -1 Invalid argument
> [1538482605.111342] [hpb0085:22502:0]        knem_ep.c:84   UCX ERROR
> KNEM inline copy failed, err = -1 Invalid argument
> [1538482606.761138] [hpb0085:22502:0]        knem_ep.c:84   UCX ERROR
> KNEM inline copy failed, err = -1 Invalid argument
> [1538482607.107478] [hpb0085:22502:0]        knem_ep.c:84   UCX ERROR
> KNEM inline copy failed, err = -1 Invalid argument
> [1538482605.882817] [hpb0085:22503:0]        knem_ep.c:84   UCX ERROR
> KNEM inline copy failed, err = -1 Invalid argument
> [1538482607.133543] [hpb0085:22503:0]        knem_ep.c:84   UCX ERROR
> KNEM inline copy failed, err = -1 Invalid argument
> [1538482621.905475] [hpb0085:22510:0]        knem_ep.c:84   UCX ERROR
> KNEM inline copy failed, err = -1 Invalid argument
> [1538482626.531234] [hpb0085:22510:0]        knem_ep.c:84   UCX ERROR
> KNEM inline copy failed, err = -1 Invalid argument
> [1538482627.613343] [hpb0085:22515:0]        knem_ep.c:84   UCX ERROR
> KNEM inline copy failed, err = -1 Invalid argument
> [1538482627.830489] [hpb0085:22515:0]        knem_ep.c:84   UCX ERROR
> KNEM inline copy failed, err = -1 Invalid argument
> [1538482629.852351] [hpb0085:22515:0]        knem_ep.c:84   UCX ERROR
> KNEM inline copy failed, err = -1 Invalid argument
> [1538482630.194620] [hpb0085:22515:0]        knem_ep.c:84   UCX ERROR
> KNEM inline copy failed, err = -1 Invalid argument
> [1538482630.280636] [hpb0085:22515:0]        knem_ep.c:84   UCX ERROR
> KNEM inline copy failed, err = -1 Invalid argument
> [1538482600.219314] [hpb0085:22516:0]        knem_ep.c:84   UCX ERROR
> KNEM inline copy failed, err = -1 Invalid argument
> [1538482658.960350] [hpb0085:22516:0]        knem_ep.c:84   UCX ERROR
> KNEM inline copy failed, err = -1 Invalid argument
> [1538482622.949471] [hpb0085:22517:0]        knem_ep.c:84   UCX ERROR
> KNEM inline copy failed, err = -1 Invalid argument
> [1538482612.502017] [hpb0085:22500:0]        knem_ep.c:84   UCX ERROR
> KNEM inline copy failed, err = -1 Invalid argument
> [1538482613.231970] [hpb0085:22500:0]        knem_ep.c:84   UCX ERROR
> KNEM inline copy failed, err = -1 Invalid argument
> [1538482621.417530] [hpb0085:22520:0]        knem_ep.c:84   UCX ERROR
> KNEM inline copy failed, err = -1 Invalid argument
> [1538482622.020998] [hpb0085:22520:0]        knem_ep.c:84   UCX ERROR
> KNEM inline copy failed, err = -1 Invalid argument
> [1538482606.221292] [hpb0085:22521:0]        knem_ep.c:84   UCX ERROR
> KNEM inline copy failed, err = -1 Invalid argument
> [1538482606.676987] [hpb0085:22521:0]        knem_ep.c:84   UCX ERROR
> KNEM inline copy failed, err = -1 Invalid argument
> [1538482606.896865] [hpb0085:22521:0]        knem_ep.c:84   UCX ERROR
> KNEM inline copy failed, err = -1 Invalid argument
> [1538482639.611427] [hpb0085:22522:0]        knem_ep.c:84   UCX ERROR
> KNEM inline copy failed, err = -1 Invalid argument
> [1538482631.435277] [hpb0085:22523:0]        knem_ep.c:84   UCX ERROR
> KNEM inline copy failed, err = -1 Invalid argument
> [1538482658.278343] [hpb0085:22512:0]        knem_ep.c:84   UCX ERROR
> KNEM inline copy failed, err = -1 Invalid argument
> [1538482658.396945] [hpb0085:22512:0]        knem_ep.c:84   UCX ERROR
> KNEM inline copy failed, err = -1 Invalid argument
> [1538482659.917476] [hpb0085:22512:0]        knem_ep.c:84   UCX ERROR
> KNEM inline copy failed, err = -1 Invalid argument
> [1538482660.162064] [hpb0085:22512:0]        knem_ep.c:84   UCX ERROR
> KNEM inline copy failed, err = -1 Invalid argument
> 2 total processes killed (some possibly by mpirun during cleanup)
>
>
> Here's the status of the simulation
>
> Parallelism:
>    Num Processors:          100
>    Num Threads:             1
>
> Mesh:
>    Parallel Type:           distributed
>    Mesh Dimension:          3
>    Spatial Dimension:       3
>    Nodes:
>      Total:                 2065551
>      Local:                 22774
>    Elems:
>      Total:                 2000000
>      Local:                 20006
>    Num Subdomains:          1
>    Num Partitions:          100
>    Partitioner:             parmetis
>
> Nonlinear System:
>    Num DOFs:                18589959
>    Num Local DOFs:          204966
>    Variables:               { "c" "w" "T" "gr0" "gr1" "gr2" "gr3" "gr4"
> "gr5" }
>    Finite Element Types:    "LAGRANGE"
>    Approximation Orders:    "FIRST"
>
> Auxiliary System:
>    Num DOFs:                10065551
>    Num Local DOFs:          102798
>    Variables:               "bnds" { "var_indices" "unique_grains" } {
> "M" "dM/dT" }
>    Finite Element Types:    "LAGRANGE" "MONOMIAL" "MONOMIAL"
>    Approximation Orders:    "FIRST" "CONSTANT" "CONSTANT"
>
> Relationship Managers:
>    Geometric                : GrainTrackerHaloRM (2 layers)
>
> Execution Information:
>    Executioner:             Transient
>    TimeStepper:             IterationAdaptiveDT
>    Solver Mode:             Preconditioned JFNK
>
>
> I tried modifying the parameters and other preconditioning option, the
> problem is much the same. So I don't know where I did wrong or there is
> actually suitable PETSc option to deal with such problem with large
> mesh. I would like to hear your response.
>
> Sincerely,
> Yang
>
> --
> ______________________________________________________
>
> Yangyiwei Yang
> Wissenschaftliche Hilfskraft
>
> TU Darmstadt
> Fachbereich 11 - Material- und Geowissenschaften
> Fachgebiet Mechanik funktionaler Materialien
>
> L1 | 08 402
> Otto Berndt Straße 3
> D-64287 Darmstadt
>
> Tel: +49 (0)6151-16-22923
> Email: yangyiwei.yang at mfm.tu-darmstadt.de
> Homepage: http://www.mawi.tu-darmstadt.de/mfm
> ORCID: 0000-0001-5505-7117
>
> ______________________________________________________
>
>

-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20181002/c94e55f0/attachment-0001.html>


More information about the petsc-users mailing list