[petsc-users] UCX ERROR KNEM inline copy failed
Fande Kong
fdkong.jd at gmail.com
Tue Oct 2 10:52:58 CDT 2018
The error messages may have nothing to do with PETSc and MOOSE.
It might be from a package for MPI communication
https://github.com/openucx/ucx. I have no experiences on such things. It
may be helpful to contact your HPC administer.
Thanks,
Fande,
On Tue, Oct 2, 2018 at 9:24 AM Matthew Knepley <knepley at gmail.com> wrote:
> On Tue, Oct 2, 2018 at 11:16 AM Y. Yang <
> yangyiwei.yang at mfm.tu-darmstadt.de> wrote:
>
>> Dear PETSc team
>>
>> Recently I'm using MOOSE (http://www.mooseframework.org/) which is built
>> with PETSc and, Unfortunately, I encountered some problems with
>> following PETSc options:
>>
>
> I do not know what problem you are reporting.I don't know what package
> knem_ep.c is
> part of, but its not PETSc.
>
> Thanks,
>
> Matt
>
>
>> petsc_options_iname = '-pc_type -ksp_gmres_restart -sub_ksp_type
>> -sub_pc_type -pc_asm_overlap -pc_factor_mat_solver_package'
>>
>> petsc_options_value = 'asm 1201 preonly ilu
>> 4 superlu_dist'
>>
>>
>> the error message is:
>>
>> Time Step 1, time = 1
>> dt = 1
>>
>> |residual|_2 of individual variables:
>> c: 779.034
>> w: 0
>> T: 6.57948e+07
>> gr0: 211.617
>> gr1: 206.973
>> gr2: 209.382
>> gr3: 191.089
>> gr4: 185.242
>> gr5: 157.361
>> gr6: 128.473
>> gr7: 87.6029
>>
>> 0 Nonlinear |R| = [32m6.579482e+07 [39m
>> [1538482623.976180] [hpb0085:22501:0] knem_ep.c:84 UCX ERROR
>> KNEM inline copy failed, err = -1 Invalid argument
>> [1538482605.111342] [hpb0085:22502:0] knem_ep.c:84 UCX ERROR
>> KNEM inline copy failed, err = -1 Invalid argument
>> [1538482606.761138] [hpb0085:22502:0] knem_ep.c:84 UCX ERROR
>> KNEM inline copy failed, err = -1 Invalid argument
>> [1538482607.107478] [hpb0085:22502:0] knem_ep.c:84 UCX ERROR
>> KNEM inline copy failed, err = -1 Invalid argument
>> [1538482605.882817] [hpb0085:22503:0] knem_ep.c:84 UCX ERROR
>> KNEM inline copy failed, err = -1 Invalid argument
>> [1538482607.133543] [hpb0085:22503:0] knem_ep.c:84 UCX ERROR
>> KNEM inline copy failed, err = -1 Invalid argument
>> [1538482621.905475] [hpb0085:22510:0] knem_ep.c:84 UCX ERROR
>> KNEM inline copy failed, err = -1 Invalid argument
>> [1538482626.531234] [hpb0085:22510:0] knem_ep.c:84 UCX ERROR
>> KNEM inline copy failed, err = -1 Invalid argument
>> [1538482627.613343] [hpb0085:22515:0] knem_ep.c:84 UCX ERROR
>> KNEM inline copy failed, err = -1 Invalid argument
>> [1538482627.830489] [hpb0085:22515:0] knem_ep.c:84 UCX ERROR
>> KNEM inline copy failed, err = -1 Invalid argument
>> [1538482629.852351] [hpb0085:22515:0] knem_ep.c:84 UCX ERROR
>> KNEM inline copy failed, err = -1 Invalid argument
>> [1538482630.194620] [hpb0085:22515:0] knem_ep.c:84 UCX ERROR
>> KNEM inline copy failed, err = -1 Invalid argument
>> [1538482630.280636] [hpb0085:22515:0] knem_ep.c:84 UCX ERROR
>> KNEM inline copy failed, err = -1 Invalid argument
>> [1538482600.219314] [hpb0085:22516:0] knem_ep.c:84 UCX ERROR
>> KNEM inline copy failed, err = -1 Invalid argument
>> [1538482658.960350] [hpb0085:22516:0] knem_ep.c:84 UCX ERROR
>> KNEM inline copy failed, err = -1 Invalid argument
>> [1538482622.949471] [hpb0085:22517:0] knem_ep.c:84 UCX ERROR
>> KNEM inline copy failed, err = -1 Invalid argument
>> [1538482612.502017] [hpb0085:22500:0] knem_ep.c:84 UCX ERROR
>> KNEM inline copy failed, err = -1 Invalid argument
>> [1538482613.231970] [hpb0085:22500:0] knem_ep.c:84 UCX ERROR
>> KNEM inline copy failed, err = -1 Invalid argument
>> [1538482621.417530] [hpb0085:22520:0] knem_ep.c:84 UCX ERROR
>> KNEM inline copy failed, err = -1 Invalid argument
>> [1538482622.020998] [hpb0085:22520:0] knem_ep.c:84 UCX ERROR
>> KNEM inline copy failed, err = -1 Invalid argument
>> [1538482606.221292] [hpb0085:22521:0] knem_ep.c:84 UCX ERROR
>> KNEM inline copy failed, err = -1 Invalid argument
>> [1538482606.676987] [hpb0085:22521:0] knem_ep.c:84 UCX ERROR
>> KNEM inline copy failed, err = -1 Invalid argument
>> [1538482606.896865] [hpb0085:22521:0] knem_ep.c:84 UCX ERROR
>> KNEM inline copy failed, err = -1 Invalid argument
>> [1538482639.611427] [hpb0085:22522:0] knem_ep.c:84 UCX ERROR
>> KNEM inline copy failed, err = -1 Invalid argument
>> [1538482631.435277] [hpb0085:22523:0] knem_ep.c:84 UCX ERROR
>> KNEM inline copy failed, err = -1 Invalid argument
>> [1538482658.278343] [hpb0085:22512:0] knem_ep.c:84 UCX ERROR
>> KNEM inline copy failed, err = -1 Invalid argument
>> [1538482658.396945] [hpb0085:22512:0] knem_ep.c:84 UCX ERROR
>> KNEM inline copy failed, err = -1 Invalid argument
>> [1538482659.917476] [hpb0085:22512:0] knem_ep.c:84 UCX ERROR
>> KNEM inline copy failed, err = -1 Invalid argument
>> [1538482660.162064] [hpb0085:22512:0] knem_ep.c:84 UCX ERROR
>> KNEM inline copy failed, err = -1 Invalid argument
>> 2 total processes killed (some possibly by mpirun during cleanup)
>>
>>
>> Here's the status of the simulation
>>
>> Parallelism:
>> Num Processors: 100
>> Num Threads: 1
>>
>> Mesh:
>> Parallel Type: distributed
>> Mesh Dimension: 3
>> Spatial Dimension: 3
>> Nodes:
>> Total: 2065551
>> Local: 22774
>> Elems:
>> Total: 2000000
>> Local: 20006
>> Num Subdomains: 1
>> Num Partitions: 100
>> Partitioner: parmetis
>>
>> Nonlinear System:
>> Num DOFs: 18589959
>> Num Local DOFs: 204966
>> Variables: { "c" "w" "T" "gr0" "gr1" "gr2" "gr3" "gr4"
>> "gr5" }
>> Finite Element Types: "LAGRANGE"
>> Approximation Orders: "FIRST"
>>
>> Auxiliary System:
>> Num DOFs: 10065551
>> Num Local DOFs: 102798
>> Variables: "bnds" { "var_indices" "unique_grains" } {
>> "M" "dM/dT" }
>> Finite Element Types: "LAGRANGE" "MONOMIAL" "MONOMIAL"
>> Approximation Orders: "FIRST" "CONSTANT" "CONSTANT"
>>
>> Relationship Managers:
>> Geometric : GrainTrackerHaloRM (2 layers)
>>
>> Execution Information:
>> Executioner: Transient
>> TimeStepper: IterationAdaptiveDT
>> Solver Mode: Preconditioned JFNK
>>
>>
>> I tried modifying the parameters and other preconditioning option, the
>> problem is much the same. So I don't know where I did wrong or there is
>> actually suitable PETSc option to deal with such problem with large
>> mesh. I would like to hear your response.
>>
>> Sincerely,
>> Yang
>>
>> --
>> ______________________________________________________
>>
>> Yangyiwei Yang
>> Wissenschaftliche Hilfskraft
>>
>> TU Darmstadt
>> Fachbereich 11 - Material- und Geowissenschaften
>> Fachgebiet Mechanik funktionaler Materialien
>>
>> L1 | 08 402
>> Otto Berndt Straße 3
>> D-64287 Darmstadt
>>
>> Tel: +49 (0)6151-16-22923
>> Email: yangyiwei.yang at mfm.tu-darmstadt.de
>> Homepage: http://www.mawi.tu-darmstadt.de/mfm
>> ORCID: 0000-0001-5505-7117
>>
>> ______________________________________________________
>>
>>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
> https://www.cse.buffalo.edu/~knepley/
> <http://www.cse.buffalo.edu/~knepley/>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20181002/322022de/attachment.html>
More information about the petsc-users
mailing list