[petsc-users] UCX ERROR KNEM inline copy failed

Y. Yang yangyiwei.yang at mfm.tu-darmstadt.de
Tue Oct 2 10:10:15 CDT 2018


Dear PETSc team

Recently I'm using MOOSE (http://www.mooseframework.org/) which is built 
with PETSc and, Unfortunately, I encountered some problems with 
following PETSc options:

petsc_options_iname = '-pc_type -ksp_gmres_restart -sub_ksp_type 
-sub_pc_type -pc_asm_overlap -pc_factor_mat_solver_package'

petsc_options_value = 'asm          1201  preonly             ilu        
            4    superlu_dist'


the error message is:

Time Step 1, time = 1
                 dt = 1

     |residual|_2 of individual variables:
                         c:   779.034
                         w:   0
                         T:   6.57948e+07
                         gr0: 211.617
                         gr1: 206.973
                         gr2: 209.382
                         gr3: 191.089
                         gr4: 185.242
                         gr5: 157.361
                         gr6: 128.473
                         gr7: 87.6029

  0 Nonlinear |R| = 6.579482e+07
[1538482623.976180] [hpb0085:22501:0]        knem_ep.c:84   UCX ERROR 
KNEM inline copy failed, err = -1 Invalid argument
[1538482605.111342] [hpb0085:22502:0]        knem_ep.c:84   UCX ERROR 
KNEM inline copy failed, err = -1 Invalid argument
[1538482606.761138] [hpb0085:22502:0]        knem_ep.c:84   UCX ERROR 
KNEM inline copy failed, err = -1 Invalid argument
[1538482607.107478] [hpb0085:22502:0]        knem_ep.c:84   UCX ERROR 
KNEM inline copy failed, err = -1 Invalid argument
[1538482605.882817] [hpb0085:22503:0]        knem_ep.c:84   UCX ERROR 
KNEM inline copy failed, err = -1 Invalid argument
[1538482607.133543] [hpb0085:22503:0]        knem_ep.c:84   UCX ERROR 
KNEM inline copy failed, err = -1 Invalid argument
[1538482621.905475] [hpb0085:22510:0]        knem_ep.c:84   UCX ERROR 
KNEM inline copy failed, err = -1 Invalid argument
[1538482626.531234] [hpb0085:22510:0]        knem_ep.c:84   UCX ERROR 
KNEM inline copy failed, err = -1 Invalid argument
[1538482627.613343] [hpb0085:22515:0]        knem_ep.c:84   UCX ERROR 
KNEM inline copy failed, err = -1 Invalid argument
[1538482627.830489] [hpb0085:22515:0]        knem_ep.c:84   UCX ERROR 
KNEM inline copy failed, err = -1 Invalid argument
[1538482629.852351] [hpb0085:22515:0]        knem_ep.c:84   UCX ERROR 
KNEM inline copy failed, err = -1 Invalid argument
[1538482630.194620] [hpb0085:22515:0]        knem_ep.c:84   UCX ERROR 
KNEM inline copy failed, err = -1 Invalid argument
[1538482630.280636] [hpb0085:22515:0]        knem_ep.c:84   UCX ERROR 
KNEM inline copy failed, err = -1 Invalid argument
[1538482600.219314] [hpb0085:22516:0]        knem_ep.c:84   UCX ERROR 
KNEM inline copy failed, err = -1 Invalid argument
[1538482658.960350] [hpb0085:22516:0]        knem_ep.c:84   UCX ERROR 
KNEM inline copy failed, err = -1 Invalid argument
[1538482622.949471] [hpb0085:22517:0]        knem_ep.c:84   UCX ERROR 
KNEM inline copy failed, err = -1 Invalid argument
[1538482612.502017] [hpb0085:22500:0]        knem_ep.c:84   UCX ERROR 
KNEM inline copy failed, err = -1 Invalid argument
[1538482613.231970] [hpb0085:22500:0]        knem_ep.c:84   UCX ERROR 
KNEM inline copy failed, err = -1 Invalid argument
[1538482621.417530] [hpb0085:22520:0]        knem_ep.c:84   UCX ERROR 
KNEM inline copy failed, err = -1 Invalid argument
[1538482622.020998] [hpb0085:22520:0]        knem_ep.c:84   UCX ERROR 
KNEM inline copy failed, err = -1 Invalid argument
[1538482606.221292] [hpb0085:22521:0]        knem_ep.c:84   UCX ERROR 
KNEM inline copy failed, err = -1 Invalid argument
[1538482606.676987] [hpb0085:22521:0]        knem_ep.c:84   UCX ERROR 
KNEM inline copy failed, err = -1 Invalid argument
[1538482606.896865] [hpb0085:22521:0]        knem_ep.c:84   UCX ERROR 
KNEM inline copy failed, err = -1 Invalid argument
[1538482639.611427] [hpb0085:22522:0]        knem_ep.c:84   UCX ERROR 
KNEM inline copy failed, err = -1 Invalid argument
[1538482631.435277] [hpb0085:22523:0]        knem_ep.c:84   UCX ERROR 
KNEM inline copy failed, err = -1 Invalid argument
[1538482658.278343] [hpb0085:22512:0]        knem_ep.c:84   UCX ERROR 
KNEM inline copy failed, err = -1 Invalid argument
[1538482658.396945] [hpb0085:22512:0]        knem_ep.c:84   UCX ERROR 
KNEM inline copy failed, err = -1 Invalid argument
[1538482659.917476] [hpb0085:22512:0]        knem_ep.c:84   UCX ERROR 
KNEM inline copy failed, err = -1 Invalid argument
[1538482660.162064] [hpb0085:22512:0]        knem_ep.c:84   UCX ERROR 
KNEM inline copy failed, err = -1 Invalid argument
2 total processes killed (some possibly by mpirun during cleanup)


Here's the status of the simulation

Parallelism:
   Num Processors:          100
   Num Threads:             1

Mesh:
   Parallel Type:           distributed
   Mesh Dimension:          3
   Spatial Dimension:       3
   Nodes:
     Total:                 2065551
     Local:                 22774
   Elems:
     Total:                 2000000
     Local:                 20006
   Num Subdomains:          1
   Num Partitions:          100
   Partitioner:             parmetis

Nonlinear System:
   Num DOFs:                18589959
   Num Local DOFs:          204966
   Variables:               { "c" "w" "T" "gr0" "gr1" "gr2" "gr3" "gr4" 
"gr5" }
   Finite Element Types:    "LAGRANGE"
   Approximation Orders:    "FIRST"

Auxiliary System:
   Num DOFs:                10065551
   Num Local DOFs:          102798
   Variables:               "bnds" { "var_indices" "unique_grains" } { 
"M" "dM/dT" }
   Finite Element Types:    "LAGRANGE" "MONOMIAL" "MONOMIAL"
   Approximation Orders:    "FIRST" "CONSTANT" "CONSTANT"

Relationship Managers:
   Geometric                : GrainTrackerHaloRM (2 layers)

Execution Information:
   Executioner:             Transient
   TimeStepper:             IterationAdaptiveDT
   Solver Mode:             Preconditioned JFNK


I tried modifying the parameters and other preconditioning option, the 
problem is much the same. So I don't know where I did wrong or there is 
actually suitable PETSc option to deal with such problem with large 
mesh. I would like to hear your response.

Sincerely,
Yang

-- 
______________________________________________________
  
Yangyiwei Yang
Wissenschaftliche Hilfskraft

TU Darmstadt
Fachbereich 11 - Material- und Geowissenschaften
Fachgebiet Mechanik funktionaler Materialien

L1 | 08 402
Otto Berndt Straße 3
D-64287 Darmstadt

Tel: +49 (0)6151-16-22923
Email: yangyiwei.yang at mfm.tu-darmstadt.de
Homepage: http://www.mawi.tu-darmstadt.de/mfm
ORCID: 0000-0001-5505-7117

______________________________________________________



More information about the petsc-users mailing list