I still don't understand fully the underlying processes of the whole PCNN
solution procedure, but trying around I substituted

MatCreateIS(commw, ind_length, ind_length, PETSC_DECIDE, PETSC_DECIDE,
gridmapping, &A);


MatCreateIS(commw, PETSC_DECIDE, PETSC_DECIDE, actdof, actdof, gridmapping, &A);

and received the needed results.

Furthermore it seems, that the load balance is now better, although I still
don't reach the expected values, e.g.
ilu-cg 320 iterations, condition 4601
cg only 1662 iterations, condition 84919

nn-cg on 2 nodes 229 iterations, condition 6285
nn-cg on 4 nodes 331 iterations, condition 13312

or is it not to expect, that nn-cg is faster than ilu-cg?



