[petsc-users] solve problem with pastix
hg
hgbk2008 at gmail.com
Wed Nov 6 03:12:53 CST 2019
sched_setaffinity: Invalid argument only happens when I launch the job with
sbatch. Running without scheduler is fine. I think this has something to do
with pastix.
Giang
On Wed, Nov 6, 2019 at 4:37 AM Smith, Barry F. <bsmith at mcs.anl.gov> wrote:
>
> Google finds this
> https://gforge.inria.fr/forum/forum.php?thread_id=32824&forum_id=599&group_id=186
>
>
>
> > On Nov 5, 2019, at 7:01 PM, Matthew Knepley via petsc-users <
> petsc-users at mcs.anl.gov> wrote:
> >
> > I have no idea. That is a good question for the PasTix list.
> >
> > Thanks,
> >
> > Matt
> >
> > On Tue, Nov 5, 2019 at 5:32 PM hg <hgbk2008 at gmail.com> wrote:
> > Should thread affinity be invoked? I set -mat_pastix_threadnbr 1 and
> also OMP_NUM_THREADS to 1
> >
> > Giang
> >
> >
> > On Tue, Nov 5, 2019 at 10:50 PM Matthew Knepley <knepley at gmail.com>
> wrote:
> > On Tue, Nov 5, 2019 at 4:11 PM hg via petsc-users <
> petsc-users at mcs.anl.gov> wrote:
> > Hello
> >
> > I got crashed when using Pastix as solver for KSP. The error message
> looks like:
> >
> > ....
> > NUMBER of BUBBLE 1
> > COEFMAX 1735566 CPFTMAX 0 BPFTMAX 0 NBFTMAX 0 ARFTMAX 0
> > ** End of Partition & Distribution phase **
> > Time to analyze 0.225 s
> > Number of nonzeros in factorized matrix 708784076
> > Fill-in 12.2337
> > Number of operations (LU) 2.80185e+12
> > Prediction Time to factorize (AMD 6180 MKL) 394 s
> > 0 : SolverMatrix size (without coefficients) 32.4 MB
> > 0 : Number of nonzeros (local block structure) 365309391
> > Numerical Factorization (LU) :
> > 0 : Internal CSC size 1.08 GB
> > Time to fill internal csc 6.66 s
> > --- Sopalin : Allocation de la structure globale ---
> > --- Fin Sopalin Init ---
> > --- Initialisation des tableaux globaux ---
> > sched_setaffinity: Invalid argument
> > [node083:165071] *** Process received signal ***
> > [node083:165071] Signal: Aborted (6)
> > [node083:165071] Signal code: (-6)
> > [node083:165071] [ 0] /lib64/libpthread.so.0(+0xf680)[0x2b8081845680]
> > [node083:165071] [ 1] /lib64/libc.so.6(gsignal+0x37)[0x2b8082191207]
> > [node083:165071] [ 2] /lib64/libc.so.6(abort+0x148)[0x2b80821928f8]
> > [node083:165071] [ 3]
> /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(sopalin_launch_comm+0x0)[0x2b80a4124c9d]
> > [node083:165071] [ 4] Launching 1 threads (1 commputation, 0
> communication, 0 out-of-core)
> > --- Sopalin : Local structure allocation ---
> >
> /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(D_sopalin_init_smp+0x29b)[0x2b80a40c39d2]
> > [node083:165071] [ 5]
> /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(D_ge_sopalin_smp+0x68)[0x2b80a40cf4c2]
> > [node083:165071] [ 6]
> /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(sopalin_launch_thread+0x4ba)[0x2b80a4124a31]
> > [node083:165071] [ 7]
> /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(D_ge_sopalin_thread+0x94)[0x2b80a40d6170]
> > [node083:165071] [ 8]
> /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(D_pastix_task_sopalin+0x5ad)[0x2b80a40b09a2]
> > [node083:165071] [ 9]
> /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(d_pastix+0xa8a)[0x2b80a40b2325]
> > [node083:165071] [10]
> /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(+0x63927b)[0x2b80a35bf27b]
> > [node083:165071] [11]
> /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(MatLUFactorNumeric+0x19a)[0x2b80a32c7552]
> > [node083:165071] [12]
> /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(+0xa46c09)[0x2b80a39ccc09]
> > [node083:165071] [13]
> /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(PCSetUp+0x311)[0x2b80a3a8f1a9]
> > [node083:165071] [14]
> /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(KSPSetUp+0xbf7)[0x2b80a3b46e81]
> > [node083:165071] [15]
> /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(KSPSolve+0x210)[0x2b80a3b4746e]
> >
> > Does anyone have an idea what is the problem and how to fix it? The
> PETSc parameters I used are as below:
> >
> > It looks like PasTix is having trouble setting the thread affinity:
> >
> > sched_setaffinity: Invalid argument
> >
> > so it may be your build of PasTix.
> >
> > Thanks,
> >
> > Matt
> >
> > -pc_type lu
> > -pc_factor_mat_solver_package pastix
> > -mat_pastix_verbose 2
> > -mat_pastix_threadnbr 1
> >
> > Giang
> >
> >
> >
> > --
> > What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> > -- Norbert Wiener
> >
> > https://www.cse.buffalo.edu/~knepley/
> >
> >
> > --
> > What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> > -- Norbert Wiener
> >
> > https://www.cse.buffalo.edu/~knepley/
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20191106/60c57d9b/attachment-0001.html>
More information about the petsc-users
mailing list