[petsc-dev] error with --download-amgx

Mark Adams mfadams at lbl.gov
Fri Dec 13 14:03:14 CST 2019


Well, it has some problems but it is good that we have it so we can check
that box off. And if anyone ever picks AMGx up again PETSc might be the
best place to do that.

On Fri, Dec 13, 2019 at 2:28 PM Fande Kong <fdkong.jd at gmail.com> wrote:

> Thanks, Mark,
>
> it looks really cool!
>
> Fande,
>
> On Fri, Dec 13, 2019 at 11:29 AM Mark Adams <mfadams at lbl.gov> wrote:
>
>> There is a MR winding its way in. I think it is ready to go. There is
>> confusion with branches but it is something like
>> barry/12-01-19-pc-feature-amgx.
>>
>> It is a PC, but it is so fragile. I could only get AMGx to work with
>> classical AMG (preferred) by using its Krylov solver also. So the test sets
>> -ksp_preonly.
>>
>> There is an example in snes/ex13.
>>
>> On Fri, Dec 13, 2019 at 12:09 PM Fande Kong <fdkong.jd at gmail.com> wrote:
>>
>>> I was wondering how we use AMGx? It is a PC? Do we have a native
>>> interface to this package?  Any example?
>>>
>>> Fande,
>>>
>>> On Tue, Dec 3, 2019 at 9:10 PM Smith, Barry F. <bsmith at mcs.anl.gov>
>>> wrote:
>>>
>>>>
>>>>   Ok, since we are using a fork of the repository if it is causing
>>>> grief we can remove it but otherwise best just to leave it since it put in
>>>> by the AMGX developers.
>>>>
>>>>
>>>>
>>>> > On Dec 3, 2019, at 10:00 PM, Balay, Satish <balay at mcs.anl.gov> wrote:
>>>> >
>>>> > Its from the attached configure.log [from a prior e-mail on this
>>>> > thread] - and this flag is not passed in from petsc configure to amgx
>>>> > cmake - so it must be somehow set internally in this package.
>>>> >
>>>> > Satish
>>>> >
>>>> > On Wed, 4 Dec 2019, Smith, Barry F. wrote:
>>>> >
>>>> >>
>>>> >>> Also - its best to avoid -Werror in externalpackage builds..
>>>> >>
>>>> >>  Hmm, my branch uses the CMake package to do the install  (it has no
>>>> custom code) so is this a bug in package.py that it doesn't strip out the
>>>> -Werror when passing stuff to cmake?
>>>> >>
>>>> >>  There is not enough information your email for me to determine
>>>> where this message you printed below came from. Did it come from my branch?
>>>> >>
>>>> >>   Barry
>>>> >>
>>>> >>
>>>> >>> On Dec 3, 2019, at 9:29 PM, Balay, Satish <balay at mcs.anl.gov>
>>>> wrote:
>>>> >>>
>>>> >>>>>>>
>>>> >>>
>>>> autofs/nccs-svm1_sw/summit/.swci/0-core/opt/spack/20180914/linux-rhel7-ppc64le/gcc-4.8.5/cmake-3.15.2-xit2o3iepxvqbyku77lwcugufilztu7t/bin/cmake
>>>> -E remove
>>>> /autofs/nccs-svm1_home1/adams/petsc/arch-summit-opt64-gnu-cuda/externalpackages/git.amgx/petsc-build/core/CMakeFiles/amgx_core.dir/src/classical/interpolators/./
>>>> amgx_core_generated_common.cu.o
>>>> >>> /sw/summit/cuda/10.1.168/bin/nvcc -M -D__CUDACC__
>>>> /autofs/nccs-svm1_home1/adams/petsc/arch-summit-opt64-gnu-cuda/externalpackages/git.amgx/
>>>> plugin_config.cu -o
>>>> /autofs/nccs-svm1_home1/adams/petsc/arch-summit-opt64-gnu-cuda/externalpackages/git.amgx/petsc-build/base/CMakeFiles/amgx_base.dir/__/amgx_base_generated_plugin_config.cu.o.NVCC-depend
>>>> -m64 -Xcompiler
>>>> ,\"-fstack-protector\",\"-g\",\"-O2\",\"-fPIC\",\"-Wno-terminate\",\"-static-libgcc\",\"-fopenmp\",\"-DRAPIDJSON_DEFINED\",\"-DAMGX_WITH_MPI\",\"-fstack-protector\",\"-g\",\"-O2\",\"-fPIC\"
>>>> -ccbin mpicxx -gencode=arch=compute_35,code=\"sm_35,compute_35\"
>>>> -gencode=arch=compute_52,code=\"sm_52,compute_52\"
>>>> -gencode=arch=compute_60,code=\"sm_60,compute_60\"
>>>> -gencode=arch=compute_70,code=\"sm_70,compute_70\" -O3 -DNDEBUG -std=c++11
>>>> --Werror cross-execution-space-call -DNVCC
>>>> -I/autofs/nccs-svm1_home1/adams/petsc/arch-summit-opt64-gnu-cuda/externalpackages/git.amgx/../../thrust
>>>> -I/autofs/nccs-svm1_home1/adams/petsc/arch-summit-opt64-gnu-cuda/externalpackages/git..amgx/base/include
>>>> -I/sw/summit/cuda/10.1.168/include
>>>> -I/autofs/nccs-svm1_home1/adams/petsc/arch-summit-opt64-gnu-cuda/externalpackages/git.amgx/external/rapidjson/include
>>>> >>> -- Removing
>>>> /autofs/nccs-svm1_home1/adams/petsc/arch-summit-opt64-gnu-cuda/externalpackages/git.amgx/petsc-build/base/CMakeFiles/amgx_base.dir/src/./amgx_base_generated_device_properties.cu.o
>>>> >>> <<<<<<
>>>> >>>
>>>> >>> Also - its best to avoid -Werror in externalpackage builds..
>>>> >>>
>>>> >>> Satish
>>>> >>>
>>>> >>> On Tue, 3 Dec 2019, Mark Adams wrote:
>>>> >>>
>>>> >>>> Barry,
>>>> >>>>
>>>> >>>> First, there is a fix that we need in the AMGx repo (appended).
>>>> >>>>
>>>> >>>> I've never had problems like this not being able to build. I will
>>>> try to
>>>> >>>> build my AMGx manually to check.
>>>> >>>>
>>>> >>>> 21:31 master *= ~/AMGX$ git diff
>>>> >>>> diff --git a/base/include/amgx_c_wrappers.inl
>>>> >>>> b/base/include/amgx_c_wrappers.inl
>>>> >>>> index 42496f0..ed9f0e1 100644
>>>> >>>> --- a/base/include/amgx_c_wrappers.inl
>>>> >>>> +++ b/base/include/amgx_c_wrappers.inl
>>>> >>>> @@ -643,7 +643,7 @@ inline AMGX_Mode get_mode_from(const Envelope
>>>> &envl)
>>>> >>>>    {
>>>> >>>>        //throws...
>>>> >>>>        //
>>>> >>>> -        FatalError("Mode not found.\n", AMGX_ERR_BAD_MODE);
>>>> >>>> +        // FatalError("Mode not found.\n", AMGX_ERR_BAD_MODE);
>>>> >>>>    }
>>>> >>>>
>>>> >>>>    AMGX_Mode mode = static_cast<AMGX_Mode>(itFound->second);
>>>> >>>> @@ -1125,4 +1125,4 @@ inline bool
>>>> remove_managed_matrix(AMGX_matrix_handle
>>>> >>>> envl)
>>>> >>>> } //namespace unnamed
>>>> >>>>
>>>> >>>> On Tue, Dec 3, 2019 at 8:50 PM Smith, Barry F. <bsmith at mcs.anl.gov>
>>>> wrote:
>>>> >>>>
>>>> >>>>>
>>>> >>>>> The first error is
>>>> >>>>>
>>>> >>>>> nvcc error   : 'cicc' died due to signal 9 (Kill signal)
>>>> >>>>> nvcc error   : 'cicc' died due to signal 9 (Kill signal)
>>>> >>>>>
>>>> >>>>> later
>>>> >>>>>
>>>> >>>>>
>>>> >>>>>
>>>> /autofs/nccs-svm1_home1/adams/petsc/arch-summit-opt64-gnu-cuda/externalpackages/git.amgx/base/src/
>>>> >>>>> amgx_c_common.cu(77): catastrophic error: error while writing
>>>> generated
>>>> >>>>> C++ file: Cannot allocate memory
>>>> >>>>>
>>>> >>>>> 1 catastrophic error detected in the compilation of
>>>> >>>>> "/tmp/tmpxft_000038d3_00000000-4_amgx_c_common.cpp4.ii".
>>>> >>>>>
>>>> >>>>> Whatever machine you are compiling on seems to be overloaded.
>>>> What does
>>>> >>>>> top show? Can you log into "different" compiler servers on Summit?
>>>> >>>>>
>>>> >>>>> I run the install on the utk system with no problems like this.
>>>> >>>>>
>>>> >>>>> PETSc configure is deciding
>>>> >>>>>
>>>> >>>>> TEST configureMakeNP from
>>>> >>>>>
>>>> config.packages.make(/autofs/nccs-svm1_home1/adams/petsc/config/BuildSystem/config/packages/make.py:168)
>>>> >>>>> TESTING: configureMakeNP from
>>>> >>>>>
>>>> config.packages.make(config/BuildSystem/config/packages/make.py:168)
>>>> >>>>> check no of cores on the build machine [perhaps to do make '-j
>>>> ncores']
>>>> >>>>>         module multiprocessing found 128 cores: using make_np = 59
>>>> >>>>>           Defined make macro "MAKE_NP" to "59"
>>>> >>>>>           Defined make macro "MAKE_TEST_NP" to "49"
>>>> >>>>>           Defined make macro "MAKE_LOAD" to "166.4"
>>>> >>>>>           Defined make macro "NPMAX" to "128"
>>>> >>>>>
>>>> >>>>> then somehow this gets passed down to the AMGX cmake. Perhaps you
>>>> could
>>>> >>>>> try less greedy numbers.
>>>> >>>>>
>>>> >>>>> Executing: /usr/bin/gmake -j59 -l166.4
>>>> >>>>>
>>>> >>>>> Perhaps try
>>>> >>>>>
>>>> >>>>> --with-make-np=20
>>>> >>>>>
>>>> >>>>> --with-make-load=20
>>>> >>>>>
>>>> >>>>> Barry
>>>> >>>>>
>>>> >>>>>
>>>> >>>>>
>>>> >>>>>> On Dec 3, 2019, at 4:17 PM, Mark Adams <mfadams at lbl.gov> wrote:
>>>> >>>>>>
>>>> >>>>>> That might have been from an old .nsf file. Here is a new one.
>>>> I'm sure
>>>> >>>>> I have the disk space.
>>>> >>>>>>
>>>> >>>>>> On Tue, Dec 3, 2019 at 4:51 PM Mark Adams <mfadams at lbl.gov>
>>>> wrote:
>>>> >>>>>> Humm, clean up and seem to have different error:
>>>> >>>>>>
>>>> >>>>>>
>>>> >>>>>> On Tue, Dec 3, 2019 at 3:45 PM Matthew Knepley <
>>>> knepley at gmail.com>
>>>> >>>>> wrote:
>>>> >>>>>> Could you have run out of disk space?
>>>> >>>>>>
>>>> >>>>>> /usr/bin/ranlib: libamgx_base.a: Input/output error
>>>> >>>>>> gmake[2]: *** [base/libamgx_base.a] Error 1
>>>> >>>>>> gmake[2]: *** Deleting file `base/libamgx_base.a'
>>>> >>>>>> gmake[1]: *** [base/CMakeFiles/amgx_base.dir/all] Error 2
>>>> >>>>>> gmake: *** [all] Error 2
>>>> >>>>>>
>>>> >>>>>>  Matt
>>>> >>>>>>
>>>> >>>>>> On Tue, Dec 3, 2019 at 1:44 PM Mark Adams <mfadams at lbl.gov>
>>>> wrote:
>>>> >>>>>>
>>>> >>>>>>
>>>> >>>>>>
>>>> >>>>>> --
>>>> >>>>>> What most experimenters take for granted before they begin their
>>>> >>>>> experiments is infinitely more interesting than any results to
>>>> which their
>>>> >>>>> experiments lead.
>>>> >>>>>> -- Norbert Wiener
>>>> >>>>>>
>>>> >>>>>> https://www.cse.buffalo.edu/~knepley/
>>>> >>>>>> <configure.log>
>>>> >>>>>
>>>> >>>>>
>>>> >>>>
>>>> >>
>>>>
>>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20191213/c469c331/attachment.html>


More information about the petsc-dev mailing list