From Moritz.Huck at rwth-aachen.de Mon Sep 2 04:51:11 2019 From: Moritz.Huck at rwth-aachen.de (Huck, Moritz) Date: Mon, 2 Sep 2019 09:51:11 +0000 Subject: [petsc-users] is TS_EQ_DAE_SEMI_EXPLICIT_INDEX functional Message-ID: Hi, TS_EQ_DAE_SEMI_EXPLICIT_INDEX(?) are defined in TSEquationType but not mentioned in the manual. Is this feature functional ? If yes how do I have to define the RHSFunction? (I am asking since the ODE variant has it defined as G= M^-1 g, which cannot work for a DAE) Best Regards, Moritz From stevenbenbow at quintessa.org Mon Sep 2 11:22:39 2019 From: stevenbenbow at quintessa.org (Steve) Date: Mon, 2 Sep 2019 17:22:39 +0100 Subject: [petsc-users] Handling infeasible solution iterates in TS/SNES Message-ID: <3d28e4c5-70fa-5baf-3fd3-a9346385974d@quintessa.org> Hello, I have another beginner's PETSc question.? Apologies if the solution is obvious, but I've looked around the manual and the API and haven't yet spotted a solution. I'm solving a nonlinear problem using the BDF TSP (although the same issue arises if I use BEULER and other TS - it's not specific to BDF).? The issue that I have is that during the SNES iterations for a timestep it's possible for the solution iterates to wander into an infeasible region when evaluating the TS IFunction.? In the particular instance I have this is resulting in an exp overflow, so it is both physically and computationally infeasible. The problem arises because of the highly nonlinear nature of the problem.? I have a hand-coded DAE solver that also suffers with the same issue, but which spots that the situation had arisen in the evaluation of the residual, and then rejects the timestep and takes a smaller one, which is usually sufficient for the Newton iterates to remain feasible and for timestepping to proceed.? I would like to take the same approach with PETSc. Currently I return a non-zero PetscErrorCode from my IFunction to indicate that the solution iterate is infeasible, but this results in an immediate (but graceful) exit from PETSc. Ideally I guess I would like to call a function like TS/SNESSetIterateIsInfeasible(...) from within my IFunction and then return zero, to indicate to PETSc that the Newton iterations have gone awry but that nothing fatal has happened, or (better?) would return a specific non-zero error code from my IFunction and handle the particular error code by reducing the timestep.? The crucial thing is that I can't return a new residual from my IFunction when this happens, due to the infeasibility of the input, and so PETSc should not use the value of the residual itself to infer divergence. Are either of these approaches possible?? Or is there an alternative/better approach that I can use with PETSc to handle such situations?? (I've seen SETERRQ in the API but this only appears to allow tailored error messages to be set rather than providing a method to handle them - but perhaps I have misunderstood.) Again, apologies if this is a naive question, and thanks in advance for any suggestions. Steve From bsmith at mcs.anl.gov Mon Sep 2 13:03:13 2019 From: bsmith at mcs.anl.gov (Smith, Barry F.) Date: Mon, 2 Sep 2019 18:03:13 +0000 Subject: [petsc-users] Handling infeasible solution iterates in TS/SNES In-Reply-To: <3d28e4c5-70fa-5baf-3fd3-a9346385974d@quintessa.org> References: <3d28e4c5-70fa-5baf-3fd3-a9346385974d@quintessa.org> Message-ID: Steve, There are two levels at which "out of domain" (infeasible) errors can/should be indicated: 1) First call TSSetFunctionDomainError(). TS calls your provided function when candidate solutions are generated, if the new solution is not feasible, as indicated by this function, it will cut the time-step. 2) Next, in your function evaluations (provided with TSSetIFunction() or TSSetRHSFunction()) call SNESSetFunctionDomainError() whenever an infeasible state is provided. Use TSGetSNES() to access the SNES object from the TS. If you call this function the SNES solver will try cutting the length of the search direction in the nonlinear solver, still striving to solve the nonlinear system. If this fails the TS will cut the time-step. Not a naive question at all, our documentation on this is not as extensive or as clear as it should be. I have tried to improve it a bit https://gitlab.com/petsc/petsc/merge_requests/2001 Good luck, and please let us know if this support can be improved in any way. Barry Yes, the names of the the two functions TSSetFunctionDomainError() and SNESSetFunctionDomainError() are a bit inconsistent since the first one takes a user provided function while the second one indicates the current solution point is not feasible. The PETSc error codes are all "hard" errors that end the program. We manage "exception" type errors that can be recovered from through additional APIs such as SNESSetFunctionDomainError(). I tried once to add a recoverable exception type mechanism for the PETSc error codes but found it was too cumbersome in C. > On Sep 2, 2019, at 11:22 AM, Steve via petsc-users wrote: > > Hello, > > I have another beginner's PETSc question. Apologies if the solution is obvious, but I've looked around the manual and the API and haven't yet spotted a solution. > > I'm solving a nonlinear problem using the BDF TSP (although the same issue arises if I use BEULER and other TS - it's not specific to BDF). The issue that I have is that during the SNES iterations for a timestep it's possible for the solution iterates to wander into an infeasible region when evaluating the TS IFunction. In the particular instance I have this is resulting in an exp overflow, so it is both physically and computationally infeasible. > > The problem arises because of the highly nonlinear nature of the problem. I have a hand-coded DAE solver that also suffers with the same issue, but which spots that the situation had arisen in the evaluation of the residual, and then rejects the timestep and takes a smaller one, which is usually sufficient for the Newton iterates to remain feasible and for timestepping to proceed. I would like to take the same approach with PETSc. > > Currently I return a non-zero PetscErrorCode from my IFunction to indicate that the solution iterate is infeasible, but this results in an immediate (but graceful) exit from PETSc. > > Ideally I guess I would like to call a function like TS/SNESSetIterateIsInfeasible(...) from within my IFunction and then return zero, to indicate to PETSc that the Newton iterations have gone awry but that nothing fatal has happened, or (better?) would return a specific non-zero error code from my IFunction and handle the particular error code by reducing the timestep. The crucial thing is that I can't return a new residual from my IFunction when this happens, due to the infeasibility of the input, and so PETSc should not use the value of the residual itself to infer divergence. > > Are either of these approaches possible? Or is there an alternative/better approach that I can use with PETSc to handle such situations? (I've seen SETERRQ in the API but this only appears to allow tailored error messages to be set rather than providing a method to handle them - but perhaps I have misunderstood.) > > Again, apologies if this is a naive question, and thanks in advance for any suggestions. > > Steve > > From danyang.su at gmail.com Mon Sep 2 14:24:17 2019 From: danyang.su at gmail.com (Danyang Su) Date: Mon, 2 Sep 2019 12:24:17 -0700 Subject: [petsc-users] Error in creating compressed data using HDF5 Message-ID: Dear All, Not sure if this is the right place to ask hdf5 question. I installed hdf5 through PETSc configuration --download-hdf5=yes. The code runs without problem except the function to create compressed data (red part shown below). ??? !c create local memory space and hyperslab ??? call h5screate_simple_f(hdf5_ndim, hdf5_dsize, memspace,?????????? & ??????????????????????????? hdf5_ierr) ??? call h5sselect_hyperslab_f(memspace, H5S_SELECT_SET_F,???????????? & ?????????????????????????????? hdf5_offset, hdf5_count, hdf5_ierr,???? & ?????????????????????????????? hdf5_stride, hdf5_block) ??? !c create the global file space and hyperslab ??? call h5screate_simple_f(hdf5_ndim,hdf5_gdsize,filespace, & ??????????????????????????? hdf5_ierr) ??? call h5sselect_hyperslab_f(filespace, H5S_SELECT_SET_F,??????????? & ?????????????????????????????? hdf5_goffset, hdf5_count, hdf5_ierr,??? & ?????????????????????????????? hdf5_stride, hdf5_block) ??? !c create a data chunking property ??? call h5pcreate_f(H5P_DATASET_CREATE_F, chunk_id, hdf5_ierr) ??? call h5pset_chunk_f(chunk_id, hdf5_ndim, hdf5_csize, hdf5_ierr) ??? !c create compressed data, dataset must be chunked for compression ??? !c the following cause crash in hdf5 library, check when new ??? !c hdf5 version is available ??? ! Set ZLIB / DEFLATE Compression using compression level 6. ??? ! To use SZIP Compression comment out these lines. ??? !call h5pset_deflate_f(chunk_id, 6, hdf5_ierr) ??? ! Uncomment these lines to set SZIP Compression ??? !szip_options_mask = H5_SZIP_NN_OM_F ??? !szip_pixels_per_block = 16 ??? !call H5Pset_szip_f(chunk_id, szip_options_mask,??????????????????? & ??? !?????????????????? szip_pixels_per_block, hdf5_ierr) ??? !c create the dataset id ??? call h5dcreate_f(group_id, dataname, H5T_NATIVE_INTEGER,?????????? & ???????????????????? filespace, dset_id, hdf5_ierr,??????????????????? & ???????????????????? dcpl_id=chunk_id) ??? !c create a data transfer property ??? call h5pcreate_f(H5P_DATASET_XFER_F, xlist_id, hdf5_ierr) ??? call h5pset_dxpl_mpio_f(xlist_id, H5FD_MPIO_COLLECTIVE_F,????????? & ??????????????????????????? hdf5_ierr) ??? !c write the dataset collectively ??? call h5dwrite_f(dset_id, H5T_NATIVE_INTEGER, dataset, hdf5_dsize,? & ??????????????????? hdf5_ierr, file_space_id=filespace,??????????????? & ??????????????????? mem_space_id=memspace, xfer_prp = xlist_id) ??? call h5dclose_f(dset_id, hdf5_ierr) ??? !c close resources ??? call h5sclose_f(filespace, hdf5_ierr) ??? call h5sclose_f(memspace, hdf5_ierr) ??? call h5pclose_f(chunk_id, hdf5_ierr) ??? call h5pclose_f(xlist_id, hdf5_ierr) Both h5pset_deflate_f and H5Pset_szip_f crashes the code with error information as shown below. If I comment out h5pset_deflate_f and H5Pset_szip_f, then everything works fine. HDF5-DIAG: Error detected in HDF5 (1.8.18) MPI-process 0: ? #000: H5D.c line 194 in H5Dcreate2(): unable to create dataset ??? major: Dataset ??? minor: Unable to initialize object ? #001: H5Dint.c line 455 in H5D__create_named(): unable to create and link to dataset ??? major: Dataset ??? minor: Unable to initialize object ? #002: H5L.c line 1638 in H5L_link_object(): unable to create new link to object ??? major: Links ??? minor: Unable to initialize object ? #003: H5L.c line 1882 in H5L_create_real(): can't insert link ??? major: Symbol table ??? minor: Unable to insert object ? #004: H5Gtraverse.c line 861 in H5G_traverse(): internal path traversal failed ??? major: Symbol table ??? minor: Object not found Does anyone encounter this kind of error before? Kind regards, Danyang -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Mon Sep 2 14:45:23 2019 From: bsmith at mcs.anl.gov (Smith, Barry F.) Date: Mon, 2 Sep 2019 19:45:23 +0000 Subject: [petsc-users] Error in creating compressed data using HDF5 In-Reply-To: References: Message-ID: <27553ABD-57FC-45EB-BE3D-A26EE0E60E2E@anl.gov> You could try the master branch of PETSc that uses a much more recent branch of hdf5 When you did the --download-hdf5 did you also do --download-zlib and --download-szlib (though I would hope hdf5 would give you a very useful error message that they need to be installed instead of the vague error message they do provide.) Barry > On Sep 2, 2019, at 2:24 PM, Danyang Su via petsc-users wrote: > > Dear All, > > Not sure if this is the right place to ask hdf5 question. I installed hdf5 through PETSc configuration --download-hdf5=yes. The code runs without problem except the function to create compressed data (red part shown below). > > !c create local memory space and hyperslab > call h5screate_simple_f(hdf5_ndim, hdf5_dsize, memspace, & > hdf5_ierr) > call h5sselect_hyperslab_f(memspace, H5S_SELECT_SET_F, & > hdf5_offset, hdf5_count, hdf5_ierr, & > hdf5_stride, hdf5_block) > > !c create the global file space and hyperslab > call h5screate_simple_f(hdf5_ndim,hdf5_gdsize,filespace, & > hdf5_ierr) > call h5sselect_hyperslab_f(filespace, H5S_SELECT_SET_F, & > hdf5_goffset, hdf5_count, hdf5_ierr, & > hdf5_stride, hdf5_block) > > !c create a data chunking property > call h5pcreate_f(H5P_DATASET_CREATE_F, chunk_id, hdf5_ierr) > call h5pset_chunk_f(chunk_id, hdf5_ndim, hdf5_csize, hdf5_ierr) > > !c create compressed data, dataset must be chunked for compression > !c the following cause crash in hdf5 library, check when new > !c hdf5 version is available > > ! Set ZLIB / DEFLATE Compression using compression level 6. > ! To use SZIP Compression comment out these lines. > !call h5pset_deflate_f(chunk_id, 6, hdf5_ierr) > > ! Uncomment these lines to set SZIP Compression > !szip_options_mask = H5_SZIP_NN_OM_F > !szip_pixels_per_block = 16 > !call H5Pset_szip_f(chunk_id, szip_options_mask, & > ! szip_pixels_per_block, hdf5_ierr) > > !c create the dataset id > call h5dcreate_f(group_id, dataname, H5T_NATIVE_INTEGER, & > filespace, dset_id, hdf5_ierr, & > dcpl_id=chunk_id) > > !c create a data transfer property > call h5pcreate_f(H5P_DATASET_XFER_F, xlist_id, hdf5_ierr) > call h5pset_dxpl_mpio_f(xlist_id, H5FD_MPIO_COLLECTIVE_F, & > hdf5_ierr) > > !c write the dataset collectively > call h5dwrite_f(dset_id, H5T_NATIVE_INTEGER, dataset, hdf5_dsize, & > hdf5_ierr, file_space_id=filespace, & > mem_space_id=memspace, xfer_prp = xlist_id) > > call h5dclose_f(dset_id, hdf5_ierr) > > !c close resources > call h5sclose_f(filespace, hdf5_ierr) > call h5sclose_f(memspace, hdf5_ierr) > call h5pclose_f(chunk_id, hdf5_ierr) > call h5pclose_f(xlist_id, hdf5_ierr) > > > > Both h5pset_deflate_f and H5Pset_szip_f crashes the code with error information as shown below. If I comment out h5pset_deflate_f and H5Pset_szip_f, then everything works fine. > > HDF5-DIAG: Error detected in HDF5 (1.8.18) MPI-process 0: > #000: H5D.c line 194 in H5Dcreate2(): unable to create dataset > major: Dataset > minor: Unable to initialize object > #001: H5Dint.c line 455 in H5D__create_named(): unable to create and link to dataset > major: Dataset > minor: Unable to initialize object > #002: H5L.c line 1638 in H5L_link_object(): unable to create new link to object > major: Links > minor: Unable to initialize object > #003: H5L.c line 1882 in H5L_create_real(): can't insert link > major: Symbol table > minor: Unable to insert object > #004: H5Gtraverse.c line 861 in H5G_traverse(): internal path traversal failed > major: Symbol table > minor: Object not found > > Does anyone encounter this kind of error before? > > Kind regards, > > Danyang > From jed at jedbrown.org Mon Sep 2 16:44:31 2019 From: jed at jedbrown.org (Jed Brown) Date: Mon, 02 Sep 2019 15:44:31 -0600 Subject: [petsc-users] is TS_EQ_DAE_SEMI_EXPLICIT_INDEX functional In-Reply-To: References: Message-ID: <87mufml6f4.fsf@jedbrown.org> I believe this is intended to work with most any implicit solver, *provided* the initial conditions are compatible. It was added by Emil, but I don't see it explicitly tested in PETSc. "Huck, Moritz via petsc-users" writes: > Hi, > TS_EQ_DAE_SEMI_EXPLICIT_INDEX(?) are defined in TSEquationType but not mentioned in the manual. > Is this feature functional ? > If yes how do I have to define the RHSFunction? > (I am asking since the ODE variant has it defined as G= M^-1 g, which cannot work for a DAE) > > Best Regards, > Moritz From emconsta at anl.gov Mon Sep 2 19:27:40 2019 From: emconsta at anl.gov (Constantinescu, Emil M.) Date: Tue, 3 Sep 2019 00:27:40 +0000 Subject: [petsc-users] is TS_EQ_DAE_SEMI_EXPLICIT_INDEX functional In-Reply-To: <87mufml6f4.fsf@jedbrown.org> References: <87mufml6f4.fsf@jedbrown.org> Message-ID: Indeed, various time steppers can take advantage of the differential form provided and also can serve as a sanity check (e.g., warn users before they use an explicit solver on an index-2 DAE). To my knowledge, we do not have solvers that take advantage of semi-explicit DAEs, but it's good practice to annotate applications for when solvers are available. In ARKIMEX, we check only if the problem is TS_EQ_IMPLICIT or not (this includes DAEs and ODEs with mass matrix on the left M\dot{u}). If you are solving a DAE, you should use TS_EQ_IMPLICIT. G=M^-1 g is either for nonstiff ODE or partitioned ODE with mass matrix, which can be used if you have a fast way to compute M^-1 g. Emil On 9/2/19 4:44 PM, Jed Brown wrote: > I believe this is intended to work with most any implicit solver, > *provided* the initial conditions are compatible. It was added by Emil, > but I don't see it explicitly tested in PETSc. > > "Huck, Moritz via petsc-users" writes: > >> Hi, >> TS_EQ_DAE_SEMI_EXPLICIT_INDEX(?) are defined in TSEquationType but not mentioned in the manual. >> Is this feature functional ? >> If yes how do I have to define the RHSFunction? >> (I am asking since the ODE variant has it defined as G= M^-1 g, which cannot work for a DAE) >> >> Best Regards, >> Moritz From danyang.su at gmail.com Mon Sep 2 19:31:30 2019 From: danyang.su at gmail.com (Danyang Su) Date: Mon, 2 Sep 2019 17:31:30 -0700 Subject: [petsc-users] Error in creating compressed data using HDF5 In-Reply-To: <27553ABD-57FC-45EB-BE3D-A26EE0E60E2E@anl.gov> References: <27553ABD-57FC-45EB-BE3D-A26EE0E60E2E@anl.gov> Message-ID: <67137fbd-3373-408d-892e-2b68eef143fe@gmail.com> Hi Barry, Yes, I have already included zlib and szlib during the configuration. I will try the dev version to see if it works. Thanks, Danyang On 2019-09-02 12:45 p.m., Smith, Barry F. wrote: > You could try the master branch of PETSc that uses a much more recent branch of hdf5 > > When you did the --download-hdf5 did you also do --download-zlib and --download-szlib (though I would hope hdf5 would give you a very useful error message that they need to be installed instead of the vague error message they do provide.) > > Barry > > >> On Sep 2, 2019, at 2:24 PM, Danyang Su via petsc-users wrote: >> >> Dear All, >> >> Not sure if this is the right place to ask hdf5 question. I installed hdf5 through PETSc configuration --download-hdf5=yes. The code runs without problem except the function to create compressed data (red part shown below). >> >> !c create local memory space and hyperslab >> call h5screate_simple_f(hdf5_ndim, hdf5_dsize, memspace, & >> hdf5_ierr) >> call h5sselect_hyperslab_f(memspace, H5S_SELECT_SET_F, & >> hdf5_offset, hdf5_count, hdf5_ierr, & >> hdf5_stride, hdf5_block) >> >> !c create the global file space and hyperslab >> call h5screate_simple_f(hdf5_ndim,hdf5_gdsize,filespace, & >> hdf5_ierr) >> call h5sselect_hyperslab_f(filespace, H5S_SELECT_SET_F, & >> hdf5_goffset, hdf5_count, hdf5_ierr, & >> hdf5_stride, hdf5_block) >> >> !c create a data chunking property >> call h5pcreate_f(H5P_DATASET_CREATE_F, chunk_id, hdf5_ierr) >> call h5pset_chunk_f(chunk_id, hdf5_ndim, hdf5_csize, hdf5_ierr) >> >> !c create compressed data, dataset must be chunked for compression >> !c the following cause crash in hdf5 library, check when new >> !c hdf5 version is available >> >> ! Set ZLIB / DEFLATE Compression using compression level 6. >> ! To use SZIP Compression comment out these lines. >> !call h5pset_deflate_f(chunk_id, 6, hdf5_ierr) >> >> ! Uncomment these lines to set SZIP Compression >> !szip_options_mask = H5_SZIP_NN_OM_F >> !szip_pixels_per_block = 16 >> !call H5Pset_szip_f(chunk_id, szip_options_mask, & >> ! szip_pixels_per_block, hdf5_ierr) >> >> !c create the dataset id >> call h5dcreate_f(group_id, dataname, H5T_NATIVE_INTEGER, & >> filespace, dset_id, hdf5_ierr, & >> dcpl_id=chunk_id) >> >> !c create a data transfer property >> call h5pcreate_f(H5P_DATASET_XFER_F, xlist_id, hdf5_ierr) >> call h5pset_dxpl_mpio_f(xlist_id, H5FD_MPIO_COLLECTIVE_F, & >> hdf5_ierr) >> >> !c write the dataset collectively >> call h5dwrite_f(dset_id, H5T_NATIVE_INTEGER, dataset, hdf5_dsize, & >> hdf5_ierr, file_space_id=filespace, & >> mem_space_id=memspace, xfer_prp = xlist_id) >> >> call h5dclose_f(dset_id, hdf5_ierr) >> >> !c close resources >> call h5sclose_f(filespace, hdf5_ierr) >> call h5sclose_f(memspace, hdf5_ierr) >> call h5pclose_f(chunk_id, hdf5_ierr) >> call h5pclose_f(xlist_id, hdf5_ierr) >> >> >> >> Both h5pset_deflate_f and H5Pset_szip_f crashes the code with error information as shown below. If I comment out h5pset_deflate_f and H5Pset_szip_f, then everything works fine. >> >> HDF5-DIAG: Error detected in HDF5 (1.8.18) MPI-process 0: >> #000: H5D.c line 194 in H5Dcreate2(): unable to create dataset >> major: Dataset >> minor: Unable to initialize object >> #001: H5Dint.c line 455 in H5D__create_named(): unable to create and link to dataset >> major: Dataset >> minor: Unable to initialize object >> #002: H5L.c line 1638 in H5L_link_object(): unable to create new link to object >> major: Links >> minor: Unable to initialize object >> #003: H5L.c line 1882 in H5L_create_real(): can't insert link >> major: Symbol table >> minor: Unable to insert object >> #004: H5Gtraverse.c line 861 in H5G_traverse(): internal path traversal failed >> major: Symbol table >> minor: Object not found >> >> Does anyone encounter this kind of error before? >> >> Kind regards, >> >> Danyang >> From emconsta at anl.gov Mon Sep 2 19:33:56 2019 From: emconsta at anl.gov (Constantinescu, Emil M.) Date: Tue, 3 Sep 2019 00:33:56 +0000 Subject: [petsc-users] Problem with TS and SNES VI In-Reply-To: <3633CF86-71B0-4C91-A3EB-A14F22B5098A@mcs.anl.gov> References: <8dbfce1e921b4ae282a7539dfbf5370b@rwth-aachen.de> <3ac7d73f24074aec8b8c288b45193ecb@rwth-aachen.de> <5A251CF0-D34E-4823-B0CA-695CC21AC1B5@pnnl.gov> <002a46ca0d73467aa4a7a4f9dfb503ea@rwth-aachen.de> <5643e5d1bb8b4dada80ec8e76c98d5cd@rwth-aachen.de> <3633CF86-71B0-4C91-A3EB-A14F22B5098A@mcs.anl.gov> Message-ID: <42bbbb3f-b3ed-8794-838f-295004ec228e@anl.gov> Moritz, If you use ARKIMEX(3,4,5) and force the time step to be small [try setting -ts_dt_max] - do you still get negatives? If you are not, it may be due to the time steps being too large relative to the dynamics of your system, forcing it in the negative territory. This is typical when one deals with modeling chemical reactions as an example. If the non negativity disappears when using smaller steps, then the events can be used to keep it on the positive side. Emil On 8/28/19 11:01 AM, Smith, Barry F. via petsc-users wrote: > Without more detail it is impossible to understand why things are going wrong. > > Please attempt to do your debugging with gdb and send all the output, we may have suggestions on how to get it working. With that working you will be able to zoom in immediately at exactly where the problem is. Print statements are not the way to go. > > Do you have your snes line search prestep code working? Are the values out of bounds when you get in and do you fix them here to not be out of bounds? Do they they get out of bounds later inside SNES? If the prestep is working properly TS should never see out of bounds variables (since your code in SNES detects and removes them). > > Stick to bt for now. > > Barry > > > > >> On Aug 28, 2019, at 2:26 AM, Huck, Moritz wrote: >> >> Hi, >> (since I'm using petsc4py and wasnot able to hook gdb correctly up, I am "debbuging" with prints) >> >> I am using TS_EQ_DAE_IMPLICIT_INDEX1 as equation type. >> The out of bounds values occur inside the SNES as well after a step has finished. >> The occur first after a timestep and then "propgate" into the SNES. >> The problem arises with bt, l2 or basic as linesearch. >> It seems to occur with ARKIMEX(3,4,5) but not with ARKIMEX(L2,A2) or BDF (but these have to use much lower time steps), for the later SNESVI also works for bounding . >> >> @Shri the event solution seems not work for me, if an lower bound crossing is detected the solver reduces the time step to a small value and doesnt reach the crossing in a reasonable time frame. >> >> Best Regards, >> Moritz >> >> >> >> >> ________________________________________ >> Von: Smith, Barry F. >> Gesendet: Dienstag, 13. August 2019 05:58:29 >> An: Huck, Moritz >> Cc: Abhyankar, Shrirang G; petsc-users at mcs.anl.gov >> Betreff: Re: [petsc-users] Problem with TS and SNES VI >> >>> On Aug 12, 2019, at 10:25 AM, Huck, Moritz wrote: >>> >>> Hi, >>> at the moment I am trying the Precheck version (I will try the event one afterwards). >>> My precheckfunction is (pseudo code): >>> precheckfunction(Vec X,Vec Y,PetscBool *changed){ >>> if any(X+Y>> *changed=True >>> Y[where(X+Y>> } >>> Inside precheck around 10-20 occurences of X+Y> In what IFunction calls are you getting all these occurrences? >> >>> In my understanding this should not happen, since the precheck should be called before the IFunction call. >> For the nonlinear system solve once the precheck is done the new nonlinear solution approximation is computed via a line search >> >> X = X + lamba Y where lambda > 0 and for most line searches lambda <=1 (for example the SNESLINESEARCHBT will always result in lambda <=1, I am not sure about the other line searchs) >> >> X_i = X_i + lambda (lowerbound - X_i) = X_i - lambda X_i + lambda lowerbound = (1 - lambda) X_i + lambda lowerbound => (1 - lambda) lowerbound + lambda lowerbound = lowerbound >> >> Thus it seems you are correct, each step that the line search tries should satisfy the bounds. >> >> Possible issues: >> 1) the line search produces lambda > 1. Make sure you use SNESLINESEARCHBT >> >> ??? Here you would need to determine exactly when in the algorithm the IFunction is having as input X < lower bound. Somewhere in the ARKIMEX integrator? Are you using fully implicit? You might need to use fully implicit in order to enforce the bound? >> >> What I would do is run in the debugger and have it stop inside IFunction when the lower bound is not satisfied. Then do bt to see where the code is, in what part of the algorithms. If inside the line search you'll need to poke around at the values to see why the step could produce something below the bound which in theory it shouldn't >> >> Good luck >> >> Barry >> >> >> >>> ________________________________________ >>> Von: Abhyankar, Shrirang G >>> Gesendet: Donnerstag, 8. August 2019 19:16:12 >>> An: Huck, Moritz; Smith, Barry F. >>> Cc: petsc-users at mcs.anl.gov >>> Betreff: Re: [petsc-users] Problem with TS and SNES VI >>> >>> Moritz, >>> I think your case will also work with using TSEvent. I think your problem is similar, correct me if I am wrong, to my application where I need to constrain the states within some limits, lb \le x. I use events to handle this, where I use two event functions: >>> (i) x ? lb = 0. if x > lb & >>> (ii) \dot{x} = 0 x = lb >>> >>> The first event function is used to detect when x hits the limit lb. Once it hits the limit, the differential equation for x is changed to (x-lb = 0) in the model to hold x at limit lb. For releasing x, there is an event function on the derivative of x, \dot{x}, and x is released on detection of the condition \dot{x} > 0. This is done through the event function \dot{x} = 0 with a positive zero crossing. >>> >>> An example of how the above works is in the example src/ts/examples/tutorials/power_grid/stability_9bus/ex9bus.c. In this example, there is an event function that first checks whether the state VR has hit the upper limit VRMAX. Once it does so, the flag VRatmax is set by the post-event function. The event function is then switched to the \dot{VR} >>> if (!VRatmax[i])) >>> fvalue[2+2*i] = VRMAX[i] - VR; >>> } else { >>> fvalue[2+2*i] = (VR - KA[i]*RF + KA[i]*KF[i]*Efd/TF[i] - KA[i]*(Vref[i] - Vm))/TA[i]; >>> } >>> >>> You can either try TSEvent or what Barry suggested SNESLineSearchSetPreCheck(), or both. >>> >>> Thanks, >>> Shri >>> >>> >>> From: "Huck, Moritz" >>> Date: Wednesday, August 7, 2019 at 8:46 AM >>> To: "Smith, Barry F." >>> Cc: "Abhyankar, Shrirang G" , "petsc-users at mcs.anl.gov" >>> Subject: AW: [petsc-users] Problem with TS and SNES VI >>> >>> Thank you for your response. >>> The sizes are only allowed to go down to a certain value. >>> The non-physical values do also occur during the function evaluations (IFunction). >>> >>> I will try to implment your suggestions with SNESLineSearchSetPreCheck. This would mean I dont have to use SNESVISetVariableBounds at all, right? >>> ________________________________________ >>> Von: Smith, Barry F. > >>> Gesendet: Dienstag, 6. August 2019 17:47:13 >>> An: Huck, Moritz >>> Cc: Abhyankar, Shrirang G; petsc-users at mcs.anl.gov >>> Betreff: Re: [petsc-users] Problem with TS and SNES VI >>> >>> Thanks, very useful. >>> >>> Are the non-physical values appearing in the nonlinear solver ? Or just at the time-step? >>> >>> Do you check for non-physical values each time you do a function evaluation needed by SNES/TS? >>> >>> If the non-physical values are an artifact of the steps taken in the nonlinear solver in SNES then the correct solution is to use >>> SNESLineSearchSetPreCheck() what you do is change the step so the resulting solutions are physical. >>> >>> For you case where the sizes go negative I am not sure what to do. Are the sizes allowed to go to zero? If so then adjust the step so that the sizes that go to negative values just go to zero. If they are suppose to be always positive then you need to pick some tolerance (say epsilon) and adjust the step so they are of size epsilon. Note you don't scale the entire step vector by a small number to satisfy the constraint you change each entry in the step as needed to satisfy the constraints. >>> >>> Good luck and let us know how it goes >>> >>> Barry >>> >>> >>> >>> On Aug 6, 2019, at 9:24 AM, Huck, Moritz > wrote: >>> >>> At the moment I output only the values at the actual time-step (with the poststep functionality), I dont know the values during the stages. >>> Unphysical values are e.g. particle sizes below zero. >>> >>> My model as no explicit inequalities, the only handling of the constraints is done by setting SNES VI. >>> >>> The model does not change in the senes that there are new equations. If have put in an conditional that xdot is calculated to be positive of x is on or below the lower bound. >>> ________________________________________ >>> Von: Smith, Barry F. > >>> Gesendet: Dienstag, 6. August 2019 15:51:16 >>> An: Huck, Moritz >>> Cc: Abhyankar, Shrirang G; petsc-users at mcs.anl.gov >>> Betreff: Re: [petsc-users] Problem with TS and SNES VI >>> >>> Could you explain in a bit more detail what you mean by "some states go to unphysical values" ? >>> >>> Is this within a stage or at the actual time-step after the stage? >>> >>> Does you model explicitly have these bounds on the solution; i.e. it is imposed as a variational inequality or does the model not explicitly have the constraints because its "analytic" solution just naturally stays in the physical region anyways? But numerical it can go out? >>> >>> Or, is your model suppose to "change" at a certain time, which you don't know in advance when the solution goes out of some predefined bounds" (this is where the event is designed for). >>> >>> This information can help us determine what approach you should take. >>> >>> Thanks >>> >>> Barry >>> >>> >>> On Aug 6, 2019, at 2:12 AM, Huck, Moritz via petsc-users > wrote: >>> >>> Hi, >>> I think I am missing something here. >>> How would events help to constrain the states. >>> Do you mean to use the event to "pause" to integration an adjust the state manually? >>> Or are the events to enforce smaller timesteps when the state come close to the constraints? >>> >>> Thank you, >>> Moritz >>> ________________________________________ >>> Von: Abhyankar, Shrirang G > >>> Gesendet: Montag, 5. August 2019 17:21:41 >>> An: Huck, Moritz; petsc-users at mcs.anl.gov >>> Betreff: Re: [petsc-users] Problem with TS and SNES VI >>> >>> For problems with constraints on the states, I would recommend trying the event functionality, TSEvent, that allows detection and location of discrete events, such as one that you have in your problem. >>> https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/TS/TSSetEventHandler.html. >>> >>> An example using TSEvent functionality: https://www.mcs.anl.gov/petsc/petsc-current/src/ts/examples/tutorials/ex40.c.html >>> >>> A brief intro to TSEvent can be found here. >>> >>> Thanks, >>> Shri >>> >>> >>> From: petsc-users > on behalf of "Huck, Moritz via petsc-users" > >>> Reply-To: "Huck, Moritz" > >>> Date: Monday, August 5, 2019 at 5:18 AM >>> To: "petsc-users at mcs.anl.gov" > >>> Subject: [petsc-users] Problem with TS and SNES VI >>> >>> Hi, >>> I am trying to solve a DAE with the ARKIMEX solver, which works mostly fine. >>> The problem arises when some states go to unphysical values. I try to constrain my states with SNESVISetVariableBounds (through the petsc4py interface). >>> But TS seems not respect this e.g. I have a state with is usually between 1 and 1e3 for which I set a lower bound of 1, but the state goes t0 -0.8 at some points. >>> Are there some tolerances I have to set for VI or something like this? >>> >>> Best Regards, >>> Moritz From danyang.su at gmail.com Tue Sep 3 00:56:01 2019 From: danyang.su at gmail.com (Danyang Su) Date: Mon, 2 Sep 2019 22:56:01 -0700 Subject: [petsc-users] Error in creating compressed data using HDF5 In-Reply-To: <67137fbd-3373-408d-892e-2b68eef143fe@gmail.com> References: <27553ABD-57FC-45EB-BE3D-A26EE0E60E2E@anl.gov> <67137fbd-3373-408d-892e-2b68eef143fe@gmail.com> Message-ID: <60466674-cfc7-8c69-ee7f-e0de9e219aff@gmail.com> Unfortunately, the master branch with hdf-1.10.5 returns similar error. Danyang On 2019-09-02 5:31 p.m., Danyang Su wrote: > Hi Barry, > > Yes, I have already included zlib and szlib during the configuration. > I will try the dev version to see if it works. > > Thanks, > > Danyang > > On 2019-09-02 12:45 p.m., Smith, Barry F. wrote: >> ?? You could try the master branch of PETSc that uses a much more >> recent branch of hdf5 >> >> ?? When you did the --download-hdf5 did you also do --download-zlib >> and --download-szlib (though I would hope hdf5 would give you a very >> useful error message that they need to be installed instead of the >> vague error message they do provide.) >> >> ??? Barry >> >> >>> On Sep 2, 2019, at 2:24 PM, Danyang Su via petsc-users >>> wrote: >>> >>> Dear All, >>> >>> Not sure if this is the right place to ask hdf5 question. I >>> installed hdf5 through PETSc configuration --download-hdf5=yes. The >>> code runs without problem except the function to create compressed >>> data (red part shown below). >>> >>> ???? !c create local memory space and hyperslab >>> ???? call h5screate_simple_f(hdf5_ndim, hdf5_dsize, >>> memspace,?????????? & >>> ???????????????????????????? hdf5_ierr) >>> ???? call h5sselect_hyperslab_f(memspace, >>> H5S_SELECT_SET_F,???????????? & >>> ??????????????????????????????? hdf5_offset, hdf5_count, >>> hdf5_ierr,???? & >>> ??????????????????????????????? hdf5_stride, hdf5_block) >>> >>> ???? !c create the global file space and hyperslab >>> ???? call h5screate_simple_f(hdf5_ndim,hdf5_gdsize,filespace, & >>> ???????????????????????????? hdf5_ierr) >>> ???? call h5sselect_hyperslab_f(filespace, >>> H5S_SELECT_SET_F,??????????? & >>> ??????????????????????????????? hdf5_goffset, hdf5_count, >>> hdf5_ierr,??? & >>> ??????????????????????????????? hdf5_stride, hdf5_block) >>> >>> ???? !c create a data chunking property >>> ???? call h5pcreate_f(H5P_DATASET_CREATE_F, chunk_id, hdf5_ierr) >>> ???? call h5pset_chunk_f(chunk_id, hdf5_ndim, hdf5_csize, hdf5_ierr) >>> >>> ???? !c create compressed data, dataset must be chunked for compression >>> ???? !c the following cause crash in hdf5 library, check when new >>> ???? !c hdf5 version is available >>> >>> ???? ! Set ZLIB / DEFLATE Compression using compression level 6. >>> ???? ! To use SZIP Compression comment out these lines. >>> ???? !call h5pset_deflate_f(chunk_id, 6, hdf5_ierr) >>> >>> ???? ! Uncomment these lines to set SZIP Compression >>> ???? !szip_options_mask = H5_SZIP_NN_OM_F >>> ???? !szip_pixels_per_block = 16 >>> ???? !call H5Pset_szip_f(chunk_id, >>> szip_options_mask,??????????????????? & >>> ???? !?????????????????? szip_pixels_per_block, hdf5_ierr) >>> >>> ???? !c create the dataset id >>> ???? call h5dcreate_f(group_id, dataname, >>> H5T_NATIVE_INTEGER,?????????? & >>> ????????????????????? filespace, dset_id, >>> hdf5_ierr,??????????????????? & >>> ????????????????????? dcpl_id=chunk_id) >>> >>> ???? !c create a data transfer property >>> ???? call h5pcreate_f(H5P_DATASET_XFER_F, xlist_id, hdf5_ierr) >>> ???? call h5pset_dxpl_mpio_f(xlist_id, >>> H5FD_MPIO_COLLECTIVE_F,????????? & >>> ???????????????????????????? hdf5_ierr) >>> >>> ???? !c write the dataset collectively >>> ???? call h5dwrite_f(dset_id, H5T_NATIVE_INTEGER, dataset, >>> hdf5_dsize,? & >>> ???????????????????? hdf5_ierr, >>> file_space_id=filespace,??????????????? & >>> ???????????????????? mem_space_id=memspace, xfer_prp = xlist_id) >>> >>> ???? call h5dclose_f(dset_id, hdf5_ierr) >>> >>> ???? !c close resources >>> ???? call h5sclose_f(filespace, hdf5_ierr) >>> ???? call h5sclose_f(memspace, hdf5_ierr) >>> ???? call h5pclose_f(chunk_id, hdf5_ierr) >>> ???? call h5pclose_f(xlist_id, hdf5_ierr) >>> >>> >>> >>> Both h5pset_deflate_f and H5Pset_szip_f crashes the code with error >>> information as shown below. If I comment out h5pset_deflate_f and >>> H5Pset_szip_f, then everything works fine. >>> >>> HDF5-DIAG: Error detected in HDF5 (1.8.18) MPI-process 0: >>> ?? #000: H5D.c line 194 in H5Dcreate2(): unable to create dataset >>> ???? major: Dataset >>> ???? minor: Unable to initialize object >>> ?? #001: H5Dint.c line 455 in H5D__create_named(): unable to create >>> and link to dataset >>> ???? major: Dataset >>> ???? minor: Unable to initialize object >>> ?? #002: H5L.c line 1638 in H5L_link_object(): unable to create new >>> link to object >>> ???? major: Links >>> ???? minor: Unable to initialize object >>> ?? #003: H5L.c line 1882 in H5L_create_real(): can't insert link >>> ???? major: Symbol table >>> ???? minor: Unable to insert object >>> ?? #004: H5Gtraverse.c line 861 in H5G_traverse(): internal path >>> traversal failed >>> ???? major: Symbol table >>> ???? minor: Object not found >>> >>> Does anyone encounter this kind of error before? >>> >>> Kind regards, >>> >>> Danyang >>> From knepley at gmail.com Tue Sep 3 03:58:36 2019 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 3 Sep 2019 04:58:36 -0400 Subject: [petsc-users] Error in creating compressed data using HDF5 In-Reply-To: <60466674-cfc7-8c69-ee7f-e0de9e219aff@gmail.com> References: <27553ABD-57FC-45EB-BE3D-A26EE0E60E2E@anl.gov> <67137fbd-3373-408d-892e-2b68eef143fe@gmail.com> <60466674-cfc7-8c69-ee7f-e0de9e219aff@gmail.com> Message-ID: On Tue, Sep 3, 2019 at 1:57 AM Danyang Su via petsc-users < petsc-users at mcs.anl.gov> wrote: > Unfortunately, the master branch with hdf-1.10.5 returns similar error. > It looks like its complaining about the group_id. Are you sure its correct? Matt > Danyang > > On 2019-09-02 5:31 p.m., Danyang Su wrote: > > Hi Barry, > > > > Yes, I have already included zlib and szlib during the configuration. > > I will try the dev version to see if it works. > > > > Thanks, > > > > Danyang > > > > On 2019-09-02 12:45 p.m., Smith, Barry F. wrote: > >> You could try the master branch of PETSc that uses a much more > >> recent branch of hdf5 > >> > >> When you did the --download-hdf5 did you also do --download-zlib > >> and --download-szlib (though I would hope hdf5 would give you a very > >> useful error message that they need to be installed instead of the > >> vague error message they do provide.) > >> > >> Barry > >> > >> > >>> On Sep 2, 2019, at 2:24 PM, Danyang Su via petsc-users > >>> wrote: > >>> > >>> Dear All, > >>> > >>> Not sure if this is the right place to ask hdf5 question. I > >>> installed hdf5 through PETSc configuration --download-hdf5=yes. The > >>> code runs without problem except the function to create compressed > >>> data (red part shown below). > >>> > >>> !c create local memory space and hyperslab > >>> call h5screate_simple_f(hdf5_ndim, hdf5_dsize, > >>> memspace, & > >>> hdf5_ierr) > >>> call h5sselect_hyperslab_f(memspace, > >>> H5S_SELECT_SET_F, & > >>> hdf5_offset, hdf5_count, > >>> hdf5_ierr, & > >>> hdf5_stride, hdf5_block) > >>> > >>> !c create the global file space and hyperslab > >>> call h5screate_simple_f(hdf5_ndim,hdf5_gdsize,filespace, & > >>> hdf5_ierr) > >>> call h5sselect_hyperslab_f(filespace, > >>> H5S_SELECT_SET_F, & > >>> hdf5_goffset, hdf5_count, > >>> hdf5_ierr, & > >>> hdf5_stride, hdf5_block) > >>> > >>> !c create a data chunking property > >>> call h5pcreate_f(H5P_DATASET_CREATE_F, chunk_id, hdf5_ierr) > >>> call h5pset_chunk_f(chunk_id, hdf5_ndim, hdf5_csize, hdf5_ierr) > >>> > >>> !c create compressed data, dataset must be chunked for compression > >>> !c the following cause crash in hdf5 library, check when new > >>> !c hdf5 version is available > >>> > >>> ! Set ZLIB / DEFLATE Compression using compression level 6. > >>> ! To use SZIP Compression comment out these lines. > >>> !call h5pset_deflate_f(chunk_id, 6, hdf5_ierr) > >>> > >>> ! Uncomment these lines to set SZIP Compression > >>> !szip_options_mask = H5_SZIP_NN_OM_F > >>> !szip_pixels_per_block = 16 > >>> !call H5Pset_szip_f(chunk_id, > >>> szip_options_mask, & > >>> ! szip_pixels_per_block, hdf5_ierr) > >>> > >>> !c create the dataset id > >>> call h5dcreate_f(group_id, dataname, > >>> H5T_NATIVE_INTEGER, & > >>> filespace, dset_id, > >>> hdf5_ierr, & > >>> dcpl_id=chunk_id) > >>> > >>> !c create a data transfer property > >>> call h5pcreate_f(H5P_DATASET_XFER_F, xlist_id, hdf5_ierr) > >>> call h5pset_dxpl_mpio_f(xlist_id, > >>> H5FD_MPIO_COLLECTIVE_F, & > >>> hdf5_ierr) > >>> > >>> !c write the dataset collectively > >>> call h5dwrite_f(dset_id, H5T_NATIVE_INTEGER, dataset, > >>> hdf5_dsize, & > >>> hdf5_ierr, > >>> file_space_id=filespace, & > >>> mem_space_id=memspace, xfer_prp = xlist_id) > >>> > >>> call h5dclose_f(dset_id, hdf5_ierr) > >>> > >>> !c close resources > >>> call h5sclose_f(filespace, hdf5_ierr) > >>> call h5sclose_f(memspace, hdf5_ierr) > >>> call h5pclose_f(chunk_id, hdf5_ierr) > >>> call h5pclose_f(xlist_id, hdf5_ierr) > >>> > >>> > >>> > >>> Both h5pset_deflate_f and H5Pset_szip_f crashes the code with error > >>> information as shown below. If I comment out h5pset_deflate_f and > >>> H5Pset_szip_f, then everything works fine. > >>> > >>> HDF5-DIAG: Error detected in HDF5 (1.8.18) MPI-process 0: > >>> #000: H5D.c line 194 in H5Dcreate2(): unable to create dataset > >>> major: Dataset > >>> minor: Unable to initialize object > >>> #001: H5Dint.c line 455 in H5D__create_named(): unable to create > >>> and link to dataset > >>> major: Dataset > >>> minor: Unable to initialize object > >>> #002: H5L.c line 1638 in H5L_link_object(): unable to create new > >>> link to object > >>> major: Links > >>> minor: Unable to initialize object > >>> #003: H5L.c line 1882 in H5L_create_real(): can't insert link > >>> major: Symbol table > >>> minor: Unable to insert object > >>> #004: H5Gtraverse.c line 861 in H5G_traverse(): internal path > >>> traversal failed > >>> major: Symbol table > >>> minor: Object not found > >>> > >>> Does anyone encounter this kind of error before? > >>> > >>> Kind regards, > >>> > >>> Danyang > >>> > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From stevenbenbow at quintessa.org Tue Sep 3 06:32:57 2019 From: stevenbenbow at quintessa.org (Steve) Date: Tue, 3 Sep 2019 12:32:57 +0100 Subject: [petsc-users] Handling infeasible solution iterates in TS/SNES In-Reply-To: References: <3d28e4c5-70fa-5baf-3fd3-a9346385974d@quintessa.org> Message-ID: <784bf130-6679-ced5-4cae-9a37d734d9f6@quintessa.org> Thanks for this Barry, it's exactly what I was looking for. Sorry for not spotting this myself - I was looking for concepts such as 'feasibility', 'range' etc. in the API method names, so should probably have realised that 'DomainError' could be interpreted similarly. The suggested documentation update is a good one for us less experienced users I think - thanks for this. In case it is of interest, the approach that I take when trying to find information on PETSc concepts that are new to me is to first search the PDF manual, since that includes a lot of high-level information in one easily searchable place.? I don't think that either of these ...DomainError() methods are mentioned there (SNESSetJacobianDomainError() does appear in a code snippet, but without any explanation).? It might be helpful for the likes of me to add a quick mention to these methods in the PDF manual in a future update, and also slip in the 'feasible' word if you think that others might search on that.? I appreciate that these big documents are the hardest to keep up-to-date in a fast moving software project though, whereas the API docs can be tweaked relatively quickly. Best regards, Steve On 02/09/2019 19:03, Smith, Barry F. wrote: > Steve, > > There are two levels at which "out of domain" (infeasible) errors can/should be indicated: > > 1) First call TSSetFunctionDomainError(). TS calls your provided function when candidate solutions are generated, if the new solution is not feasible, as indicated by this function, it will cut the time-step. > > 2) Next, in your function evaluations (provided with TSSetIFunction() or TSSetRHSFunction()) call SNESSetFunctionDomainError() whenever an infeasible state is provided. Use TSGetSNES() to access the SNES object from the TS. If you call this function the SNES solver will try cutting the length of the search direction in the nonlinear solver, still striving to solve the nonlinear system. If this fails the TS will cut the time-step. > > Not a naive question at all, our documentation on this is not as extensive or as clear as it should be. I have tried to improve it a bit https://gitlab.com/petsc/petsc/merge_requests/2001 > > Good luck, and please let us know if this support can be improved in any way. > > Barry > > Yes, the names of the the two functions TSSetFunctionDomainError() and SNESSetFunctionDomainError() are a bit inconsistent since the first one takes a user provided function while the second one indicates the current solution point is not feasible. > > The PETSc error codes are all "hard" errors that end the program. We manage "exception" type errors that can be recovered from through additional APIs such as SNESSetFunctionDomainError(). I tried once to add a recoverable exception type mechanism for the PETSc error codes but found it was too cumbersome in C. > > > >> On Sep 2, 2019, at 11:22 AM, Steve via petsc-users wrote: >> >> Hello, >> >> I have another beginner's PETSc question. Apologies if the solution is obvious, but I've looked around the manual and the API and haven't yet spotted a solution. >> >> I'm solving a nonlinear problem using the BDF TSP (although the same issue arises if I use BEULER and other TS - it's not specific to BDF). The issue that I have is that during the SNES iterations for a timestep it's possible for the solution iterates to wander into an infeasible region when evaluating the TS IFunction. In the particular instance I have this is resulting in an exp overflow, so it is both physically and computationally infeasible. >> >> The problem arises because of the highly nonlinear nature of the problem. I have a hand-coded DAE solver that also suffers with the same issue, but which spots that the situation had arisen in the evaluation of the residual, and then rejects the timestep and takes a smaller one, which is usually sufficient for the Newton iterates to remain feasible and for timestepping to proceed. I would like to take the same approach with PETSc. >> >> Currently I return a non-zero PetscErrorCode from my IFunction to indicate that the solution iterate is infeasible, but this results in an immediate (but graceful) exit from PETSc. >> >> Ideally I guess I would like to call a function like TS/SNESSetIterateIsInfeasible(...) from within my IFunction and then return zero, to indicate to PETSc that the Newton iterations have gone awry but that nothing fatal has happened, or (better?) would return a specific non-zero error code from my IFunction and handle the particular error code by reducing the timestep. The crucial thing is that I can't return a new residual from my IFunction when this happens, due to the infeasibility of the input, and so PETSc should not use the value of the residual itself to infer divergence. >> >> Are either of these approaches possible? Or is there an alternative/better approach that I can use with PETSc to handle such situations? (I've seen SETERRQ in the API but this only appears to allow tailored error messages to be set rather than providing a method to handle them - but perhaps I have misunderstood.) >> >> Again, apologies if this is a naive question, and thanks in advance for any suggestions. >> >> Steve >> >> -- Dr Steven J Benbow Quintessa Ltd, First Floor, West Wing, Videcom House, Newtown Road, Henley-on-Thames, Oxfordshire RG9 1HG, UK Tel: 01491 636246 DD: 01491 630051 Web: http://www.quintessa.org Quintessa Limited is an employee-owned company registered in England, Number 3716623. Registered office: Quintessa Ltd, First Floor, West Wing, Videcom House, Newtown Road, Henley-on-Thames, Oxfordshire RG9 1HG, UK If you have received this e-mail in error, please notify privacy at quintessa.org and delete it from your system From danyang.su at gmail.com Tue Sep 3 14:22:43 2019 From: danyang.su at gmail.com (Danyang Su) Date: Tue, 3 Sep 2019 12:22:43 -0700 Subject: [petsc-users] Error in creating compressed data using HDF5 In-Reply-To: References: <27553ABD-57FC-45EB-BE3D-A26EE0E60E2E@anl.gov> <67137fbd-3373-408d-892e-2b68eef143fe@gmail.com> <60466674-cfc7-8c69-ee7f-e0de9e219aff@gmail.com> Message-ID: <86b4c7f4-29fe-207e-1695-527973d8e8e1@gmail.com> Hi Barry and Matt, It turns out to be my stupid error in testing hdf5 chunk size. Different chunk sizes have been passed which is not allowed. After setting the same chunk size, the code now works. Thanks, Danyang On 2019-09-03 1:58 a.m., Matthew Knepley wrote: > On Tue, Sep 3, 2019 at 1:57 AM Danyang Su via petsc-users > > wrote: > > Unfortunately, the master branch with hdf-1.10.5 returns similar > error. > > > It looks like its complaining about the group_id. Are you sure its > correct? > > ? ?Matt > > Danyang > > On 2019-09-02 5:31 p.m., Danyang Su wrote: > > Hi Barry, > > > > Yes, I have already included zlib and szlib during the > configuration. > > I will try the dev version to see if it works. > > > > Thanks, > > > > Danyang > > > > On 2019-09-02 12:45 p.m., Smith, Barry F. wrote: > >> ?? You could try the master branch of PETSc that uses a much more > >> recent branch of hdf5 > >> > >> ?? When you did the --download-hdf5 did you also do > --download-zlib > >> and --download-szlib (though I would hope hdf5 would give you a > very > >> useful error message that they need to be installed instead of the > >> vague error message they do provide.) > >> > >> ??? Barry > >> > >> > >>> On Sep 2, 2019, at 2:24 PM, Danyang Su via petsc-users > >>> > wrote: > >>> > >>> Dear All, > >>> > >>> Not sure if this is the right place to ask hdf5 question. I > >>> installed hdf5 through PETSc configuration > --download-hdf5=yes. The > >>> code runs without problem except the function to create > compressed > >>> data (red part shown below). > >>> > >>> ???? !c create local memory space and hyperslab > >>> ???? call h5screate_simple_f(hdf5_ndim, hdf5_dsize, > >>> memspace,?????????? & > >>> ???????????????????????????? hdf5_ierr) > >>> ???? call h5sselect_hyperslab_f(memspace, > >>> H5S_SELECT_SET_F,???????????? & > >>> ??????????????????????????????? hdf5_offset, hdf5_count, > >>> hdf5_ierr,???? & > >>> ??????????????????????????????? hdf5_stride, hdf5_block) > >>> > >>> ???? !c create the global file space and hyperslab > >>> ???? call h5screate_simple_f(hdf5_ndim,hdf5_gdsize,filespace, & > >>> ???????????????????????????? hdf5_ierr) > >>> ???? call h5sselect_hyperslab_f(filespace, > >>> H5S_SELECT_SET_F,??????????? & > >>> ??????????????????????????????? hdf5_goffset, hdf5_count, > >>> hdf5_ierr,??? & > >>> ??????????????????????????????? hdf5_stride, hdf5_block) > >>> > >>> ???? !c create a data chunking property > >>> ???? call h5pcreate_f(H5P_DATASET_CREATE_F, chunk_id, hdf5_ierr) > >>> ???? call h5pset_chunk_f(chunk_id, hdf5_ndim, hdf5_csize, > hdf5_ierr) > >>> > >>> ???? !c create compressed data, dataset must be chunked for > compression > >>> ???? !c the following cause crash in hdf5 library, check when new > >>> ???? !c hdf5 version is available > >>> > >>> ???? ! Set ZLIB / DEFLATE Compression using compression level 6. > >>> ???? ! To use SZIP Compression comment out these lines. > >>> ???? !call h5pset_deflate_f(chunk_id, 6, hdf5_ierr) > >>> > >>> ???? ! Uncomment these lines to set SZIP Compression > >>> ???? !szip_options_mask = H5_SZIP_NN_OM_F > >>> ???? !szip_pixels_per_block = 16 > >>> ???? !call H5Pset_szip_f(chunk_id, > >>> szip_options_mask,??????????????????? & > >>> ???? !?????????????????? szip_pixels_per_block, hdf5_ierr) > >>> > >>> ???? !c create the dataset id > >>> ???? call h5dcreate_f(group_id, dataname, > >>> H5T_NATIVE_INTEGER,?????????? & > >>> ????????????????????? filespace, dset_id, > >>> hdf5_ierr,??????????????????? & > >>> ????????????????????? dcpl_id=chunk_id) > >>> > >>> ???? !c create a data transfer property > >>> ???? call h5pcreate_f(H5P_DATASET_XFER_F, xlist_id, hdf5_ierr) > >>> ???? call h5pset_dxpl_mpio_f(xlist_id, > >>> H5FD_MPIO_COLLECTIVE_F,????????? & > >>> ???????????????????????????? hdf5_ierr) > >>> > >>> ???? !c write the dataset collectively > >>> ???? call h5dwrite_f(dset_id, H5T_NATIVE_INTEGER, dataset, > >>> hdf5_dsize,? & > >>> ???????????????????? hdf5_ierr, > >>> file_space_id=filespace,??????????????? & > >>> ???????????????????? mem_space_id=memspace, xfer_prp = xlist_id) > >>> > >>> ???? call h5dclose_f(dset_id, hdf5_ierr) > >>> > >>> ???? !c close resources > >>> ???? call h5sclose_f(filespace, hdf5_ierr) > >>> ???? call h5sclose_f(memspace, hdf5_ierr) > >>> ???? call h5pclose_f(chunk_id, hdf5_ierr) > >>> ???? call h5pclose_f(xlist_id, hdf5_ierr) > >>> > >>> > >>> > >>> Both h5pset_deflate_f and H5Pset_szip_f crashes the code with > error > >>> information as shown below. If I comment out h5pset_deflate_f and > >>> H5Pset_szip_f, then everything works fine. > >>> > >>> HDF5-DIAG: Error detected in HDF5 (1.8.18) MPI-process 0: > >>> ?? #000: H5D.c line 194 in H5Dcreate2(): unable to create dataset > >>> ???? major: Dataset > >>> ???? minor: Unable to initialize object > >>> ?? #001: H5Dint.c line 455 in H5D__create_named(): unable to > create > >>> and link to dataset > >>> ???? major: Dataset > >>> ???? minor: Unable to initialize object > >>> ?? #002: H5L.c line 1638 in H5L_link_object(): unable to > create new > >>> link to object > >>> ???? major: Links > >>> ???? minor: Unable to initialize object > >>> ?? #003: H5L.c line 1882 in H5L_create_real(): can't insert link > >>> ???? major: Symbol table > >>> ???? minor: Unable to insert object > >>> ?? #004: H5Gtraverse.c line 861 in H5G_traverse(): internal path > >>> traversal failed > >>> ???? major: Symbol table > >>> ???? minor: Object not found > >>> > >>> Does anyone encounter this kind of error before? > >>> > >>> Kind regards, > >>> > >>> Danyang > >>> > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which > their experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bvstraalen at lbl.gov Tue Sep 3 19:00:04 2019 From: bvstraalen at lbl.gov (Brian Van Straalen) Date: Tue, 3 Sep 2019 17:00:04 -0700 Subject: [petsc-users] configuring on OSX Message-ID: pulling from git PETSC and on master branch. ./configure CPP=/usr/bin/cpp =============================================================================== Configuring PETSc to compile on your system =============================================================================== TESTING: checkCPreprocessor from config.setCompilers(config/BuildSystem/config/setCompilers.py:592) ******************************************************************************* UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log for details): ------------------------------------------------------------------------------- Cannot find a C preprocessor ******************************************************************************* Brian my configure script configure_options = [ # '--with-mpi-dir=/usr/local/opt/open-mpi', '--with-cc=/usr/bin/clang', '--with-cpp=/usr/bin/cpp', '--with-cxx=/usr/bin/clang++', '--with-fc=0', 'COPTFLAGS=-g -framework Accelerate', 'CXXOPTFLAGS=-g -framework Accelerate', 'FOPTFLAGS=-g', # '--with-memalign=64', '--download-hypre=1', '--download-metis=1', '--download-parmetis=1', '--download-c2html=1', '--download-ctetgen', # '--download-viennacl', # '--download-ml=1', '--download-p4est=1', '--download-superlu_dist', '--download-superlu', '--with-cxx-dialect=C++11', '--download-mumps=1', '--download-scalapack=1', # '--download-exodus=1', # '--download-ctetgen=1', '--download-triangle=1', # '--download-pragmatic=1', # '--download-eigen=1', '--download-zlib', '--with-x=1', '--with-sowing=0', '--with-debugging=1', '--with-precision=double', 'PETSC_ARCH=arch-macosx-gnu-g', '--download-chaco' ] if __name__ == '__main__': import sys,os sys.path.insert(0,os.path.abspath('config')) import configure configure.petsc_configure(configure_options) -- Brian Van Straalen Lawrence Berkeley Lab BVStraalen at lbl.gov Computational Research (510) 486-4976 Division (crd.lbl.gov) -------------- next part -------------- An HTML attachment was scrubbed... URL: From balay at mcs.anl.gov Tue Sep 3 23:08:34 2019 From: balay at mcs.anl.gov (Balay, Satish) Date: Wed, 4 Sep 2019 04:08:34 +0000 Subject: [petsc-users] configuring on OSX In-Reply-To: References: Message-ID: On Tue, 3 Sep 2019, Brian Van Straalen via petsc-users wrote: > pulling from git PETSC and on master branch. > > ./configure CPP=/usr/bin/cpp > =============================================================================== > Configuring PETSc to compile on your system > > =============================================================================== > TESTING: checkCPreprocessor from > config.setCompilers(config/BuildSystem/config/setCompilers.py:592) > > ******************************************************************************* > UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log for > details): > ------------------------------------------------------------------------------- > Cannot find a C preprocessor > ******************************************************************************* Its best to send configure.log > > > > Brian > > my configure script > > configure_options = [ > # '--with-mpi-dir=/usr/local/opt/open-mpi', > '--with-cc=/usr/bin/clang', > '--with-cpp=/usr/bin/cpp', > '--with-cxx=/usr/bin/clang++', The above 3 options are redundant - when --with-mpi-dir is provided. PETSc configure will pick up mpicc etc from the specified location. > '--with-fc=0', Hm - this conflicts with --download-mumps etc that require fortran Satish > 'COPTFLAGS=-g -framework Accelerate', > 'CXXOPTFLAGS=-g -framework Accelerate', > 'FOPTFLAGS=-g', > # '--with-memalign=64', > '--download-hypre=1', > '--download-metis=1', > '--download-parmetis=1', > '--download-c2html=1', > '--download-ctetgen', > # '--download-viennacl', > # '--download-ml=1', > '--download-p4est=1', > '--download-superlu_dist', > '--download-superlu', > '--with-cxx-dialect=C++11', > '--download-mumps=1', > '--download-scalapack=1', > # '--download-exodus=1', > # '--download-ctetgen=1', > '--download-triangle=1', > # '--download-pragmatic=1', > # '--download-eigen=1', > '--download-zlib', > '--with-x=1', > '--with-sowing=0', > '--with-debugging=1', > '--with-precision=double', > 'PETSC_ARCH=arch-macosx-gnu-g', > '--download-chaco' > ] > > if __name__ == '__main__': > import sys,os > sys.path.insert(0,os.path.abspath('config')) > import configure > configure.petsc_configure(configure_options) > > > From jed at jedbrown.org Tue Sep 3 23:09:41 2019 From: jed at jedbrown.org (Jed Brown) Date: Tue, 03 Sep 2019 22:09:41 -0600 Subject: [petsc-users] configuring on OSX In-Reply-To: References: Message-ID: <87ef0wya62.fsf@jedbrown.org> We always need configure.log, please. You shouldn't have to set CPP; it's usually preferable to use $(CC) -E anyway. Brian Van Straalen via petsc-users writes: > pulling from git PETSC and on master branch. > > ./configure CPP=/usr/bin/cpp > =============================================================================== > Configuring PETSc to compile on your system > > =============================================================================== > TESTING: checkCPreprocessor from > config.setCompilers(config/BuildSystem/config/setCompilers.py:592) > > ******************************************************************************* > UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log for > details): > ------------------------------------------------------------------------------- > Cannot find a C preprocessor > ******************************************************************************* > > > > Brian > > my configure script > > configure_options = [ > # '--with-mpi-dir=/usr/local/opt/open-mpi', > '--with-cc=/usr/bin/clang', > '--with-cpp=/usr/bin/cpp', > '--with-cxx=/usr/bin/clang++', > '--with-fc=0', > 'COPTFLAGS=-g -framework Accelerate', > 'CXXOPTFLAGS=-g -framework Accelerate', > 'FOPTFLAGS=-g', > # '--with-memalign=64', > '--download-hypre=1', > '--download-metis=1', > '--download-parmetis=1', > '--download-c2html=1', > '--download-ctetgen', > # '--download-viennacl', > # '--download-ml=1', > '--download-p4est=1', > '--download-superlu_dist', > '--download-superlu', > '--with-cxx-dialect=C++11', > '--download-mumps=1', > '--download-scalapack=1', > # '--download-exodus=1', > # '--download-ctetgen=1', > '--download-triangle=1', > # '--download-pragmatic=1', > # '--download-eigen=1', > '--download-zlib', > '--with-x=1', > '--with-sowing=0', > '--with-debugging=1', > '--with-precision=double', > 'PETSC_ARCH=arch-macosx-gnu-g', > '--download-chaco' > ] > > if __name__ == '__main__': > import sys,os > sys.path.insert(0,os.path.abspath('config')) > import configure > configure.petsc_configure(configure_options) > > > -- > Brian Van Straalen Lawrence Berkeley Lab > BVStraalen at lbl.gov Computational Research > (510) 486-4976 Division (crd.lbl.gov) From balay at mcs.anl.gov Tue Sep 3 23:11:55 2019 From: balay at mcs.anl.gov (Balay, Satish) Date: Wed, 4 Sep 2019 04:11:55 +0000 Subject: [petsc-users] configuring on OSX In-Reply-To: References: Message-ID: On Wed, 4 Sep 2019, Balay, Satish via petsc-users wrote: > On Tue, 3 Sep 2019, Brian Van Straalen via petsc-users wrote: > > > pulling from git PETSC and on master branch. > > > > ./configure CPP=/usr/bin/cpp > > =============================================================================== > > Configuring PETSc to compile on your system > > > > =============================================================================== > > TESTING: checkCPreprocessor from > > config.setCompilers(config/BuildSystem/config/setCompilers.py:592) > > > > ******************************************************************************* > > UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log for > > details): > > ------------------------------------------------------------------------------- > > Cannot find a C preprocessor > > ******************************************************************************* > > Its best to send configure.log > > > > > > > > > Brian > > > > my configure script > > > > configure_options = [ > > # '--with-mpi-dir=/usr/local/opt/open-mpi', Hm - this is commented out anyway. So there is no mpi specified for this build (its a mandatory dependency)? > > '--with-cc=/usr/bin/clang', > > '--with-cpp=/usr/bin/cpp', > > '--with-cxx=/usr/bin/clang++', > > The above 3 options are redundant - when --with-mpi-dir is provided. PETSc configure will pick up mpicc etc from the specified location. > > > '--with-fc=0', > > Hm - this conflicts with --download-mumps etc that require fortran > > Satish > > > 'COPTFLAGS=-g -framework Accelerate', > > 'CXXOPTFLAGS=-g -framework Accelerate', > > 'FOPTFLAGS=-g', > > # '--with-memalign=64', > > '--download-hypre=1', > > '--download-metis=1', > > '--download-parmetis=1', > > '--download-c2html=1', > > '--download-ctetgen', > > # '--download-viennacl', > > # '--download-ml=1', > > '--download-p4est=1', > > '--download-superlu_dist', > > '--download-superlu', > > '--with-cxx-dialect=C++11', > > '--download-mumps=1', > > '--download-scalapack=1', > > # '--download-exodus=1', > > # '--download-ctetgen=1', > > '--download-triangle=1', > > # '--download-pragmatic=1', > > # '--download-eigen=1', > > '--download-zlib', > > '--with-x=1', > > '--with-sowing=0', > > '--with-debugging=1', > > '--with-precision=double', > > 'PETSC_ARCH=arch-macosx-gnu-g', > > '--download-chaco' > > ] > > > > if __name__ == '__main__': > > import sys,os > > sys.path.insert(0,os.path.abspath('config')) > > import configure > > configure.petsc_configure(configure_options) > > > > > > > From bvstraalen at lbl.gov Tue Sep 3 23:24:44 2019 From: bvstraalen at lbl.gov (Brian Van Straalen) Date: Tue, 3 Sep 2019 21:24:44 -0700 Subject: [petsc-users] configuring on OSX In-Reply-To: References: Message-ID: I can attach the config log. mpi breaks for yet other reasons so I was trying to simplify things sufficiently. if I put mpi back in and take the compiler choices out, I still end up with failure on CPP. It seems determined to look for "/lib/cpp" which is only correct for linux. On Tue, Sep 3, 2019 at 9:11 PM Balay, Satish wrote: > On Wed, 4 Sep 2019, Balay, Satish via petsc-users wrote: > > > On Tue, 3 Sep 2019, Brian Van Straalen via petsc-users wrote: > > > > > pulling from git PETSC and on master branch. > > > > > > ./configure CPP=/usr/bin/cpp > > > > =============================================================================== > > > Configuring PETSc to compile on your system > > > > > > > =============================================================================== > > > TESTING: checkCPreprocessor from > > > config.setCompilers(config/BuildSystem/config/setCompilers.py:592) > > > > > > > ******************************************************************************* > > > UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log > for > > > details): > > > > ------------------------------------------------------------------------------- > > > Cannot find a C preprocessor > > > > ******************************************************************************* > > > > Its best to send configure.log > > > > > > > > > > > > > > Brian > > > > > > my configure script > > > > > > configure_options = [ > > > # '--with-mpi-dir=/usr/local/opt/open-mpi', > > Hm - this is commented out anyway. So there is no mpi specified for this > build (its a mandatory dependency)? > > > > '--with-cc=/usr/bin/clang', > > > '--with-cpp=/usr/bin/cpp', > > > '--with-cxx=/usr/bin/clang++', > > > > The above 3 options are redundant - when --with-mpi-dir is provided. > PETSc configure will pick up mpicc etc from the specified location. > > > > > '--with-fc=0', > > > > Hm - this conflicts with --download-mumps etc that require fortran > > > > Satish > > > > > 'COPTFLAGS=-g -framework Accelerate', > > > 'CXXOPTFLAGS=-g -framework Accelerate', > > > 'FOPTFLAGS=-g', > > > # '--with-memalign=64', > > > '--download-hypre=1', > > > '--download-metis=1', > > > '--download-parmetis=1', > > > '--download-c2html=1', > > > '--download-ctetgen', > > > # '--download-viennacl', > > > # '--download-ml=1', > > > '--download-p4est=1', > > > '--download-superlu_dist', > > > '--download-superlu', > > > '--with-cxx-dialect=C++11', > > > '--download-mumps=1', > > > '--download-scalapack=1', > > > # '--download-exodus=1', > > > # '--download-ctetgen=1', > > > '--download-triangle=1', > > > # '--download-pragmatic=1', > > > # '--download-eigen=1', > > > '--download-zlib', > > > '--with-x=1', > > > '--with-sowing=0', > > > '--with-debugging=1', > > > '--with-precision=double', > > > 'PETSC_ARCH=arch-macosx-gnu-g', > > > '--download-chaco' > > > ] > > > > > > if __name__ == '__main__': > > > import sys,os > > > sys.path.insert(0,os.path.abspath('config')) > > > import configure > > > configure.petsc_configure(configure_options) > > > > > > > > > > > > > -- Brian Van Straalen Lawrence Berkeley Lab BVStraalen at lbl.gov Computational Research (510) 486-4976 Division (crd.lbl.gov) -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: configure.log Type: application/octet-stream Size: 845641 bytes Desc: not available URL: From bsmith at mcs.anl.gov Tue Sep 3 23:35:30 2019 From: bsmith at mcs.anl.gov (Smith, Barry F.) Date: Wed, 4 Sep 2019 04:35:30 +0000 Subject: [petsc-users] configuring on OSX In-Reply-To: References: Message-ID: Ahh, the --download-c2html is not needed; it is used only when building all of the PETSc documentation including html versions of the source. For some reason it's configure is not doing a good job of selecting cpp. Over the years it has been amazingly portable, I guess Apple just went a bit too far. Anyways just run without that option, Barry > On Sep 3, 2019, at 11:24 PM, Brian Van Straalen via petsc-users wrote: > > I can attach the config log. mpi breaks for yet other reasons so I was trying to simplify things sufficiently. > > if I put mpi back in and take the compiler choices out, I still end up with failure on CPP. It seems determined to look for "/lib/cpp" which is only correct for linux. > > > > > On Tue, Sep 3, 2019 at 9:11 PM Balay, Satish wrote: > On Wed, 4 Sep 2019, Balay, Satish via petsc-users wrote: > > > On Tue, 3 Sep 2019, Brian Van Straalen via petsc-users wrote: > > > > > pulling from git PETSC and on master branch. > > > > > > ./configure CPP=/usr/bin/cpp > > > =============================================================================== > > > Configuring PETSc to compile on your system > > > > > > =============================================================================== > > > TESTING: checkCPreprocessor from > > > config.setCompilers(config/BuildSystem/config/setCompilers.py:592) > > > > > > ******************************************************************************* > > > UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log for > > > details): > > > ------------------------------------------------------------------------------- > > > Cannot find a C preprocessor > > > ******************************************************************************* > > > > Its best to send configure.log > > > > > > > > > > > > > > Brian > > > > > > my configure script > > > > > > configure_options = [ > > > # '--with-mpi-dir=/usr/local/opt/open-mpi', > > Hm - this is commented out anyway. So there is no mpi specified for this build (its a mandatory dependency)? > > > > '--with-cc=/usr/bin/clang', > > > '--with-cpp=/usr/bin/cpp', > > > '--with-cxx=/usr/bin/clang++', > > > > The above 3 options are redundant - when --with-mpi-dir is provided. PETSc configure will pick up mpicc etc from the specified location. > > > > > '--with-fc=0', > > > > Hm - this conflicts with --download-mumps etc that require fortran > > > > Satish > > > > > 'COPTFLAGS=-g -framework Accelerate', > > > 'CXXOPTFLAGS=-g -framework Accelerate', > > > 'FOPTFLAGS=-g', > > > # '--with-memalign=64', > > > '--download-hypre=1', > > > '--download-metis=1', > > > '--download-parmetis=1', > > > '--download-c2html=1', > > > '--download-ctetgen', > > > # '--download-viennacl', > > > # '--download-ml=1', > > > '--download-p4est=1', > > > '--download-superlu_dist', > > > '--download-superlu', > > > '--with-cxx-dialect=C++11', > > > '--download-mumps=1', > > > '--download-scalapack=1', > > > # '--download-exodus=1', > > > # '--download-ctetgen=1', > > > '--download-triangle=1', > > > # '--download-pragmatic=1', > > > # '--download-eigen=1', > > > '--download-zlib', > > > '--with-x=1', > > > '--with-sowing=0', > > > '--with-debugging=1', > > > '--with-precision=double', > > > 'PETSC_ARCH=arch-macosx-gnu-g', > > > '--download-chaco' > > > ] > > > > > > if __name__ == '__main__': > > > import sys,os > > > sys.path.insert(0,os.path.abspath('config')) > > > import configure > > > configure.petsc_configure(configure_options) > > > > > > > > > > > > > > > -- > Brian Van Straalen Lawrence Berkeley Lab > BVStraalen at lbl.gov Computational Research > (510) 486-4976 Division (crd.lbl.gov) > From bvstraalen at lbl.gov Wed Sep 4 00:41:38 2019 From: bvstraalen at lbl.gov (Brian Van Straalen) Date: Tue, 3 Sep 2019 22:41:38 -0700 Subject: [petsc-users] configuring on OSX In-Reply-To: References: Message-ID: Thanks Barry. that gets me further. now it seems the COPTFLAGS are not being propagated from PETSC to HYPRE. I have -framework Accelerate in my COPTFLAGS, but HYPRE still fails looking for BLAS and LAPACK routines. _dgetri, _dgetrs, and so on. Or does PETSc, and Hypre, and SUPERLU all pull in their own blas and lapack source code and this is from me disabling Fortran? I can't use Fortran in PETSc since sowing does not work on OSX as PETSc configures it (claims "configure: error: cannot run C compiled programs." but I know that is not true. I suspect sowing has linux-isms see attached log). is there some way to have PETSc configure just run through and *configure *things, and have a make command *make* things? I have to keep re-issuing the configure command, which prevents me from debugging a build errors. I'm about two days into this port to OSX.... Brian On Tue, Sep 3, 2019 at 9:35 PM Smith, Barry F. wrote: > > Ahh, the --download-c2html is not needed; it is used only when > building all of the PETSc documentation including html versions of the > source. > > For some reason it's configure is not doing a good job of selecting > cpp. Over the years it has been amazingly portable, I guess Apple just went > a bit too far. > > Anyways just run without that option, > > Barry > > > > > On Sep 3, 2019, at 11:24 PM, Brian Van Straalen via petsc-users < > petsc-users at mcs.anl.gov> wrote: > > > > I can attach the config log. mpi breaks for yet other reasons so I was > trying to simplify things sufficiently. > > > > if I put mpi back in and take the compiler choices out, I still end up > with failure on CPP. It seems determined to look for "/lib/cpp" which is > only correct for linux. > > > > > > > > > > On Tue, Sep 3, 2019 at 9:11 PM Balay, Satish wrote: > > On Wed, 4 Sep 2019, Balay, Satish via petsc-users wrote: > > > > > On Tue, 3 Sep 2019, Brian Van Straalen via petsc-users wrote: > > > > > > > pulling from git PETSC and on master branch. > > > > > > > > ./configure CPP=/usr/bin/cpp > > > > > =============================================================================== > > > > Configuring PETSc to compile on your system > > > > > > > > > =============================================================================== > > > > TESTING: checkCPreprocessor from > > > > config.setCompilers(config/BuildSystem/config/setCompilers.py:592) > > > > > > > > > ******************************************************************************* > > > > UNABLE to CONFIGURE with GIVEN OPTIONS (see > configure.log for > > > > details): > > > > > ------------------------------------------------------------------------------- > > > > Cannot find a C preprocessor > > > > > ******************************************************************************* > > > > > > Its best to send configure.log > > > > > > > > > > > > > > > > > > > Brian > > > > > > > > my configure script > > > > > > > > configure_options = [ > > > > # '--with-mpi-dir=/usr/local/opt/open-mpi', > > > > Hm - this is commented out anyway. So there is no mpi specified for this > build (its a mandatory dependency)? > > > > > > '--with-cc=/usr/bin/clang', > > > > '--with-cpp=/usr/bin/cpp', > > > > '--with-cxx=/usr/bin/clang++', > > > > > > The above 3 options are redundant - when --with-mpi-dir is provided. > PETSc configure will pick up mpicc etc from the specified location. > > > > > > > '--with-fc=0', > > > > > > Hm - this conflicts with --download-mumps etc that require fortran > > > > > > Satish > > > > > > > 'COPTFLAGS=-g -framework Accelerate', > > > > 'CXXOPTFLAGS=-g -framework Accelerate', > > > > 'FOPTFLAGS=-g', > > > > # '--with-memalign=64', > > > > '--download-hypre=1', > > > > '--download-metis=1', > > > > '--download-parmetis=1', > > > > '--download-c2html=1', > > > > '--download-ctetgen', > > > > # '--download-viennacl', > > > > # '--download-ml=1', > > > > '--download-p4est=1', > > > > '--download-superlu_dist', > > > > '--download-superlu', > > > > '--with-cxx-dialect=C++11', > > > > '--download-mumps=1', > > > > '--download-scalapack=1', > > > > # '--download-exodus=1', > > > > # '--download-ctetgen=1', > > > > '--download-triangle=1', > > > > # '--download-pragmatic=1', > > > > # '--download-eigen=1', > > > > '--download-zlib', > > > > '--with-x=1', > > > > '--with-sowing=0', > > > > '--with-debugging=1', > > > > '--with-precision=double', > > > > 'PETSC_ARCH=arch-macosx-gnu-g', > > > > '--download-chaco' > > > > ] > > > > > > > > if __name__ == '__main__': > > > > import sys,os > > > > sys.path.insert(0,os.path.abspath('config')) > > > > import configure > > > > configure.petsc_configure(configure_options) > > > > > > > > > > > > > > > > > > > > > > > -- > > Brian Van Straalen Lawrence Berkeley Lab > > BVStraalen at lbl.gov Computational Research > > (510) 486-4976 Division (crd.lbl.gov) > > > > -- Brian Van Straalen Lawrence Berkeley Lab BVStraalen at lbl.gov Computational Research (510) 486-4976 Division (crd.lbl.gov) -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: configure.log Type: application/octet-stream Size: 957269 bytes Desc: not available URL: From bsmith at mcs.anl.gov Wed Sep 4 00:44:04 2019 From: bsmith at mcs.anl.gov (Smith, Barry F.) Date: Wed, 4 Sep 2019 05:44:04 +0000 Subject: [petsc-users] configuring on OSX In-Reply-To: References: Message-ID: Brian, Just for kicks could you send arch-macosx-gnu-g/externalpackages/c2html-0.9.4/config.log ? On my Mac I get checking for gcc... gcc checking for C compiler default output... a.out checking whether the C compiler works... yes checking whether we are cross compiling... no checking for executable suffix... checking for object suffix... o checking whether we are using the GNU C compiler... yes checking whether gcc accepts -g... yes checking for a BSD compatible install... /usr/bin/install -c checking for flex... flex checking for yywrap in -lfl... no checking for yywrap in -ll... yes checking lex output file root... lex.yy checking whether yytext is a pointer... yes checking whether make sets ${MAKE}... yes checking for yylex in -lfl... no checking how to run the C preprocessor... gcc -E checking for ANSI C header files... yes checking for unistd.h... yes .... while you get checking for gcc... gcc checking for C compiler default output... a.out checking whether the C compiler works... yes checking whether we are cross compiling... no checking for executable suffix... checking for object suffix... o checking whether we are using the GNU C compiler... yes checking whether gcc accepts -g... yes checking for a BSD compatible install... /usr/bin/install -c checking for flex... flex checking for yywrap in -lfl... no checking for yywrap in -ll... no checking lex output file root... lex.yy checking whether yytext is a pointer... no checking whether make sets ${MAKE}... yes checking for yylex in -lfl... no checking how to run the C preprocessor... /lib/cpp ... > On Sep 3, 2019, at 11:35 PM, Smith, Barry F. wrote: > > > Ahh, the --download-c2html is not needed; it is used only when building all of the PETSc documentation including html versions of the source. > > For some reason it's configure is not doing a good job of selecting cpp. Over the years it has been amazingly portable, I guess Apple just went a bit too far. > > Anyways just run without that option, > > Barry > > > >> On Sep 3, 2019, at 11:24 PM, Brian Van Straalen via petsc-users wrote: >> >> I can attach the config log. mpi breaks for yet other reasons so I was trying to simplify things sufficiently. >> >> if I put mpi back in and take the compiler choices out, I still end up with failure on CPP. It seems determined to look for "/lib/cpp" which is only correct for linux. >> >> >> >> >> On Tue, Sep 3, 2019 at 9:11 PM Balay, Satish wrote: >> On Wed, 4 Sep 2019, Balay, Satish via petsc-users wrote: >> >>> On Tue, 3 Sep 2019, Brian Van Straalen via petsc-users wrote: >>> >>>> pulling from git PETSC and on master branch. >>>> >>>> ./configure CPP=/usr/bin/cpp >>>> =============================================================================== >>>> Configuring PETSc to compile on your system >>>> >>>> =============================================================================== >>>> TESTING: checkCPreprocessor from >>>> config.setCompilers(config/BuildSystem/config/setCompilers.py:592) >>>> >>>> ******************************************************************************* >>>> UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log for >>>> details): >>>> ------------------------------------------------------------------------------- >>>> Cannot find a C preprocessor >>>> ******************************************************************************* >>> >>> Its best to send configure.log >>> >>>> >>>> >>>> >>>> Brian >>>> >>>> my configure script >>>> >>>> configure_options = [ >>>> # '--with-mpi-dir=/usr/local/opt/open-mpi', >> >> Hm - this is commented out anyway. So there is no mpi specified for this build (its a mandatory dependency)? >> >>>> '--with-cc=/usr/bin/clang', >>>> '--with-cpp=/usr/bin/cpp', >>>> '--with-cxx=/usr/bin/clang++', >>> >>> The above 3 options are redundant - when --with-mpi-dir is provided. PETSc configure will pick up mpicc etc from the specified location. >>> >>>> '--with-fc=0', >>> >>> Hm - this conflicts with --download-mumps etc that require fortran >>> >>> Satish >>> >>>> 'COPTFLAGS=-g -framework Accelerate', >>>> 'CXXOPTFLAGS=-g -framework Accelerate', >>>> 'FOPTFLAGS=-g', >>>> # '--with-memalign=64', >>>> '--download-hypre=1', >>>> '--download-metis=1', >>>> '--download-parmetis=1', >>>> '--download-c2html=1', >>>> '--download-ctetgen', >>>> # '--download-viennacl', >>>> # '--download-ml=1', >>>> '--download-p4est=1', >>>> '--download-superlu_dist', >>>> '--download-superlu', >>>> '--with-cxx-dialect=C++11', >>>> '--download-mumps=1', >>>> '--download-scalapack=1', >>>> # '--download-exodus=1', >>>> # '--download-ctetgen=1', >>>> '--download-triangle=1', >>>> # '--download-pragmatic=1', >>>> # '--download-eigen=1', >>>> '--download-zlib', >>>> '--with-x=1', >>>> '--with-sowing=0', >>>> '--with-debugging=1', >>>> '--with-precision=double', >>>> 'PETSC_ARCH=arch-macosx-gnu-g', >>>> '--download-chaco' >>>> ] >>>> >>>> if __name__ == '__main__': >>>> import sys,os >>>> sys.path.insert(0,os.path.abspath('config')) >>>> import configure >>>> configure.petsc_configure(configure_options) >>>> >>>> >>>> >>> >> >> >> >> -- >> Brian Van Straalen Lawrence Berkeley Lab >> BVStraalen at lbl.gov Computational Research >> (510) 486-4976 Division (crd.lbl.gov) >> > From bvstraalen at lbl.gov Wed Sep 4 00:46:28 2019 From: bvstraalen at lbl.gov (Brian Van Straalen) Date: Tue, 3 Sep 2019 22:46:28 -0700 Subject: [petsc-users] configuring on OSX In-Reply-To: References: Message-ID: Attached. c2html config.log On Tue, Sep 3, 2019 at 10:44 PM Smith, Barry F. wrote: > > Brian, > > Just for kicks could you send > arch-macosx-gnu-g/externalpackages/c2html-0.9.4/config.log ? > > On my Mac I get > > checking for gcc... gcc > checking for C compiler default output... a.out > checking whether the C compiler works... yes > checking whether we are cross compiling... no > checking for executable suffix... > checking for object suffix... o > checking whether we are using the GNU C compiler... yes > checking whether gcc accepts -g... yes > checking for a BSD compatible install... /usr/bin/install -c > checking for flex... flex > checking for yywrap in -lfl... no > checking for yywrap in -ll... yes > checking lex output file root... lex.yy > checking whether yytext is a pointer... yes > checking whether make sets ${MAKE}... yes > checking for yylex in -lfl... no > checking how to run the C preprocessor... gcc -E > checking for ANSI C header files... yes > checking for unistd.h... yes > .... > > while you get > > checking for gcc... gcc > checking for C compiler default output... a.out > checking whether the C compiler works... yes > checking whether we are cross compiling... no > checking for executable suffix... > checking for object suffix... o > checking whether we are using the GNU C compiler... yes > checking whether gcc accepts -g... yes > checking for a BSD compatible install... /usr/bin/install -c > checking for flex... flex > checking for yywrap in -lfl... no > checking for yywrap in -ll... no > checking lex output file root... lex.yy > checking whether yytext is a pointer... no > checking whether make sets ${MAKE}... yes > checking for yylex in -lfl... no > checking how to run the C preprocessor... /lib/cpp > ... > > > On Sep 3, 2019, at 11:35 PM, Smith, Barry F. wrote: > > > > > > Ahh, the --download-c2html is not needed; it is used only when > building all of the PETSc documentation including html versions of the > source. > > > > For some reason it's configure is not doing a good job of selecting > cpp. Over the years it has been amazingly portable, I guess Apple just went > a bit too far. > > > > Anyways just run without that option, > > > > Barry > > > > > > > >> On Sep 3, 2019, at 11:24 PM, Brian Van Straalen via petsc-users < > petsc-users at mcs.anl.gov> wrote: > >> > >> I can attach the config log. mpi breaks for yet other reasons so I was > trying to simplify things sufficiently. > >> > >> if I put mpi back in and take the compiler choices out, I still end up > with failure on CPP. It seems determined to look for "/lib/cpp" which is > only correct for linux. > >> > >> > >> > >> > >> On Tue, Sep 3, 2019 at 9:11 PM Balay, Satish wrote: > >> On Wed, 4 Sep 2019, Balay, Satish via petsc-users wrote: > >> > >>> On Tue, 3 Sep 2019, Brian Van Straalen via petsc-users wrote: > >>> > >>>> pulling from git PETSC and on master branch. > >>>> > >>>> ./configure CPP=/usr/bin/cpp > >>>> > =============================================================================== > >>>> Configuring PETSc to compile on your system > >>>> > >>>> > =============================================================================== > >>>> TESTING: checkCPreprocessor from > >>>> config.setCompilers(config/BuildSystem/config/setCompilers.py:592) > >>>> > >>>> > ******************************************************************************* > >>>> UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log > for > >>>> details): > >>>> > ------------------------------------------------------------------------------- > >>>> Cannot find a C preprocessor > >>>> > ******************************************************************************* > >>> > >>> Its best to send configure.log > >>> > >>>> > >>>> > >>>> > >>>> Brian > >>>> > >>>> my configure script > >>>> > >>>> configure_options = [ > >>>> # '--with-mpi-dir=/usr/local/opt/open-mpi', > >> > >> Hm - this is commented out anyway. So there is no mpi specified for > this build (its a mandatory dependency)? > >> > >>>> '--with-cc=/usr/bin/clang', > >>>> '--with-cpp=/usr/bin/cpp', > >>>> '--with-cxx=/usr/bin/clang++', > >>> > >>> The above 3 options are redundant - when --with-mpi-dir is provided. > PETSc configure will pick up mpicc etc from the specified location. > >>> > >>>> '--with-fc=0', > >>> > >>> Hm - this conflicts with --download-mumps etc that require fortran > >>> > >>> Satish > >>> > >>>> 'COPTFLAGS=-g -framework Accelerate', > >>>> 'CXXOPTFLAGS=-g -framework Accelerate', > >>>> 'FOPTFLAGS=-g', > >>>> # '--with-memalign=64', > >>>> '--download-hypre=1', > >>>> '--download-metis=1', > >>>> '--download-parmetis=1', > >>>> '--download-c2html=1', > >>>> '--download-ctetgen', > >>>> # '--download-viennacl', > >>>> # '--download-ml=1', > >>>> '--download-p4est=1', > >>>> '--download-superlu_dist', > >>>> '--download-superlu', > >>>> '--with-cxx-dialect=C++11', > >>>> '--download-mumps=1', > >>>> '--download-scalapack=1', > >>>> # '--download-exodus=1', > >>>> # '--download-ctetgen=1', > >>>> '--download-triangle=1', > >>>> # '--download-pragmatic=1', > >>>> # '--download-eigen=1', > >>>> '--download-zlib', > >>>> '--with-x=1', > >>>> '--with-sowing=0', > >>>> '--with-debugging=1', > >>>> '--with-precision=double', > >>>> 'PETSC_ARCH=arch-macosx-gnu-g', > >>>> '--download-chaco' > >>>> ] > >>>> > >>>> if __name__ == '__main__': > >>>> import sys,os > >>>> sys.path.insert(0,os.path.abspath('config')) > >>>> import configure > >>>> configure.petsc_configure(configure_options) > >>>> > >>>> > >>>> > >>> > >> > >> > >> > >> -- > >> Brian Van Straalen Lawrence Berkeley Lab > >> BVStraalen at lbl.gov Computational Research > >> (510) 486-4976 Division (crd.lbl.gov) > >> > > > > -- Brian Van Straalen Lawrence Berkeley Lab BVStraalen at lbl.gov Computational Research (510) 486-4976 Division (crd.lbl.gov) -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: config.log Type: application/octet-stream Size: 54373 bytes Desc: not available URL: From bsmith at mcs.anl.gov Wed Sep 4 01:09:04 2019 From: bsmith at mcs.anl.gov (Smith, Barry F.) Date: Wed, 4 Sep 2019 06:09:04 +0000 Subject: [petsc-users] configuring on OSX In-Reply-To: References: Message-ID: <2C9672A3-D0BE-45AD-A27A-4F8C517DB904@mcs.anl.gov> Based on the config.log you just sent from c2html I see the problem: You have an ancient gcc in your path installed by brew before the Apple "gcc wrapper" that is broken. This is picked up the c2html and sowing configures and thus ends up in failure. configure:1039: checking for gcc configure:1054: found /usr/local/bin/gcc configure:1062: result: gcc configure:1290: checking for C compiler version configure:1293: gcc --version &5 gcc (Homebrew GCC 5.5.0_2) 5.5.0 .... configure:1978: gcc -o conftest -g -O2 conftest.c >&5 lex.yy.c:19:19: fatal error: stdio.h: No such file or directory compilation terminated. configure:1981: $? = 1 You can brew uninstall it, or if you like having broken compilers in your path you can leave it and add the options -download-sowing-cc=/usr/bin/gcc -download-sowing-cxx=/usr/bin/g++ to ./configure see more comments below. (As always please send configure.log immediately on failure; we don't expect users to debug our configure which is why we provide the service of helping decipher its behavior). > On Sep 4, 2019, at 12:41 AM, Brian Van Straalen wrote: > > Thanks Barry. that gets me further. now it seems the COPTFLAGS are not being propagated from PETSC to HYPRE. I have -framework Accelerate in my COPTFLAGS, but HYPRE still fails looking for BLAS and LAPACK routines. _dgetri, _dgetrs, and so on. Don't bother with the -framework stuff, on Apple PETSc just picks up the -lblas and -llapack automatically that point to the accelerate libraries. I suspect hypre's configure cannot handle the non-standard -framework stuff. > Or does PETSc, and Hypre, and SUPERLU all pull in their own blas and lapack source code No, PETSc's configure tells all its child packages what blas/lapack to use to make sure they don't select different ones. > and this is from me disabling Fortran? I can't use Fortran in PETSc since sowing does not work on OSX as PETSc configures it (claims "configure: error: cannot run C compiled programs." but I know that is not true. I suspect sowing has linux-isms see attached log). I was going to ask you to send arch-macosx-gnu-g/externalpackages/git.sowing/config.log but I don't need it since the config.log for c2html provided the needed information. > > is there some way to have PETSc configure just run through and configure things, and have a make command make things? Configure has to build the external packages as it runs because it needs information from the built packages to continue on to the next configure task; for example it cannot configure PETSc for hypre until hypre is built. > I have to keep re-issuing the configure command, which prevents me from debugging a build errors. I'm about two days into this port to OSX.... > > Brian > > > On Tue, Sep 3, 2019 at 9:35 PM Smith, Barry F. wrote: > > Ahh, the --download-c2html is not needed; it is used only when building all of the PETSc documentation including html versions of the source. > > For some reason it's configure is not doing a good job of selecting cpp. Over the years it has been amazingly portable, I guess Apple just went a bit too far. > > Anyways just run without that option, > > Barry > > > > > On Sep 3, 2019, at 11:24 PM, Brian Van Straalen via petsc-users wrote: > > > > I can attach the config log. mpi breaks for yet other reasons so I was trying to simplify things sufficiently. > > > > if I put mpi back in and take the compiler choices out, I still end up with failure on CPP. It seems determined to look for "/lib/cpp" which is only correct for linux. > > > > > > > > > > On Tue, Sep 3, 2019 at 9:11 PM Balay, Satish wrote: > > On Wed, 4 Sep 2019, Balay, Satish via petsc-users wrote: > > > > > On Tue, 3 Sep 2019, Brian Van Straalen via petsc-users wrote: > > > > > > > pulling from git PETSC and on master branch. > > > > > > > > ./configure CPP=/usr/bin/cpp > > > > =============================================================================== > > > > Configuring PETSc to compile on your system > > > > > > > > =============================================================================== > > > > TESTING: checkCPreprocessor from > > > > config.setCompilers(config/BuildSystem/config/setCompilers.py:592) > > > > > > > > ******************************************************************************* > > > > UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log for > > > > details): > > > > ------------------------------------------------------------------------------- > > > > Cannot find a C preprocessor > > > > ******************************************************************************* > > > > > > Its best to send configure.log > > > > > > > > > > > > > > > > > > > Brian > > > > > > > > my configure script > > > > > > > > configure_options = [ > > > > # '--with-mpi-dir=/usr/local/opt/open-mpi', > > > > Hm - this is commented out anyway. So there is no mpi specified for this build (its a mandatory dependency)? > > > > > > '--with-cc=/usr/bin/clang', > > > > '--with-cpp=/usr/bin/cpp', > > > > '--with-cxx=/usr/bin/clang++', > > > > > > The above 3 options are redundant - when --with-mpi-dir is provided. PETSc configure will pick up mpicc etc from the specified location. > > > > > > > '--with-fc=0', > > > > > > Hm - this conflicts with --download-mumps etc that require fortran > > > > > > Satish > > > > > > > 'COPTFLAGS=-g -framework Accelerate', > > > > 'CXXOPTFLAGS=-g -framework Accelerate', > > > > 'FOPTFLAGS=-g', > > > > # '--with-memalign=64', > > > > '--download-hypre=1', > > > > '--download-metis=1', > > > > '--download-parmetis=1', > > > > '--download-c2html=1', > > > > '--download-ctetgen', > > > > # '--download-viennacl', > > > > # '--download-ml=1', > > > > '--download-p4est=1', > > > > '--download-superlu_dist', > > > > '--download-superlu', > > > > '--with-cxx-dialect=C++11', > > > > '--download-mumps=1', > > > > '--download-scalapack=1', > > > > # '--download-exodus=1', > > > > # '--download-ctetgen=1', > > > > '--download-triangle=1', > > > > # '--download-pragmatic=1', > > > > # '--download-eigen=1', > > > > '--download-zlib', > > > > '--with-x=1', > > > > '--with-sowing=0', > > > > '--with-debugging=1', > > > > '--with-precision=double', > > > > 'PETSC_ARCH=arch-macosx-gnu-g', > > > > '--download-chaco' > > > > ] > > > > > > > > if __name__ == '__main__': > > > > import sys,os > > > > sys.path.insert(0,os.path.abspath('config')) > > > > import configure > > > > configure.petsc_configure(configure_options) > > > > > > > > > > > > > > > > > > > > > > > -- > > Brian Van Straalen Lawrence Berkeley Lab > > BVStraalen at lbl.gov Computational Research > > (510) 486-4976 Division (crd.lbl.gov) > > > > > > -- > Brian Van Straalen Lawrence Berkeley Lab > BVStraalen at lbl.gov Computational Research > (510) 486-4976 Division (crd.lbl.gov) > From bsmith at mcs.anl.gov Wed Sep 4 01:47:00 2019 From: bsmith at mcs.anl.gov (Smith, Barry F.) Date: Wed, 4 Sep 2019 06:47:00 +0000 Subject: [petsc-users] configuring on OSX In-Reply-To: References: Message-ID: <60629179-F4AA-451C-9365-84D7F88126E0@mcs.anl.gov> --with-blaslapack-libs="-framework Accelerate" also works (though is not needed). Using COPTFLAGS="-framework Accelerate" doesn't work because hypre doesn't use the C flags when linking the shared libraries hence the -framework Accelerate is not passed to the linker. Barry > On Sep 4, 2019, at 12:41 AM, Brian Van Straalen wrote: > > Thanks Barry. that gets me further. now it seems the COPTFLAGS are not being propagated from PETSC to HYPRE. I have -framework Accelerate in my COPTFLAGS, but HYPRE still fails looking for BLAS and LAPACK routines. _dgetri, _dgetrs, and so on. Or does PETSc, and Hypre, and SUPERLU all pull in their own blas and lapack source code and this is from me disabling Fortran? I can't use Fortran in PETSc since sowing does not work on OSX as PETSc configures it (claims "configure: error: cannot run C compiled programs." but I know that is not true. I suspect sowing has linux-isms see attached log). > > is there some way to have PETSc configure just run through and configure things, and have a make command make things? I have to keep re-issuing the configure command, which prevents me from debugging a build errors. I'm about two days into this port to OSX.... > > Brian > > > On Tue, Sep 3, 2019 at 9:35 PM Smith, Barry F. wrote: > > Ahh, the --download-c2html is not needed; it is used only when building all of the PETSc documentation including html versions of the source. > > For some reason it's configure is not doing a good job of selecting cpp. Over the years it has been amazingly portable, I guess Apple just went a bit too far. > > Anyways just run without that option, > > Barry > > > > > On Sep 3, 2019, at 11:24 PM, Brian Van Straalen via petsc-users wrote: > > > > I can attach the config log. mpi breaks for yet other reasons so I was trying to simplify things sufficiently. > > > > if I put mpi back in and take the compiler choices out, I still end up with failure on CPP. It seems determined to look for "/lib/cpp" which is only correct for linux. > > > > > > > > > > On Tue, Sep 3, 2019 at 9:11 PM Balay, Satish wrote: > > On Wed, 4 Sep 2019, Balay, Satish via petsc-users wrote: > > > > > On Tue, 3 Sep 2019, Brian Van Straalen via petsc-users wrote: > > > > > > > pulling from git PETSC and on master branch. > > > > > > > > ./configure CPP=/usr/bin/cpp > > > > =============================================================================== > > > > Configuring PETSc to compile on your system > > > > > > > > =============================================================================== > > > > TESTING: checkCPreprocessor from > > > > config.setCompilers(config/BuildSystem/config/setCompilers.py:592) > > > > > > > > ******************************************************************************* > > > > UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log for > > > > details): > > > > ------------------------------------------------------------------------------- > > > > Cannot find a C preprocessor > > > > ******************************************************************************* > > > > > > Its best to send configure.log > > > > > > > > > > > > > > > > > > > Brian > > > > > > > > my configure script > > > > > > > > configure_options = [ > > > > # '--with-mpi-dir=/usr/local/opt/open-mpi', > > > > Hm - this is commented out anyway. So there is no mpi specified for this build (its a mandatory dependency)? > > > > > > '--with-cc=/usr/bin/clang', > > > > '--with-cpp=/usr/bin/cpp', > > > > '--with-cxx=/usr/bin/clang++', > > > > > > The above 3 options are redundant - when --with-mpi-dir is provided. PETSc configure will pick up mpicc etc from the specified location. > > > > > > > '--with-fc=0', > > > > > > Hm - this conflicts with --download-mumps etc that require fortran > > > > > > Satish > > > > > > > 'COPTFLAGS=-g -framework Accelerate', > > > > 'CXXOPTFLAGS=-g -framework Accelerate', > > > > 'FOPTFLAGS=-g', > > > > # '--with-memalign=64', > > > > '--download-hypre=1', > > > > '--download-metis=1', > > > > '--download-parmetis=1', > > > > '--download-c2html=1', > > > > '--download-ctetgen', > > > > # '--download-viennacl', > > > > # '--download-ml=1', > > > > '--download-p4est=1', > > > > '--download-superlu_dist', > > > > '--download-superlu', > > > > '--with-cxx-dialect=C++11', > > > > '--download-mumps=1', > > > > '--download-scalapack=1', > > > > # '--download-exodus=1', > > > > # '--download-ctetgen=1', > > > > '--download-triangle=1', > > > > # '--download-pragmatic=1', > > > > # '--download-eigen=1', > > > > '--download-zlib', > > > > '--with-x=1', > > > > '--with-sowing=0', > > > > '--with-debugging=1', > > > > '--with-precision=double', > > > > 'PETSC_ARCH=arch-macosx-gnu-g', > > > > '--download-chaco' > > > > ] > > > > > > > > if __name__ == '__main__': > > > > import sys,os > > > > sys.path.insert(0,os.path.abspath('config')) > > > > import configure > > > > configure.petsc_configure(configure_options) > > > > > > > > > > > > > > > > > > > > > > > -- > > Brian Van Straalen Lawrence Berkeley Lab > > BVStraalen at lbl.gov Computational Research > > (510) 486-4976 Division (crd.lbl.gov) > > > > > > -- > Brian Van Straalen Lawrence Berkeley Lab > BVStraalen at lbl.gov Computational Research > (510) 486-4976 Division (crd.lbl.gov) > From bvstraalen at lbl.gov Wed Sep 4 03:19:56 2019 From: bvstraalen at lbl.gov (Brian Van Straalen) Date: Wed, 4 Sep 2019 01:19:56 -0700 Subject: [petsc-users] configuring on OSX In-Reply-To: <60629179-F4AA-451C-9365-84D7F88126E0@mcs.anl.gov> References: <60629179-F4AA-451C-9365-84D7F88126E0@mcs.anl.gov> Message-ID: Thanks, I'll check it out. Glad you are unbound by the rising and setting of the sun :-) Brian On Tue, Sep 3, 2019 at 11:47 PM Smith, Barry F. wrote: > > --with-blaslapack-libs="-framework Accelerate" also works (though is > not needed). > > Using COPTFLAGS="-framework Accelerate" doesn't work because hypre > doesn't use the C flags when linking the shared libraries hence the > -framework Accelerate is not passed to the linker. > > Barry > > > > On Sep 4, 2019, at 12:41 AM, Brian Van Straalen > wrote: > > > > Thanks Barry. that gets me further. now it seems the COPTFLAGS are > not being propagated from PETSC to HYPRE. I have -framework Accelerate in > my COPTFLAGS, but HYPRE still fails looking for BLAS and LAPACK routines. > _dgetri, _dgetrs, and so on. Or does PETSc, and Hypre, and SUPERLU all > pull in their own blas and lapack source code and this is from me disabling > Fortran? I can't use Fortran in PETSc since sowing does not work on OSX as > PETSc configures it (claims "configure: error: cannot run C compiled > programs." but I know that is not true. I suspect sowing has linux-isms > see attached log). > > > > is there some way to have PETSc configure just run through and configure > things, and have a make command make things? I have to keep re-issuing > the configure command, which prevents me from debugging a build errors. > I'm about two days into this port to OSX.... > > > > Brian > > > > > > On Tue, Sep 3, 2019 at 9:35 PM Smith, Barry F. > wrote: > > > > Ahh, the --download-c2html is not needed; it is used only when > building all of the PETSc documentation including html versions of the > source. > > > > For some reason it's configure is not doing a good job of selecting > cpp. Over the years it has been amazingly portable, I guess Apple just went > a bit too far. > > > > Anyways just run without that option, > > > > Barry > > > > > > > > > On Sep 3, 2019, at 11:24 PM, Brian Van Straalen via petsc-users < > petsc-users at mcs.anl.gov> wrote: > > > > > > I can attach the config log. mpi breaks for yet other reasons so I > was trying to simplify things sufficiently. > > > > > > if I put mpi back in and take the compiler choices out, I still end up > with failure on CPP. It seems determined to look for "/lib/cpp" which is > only correct for linux. > > > > > > > > > > > > > > > On Tue, Sep 3, 2019 at 9:11 PM Balay, Satish > wrote: > > > On Wed, 4 Sep 2019, Balay, Satish via petsc-users wrote: > > > > > > > On Tue, 3 Sep 2019, Brian Van Straalen via petsc-users wrote: > > > > > > > > > pulling from git PETSC and on master branch. > > > > > > > > > > ./configure CPP=/usr/bin/cpp > > > > > > =============================================================================== > > > > > Configuring PETSc to compile on your system > > > > > > > > > > > =============================================================================== > > > > > TESTING: checkCPreprocessor from > > > > > config.setCompilers(config/BuildSystem/config/setCompilers.py:592) > > > > > > > > > > > ******************************************************************************* > > > > > UNABLE to CONFIGURE with GIVEN OPTIONS (see > configure.log for > > > > > details): > > > > > > ------------------------------------------------------------------------------- > > > > > Cannot find a C preprocessor > > > > > > ******************************************************************************* > > > > > > > > Its best to send configure.log > > > > > > > > > > > > > > > > > > > > > > > > Brian > > > > > > > > > > my configure script > > > > > > > > > > configure_options = [ > > > > > # '--with-mpi-dir=/usr/local/opt/open-mpi', > > > > > > Hm - this is commented out anyway. So there is no mpi specified for > this build (its a mandatory dependency)? > > > > > > > > '--with-cc=/usr/bin/clang', > > > > > '--with-cpp=/usr/bin/cpp', > > > > > '--with-cxx=/usr/bin/clang++', > > > > > > > > The above 3 options are redundant - when --with-mpi-dir is provided. > PETSc configure will pick up mpicc etc from the specified location. > > > > > > > > > '--with-fc=0', > > > > > > > > Hm - this conflicts with --download-mumps etc that require fortran > > > > > > > > Satish > > > > > > > > > 'COPTFLAGS=-g -framework Accelerate', > > > > > 'CXXOPTFLAGS=-g -framework Accelerate', > > > > > 'FOPTFLAGS=-g', > > > > > # '--with-memalign=64', > > > > > '--download-hypre=1', > > > > > '--download-metis=1', > > > > > '--download-parmetis=1', > > > > > '--download-c2html=1', > > > > > '--download-ctetgen', > > > > > # '--download-viennacl', > > > > > # '--download-ml=1', > > > > > '--download-p4est=1', > > > > > '--download-superlu_dist', > > > > > '--download-superlu', > > > > > '--with-cxx-dialect=C++11', > > > > > '--download-mumps=1', > > > > > '--download-scalapack=1', > > > > > # '--download-exodus=1', > > > > > # '--download-ctetgen=1', > > > > > '--download-triangle=1', > > > > > # '--download-pragmatic=1', > > > > > # '--download-eigen=1', > > > > > '--download-zlib', > > > > > '--with-x=1', > > > > > '--with-sowing=0', > > > > > '--with-debugging=1', > > > > > '--with-precision=double', > > > > > 'PETSC_ARCH=arch-macosx-gnu-g', > > > > > '--download-chaco' > > > > > ] > > > > > > > > > > if __name__ == '__main__': > > > > > import sys,os > > > > > sys.path.insert(0,os.path.abspath('config')) > > > > > import configure > > > > > configure.petsc_configure(configure_options) > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > Brian Van Straalen Lawrence Berkeley Lab > > > BVStraalen at lbl.gov Computational Research > > > (510) 486-4976 Division (crd.lbl.gov) > > > > > > > > > > > -- > > Brian Van Straalen Lawrence Berkeley Lab > > BVStraalen at lbl.gov Computational Research > > (510) 486-4976 Division (crd.lbl.gov) > > > > -- Brian Van Straalen Lawrence Berkeley Lab BVStraalen at lbl.gov Computational Research (510) 486-4976 Division (crd.lbl.gov) -------------- next part -------------- An HTML attachment was scrubbed... URL: From kostas.kontzialis at gmail.com Wed Sep 4 19:06:42 2019 From: kostas.kontzialis at gmail.com (Konstantinos Kontzialis) Date: Wed, 4 Sep 2019 20:06:42 -0400 Subject: [petsc-users] =?utf-8?q?Problem_running_perc=C3=A9?= Message-ID: <739686E6-22D4-43C6-9714-36EA9342E86A@gmail.com> Hi all, I try to run my code with petsc on Fedora. It?s a parallel CFD code and I write on a command line the following: mpirun -np 4 ./executable input-file but when I do that I get my code run serially but four different processes and not in parallel. I use MPI_Comm_rank after petscinitialize and I get 0 for every process. Why is this happening? Kostas From knepley at gmail.com Wed Sep 4 19:11:55 2019 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 4 Sep 2019 20:11:55 -0400 Subject: [petsc-users] =?utf-8?q?Problem_running_perc=C3=A9?= In-Reply-To: <739686E6-22D4-43C6-9714-36EA9342E86A@gmail.com> References: <739686E6-22D4-43C6-9714-36EA9342E86A@gmail.com> Message-ID: On Wed, Sep 4, 2019 at 8:07 PM Konstantinos Kontzialis via petsc-users < petsc-users at mcs.anl.gov> wrote: > Hi all, > > I try to run my code with petsc on Fedora. It?s a parallel CFD code and I > write on a command line the following: > > mpirun -np 4 ./executable input-file > > but when I do that I get my code run serially but four different processes > and not in parallel. I use MPI_Comm_rank after petscinitialize and I get 0 > for every process. > > Why is this happening? > It is likely that 'mpirun' in the your path does not match the MPI you built PETSc with. Thanks, Matt > Kostas -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From balay at mcs.anl.gov Wed Sep 4 23:53:23 2019 From: balay at mcs.anl.gov (Balay, Satish) Date: Thu, 5 Sep 2019 04:53:23 +0000 Subject: [petsc-users] [petsc-dev] IMPORTANT PETSc repository changed from Bucketbit to GitLab In-Reply-To: References: <87D5FE0A-00FB-4A5D-82D1-9D8BAA31F258@mcs.anl.gov> <59054920-7BD3-48BD-B9F6-E9724429155F@mcs.anl.gov> Message-ID: On Thu, 22 Aug 2019, Balay, Satish via petsc-users wrote: > > > > Please do not make pull requests to the Gitlab site yet; we will be manually processing the PR from the BitBucket site over the next couple of > > days as we implement the testing. > > > > Please be patient, this is all new to use and it may take a few days to get out all the glitches. > > Just an update: > > We are still in the process of setting up the CI at gitlab. So we are > not yet ready to process PRs [or Merge Requests (MRs) in gitlab terminology] > > As of now - we have the old jenkins equivalent [and a few additional] > tests working with gitlab setup. i.e > > https://gitlab.com/petsc/petsc/pipelines/77669506 > > But we are yet to migrate all the regular [aka next] tests to this > infrastructure. All, We now have a preliminary CI in place that can process merge requests (MRs) - so please go ahead and submit them. Satish From hongzhang at anl.gov Thu Sep 5 16:04:18 2019 From: hongzhang at anl.gov (Zhang, Hong) Date: Thu, 5 Sep 2019 21:04:18 +0000 Subject: [petsc-users] is TS_EQ_DAE_SEMI_EXPLICIT_INDEX functional In-Reply-To: References: Message-ID: You do not need to worry about these equation flags in the beginning. To solve a DAE, you need to define IFunction and IJacobian instead of RHSFunction. ts/examples/tutorials/ex19.c is the simplest DAE example that you can refer to. Hong (Mr.) > On Sep 2, 2019, at 4:51 AM, Huck, Moritz via petsc-users wrote: > > Hi, > TS_EQ_DAE_SEMI_EXPLICIT_INDEX(?) are defined in TSEquationType but not mentioned in the manual. > Is this feature functional ? > If yes how do I have to define the RHSFunction? > (I am asking since the ODE variant has it defined as G= M^-1 g, which cannot work for a DAE) > > Best Regards, > Moritz From hongzhang at anl.gov Thu Sep 5 16:16:17 2019 From: hongzhang at anl.gov (Zhang, Hong) Date: Thu, 5 Sep 2019 21:16:17 +0000 Subject: [petsc-users] Problem with TS and SNES VI In-Reply-To: References: Message-ID: Where does your DAE come from? If the DAE does not come from PDE discretization, there are a few things that you might want to try. 1. Use direct solvers (-pc_type lu, some third-party solvers can also be used by specifying -pc_factor_mat_solver_type xxx); 2. Adjust SNES tolerances; 3. Adjust TS tolerances if you use adaptive time stepping or reduce the stepsize with -ts_dt if fixed stepsize is used. Hong (Mr.) > On Aug 5, 2019, at 4:16 AM, Huck, Moritz via petsc-users wrote: > > Hi, > I am trying to solve a DAE with the ARKIMEX solver, which works mostly fine. > The problem arises when some states go to unphysical values. I try to constrain my states with SNESVISetVariableBounds (through the petsc4py interface). > But TS seems not respect this e.g. I have a state with is usually between 1 and 1e3 for which I set a lower bound of 1, but the state goes t0 -0.8 at some points. > Are there some tolerances I have to set for VI or something like this? > > Best Regards, > Moritz From balay at mcs.anl.gov Fri Sep 6 08:38:29 2019 From: balay at mcs.anl.gov (Balay, Satish) Date: Fri, 6 Sep 2019 13:38:29 +0000 Subject: [petsc-users] [petsc-dev] IMPORTANT PETSc repository changed from Bucketbit to GitLab In-Reply-To: References: <87D5FE0A-00FB-4A5D-82D1-9D8BAA31F258@mcs.anl.gov> <59054920-7BD3-48BD-B9F6-E9724429155F@mcs.anl.gov> Message-ID: On Thu, 5 Sep 2019, Balay, Satish via petsc-dev wrote: > On Thu, 22 Aug 2019, Balay, Satish via petsc-users wrote: > All, > > We now have a preliminary CI in place that can process merge requests (MRs) - so please go ahead and submit them. We have a notice on 'Data Center Outage, starting 5PM CST on Sep 6 -to sometime Sep 9' So PETSc CI will likely not be working during this time [as (some of) the test machines go down] i.e MRs won't get tested during this outage. Satish From mpovolot at purdue.edu Fri Sep 6 15:11:57 2019 From: mpovolot at purdue.edu (Povolotskyi, Mykhailo) Date: Fri, 6 Sep 2019 20:11:57 +0000 Subject: [petsc-users] question about CISS Message-ID: Hello, I have been experimenting with CISS by computing part of the spectrum of a complex matrix of rather small size (774). I have compared two spectral regions: the circle and the ring. The results were compared against LAPACK. The attached figure shows that the accuracy with the ring is low, comparatively to a circle. It looks like if the ring is used, the eigenvalues are found, but not accurately. The circle area works quite okay. This is what I'm doing: ? ? ? EPS ?????????? eps; ????? EPSType??????? type; ????? RG???????????? rg; ????? EPSCreate(MPI_COMM_SELF,&eps); ????? EPSSetOperators( eps,matrix_petsc,NULL); ????? EPSSetType(eps,EPSCISS); ????? EPSSetProblemType(eps, EPS_NHEP); ????? EPSSetFromOptions(eps); ????? EPSGetRG(eps,&rg); ????? RGSetType(rg,RGRING); ????? double vscale(1.0); ????? double start_ang(0); ????? double end_ang(1.0); RGRingSetParameters(rg,center,radius,vscale,start_ang,end_ang,width); ????? EPSSolve(eps); Could you, please, advise me on this problem? Thank you, Michael. -------------- next part -------------- A non-text attachment was scrubbed... Name: circle_vs_ring2.png Type: image/png Size: 236840 bytes Desc: circle_vs_ring2.png URL: From jpapp at craft-tech.com Fri Sep 6 16:11:30 2019 From: jpapp at craft-tech.com (John L. Papp) Date: Fri, 6 Sep 2019 17:11:30 -0400 Subject: [petsc-users] Block Tridiagonal Solver Message-ID: <5bc1a718-c6a3-3293-2a58-41d8f7fbfc68@craft-tech.com> Hello, I need a parallel block tridiagonal solver and thought PETSc would be perfect.? However, there seems to be no specific example showing exactly which VecCreate and MatCreate functions to use.? I searched the archive and the web and there is no explicit block tridiagonal examples (although ex23.c example solves a tridiagonal matrix) and the manual is vague on the subject.? So a couple of questions: 1. Is it better to create a monolithic matrix (MatCreateAIJ) and vector (VecCreate)? 2. Is it better to create a block matrix (MatCreateBAIJ) and vector (VecCreate and then VecSetBlockSize or is there an equivalent block vector create)? 3. What is the best parallel solver(s) to invert the Dx=b when D is a block tridiagonal matrix? If this helps, each row will be owned by the same process.? In other words, the data used to fill the [A] [B] [C] block matrices in a row of the D block tridiagonal matrix will reside on the same process.? Hence, I don't need to store the individual [A], [B], and [C] block matrices in parallel, just the over all block tridiagonal matrix on a row by row basis. Thanks in advance, John -- ************************************************************** Dr. John Papp Senior Research Scientist CRAFT Tech. 6210 Kellers Church Road Pipersville, PA 18947 Email: jpapp at craft-tech.com Phone: (215) 766-1520 Fax : (215) 766-1524 Web : http://www.craft-tech.com ************************************************************** -------------- next part -------------- An HTML attachment was scrubbed... URL: From kostas.kontzialis at gmail.com Fri Sep 6 16:21:04 2019 From: kostas.kontzialis at gmail.com (Konstantinos Kontzialis) Date: Fri, 6 Sep 2019 17:21:04 -0400 Subject: [petsc-users] error at run time Message-ID: Hi all, I run my code as follows: mpiexec -np 2 ./hoac test1.cgns and I get the following error: ./hoac: error while loading shared libraries: libpetsc.so.3.11: cannot open shared object file: No such file or directory ./hoac: error while loading shared libraries: libpetsc.so.3.11: cannot open shared object file: No such file or directory but the libpetsc.so.3.11 exists. I ls my local petsc installation at: ls -l $PETSC_DIR/$PETSC_ARCH/lib and I get -rw-r--r--. 1 ** ** 8955614 Sep 4 23:15 libcmumps.a -rw-r--r--. 1 ** ** 8900950 Sep 4 23:15 libdmumps.a -rw-r--r--. 1 ** ** 1655112 Sep 4 23:03 libfblas.a -rw-r--r--. 1 ** ** 29819054 Sep 4 23:03 libflapack.a -rw-r--r--. 1 ** ** 110570758 Sep 4 23:16 libHYPRE.a -rwxr-xr-x. 1 ** ** 1996784 Sep 4 23:05 libmetis.so -rw-r--r--. 1 ** ** 3268440 Sep 4 23:15 libmumps_common.a -rwxr-xr-x. 1 ** ** 1358144 Sep 4 23:05 libparmetis.so lrwxrwxrwx. 1 ** ** 18 Sep 4 23:20 libpetsc.so -> libpetsc.so.3.11.3 lrwxrwxrwx. 1 ** ** 18 Sep 4 23:20 libpetsc.so.3.11 -> libpetsc.so.3.11.3 -rwxrwxr-x. 1 ** ** 111015280 Sep 4 23:20 libpetsc.so.3.11.3 -rw-r--r--. 1 ** ** 2026784 Sep 4 23:15 libpord.a -rw-r--r--. 1 ** ** 80737926 Sep 4 23:12 libscalapack.a -rw-r--r--. 1 ** ** 8902046 Sep 4 23:15 libsmumps.a -rw-r--r--. 1 ** ** 8973822 Sep 4 23:15 libzmumps.a drwxr-xr-x. 3 ** ** 4096 Sep 4 22:55 petsc drwxr-xr-x. 2 ** ** 4096 Sep 4 23:03 pkgconfig and I see that the libepetsc.so.3.11 exists as a link. Why do I get this error then? Regards, Kostas -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at jedbrown.org Fri Sep 6 16:32:48 2019 From: jed at jedbrown.org (Jed Brown) Date: Fri, 06 Sep 2019 15:32:48 -0600 Subject: [petsc-users] Block Tridiagonal Solver In-Reply-To: <5bc1a718-c6a3-3293-2a58-41d8f7fbfc68@craft-tech.com> References: <5bc1a718-c6a3-3293-2a58-41d8f7fbfc68@craft-tech.com> Message-ID: <87ftl9hzzz.fsf@jedbrown.org> Where do your tridiagonal systems come from? Do you need to solve one at a time, or batches of tridiagonal problems? Although it is not in PETSc, we have some work on solving the sort of tridiagonal systems that arise in compact discretizations, which it turns out can be solved much faster than generic tridiagonal problems. https://tridiaglu.github.io/index.html "John L. Papp via petsc-users" writes: > Hello, > > I need a parallel block tridiagonal solver and thought PETSc would be > perfect.? However, there seems to be no specific example showing exactly > which VecCreate and MatCreate functions to use.? I searched the archive > and the web and there is no explicit block tridiagonal examples > (although ex23.c example solves a tridiagonal matrix) and the manual is > vague on the subject.? So a couple of questions: > > 1. Is it better to create a monolithic matrix (MatCreateAIJ) and vector > (VecCreate)? > 2. Is it better to create a block matrix (MatCreateBAIJ) and vector > (VecCreate and then VecSetBlockSize or is there an equivalent block > vector create)? > 3. What is the best parallel solver(s) to invert the Dx=b when D is a > block tridiagonal matrix? > > If this helps, each row will be owned by the same process.? In other > words, the data used to fill the [A] [B] [C] block matrices in a row of > the D block tridiagonal matrix will reside on the same process.? Hence, > I don't need to store the individual [A], [B], and [C] block matrices in > parallel, just the over all block tridiagonal matrix on a row by row basis. > > Thanks in advance, > > John > > -- > ************************************************************** > Dr. John Papp > Senior Research Scientist > CRAFT Tech. > 6210 Kellers Church Road > Pipersville, PA 18947 > > Email: jpapp at craft-tech.com > Phone: (215) 766-1520 > Fax : (215) 766-1524 > Web : http://www.craft-tech.com > > ************************************************************** From swarnava89 at gmail.com Fri Sep 6 17:05:29 2019 From: swarnava89 at gmail.com (Swarnava Ghosh) Date: Fri, 6 Sep 2019 15:05:29 -0700 Subject: [petsc-users] DMPlex cell number containing a point in space Message-ID: Dear Petsc developers and users, I have a DMPlex mesh in 3D. Given a point with (x,y,z) coordinates, I am trying the find the cell number in which this point lies, and the vertices of the cell. Is there any DMPlex function that will give me the cell number? Thank you, SG -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Sat Sep 7 01:40:48 2019 From: knepley at gmail.com (Matthew Knepley) Date: Sat, 7 Sep 2019 02:40:48 -0400 Subject: [petsc-users] error at run time In-Reply-To: References: Message-ID: On Fri, Sep 6, 2019 at 5:22 PM Konstantinos Kontzialis via petsc-users < petsc-users at mcs.anl.gov> wrote: > Hi all, > > I run my code as follows: > > mpiexec -np 2 ./hoac test1.cgns > > and I get the following error: > > ./hoac: error while loading shared libraries: libpetsc.so.3.11: cannot > open shared object file: No such file or directory > ./hoac: error while loading shared libraries: libpetsc.so.3.11: cannot > open shared object file: No such file or directory > I assume this is Linux. You can fix this at least two ways: 1) You can fix your link by using -rpath to point to the PETSc library 2) You can put the PETSc library location in your LD_LIBRARY_PATH env variable Thanks, Matt > but the libpetsc.so.3.11 exists. > > I ls my local petsc installation at: > > ls -l $PETSC_DIR/$PETSC_ARCH/lib > > and I get > > -rw-r--r--. 1 ** ** 8955614 Sep 4 23:15 libcmumps.a > -rw-r--r--. 1 ** ** 8900950 Sep 4 23:15 libdmumps.a > -rw-r--r--. 1 ** ** 1655112 Sep 4 23:03 libfblas.a > -rw-r--r--. 1 ** ** 29819054 Sep 4 23:03 libflapack.a > -rw-r--r--. 1 ** ** 110570758 Sep 4 23:16 libHYPRE.a > -rwxr-xr-x. 1 ** ** 1996784 Sep 4 23:05 libmetis.so > -rw-r--r--. 1 ** ** 3268440 Sep 4 23:15 libmumps_common.a > -rwxr-xr-x. 1 ** ** 1358144 Sep 4 23:05 libparmetis.so > lrwxrwxrwx. 1 ** ** 18 Sep 4 23:20 libpetsc.so -> > libpetsc.so.3.11.3 > lrwxrwxrwx. 1 ** ** 18 Sep 4 23:20 libpetsc.so.3.11 -> > libpetsc.so.3.11.3 > -rwxrwxr-x. 1 ** ** 111015280 Sep 4 23:20 libpetsc.so.3.11.3 > -rw-r--r--. 1 ** ** 2026784 Sep 4 23:15 libpord.a > -rw-r--r--. 1 ** ** 80737926 Sep 4 23:12 libscalapack.a > -rw-r--r--. 1 ** ** 8902046 Sep 4 23:15 libsmumps.a > -rw-r--r--. 1 ** ** 8973822 Sep 4 23:15 libzmumps.a > drwxr-xr-x. 3 ** ** 4096 Sep 4 22:55 petsc > drwxr-xr-x. 2 ** ** 4096 Sep 4 23:03 pkgconfig > > and I see that the libepetsc.so.3.11 exists as a link. > > Why do I get this error then? > > Regards, > > Kostas > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From jroman at dsic.upv.es Sat Sep 7 04:15:42 2019 From: jroman at dsic.upv.es (Jose E. Roman) Date: Sat, 7 Sep 2019 11:15:42 +0200 Subject: [petsc-users] question about CISS In-Reply-To: References: Message-ID: <9871C5C2-0F30-4B87-BAB9-CBD5A30EB6B1@dsic.upv.es> In the ring region, you should choose appropriate start and end angles to narrow down the region, so that it fits closer to the wanted eigenvalues, otherwise the region spans the whole circumference. Note that there is a fixed number of integration points around the contour, so if the area is not focused around the wanted eigenvalues then the approximation capacity is smaller. As I said, elliptic regions are mostly recommended. The ring region is used for special situations, such as the one in our joint paper https://doi.org/10.1007/978-3-319-62426-6_2 where the eigenvalues lie on the unit circle but we want to avoid eigenvalues close to the origin. Jose > El 6 sept 2019, a las 22:11, Povolotskyi, Mykhailo via petsc-users escribi?: > > Hello, > > I have been experimenting with CISS by computing part of the spectrum of > a complex matrix of rather small size (774). > > I have compared two spectral regions: the circle and the ring. > > The results were compared against LAPACK. > > The attached figure shows that the accuracy with the ring is low, > comparatively to a circle. > > It looks like if the ring is used, the eigenvalues are found, but not > accurately. > > The circle area works quite okay. > > This is what I'm doing: > > EPS eps; > EPSType type; > RG rg; > > EPSCreate(MPI_COMM_SELF,&eps); > EPSSetOperators( eps,matrix_petsc,NULL); > EPSSetType(eps,EPSCISS); > EPSSetProblemType(eps, EPS_NHEP); > > EPSSetFromOptions(eps); > > > EPSGetRG(eps,&rg); > RGSetType(rg,RGRING); > > double vscale(1.0); > > double start_ang(0); > double end_ang(1.0); > RGRingSetParameters(rg,center,radius,vscale,start_ang,end_ang,width); > > EPSSolve(eps); > > > Could you, please, advise me on this problem? > > Thank you, > > Michael. > > From mpovolot at purdue.edu Sat Sep 7 20:25:10 2019 From: mpovolot at purdue.edu (Povolotskyi, Mykhailo) Date: Sun, 8 Sep 2019 01:25:10 +0000 Subject: [petsc-users] question about CISS In-Reply-To: <9871C5C2-0F30-4B87-BAB9-CBD5A30EB6B1@dsic.upv.es> References: <9871C5C2-0F30-4B87-BAB9-CBD5A30EB6B1@dsic.upv.es> Message-ID: Thank you. I will try to divide the ring into fragments and see what happens. Michael. On 9/7/2019 5:15 AM, Jose E. Roman wrote: > In the ring region, you should choose appropriate start and end angles to narrow down the region, so that it fits closer to the wanted eigenvalues, otherwise the region spans the whole circumference. Note that there is a fixed number of integration points around the contour, so if the area is not focused around the wanted eigenvalues then the approximation capacity is smaller. > > As I said, elliptic regions are mostly recommended. The ring region is used for special situations, such as the one in our joint paper https://doi.org/10.1007/978-3-319-62426-6_2 where the eigenvalues lie on the unit circle but we want to avoid eigenvalues close to the origin. > > Jose > > >> El 6 sept 2019, a las 22:11, Povolotskyi, Mykhailo via petsc-users escribi?: >> >> Hello, >> >> I have been experimenting with CISS by computing part of the spectrum of >> a complex matrix of rather small size (774). >> >> I have compared two spectral regions: the circle and the ring. >> >> The results were compared against LAPACK. >> >> The attached figure shows that the accuracy with the ring is low, >> comparatively to a circle. >> >> It looks like if the ring is used, the eigenvalues are found, but not >> accurately. >> >> The circle area works quite okay. >> >> This is what I'm doing: >> >> EPS eps; >> EPSType type; >> RG rg; >> >> EPSCreate(MPI_COMM_SELF,&eps); >> EPSSetOperators( eps,matrix_petsc,NULL); >> EPSSetType(eps,EPSCISS); >> EPSSetProblemType(eps, EPS_NHEP); >> >> EPSSetFromOptions(eps); >> >> >> EPSGetRG(eps,&rg); >> RGSetType(rg,RGRING); >> >> double vscale(1.0); >> >> double start_ang(0); >> double end_ang(1.0); >> RGRingSetParameters(rg,center,radius,vscale,start_ang,end_ang,width); >> >> EPSSolve(eps); >> >> >> Could you, please, advise me on this problem? >> >> Thank you, >> >> Michael. >> >> From bsmith at mcs.anl.gov Sat Sep 7 23:51:07 2019 From: bsmith at mcs.anl.gov (Smith, Barry F.) Date: Sun, 8 Sep 2019 04:51:07 +0000 Subject: [petsc-users] Block Tridiagonal Solver In-Reply-To: <87ftl9hzzz.fsf@jedbrown.org> References: <5bc1a718-c6a3-3293-2a58-41d8f7fbfc68@craft-tech.com> <87ftl9hzzz.fsf@jedbrown.org> Message-ID: <90AE8224-8EAB-47FA-8BAC-0DB3CA411FC9@anl.gov> John, How large are your blocks and are they dense? Also generally how many blocks do you have? The BAIJ formats are for when the blocks are dense. As Jed notes we don't have specific parallel block tridiagonal solvers. You can use the parallel direct solvers such as MUMPS, SuperLU_DIST, or PastiX from PETSc or standard iterative methods such as block Jacobi or the overlapping additive Schwarz method. Depending on your needs any of these may be suitable. For MPI parallelism -pc_type lu -pc_factor_mat_solver_type mumps superlu_dist or pastix mkl_cpardiso (you need to ./configure PETSc with --download-mumps --download-scalapack or --download-superlu_dist or --download-pastix or --with-mkl_cpardiso) -pc_type bjacobi or -pc_type asm For OpenMP parallelism of the linear solver you can use --with-mkl_pardiso or --download-mumps --with-mumps-serial Barry > On Sep 6, 2019, at 4:32 PM, Jed Brown via petsc-users wrote: > > Where do your tridiagonal systems come from? Do you need to solve one > at a time, or batches of tridiagonal problems? > > Although it is not in PETSc, we have some work on solving the sort of > tridiagonal systems that arise in compact discretizations, which it > turns out can be solved much faster than generic tridiagonal problems. > > https://tridiaglu.github.io/index.html > > "John L. Papp via petsc-users" writes: > >> Hello, >> >> I need a parallel block tridiagonal solver and thought PETSc would be >> perfect. However, there seems to be no specific example showing exactly >> which VecCreate and MatCreate functions to use. I searched the archive >> and the web and there is no explicit block tridiagonal examples >> (although ex23.c example solves a tridiagonal matrix) and the manual is >> vague on the subject. So a couple of questions: >> >> 1. Is it better to create a monolithic matrix (MatCreateAIJ) and vector >> (VecCreate)? >> 2. Is it better to create a block matrix (MatCreateBAIJ) and vector >> (VecCreate and then VecSetBlockSize or is there an equivalent block >> vector create)? >> 3. What is the best parallel solver(s) to invert the Dx=b when D is a >> block tridiagonal matrix? >> >> If this helps, each row will be owned by the same process. In other >> words, the data used to fill the [A] [B] [C] block matrices in a row of >> the D block tridiagonal matrix will reside on the same process. Hence, >> I don't need to store the individual [A], [B], and [C] block matrices in >> parallel, just the over all block tridiagonal matrix on a row by row basis. >> >> Thanks in advance, >> >> John >> >> -- >> ************************************************************** >> Dr. John Papp >> Senior Research Scientist >> CRAFT Tech. >> 6210 Kellers Church Road >> Pipersville, PA 18947 >> >> Email: jpapp at craft-tech.com >> Phone: (215) 766-1520 >> Fax : (215) 766-1524 >> Web : http://www.craft-tech.com >> >> ************************************************************** From jpapp at craft-tech.com Mon Sep 9 08:22:00 2019 From: jpapp at craft-tech.com (John L. Papp) Date: Mon, 9 Sep 2019 09:22:00 -0400 Subject: [petsc-users] Block Tridiagonal Solver In-Reply-To: <87ftl9hzzz.fsf@jedbrown.org> References: <5bc1a718-c6a3-3293-2a58-41d8f7fbfc68@craft-tech.com> <87ftl9hzzz.fsf@jedbrown.org> Message-ID: <02b6fd77-4207-b50d-f492-670825f4e380@craft-tech.com> Thanks for the help. I didn't want to get too far into the weeds about the numerical method, just that I have a block tridiagonal system that needs to be solved.? If it helps any, the system comes from an ADI scheme on the Navier-Stokes equations.? The [A], [B], and [C] block matrices correspond to the [Q]_i-1, [Q]_i, and [Q]_i+1 vector unknowns (density, momentum, energy, species, etc.) for each I, J, K sweep through the solution grid.?? So, technically, I do need to solve batches of tridiagonal problems.? I'll take a look at your solvers as it seems to be less heavy than PETSc. Thanks, John On 9/6/2019 5:32 PM, Jed Brown wrote: > Where do your tridiagonal systems come from? Do you need to solve one > at a time, or batches of tridiagonal problems? > > Although it is not in PETSc, we have some work on solving the sort of > tridiagonal systems that arise in compact discretizations, which it > turns out can be solved much faster than generic tridiagonal problems. > > https://tridiaglu.github.io/index.html > > "John L. Papp via petsc-users" writes: > >> Hello, >> >> I need a parallel block tridiagonal solver and thought PETSc would be >> perfect.? However, there seems to be no specific example showing exactly >> which VecCreate and MatCreate functions to use.? I searched the archive >> and the web and there is no explicit block tridiagonal examples >> (although ex23.c example solves a tridiagonal matrix) and the manual is >> vague on the subject.? So a couple of questions: >> >> 1. Is it better to create a monolithic matrix (MatCreateAIJ) and vector >> (VecCreate)? >> 2. Is it better to create a block matrix (MatCreateBAIJ) and vector >> (VecCreate and then VecSetBlockSize or is there an equivalent block >> vector create)? >> 3. What is the best parallel solver(s) to invert the Dx=b when D is a >> block tridiagonal matrix? >> >> If this helps, each row will be owned by the same process.? In other >> words, the data used to fill the [A] [B] [C] block matrices in a row of >> the D block tridiagonal matrix will reside on the same process.? Hence, >> I don't need to store the individual [A], [B], and [C] block matrices in >> parallel, just the over all block tridiagonal matrix on a row by row basis. >> >> Thanks in advance, >> >> John >> >> -- >> ************************************************************** >> Dr. John Papp >> Senior Research Scientist >> CRAFT Tech. >> 6210 Kellers Church Road >> Pipersville, PA 18947 >> >> Email: jpapp at craft-tech.com >> Phone: (215) 766-1520 >> Fax : (215) 766-1524 >> Web : http://www.craft-tech.com >> >> ************************************************************** -- ************************************************************** Dr. John Papp Senior Research Scientist CRAFT Tech. 6210 Kellers Church Road Pipersville, PA 18947 Email: jpapp at craft-tech.com Phone: (215) 766-1520 Fax : (215) 766-1524 Web : http://www.craft-tech.com ************************************************************** From jpapp at craft-tech.com Mon Sep 9 08:30:43 2019 From: jpapp at craft-tech.com (John L. Papp) Date: Mon, 9 Sep 2019 09:30:43 -0400 Subject: [petsc-users] Block Tridiagonal Solver In-Reply-To: <90AE8224-8EAB-47FA-8BAC-0DB3CA411FC9@anl.gov> References: <5bc1a718-c6a3-3293-2a58-41d8f7fbfc68@craft-tech.com> <87ftl9hzzz.fsf@jedbrown.org> <90AE8224-8EAB-47FA-8BAC-0DB3CA411FC9@anl.gov> Message-ID: <01716a84-f3ad-51e4-713d-4657a499e734@craft-tech.com> Hello, The block matrices tend to be dense and can be large depending on the amount of unknowns.? The overall block tridiagonal can be large as well as the size depends on the number of grid points in a given index direction.? It would not be unheard of to have greater 100 rows in the global block tridiagonal matrix with greater than 25 unknowns in each block matrix element.? Based on this, you would suggest BAIJ or would I be pushing the limits of parallel matrix solve? Thanks, John On 9/8/2019 12:51 AM, Smith, Barry F. wrote: > John, > > How large are your blocks and are they dense? Also generally how many blocks do you have? The BAIJ formats are for when the blocks are dense. > > As Jed notes we don't have specific parallel block tridiagonal solvers. You can use the parallel direct solvers such as MUMPS, SuperLU_DIST, or PastiX from PETSc or standard iterative methods such as block Jacobi or the overlapping additive Schwarz method. Depending on your needs any of these may be suitable. > > For MPI parallelism > > -pc_type lu -pc_factor_mat_solver_type mumps superlu_dist or pastix mkl_cpardiso (you need to ./configure PETSc with --download-mumps --download-scalapack or --download-superlu_dist or --download-pastix or --with-mkl_cpardiso) > > -pc_type bjacobi or -pc_type asm > > For OpenMP parallelism of the linear solver you can use --with-mkl_pardiso or --download-mumps --with-mumps-serial > > Barry > > >> On Sep 6, 2019, at 4:32 PM, Jed Brown via petsc-users wrote: >> >> Where do your tridiagonal systems come from? Do you need to solve one >> at a time, or batches of tridiagonal problems? >> >> Although it is not in PETSc, we have some work on solving the sort of >> tridiagonal systems that arise in compact discretizations, which it >> turns out can be solved much faster than generic tridiagonal problems. >> >> https://tridiaglu.github.io/index.html >> >> "John L. Papp via petsc-users" writes: >> >>> Hello, >>> >>> I need a parallel block tridiagonal solver and thought PETSc would be >>> perfect. However, there seems to be no specific example showing exactly >>> which VecCreate and MatCreate functions to use. I searched the archive >>> and the web and there is no explicit block tridiagonal examples >>> (although ex23.c example solves a tridiagonal matrix) and the manual is >>> vague on the subject. So a couple of questions: >>> >>> 1. Is it better to create a monolithic matrix (MatCreateAIJ) and vector >>> (VecCreate)? >>> 2. Is it better to create a block matrix (MatCreateBAIJ) and vector >>> (VecCreate and then VecSetBlockSize or is there an equivalent block >>> vector create)? >>> 3. What is the best parallel solver(s) to invert the Dx=b when D is a >>> block tridiagonal matrix? >>> >>> If this helps, each row will be owned by the same process. In other >>> words, the data used to fill the [A] [B] [C] block matrices in a row of >>> the D block tridiagonal matrix will reside on the same process. Hence, >>> I don't need to store the individual [A], [B], and [C] block matrices in >>> parallel, just the over all block tridiagonal matrix on a row by row basis. >>> >>> Thanks in advance, >>> >>> John >>> >>> -- >>> ************************************************************** >>> Dr. John Papp >>> Senior Research Scientist >>> CRAFT Tech. >>> 6210 Kellers Church Road >>> Pipersville, PA 18947 >>> >>> Email: jpapp at craft-tech.com >>> Phone: (215) 766-1520 >>> Fax : (215) 766-1524 >>> Web : http://www.craft-tech.com >>> >>> ************************************************************** -- ************************************************************** Dr. John Papp Senior Research Scientist CRAFT Tech. 6210 Kellers Church Road Pipersville, PA 18947 Email: jpapp at craft-tech.com Phone: (215) 766-1520 Fax : (215) 766-1524 Web : http://www.craft-tech.com ************************************************************** From bsmith at mcs.anl.gov Mon Sep 9 09:00:37 2019 From: bsmith at mcs.anl.gov (Smith, Barry F.) Date: Mon, 9 Sep 2019 14:00:37 +0000 Subject: [petsc-users] Block Tridiagonal Solver In-Reply-To: <02b6fd77-4207-b50d-f492-670825f4e380@craft-tech.com> References: <5bc1a718-c6a3-3293-2a58-41d8f7fbfc68@craft-tech.com> <87ftl9hzzz.fsf@jedbrown.org> <02b6fd77-4207-b50d-f492-670825f4e380@craft-tech.com> Message-ID: <456A5B1E-B8FF-47AF-BCD6-36114511E996@anl.gov> John, The individual block tridiagonal systems are pretty small for solving in parallel with MPI, you are unlikely to get much improvement from focusing on these. Some general comments/suggestions: 1) ADI methods generally are difficult to parallelize with MPI 2) There are perhaps more modern methods that converge much faster than ADI methods For your problem I would consider using (what in PETSc we call) a field split preconditioner and then appropriate preconditioners for the fields (usually some sort of multigrid for the elliptic terms and a simpler iterative method for other terms, for reactions you can use a block preconditioner that solves on each cell all the reactions together). The exact details depend on your discretization and equations. From the discussion so far I am speculating you are using a single structured grid? Are you using a staggered grid (for example cell-centered pressure) and edge or vertex centered for the other variables, or are all variables collocated? Is the discretization more or less traditional finite differences or more finite volume based? This is likely more than what you asking for but if you have collocated variables you might consider using the PETSc DMCreate3d() construct to manage the parallel decomposition of your problem. This also makes it straightforward to use the PCFIELDSPLIT preconditioner. If you have a staggered grid you could use DMStagCreate3d() to similarly manage the parallel decomposition. In both cases the hope is that you can reuse much of your current code that applies the operators and computes the (approximate) Jacobian. I realize this may involve far more of a refactorization of your code than you are able or will to do so I just float them in case you are looking for possible major parallel speedups for your simulations. Barry > On Sep 9, 2019, at 8:22 AM, John L. Papp via petsc-users wrote: > > Thanks for the help. > > I didn't want to get too far into the weeds about the numerical method, just that I have a block tridiagonal system that needs to be solved. If it helps any, the system comes from an ADI scheme on the Navier-Stokes equations. The [A], [B], and [C] block matrices correspond to the [Q]_i-1, [Q]_i, and [Q]_i+1 vector unknowns (density, momentum, energy, species, etc.) for each I, J, K sweep through the solution grid. So, technically, I do need to solve batches of tridiagonal problems. I'll take a look at your solvers as it seems to be less heavy than PETSc. > > Thanks, > > John > > On 9/6/2019 5:32 PM, Jed Brown wrote: >> Where do your tridiagonal systems come from? Do you need to solve one >> at a time, or batches of tridiagonal problems? >> >> Although it is not in PETSc, we have some work on solving the sort of >> tridiagonal systems that arise in compact discretizations, which it >> turns out can be solved much faster than generic tridiagonal problems. >> >> https://tridiaglu.github.io/index.html >> >> "John L. Papp via petsc-users" writes: >> >>> Hello, >>> >>> I need a parallel block tridiagonal solver and thought PETSc would be >>> perfect. However, there seems to be no specific example showing exactly >>> which VecCreate and MatCreate functions to use. I searched the archive >>> and the web and there is no explicit block tridiagonal examples >>> (although ex23.c example solves a tridiagonal matrix) and the manual is >>> vague on the subject. So a couple of questions: >>> >>> 1. Is it better to create a monolithic matrix (MatCreateAIJ) and vector >>> (VecCreate)? >>> 2. Is it better to create a block matrix (MatCreateBAIJ) and vector >>> (VecCreate and then VecSetBlockSize or is there an equivalent block >>> vector create)? >>> 3. What is the best parallel solver(s) to invert the Dx=b when D is a >>> block tridiagonal matrix? >>> >>> If this helps, each row will be owned by the same process. In other >>> words, the data used to fill the [A] [B] [C] block matrices in a row of >>> the D block tridiagonal matrix will reside on the same process. Hence, >>> I don't need to store the individual [A], [B], and [C] block matrices in >>> parallel, just the over all block tridiagonal matrix on a row by row basis. >>> >>> Thanks in advance, >>> >>> John >>> >>> -- >>> ************************************************************** >>> Dr. John Papp >>> Senior Research Scientist >>> CRAFT Tech. >>> 6210 Kellers Church Road >>> Pipersville, PA 18947 >>> >>> Email: jpapp at craft-tech.com >>> Phone: (215) 766-1520 >>> Fax : (215) 766-1524 >>> Web : http://www.craft-tech.com >>> >>> ************************************************************** > > -- > ************************************************************** > Dr. John Papp > Senior Research Scientist > CRAFT Tech. > 6210 Kellers Church Road > Pipersville, PA 18947 > > Email: jpapp at craft-tech.com > Phone: (215) 766-1520 > Fax : (215) 766-1524 > Web : http://www.craft-tech.com > > ************************************************************** > From mhbaghaei at mail.sjtu.edu.cn Tue Sep 10 03:20:48 2019 From: mhbaghaei at mail.sjtu.edu.cn (Amir) Date: Tue, 10 Sep 2019 16:20:48 +0800 Subject: [petsc-users] View 3D DMPlex Message-ID: <3993675D-2D80-4241-A81D-1B1EA512A78A@getmailspring.com> Hi I am trying to view a cubic mesh constructed by DMPlex. I noticed that the interior point is not seen in the output VTU file. It means that I do not see an edge inside the cube. I tried to check some detail of DM using --in_dm_view. The detail does not show any problem. Do you think there is a problem in my vtk output or dm setup. Thanks Amir DM_0x84000000_0 in 3 dimensions: 0-cells: 27 1-cells: 54 2-cells: 36 3-cells: 8 Labels: depth: 4 strata with value/size (0 (27), 1 (54), 2 (36), 3 (8)) -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Tue Sep 10 08:08:33 2019 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 10 Sep 2019 09:08:33 -0400 Subject: [petsc-users] View 3D DMPlex In-Reply-To: <3993675D-2D80-4241-A81D-1B1EA512A78A@getmailspring.com> References: <3993675D-2D80-4241-A81D-1B1EA512A78A@getmailspring.com> Message-ID: On Tue, Sep 10, 2019 at 9:00 AM Amir via petsc-users < petsc-users at mcs.anl.gov> wrote: > Hi > I am trying to view a cubic mesh constructed by DMPlex. I noticed that the > interior point is not seen in the output VTU file. > It means that I do not see an edge inside the cube. I tried to check some > detail of DM using --in_dm_view. The detail does not show any problem. Do > you think there is a problem in my vtk output or dm setup. > By default, Paraview does not show interior edges. You have to use a filter, like "Extract Edges". Thanks, Matt > Thanks > Amir > DM_0x84000000_0 in 3 dimensions: > 0-cells: 27 > 1-cells: 54 > 2-cells: 36 > 3-cells: 8 > Labels: > depth: 4 strata with value/size (0 (27), 1 (54), 2 (36), 3 (8)) > > [image: Sent from Mailspring] -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From mhbaghaei at mail.sjtu.edu.cn Tue Sep 10 11:22:46 2019 From: mhbaghaei at mail.sjtu.edu.cn (Amir) Date: Wed, 11 Sep 2019 00:22:46 +0800 Subject: [petsc-users] View 3D DMPlex In-Reply-To: References: Message-ID: The mesh contains a cube. I tried to change the ordering in cone of dm. In some ordering, in Paraview, I also noticed too many interior edges and saw the interior nodes. I have not yet been able to see the interior edge correctly placed. Do you suggest to output in other format? I do not really know where this misplacing of edges comes from. PetscInt numPoints[2] = {27, 8}; PetscInt coneSize[35] = {8,8,8,8,8,8,8,8, 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0}; PetscInt cones[64] = { 8, 9, 16, 15, 17, 24, 25, 18, 9, 10,11, 16, 18, 25, 20, 19, 16,11, 12,13, 25, 22, 21, 20, 15,16, 13,14, 24, 23, 22, 25, /////////// Not see the interior node and edge 17,18,25, 24, 26, 33, 34, 27, 18,19,20, 25, 27, 34, 29, 28, 25,20,21, 22, 34, 31, 30, 29, 24,25,22, 23, 33, 32, 31, 34} ; PetscInt cones2[64] = { 8, 15, 16, 9, 17, 18, 25, 24, 9, 16,11, 10, 18, 19, 20, 25, /////////// See the interior node and edge 16,13, 12,11, 25, 20, 21, 22, 15,14, 13,16, 24, 25, 22, 23, 17,18,25, 24, 26, 27, 34, 33, 18,19,20, 25, 27, 28, 29, 34, 25,20,21, 22, 34, 29, 30, 31, 24,25,22, 23, 33, 34, 31, 32} ; PetscInt coneOrientations[64] = {0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0}; PetscScalar vertexCoords[81] = {0.0,0.0,0.0, 1.0,0.0, 0.0, 2.0,0.0,0.0, 2.0,1.0,0.0, 2.0,2.0, 0.0, 1.0,2.0,0.0, 0.0,2.0,0.0, 0.0,1.0, 0.0, 1.0,1.0,0.0, 0.0,0.0,1.0, 1.0,0.0, 1.0, 2.0,0.0,1.0, 2.0,1.0,1.0, 2.0,2.0, 1.0, 1.0,2.0,1.0, 0.0,2.0,1.0, 0.0,1.0, 1.0, 1.0,1.0,1.0, 0.0,0.0,2.0, 1.0,0.0, 2.0, 2.0,0.0,2.0, 2.0,1.0,2.0, 2.0,2.0, 2.0, 1.0,2.0,2.0, 0.0,2.0,2.0, 0.0,1.0, 2.0, 1.0,1.0,2.0}; Thanks Amir ---------- Forwarded Message --------- From: Matthew Knepley Subject: Re: [petsc-users] View 3D DMPlex Date: Sep 10 2019, at 9:08 pm To: Amir Cc: PETSc On Tue, Sep 10, 2019 at 9:00 AM Amir via petsc-users wrote: > Hi > I am trying to view a cubic mesh constructed by DMPlex. I noticed that the interior point is not seen in the output VTU file. > It means that I do not see an edge inside the cube. I tried to check some detail of DM using --in_dm_view. The detail does not show any problem. Do you think there is a problem in my vtk output or dm setup. By default, Paraview does not show interior edges. You have to use a filter, like "Extract Edges". Thanks, Matt > Thanks > Amir > DM_0x84000000_0 in 3 dimensions: > 0-cells: 27 > 1-cells: 54 > 2-cells: 36 > 3-cells: 8 > Labels: > depth: 4 strata with value/size (0 (27), 1 (54), 2 (36), 3 (8)) > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ (http://www.cse.buffalo.edu/~knepley/) On Sep 10 2019, at 9:08 pm, Matthew Knepley wrote: > On Tue, Sep 10, 2019 at 9:00 AM Amir via petsc-users wrote: > > > Hi > > I am trying to view a cubic mesh constructed by DMPlex. I noticed that the interior point is not seen in the output VTU file. > > It means that I do not see an edge inside the cube. I tried to check some detail of DM using --in_dm_view. The detail does not show any problem. Do you think there is a problem in my vtk output or dm setup. > > > By default, Paraview does not show interior edges. You have to use a filter, like "Extract Edges". > > Thanks, > > Matt > > > Thanks > > Amir > > DM_0x84000000_0 in 3 dimensions: > > 0-cells: 27 > > 1-cells: 54 > > 2-cells: 36 > > 3-cells: 8 > > Labels: > > depth: 4 strata with value/size (0 (27), 1 (54), 2 (36), 3 (8)) > > > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener > > > https://www.cse.buffalo.edu/~knepley/ (https://link.getmailspring.com/link/CE107FD7-8BE8-492B-98A7-90F4EEAA18F2 at getmailspring.com/1?redirect=http%3A%2F%2Fwww.cse.buffalo.edu%2F~knepley%2F&recipient=cGV0c2MtdXNlcnNAbWNzLmFubC5nb3Y%3D) -------------- next part -------------- An HTML attachment was scrubbed... URL: From danyang.su at gmail.com Tue Sep 10 12:43:56 2019 From: danyang.su at gmail.com (Danyang Su) Date: Tue, 10 Sep 2019 10:43:56 -0700 Subject: [petsc-users] Error running configure on SOWING Message-ID: Dear All, I am trying to install petsc-dev on a cluster with intel compiler. However, the configuration get stuck on SOWING. Error running configure on SOWING: Could not execute "['./configure --prefix=/home/m/min3p/danyangs/soft/petsc/petsc-dev/linux-intel-opt']": checking for ranlib... ranlib checking for a BSD-compatible install... /usr/bin/install -c checking whether install works... yes checking for ar... ar checking for gcc... no checking for cc... no checking for cl.exe... noconfigure: error: in `/gpfs/fs1/home/m/min3p/danyangs/soft/petsc/petsc-dev/linux-intel-opt/externalpackages/git.sowing': configure: error: no acceptable C compiler found in $PATH See `config.log' for more details Actually the C compiler is there. If I use GNU compiler, there is no problem. I also tried to use different sowing configuration as discussed on https://lists.mcs.anl.gov/pipermail/petsc-dev/2018-June/023070.html, but without success. The configuration is ./configure COPTFLAGS="-march=native -O2" CXXOPTFLAGS="-march=native -O2" FOPTFLAGS="-march=native -O2" --with-scalapack=1 --with-scalapack-lib="[/scinet/niagara/intel/2019.1/compilers_and_libraries_2019.1.144/linux/mkl/lib/intel64/libmkl_scalapack_lp64.so,/scinet/niagara/intel/2019.1/compilers_and_libraries_2019.1.144/linux/mkl/lib/intel64/libmkl_blacs_intelmpi_lp64.so]" --download-parmetis=1 --download-metis=1 --download-ptscotch=1 --download-fblaslapack=1 --download-hypre=1 --download-superlu_dist=1 --with-hdf5=1 --with-hdf5-dir=/scinet/niagara/software/2019a/opt/intel-2019.1-intelmpi-2019.1/hdf5-mpi/1.10.4 --download-zlib=1 --download-szlib=1 --download-ctetgen=1 --with-debugging=0 --with-cxx-dialect=C++11 --with-mpi-dir=/scinet/niagara/intel/2019.1/compilers_and_libraries_2019.1.144/linux/mpi/intel64 -download-sowing Any suggestion on this? Thanks and regards, danyang From bsmith at mcs.anl.gov Tue Sep 10 13:03:31 2019 From: bsmith at mcs.anl.gov (Smith, Barry F.) Date: Tue, 10 Sep 2019 18:03:31 +0000 Subject: [petsc-users] Error running configure on SOWING In-Reply-To: References: Message-ID: Please send the configure.log file when run with --download-sowing-cc=yourCcompiler and also $PETSC_ARCH/externalpackages/git.sowing/config.log this will tell us why it is rejecting the C compiler. Barry > On Sep 10, 2019, at 12:43 PM, Danyang Su via petsc-users wrote: > > Dear All, > > I am trying to install petsc-dev on a cluster with intel compiler. However, the configuration get stuck on SOWING. > > Error running configure on SOWING: Could not execute "['./configure --prefix=/home/m/min3p/danyangs/soft/petsc/petsc-dev/linux-intel-opt']": > checking for ranlib... ranlib > checking for a BSD-compatible install... /usr/bin/install -c > checking whether install works... yes > checking for ar... ar > checking for gcc... no > checking for cc... no > checking for cl.exe... noconfigure: error: in `/gpfs/fs1/home/m/min3p/danyangs/soft/petsc/petsc-dev/linux-intel-opt/externalpackages/git.sowing': > configure: error: no acceptable C compiler found in $PATH > See `config.log' for more details > > Actually the C compiler is there. > > If I use GNU compiler, there is no problem. I also tried to use different sowing configuration as discussed on https://lists.mcs.anl.gov/pipermail/petsc-dev/2018-June/023070.html, but without success. > > The configuration is > > ./configure COPTFLAGS="-march=native -O2" CXXOPTFLAGS="-march=native -O2" FOPTFLAGS="-march=native -O2" --with-scalapack=1 --with-scalapack-lib="[/scinet/niagara/intel/2019.1/compilers_and_libraries_2019.1.144/linux/mkl/lib/intel64/libmkl_scalapack_lp64.so,/scinet/niagara/intel/2019.1/compilers_and_libraries_2019.1.144/linux/mkl/lib/intel64/libmkl_blacs_intelmpi_lp64.so]" --download-parmetis=1 --download-metis=1 --download-ptscotch=1 --download-fblaslapack=1 --download-hypre=1 --download-superlu_dist=1 --with-hdf5=1 --with-hdf5-dir=/scinet/niagara/software/2019a/opt/intel-2019.1-intelmpi-2019.1/hdf5-mpi/1.10.4 --download-zlib=1 --download-szlib=1 --download-ctetgen=1 --with-debugging=0 --with-cxx-dialect=C++11 --with-mpi-dir=/scinet/niagara/intel/2019.1/compilers_and_libraries_2019.1.144/linux/mpi/intel64 -download-sowing > > Any suggestion on this? > > Thanks and regards, > > danyang > > > From danyang.su at gmail.com Tue Sep 10 13:11:08 2019 From: danyang.su at gmail.com (Danyang Su) Date: Tue, 10 Sep 2019 11:11:08 -0700 Subject: [petsc-users] Error running configure on SOWING In-Reply-To: References: Message-ID: <1f78e55f-7a6e-c848-ec8c-5139091050b7@gmail.com> Sorry I forgot to attached the log file. Attached are the log files using the following configuration: ./configure COPTFLAGS="-march=native -O2" CXXOPTFLAGS="-march=native -O2" FOPTFLAGS="-march=native -O2" --with-scalapack=1 --with-scalapack-lib="[/scinet/niagara/intel/2019.1/compilers_and_libraries_2019.1.144/linux/mkl/lib/intel64/libmkl_scalapack_lp64.so,/scinet/niagara/intel/2019.1/compilers_and_libraries_2019.1.144/linux/mkl/lib/intel64/libmkl_blacs_intelmpi_lp64.so]" --download-parmetis=1 --download-metis=1 --download-ptscotch=1 --download-fblaslapack=1 --download-hypre=1 --download-superlu_dist=1 --with-hdf5=1 --with-hdf5-dir=/scinet/niagara/software/2019a/opt/intel-2019.1-intelmpi-2019.1/hdf5-mpi/1.10.4 --download-zlib=1 --download-szlib=1 --download-ctetgen=1 --with-debugging=0 --with-cxx-dialect=C++11 --with-mpi-dir=/scinet/niagara/intel/2019.1/compilers_and_libraries_2019.1.144/linux/mpi/intel64 --download-sowing-cc=/scinet/niagara/intel/2019.1/compilers_and_libraries_2019.1.144/linux/bin/intel64/icc Thanks, Danyang On 2019-09-10 11:03 a.m., Smith, Barry F. wrote: > Please send the configure.log file when run with --download-sowing-cc=yourCcompiler and also $PETSC_ARCH/externalpackages/git.sowing/config.log this will tell us why it is rejecting the C compiler. > > Barry > > >> On Sep 10, 2019, at 12:43 PM, Danyang Su via petsc-users wrote: >> >> Dear All, >> >> I am trying to install petsc-dev on a cluster with intel compiler. However, the configuration get stuck on SOWING. >> >> Error running configure on SOWING: Could not execute "['./configure --prefix=/home/m/min3p/danyangs/soft/petsc/petsc-dev/linux-intel-opt']": >> checking for ranlib... ranlib >> checking for a BSD-compatible install... /usr/bin/install -c >> checking whether install works... yes >> checking for ar... ar >> checking for gcc... no >> checking for cc... no >> checking for cl.exe... noconfigure: error: in `/gpfs/fs1/home/m/min3p/danyangs/soft/petsc/petsc-dev/linux-intel-opt/externalpackages/git.sowing': >> configure: error: no acceptable C compiler found in $PATH >> See `config.log' for more details >> >> Actually the C compiler is there. >> >> If I use GNU compiler, there is no problem. I also tried to use different sowing configuration as discussed on https://lists.mcs.anl.gov/pipermail/petsc-dev/2018-June/023070.html, but without success. >> >> The configuration is >> >> ./configure COPTFLAGS="-march=native -O2" CXXOPTFLAGS="-march=native -O2" FOPTFLAGS="-march=native -O2" --with-scalapack=1 --with-scalapack-lib="[/scinet/niagara/intel/2019.1/compilers_and_libraries_2019.1.144/linux/mkl/lib/intel64/libmkl_scalapack_lp64.so,/scinet/niagara/intel/2019.1/compilers_and_libraries_2019.1.144/linux/mkl/lib/intel64/libmkl_blacs_intelmpi_lp64.so]" --download-parmetis=1 --download-metis=1 --download-ptscotch=1 --download-fblaslapack=1 --download-hypre=1 --download-superlu_dist=1 --with-hdf5=1 --with-hdf5-dir=/scinet/niagara/software/2019a/opt/intel-2019.1-intelmpi-2019.1/hdf5-mpi/1.10.4 --download-zlib=1 --download-szlib=1 --download-ctetgen=1 --with-debugging=0 --with-cxx-dialect=C++11 --with-mpi-dir=/scinet/niagara/intel/2019.1/compilers_and_libraries_2019.1.144/linux/mpi/intel64 -download-sowing >> >> Any suggestion on this? >> >> Thanks and regards, >> >> danyang >> >> >> -------------- next part -------------- A non-text attachment was scrubbed... Name: configure.log Type: text/x-log Size: 762582 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: sowing-config.log Type: text/x-log Size: 37810 bytes Desc: not available URL: From bsmith at mcs.anl.gov Tue Sep 10 13:19:28 2019 From: bsmith at mcs.anl.gov (Smith, Barry F.) Date: Tue, 10 Sep 2019 18:19:28 +0000 Subject: [petsc-users] Error running configure on SOWING In-Reply-To: <1f78e55f-7a6e-c848-ec8c-5139091050b7@gmail.com> References: <1f78e55f-7a6e-c848-ec8c-5139091050b7@gmail.com> Message-ID: <06F78897-B6BF-4551-8E9B-0C6CF87DB84D@mcs.anl.gov> Ahh, sorry it also needs the C++ compiler provided with --download-sowing-cxx= something > On Sep 10, 2019, at 1:11 PM, Danyang Su wrote: > > Sorry I forgot to attached the log file. > > Attached are the log files using the following configuration: > > ./configure COPTFLAGS="-march=native -O2" CXXOPTFLAGS="-march=native -O2" FOPTFLAGS="-march=native -O2" --with-scalapack=1 --with-scalapack-lib="[/scinet/niagara/intel/2019.1/compilers_and_libraries_2019.1.144/linux/mkl/lib/intel64/libmkl_scalapack_lp64.so,/scinet/niagara/intel/2019.1/compilers_and_libraries_2019.1.144/linux/mkl/lib/intel64/libmkl_blacs_intelmpi_lp64.so]" --download-parmetis=1 --download-metis=1 --download-ptscotch=1 --download-fblaslapack=1 --download-hypre=1 --download-superlu_dist=1 --with-hdf5=1 --with-hdf5-dir=/scinet/niagara/software/2019a/opt/intel-2019.1-intelmpi-2019.1/hdf5-mpi/1.10.4 --download-zlib=1 --download-szlib=1 --download-ctetgen=1 --with-debugging=0 --with-cxx-dialect=C++11 --with-mpi-dir=/scinet/niagara/intel/2019.1/compilers_and_libraries_2019.1.144/linux/mpi/intel64 --download-sowing-cc=/scinet/niagara/intel/2019.1/compilers_and_libraries_2019.1.144/linux/bin/intel64/icc > > Thanks, > > Danyang > > On 2019-09-10 11:03 a.m., Smith, Barry F. wrote: >> Please send the configure.log file when run with --download-sowing-cc=yourCcompiler and also $PETSC_ARCH/externalpackages/git.sowing/config.log this will tell us why it is rejecting the C compiler. >> >> Barry >> >> >>> On Sep 10, 2019, at 12:43 PM, Danyang Su via petsc-users wrote: >>> >>> Dear All, >>> >>> I am trying to install petsc-dev on a cluster with intel compiler. However, the configuration get stuck on SOWING. >>> >>> Error running configure on SOWING: Could not execute "['./configure --prefix=/home/m/min3p/danyangs/soft/petsc/petsc-dev/linux-intel-opt']": >>> checking for ranlib... ranlib >>> checking for a BSD-compatible install... /usr/bin/install -c >>> checking whether install works... yes >>> checking for ar... ar >>> checking for gcc... no >>> checking for cc... no >>> checking for cl.exe... noconfigure: error: in `/gpfs/fs1/home/m/min3p/danyangs/soft/petsc/petsc-dev/linux-intel-opt/externalpackages/git.sowing': >>> configure: error: no acceptable C compiler found in $PATH >>> See `config.log' for more details >>> >>> Actually the C compiler is there. >>> >>> If I use GNU compiler, there is no problem. I also tried to use different sowing configuration as discussed on https://lists.mcs.anl.gov/pipermail/petsc-dev/2018-June/023070.html, but without success. >>> >>> The configuration is >>> >>> ./configure COPTFLAGS="-march=native -O2" CXXOPTFLAGS="-march=native -O2" FOPTFLAGS="-march=native -O2" --with-scalapack=1 --with-scalapack-lib="[/scinet/niagara/intel/2019.1/compilers_and_libraries_2019.1.144/linux/mkl/lib/intel64/libmkl_scalapack_lp64.so,/scinet/niagara/intel/2019.1/compilers_and_libraries_2019.1.144/linux/mkl/lib/intel64/libmkl_blacs_intelmpi_lp64.so]" --download-parmetis=1 --download-metis=1 --download-ptscotch=1 --download-fblaslapack=1 --download-hypre=1 --download-superlu_dist=1 --with-hdf5=1 --with-hdf5-dir=/scinet/niagara/software/2019a/opt/intel-2019.1-intelmpi-2019.1/hdf5-mpi/1.10.4 --download-zlib=1 --download-szlib=1 --download-ctetgen=1 --with-debugging=0 --with-cxx-dialect=C++11 --with-mpi-dir=/scinet/niagara/intel/2019.1/compilers_and_libraries_2019.1.144/linux/mpi/intel64 -download-sowing >>> >>> Any suggestion on this? >>> >>> Thanks and regards, >>> >>> danyang >>> >>> >>> > From knepley at gmail.com Tue Sep 10 13:27:08 2019 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 10 Sep 2019 14:27:08 -0400 Subject: [petsc-users] View 3D DMPlex In-Reply-To: References: Message-ID: On Tue, Sep 10, 2019 at 12:20 PM Amir wrote: > > > The mesh contains a cube. I tried to change the ordering in cone of dm. In > some ordering, in Paraview, I also noticed too many interior edges and saw > the interior nodes. I have not yet been able to see the interior edge > correctly placed. Do you suggest to output in other format? I do not really > know where this misplacing of edges comes from. > How about first using DMPlexCreateBoxMesh(), and seeing if you can visualize it. Then if that works, we can talk about inputting your mesh from scratch. Thanks, Matt > PetscInt numPoints[2] = {27, 8}; > PetscInt coneSize[35] = {8,8,8,8,8,8,8,8, > 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0}; > PetscInt cones[64] = { 8, 9, 16, 15, 17, 24, 25, 18, > 9, 10,11, 16, 18, 25, 20, 19, > 16,11, 12,13, 25, 22, 21, 20, > 15,16, 13,14, 24, 23, 22, 25, /////////// Not see the interior node > and edge > 17,18,25, 24, 26, 33, 34, 27, > 18,19,20, 25, 27, 34, 29, 28, > 25,20,21, 22, 34, 31, 30, 29, > 24,25,22, 23, 33, 32, 31, 34} ; > PetscInt cones2[64] = { 8, 15, 16, 9, 17, 18, 25, 24, > 9, 16,11, 10, 18, 19, 20, 25, /////////// See the > interior node and edge > 16,13, 12,11, 25, 20, 21, 22, > 15,14, 13,16, 24, 25, 22, 23, > 17,18,25, 24, 26, 27, 34, 33, > 18,19,20, 25, 27, 28, 29, 34, > 25,20,21, 22, 34, 29, 30, 31, > 24,25,22, 23, 33, 34, 31, 32} ; > PetscInt coneOrientations[64] = > {0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0}; > PetscScalar vertexCoords[81] = {0.0,0.0,0.0, 1.0,0.0, 0.0, > 2.0,0.0,0.0, 2.0,1.0,0.0, 2.0,2.0, 0.0, 1.0,2.0,0.0, > 0.0,2.0,0.0, 0.0,1.0, 0.0, 1.0,1.0,0.0, > 0.0,0.0,1.0, 1.0,0.0, 1.0, 2.0,0.0,1.0, 2.0,1.0,1.0, 2.0,2.0, > 1.0, 1.0,2.0,1.0, > 0.0,2.0,1.0, 0.0,1.0, 1.0, 1.0,1.0,1.0, > 0.0,0.0,2.0, 1.0,0.0, 2.0, 2.0,0.0,2.0, 2.0,1.0,2.0, 2.0,2.0, > 2.0, 1.0,2.0,2.0, > 0.0,2.0,2.0, 0.0,1.0, 2.0, 1.0,1.0,2.0}; > Thanks > Amir > [image: Sent from Mailspring] > > ---------- Forwarded Message --------- > > From: Matthew Knepley > Subject: Re: [petsc-users] View 3D DMPlex > Date: Sep 10 2019, at 9:08 pm > To: Amir > Cc: PETSc > > On Tue, Sep 10, 2019 at 9:00 AM Amir via petsc-users < > petsc-users at mcs.anl.gov> wrote: > > Hi > I am trying to view a cubic mesh constructed by DMPlex. I noticed that the > interior point is not seen in the output VTU file. > It means that I do not see an edge inside the cube. I tried to check some > detail of DM using --in_dm_view. The detail does not show any problem. Do > you think there is a problem in my vtk output or dm setup. > > > By default, Paraview does not show interior edges. You have to use a > filter, like "Extract Edges". > > Thanks, > > Matt > > > Thanks > Amir > DM_0x84000000_0 in 3 dimensions: > 0-cells: 27 > 1-cells: 54 > 2-cells: 36 > 3-cells: 8 > Labels: > depth: 4 strata with value/size (0 (27), 1 (54), 2 (36), 3 (8)) > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > On Sep 10 2019, at 9:08 pm, Matthew Knepley wrote: > > On Tue, Sep 10, 2019 at 9:00 AM Amir via petsc-users < > petsc-users at mcs.anl.gov > > > wrote: > > Hi > I am trying to view a cubic mesh constructed by DMPlex. I noticed that the > interior point is not seen in the output VTU file. > It means that I do not see an edge inside the cube. I tried to check some > detail of DM using --in_dm_view. The detail does not show any problem. Do > you think there is a problem in my vtk output or dm setup. > > > By default, Paraview does not show interior edges. You have to use a > filter, like "Extract Edges". > > Thanks, > > Matt > > > Thanks > Amir > DM_0x84000000_0 in 3 dimensions: > 0-cells: 27 > 1-cells: 54 > 2-cells: 36 > 3-cells: 8 > Labels: > depth: 4 strata with value/size (0 (27), 1 (54), 2 (36), 3 (8)) > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From danyang.su at gmail.com Tue Sep 10 13:45:26 2019 From: danyang.su at gmail.com (Danyang Su) Date: Tue, 10 Sep 2019 11:45:26 -0700 Subject: [petsc-users] Error running configure on SOWING In-Reply-To: <06F78897-B6BF-4551-8E9B-0C6CF87DB84D@mcs.anl.gov> References: <1f78e55f-7a6e-c848-ec8c-5139091050b7@gmail.com> <06F78897-B6BF-4551-8E9B-0C6CF87DB84D@mcs.anl.gov> Message-ID: <2473a0f2-33e3-fdb5-b2ec-7e94ecb05afc@gmail.com> Hi Barry, With both --download-sowing-cc= and --download-sowing-cxx= specified, it can be configured now. Thanks as always for all your help, Danyang On 2019-09-10 11:19 a.m., Smith, Barry F. wrote: > Ahh, sorry it also needs the C++ compiler provided with --download-sowing-cxx= something > > >> On Sep 10, 2019, at 1:11 PM, Danyang Su wrote: >> >> Sorry I forgot to attached the log file. >> >> Attached are the log files using the following configuration: >> >> ./configure COPTFLAGS="-march=native -O2" CXXOPTFLAGS="-march=native -O2" FOPTFLAGS="-march=native -O2" --with-scalapack=1 --with-scalapack-lib="[/scinet/niagara/intel/2019.1/compilers_and_libraries_2019.1.144/linux/mkl/lib/intel64/libmkl_scalapack_lp64.so,/scinet/niagara/intel/2019.1/compilers_and_libraries_2019.1.144/linux/mkl/lib/intel64/libmkl_blacs_intelmpi_lp64.so]" --download-parmetis=1 --download-metis=1 --download-ptscotch=1 --download-fblaslapack=1 --download-hypre=1 --download-superlu_dist=1 --with-hdf5=1 --with-hdf5-dir=/scinet/niagara/software/2019a/opt/intel-2019.1-intelmpi-2019.1/hdf5-mpi/1.10.4 --download-zlib=1 --download-szlib=1 --download-ctetgen=1 --with-debugging=0 --with-cxx-dialect=C++11 --with-mpi-dir=/scinet/niagara/intel/2019.1/compilers_and_libraries_2019.1.144/linux/mpi/intel64 --download-sowing-cc=/scinet/niagara/intel/2019.1/compilers_and_libraries_2019.1.144/linux/bin/intel64/icc >> >> Thanks, >> >> Danyang >> >> On 2019-09-10 11:03 a.m., Smith, Barry F. wrote: >>> Please send the configure.log file when run with --download-sowing-cc=yourCcompiler and also $PETSC_ARCH/externalpackages/git.sowing/config.log this will tell us why it is rejecting the C compiler. >>> >>> Barry >>> >>> >>>> On Sep 10, 2019, at 12:43 PM, Danyang Su via petsc-users wrote: >>>> >>>> Dear All, >>>> >>>> I am trying to install petsc-dev on a cluster with intel compiler. However, the configuration get stuck on SOWING. >>>> >>>> Error running configure on SOWING: Could not execute "['./configure --prefix=/home/m/min3p/danyangs/soft/petsc/petsc-dev/linux-intel-opt']": >>>> checking for ranlib... ranlib >>>> checking for a BSD-compatible install... /usr/bin/install -c >>>> checking whether install works... yes >>>> checking for ar... ar >>>> checking for gcc... no >>>> checking for cc... no >>>> checking for cl.exe... noconfigure: error: in `/gpfs/fs1/home/m/min3p/danyangs/soft/petsc/petsc-dev/linux-intel-opt/externalpackages/git.sowing': >>>> configure: error: no acceptable C compiler found in $PATH >>>> See `config.log' for more details >>>> >>>> Actually the C compiler is there. >>>> >>>> If I use GNU compiler, there is no problem. I also tried to use different sowing configuration as discussed on https://lists.mcs.anl.gov/pipermail/petsc-dev/2018-June/023070.html, but without success. >>>> >>>> The configuration is >>>> >>>> ./configure COPTFLAGS="-march=native -O2" CXXOPTFLAGS="-march=native -O2" FOPTFLAGS="-march=native -O2" --with-scalapack=1 --with-scalapack-lib="[/scinet/niagara/intel/2019.1/compilers_and_libraries_2019.1.144/linux/mkl/lib/intel64/libmkl_scalapack_lp64.so,/scinet/niagara/intel/2019.1/compilers_and_libraries_2019.1.144/linux/mkl/lib/intel64/libmkl_blacs_intelmpi_lp64.so]" --download-parmetis=1 --download-metis=1 --download-ptscotch=1 --download-fblaslapack=1 --download-hypre=1 --download-superlu_dist=1 --with-hdf5=1 --with-hdf5-dir=/scinet/niagara/software/2019a/opt/intel-2019.1-intelmpi-2019.1/hdf5-mpi/1.10.4 --download-zlib=1 --download-szlib=1 --download-ctetgen=1 --with-debugging=0 --with-cxx-dialect=C++11 --with-mpi-dir=/scinet/niagara/intel/2019.1/compilers_and_libraries_2019.1.144/linux/mpi/intel64 -download-sowing >>>> >>>> Any suggestion on this? >>>> >>>> Thanks and regards, >>>> >>>> danyang >>>> >>>> >>>> >> From mpovolot at purdue.edu Tue Sep 10 22:46:59 2019 From: mpovolot at purdue.edu (Povolotskyi, Mykhailo) Date: Wed, 11 Sep 2019 03:46:59 +0000 Subject: [petsc-users] SLEPC: partitioning for CISS Message-ID: <3e31f1fb-5218-fd97-9eec-3f5ec8a53a19@purdue.edu> Hello, I'm currently using CISS via SLEPc. I would like to use parallelization by partitioning over quadrature points, using the option -eps_ciss_partitions. I have done the following: 1. created a matrix on each MPI rank MatCreateDense(MPI_COMM_SELF, matrix_size, matrix_size, PETSC_DECIDE,PETSC_DECIDE,NULL,&matrix_petsc); 2. Created EPS object EPSCreate(MPI_COMM_WORLD,&eps); EPSSetOperators( eps,matrix_petsc,NULL); EPSSetType(eps,EPSCISS); EPSSetProblemType(eps, EPS_NHEP); EPSSetFromOptions(eps); EPSGetRG(eps,&rg); RGSetType(rg,RGRING); RGRingSetParameters(rg,center,radius,vscale,start_ang,end_ang,width); EPSSolve(eps); 3. Then I run the code: mpiexec -n 2 ./a.out -eps_ciss_partitions 2 The code gives an error: [1]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [1]PETSC ERROR: Arguments are incompatible [1]PETSC ERROR: MatMatMultSymbolic requires A, seqdense, to be compatible with B, mpidense [1]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. [1]PETSC ERROR: Petsc Release Version 3.8.4, Mar, 24, 2018 [1]PETSC ERROR: ../lib/a.out on a linux-complex named brown-a337.rcac.purdue.edu by mpovolot Tue Sep 10 23:43:44 2019 [1]PETSC ERROR: Configure options --with-scalar-type=complex --with-x=0 --with-hdf5 --download-hdf5=1 --with-single-library=1 --with-pic=1 --with-shared-libraries=0 --with-log=0 --with-clanguage=C++ --CXXFLAGS="-fopenmp -fPIC" --CFLAGS="-fopenmp -fPIC" --with-fortran=0 --FFLAGS="-fopenmp -fPIC" --with-debugging=0 --with-cc=mpicc --with-fc=mpif90 --with-cxx=mpicxx COPTFLAGS= CXXOPTFLAGS= FOPTFLAGS= --download-metis=1 --download-parmetis=1 --with-valgrind-dir=/apps/brown/valgrind/3.13.0_gcc-4.8.5 --download-mumps=1 --with-fortran-kernels=0 --download-superlu_dist=1 --with-blaslapack-lib="-L/apps/cent7/intel/compilers_and_libraries_2017.1.132/linux/mkl/lib/intel64 -lmkl_intel_lp64 -lmkl_gnu_thread -lmkl_core " --with-blacs-lib=/apps/cent7/intel/compilers_and_libraries_2017.1.132/linux/mkl/lib/intel64/libmkl_blacs_intelmpi_lp64.so --with-blacs-include=/apps/cent7/intel/compilers_and_libraries_2017.1.132/linux/mkl/include --with-scalapack-lib="-Wl,-rpath,/apps/cent7/intel/compilers_and_libraries_2017.1.132/linux/mkl/lib/intel64 -L/apps/cent7/intel/compilers_and_libraries_2017.1.132/linux/mkl/lib/intel64 -lmkl_gf_lp64 -lmkl_gnu_thread -lmkl_core -lpthread -L/apps/cent7/intel/compilers_and_libraries_2017.1.132/linux/mkl/lib/intel64 -lmkl_scalapack_lp64 -lmkl_blacs_intelmpi_lp64" --with-scalapack-include=/apps/cent7/intel/compilers_and_libraries_2017.1.132/linux/mkl/include [1]PETSC ERROR: #463 MatMatMultSymbolic() line 9692 in /depot/kildisha/apps/brown/nemo5/libs/petsc/build-cplx/src/mat/interface/matrix.c [1]PETSC ERROR: #464 BVMatMult_Svec() line 229 in /depot/kildisha/apps/brown/nemo5/libs/slepc/build-cplx/src/sys/classes/bv/impls/svec/svec.c [1]PETSC ERROR: #465 BVMatMult() line 589 in /depot/kildisha/apps/brown/nemo5/libs/slepc/build-cplx/src/sys/classes/bv/interface/bvops.c [1]PETSC ERROR: #466 BVMatProject_MatMult() line 903 in /depot/kildisha/apps/brown/nemo5/libs/slepc/build-cplx/src/sys/classes/bv/interface/bvglobal.c [1]PETSC ERROR: #467 BVMatProject() line 1151 in /depot/kildisha/apps/brown/nemo5/libs/slepc/build-cplx/src/sys/classes/bv/interface/bvglobal.c [1]PETSC ERROR: #468 EPSSolve_CISS() line 1066 in /depot/kildisha/apps/brown/nemo5/libs/slepc/build-cplx/src/eps/impls/ciss/ciss.c [1]PETSC ERROR: #469 EPSSolve() line 147 in /depot/kildisha/apps/brown/nemo5/libs/slepc/build-cplx/src/eps/interface/epssolve.c Could you, please, tell me what am I doing wrong? -------------- next part -------------- An HTML attachment was scrubbed... URL: From mhbaghaei at mail.sjtu.edu.cn Wed Sep 11 02:05:39 2019 From: mhbaghaei at mail.sjtu.edu.cn (Amir) Date: Wed, 11 Sep 2019 15:05:39 +0800 Subject: [petsc-users] View 3D DMPlex In-Reply-To: References: Message-ID: I tried DMPlexCreateBoxMesh(). It shows points and nodes on the boundary surface. Plus, I do see easily and correctly the interior edge/node inside by clipping the volume mesh. Do you think my dm setup is wrong that I cant see the same using my DAG settings in Paraview. Thanks Amir On Sep 11 2019, at 2:27 am, Matthew Knepley wrote: > On Tue, Sep 10, 2019 at 12:20 PM Amir wrote: > > > > > > > The mesh contains a cube. I tried to change the ordering in cone of dm. In some ordering, in Paraview, I also noticed too many interior edges and saw the interior nodes. I have not yet been able to see the interior edge correctly placed. Do you suggest to output in other format? I do not really know where this misplacing of edges comes from. > > How about first using DMPlexCreateBoxMesh(), and seeing if you can visualize it. Then if that works, we can talk about > inputting your mesh from scratch. > > Thanks, > > Matt > > > PetscInt numPoints[2] = {27, 8}; > > PetscInt coneSize[35] = {8,8,8,8,8,8,8,8, 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0}; > > PetscInt cones[64] = { 8, 9, 16, 15, 17, 24, 25, 18, > > 9, 10,11, 16, 18, 25, 20, 19, > > 16,11, 12,13, 25, 22, 21, 20, > > 15,16, 13,14, 24, 23, 22, 25, /////////// Not see the interior node and edge > > 17,18,25, 24, 26, 33, 34, 27, > > 18,19,20, 25, 27, 34, 29, 28, > > 25,20,21, 22, 34, 31, 30, 29, > > 24,25,22, 23, 33, 32, 31, 34} ; > > PetscInt cones2[64] = { 8, 15, 16, 9, 17, 18, 25, 24, > > 9, 16,11, 10, 18, 19, 20, 25, /////////// See the interior node and edge > > 16,13, 12,11, 25, 20, 21, 22, > > 15,14, 13,16, 24, 25, 22, 23, > > 17,18,25, 24, 26, 27, 34, 33, > > 18,19,20, 25, 27, 28, 29, 34, > > 25,20,21, 22, 34, 29, 30, 31, > > 24,25,22, 23, 33, 34, 31, 32} ; > > PetscInt coneOrientations[64] = {0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0}; > > PetscScalar vertexCoords[81] = {0.0,0.0,0.0, 1.0,0.0, 0.0, 2.0,0.0,0.0, 2.0,1.0,0.0, 2.0,2.0, 0.0, 1.0,2.0,0.0, > > 0.0,2.0,0.0, 0.0,1.0, 0.0, 1.0,1.0,0.0, > > 0.0,0.0,1.0, 1.0,0.0, 1.0, 2.0,0.0,1.0, 2.0,1.0,1.0, 2.0,2.0, 1.0, 1.0,2.0,1.0, > > 0.0,2.0,1.0, 0.0,1.0, 1.0, 1.0,1.0,1.0, > > 0.0,0.0,2.0, 1.0,0.0, 2.0, 2.0,0.0,2.0, 2.0,1.0,2.0, 2.0,2.0, 2.0, 1.0,2.0,2.0, > > 0.0,2.0,2.0, 0.0,1.0, 2.0, 1.0,1.0,2.0}; > > Thanks > > Amir > > > > ---------- Forwarded Message --------- > > From: Matthew Knepley > > Subject: Re: [petsc-users] View 3D DMPlex > > Date: Sep 10 2019, at 9:08 pm > > To: Amir > > Cc: PETSc > > > > On Tue, Sep 10, 2019 at 9:00 AM Amir via petsc-users wrote: > > > Hi > > > I am trying to view a cubic mesh constructed by DMPlex. I noticed that the interior point is not seen in the output VTU file. > > > It means that I do not see an edge inside the cube. I tried to check some detail of DM using --in_dm_view. The detail does not show any problem. Do you think there is a problem in my vtk output or dm setup. > > > > > > By default, Paraview does not show interior edges. You have to use a filter, like "Extract Edges". > > > > Thanks, > > > > Matt > > > > > Thanks > > > Amir > > > DM_0x84000000_0 in 3 dimensions: > > > 0-cells: 27 > > > 1-cells: 54 > > > 2-cells: 36 > > > 3-cells: 8 > > > Labels: > > > depth: 4 strata with value/size (0 (27), 1 (54), 2 (36), 3 (8)) > > > > > > > > > -- > > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > > -- Norbert Wiener > > > > > > https://www.cse.buffalo.edu/~knepley/ (http://www.cse.buffalo.edu/~knepley/) > > On Sep 10 2019, at 9:08 pm, Matthew Knepley wrote: > > > On Tue, Sep 10, 2019 at 9:00 AM Amir via petsc-users wrote: > > > > > > > Hi > > > > I am trying to view a cubic mesh constructed by DMPlex. I noticed that the interior point is not seen in the output VTU file. > > > > It means that I do not see an edge inside the cube. I tried to check some detail of DM using --in_dm_view. The detail does not show any problem. Do you think there is a problem in my vtk output or dm setup. > > > > > > > > > By default, Paraview does not show interior edges. You have to use a filter, like "Extract Edges". > > > > > > Thanks, > > > > > > Matt > > > > > > > Thanks > > > > Amir > > > > DM_0x84000000_0 in 3 dimensions: > > > > 0-cells: 27 > > > > 1-cells: 54 > > > > 2-cells: 36 > > > > 3-cells: 8 > > > > Labels: > > > > depth: 4 strata with value/size (0 (27), 1 (54), 2 (36), 3 (8)) > > > > > > > > > > > > > -- > > > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > > > -- Norbert Wiener > > > > > > > > > https://www.cse.buffalo.edu/~knepley/ (https://link.getmailspring.com/link/CE107FD7-8BE8-492B-98A7-90F4EEAA18F2 at getmailspring.com/1?redirect=http%3A%2F%2Fwww.cse.buffalo.edu%2F~knepley%2F&recipient=a25lcGxleUBnbWFpbC5jb20%3D) > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener > > > https://www.cse.buffalo.edu/~knepley/ (https://link.getmailspring.com/link/D5CB7218-13F0-46E6-9D32-1BEE866C43CA at getmailspring.com/1?redirect=http%3A%2F%2Fwww.cse.buffalo.edu%2F~knepley%2F&recipient=cGV0c2MtdXNlcnNAbWNzLmFubC5nb3Y%3D) -------------- next part -------------- An HTML attachment was scrubbed... URL: From jroman at dsic.upv.es Wed Sep 11 02:05:48 2019 From: jroman at dsic.upv.es (Jose E. Roman) Date: Wed, 11 Sep 2019 09:05:48 +0200 Subject: [petsc-users] SLEPC: partitioning for CISS In-Reply-To: <3e31f1fb-5218-fd97-9eec-3f5ec8a53a19@purdue.edu> References: <3e31f1fb-5218-fd97-9eec-3f5ec8a53a19@purdue.edu> Message-ID: <3FDCA9FA-9747-4C7D-BA7B-A9322D9EF273@dsic.upv.es> If you run with debugging enabled you will get an error at EPSSetOperators() saying that "eps" and "matrix_petsc" have different communicators. You cannot pass a sequential matrix to a parallel solver. Jose > El 11 sept 2019, a las 5:46, Povolotskyi, Mykhailo via petsc-users escribi?: > > Hello, > I'm currently using CISS via SLEPc. > I would like to use parallelization by partitioning over quadrature points, using the option -eps_ciss_partitions. > > I have done the following: > > 1. created a matrix on each MPI rank > > MatCreateDense(MPI_COMM_SELF, matrix_size, matrix_size, PETSC_DECIDE,PETSC_DECIDE,NULL,&matrix_petsc); > > 2. Created EPS object > > EPSCreate(MPI_COMM_WORLD,&eps); > EPSSetOperators( eps,matrix_petsc,NULL); > EPSSetType(eps,EPSCISS); > EPSSetProblemType(eps, EPS_NHEP); > > EPSSetFromOptions(eps); > > > EPSGetRG(eps,&rg); > RGSetType(rg,RGRING); > > RGRingSetParameters(rg,center,radius,vscale,start_ang,end_ang,width); > > EPSSolve(eps); > > 3. Then I run the code: > > mpiexec -n 2 ./a.out -eps_ciss_partitions 2 > > The code gives an error: > > [1]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > [1]PETSC ERROR: Arguments are incompatible > [1]PETSC ERROR: MatMatMultSymbolic requires A, seqdense, to be compatible with B, mpidense > [1]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > [1]PETSC ERROR: Petsc Release Version 3.8.4, Mar, 24, 2018 > [1]PETSC ERROR: ../lib/a.out on a linux-complex named brown-a337.rcac.purdue.edu by mpovolot Tue Sep 10 23:43:44 2019 > [1]PETSC ERROR: Configure options --with-scalar-type=complex --with-x=0 --with-hdf5 --download-hdf5=1 --with-single-library=1 --with-pic=1 --with-shared-libraries=0 --with-log=0 --with-clanguage=C++ --CXXFLAGS="-fopenmp -fPIC" --CFLAGS="-fopenmp -fPIC" --with-fortran=0 --FFLAGS="-fopenmp -fPIC" --with-debugging=0 --with-cc=mpicc --with-fc=mpif90 --with-cxx=mpicxx COPTFLAGS= CXXOPTFLAGS= FOPTFLAGS= --download-metis=1 --download-parmetis=1 --with-valgrind-dir=/apps/brown/valgrind/3.13.0_gcc-4.8.5 --download-mumps=1 --with-fortran-kernels=0 --download-superlu_dist=1 --with-blaslapack-lib="-L/apps/cent7/intel/compilers_and_libraries_2017.1.132/linux/mkl/lib/intel64 -lmkl_intel_lp64 -lmkl_gnu_thread -lmkl_core " --with-blacs-lib=/apps/cent7/intel/compilers_and_libraries_2017.1.132/linux/mkl/lib/intel64/libmkl_blacs_intelmpi_lp64.so --with-blacs-include=/apps/cent7/intel/compilers_and_libraries_2017.1.132/linux/mkl/include --with-scalapack-lib="-Wl,-rpath,/apps/cent7/intel/compilers_and_libraries_2017.1.132/linux/mkl/lib/intel64 -L/apps/cent7/intel/compilers_and_libraries_2017.1.132/linux/mkl/lib/intel64 -lmkl_gf_lp64 -lmkl_gnu_thread -lmkl_core -lpthread -L/apps/cent7/intel/compilers_and_libraries_2017.1.132/linux/mkl/lib/intel64 -lmkl_scalapack_lp64 -lmkl_blacs_intelmpi_lp64" --with-scalapack-include=/apps/cent7/intel/compilers_and_libraries_2017.1.132/linux/mkl/include > [1]PETSC ERROR: #463 MatMatMultSymbolic() line 9692 in /depot/kildisha/apps/brown/nemo5/libs/petsc/build-cplx/src/mat/interface/matrix.c > [1]PETSC ERROR: #464 BVMatMult_Svec() line 229 in /depot/kildisha/apps/brown/nemo5/libs/slepc/build-cplx/src/sys/classes/bv/impls/svec/svec.c > [1]PETSC ERROR: #465 BVMatMult() line 589 in /depot/kildisha/apps/brown/nemo5/libs/slepc/build-cplx/src/sys/classes/bv/interface/bvops.c > [1]PETSC ERROR: #466 BVMatProject_MatMult() line 903 in /depot/kildisha/apps/brown/nemo5/libs/slepc/build-cplx/src/sys/classes/bv/interface/bvglobal.c > [1]PETSC ERROR: #467 BVMatProject() line 1151 in /depot/kildisha/apps/brown/nemo5/libs/slepc/build-cplx/src/sys/classes/bv/interface/bvglobal.c > [1]PETSC ERROR: #468 EPSSolve_CISS() line 1066 in /depot/kildisha/apps/brown/nemo5/libs/slepc/build-cplx/src/eps/impls/ciss/ciss.c > [1]PETSC ERROR: #469 EPSSolve() line 147 in /depot/kildisha/apps/brown/nemo5/libs/slepc/build-cplx/src/eps/interface/epssolve.c > > Could you, please, tell me what am I doing wrong? > > > From knepley at gmail.com Wed Sep 11 06:33:44 2019 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 11 Sep 2019 07:33:44 -0400 Subject: [petsc-users] View 3D DMPlex In-Reply-To: References: Message-ID: On Wed, Sep 11, 2019 at 3:05 AM Amir wrote: > I tried DMPlexCreateBoxMesh(). It shows points and nodes on the boundary > surface. Plus, I do see easily and correctly the interior edge/node inside > by clipping the volume mesh. Do you think my dm setup is wrong that I cant > see the same using my DAG settings in Paraview. > Yes. You can see the order of the BoxMesh cells by using -dm_view ::ascii_info_detail Thanks, Matt > Thanks > Amir > > On Sep 11 2019, at 2:27 am, Matthew Knepley wrote: > > On Tue, Sep 10, 2019 at 12:20 PM Amir > > wrote: > [image: Sent from Mailspring] > > > > The mesh contains a cube. I tried to change the ordering in cone of dm. In > some ordering, in Paraview, I also noticed too many interior edges and saw > the interior nodes. I have not yet been able to see the interior edge > correctly placed. Do you suggest to output in other format? I do not really > know where this misplacing of edges comes from. > > > How about first using DMPlexCreateBoxMesh(), and seeing if you can > visualize it. Then if that works, we can talk about > inputting your mesh from scratch. > > Thanks, > > Matt > > > PetscInt numPoints[2] = {27, 8}; > PetscInt coneSize[35] = {8,8,8,8,8,8,8,8, > 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0}; > PetscInt cones[64] = { 8, 9, 16, 15, 17, 24, 25, 18, > 9, 10,11, 16, 18, 25, 20, 19, > 16,11, 12,13, 25, 22, 21, 20, > 15,16, 13,14, 24, 23, 22, 25, /////////// Not see the interior node > and edge > 17,18,25, 24, 26, 33, 34, 27, > 18,19,20, 25, 27, 34, 29, 28, > 25,20,21, 22, 34, 31, 30, 29, > 24,25,22, 23, 33, 32, 31, 34} ; > PetscInt cones2[64] = { 8, 15, 16, 9, 17, 18, 25, 24, > 9, 16,11, 10, 18, 19, 20, 25, /////////// See the > interior node and edge > 16,13, 12,11, 25, 20, 21, 22, > 15,14, 13,16, 24, 25, 22, 23, > 17,18,25, 24, 26, 27, 34, 33, > 18,19,20, 25, 27, 28, 29, 34, > 25,20,21, 22, 34, 29, 30, 31, > 24,25,22, 23, 33, 34, 31, 32} ; > PetscInt coneOrientations[64] = > {0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0}; > PetscScalar vertexCoords[81] = {0.0,0.0,0.0, 1.0,0.0, 0.0, > 2.0,0.0,0.0, 2.0,1.0,0.0, 2.0,2.0, 0.0, 1.0,2.0,0.0, > 0.0,2.0,0.0, 0.0,1.0, 0.0, 1.0,1.0,0.0, > 0.0,0.0,1.0, 1.0,0.0, 1.0, 2.0,0.0,1.0, 2.0,1.0,1.0, 2.0,2.0, > 1.0, 1.0,2.0,1.0, > 0.0,2.0,1.0, 0.0,1.0, 1.0, 1.0,1.0,1.0, > 0.0,0.0,2.0, 1.0,0.0, 2.0, 2.0,0.0,2.0, 2.0,1.0,2.0, 2.0,2.0, > 2.0, 1.0,2.0,2.0, > 0.0,2.0,2.0, 0.0,1.0, 2.0, 1.0,1.0,2.0}; > Thanks > Amir > > ---------- Forwarded Message --------- > > From: Matthew Knepley > Subject: Re: [petsc-users] View 3D DMPlex > Date: Sep 10 2019, at 9:08 pm > To: Amir > Cc: PETSc > > On Tue, Sep 10, 2019 at 9:00 AM Amir via petsc-users < > petsc-users at mcs.anl.gov> wrote: > > Hi > I am trying to view a cubic mesh constructed by DMPlex. I noticed that the > interior point is not seen in the output VTU file. > It means that I do not see an edge inside the cube. I tried to check some > detail of DM using --in_dm_view. The detail does not show any problem. Do > you think there is a problem in my vtk output or dm setup. > > > By default, Paraview does not show interior edges. You have to use a > filter, like "Extract Edges". > > Thanks, > > Matt > > > Thanks > Amir > DM_0x84000000_0 in 3 dimensions: > 0-cells: 27 > 1-cells: 54 > 2-cells: 36 > 3-cells: 8 > Labels: > depth: 4 strata with value/size (0 (27), 1 (54), 2 (36), 3 (8)) > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > On Sep 10 2019, at 9:08 pm, Matthew Knepley wrote: > > On Tue, Sep 10, 2019 at 9:00 AM Amir via petsc-users < > petsc-users at mcs.anl.gov > > > wrote: > > Hi > I am trying to view a cubic mesh constructed by DMPlex. I noticed that the > interior point is not seen in the output VTU file. > It means that I do not see an edge inside the cube. I tried to check some > detail of DM using --in_dm_view. The detail does not show any problem. Do > you think there is a problem in my vtk output or dm setup. > > > By default, Paraview does not show interior edges. You have to use a > filter, like "Extract Edges". > > Thanks, > > Matt > > > Thanks > Amir > DM_0x84000000_0 in 3 dimensions: > 0-cells: 27 > 1-cells: 54 > 2-cells: 36 > 3-cells: 8 > Labels: > depth: 4 strata with value/size (0 (27), 1 (54), 2 (36), 3 (8)) > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From juaneah at gmail.com Thu Sep 12 14:19:48 2019 From: juaneah at gmail.com (Emmanuel Ayala) Date: Thu, 12 Sep 2019 14:19:48 -0500 Subject: [petsc-users] DMDAGetElements and global/local element number Message-ID: Hi everyone, it would be great if someone can give me a hint for this issue, i have been trying to figure out how to solve it, but i did not succeed I'm using DMDA to generate a 3D mesh (DMDA_ELEMENT_Q1). I'm trying to fill a MPI matrix with some values wich are related to the dofs of each element node, moreover i need to set this values based on the element number. Something like: mpi_A(total_elements X total_dofs) total_dofs row_0 (element_0) a_0 a_1 a_2 ... a_23 row_1 (element_1) a_0 a_1 a_2 ... a_23 row_2 (element_2) a_0 a_1 a_2 ... a_23 . . . row_n (element_n) a_0 a_1 a_2 ... a_23 The element number is related to the row index. And the matrix values are set depending of the DOFs related to the element. With DMDAGetElements i can read the LOCAL nodes connected to the element and then the DOFs associated to the element. I can handle the local and global relations with DMGetLocalToGlobalMapping, MatSetLocalToGlobalMapping and MatSetValuesLocal. BUT i CAN NOT understand how to know the element number in LOCAL or GLOBAL contex. DMDAGetElements gives the NUMBER OF ELEMENTS owned in the local process, but there is not any information about the local or global ELEMENT NUMBER. How to know the local or global element number related to the data provided by DMDAGetElements? Thank you. -------------- next part -------------- An HTML attachment was scrubbed... URL: From dave.mayhem23 at gmail.com Thu Sep 12 15:21:33 2019 From: dave.mayhem23 at gmail.com (Dave May) Date: Thu, 12 Sep 2019 21:21:33 +0100 Subject: [petsc-users] DMDAGetElements and global/local element number In-Reply-To: References: Message-ID: On Thu, 12 Sep 2019 at 20:21, Emmanuel Ayala via petsc-users < petsc-users at mcs.anl.gov> wrote: > Hi everyone, it would be great if someone can give me a hint for this > issue, i have been trying to figure out how to solve it, but i did not > succeed > > I'm using DMDA to generate a 3D mesh (DMDA_ELEMENT_Q1). I'm trying to fill > a MPI matrix with some values wich are related to the dofs of each element > node, moreover i need to set this values based on the element number. > Something like: > > mpi_A(total_elements X total_dofs) > > total_dofs > row_0 (element_0) a_0 a_1 a_2 ... a_23 > row_1 (element_1) a_0 a_1 a_2 ... a_23 > row_2 (element_2) > a_0 a_1 a_2 ... a_23 > . > . > . > row_n (element_n) a_0 a_1 a_2 ... a_23 > > The element number is related to the row index. And the matrix values are > set depending of the DOFs related to the element. > > With DMDAGetElements i can read the LOCAL nodes connected to the element > and then the DOFs associated to the element. I can handle the local and > global relations with DMGetLocalToGlobalMapping, MatSetLocalToGlobalMapping > and MatSetValuesLocal. BUT i CAN NOT understand how to know the element > number in LOCAL or GLOBAL contex. DMDAGetElements gives the NUMBER OF > ELEMENTS owned in the local process, but there is not any information about > the local or global ELEMENT NUMBER. > > How to know the local or global element number related to the data > provided by DMDAGetElements? > The DMDA defines cells of the same type (quads (2D) or hex (3D), hence every cell defines the same number of vertices. DMDAGetElements(DM dm,PetscInt *nel,PetscInt *nen,const PetscInt *e[]) nel - number of local elements nen - number of element nodes e - the local indices of the elements' vertices e[] defines the ordering of the elements. e[] is an array containing all of the element-vertex maps. Since each element in the DMDA has the same number of vertices, the first nen values in e[] correspond to the vertices (local index) associated with the first element. The next nen values in e[] correspond to the vertices of the second element. The vertices for any (local) element with the index "cid" can be sought via e[nen*cid + i] where i would range from 0 to nen-1. Why would you ever want, or need, the global element number? What is the use case? Thanks, Dave > > Thank you. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From dave.mayhem23 at gmail.com Thu Sep 12 17:01:06 2019 From: dave.mayhem23 at gmail.com (Dave May) Date: Thu, 12 Sep 2019 23:01:06 +0100 Subject: [petsc-users] DMDAGetElements and global/local element number In-Reply-To: References: Message-ID: Please always use "reply-all" so that your messages go to the list. This is standard mailing list etiquette. It is important to preserve threading for people who find this discussion later and so that we do not waste our time re-answering the same questions that have already been answered in private side-conversations. You'll likely get an answer faster that way too. On Thu, 12 Sep 2019 at 22:26, Emmanuel Ayala wrote: > Thank you for the answer. > > El jue., 12 de sep. de 2019 a la(s) 15:21, Dave May ( > dave.mayhem23 at gmail.com) escribi?: > >> >> >> On Thu, 12 Sep 2019 at 20:21, Emmanuel Ayala via petsc-users < >> petsc-users at mcs.anl.gov> wrote: >> >>> Hi everyone, it would be great if someone can give me a hint for this >>> issue, i have been trying to figure out how to solve it, but i did not >>> succeed >>> >>> I'm using DMDA to generate a 3D mesh (DMDA_ELEMENT_Q1). I'm trying to >>> fill a MPI matrix with some values wich are related to the dofs of each >>> element node, moreover i need to set this values based on the element >>> number. Something like: >>> >>> mpi_A(total_elements X total_dofs) >>> >>> total_dofs >>> row_0 (element_0) a_0 a_1 a_2 ... a_23 >>> row_1 (element_1) a_0 a_1 a_2 ... >>> a_23 >>> row_2 (element_2) >>> a_0 a_1 a_2 ... a_23 >>> . >>> . >>> . >>> row_n (element_n) a_0 a_1 a_2 ... a_23 >>> >>> The element number is related to the row index. And the matrix values >>> are set depending of the DOFs related to the element. >>> >>> With DMDAGetElements i can read the LOCAL nodes connected to the element >>> and then the DOFs associated to the element. I can handle the local and >>> global relations with DMGetLocalToGlobalMapping, MatSetLocalToGlobalMapping >>> and MatSetValuesLocal. BUT i CAN NOT understand how to know the element >>> number in LOCAL or GLOBAL contex. DMDAGetElements gives the NUMBER OF >>> ELEMENTS owned in the local process, but there is not any information about >>> the local or global ELEMENT NUMBER. >>> >>> How to know the local or global element number related to the data >>> provided by DMDAGetElements? >>> >> >> The DMDA defines cells of the same type (quads (2D) or hex (3D), hence >> every cell defines the same number of vertices. >> > >> DMDAGetElements(DM dm,PetscInt *nel,PetscInt *nen,const PetscInt *e[]) >> nel - number of local elements >> nen - number of element nodes >> e - the local indices of the elements' vertices >> >> e[] defines the ordering of the elements. e[] is an array containing all >> of the element-vertex maps. Since each element in the DMDA has the same >> number of vertices, the first nen values in e[] correspond to the vertices >> (local index) associated with the first element. The next nen values in e[] >> correspond to the vertices of the second element. The vertices for any >> (local) element with the index "cid" can be sought via e[nen*cid + i] where >> i would range from 0 to nen-1. >> >> > You are right. I can handle the local information, i think the idea is: > > for ( PetscInt i = 0; i < nel; i++ ) > for (PetscInt j = 0; j < nen; j++) > PetscSynchronizedPrintf(PETSC_COMM_WORLD,"local element %d : > e[%d] = %d\n", i, j, e[i*nen+j]); > > BUT, it does not give information regarding to the ELEMENT identifier > (number). I need the element number to ordering the elements inside of a > MPI matrix. I want to access to each element data by means of the matrix > row . I mean, in the row_0 there is the information (spreading through the > columns) of the element_0. > I think this is a mis-understanding. The element number is not related to a row in the matrix. The element is associated with vertices (basis functions), and each vertex (basis) in the DMDA is given a unique index. The index of that basis corresponds to a row (column) if it's a test (trial) function. So if you have any element defined by the array e[], you know how to insert values into a matrix by using the vertex indices. > > The element vertices are numbered starting from 0, for each process. It > does not give information about the element number. > > Why would you ever want, or need, the global element number? What is the >> use case? >> > > I'm performing topology optimization, and it is part of gradient > computation. I'm already have the analytic gradient. > I need the global element number to link nodal displacements with the > element. > Any nodal displacement, except those at the corners of your physical domain are associated with multiple elements. There isn't a one-to-one map between nodes and elements. > I can use a local element number just if I have a equivalence between this > local number and the global element number. > I obviously don't understand what you want to do. However, here is one way to achieve what you are asking for: specifically relating local element indices to global indices. Assuming by "global element number" you are referring to what PETSc calls the "natural" ordering, then the dumbest way to convert from the local element index to the natural element index is the following: (i) upon creation, use DMSetUniformCoordinates to define a unit 1 box; (ii) compute the cell dimensions dx, dy associated with your uniform grid layout; (iii) traverse through the e[] array return by DMDAGetElements. For each element, get the vertices and compute the centroid cx, cy and then compute PetscInt J = (PetscInt)(cy/dy); PetscInt I = (PetscInt)(cx/dx); PetscInt natural_id = I + J * mx; where mx is the number of elements in the i direction in your domain (not the sub-domain). You can determine mx by calling DMDAGetInfo(dm,NULL,&mx,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL); (iv) after you have computed natural_id for every element, layout the global coordinates of the DMDA however you want. Following that, you must call the following DM dm, cdm; Vec coor,lcoor; DMGetCoordinateDM(dm,&cdm); DMGetCoordinates(dm,&coor); DMGetCoordinatesLocal(dm,&lcoor); DMGlobalToLocal(cmd,coor,INSERT_VALUES,lcoor); The above must be executed to ensure the new coordinates values are propagated to coords associated with your local sub-domain. I doubt this will solve your _actual_ problem. Thanks, Dave > >> Thanks, >> Dave >> >> >> >> >>> >>> Thank you. >>> >> > Thanks! > -------------- next part -------------- An HTML attachment was scrubbed... URL: From a.croucher at auckland.ac.nz Thu Sep 12 21:31:25 2019 From: a.croucher at auckland.ac.nz (Adrian Croucher) Date: Fri, 13 Sep 2019 14:31:25 +1200 Subject: [petsc-users] Segfault in DMPlexDistributeFieldIS with -log_view Message-ID: <64bfa373-f58e-5ecf-51f1-4cc79d98360b@auckland.ac.nz> hi My code is using DMPlexDistributeFieldIS() to distribute an index set, and it seems to work ok, except if I run with -log_view. In that case I get the error below. The code (Fortran) looks like this: ??? call PetscSectionCreate(PETSC_COMM_WORLD, dist_section, ierr) ??? CHKERRQ(ierr) ??? call ISCreate(PETSC_COMM_WORLD, dist_index_set, ierr) ??? CHKERRQ(ierr) ??? call DMPlexDistributeFieldIS(self%dm, sf, section, & ???????? index_set, dist_section, & ???????? dist_index_set, ierr); CHKERRQ(ierr) ??? call PetscSectionDestroy(dist_section, ierr); CHKERRQ(ierr) ??? call ISDestroy(index_set, ierr); CHKERRQ(ierr) ??? index_set = dist_index_set I'm running the master branch. Any clues? - Adrian -- [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger [0]PETSC ERROR: or see https://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors [0]PETSC ERROR: likely location of problem given in stack below [0]PETSC ERROR: ---------------------? Stack Frames ------------------------------------ [1]PETSC ERROR: ------------------------------------------------------------------------ [1]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range [1]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger [1]PETSC ERROR: or see https://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind [1]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors [1]PETSC ERROR: likely location of problem given in stack below [1]PETSC ERROR: ---------------------? Stack Frames ------------------------------------ [1]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, [1]PETSC ERROR:?????? INSTEAD the line number of the start of the function [1]PETSC ERROR:?????? is given. [1]PETSC ERROR: [1] PetscLogEventBeginDefault line 642 /home/acro018/software/PETSc/code/src/sys/logging/utils/eventlog.c [1]PETSC ERROR: [1] DMPlexDistributeFieldIS line 831 /home/acro018/software/PETSc/code/src/dm/impls/plex/plexdistribute.c [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, [0]PETSC ERROR:?????? INSTEAD the line number of the start of the function [0]PETSC ERROR:?????? is given. [0]PETSC ERROR: [0] PetscLogEventBeginDefault line 642 /home/acro018/software/PETSc/code/src/sys/logging/utils/eventlog.c [0]PETSC ERROR: [1]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [1]PETSC ERROR: Signal received [1]PETSC ERROR: See https://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. [1]PETSC ERROR: Petsc Development GIT revision: v3.11.3-1582-gcb66735359? GIT Date: 2019-08-04 16:01:27 -0500 [1]PETSC ERROR: waiwera on a linux-gnu-c-debug named en-354401 by acro018 Fri Sep 13 14:20:41 2019 [1]PETSC ERROR: Configure options --with-x --download-hdf5 --download-zlib --download-netcdf --download-pnetcdf --download-exodusii --download-triangle --download-ptscotch --download-chaco --download-hypre [1]PETSC ERROR: #1 User provided function() line 0 in? unknown file [0] DMPlexDistributeFieldIS line 831 /home/acro018/software/PETSc/code/src/dm/impls/plex/plexdistribute.c [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [0]PETSC ERROR: Signal received [0]PETSC ERROR: See https://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. [0]PETSC ERROR: Petsc Development GIT revision: v3.11.3-1582-gcb66735359? GIT Date: 2019-08-04 16:01:27 -0500 [0]PETSC ERROR: waiwera on a linux-gnu-c-debug named en-354401 by acro018 Fri Sep 13 14:20:41 2019 [0]PETSC ERROR: Configure options --with-x --download-hdf5 --download-zlib --download-netcdf --download-pnetcdf --download-exodusii --download-triangle --download-ptscotch --download-chaco --download-hypre [0]PETSC ERROR: #1 User provided function() line 0 in? unknown file -- Dr Adrian Croucher Senior Research Fellow Department of Engineering Science University of Auckland, New Zealand email: a.croucher at auckland.ac.nz tel: +64 (0)9 923 4611 From a.croucher at auckland.ac.nz Thu Sep 12 21:41:53 2019 From: a.croucher at auckland.ac.nz (Adrian Croucher) Date: Fri, 13 Sep 2019 14:41:53 +1200 Subject: [petsc-users] Segfault in DMPlexDistributeFieldIS with -log_view In-Reply-To: <64bfa373-f58e-5ecf-51f1-4cc79d98360b@auckland.ac.nz> References: <64bfa373-f58e-5ecf-51f1-4cc79d98360b@auckland.ac.nz> Message-ID: <6792186c-5db2-49f9-038f-0b62527a068f@auckland.ac.nz> PS if I run with -start_in_debugger and do a backtrace I get the following: Thread 1 "waiwera" received signal SIGSEGV, Segmentation fault. 0x00007fe12231b761 in PetscObjectComm (obj=0xfffffffffffffffe) ??? at /home/acro018/software/PETSc/code/src/sys/objects/gcomm.c:33 33??????? return obj->comm; (gdb) bt #0? 0x00007fe12231b761 in PetscObjectComm (obj=0xfffffffffffffffe) ??? at /home/acro018/software/PETSc/code/src/sys/objects/gcomm.c:33 #1? 0x00007fe12231488c in PetscLogEventBeginDefault (event=133, t=0, ??? o1=0xfffffffffffffffe, o2=0x0, o3=0x0, o4=0x0) ??? at /home/acro018/software/PETSc/code/src/sys/logging/utils/eventlog.c:647 #2? 0x00007fe1231bcbbd in DMPlexDistributeFieldIS (dm=0xfffffffffffffffe, ??? pointSF=0x55857b89eb40, originalSection=0x55857b7bdaf0, ??? originalIS=0x55857b793270, newSection=0x55857b8a26d0, newIS=0x7ffc52486d58) ??? at /home/acro018/software/PETSc/code/src/dm/impls/plex/plexdistribute.c:832 #3? 0x00007fe1233e5c9a in dmplexdistributefieldis_ ( ??? dm=0x55857a4cdea8 , ??? pointSF=0x55857a4cdf20 , originalSection=0x7ffc52486d88, ??? originalIS=0x55857a4cdf10 , newSection=0x7ffc52486d50, ??? newIS=0x7ffc52486d58, __ierr=0x7ffc52486d4c) ??? at /home/acro018/software/PETSc/code/src/dm/impls/plex/ftn-auto/plexdistributef.c:135 #4? 0x000055857a202cc8 in mesh_module::mesh_distribute_index_set (self=..., ??? sf=..., section=..., index_set=...) at ../src/mesh.F90:3393 #5? 0x000055857a22b60c in mesh_module::mesh_distribute (self=...) ??? at ../src/mesh.F90:152 #6? 0x000055857a224626 in mesh_module::mesh_configure (self=..., gravity=..., ??? json=0x55857b574180, logfile=..., err=0) at ../src/mesh.F90:826 #7? 0x000055857a1baf14 in flow_simulation_module::flow_simulation_init ( ---Type to continue, or q to quit---Quit On 13/09/19 2:31 PM, Adrian Croucher wrote: > hi > > My code is using DMPlexDistributeFieldIS() to distribute an index set, > and it seems to work ok, except if I run with -log_view. > > In that case I get the error below. > > The code (Fortran) looks like this: > > ??? call PetscSectionCreate(PETSC_COMM_WORLD, dist_section, ierr) > ??? CHKERRQ(ierr) > ??? call ISCreate(PETSC_COMM_WORLD, dist_index_set, ierr) > ??? CHKERRQ(ierr) > ??? call DMPlexDistributeFieldIS(self%dm, sf, section, & > ???????? index_set, dist_section, & > ???????? dist_index_set, ierr); CHKERRQ(ierr) > ??? call PetscSectionDestroy(dist_section, ierr); CHKERRQ(ierr) > ??? call ISDestroy(index_set, ierr); CHKERRQ(ierr) > ??? index_set = dist_index_set > > I'm running the master branch. > > Any clues? > > - Adrian > > -- > > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, > probably memory access out of range > [0]PETSC ERROR: Try option -start_in_debugger or > -on_error_attach_debugger > [0]PETSC ERROR: or see > https://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind > [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac > OS X to find memory corruption errors > [0]PETSC ERROR: likely location of problem given in stack below > [0]PETSC ERROR: ---------------------? Stack Frames > ------------------------------------ > [1]PETSC ERROR: > ------------------------------------------------------------------------ > [1]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, > probably memory access out of range > [1]PETSC ERROR: Try option -start_in_debugger or > -on_error_attach_debugger > [1]PETSC ERROR: or see > https://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind > [1]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac > OS X to find memory corruption errors > [1]PETSC ERROR: likely location of problem given in stack below > [1]PETSC ERROR: ---------------------? Stack Frames > ------------------------------------ > [1]PETSC ERROR: Note: The EXACT line numbers in the stack are not > available, > [1]PETSC ERROR:?????? INSTEAD the line number of the start of the > function > [1]PETSC ERROR:?????? is given. > [1]PETSC ERROR: [1] PetscLogEventBeginDefault line 642 > /home/acro018/software/PETSc/code/src/sys/logging/utils/eventlog.c > [1]PETSC ERROR: [1] DMPlexDistributeFieldIS line 831 > /home/acro018/software/PETSc/code/src/dm/impls/plex/plexdistribute.c > [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not > available, > [0]PETSC ERROR:?????? INSTEAD the line number of the start of the > function > [0]PETSC ERROR:?????? is given. > [0]PETSC ERROR: [0] PetscLogEventBeginDefault line 642 > /home/acro018/software/PETSc/code/src/sys/logging/utils/eventlog.c > [0]PETSC ERROR: [1]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > [1]PETSC ERROR: Signal received > [1]PETSC ERROR: See > https://www.mcs.anl.gov/petsc/documentation/faq.html for trouble > shooting. > [1]PETSC ERROR: Petsc Development GIT revision: > v3.11.3-1582-gcb66735359? GIT Date: 2019-08-04 16:01:27 -0500 > [1]PETSC ERROR: waiwera on a linux-gnu-c-debug named en-354401 by > acro018 Fri Sep 13 14:20:41 2019 > [1]PETSC ERROR: Configure options --with-x --download-hdf5 > --download-zlib --download-netcdf --download-pnetcdf > --download-exodusii --download-triangle --download-ptscotch > --download-chaco --download-hypre > [1]PETSC ERROR: #1 User provided function() line 0 in? unknown file > [0] DMPlexDistributeFieldIS line 831 > /home/acro018/software/PETSc/code/src/dm/impls/plex/plexdistribute.c > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > [0]PETSC ERROR: Signal received > [0]PETSC ERROR: See > https://www.mcs.anl.gov/petsc/documentation/faq.html for trouble > shooting. > [0]PETSC ERROR: Petsc Development GIT revision: > v3.11.3-1582-gcb66735359? GIT Date: 2019-08-04 16:01:27 -0500 > [0]PETSC ERROR: waiwera on a linux-gnu-c-debug named en-354401 by > acro018 Fri Sep 13 14:20:41 2019 > [0]PETSC ERROR: Configure options --with-x --download-hdf5 > --download-zlib --download-netcdf --download-pnetcdf > --download-exodusii --download-triangle --download-ptscotch > --download-chaco --download-hypre > [0]PETSC ERROR: #1 User provided function() line 0 in? unknown file > -- Dr Adrian Croucher Senior Research Fellow Department of Engineering Science University of Auckland, New Zealand email: a.croucher at auckland.ac.nz tel: +64 (0)9 923 4611 From a.croucher at auckland.ac.nz Thu Sep 12 21:59:06 2019 From: a.croucher at auckland.ac.nz (Adrian Croucher) Date: Fri, 13 Sep 2019 14:59:06 +1200 Subject: [petsc-users] Segfault in DMPlexDistributeFieldIS with -log_view In-Reply-To: <64bfa373-f58e-5ecf-51f1-4cc79d98360b@auckland.ac.nz> References: <64bfa373-f58e-5ecf-51f1-4cc79d98360b@auckland.ac.nz> Message-ID: <2d5f8a40-c198-b804-4934-b9139a9fb54e@auckland.ac.nz> It's OK, I found the problem. I was accidentally passing in a DM which hadn't been created yet. This didn't matter when not using -log_view, because it looks like the DM isn't actually used for anything else inside DMPlexDistributeFieldIS(). But when you run with -log_view it tries to get its communicator. - Adrian On 13/09/19 2:31 PM, Adrian Croucher wrote: > hi > > My code is using DMPlexDistributeFieldIS() to distribute an index set, > and it seems to work ok, except if I run with -log_view. > > In that case I get the error below. > > The code (Fortran) looks like this: > > ??? call PetscSectionCreate(PETSC_COMM_WORLD, dist_section, ierr) > ??? CHKERRQ(ierr) > ??? call ISCreate(PETSC_COMM_WORLD, dist_index_set, ierr) > ??? CHKERRQ(ierr) > ??? call DMPlexDistributeFieldIS(self%dm, sf, section, & > ???????? index_set, dist_section, & > ???????? dist_index_set, ierr); CHKERRQ(ierr) > ??? call PetscSectionDestroy(dist_section, ierr); CHKERRQ(ierr) > ??? call ISDestroy(index_set, ierr); CHKERRQ(ierr) > ??? index_set = dist_index_set > > -- Dr Adrian Croucher Senior Research Fellow Department of Engineering Science University of Auckland, New Zealand email: a.croucher at auckland.ac.nz tel: +64 (0)9 923 4611 From juaneah at gmail.com Thu Sep 12 23:53:02 2019 From: juaneah at gmail.com (Emmanuel Ayala) Date: Thu, 12 Sep 2019 23:53:02 -0500 Subject: [petsc-users] DMDAGetElements and global/local element number In-Reply-To: References: Message-ID: El jue., 12 de sep. de 2019 a la(s) 17:01, Dave May (dave.mayhem23 at gmail.com) escribi?: > > Please always use "reply-all" so that your messages go to the list. > This is standard mailing list etiquette. It is important to preserve > threading for people who find this discussion later and so that we do > not waste our time re-answering the same questions that have already > been answered in private side-conversations. You'll likely get an > answer faster that way too. > > On Thu, 12 Sep 2019 at 22:26, Emmanuel Ayala wrote: > >> Thank you for the answer. >> >> El jue., 12 de sep. de 2019 a la(s) 15:21, Dave May ( >> dave.mayhem23 at gmail.com) escribi?: >> >>> >>> >>> On Thu, 12 Sep 2019 at 20:21, Emmanuel Ayala via petsc-users < >>> petsc-users at mcs.anl.gov> wrote: >>> >>>> Hi everyone, it would be great if someone can give me a hint for this >>>> issue, i have been trying to figure out how to solve it, but i did not >>>> succeed >>>> >>>> I'm using DMDA to generate a 3D mesh (DMDA_ELEMENT_Q1). I'm trying to >>>> fill a MPI matrix with some values wich are related to the dofs of each >>>> element node, moreover i need to set this values based on the element >>>> number. Something like: >>>> >>>> mpi_A(total_elements X total_dofs) >>>> >>>> total_dofs >>>> row_0 (element_0) a_0 a_1 a_2 ... a_23 >>>> row_1 (element_1) a_0 a_1 a_2 ... >>>> a_23 >>>> row_2 (element_2) >>>> a_0 a_1 a_2 ... a_23 >>>> . >>>> . >>>> . >>>> row_n (element_n) a_0 a_1 a_2 ... a_23 >>>> >>>> The element number is related to the row index. And the matrix values >>>> are set depending of the DOFs related to the element. >>>> >>>> With DMDAGetElements i can read the LOCAL nodes connected to the >>>> element and then the DOFs associated to the element. I can handle the local >>>> and global relations with DMGetLocalToGlobalMapping, >>>> MatSetLocalToGlobalMapping and MatSetValuesLocal. BUT i CAN NOT understand >>>> how to know the element number in LOCAL or GLOBAL contex. DMDAGetElements >>>> gives the NUMBER OF ELEMENTS owned in the local process, but there is not >>>> any information about the local or global ELEMENT NUMBER. >>>> >>>> How to know the local or global element number related to the data >>>> provided by DMDAGetElements? >>>> >>> >>> The DMDA defines cells of the same type (quads (2D) or hex (3D), hence >>> every cell defines the same number of vertices. >>> >> >>> DMDAGetElements(DM dm,PetscInt *nel,PetscInt *nen,const PetscInt *e[]) >>> nel - number of local elements >>> nen - number of element nodes >>> e - the local indices of the elements' vertices >>> >>> e[] defines the ordering of the elements. e[] is an array containing all >>> of the element-vertex maps. Since each element in the DMDA has the same >>> number of vertices, the first nen values in e[] correspond to the vertices >>> (local index) associated with the first element. The next nen values in e[] >>> correspond to the vertices of the second element. The vertices for any >>> (local) element with the index "cid" can be sought via e[nen*cid + i] where >>> i would range from 0 to nen-1. >>> >>> >> You are right. I can handle the local information, i think the idea is: >> >> for ( PetscInt i = 0; i < nel; i++ ) >> for (PetscInt j = 0; j < nen; j++) >> PetscSynchronizedPrintf(PETSC_COMM_WORLD,"local element >> %d : e[%d] = %d\n", i, j, e[i*nen+j]); >> >> BUT, it does not give information regarding to the ELEMENT identifier >> (number). I need the element number to ordering the elements inside of a >> MPI matrix. I want to access to each element data by means of the matrix >> row . I mean, in the row_0 there is the information (spreading through the >> columns) of the element_0. >> > > I think this is a mis-understanding. The element number is not related to > a row in the matrix. > Sorry, I did not express myself very well: I WANT TO CREATE a matrix which let me "access to each element data by means of the matrix row". > The element is associated with vertices (basis functions), and each vertex > (basis) in the DMDA is given a unique index. The index of that basis > corresponds to a row (column) if it's a test (trial) function. So if you > have any element defined by the array e[], you know how to insert values > into a matrix by using the vertex indices. > > OK, i got it. > >> >> The element vertices are numbered starting from 0, for each process. It >> does not give information about the element number. >> >> Why would you ever want, or need, the global element number? What is the >>> use case? >>> >> >> I'm performing topology optimization, and it is part of gradient >> computation. I'm already have the analytic gradient. >> I need the global element number to link nodal displacements with the >> element. >> > > Any nodal displacement, except those at the corners of your physical > domain are associated with multiple elements. There isn't a one-to-one map > between nodes and elements. > > > >> I can use a local element number just if I have a equivalence between >> this local number and the global element number. >> > > I obviously don't understand what you want to do. > > Actually this is (below) the information that i need, thanks! Let me try it! :) > However, here is one way to achieve what you are asking for: specifically > relating local element indices to global indices. > Assuming by "global element number" you are referring to what PETSc calls > the "natural" ordering, then the dumbest way to convert from the local > element index to the natural element index is the following: > (i) upon creation, use DMSetUniformCoordinates to define a unit 1 box; > (ii) compute the cell dimensions dx, dy associated with your uniform grid > layout; > (iii) traverse through the e[] array return by DMDAGetElements. For each > element, get the vertices and compute the centroid cx, cy and then compute > PetscInt J = (PetscInt)(cy/dy); > PetscInt I = (PetscInt)(cx/dx); > PetscInt natural_id = I + J * mx; > where mx is the number of elements in the i direction in your domain (not > the sub-domain). > You can determine mx by calling > > DMDAGetInfo(dm,NULL,&mx,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL); > (iv) after you have computed natural_id for every element, layout the > global coordinates of the DMDA however you want. Following that, you must > call the following > > DM dm, cdm; > Vec coor,lcoor; > > DMGetCoordinateDM(dm,&cdm); > DMGetCoordinates(dm,&coor); > DMGetCoordinatesLocal(dm,&lcoor); > DMGlobalToLocal(cmd,coor,INSERT_VALUES,lcoor); > > The above must be executed to ensure the new coordinates values are > propagated to coords associated with your local sub-domain. > > I doubt this will solve your _actual_ problem. > Thanks for your time! Best regards. > Thanks, > Dave > > >> >>> Thanks, >>> Dave >>> >>> >>> >>> >>>> >>>> Thank you. >>>> >>> >> Thanks! >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From joslorgom at gmail.com Fri Sep 13 11:30:30 2019 From: joslorgom at gmail.com (=?UTF-8?Q?Jos=C3=A9_Lorenzo?=) Date: Fri, 13 Sep 2019 18:30:30 +0200 Subject: [petsc-users] VecAssembly gets stuck Message-ID: Hello, I am solving a finite element problem with Dirichlet boundary conditions using PETSC. In the boundary conditions there are two terms: a first one that is known before hand (normally zero) and a second term that depends linearly on the unknown variable itself in the whole domain. Therefore, at every time step I need to iterate as the boundary condition depends on the field and the latter depends on the BC. Moreover, the problem is nonlinear and I use a ghosted vector to represent the field. Every processor manages a portion of the domain and a portion of the boundary (if not interior). At every Newton iteration within the time loop the way I set the boundary conditions is as follows: First, each processor computes the known term of the BC (first term) and inserts the values into the vector call VecSetValues(H, nedge_own, edglocglo(diredg_loc) - 1, Hdir, INSERT_VALUES, ierr) call VecAssemblyBegin(H, ierr) call VecAssemblyEnd(H, ierr) As far as I understand, at this stage VecAssembly will not need to communicate to other processors as each processor only sets values to components that belong to it. Then, each processor computes its own contribution to the field-dependent term of the BC for the whole domain boundary as call VecSetValues(H, nedge_all, edgappglo(diredg_app) - 1, Hself, ADD_VALUES, ierr) call VecAssemblyBegin(H, ierr) call VecAssemblyEnd(H, ierr) In this case communication will be needed as each processor will add values to vector components that are not stored by it, and I guess it might get very busy as all the processors will need to communicate with each other. When using this strategy I don't find any issue for problems using a small amount of processors, but recently I've been solving using 90 processors and the simulation always hangs at the second VecSetValues at some random time step. It works fine for some time steps but at some point it just gets stuck and I have to cancel the simulation. I have managed to overcome this by making each processor contribute to its own components using first MPI_Reduce and then doing call VecSetValues(H, nedge_own, edgappglo(diredg_app_loc), Hself_own, ADD_VALUES, ierr) call VecAssemblyBegin(H, ierr) call VecAssemblyEnd(H, ierr) However I would like to understand whether there is something wrong in the code above. Thank you. -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Fri Sep 13 13:14:06 2019 From: bsmith at mcs.anl.gov (Smith, Barry F.) Date: Fri, 13 Sep 2019 18:14:06 +0000 Subject: [petsc-users] VecAssembly gets stuck In-Reply-To: References: Message-ID: <2AA710CA-DD4C-4652-965A-56A368C5D322@anl.gov> What version of PETSc is this? The master branch? > call VecSetValues(H, nedge_all, edgappglo(diredg_app) - 1, Hself, ADD_VALUES, ierr) So each process is providing all data for all the boundary entries in the vector? I don't think there is anything wrong with what you are doing but the mechanism that does the communication inside the VecAssembly cannot know about the structure of the communication and so will do it inefficiently. It would be useful to know where it is "stuck" that might help us improve the assembly process for your case. But I think it would better for you to just use MPI directly to put the data where it is needed. > On Sep 13, 2019, at 11:30 AM, Jos? Lorenzo via petsc-users wrote: > > Hello, > > I am solving a finite element problem with Dirichlet boundary conditions using PETSC. In the boundary conditions there are two terms: a first one that is known before hand (normally zero) and a second term that depends linearly on the unknown variable itself in the whole domain. Therefore, at every time step I need to iterate as the boundary condition depends on the field and the latter depends on the BC. Moreover, the problem is nonlinear and I use a ghosted vector to represent the field. > > Every processor manages a portion of the domain and a portion of the boundary (if not interior). At every Newton iteration within the time loop the way I set the boundary conditions is as follows: > > First, each processor computes the known term of the BC (first term) and inserts the values into the vector > > call VecSetValues(H, nedge_own, edglocglo(diredg_loc) - 1, Hdir, INSERT_VALUES, ierr) > call VecAssemblyBegin(H, ierr) > call VecAssemblyEnd(H, ierr) > > As far as I understand, at this stage VecAssembly will not need to communicate to other processors as each processor only sets values to components that belong to it. > > Then, each processor computes its own contribution to the field-dependent term of the BC for the whole domain boundary as > > call VecSetValues(H, nedge_all, edgappglo(diredg_app) - 1, Hself, ADD_VALUES, ierr) > call VecAssemblyBegin(H, ierr) > call VecAssemblyEnd(H, ierr) > > In this case communication will be needed as each processor will add values to vector components that are not stored by it, and I guess it might get very busy as all the processors will need to communicate with each other. > > When using this strategy I don't find any issue for problems using a small amount of processors, but recently I've been solving using 90 processors and the simulation always hangs at the second VecSetValues at some random time step. It works fine for some time steps but at some point it just gets stuck and I have to cancel the simulation. > > I have managed to overcome this by making each processor contribute to its own components using first MPI_Reduce and then doing > > call VecSetValues(H, nedge_own, edgappglo(diredg_app_loc), Hself_own, ADD_VALUES, ierr) > call VecAssemblyBegin(H, ierr) > call VecAssemblyEnd(H, ierr) > > However I would like to understand whether there is something wrong in the code above. > > Thank you. > From joslorgom at gmail.com Fri Sep 13 15:37:49 2019 From: joslorgom at gmail.com (=?UTF-8?Q?Jos=C3=A9_Lorenzo?=) Date: Fri, 13 Sep 2019 22:37:49 +0200 Subject: [petsc-users] VecAssembly gets stuck In-Reply-To: <2AA710CA-DD4C-4652-965A-56A368C5D322@anl.gov> References: <2AA710CA-DD4C-4652-965A-56A368C5D322@anl.gov> Message-ID: I'm using PETSc 3.10.2, I guess it is the master branch but I do not know for sure as I didn't install it myself. You are right, each processor provides data for all the boundary entries. I have carried out a few more tests and apparently it gets stuck during VecAssemblyBegin. I don't know whether I can be more preciseabout this. El vie., 13 sept. 2019 a las 20:14, Smith, Barry F. () escribi?: > > What version of PETSc is this? The master branch? > > > call VecSetValues(H, nedge_all, edgappglo(diredg_app) - 1, Hself, > ADD_VALUES, ierr) > > So each process is providing all data for all the boundary entries in the > vector? > > I don't think there is anything wrong with what you are doing but the > mechanism that does the communication inside the VecAssembly cannot know > about the structure of the communication and so will do it inefficiently. > It would be useful to know where it is "stuck" that might help us improve > the assembly process for your case. > > But I think it would better for you to just use MPI directly to put the > data where it is needed. > > > > On Sep 13, 2019, at 11:30 AM, Jos? Lorenzo via petsc-users < > petsc-users at mcs.anl.gov> wrote: > > > > Hello, > > > > I am solving a finite element problem with Dirichlet boundary conditions > using PETSC. In the boundary conditions there are two terms: a first one > that is known before hand (normally zero) and a second term that depends > linearly on the unknown variable itself in the whole domain. Therefore, at > every time step I need to iterate as the boundary condition depends on the > field and the latter depends on the BC. Moreover, the problem is nonlinear > and I use a ghosted vector to represent the field. > > > > Every processor manages a portion of the domain and a portion of the > boundary (if not interior). At every Newton iteration within the time loop > the way I set the boundary conditions is as follows: > > > > First, each processor computes the known term of the BC (first term) and > inserts the values into the vector > > > > call VecSetValues(H, nedge_own, edglocglo(diredg_loc) - 1, Hdir, > INSERT_VALUES, ierr) > > call VecAssemblyBegin(H, ierr) > > call VecAssemblyEnd(H, ierr) > > > > As far as I understand, at this stage VecAssembly will not need to > communicate to other processors as each processor only sets values to > components that belong to it. > > > > Then, each processor computes its own contribution to the > field-dependent term of the BC for the whole domain boundary as > > > > call VecSetValues(H, nedge_all, edgappglo(diredg_app) - 1, Hself, > ADD_VALUES, ierr) > > call VecAssemblyBegin(H, ierr) > > call VecAssemblyEnd(H, ierr) > > > > In this case communication will be needed as each processor will add > values to vector components that are not stored by it, and I guess it might > get very busy as all the processors will need to communicate with each > other. > > > > When using this strategy I don't find any issue for problems using a > small amount of processors, but recently I've been solving using 90 > processors and the simulation always hangs at the second VecSetValues at > some random time step. It works fine for some time steps but at some point > it just gets stuck and I have to cancel the simulation. > > > > I have managed to overcome this by making each processor contribute to > its own components using first MPI_Reduce and then doing > > > > call VecSetValues(H, nedge_own, edgappglo(diredg_app_loc), Hself_own, > ADD_VALUES, ierr) > > call VecAssemblyBegin(H, ierr) > > call VecAssemblyEnd(H, ierr) > > > > However I would like to understand whether there is something wrong in > the code above. > > > > Thank you. > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefano.zampini at gmail.com Fri Sep 13 15:44:59 2019 From: stefano.zampini at gmail.com (Stefano Zampini) Date: Fri, 13 Sep 2019 23:44:59 +0300 Subject: [petsc-users] VecAssembly gets stuck In-Reply-To: References: <2AA710CA-DD4C-4652-965A-56A368C5D322@anl.gov> Message-ID: <3D94048A-AB9A-4182-B9F1-F64673255759@gmail.com> > On Sep 13, 2019, at 11:37 PM, Jos? Lorenzo via petsc-users wrote: > > I'm using PETSc 3.10.2, I guess it is the master branch but I do not know for sure as I didn't install it myself. > > You are right, each processor provides data for all the boundary entries. > > I have carried out a few more tests and apparently it gets stuck during VecAssemblyBegin. I don't know whether I can be more preciseabout this. > > El vie., 13 sept. 2019 a las 20:14, Smith, Barry F. (>) escribi?: > > What version of PETSc is this? The master branch? > > > call VecSetValues(H, nedge_all, edgappglo(diredg_app) - 1, Hself, ADD_VALUES, ierr) > > So each process is providing all data for all the boundary entries in the vector? > > I don't think there is anything wrong with what you are doing but the mechanism that does the communication inside the VecAssembly cannot know about the structure of the communication and so will do it inefficiently. It would be useful to know where it is "stuck" that might help us improve the assembly process for your case. > > But I think it would better for you to just use MPI directly to put the data where it is needed. > > > > On Sep 13, 2019, at 11:30 AM, Jos? Lorenzo via petsc-users > wrote: > > > > Hello, > > > > I am solving a finite element problem with Dirichlet boundary conditions using PETSC. In the boundary conditions there are two terms: a first one that is known before hand (normally zero) and a second term that depends linearly on the unknown variable itself in the whole domain. Therefore, at every time step I need to iterate as the boundary condition depends on the field and the latter depends on the BC. Moreover, the problem is nonlinear and I use a ghosted vector to represent the field. > > > > Every processor manages a portion of the domain and a portion of the boundary (if not interior). At every Newton iteration within the time loop the way I set the boundary conditions is as follows: > > > > First, each processor computes the known term of the BC (first term) and inserts the values into the vector > > > > call VecSetValues(H, nedge_own, edglocglo(diredg_loc) - 1, Hdir, INSERT_VALUES, ierr) > > call VecAssemblyBegin(H, ierr) > > call VecAssemblyEnd(H, ierr) > > > > As far as I understand, at this stage VecAssembly will not need to communicate to other processors as each processor only sets values to components that belong to it. > > > > Then, each processor computes its own contribution to the field-dependent term of the BC for the whole domain boundary as > > > > call VecSetValues(H, nedge_all, edgappglo(diredg_app) - 1, Hself, ADD_VALUES, ierr) > > call VecAssemblyBegin(H, ierr) > > call VecAssemblyEnd(H, ierr) > > > > In this case communication will be needed as each processor will add values to vector components that are not stored by it, and I guess it might get very busy as all the processors will need to communicate with each other. > > > > When using this strategy I don't find any issue for problems using a small amount of processors, but recently I've been solving using 90 processors and the simulation always hangs at the second VecSetValues at some random time step. It works fine for some time steps but at some point it just gets stuck and I have to cancel the simulation. > > The words ?hangs? and ?gets stuck? 99% percent of the time indicates some memory issue with your code. First; are all processes calling VecAssemblyBegin/End or only a subset of them? VecAssemblyBegin/End must be called by all processes Second: run with valgrind, even a smaller case http://www.valgrind.org/ > > I have managed to overcome this by making each processor contribute to its own components using first MPI_Reduce and then doing > > > > call VecSetValues(H, nedge_own, edgappglo(diredg_app_loc), Hself_own, ADD_VALUES, ierr) > > call VecAssemblyBegin(H, ierr) > > call VecAssemblyEnd(H, ierr) > > > > However I would like to understand whether there is something wrong in the code above. > > > > Thank you. > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From joslorgom at gmail.com Fri Sep 13 15:51:59 2019 From: joslorgom at gmail.com (=?UTF-8?Q?Jos=C3=A9_Lorenzo?=) Date: Fri, 13 Sep 2019 22:51:59 +0200 Subject: [petsc-users] VecAssembly gets stuck In-Reply-To: <3D94048A-AB9A-4182-B9F1-F64673255759@gmail.com> References: <2AA710CA-DD4C-4652-965A-56A368C5D322@anl.gov> <3D94048A-AB9A-4182-B9F1-F64673255759@gmail.com> Message-ID: I thought so and ran the code with the option -malloc_dump but everything looked fine, no warnings were displayed in a smaller case. Perhaps this is not enough? Yes, all the processes call both VecAssemblyBegin/End. I will try valgrind. El vie., 13 sept. 2019 a las 22:45, Stefano Zampini (< stefano.zampini at gmail.com>) escribi?: > > > On Sep 13, 2019, at 11:37 PM, Jos? Lorenzo via petsc-users < > petsc-users at mcs.anl.gov> wrote: > > I'm using PETSc 3.10.2, I guess it is the master branch but I do not know > for sure as I didn't install it myself. > > You are right, each processor provides data for all the boundary entries. > > I have carried out a few more tests and apparently it gets stuck during > VecAssemblyBegin. I don't know whether I can be more preciseabout this. > > El vie., 13 sept. 2019 a las 20:14, Smith, Barry F. () > escribi?: > >> >> What version of PETSc is this? The master branch? >> >> > call VecSetValues(H, nedge_all, edgappglo(diredg_app) - 1, Hself, >> ADD_VALUES, ierr) >> >> So each process is providing all data for all the boundary entries in the >> vector? >> >> I don't think there is anything wrong with what you are doing but the >> mechanism that does the communication inside the VecAssembly cannot know >> about the structure of the communication and so will do it inefficiently. >> It would be useful to know where it is "stuck" that might help us improve >> the assembly process for your case. >> >> But I think it would better for you to just use MPI directly to put the >> data where it is needed. >> >> >> > On Sep 13, 2019, at 11:30 AM, Jos? Lorenzo via petsc-users < >> petsc-users at mcs.anl.gov> wrote: >> > >> > Hello, >> > >> > I am solving a finite element problem with Dirichlet boundary >> conditions using PETSC. In the boundary conditions there are two terms: a >> first one that is known before hand (normally zero) and a second term that >> depends linearly on the unknown variable itself in the whole domain. >> Therefore, at every time step I need to iterate as the boundary condition >> depends on the field and the latter depends on the BC. Moreover, the >> problem is nonlinear and I use a ghosted vector to represent the field. >> > >> > Every processor manages a portion of the domain and a portion of the >> boundary (if not interior). At every Newton iteration within the time loop >> the way I set the boundary conditions is as follows: >> > >> > First, each processor computes the known term of the BC (first term) >> and inserts the values into the vector >> > >> > call VecSetValues(H, nedge_own, edglocglo(diredg_loc) - 1, Hdir, >> INSERT_VALUES, ierr) >> > call VecAssemblyBegin(H, ierr) >> > call VecAssemblyEnd(H, ierr) >> > >> > As far as I understand, at this stage VecAssembly will not need to >> communicate to other processors as each processor only sets values to >> components that belong to it. >> > >> > Then, each processor computes its own contribution to the >> field-dependent term of the BC for the whole domain boundary as >> > >> > call VecSetValues(H, nedge_all, edgappglo(diredg_app) - 1, Hself, >> ADD_VALUES, ierr) >> > call VecAssemblyBegin(H, ierr) >> > call VecAssemblyEnd(H, ierr) >> > >> > In this case communication will be needed as each processor will add >> values to vector components that are not stored by it, and I guess it might >> get very busy as all the processors will need to communicate with each >> other. >> > >> > When using this strategy I don't find any issue for problems using a >> small amount of processors, but recently I've been solving using 90 >> processors and the simulation always hangs at the second VecSetValues at >> some random time step. It works fine for some time steps but at some point >> it just gets stuck and I have to cancel the simulation. >> > >> > > The words ?hangs? and ?gets stuck? 99% percent of the time indicates some > memory issue with your code. > First; are all processes calling VecAssemblyBegin/End or only a subset of > them? VecAssemblyBegin/End must be called by all processes > Second: run with valgrind, even a smaller case http://www.valgrind.org/ > > > I have managed to overcome this by making each processor contribute to >> its own components using first MPI_Reduce and then doing >> > >> > call VecSetValues(H, nedge_own, edgappglo(diredg_app_loc), Hself_own, >> ADD_VALUES, ierr) >> > call VecAssemblyBegin(H, ierr) >> > call VecAssemblyEnd(H, ierr) >> > >> > However I would like to understand whether there is something wrong in >> the code above. >> > >> > Thank you. >> > >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jczhang at mcs.anl.gov Fri Sep 13 22:38:35 2019 From: jczhang at mcs.anl.gov (Zhang, Junchao) Date: Sat, 14 Sep 2019 03:38:35 +0000 Subject: [petsc-users] VecAssembly gets stuck In-Reply-To: References: Message-ID: When processes get stuck, you can attach gdb to one process and back trace its call stack to see what it is doing, so we can have better understanding. --Junchao Zhang On Fri, Sep 13, 2019 at 11:31 AM Jos? Lorenzo via petsc-users > wrote: Hello, I am solving a finite element problem with Dirichlet boundary conditions using PETSC. In the boundary conditions there are two terms: a first one that is known before hand (normally zero) and a second term that depends linearly on the unknown variable itself in the whole domain. Therefore, at every time step I need to iterate as the boundary condition depends on the field and the latter depends on the BC. Moreover, the problem is nonlinear and I use a ghosted vector to represent the field. Every processor manages a portion of the domain and a portion of the boundary (if not interior). At every Newton iteration within the time loop the way I set the boundary conditions is as follows: First, each processor computes the known term of the BC (first term) and inserts the values into the vector call VecSetValues(H, nedge_own, edglocglo(diredg_loc) - 1, Hdir, INSERT_VALUES, ierr) call VecAssemblyBegin(H, ierr) call VecAssemblyEnd(H, ierr) As far as I understand, at this stage VecAssembly will not need to communicate to other processors as each processor only sets values to components that belong to it. Then, each processor computes its own contribution to the field-dependent term of the BC for the whole domain boundary as call VecSetValues(H, nedge_all, edgappglo(diredg_app) - 1, Hself, ADD_VALUES, ierr) call VecAssemblyBegin(H, ierr) call VecAssemblyEnd(H, ierr) In this case communication will be needed as each processor will add values to vector components that are not stored by it, and I guess it might get very busy as all the processors will need to communicate with each other. When using this strategy I don't find any issue for problems using a small amount of processors, but recently I've been solving using 90 processors and the simulation always hangs at the second VecSetValues at some random time step. It works fine for some time steps but at some point it just gets stuck and I have to cancel the simulation. I have managed to overcome this by making each processor contribute to its own components using first MPI_Reduce and then doing call VecSetValues(H, nedge_own, edgappglo(diredg_app_loc), Hself_own, ADD_VALUES, ierr) call VecAssemblyBegin(H, ierr) call VecAssemblyEnd(H, ierr) However I would like to understand whether there is something wrong in the code above. Thank you. -------------- next part -------------- An HTML attachment was scrubbed... URL: From danyang.su at gmail.com Sun Sep 15 16:17:49 2019 From: danyang.su at gmail.com (Danyang Su) Date: Sun, 15 Sep 2019 14:17:49 -0700 Subject: [petsc-users] Strange Partition in PETSc 3.11 version on some computers Message-ID: <9c36fe1c-e6c2-0278-864e-f1453687d3f9@gmail.com> Dear All, I have a question regarding strange partition problem in PETSc 3.11 version. The problem does not exist on my local workstation. However, on a cluster with different PETSc versions, the partition seems quite different, as you can find in the figure below, which is tested with 160 processors. The color means the processor owns that subdomain. In this layered prism mesh, there are 40 layers from bottom to top and each layer has around 20k nodes. The natural order of nodes is also layered from bottom to top. The left partition (PETSc 3.10 and earlier) looks good with minimum number of ghost nodes while the right one (PETSc 3.11) looks weired with huge number of ghost nodes. Looks like the right one uses partition layer by layer. This problem exists on a a cluster but not on my local workstation for the same PETSc version (with different compiler and MPI). Other than the difference in partition and efficiency, the simulation results are the same. partition difference Below is PETSc configuration on three machine: Local workstation (works fine):? ./configure --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --download-mpich --download-scalapack --download-parmetis --download-metis --download-ptscotch --download-fblaslapack --download-hypre --download-superlu_dist --download-hdf5=yes --download-ctetgen --with-debugging=0 COPTFLAGS=-O3 CXXOPTFLAGS=-O3 FOPTFLAGS=-O3 --with-cxx-dialect=C++11 Cluster with PETSc 3.9.3 (works fine): --prefix=/scinet/niagara/software/2018a/opt/intel-2018.2-intelmpi-2018.2/petsc/3.9.3 CC=mpicc CXX=mpicxx F77=mpif77 F90=mpif90 FC=mpifc COPTFLAGS="-march=native -O2" CXXOPTFLAGS="-march=native -O2" FOPTFLAGS="-march=native -O2" --download-chaco=1 --download-hypre=1 --download-metis=1 --download-ml=1 --download-mumps=1 --download-parmetis=1 --download-plapack=1 --download-prometheus=1 --download-ptscotch=1 --download-scotch=1 --download-sprng=1 --download-superlu=1 --download-superlu_dist=1 --download-triangle=1 --with-avx512-kernels=1 --with-blaslapack-dir=/scinet/niagara/intel/2018.2/compilers_and_libraries_2018.2.199/linux/mkl --with-debugging=0 --with-hdf5=1 --with-mkl_pardiso-dir=/scinet/niagara/intel/2018.2/compilers_and_libraries_2018.2.199/linux/mkl --with-scalapack=1 --with-scalapack-lib="[/scinet/niagara/intel/2018.2/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64/libmkl_scalapack_lp64.so,/scinet/niagara/intel/2018.2/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64/libmkl_blacs_intelmpi_lp64.so]" --with-x=0 Cluster with PETSc 3.11.3 (looks weired): --prefix=/scinet/niagara/software/2019b/opt/intel-2019u4-intelmpi-2019u4/petsc/3.11.3 CC=mpicc CXX=mpicxx F77=mpif77 F90=mpif90 FC=mpifc COPTFLAGS="-march=native -O2" CXXOPTFLAGS="-march=native -O2" FOPTFLAGS="-march=native -O2" --download-chaco=1 --download-hdf5=1 --download-hypre=1 --download-metis=1 --download-ml=1 --download-mumps=1 --download-parmetis=1 --download-plapack=1 --download-prometheus=1 --download-ptscotch=1 --download-scotch=1 --download-sprng=1 --download-superlu=1 --download-superlu_dist=1 --download-triangle=1 --with-avx512-kernels=1 --with-blaslapack-dir=/scinet/intel/2019u4/compilers_and_libraries_2019.4.243/linux/mkl --with-cxx-dialect=C++11 --with-debugging=0 --with-mkl_pardiso-dir=/scinet/intel/2019u4/compilers_and_libraries_2019.4.243/linux/mkl --with-scalapack=1 --with-scalapack-lib="[/scinet/intel/2019u4/compilers_and_libraries_2019.4.243/linux/mkl/lib/intel64/libmkl_scalapack_lp64.so,/scinet/intel/2019u4/compilers_and_libraries_2019.4.243/linux/mkl/lib/intel64/libmkl_blacs_intelmpi_lp64.so]" --with-x=0 And the partition is used by default dmplex distribution. ????? !c distribute mesh over processes ????? call DMPlexDistribute(dmda_flow%da,stencil_width,??????????????? & PETSC_NULL_SF,???????????????????????????? & PETSC_NULL_OBJECT,???????????????????????? & ??????????????????????????? distributedMesh,ierr) ????? CHKERRQ(ierr) Any idea on this strange problem? Thanks, Danyang -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: partition-petsc-3.11.3-vs-old.png Type: image/png Size: 381572 bytes Desc: not available URL: From knepley at gmail.com Sun Sep 15 17:20:18 2019 From: knepley at gmail.com (Matthew Knepley) Date: Sun, 15 Sep 2019 18:20:18 -0400 Subject: [petsc-users] Strange Partition in PETSc 3.11 version on some computers In-Reply-To: <9c36fe1c-e6c2-0278-864e-f1453687d3f9@gmail.com> References: <9c36fe1c-e6c2-0278-864e-f1453687d3f9@gmail.com> Message-ID: On Sun, Sep 15, 2019 at 5:19 PM Danyang Su via petsc-users < petsc-users at mcs.anl.gov> wrote: > Dear All, > > I have a question regarding strange partition problem in PETSc 3.11 > version. The problem does not exist on my local workstation. However, on a > cluster with different PETSc versions, the partition seems quite different, > as you can find in the figure below, which is tested with 160 processors. > The color means the processor owns that subdomain. In this layered prism > mesh, there are 40 layers from bottom to top and each layer has around 20k > nodes. The natural order of nodes is also layered from bottom to top. > > The left partition (PETSc 3.10 and earlier) looks good with minimum number > of ghost nodes while the right one (PETSc 3.11) looks weired with huge > number of ghost nodes. Looks like the right one uses partition layer by > layer. This problem exists on a a cluster but not on my local workstation > for the same PETSc version (with different compiler and MPI). Other than > the difference in partition and efficiency, the simulation results are the > same. > > [image: partition difference] > > Below is PETSc configuration on three machine: > > Local workstation (works fine): ./configure --with-cc=gcc --with-cxx=g++ > --with-fc=gfortran --download-mpich --download-scalapack > --download-parmetis --download-metis --download-ptscotch > --download-fblaslapack --download-hypre --download-superlu_dist > --download-hdf5=yes --download-ctetgen --with-debugging=0 COPTFLAGS=-O3 > CXXOPTFLAGS=-O3 FOPTFLAGS=-O3 --with-cxx-dialect=C++11 > > Cluster with PETSc 3.9.3 (works fine): > --prefix=/scinet/niagara/software/2018a/opt/intel-2018.2-intelmpi-2018.2/petsc/3.9.3 > CC=mpicc CXX=mpicxx F77=mpif77 F90=mpif90 FC=mpifc COPTFLAGS="-march=native > -O2" CXXOPTFLAGS="-march=native -O2" FOPTFLAGS="-march=native -O2" > --download-chaco=1 --download-hypre=1 --download-metis=1 --download-ml=1 > --download-mumps=1 --download-parmetis=1 --download-plapack=1 > --download-prometheus=1 --download-ptscotch=1 --download-scotch=1 > --download-sprng=1 --download-superlu=1 --download-superlu_dist=1 > --download-triangle=1 --with-avx512-kernels=1 > --with-blaslapack-dir=/scinet/niagara/intel/2018.2/compilers_and_libraries_2018.2.199/linux/mkl > --with-debugging=0 --with-hdf5=1 > --with-mkl_pardiso-dir=/scinet/niagara/intel/2018.2/compilers_and_libraries_2018.2.199/linux/mkl > --with-scalapack=1 > --with-scalapack-lib="[/scinet/niagara/intel/2018.2/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64/libmkl_scalapack_lp64.so,/scinet/niagara/intel/2018.2/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64/libmkl_blacs_intelmpi_lp64.so]" > --with-x=0 > > Cluster with PETSc 3.11.3 (looks weired): > --prefix=/scinet/niagara/software/2019b/opt/intel-2019u4-intelmpi-2019u4/petsc/3.11.3 > CC=mpicc CXX=mpicxx F77=mpif77 F90=mpif90 FC=mpifc COPTFLAGS="-march=native > -O2" CXXOPTFLAGS="-march=native -O2" FOPTFLAGS="-march=native -O2" > --download-chaco=1 --download-hdf5=1 --download-hypre=1 --download-metis=1 > --download-ml=1 --download-mumps=1 --download-parmetis=1 > --download-plapack=1 --download-prometheus=1 --download-ptscotch=1 > --download-scotch=1 --download-sprng=1 --download-superlu=1 > --download-superlu_dist=1 --download-triangle=1 --with-avx512-kernels=1 > --with-blaslapack-dir=/scinet/intel/2019u4/compilers_and_libraries_2019.4.243/linux/mkl > --with-cxx-dialect=C++11 --with-debugging=0 > --with-mkl_pardiso-dir=/scinet/intel/2019u4/compilers_and_libraries_2019.4.243/linux/mkl > --with-scalapack=1 > --with-scalapack-lib="[/scinet/intel/2019u4/compilers_and_libraries_2019.4.243/linux/mkl/lib/intel64/libmkl_scalapack_lp64.so,/scinet/intel/2019u4/compilers_and_libraries_2019.4.243/linux/mkl/lib/intel64/libmkl_blacs_intelmpi_lp64.so]" > --with-x=0 > > And the partition is used by default dmplex distribution. > > !c distribute mesh over processes > call DMPlexDistribute(dmda_flow%da,stencil_width, & > PETSC_NULL_SF, & > PETSC_NULL_OBJECT, & > distributedMesh,ierr) > CHKERRQ(ierr) > > Any idea on this strange problem? > > I just looked at the code. Your mesh should be partitioned by k-way partitioning using Metis since its on 1 proc for partitioning. This code is the same for 3.9 and 3.11, and you get the same result on your machine. I cannot understand what might be happening on your cluster (MPI plays no role). Is it possible that you changed the adjacency specification in that version? Thanks, Matt > Thanks, > > Danyang > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: partition-petsc-3.11.3-vs-old.png Type: image/png Size: 381572 bytes Desc: not available URL: From pedro.gonzalez at u-bordeaux.fr Sun Sep 15 17:35:15 2019 From: pedro.gonzalez at u-bordeaux.fr (Pedro Gonzalez) Date: Mon, 16 Sep 2019 00:35:15 +0200 (CEST) Subject: [petsc-users] I find slow performance of SNES Message-ID: <1415422715.5264647.1568586915625.JavaMail.zimbra@u-bordeaux.fr> Dear all, I am working on a code that solves a nonlinear system of equations G(x)=0 with Gauss-Seidel method. I managed to parallelize it by using DMDA with very good results. The previous week I changed my Gauss-Seidel solver by SNES. The code using SNES gives the same result as before, but I do not obtain the performance that I expected: 1) When using the Gauss-Seidel method (-snes_type ngs) the residual G(x) seems not be scallable to the amplitude of x and I have to add the option -snes_secant_h in order to make SNES converge. However, I varied the step from 1.E-1 to 1.E50 and obtained the same result within the same computation time. Is it normal that snes_secant_h can vary so many orders of magnitude? 2) Compared to my Gauss-Seidel algorithm, SNES does (approximately) the same number of iterations (with the same convergence criterium) but it is about 100 times slower. What can be the reason(s) of this slow performance of SNES solver? I do not use preconditioner with my algorithm so I did not add one to SNES. The main PETSc subroutines that I have included (in this order) are the following: call DMDACreate3D call DMSetUp call DMCreateLocalVector call DMCreateGlobalVector call SNESCreate call SNESSetConvergenceTest call SNESSetDM call DMDASNESSetFunctionLocal call SNESSetFromOptions call SNESSolve Thanks in advance for you help. Best regards, Pedro -------------- next part -------------- An HTML attachment was scrubbed... URL: From danyang.su at gmail.com Sun Sep 15 17:58:59 2019 From: danyang.su at gmail.com (Danyang Su) Date: Sun, 15 Sep 2019 15:58:59 -0700 Subject: [petsc-users] Strange Partition in PETSc 3.11 version on some computers In-Reply-To: References: <9c36fe1c-e6c2-0278-864e-f1453687d3f9@gmail.com> Message-ID: Hi Matt, Thanks for the quick reply. I have no change in the adjacency. The source code and the simulation input files are all the same. I also tried to use GNU compiler and mpich with petsc 3.11.3 and it works fine. It looks like the problem is caused by the difference in configuration. However, the configuration is pretty the same as petsc 3.9.3 except the compiler and mpi used. I will contact scinet staff to check if they have any idea on this. Thanks, Danyang On September 15, 2019 3:20:18 p.m. PDT, Matthew Knepley wrote: >On Sun, Sep 15, 2019 at 5:19 PM Danyang Su via petsc-users < >petsc-users at mcs.anl.gov> wrote: > >> Dear All, >> >> I have a question regarding strange partition problem in PETSc 3.11 >> version. The problem does not exist on my local workstation. However, >on a >> cluster with different PETSc versions, the partition seems quite >different, >> as you can find in the figure below, which is tested with 160 >processors. >> The color means the processor owns that subdomain. In this layered >prism >> mesh, there are 40 layers from bottom to top and each layer has >around 20k >> nodes. The natural order of nodes is also layered from bottom to top. >> >> The left partition (PETSc 3.10 and earlier) looks good with minimum >number >> of ghost nodes while the right one (PETSc 3.11) looks weired with >huge >> number of ghost nodes. Looks like the right one uses partition layer >by >> layer. This problem exists on a a cluster but not on my local >workstation >> for the same PETSc version (with different compiler and MPI). Other >than >> the difference in partition and efficiency, the simulation results >are the >> same. >> >> [image: partition difference] >> >> Below is PETSc configuration on three machine: >> >> Local workstation (works fine): ./configure --with-cc=gcc >--with-cxx=g++ >> --with-fc=gfortran --download-mpich --download-scalapack >> --download-parmetis --download-metis --download-ptscotch >> --download-fblaslapack --download-hypre --download-superlu_dist >> --download-hdf5=yes --download-ctetgen --with-debugging=0 >COPTFLAGS=-O3 >> CXXOPTFLAGS=-O3 FOPTFLAGS=-O3 --with-cxx-dialect=C++11 >> >> Cluster with PETSc 3.9.3 (works fine): >> >--prefix=/scinet/niagara/software/2018a/opt/intel-2018.2-intelmpi-2018.2/petsc/3.9.3 >> CC=mpicc CXX=mpicxx F77=mpif77 F90=mpif90 FC=mpifc >COPTFLAGS="-march=native >> -O2" CXXOPTFLAGS="-march=native -O2" FOPTFLAGS="-march=native -O2" >> --download-chaco=1 --download-hypre=1 --download-metis=1 >--download-ml=1 >> --download-mumps=1 --download-parmetis=1 --download-plapack=1 >> --download-prometheus=1 --download-ptscotch=1 --download-scotch=1 >> --download-sprng=1 --download-superlu=1 --download-superlu_dist=1 >> --download-triangle=1 --with-avx512-kernels=1 >> >--with-blaslapack-dir=/scinet/niagara/intel/2018.2/compilers_and_libraries_2018.2.199/linux/mkl >> --with-debugging=0 --with-hdf5=1 >> >--with-mkl_pardiso-dir=/scinet/niagara/intel/2018.2/compilers_and_libraries_2018.2.199/linux/mkl >> --with-scalapack=1 >> >--with-scalapack-lib="[/scinet/niagara/intel/2018.2/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64/libmkl_scalapack_lp64.so,/scinet/niagara/intel/2018.2/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64/libmkl_blacs_intelmpi_lp64.so]" >> --with-x=0 >> >> Cluster with PETSc 3.11.3 (looks weired): >> >--prefix=/scinet/niagara/software/2019b/opt/intel-2019u4-intelmpi-2019u4/petsc/3.11.3 >> CC=mpicc CXX=mpicxx F77=mpif77 F90=mpif90 FC=mpifc >COPTFLAGS="-march=native >> -O2" CXXOPTFLAGS="-march=native -O2" FOPTFLAGS="-march=native -O2" >> --download-chaco=1 --download-hdf5=1 --download-hypre=1 >--download-metis=1 >> --download-ml=1 --download-mumps=1 --download-parmetis=1 >> --download-plapack=1 --download-prometheus=1 --download-ptscotch=1 >> --download-scotch=1 --download-sprng=1 --download-superlu=1 >> --download-superlu_dist=1 --download-triangle=1 >--with-avx512-kernels=1 >> >--with-blaslapack-dir=/scinet/intel/2019u4/compilers_and_libraries_2019.4.243/linux/mkl >> --with-cxx-dialect=C++11 --with-debugging=0 >> >--with-mkl_pardiso-dir=/scinet/intel/2019u4/compilers_and_libraries_2019.4.243/linux/mkl >> --with-scalapack=1 >> >--with-scalapack-lib="[/scinet/intel/2019u4/compilers_and_libraries_2019.4.243/linux/mkl/lib/intel64/libmkl_scalapack_lp64.so,/scinet/intel/2019u4/compilers_and_libraries_2019.4.243/linux/mkl/lib/intel64/libmkl_blacs_intelmpi_lp64.so]" >> --with-x=0 >> >> And the partition is used by default dmplex distribution. >> >> !c distribute mesh over processes >> call DMPlexDistribute(dmda_flow%da,stencil_width, > & >> PETSC_NULL_SF, > & >> PETSC_NULL_OBJECT, > & >> distributedMesh,ierr) >> CHKERRQ(ierr) >> >> Any idea on this strange problem? >> >> I just looked at the code. Your mesh should be partitioned by k-way >partitioning using Metis since its on 1 proc for partitioning. This >code >is the same for 3.9 and 3.11, and you get the same result on your >machine. >I cannot understand what might be happening on your cluster >(MPI plays no role). Is it possible that you changed the adjacency >specification in that version? > > Thanks, > > Matt > >> Thanks, >> >> Danyang >> > > >-- >What most experimenters take for granted before they begin their >experiments is infinitely more interesting than any results to which >their >experiments lead. >-- Norbert Wiener > >https://www.cse.buffalo.edu/~knepley/ > -- Sent from my Android device with K-9 Mail. Please excuse my brevity. -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Sun Sep 15 18:07:15 2019 From: knepley at gmail.com (Matthew Knepley) Date: Sun, 15 Sep 2019 19:07:15 -0400 Subject: [petsc-users] Strange Partition in PETSc 3.11 version on some computers In-Reply-To: References: <9c36fe1c-e6c2-0278-864e-f1453687d3f9@gmail.com> Message-ID: On Sun, Sep 15, 2019 at 6:59 PM Danyang Su wrote: > Hi Matt, > > Thanks for the quick reply. I have no change in the adjacency. The source > code and the simulation input files are all the same. I also tried to use > GNU compiler and mpich with petsc 3.11.3 and it works fine. > > It looks like the problem is caused by the difference in configuration. > However, the configuration is pretty the same as petsc 3.9.3 except the > compiler and mpi used. I will contact scinet staff to check if they have > any idea on this. > Very very strange since the partition is handled completely by Metis, and does not use MPI. Thanks, Matt > Thanks, > > Danyang > > On September 15, 2019 3:20:18 p.m. PDT, Matthew Knepley > wrote: >> >> On Sun, Sep 15, 2019 at 5:19 PM Danyang Su via petsc-users < >> petsc-users at mcs.anl.gov> wrote: >> >>> Dear All, >>> >>> I have a question regarding strange partition problem in PETSc 3.11 >>> version. The problem does not exist on my local workstation. However, on a >>> cluster with different PETSc versions, the partition seems quite different, >>> as you can find in the figure below, which is tested with 160 processors. >>> The color means the processor owns that subdomain. In this layered prism >>> mesh, there are 40 layers from bottom to top and each layer has around 20k >>> nodes. The natural order of nodes is also layered from bottom to top. >>> >>> The left partition (PETSc 3.10 and earlier) looks good with minimum >>> number of ghost nodes while the right one (PETSc 3.11) looks weired with >>> huge number of ghost nodes. Looks like the right one uses partition layer >>> by layer. This problem exists on a a cluster but not on my local >>> workstation for the same PETSc version (with different compiler and MPI). >>> Other than the difference in partition and efficiency, the simulation >>> results are the same. >>> >>> [image: partition difference] >>> >>> Below is PETSc configuration on three machine: >>> >>> Local workstation (works fine): ./configure --with-cc=gcc >>> --with-cxx=g++ --with-fc=gfortran --download-mpich --download-scalapack >>> --download-parmetis --download-metis --download-ptscotch >>> --download-fblaslapack --download-hypre --download-superlu_dist >>> --download-hdf5=yes --download-ctetgen --with-debugging=0 COPTFLAGS=-O3 >>> CXXOPTFLAGS=-O3 FOPTFLAGS=-O3 --with-cxx-dialect=C++11 >>> >>> Cluster with PETSc 3.9.3 (works fine): >>> --prefix=/scinet/niagara/software/2018a/opt/intel-2018.2-intelmpi-2018.2/petsc/3.9.3 >>> CC=mpicc CXX=mpicxx F77=mpif77 F90=mpif90 FC=mpifc COPTFLAGS="-march=native >>> -O2" CXXOPTFLAGS="-march=native -O2" FOPTFLAGS="-march=native -O2" >>> --download-chaco=1 --download-hypre=1 --download-metis=1 --download-ml=1 >>> --download-mumps=1 --download-parmetis=1 --download-plapack=1 >>> --download-prometheus=1 --download-ptscotch=1 --download-scotch=1 >>> --download-sprng=1 --download-superlu=1 --download-superlu_dist=1 >>> --download-triangle=1 --with-avx512-kernels=1 >>> --with-blaslapack-dir=/scinet/niagara/intel/2018.2/compilers_and_libraries_2018.2.199/linux/mkl >>> --with-debugging=0 --with-hdf5=1 >>> --with-mkl_pardiso-dir=/scinet/niagara/intel/2018.2/compilers_and_libraries_2018.2.199/linux/mkl >>> --with-scalapack=1 >>> --with-scalapack-lib="[/scinet/niagara/intel/2018.2/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64/libmkl_scalapack_lp64.so,/scinet/niagara/intel/2018.2/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64/libmkl_blacs_intelmpi_lp64.so]" >>> --with-x=0 >>> >>> Cluster with PETSc 3.11.3 (looks weired): >>> --prefix=/scinet/niagara/software/2019b/opt/intel-2019u4-intelmpi-2019u4/petsc/3.11.3 >>> CC=mpicc CXX=mpicxx F77=mpif77 F90=mpif90 FC=mpifc COPTFLAGS="-march=native >>> -O2" CXXOPTFLAGS="-march=native -O2" FOPTFLAGS="-march=native -O2" >>> --download-chaco=1 --download-hdf5=1 --download-hypre=1 --download-metis=1 >>> --download-ml=1 --download-mumps=1 --download-parmetis=1 >>> --download-plapack=1 --download-prometheus=1 --download-ptscotch=1 >>> --download-scotch=1 --download-sprng=1 --download-superlu=1 >>> --download-superlu_dist=1 --download-triangle=1 --with-avx512-kernels=1 >>> --with-blaslapack-dir=/scinet/intel/2019u4/compilers_and_libraries_2019.4.243/linux/mkl >>> --with-cxx-dialect=C++11 --with-debugging=0 >>> --with-mkl_pardiso-dir=/scinet/intel/2019u4/compilers_and_libraries_2019.4.243/linux/mkl >>> --with-scalapack=1 >>> --with-scalapack-lib="[/scinet/intel/2019u4/compilers_and_libraries_2019.4.243/linux/mkl/lib/intel64/libmkl_scalapack_lp64.so,/scinet/intel/2019u4/compilers_and_libraries_2019.4.243/linux/mkl/lib/intel64/libmkl_blacs_intelmpi_lp64.so]" >>> --with-x=0 >>> >>> And the partition is used by default dmplex distribution. >>> >>> !c distribute mesh over processes >>> call DMPlexDistribute(dmda_flow%da,stencil_width, & >>> PETSC_NULL_SF, & >>> PETSC_NULL_OBJECT, & >>> distributedMesh,ierr) >>> CHKERRQ(ierr) >>> >>> Any idea on this strange problem? >>> >>> I just looked at the code. Your mesh should be partitioned by k-way >> partitioning using Metis since its on 1 proc for partitioning. This code >> is the same for 3.9 and 3.11, and you get the same result on your >> machine. I cannot understand what might be happening on your cluster >> (MPI plays no role). Is it possible that you changed the adjacency >> specification in that version? >> >> Thanks, >> >> Matt >> >>> Thanks, >>> >>> Danyang >>> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ >> >> > > -- > Sent from my Android device with K-9 Mail. Please excuse my brevity. > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Sun Sep 15 18:28:15 2019 From: bsmith at mcs.anl.gov (Smith, Barry F.) Date: Sun, 15 Sep 2019 23:28:15 +0000 Subject: [petsc-users] I find slow performance of SNES In-Reply-To: <1415422715.5264647.1568586915625.JavaMail.zimbra@u-bordeaux.fr> References: <1415422715.5264647.1568586915625.JavaMail.zimbra@u-bordeaux.fr> Message-ID: <9249F6D6-0F16-4FCB-A820-5F8F3F04E03C@anl.gov> > On Sep 15, 2019, at 5:35 PM, Pedro Gonzalez via petsc-users wrote: > > Dear all, > > I am working on a code that solves a nonlinear system of equations G(x)=0 with Gauss-Seidel method. I managed to parallelize it by using DMDA with very good results. The previous week I changed my Gauss-Seidel solver by SNES. The code using SNES gives the same result as before, but I do not obtain the performance that I expected: > 1) When using the Gauss-Seidel method (-snes_type ngs) the residual G(x) seems not be scallable to the amplitude of x Do you simply mean that -snes_type ngs is not converging to the solution? Does it seem to converge to something else or nothing at all? The SNES GS code just calls the user provided routine set with SNESSetNGS(). This means that it is expect to behave the same way as if the user simply called the user provided routine themselves. I assume you have a routine that implements the GS on the DMDA since you write " I managed to parallelize it by using DMDA with very good results. " Thus you should get the same iterations for both calling your Gauss-Seidel code yourself and calling SNESSetNGS() and then calling SNESSolve() with -snes_type ngs. You can check if they are behaving in the same (for simplicity) by using the debugger or by putting VecView() into code to see if they are generating the same values. Just have your GS code call VecView() on the input vectors at the top and the output vectors at the bottom. > and I have to add the option -snes_secant_h in order to make SNES converge. Do you mean both -snes_ngs_secant and -snes_ngs_secant_h ? The second option by itself will do nothing. Cut and paste the exact options you use. > However, I varied the step from 1.E-1 to 1.E50 and obtained the same result within the same computation time. This would happen if you did not use -snes_ngs_secant but do use -snes_ngs_secant_h but this doesn't change the algorithm > Is it normal that snes_secant_h can vary so many orders of magnitude? That certainly does seem odd. Looking at the code SNESComputeNGSDefaultSecant() we see it is perturbing the input vector (by color) with h - for (j=0;j atol) { /* This is equivalent to d = x - (h*f) / PetscRealPart(g-f) */ d = (x*g-w*f) / PetscRealPart(g-f); } else { d = x; } In PetscErrorCode SNESSetFromOptions_NGS(PetscOptionItems *PetscOptionsObject,SNES snes) one can see that the user provided h is accepted ierr = PetscOptionsReal("-snes_ngs_secant_h","Differencing parameter for secant search","",gs->h,&gs->h,NULL);CHKERRQ(ierr); You could run in the debugger with a break point in SNESComputeNGSDefaultSecant() to see if it is truly using the h you provided. > 2) Compared to my Gauss-Seidel algorithm, SNES does (approximately) the same number of iterations (with the same convergence criterium) but it is about 100 times slower. I don't fully understand what you are running so cannot completely answer this. -snes_ngs_secant will be lots slower than using a user provided GS By default SNES may be doing some norm computations at each iteration which are expensive and will slow things down. You can use SNESSetNormSchedule() or the command line form -snes_norm_schedule to turn these norms off. > What can be the reason(s) of this slow performance of SNES solver? I do not use preconditioner with my algorithm so I did not add one to SNES. > > The main PETSc subroutines that I have included (in this order) are the following: > call DMDACreate3D > call DMSetUp > call DMCreateLocalVector > call DMCreateGlobalVector > call SNESCreate > call SNESSetConvergenceTest > call SNESSetDM > call DMDASNESSetFunctionLocal > call SNESSetFromOptions > call SNESSolve > > Thanks in advance for you help. > > Best regards, > Pedro > > From bsmith at mcs.anl.gov Sun Sep 15 18:43:45 2019 From: bsmith at mcs.anl.gov (Smith, Barry F.) Date: Sun, 15 Sep 2019 23:43:45 +0000 Subject: [petsc-users] Strange Partition in PETSc 3.11 version on some computers In-Reply-To: References: <9c36fe1c-e6c2-0278-864e-f1453687d3f9@gmail.com> Message-ID: <5217DF9F-E42D-4E1F-AE9B-1088954548BB@anl.gov> Send the configure.log and make.log for the two system configurations that produce very different results as well as the output running with -dm_view -info for both runs. The cause is likely not subtle, one is likely using metis and the other is likely just not using any partitioner. > On Sep 15, 2019, at 6:07 PM, Matthew Knepley via petsc-users wrote: > > On Sun, Sep 15, 2019 at 6:59 PM Danyang Su wrote: > Hi Matt, > > Thanks for the quick reply. I have no change in the adjacency. The source code and the simulation input files are all the same. I also tried to use GNU compiler and mpich with petsc 3.11.3 and it works fine. > > It looks like the problem is caused by the difference in configuration. However, the configuration is pretty the same as petsc 3.9.3 except the compiler and mpi used. I will contact scinet staff to check if they have any idea on this. > > Very very strange since the partition is handled completely by Metis, and does not use MPI. > > Thanks, > > Matt > > Thanks, > > Danyang > > On September 15, 2019 3:20:18 p.m. PDT, Matthew Knepley wrote: > On Sun, Sep 15, 2019 at 5:19 PM Danyang Su via petsc-users wrote: > Dear All, > > I have a question regarding strange partition problem in PETSc 3.11 version. The problem does not exist on my local workstation. However, on a cluster with different PETSc versions, the partition seems quite different, as you can find in the figure below, which is tested with 160 processors. The color means the processor owns that subdomain. In this layered prism mesh, there are 40 layers from bottom to top and each layer has around 20k nodes. The natural order of nodes is also layered from bottom to top. > > The left partition (PETSc 3.10 and earlier) looks good with minimum number of ghost nodes while the right one (PETSc 3.11) looks weired with huge number of ghost nodes. Looks like the right one uses partition layer by layer. This problem exists on a a cluster but not on my local workstation for the same PETSc version (with different compiler and MPI). Other than the difference in partition and efficiency, the simulation results are the same. > > > > > Below is PETSc configuration on three machine: > > Local workstation (works fine): ./configure --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --download-mpich --download-scalapack --download-parmetis --download-metis --download-ptscotch --download-fblaslapack --download-hypre --download-superlu_dist --download-hdf5=yes --download-ctetgen --with-debugging=0 COPTFLAGS=-O3 CXXOPTFLAGS=-O3 FOPTFLAGS=-O3 --with-cxx-dialect=C++11 > > Cluster with PETSc 3.9.3 (works fine): --prefix=/scinet/niagara/software/2018a/opt/intel-2018.2-intelmpi-2018.2/petsc/3.9.3 CC=mpicc CXX=mpicxx F77=mpif77 F90=mpif90 FC=mpifc COPTFLAGS="-march=native -O2" CXXOPTFLAGS="-march=native -O2" FOPTFLAGS="-march=native -O2" --download-chaco=1 --download-hypre=1 --download-metis=1 --download-ml=1 --download-mumps=1 --download-parmetis=1 --download-plapack=1 --download-prometheus=1 --download-ptscotch=1 --download-scotch=1 --download-sprng=1 --download-superlu=1 --download-superlu_dist=1 --download-triangle=1 --with-avx512-kernels=1 --with-blaslapack-dir=/scinet/niagara/intel/2018.2/compilers_and_libraries_2018.2.199/linux/mkl --with-debugging=0 --with-hdf5=1 --with-mkl_pardiso-dir=/scinet/niagara/intel/2018.2/compilers_and_libraries_2018.2.199/linux/mkl --with-scalapack=1 --with-scalapack-lib="[/scinet/niagara/intel/2018.2/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64/libmkl_scalapack_lp64.so,/scinet/niagara/intel/2018.2/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64/libmkl_blacs_intelmpi_lp64.so]" --with-x=0 > > Cluster with PETSc 3.11.3 (looks weired): --prefix=/scinet/niagara/software/2019b/opt/intel-2019u4-intelmpi-2019u4/petsc/3.11.3 CC=mpicc CXX=mpicxx F77=mpif77 F90=mpif90 FC=mpifc COPTFLAGS="-march=native -O2" CXXOPTFLAGS="-march=native -O2" FOPTFLAGS="-march=native -O2" --download-chaco=1 --download-hdf5=1 --download-hypre=1 --download-metis=1 --download-ml=1 --download-mumps=1 --download-parmetis=1 --download-plapack=1 --download-prometheus=1 --download-ptscotch=1 --download-scotch=1 --download-sprng=1 --download-superlu=1 --download-superlu_dist=1 --download-triangle=1 --with-avx512-kernels=1 --with-blaslapack-dir=/scinet/intel/2019u4/compilers_and_libraries_2019.4.243/linux/mkl --with-cxx-dialect=C++11 --with-debugging=0 --with-mkl_pardiso-dir=/scinet/intel/2019u4/compilers_and_libraries_2019.4.243/linux/mkl --with-scalapack=1 --with-scalapack-lib="[/scinet/intel/2019u4/compilers_and_libraries_2019.4.243/linux/mkl/lib/intel64/libmkl_scalapack_lp64.so,/scinet/intel/2019u4/compilers_and_libraries_2019.4.243/linux/mkl/lib/intel64/libmkl_blacs_intelmpi_lp64.so]" --with-x=0 > > And the partition is used by default dmplex distribution. > > !c distribute mesh over processes > call DMPlexDistribute(dmda_flow%da,stencil_width, & > PETSC_NULL_SF, & > PETSC_NULL_OBJECT, & > distributedMesh,ierr) > CHKERRQ(ierr) > > Any idea on this strange problem? > > > I just looked at the code. Your mesh should be partitioned by k-way partitioning using Metis since its on 1 proc for partitioning. This code > is the same for 3.9 and 3.11, and you get the same result on your machine. I cannot understand what might be happening on your cluster > (MPI plays no role). Is it possible that you changed the adjacency specification in that version? > > Thanks, > > Matt > Thanks, > > Danyang > > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > -- > Sent from my Android device with K-9 Mail. Please excuse my brevity. > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ From danyang.su at gmail.com Sun Sep 15 23:26:15 2019 From: danyang.su at gmail.com (Danyang Su) Date: Sun, 15 Sep 2019 21:26:15 -0700 Subject: [petsc-users] [WARNING: UNSCANNABLE EXTRACTION FAILED]Re: Strange Partition in PETSc 3.11 version on some computers In-Reply-To: <5217DF9F-E42D-4E1F-AE9B-1088954548BB@anl.gov> References: <9c36fe1c-e6c2-0278-864e-f1453687d3f9@gmail.com> <5217DF9F-E42D-4E1F-AE9B-1088954548BB@anl.gov> Message-ID: On 2019-09-15 4:43 p.m., Smith, Barry F. wrote: > Send the configure.log and make.log for the two system configurations that produce very different results as well as the output running with -dm_view -info for both runs. The cause is likely not subtle, one is likely using metis and the other is likely just not using any partitioner. Hi Barry, Attached is the output information with -dm_view -info for both runs. The configure.log and make.log will be mailed to you once I get it from the SCINET staff who installed PETSc on this cluster. Thanks, Danyang > > > >> On Sep 15, 2019, at 6:07 PM, Matthew Knepley via petsc-users wrote: >> >> On Sun, Sep 15, 2019 at 6:59 PM Danyang Su wrote: >> Hi Matt, >> >> Thanks for the quick reply. I have no change in the adjacency. The source code and the simulation input files are all the same. I also tried to use GNU compiler and mpich with petsc 3.11.3 and it works fine. >> >> It looks like the problem is caused by the difference in configuration. However, the configuration is pretty the same as petsc 3.9.3 except the compiler and mpi used. I will contact scinet staff to check if they have any idea on this. >> >> Very very strange since the partition is handled completely by Metis, and does not use MPI. >> >> Thanks, >> >> Matt >> >> Thanks, >> >> Danyang >> >> On September 15, 2019 3:20:18 p.m. PDT, Matthew Knepley wrote: >> On Sun, Sep 15, 2019 at 5:19 PM Danyang Su via petsc-users wrote: >> Dear All, >> >> I have a question regarding strange partition problem in PETSc 3.11 version. The problem does not exist on my local workstation. However, on a cluster with different PETSc versions, the partition seems quite different, as you can find in the figure below, which is tested with 160 processors. The color means the processor owns that subdomain. In this layered prism mesh, there are 40 layers from bottom to top and each layer has around 20k nodes. The natural order of nodes is also layered from bottom to top. >> >> The left partition (PETSc 3.10 and earlier) looks good with minimum number of ghost nodes while the right one (PETSc 3.11) looks weired with huge number of ghost nodes. Looks like the right one uses partition layer by layer. This problem exists on a a cluster but not on my local workstation for the same PETSc version (with different compiler and MPI). Other than the difference in partition and efficiency, the simulation results are the same. >> >> >> >> >> Below is PETSc configuration on three machine: >> >> Local workstation (works fine): ./configure --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --download-mpich --download-scalapack --download-parmetis --download-metis --download-ptscotch --download-fblaslapack --download-hypre --download-superlu_dist --download-hdf5=yes --download-ctetgen --with-debugging=0 COPTFLAGS=-O3 CXXOPTFLAGS=-O3 FOPTFLAGS=-O3 --with-cxx-dialect=C++11 >> >> Cluster with PETSc 3.9.3 (works fine): --prefix=/scinet/niagara/software/2018a/opt/intel-2018.2-intelmpi-2018.2/petsc/3.9.3 CC=mpicc CXX=mpicxx F77=mpif77 F90=mpif90 FC=mpifc COPTFLAGS="-march=native -O2" CXXOPTFLAGS="-march=native -O2" FOPTFLAGS="-march=native -O2" --download-chaco=1 --download-hypre=1 --download-metis=1 --download-ml=1 --download-mumps=1 --download-parmetis=1 --download-plapack=1 --download-prometheus=1 --download-ptscotch=1 --download-scotch=1 --download-sprng=1 --download-superlu=1 --download-superlu_dist=1 --download-triangle=1 --with-avx512-kernels=1 --with-blaslapack-dir=/scinet/niagara/intel/2018.2/compilers_and_libraries_2018.2.199/linux/mkl --with-debugging=0 --with-hdf5=1 --with-mkl_pardiso-dir=/scinet/niagara/intel/2018.2/compilers_and_libraries_2018.2.199/linux/mkl --with-scalapack=1 --with-scalapack-lib="[/scinet/niagara/intel/2018.2/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64/libmkl_scalapack_lp64.so,/scinet/niagara/intel/2018.2/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64/libmkl_blacs_intelmpi_lp64.so]" --with-x=0 >> >> Cluster with PETSc 3.11.3 (looks weired): --prefix=/scinet/niagara/software/2019b/opt/intel-2019u4-intelmpi-2019u4/petsc/3.11.3 CC=mpicc CXX=mpicxx F77=mpif77 F90=mpif90 FC=mpifc COPTFLAGS="-march=native -O2" CXXOPTFLAGS="-march=native -O2" FOPTFLAGS="-march=native -O2" --download-chaco=1 --download-hdf5=1 --download-hypre=1 --download-metis=1 --download-ml=1 --download-mumps=1 --download-parmetis=1 --download-plapack=1 --download-prometheus=1 --download-ptscotch=1 --download-scotch=1 --download-sprng=1 --download-superlu=1 --download-superlu_dist=1 --download-triangle=1 --with-avx512-kernels=1 --with-blaslapack-dir=/scinet/intel/2019u4/compilers_and_libraries_2019.4.243/linux/mkl --with-cxx-dialect=C++11 --with-debugging=0 --with-mkl_pardiso-dir=/scinet/intel/2019u4/compilers_and_libraries_2019.4.243/linux/mkl --with-scalapack=1 --with-scalapack-lib="[/scinet/intel/2019u4/compilers_and_libraries_2019.4.243/linux/mkl/lib/intel64/libmkl_scalapack_lp64.so,/scinet/intel/2019u4/compilers_and_libraries_2019.4.243/linux/mkl/lib/intel64/libmkl_blacs_intelmpi_lp64.so]" --with-x=0 >> >> And the partition is used by default dmplex distribution. >> >> !c distribute mesh over processes >> call DMPlexDistribute(dmda_flow%da,stencil_width, & >> PETSC_NULL_SF, & >> PETSC_NULL_OBJECT, & >> distributedMesh,ierr) >> CHKERRQ(ierr) >> >> Any idea on this strange problem? >> >> >> I just looked at the code. Your mesh should be partitioned by k-way partitioning using Metis since its on 1 proc for partitioning. This code >> is the same for 3.9 and 3.11, and you get the same result on your machine. I cannot understand what might be happening on your cluster >> (MPI plays no role). Is it possible that you changed the adjacency specification in that version? >> >> Thanks, >> >> Matt >> Thanks, >> >> Danyang >> >> >> >> -- >> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ >> >> -- >> Sent from my Android device with K-9 Mail. Please excuse my brevity. >> >> >> -- >> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- A non-text attachment was scrubbed... Name: basin-3d-petsc3.3.9.log.tar.gz Type: application/gzip Size: 226582 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: basin-3d-petsc3.11.3.log.tar.gz Type: application/gzip Size: 751782 bytes Desc: not available URL: From bsmith at mcs.anl.gov Sun Sep 15 23:39:10 2019 From: bsmith at mcs.anl.gov (Smith, Barry F.) Date: Mon, 16 Sep 2019 04:39:10 +0000 Subject: [petsc-users] [WARNING: UNSCANNABLE EXTRACTION FAILED]Re: Strange Partition in PETSc 3.11 version on some computers In-Reply-To: References: <9c36fe1c-e6c2-0278-864e-f1453687d3f9@gmail.com> <5217DF9F-E42D-4E1F-AE9B-1088954548BB@anl.gov> Message-ID: <3B61258A-6BCC-4905-A691-2CFFB0E5A76E@mcs.anl.gov> Please don't run with -info the file is infinitely large and the mailer rejected it. It doesn't provide any useful information. Barry > On Sep 15, 2019, at 11:26 PM, Danyang Su wrote: > > On 2019-09-15 4:43 p.m., Smith, Barry F. wrote: >> Send the configure.log and make.log for the two system configurations that produce very different results as well as the output running with -dm_view -info for both runs. The cause is likely not subtle, one is likely using metis and the other is likely just not using any partitioner. > > Hi Barry, > > Attached is the output information with -dm_view -info for both runs. The configure.log and make.log will be mailed to you once I get it from the SCINET staff who installed PETSc on this cluster. > > Thanks, > > Danyang > >> >> >> >>> On Sep 15, 2019, at 6:07 PM, Matthew Knepley via petsc-users wrote: >>> >>> On Sun, Sep 15, 2019 at 6:59 PM Danyang Su wrote: >>> Hi Matt, >>> >>> Thanks for the quick reply. I have no change in the adjacency. The source code and the simulation input files are all the same. I also tried to use GNU compiler and mpich with petsc 3.11.3 and it works fine. >>> >>> It looks like the problem is caused by the difference in configuration. However, the configuration is pretty the same as petsc 3.9.3 except the compiler and mpi used. I will contact scinet staff to check if they have any idea on this. >>> >>> Very very strange since the partition is handled completely by Metis, and does not use MPI. >>> >>> Thanks, >>> >>> Matt >>> Thanks, >>> >>> Danyang >>> >>> On September 15, 2019 3:20:18 p.m. PDT, Matthew Knepley wrote: >>> On Sun, Sep 15, 2019 at 5:19 PM Danyang Su via petsc-users wrote: >>> Dear All, >>> >>> I have a question regarding strange partition problem in PETSc 3.11 version. The problem does not exist on my local workstation. However, on a cluster with different PETSc versions, the partition seems quite different, as you can find in the figure below, which is tested with 160 processors. The color means the processor owns that subdomain. In this layered prism mesh, there are 40 layers from bottom to top and each layer has around 20k nodes. The natural order of nodes is also layered from bottom to top. >>> >>> The left partition (PETSc 3.10 and earlier) looks good with minimum number of ghost nodes while the right one (PETSc 3.11) looks weired with huge number of ghost nodes. Looks like the right one uses partition layer by layer. This problem exists on a a cluster but not on my local workstation for the same PETSc version (with different compiler and MPI). Other than the difference in partition and efficiency, the simulation results are the same. >>> >>> >>> >>> >>> Below is PETSc configuration on three machine: >>> >>> Local workstation (works fine): ./configure --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --download-mpich --download-scalapack --download-parmetis --download-metis --download-ptscotch --download-fblaslapack --download-hypre --download-superlu_dist --download-hdf5=yes --download-ctetgen --with-debugging=0 COPTFLAGS=-O3 CXXOPTFLAGS=-O3 FOPTFLAGS=-O3 --with-cxx-dialect=C++11 >>> >>> Cluster with PETSc 3.9.3 (works fine): --prefix=/scinet/niagara/software/2018a/opt/intel-2018.2-intelmpi-2018.2/petsc/3.9.3 CC=mpicc CXX=mpicxx F77=mpif77 F90=mpif90 FC=mpifc COPTFLAGS="-march=native -O2" CXXOPTFLAGS="-march=native -O2" FOPTFLAGS="-march=native -O2" --download-chaco=1 --download-hypre=1 --download-metis=1 --download-ml=1 --download-mumps=1 --download-parmetis=1 --download-plapack=1 --download-prometheus=1 --download-ptscotch=1 --download-scotch=1 --download-sprng=1 --download-superlu=1 --download-superlu_dist=1 --download-triangle=1 --with-avx512-kernels=1 --with-blaslapack-dir=/scinet/niagara/intel/2018.2/compilers_and_libraries_2018.2.199/linux/mkl --with-debugging=0 --with-hdf5=1 --with-mkl_pardiso-dir=/scinet/niagara/intel/2018.2/compilers_and_libraries_2018.2.199/linux/mkl --with-scalapack=1 --with-scalapack-lib="[/scinet/niagara/intel/2018.2/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64/libmkl_scalapack_lp64.so,/scinet/niagara/intel/2018.2/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64/libmkl_blacs_intelmpi_lp64.so]" --with-x=0 >>> >>> Cluster with PETSc 3.11.3 (looks weired): --prefix=/scinet/niagara/software/2019b/opt/intel-2019u4-intelmpi-2019u4/petsc/3.11.3 CC=mpicc CXX=mpicxx F77=mpif77 F90=mpif90 FC=mpifc COPTFLAGS="-march=native -O2" CXXOPTFLAGS="-march=native -O2" FOPTFLAGS="-march=native -O2" --download-chaco=1 --download-hdf5=1 --download-hypre=1 --download-metis=1 --download-ml=1 --download-mumps=1 --download-parmetis=1 --download-plapack=1 --download-prometheus=1 --download-ptscotch=1 --download-scotch=1 --download-sprng=1 --download-superlu=1 --download-superlu_dist=1 --download-triangle=1 --with-avx512-kernels=1 --with-blaslapack-dir=/scinet/intel/2019u4/compilers_and_libraries_2019.4.243/linux/mkl --with-cxx-dialect=C++11 --with-debugging=0 --with-mkl_pardiso-dir=/scinet/intel/2019u4/compilers_and_libraries_2019.4.243/linux/mkl --with-scalapack=1 --with-scalapack-lib="[/scinet/intel/2019u4/compilers_and_libraries_2019.4.243/linux/mkl/lib/intel64/libmkl_scalapack_lp64.so,/scinet/intel/2019u4/compilers_and_libraries_2019.4.243/linux/mkl/lib/intel64/libmkl_blacs_intelmpi_lp64.so]" --with-x=0 >>> >>> And the partition is used by default dmplex distribution. >>> >>> !c distribute mesh over processes >>> call DMPlexDistribute(dmda_flow%da,stencil_width, & >>> PETSC_NULL_SF, & >>> PETSC_NULL_OBJECT, & >>> distributedMesh,ierr) >>> CHKERRQ(ierr) >>> >>> Any idea on this strange problem? >>> >>> >>> I just looked at the code. Your mesh should be partitioned by k-way partitioning using Metis since its on 1 proc for partitioning. This code >>> is the same for 3.9 and 3.11, and you get the same result on your machine. I cannot understand what might be happening on your cluster >>> (MPI plays no role). Is it possible that you changed the adjacency specification in that version? >>> >>> Thanks, >>> >>> Matt >>> Thanks, >>> >>> Danyang >>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>> -- Norbert Wiener >>> >>> https://www.cse.buffalo.edu/~knepley/ >>> >>> -- >>> Sent from my Android device with K-9 Mail. Please excuse my brevity. >>> >>> >>> -- >>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>> -- Norbert Wiener >>> >>> https://www.cse.buffalo.edu/~knepley/ > <1_Warning.txt> From knepley at gmail.com Mon Sep 16 05:36:49 2019 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 16 Sep 2019 06:36:49 -0400 Subject: [petsc-users] [WARNING: UNSCANNABLE EXTRACTION FAILED]Re: Strange Partition in PETSc 3.11 version on some computers In-Reply-To: <3B61258A-6BCC-4905-A691-2CFFB0E5A76E@mcs.anl.gov> References: <9c36fe1c-e6c2-0278-864e-f1453687d3f9@gmail.com> <5217DF9F-E42D-4E1F-AE9B-1088954548BB@anl.gov> <3B61258A-6BCC-4905-A691-2CFFB0E5A76E@mcs.anl.gov> Message-ID: When you rerun for Barry, could you use -dm_view -petscpartitioner_view. In the latest release, this should tell us what the partitioner is if you do SetFromOptions() on it, say like https://gitlab.com/petsc/petsc/blob/master/src/snes/examples/tutorials/ex12.c#L543 I guess we might output this information in the default DM view for the next release. Thanks, Matt On Mon, Sep 16, 2019 at 12:39 AM Smith, Barry F. wrote: > > Please don't run with -info the file is infinitely large and the mailer > rejected it. It doesn't provide any useful information. > > Barry > > > > On Sep 15, 2019, at 11:26 PM, Danyang Su wrote: > > > > On 2019-09-15 4:43 p.m., Smith, Barry F. wrote: > >> Send the configure.log and make.log for the two system configurations > that produce very different results as well as the output running with > -dm_view -info for both runs. The cause is likely not subtle, one is likely > using metis and the other is likely just not using any partitioner. > > > > Hi Barry, > > > > Attached is the output information with -dm_view -info for both runs. > The configure.log and make.log will be mailed to you once I get it from the > SCINET staff who installed PETSc on this cluster. > > > > Thanks, > > > > Danyang > > > >> > >> > >> > >>> On Sep 15, 2019, at 6:07 PM, Matthew Knepley via petsc-users < > petsc-users at mcs.anl.gov> wrote: > >>> > >>> On Sun, Sep 15, 2019 at 6:59 PM Danyang Su > wrote: > >>> Hi Matt, > >>> > >>> Thanks for the quick reply. I have no change in the adjacency. The > source code and the simulation input files are all the same. I also tried > to use GNU compiler and mpich with petsc 3.11.3 and it works fine. > >>> > >>> It looks like the problem is caused by the difference in > configuration. However, the configuration is pretty the same as petsc 3.9.3 > except the compiler and mpi used. I will contact scinet staff to check if > they have any idea on this. > >>> > >>> Very very strange since the partition is handled completely by Metis, > and does not use MPI. > >>> > >>> Thanks, > >>> > >>> Matt > >>> Thanks, > >>> > >>> Danyang > >>> > >>> On September 15, 2019 3:20:18 p.m. PDT, Matthew Knepley < > knepley at gmail.com> wrote: > >>> On Sun, Sep 15, 2019 at 5:19 PM Danyang Su via petsc-users < > petsc-users at mcs.anl.gov> wrote: > >>> Dear All, > >>> > >>> I have a question regarding strange partition problem in PETSc 3.11 > version. The problem does not exist on my local workstation. However, on a > cluster with different PETSc versions, the partition seems quite different, > as you can find in the figure below, which is tested with 160 processors. > The color means the processor owns that subdomain. In this layered prism > mesh, there are 40 layers from bottom to top and each layer has around 20k > nodes. The natural order of nodes is also layered from bottom to top. > >>> > >>> The left partition (PETSc 3.10 and earlier) looks good with minimum > number of ghost nodes while the right one (PETSc 3.11) looks weired with > huge number of ghost nodes. Looks like the right one uses partition layer > by layer. This problem exists on a a cluster but not on my local > workstation for the same PETSc version (with different compiler and MPI). > Other than the difference in partition and efficiency, the simulation > results are the same. > >>> > >>> > >>> > >>> > >>> Below is PETSc configuration on three machine: > >>> > >>> Local workstation (works fine): ./configure --with-cc=gcc > --with-cxx=g++ --with-fc=gfortran --download-mpich --download-scalapack > --download-parmetis --download-metis --download-ptscotch > --download-fblaslapack --download-hypre --download-superlu_dist > --download-hdf5=yes --download-ctetgen --with-debugging=0 COPTFLAGS=-O3 > CXXOPTFLAGS=-O3 FOPTFLAGS=-O3 --with-cxx-dialect=C++11 > >>> > >>> Cluster with PETSc 3.9.3 (works fine): > --prefix=/scinet/niagara/software/2018a/opt/intel-2018.2-intelmpi-2018.2/petsc/3.9.3 > CC=mpicc CXX=mpicxx F77=mpif77 F90=mpif90 FC=mpifc COPTFLAGS="-march=native > -O2" CXXOPTFLAGS="-march=native -O2" FOPTFLAGS="-march=native -O2" > --download-chaco=1 --download-hypre=1 --download-metis=1 --download-ml=1 > --download-mumps=1 --download-parmetis=1 --download-plapack=1 > --download-prometheus=1 --download-ptscotch=1 --download-scotch=1 > --download-sprng=1 --download-superlu=1 --download-superlu_dist=1 > --download-triangle=1 --with-avx512-kernels=1 > --with-blaslapack-dir=/scinet/niagara/intel/2018.2/compilers_and_libraries_2018.2.199/linux/mkl > --with-debugging=0 --with-hdf5=1 > --with-mkl_pardiso-dir=/scinet/niagara/intel/2018.2/compilers_and_libraries_2018.2.199/linux/mkl > --with-scalapack=1 > --with-scalapack-lib="[/scinet/niagara/intel/2018.2/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64/libmkl_scalapack_lp64.so,/scinet/niagara/intel/2018.2/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64/libmkl_blacs_intelmpi_lp64.so]" > --with-x=0 > >>> > >>> Cluster with PETSc 3.11.3 (looks weired): > --prefix=/scinet/niagara/software/2019b/opt/intel-2019u4-intelmpi-2019u4/petsc/3.11.3 > CC=mpicc CXX=mpicxx F77=mpif77 F90=mpif90 FC=mpifc COPTFLAGS="-march=native > -O2" CXXOPTFLAGS="-march=native -O2" FOPTFLAGS="-march=native -O2" > --download-chaco=1 --download-hdf5=1 --download-hypre=1 --download-metis=1 > --download-ml=1 --download-mumps=1 --download-parmetis=1 > --download-plapack=1 --download-prometheus=1 --download-ptscotch=1 > --download-scotch=1 --download-sprng=1 --download-superlu=1 > --download-superlu_dist=1 --download-triangle=1 --with-avx512-kernels=1 > --with-blaslapack-dir=/scinet/intel/2019u4/compilers_and_libraries_2019.4.243/linux/mkl > --with-cxx-dialect=C++11 --with-debugging=0 > --with-mkl_pardiso-dir=/scinet/intel/2019u4/compilers_and_libraries_2019.4.243/linux/mkl > --with-scalapack=1 > --with-scalapack-lib="[/scinet/intel/2019u4/compilers_and_libraries_2019.4.243/linux/mkl/lib/intel64/libmkl_scalapack_lp64.so,/scinet/intel/2019u4/compilers_and_libraries_2019.4.243/linux/mkl/lib/intel64/libmkl_blacs_intelmpi_lp64.so]" > --with-x=0 > >>> > >>> And the partition is used by default dmplex distribution. > >>> > >>> !c distribute mesh over processes > >>> call DMPlexDistribute(dmda_flow%da,stencil_width, > & > >>> PETSC_NULL_SF, > & > >>> PETSC_NULL_OBJECT, > & > >>> distributedMesh,ierr) > >>> CHKERRQ(ierr) > >>> > >>> Any idea on this strange problem? > >>> > >>> > >>> I just looked at the code. Your mesh should be partitioned by k-way > partitioning using Metis since its on 1 proc for partitioning. This code > >>> is the same for 3.9 and 3.11, and you get the same result on your > machine. I cannot understand what might be happening on your cluster > >>> (MPI plays no role). Is it possible that you changed the adjacency > specification in that version? > >>> > >>> Thanks, > >>> > >>> Matt > >>> Thanks, > >>> > >>> Danyang > >>> > >>> > >>> > >>> -- > >>> What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > >>> -- Norbert Wiener > >>> > >>> https://www.cse.buffalo.edu/~knepley/ > >>> > >>> -- > >>> Sent from my Android device with K-9 Mail. Please excuse my brevity. > >>> > >>> > >>> -- > >>> What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > >>> -- Norbert Wiener > >>> > >>> https://www.cse.buffalo.edu/~knepley/ > > <1_Warning.txt> > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Mon Sep 16 08:37:37 2019 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 16 Sep 2019 09:37:37 -0400 Subject: [petsc-users] DMPlex cell number containing a point in space In-Reply-To: References: Message-ID: On Fri, Sep 6, 2019 at 6:07 PM Swarnava Ghosh via petsc-users < petsc-users at mcs.anl.gov> wrote: > Dear Petsc developers and users, > > I have a DMPlex mesh in 3D. Given a point with (x,y,z) coordinates, I am > trying the find the cell number in which this point lies, and the vertices > of the cell. Is there any DMPlex function that will give me the cell number? > Sorry, I lost this mail. In serial, you can just use DMLocatePoint(). If you have some points and you are not sure which process they might be located on, then you need a DMInterpolation context. Thanks, Matt > Thank you, > SG > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From mlohry at gmail.com Mon Sep 16 09:07:34 2019 From: mlohry at gmail.com (Mark Lohry) Date: Mon, 16 Sep 2019 10:07:34 -0400 Subject: [petsc-users] Programmatically get TS Error estimates without adaptivity Message-ID: I'm trying to assess time accuracy for a couple different integrators and timesteps in a setup where I need constant time steps. TSGetTimeError seems to only work for GLEE methods (it returns a 0 vector otherwise); is there an equivalent for others? I can run -ts_adapt_type basic, and just set min=max=constant, and it clearly computes some integrator errors for printf, but is there a way I can get that programmatically? Thanks, Mark -------------- next part -------------- An HTML attachment was scrubbed... URL: From emconsta at anl.gov Mon Sep 16 09:56:33 2019 From: emconsta at anl.gov (Constantinescu, Emil M.) Date: Mon, 16 Sep 2019 14:56:33 +0000 Subject: [petsc-users] Programmatically get TS Error estimates without adaptivity In-Reply-To: References: Message-ID: Mark, most integrators estimate the error occurring every step for adapting the step size. These are called local errors. GLEE are specialized integrator that estimate the actual error in time; this is called the global error. GLEE returns TimeError because it computes it internally, however none of the other solvers does it. To assess the accuracy for a specific problem you have two options, and both involve you estimating the error yourself: 1. Use an integrator with very small step, store or save the solution (call it reference) and do the same for the others and then compute the error with respect to the reference. A strategy similar to this is implemented in src/ts/examples/tutorials/ex31.c 2. If you have an exact or manufactured solution, you can compare against that. Emil On 9/16/19 9:07 AM, Mark Lohry via petsc-users wrote: > I'm trying to assess time accuracy for a couple different integrators > and timesteps in a setup where I need constant time steps. > > TSGetTimeError seems to only work for GLEE methods (it returns a 0 > vector otherwise); is there an equivalent for others? > > I can run -ts_adapt_type basic, and just set min=max=constant, and it > clearly computes some integrator errors for printf, but is there a way > I can get that programmatically? > > Thanks, > Mark > > > From mlohry at gmail.com Mon Sep 16 11:38:09 2019 From: mlohry at gmail.com (Mark Lohry) Date: Mon, 16 Sep 2019 12:38:09 -0400 Subject: [petsc-users] Programmatically get TS Error estimates without adaptivity In-Reply-To: References: Message-ID: Hi Emil, maybe I was unclear, let me try to clarify, I only want the local truncation error at a given step. Most integrators outside of GLEE also support ts_adapt basic (I'm specifically testing arkimex and bdf), and they compute the WLTE weighted error term, e.g. for arkimex: TSAdapt basic arkimex 0:3 step 0 accepted t=0 + 1.896e-02 dt=1.967e-02 wlte=0.653 wltea= -1 wlter= -1 or BDF2: TSAdapt basic bdf 0:2 step 0 rejected t=0 + 3.673e-02 dt=3.673e-03 wlte= 36.6 wltea= -1 wlter= -1 I'd like to have access to that wlte value. I can call TSEvaluateWLTE which works for BDF2 but crashes on arkimex for not being implemented. Is there a different avenue to getting this value through arkimex? On Mon, Sep 16, 2019 at 10:56 AM Constantinescu, Emil M. wrote: > Mark, most integrators estimate the error occurring every step for > adapting the step size. These are called local errors. GLEE are > specialized integrator that estimate the actual error in time; this is > called the global error. GLEE returns TimeError because it computes it > internally, however none of the other solvers does it. > > To assess the accuracy for a specific problem you have two options, and > both involve you estimating the error yourself: > > 1. Use an integrator with very small step, store or save the solution > (call it reference) and do the same for the others and then compute the > error with respect to the reference. A strategy similar to this is > implemented in src/ts/examples/tutorials/ex31.c > > 2. If you have an exact or manufactured solution, you can compare > against that. > > Emil > > On 9/16/19 9:07 AM, Mark Lohry via petsc-users wrote: > > I'm trying to assess time accuracy for a couple different integrators > > and timesteps in a setup where I need constant time steps. > > > > TSGetTimeError seems to only work for GLEE methods (it returns a 0 > > vector otherwise); is there an equivalent for others? > > > > I can run -ts_adapt_type basic, and just set min=max=constant, and it > > clearly computes some integrator errors for printf, but is there a way > > I can get that programmatically? > > > > Thanks, > > Mark > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From danyang.su at gmail.com Mon Sep 16 11:50:54 2019 From: danyang.su at gmail.com (Danyang Su) Date: Mon, 16 Sep 2019 09:50:54 -0700 Subject: [petsc-users] Strange Partition in PETSc 3.11 version on some computers In-Reply-To: <5217DF9F-E42D-4E1F-AE9B-1088954548BB@anl.gov> References: <9c36fe1c-e6c2-0278-864e-f1453687d3f9@gmail.com> <5217DF9F-E42D-4E1F-AE9B-1088954548BB@anl.gov> Message-ID: <4ffe88c2-354c-07d7-ab1e-0f1edd8ec3c3@gmail.com> Hi Barry and Matt, Attached is the output of both runs with -dm_view -log_view included. I am now coordinating with staff to install PETSc 3.9.3 version using intel2019u4 to narrow down the problem. Will get back to you later after the test. Thanks, Danyang On 2019-09-15 4:43 p.m., Smith, Barry F. wrote: > Send the configure.log and make.log for the two system configurations that produce very different results as well as the output running with -dm_view -info for both runs. The cause is likely not subtle, one is likely using metis and the other is likely just not using any partitioner. > > > >> On Sep 15, 2019, at 6:07 PM, Matthew Knepley via petsc-users wrote: >> >> On Sun, Sep 15, 2019 at 6:59 PM Danyang Su wrote: >> Hi Matt, >> >> Thanks for the quick reply. I have no change in the adjacency. The source code and the simulation input files are all the same. I also tried to use GNU compiler and mpich with petsc 3.11.3 and it works fine. >> >> It looks like the problem is caused by the difference in configuration. However, the configuration is pretty the same as petsc 3.9.3 except the compiler and mpi used. I will contact scinet staff to check if they have any idea on this. >> >> Very very strange since the partition is handled completely by Metis, and does not use MPI. >> >> Thanks, >> >> Matt >> >> Thanks, >> >> Danyang >> >> On September 15, 2019 3:20:18 p.m. PDT, Matthew Knepley wrote: >> On Sun, Sep 15, 2019 at 5:19 PM Danyang Su via petsc-users wrote: >> Dear All, >> >> I have a question regarding strange partition problem in PETSc 3.11 version. The problem does not exist on my local workstation. However, on a cluster with different PETSc versions, the partition seems quite different, as you can find in the figure below, which is tested with 160 processors. The color means the processor owns that subdomain. In this layered prism mesh, there are 40 layers from bottom to top and each layer has around 20k nodes. The natural order of nodes is also layered from bottom to top. >> >> The left partition (PETSc 3.10 and earlier) looks good with minimum number of ghost nodes while the right one (PETSc 3.11) looks weired with huge number of ghost nodes. Looks like the right one uses partition layer by layer. This problem exists on a a cluster but not on my local workstation for the same PETSc version (with different compiler and MPI). Other than the difference in partition and efficiency, the simulation results are the same. >> >> >> >> >> Below is PETSc configuration on three machine: >> >> Local workstation (works fine): ./configure --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --download-mpich --download-scalapack --download-parmetis --download-metis --download-ptscotch --download-fblaslapack --download-hypre --download-superlu_dist --download-hdf5=yes --download-ctetgen --with-debugging=0 COPTFLAGS=-O3 CXXOPTFLAGS=-O3 FOPTFLAGS=-O3 --with-cxx-dialect=C++11 >> >> Cluster with PETSc 3.9.3 (works fine): --prefix=/scinet/niagara/software/2018a/opt/intel-2018.2-intelmpi-2018.2/petsc/3.9.3 CC=mpicc CXX=mpicxx F77=mpif77 F90=mpif90 FC=mpifc COPTFLAGS="-march=native -O2" CXXOPTFLAGS="-march=native -O2" FOPTFLAGS="-march=native -O2" --download-chaco=1 --download-hypre=1 --download-metis=1 --download-ml=1 --download-mumps=1 --download-parmetis=1 --download-plapack=1 --download-prometheus=1 --download-ptscotch=1 --download-scotch=1 --download-sprng=1 --download-superlu=1 --download-superlu_dist=1 --download-triangle=1 --with-avx512-kernels=1 --with-blaslapack-dir=/scinet/niagara/intel/2018.2/compilers_and_libraries_2018.2.199/linux/mkl --with-debugging=0 --with-hdf5=1 --with-mkl_pardiso-dir=/scinet/niagara/intel/2018.2/compilers_and_libraries_2018.2.199/linux/mkl --with-scalapack=1 --with-scalapack-lib="[/scinet/niagara/intel/2018.2/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64/libmkl_scalapack_lp64.so,/scinet/niagara/intel/2018.2/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64/libmkl_blacs_intelmpi_lp64.so]" --with-x=0 >> >> Cluster with PETSc 3.11.3 (looks weired): --prefix=/scinet/niagara/software/2019b/opt/intel-2019u4-intelmpi-2019u4/petsc/3.11.3 CC=mpicc CXX=mpicxx F77=mpif77 F90=mpif90 FC=mpifc COPTFLAGS="-march=native -O2" CXXOPTFLAGS="-march=native -O2" FOPTFLAGS="-march=native -O2" --download-chaco=1 --download-hdf5=1 --download-hypre=1 --download-metis=1 --download-ml=1 --download-mumps=1 --download-parmetis=1 --download-plapack=1 --download-prometheus=1 --download-ptscotch=1 --download-scotch=1 --download-sprng=1 --download-superlu=1 --download-superlu_dist=1 --download-triangle=1 --with-avx512-kernels=1 --with-blaslapack-dir=/scinet/intel/2019u4/compilers_and_libraries_2019.4.243/linux/mkl --with-cxx-dialect=C++11 --with-debugging=0 --with-mkl_pardiso-dir=/scinet/intel/2019u4/compilers_and_libraries_2019.4.243/linux/mkl --with-scalapack=1 --with-scalapack-lib="[/scinet/intel/2019u4/compilers_and_libraries_2019.4.243/linux/mkl/lib/intel64/libmkl_scalapack_lp64.so,/scinet/intel/2019u4/compilers_and_libraries_2019.4.243/linux/mkl/lib/intel64/libmkl_blacs_intelmpi_lp64.so]" --with-x=0 >> >> And the partition is used by default dmplex distribution. >> >> !c distribute mesh over processes >> call DMPlexDistribute(dmda_flow%da,stencil_width, & >> PETSC_NULL_SF, & >> PETSC_NULL_OBJECT, & >> distributedMesh,ierr) >> CHKERRQ(ierr) >> >> Any idea on this strange problem? >> >> >> I just looked at the code. Your mesh should be partitioned by k-way partitioning using Metis since its on 1 proc for partitioning. This code >> is the same for 3.9 and 3.11, and you get the same result on your machine. I cannot understand what might be happening on your cluster >> (MPI plays no role). Is it possible that you changed the adjacency specification in that version? >> >> Thanks, >> >> Matt >> Thanks, >> >> Danyang >> >> >> >> -- >> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ >> >> -- >> Sent from my Android device with K-9 Mail. Please excuse my brevity. >> >> >> -- >> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- A non-text attachment was scrubbed... Name: basin-petsc-3.9.3.log Type: text/x-log Size: 16417 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: basin-petsc-3.11.3.log Type: text/x-log Size: 15910 bytes Desc: not available URL: From emconsta at anl.gov Mon Sep 16 12:15:47 2019 From: emconsta at anl.gov (Constantinescu, Emil M.) Date: Mon, 16 Sep 2019 17:15:47 +0000 Subject: [petsc-users] Programmatically get TS Error estimates without adaptivity In-Reply-To: References: Message-ID: On 9/16/19 11:38 AM, Mark Lohry wrote: Hi Emil, maybe I was unclear, let me try to clarify, I only want the local truncation error at a given step. Most integrators outside of GLEE also support ts_adapt basic (I'm specifically testing arkimex and bdf), and they compute the WLTE weighted error term, e.g. for arkimex: TSAdapt basic arkimex 0:3 step 0 accepted t=0 + 1.896e-02 dt=1.967e-02 wlte=0.653 wltea= -1 wlter= -1 or BDF2: TSAdapt basic bdf 0:2 step 0 rejected t=0 + 3.673e-02 dt=3.673e-03 wlte= 36.6 wltea= -1 wlter= -1 I'd like to have access to that wlte value. I can call TSEvaluateWLTE which works for BDF2 but crashes on arkimex for not being implemented. Is there a different avenue to getting this value through arkimex? I see, wlte is estimated in the adapter and logically separated from the time stepper. For ARKIMEX wlte is computed in TSAdaptChoose_Basic: https://www.mcs.anl.gov/petsc/petsc-current/src/ts/adapt/impls/basic/adaptbasic.c.html#TSAdaptChoose_Basic That is used internally in ARKIMEX to adapt the step size, but you would have to call it after every step. TSEvaluateWLTE will also provide wlte after each step. Emil On Mon, Sep 16, 2019 at 10:56 AM Constantinescu, Emil M. > wrote: Mark, most integrators estimate the error occurring every step for adapting the step size. These are called local errors. GLEE are specialized integrator that estimate the actual error in time; this is called the global error. GLEE returns TimeError because it computes it internally, however none of the other solvers does it. To assess the accuracy for a specific problem you have two options, and both involve you estimating the error yourself: 1. Use an integrator with very small step, store or save the solution (call it reference) and do the same for the others and then compute the error with respect to the reference. A strategy similar to this is implemented in src/ts/examples/tutorials/ex31.c 2. If you have an exact or manufactured solution, you can compare against that. Emil On 9/16/19 9:07 AM, Mark Lohry via petsc-users wrote: > I'm trying to assess time accuracy for a couple different integrators > and timesteps in a setup where I need constant time steps. > > TSGetTimeError seems to only work for GLEE methods (it returns a 0 > vector otherwise); is there an equivalent for others? > > I can run -ts_adapt_type basic, and just set min=max=constant, and it > clearly computes some integrator errors for printf, but is there a way > I can get that programmatically? > > Thanks, > Mark > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Mon Sep 16 12:46:25 2019 From: bsmith at mcs.anl.gov (Smith, Barry F.) Date: Mon, 16 Sep 2019 17:46:25 +0000 Subject: [petsc-users] Strange Partition in PETSc 3.11 version on some computers In-Reply-To: <4ffe88c2-354c-07d7-ab1e-0f1edd8ec3c3@gmail.com> References: <9c36fe1c-e6c2-0278-864e-f1453687d3f9@gmail.com> <5217DF9F-E42D-4E1F-AE9B-1088954548BB@anl.gov> <4ffe88c2-354c-07d7-ab1e-0f1edd8ec3c3@gmail.com> Message-ID: Very different stuff going on in the two cases, different objects being created, different number of different types of operations. Clearly a major refactorization of the code was done. Presumably a regression was introduced that changed the behavior dramatically, possible by mistake. You can attempt to use git bisect to determine what changed caused the dramatic change in behavior. Then it can be decided if the changed that triggered the change in the results was a bug or a planned feature. Barry > On Sep 16, 2019, at 11:50 AM, Danyang Su wrote: > > Hi Barry and Matt, > > Attached is the output of both runs with -dm_view -log_view included. > > I am now coordinating with staff to install PETSc 3.9.3 version using intel2019u4 to narrow down the problem. Will get back to you later after the test. > > Thanks, > > Danyang > > On 2019-09-15 4:43 p.m., Smith, Barry F. wrote: >> Send the configure.log and make.log for the two system configurations that produce very different results as well as the output running with -dm_view -info for both runs. The cause is likely not subtle, one is likely using metis and the other is likely just not using any partitioner. >> >> >> >>> On Sep 15, 2019, at 6:07 PM, Matthew Knepley via petsc-users wrote: >>> >>> On Sun, Sep 15, 2019 at 6:59 PM Danyang Su wrote: >>> Hi Matt, >>> >>> Thanks for the quick reply. I have no change in the adjacency. The source code and the simulation input files are all the same. I also tried to use GNU compiler and mpich with petsc 3.11.3 and it works fine. >>> >>> It looks like the problem is caused by the difference in configuration. However, the configuration is pretty the same as petsc 3.9.3 except the compiler and mpi used. I will contact scinet staff to check if they have any idea on this. >>> >>> Very very strange since the partition is handled completely by Metis, and does not use MPI. >>> >>> Thanks, >>> >>> Matt >>> Thanks, >>> >>> Danyang >>> >>> On September 15, 2019 3:20:18 p.m. PDT, Matthew Knepley wrote: >>> On Sun, Sep 15, 2019 at 5:19 PM Danyang Su via petsc-users wrote: >>> Dear All, >>> >>> I have a question regarding strange partition problem in PETSc 3.11 version. The problem does not exist on my local workstation. However, on a cluster with different PETSc versions, the partition seems quite different, as you can find in the figure below, which is tested with 160 processors. The color means the processor owns that subdomain. In this layered prism mesh, there are 40 layers from bottom to top and each layer has around 20k nodes. The natural order of nodes is also layered from bottom to top. >>> >>> The left partition (PETSc 3.10 and earlier) looks good with minimum number of ghost nodes while the right one (PETSc 3.11) looks weired with huge number of ghost nodes. Looks like the right one uses partition layer by layer. This problem exists on a a cluster but not on my local workstation for the same PETSc version (with different compiler and MPI). Other than the difference in partition and efficiency, the simulation results are the same. >>> >>> >>> >>> >>> Below is PETSc configuration on three machine: >>> >>> Local workstation (works fine): ./configure --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --download-mpich --download-scalapack --download-parmetis --download-metis --download-ptscotch --download-fblaslapack --download-hypre --download-superlu_dist --download-hdf5=yes --download-ctetgen --with-debugging=0 COPTFLAGS=-O3 CXXOPTFLAGS=-O3 FOPTFLAGS=-O3 --with-cxx-dialect=C++11 >>> >>> Cluster with PETSc 3.9.3 (works fine): --prefix=/scinet/niagara/software/2018a/opt/intel-2018.2-intelmpi-2018.2/petsc/3.9.3 CC=mpicc CXX=mpicxx F77=mpif77 F90=mpif90 FC=mpifc COPTFLAGS="-march=native -O2" CXXOPTFLAGS="-march=native -O2" FOPTFLAGS="-march=native -O2" --download-chaco=1 --download-hypre=1 --download-metis=1 --download-ml=1 --download-mumps=1 --download-parmetis=1 --download-plapack=1 --download-prometheus=1 --download-ptscotch=1 --download-scotch=1 --download-sprng=1 --download-superlu=1 --download-superlu_dist=1 --download-triangle=1 --with-avx512-kernels=1 --with-blaslapack-dir=/scinet/niagara/intel/2018.2/compilers_and_libraries_2018.2.199/linux/mkl --with-debugging=0 --with-hdf5=1 --with-mkl_pardiso-dir=/scinet/niagara/intel/2018.2/compilers_and_libraries_2018.2.199/linux/mkl --with-scalapack=1 --with-scalapack-lib="[/scinet/niagara/intel/2018.2/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64/libmkl_scalapack_lp64.so,/scinet/niagara/intel/2018.2/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64/libmkl_blacs_intelmpi_lp64.so]" --with-x=0 >>> >>> Cluster with PETSc 3.11.3 (looks weired): --prefix=/scinet/niagara/software/2019b/opt/intel-2019u4-intelmpi-2019u4/petsc/3.11.3 CC=mpicc CXX=mpicxx F77=mpif77 F90=mpif90 FC=mpifc COPTFLAGS="-march=native -O2" CXXOPTFLAGS="-march=native -O2" FOPTFLAGS="-march=native -O2" --download-chaco=1 --download-hdf5=1 --download-hypre=1 --download-metis=1 --download-ml=1 --download-mumps=1 --download-parmetis=1 --download-plapack=1 --download-prometheus=1 --download-ptscotch=1 --download-scotch=1 --download-sprng=1 --download-superlu=1 --download-superlu_dist=1 --download-triangle=1 --with-avx512-kernels=1 --with-blaslapack-dir=/scinet/intel/2019u4/compilers_and_libraries_2019.4.243/linux/mkl --with-cxx-dialect=C++11 --with-debugging=0 --with-mkl_pardiso-dir=/scinet/intel/2019u4/compilers_and_libraries_2019.4.243/linux/mkl --with-scalapack=1 --with-scalapack-lib="[/scinet/intel/2019u4/compilers_and_libraries_2019.4.243/linux/mkl/lib/intel64/libmkl_scalapack_lp64.so,/scinet/intel/2019u4/compilers_and_libraries_2019.4.243/linux/mkl/lib/intel64/libmkl_blacs_intelmpi_lp64.so]" --with-x=0 >>> >>> And the partition is used by default dmplex distribution. >>> >>> !c distribute mesh over processes >>> call DMPlexDistribute(dmda_flow%da,stencil_width, & >>> PETSC_NULL_SF, & >>> PETSC_NULL_OBJECT, & >>> distributedMesh,ierr) >>> CHKERRQ(ierr) >>> >>> Any idea on this strange problem? >>> >>> >>> I just looked at the code. Your mesh should be partitioned by k-way partitioning using Metis since its on 1 proc for partitioning. This code >>> is the same for 3.9 and 3.11, and you get the same result on your machine. I cannot understand what might be happening on your cluster >>> (MPI plays no role). Is it possible that you changed the adjacency specification in that version? >>> >>> Thanks, >>> >>> Matt >>> Thanks, >>> >>> Danyang >>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>> -- Norbert Wiener >>> >>> https://www.cse.buffalo.edu/~knepley/ >>> >>> -- >>> Sent from my Android device with K-9 Mail. Please excuse my brevity. >>> >>> >>> -- >>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>> -- Norbert Wiener >>> >>> https://www.cse.buffalo.edu/~knepley/ > From knepley at gmail.com Mon Sep 16 14:02:55 2019 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 16 Sep 2019 15:02:55 -0400 Subject: [petsc-users] Strange Partition in PETSc 3.11 version on some computers In-Reply-To: References: <9c36fe1c-e6c2-0278-864e-f1453687d3f9@gmail.com> <5217DF9F-E42D-4E1F-AE9B-1088954548BB@anl.gov> <4ffe88c2-354c-07d7-ab1e-0f1edd8ec3c3@gmail.com> Message-ID: On Mon, Sep 16, 2019 at 1:46 PM Smith, Barry F. wrote: > > Very different stuff going on in the two cases, different objects being > created, different number of different types of operations. Clearly a major > refactorization of the code was done. Presumably a regression was > introduced that changed the behavior dramatically, possible by mistake. > > You can attempt to use git bisect to determine what changed caused the > dramatic change in behavior. Then it can be decided if the changed that > triggered the change in the results was a bug or a planned feature. > Danyang, Can you send me the smallest mesh you care about, and I will look at the partitioning? We can at least get quality metrics between these two releases. Thanks, Matt > Barry > > > > On Sep 16, 2019, at 11:50 AM, Danyang Su wrote: > > > > Hi Barry and Matt, > > > > Attached is the output of both runs with -dm_view -log_view included. > > > > I am now coordinating with staff to install PETSc 3.9.3 version using > intel2019u4 to narrow down the problem. Will get back to you later after > the test. > > > > Thanks, > > > > Danyang > > > > On 2019-09-15 4:43 p.m., Smith, Barry F. wrote: > >> Send the configure.log and make.log for the two system configurations > that produce very different results as well as the output running with > -dm_view -info for both runs. The cause is likely not subtle, one is likely > using metis and the other is likely just not using any partitioner. > >> > >> > >> > >>> On Sep 15, 2019, at 6:07 PM, Matthew Knepley via petsc-users < > petsc-users at mcs.anl.gov> wrote: > >>> > >>> On Sun, Sep 15, 2019 at 6:59 PM Danyang Su > wrote: > >>> Hi Matt, > >>> > >>> Thanks for the quick reply. I have no change in the adjacency. The > source code and the simulation input files are all the same. I also tried > to use GNU compiler and mpich with petsc 3.11.3 and it works fine. > >>> > >>> It looks like the problem is caused by the difference in > configuration. However, the configuration is pretty the same as petsc 3.9.3 > except the compiler and mpi used. I will contact scinet staff to check if > they have any idea on this. > >>> > >>> Very very strange since the partition is handled completely by Metis, > and does not use MPI. > >>> > >>> Thanks, > >>> > >>> Matt > >>> Thanks, > >>> > >>> Danyang > >>> > >>> On September 15, 2019 3:20:18 p.m. PDT, Matthew Knepley < > knepley at gmail.com> wrote: > >>> On Sun, Sep 15, 2019 at 5:19 PM Danyang Su via petsc-users < > petsc-users at mcs.anl.gov> wrote: > >>> Dear All, > >>> > >>> I have a question regarding strange partition problem in PETSc 3.11 > version. The problem does not exist on my local workstation. However, on a > cluster with different PETSc versions, the partition seems quite different, > as you can find in the figure below, which is tested with 160 processors. > The color means the processor owns that subdomain. In this layered prism > mesh, there are 40 layers from bottom to top and each layer has around 20k > nodes. The natural order of nodes is also layered from bottom to top. > >>> > >>> The left partition (PETSc 3.10 and earlier) looks good with minimum > number of ghost nodes while the right one (PETSc 3.11) looks weired with > huge number of ghost nodes. Looks like the right one uses partition layer > by layer. This problem exists on a a cluster but not on my local > workstation for the same PETSc version (with different compiler and MPI). > Other than the difference in partition and efficiency, the simulation > results are the same. > >>> > >>> > >>> > >>> > >>> Below is PETSc configuration on three machine: > >>> > >>> Local workstation (works fine): ./configure --with-cc=gcc > --with-cxx=g++ --with-fc=gfortran --download-mpich --download-scalapack > --download-parmetis --download-metis --download-ptscotch > --download-fblaslapack --download-hypre --download-superlu_dist > --download-hdf5=yes --download-ctetgen --with-debugging=0 COPTFLAGS=-O3 > CXXOPTFLAGS=-O3 FOPTFLAGS=-O3 --with-cxx-dialect=C++11 > >>> > >>> Cluster with PETSc 3.9.3 (works fine): > --prefix=/scinet/niagara/software/2018a/opt/intel-2018.2-intelmpi-2018.2/petsc/3.9.3 > CC=mpicc CXX=mpicxx F77=mpif77 F90=mpif90 FC=mpifc COPTFLAGS="-march=native > -O2" CXXOPTFLAGS="-march=native -O2" FOPTFLAGS="-march=native -O2" > --download-chaco=1 --download-hypre=1 --download-metis=1 --download-ml=1 > --download-mumps=1 --download-parmetis=1 --download-plapack=1 > --download-prometheus=1 --download-ptscotch=1 --download-scotch=1 > --download-sprng=1 --download-superlu=1 --download-superlu_dist=1 > --download-triangle=1 --with-avx512-kernels=1 > --with-blaslapack-dir=/scinet/niagara/intel/2018.2/compilers_and_libraries_2018.2.199/linux/mkl > --with-debugging=0 --with-hdf5=1 > --with-mkl_pardiso-dir=/scinet/niagara/intel/2018.2/compilers_and_libraries_2018.2.199/linux/mkl > --with-scalapack=1 > --with-scalapack-lib="[/scinet/niagara/intel/2018.2/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64/libmkl_scalapack_lp64.so,/scinet/niagara/intel/2018.2/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64/libmkl_blacs_intelmpi_lp64.so]" > --with-x=0 > >>> > >>> Cluster with PETSc 3.11.3 (looks weired): > --prefix=/scinet/niagara/software/2019b/opt/intel-2019u4-intelmpi-2019u4/petsc/3.11.3 > CC=mpicc CXX=mpicxx F77=mpif77 F90=mpif90 FC=mpifc COPTFLAGS="-march=native > -O2" CXXOPTFLAGS="-march=native -O2" FOPTFLAGS="-march=native -O2" > --download-chaco=1 --download-hdf5=1 --download-hypre=1 --download-metis=1 > --download-ml=1 --download-mumps=1 --download-parmetis=1 > --download-plapack=1 --download-prometheus=1 --download-ptscotch=1 > --download-scotch=1 --download-sprng=1 --download-superlu=1 > --download-superlu_dist=1 --download-triangle=1 --with-avx512-kernels=1 > --with-blaslapack-dir=/scinet/intel/2019u4/compilers_and_libraries_2019.4.243/linux/mkl > --with-cxx-dialect=C++11 --with-debugging=0 > --with-mkl_pardiso-dir=/scinet/intel/2019u4/compilers_and_libraries_2019.4.243/linux/mkl > --with-scalapack=1 > --with-scalapack-lib="[/scinet/intel/2019u4/compilers_and_libraries_2019.4.243/linux/mkl/lib/intel64/libmkl_scalapack_lp64.so,/scinet/intel/2019u4/compilers_and_libraries_2019.4.243/linux/mkl/lib/intel64/libmkl_blacs_intelmpi_lp64.so]" > --with-x=0 > >>> > >>> And the partition is used by default dmplex distribution. > >>> > >>> !c distribute mesh over processes > >>> call DMPlexDistribute(dmda_flow%da,stencil_width, > & > >>> PETSC_NULL_SF, > & > >>> PETSC_NULL_OBJECT, > & > >>> distributedMesh,ierr) > >>> CHKERRQ(ierr) > >>> > >>> Any idea on this strange problem? > >>> > >>> > >>> I just looked at the code. Your mesh should be partitioned by k-way > partitioning using Metis since its on 1 proc for partitioning. This code > >>> is the same for 3.9 and 3.11, and you get the same result on your > machine. I cannot understand what might be happening on your cluster > >>> (MPI plays no role). Is it possible that you changed the adjacency > specification in that version? > >>> > >>> Thanks, > >>> > >>> Matt > >>> Thanks, > >>> > >>> Danyang > >>> > >>> > >>> > >>> -- > >>> What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > >>> -- Norbert Wiener > >>> > >>> https://www.cse.buffalo.edu/~knepley/ > >>> > >>> -- > >>> Sent from my Android device with K-9 Mail. Please excuse my brevity. > >>> > >>> > >>> -- > >>> What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > >>> -- Norbert Wiener > >>> > >>> https://www.cse.buffalo.edu/~knepley/ > > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From juaneah at gmail.com Mon Sep 16 23:13:59 2019 From: juaneah at gmail.com (Emmanuel Ayala) Date: Mon, 16 Sep 2019 23:13:59 -0500 Subject: [petsc-users] Optimized mode Message-ID: Hi everyone, I have a code for elastic linear analysis which involves DMDA, SuperLu and matrix operations. I have a MPI vector which contains the nodal displacements, when I run the code in debug mode i get the same value for the norm (2 or Inf), for different number of process. But when I use the optimized mode, I have small variations for the same norm depending on the number of process. -Debug mode: the same norm value for any number of processes. -Optimized mode: the norm value changes with the number of processes. The variations are around 2.0e-4 units. * This is normal?* For my optimized mode I used the next configuration ./configure --with-debugging=0 COPTFLAGS='-O3 -march=native -mtune=native' CXXOPTFLAGS='-O3 -march=native -mtune=native' FOPTFLAGS='-O3 -march=native -mtune=native' --download-mpich --download-superlu_dist --download-metis --download-parmetis --download-cmake --download-fblaslapack=1 --with-cxx-dialect=C++11 Best regards. -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Mon Sep 16 23:40:49 2019 From: bsmith at mcs.anl.gov (Smith, Barry F.) Date: Tue, 17 Sep 2019 04:40:49 +0000 Subject: [petsc-users] Optimized mode In-Reply-To: References: Message-ID: <88D14AC3-F5CA-4A72-82F4-F1D74EF96FF5@anl.gov> What do you mean by 2.0e-4 units ? If you mean the last 4 digits may differ in the two solutions, yes that is completely normal. How many digits you lose depends on the order of the operations and the condition number of the matrix and and for elasticity that will very easily be greater than 10^4 Barry From: https://pdfs.semanticscholar.org/dccf/d6daa35fc9d585de1f927c58cc29c4cd0bab.pdf We conclude this section by noting the need for care in interpreting the forward error. Experiments in [24] show that simply changing the order of evaluation of an inner product in the substitution algorithm for solution of a triangular system can change the forward error in the computed solution by orders of magnitude. This means, for example, that it is dangerous to compare different codes or algorithms solely in terms of observed forward errors. > On Sep 16, 2019, at 11:13 PM, Emmanuel Ayala via petsc-users wrote: > > Hi everyone, > > I have a code for elastic linear analysis which involves DMDA, SuperLu and matrix operations. I have a MPI vector which contains the nodal displacements, when I run the code in debug mode i get the same value for the norm (2 or Inf), for different number of process. But when I use the optimized mode, I have small variations for the same norm depending on the number of process. > > -Debug mode: the same norm value for any number of processes. > -Optimized mode: the norm value changes with the number of processes. The variations are around 2.0e-4 units. > > This is normal? > > For my optimized mode I used the next configuration > > ./configure --with-debugging=0 COPTFLAGS='-O3 -march=native -mtune=native' CXXOPTFLAGS='-O3 -march=native -mtune=native' FOPTFLAGS='-O3 -march=native -mtune=native' --download-mpich --download-superlu_dist --download-metis --download-parmetis --download-cmake --download-fblaslapack=1 --with-cxx-dialect=C++11 > > Best regards. From danyang.su at gmail.com Mon Sep 16 23:41:45 2019 From: danyang.su at gmail.com (Danyang Su) Date: Mon, 16 Sep 2019 21:41:45 -0700 Subject: [petsc-users] Strange Partition in PETSc 3.11 version on some computers In-Reply-To: References: <9c36fe1c-e6c2-0278-864e-f1453687d3f9@gmail.com> <5217DF9F-E42D-4E1F-AE9B-1088954548BB@anl.gov> <4ffe88c2-354c-07d7-ab1e-0f1edd8ec3c3@gmail.com> Message-ID: On 2019-09-16 12:02 p.m., Matthew Knepley wrote: > On Mon, Sep 16, 2019 at 1:46 PM Smith, Barry F. > wrote: > > > ? Very different stuff going on in the two cases, different > objects being created, different number of different types of > operations. Clearly a major refactorization of the code was done. > Presumably a regression was introduced that changed the behavior > dramatically, possible by mistake. > > ? ?You can attempt to use git bisect to determine what changed > caused the dramatic change in behavior. Then it can be decided if > the changed that triggered the change in the results was a bug or > a planned feature. > > > Danyang, > > Can you send me the smallest mesh you care about, and I will look at > the partitioning? We can at least get quality metrics > between these two releases. > > ? Thanks, > > ? ? ?Matt Hi Matt, This is the smallest mesh for the regional scale simulation that has strange partition problem. It can be download via the link below. https://www.dropbox.com/s/tu34jgqqhkz8pwj/basin-3d.vtk?dl=0 I am trying to reproduce the similar problem using smaller 2D mesh, however, there is no such problem in 2D, even though the partitions using PETSc 3.9.3 and 3.11.3 are a bit different, they both look reasonable. As shown below, both rectangular mesh and triangular mesh use DMPlex. 2D rectangular and triangle mesh I will keep on testing using PETSc3.11.3 but with different compiler and MPI to check if I can reproduce the problem. Thanks, Danyang > > ? ?Barry > > > > On Sep 16, 2019, at 11:50 AM, Danyang Su > wrote: > > > > Hi Barry and Matt, > > > > Attached is the output of both runs with -dm_view -log_view > included. > > > > I am now coordinating with staff to install PETSc 3.9.3 version > using intel2019u4 to narrow down the problem. Will get back to you > later after the test. > > > > Thanks, > > > > Danyang > > > > On 2019-09-15 4:43 p.m., Smith, Barry F. wrote: > >>? ?Send the configure.log and make.log for the two system > configurations that produce very different results as well as the > output running with -dm_view -info for both runs. The cause is > likely not subtle, one is likely using metis and the other is > likely just not using any partitioner. > >> > >> > >> > >>> On Sep 15, 2019, at 6:07 PM, Matthew Knepley via petsc-users > > wrote: > >>> > >>> On Sun, Sep 15, 2019 at 6:59 PM Danyang Su > > wrote: > >>> Hi Matt, > >>> > >>> Thanks for the quick reply. I have no change in the adjacency. > The source code and the simulation input files are all the same. I > also tried to use GNU compiler and mpich with petsc 3.11.3 and it > works fine. > >>> > >>> It looks like the problem is caused by the difference in > configuration. However, the configuration is pretty the same as > petsc 3.9.3 except the compiler and mpi used. I will contact > scinet staff to check if they have any idea on this. > >>> > >>> Very very strange since the partition is handled completely by > Metis, and does not use MPI. > >>> > >>>? ?Thanks, > >>> > >>>? ? ?Matt > >>>? Thanks, > >>> > >>> Danyang > >>> > >>> On September 15, 2019 3:20:18 p.m. PDT, Matthew Knepley > > wrote: > >>> On Sun, Sep 15, 2019 at 5:19 PM Danyang Su via petsc-users > > wrote: > >>> Dear All, > >>> > >>> I have a question regarding strange partition problem in PETSc > 3.11 version. The problem does not exist on my local workstation. > However, on a cluster with different PETSc versions, the partition > seems quite different, as you can find in the figure below, which > is tested with 160 processors. The color means the processor owns > that subdomain. In this layered prism mesh, there are 40 layers > from bottom to top and each layer has around 20k nodes. The > natural order of nodes is also layered from bottom to top. > >>> > >>> The left partition (PETSc 3.10 and earlier) looks good with > minimum number of ghost nodes while the right one (PETSc 3.11) > looks weired with huge number of ghost nodes. Looks like the right > one uses partition layer by layer. This problem exists on a a > cluster but not on my local workstation for the same PETSc version > (with different compiler and MPI). Other than the difference in > partition and efficiency, the simulation results are the same. > >>> > >>> > >>> > >>> > >>> Below is PETSc configuration on three machine: > >>> > >>> Local workstation (works fine):? ./configure --with-cc=gcc > --with-cxx=g++ --with-fc=gfortran --download-mpich > --download-scalapack --download-parmetis --download-metis > --download-ptscotch --download-fblaslapack --download-hypre > --download-superlu_dist --download-hdf5=yes --download-ctetgen > --with-debugging=0 COPTFLAGS=-O3 CXXOPTFLAGS=-O3 FOPTFLAGS=-O3 > --with-cxx-dialect=C++11 > >>> > >>> Cluster with PETSc 3.9.3 (works fine): > --prefix=/scinet/niagara/software/2018a/opt/intel-2018.2-intelmpi-2018.2/petsc/3.9.3 > CC=mpicc CXX=mpicxx F77=mpif77 F90=mpif90 FC=mpifc > COPTFLAGS="-march=native -O2" CXXOPTFLAGS="-march=native -O2" > FOPTFLAGS="-march=native -O2" --download-chaco=1 > --download-hypre=1 --download-metis=1 --download-ml=1 > --download-mumps=1 --download-parmetis=1 --download-plapack=1 > --download-prometheus=1 --download-ptscotch=1 --download-scotch=1 > --download-sprng=1 --download-superlu=1 --download-superlu_dist=1 > --download-triangle=1 --with-avx512-kernels=1 > --with-blaslapack-dir=/scinet/niagara/intel/2018.2/compilers_and_libraries_2018.2.199/linux/mkl > --with-debugging=0 --with-hdf5=1 > --with-mkl_pardiso-dir=/scinet/niagara/intel/2018.2/compilers_and_libraries_2018.2.199/linux/mkl > --with-scalapack=1 > --with-scalapack-lib="[/scinet/niagara/intel/2018.2/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64/libmkl_scalapack_lp64.so,/scinet/niagara/intel/2018.2/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64/libmkl_blacs_intelmpi_lp64.so]" > --with-x=0 > >>> > >>> Cluster with PETSc 3.11.3 (looks weired): > --prefix=/scinet/niagara/software/2019b/opt/intel-2019u4-intelmpi-2019u4/petsc/3.11.3 > CC=mpicc CXX=mpicxx F77=mpif77 F90=mpif90 FC=mpifc > COPTFLAGS="-march=native -O2" CXXOPTFLAGS="-march=native -O2" > FOPTFLAGS="-march=native -O2" --download-chaco=1 --download-hdf5=1 > --download-hypre=1 --download-metis=1 --download-ml=1 > --download-mumps=1 --download-parmetis=1 --download-plapack=1 > --download-prometheus=1 --download-ptscotch=1 --download-scotch=1 > --download-sprng=1 --download-superlu=1 --download-superlu_dist=1 > --download-triangle=1 --with-avx512-kernels=1 > --with-blaslapack-dir=/scinet/intel/2019u4/compilers_and_libraries_2019.4.243/linux/mkl > --with-cxx-dialect=C++11 --with-debugging=0 > --with-mkl_pardiso-dir=/scinet/intel/2019u4/compilers_and_libraries_2019.4.243/linux/mkl > --with-scalapack=1 > --with-scalapack-lib="[/scinet/intel/2019u4/compilers_and_libraries_2019.4.243/linux/mkl/lib/intel64/libmkl_scalapack_lp64.so,/scinet/intel/2019u4/compilers_and_libraries_2019.4.243/linux/mkl/lib/intel64/libmkl_blacs_intelmpi_lp64.so]" > --with-x=0 > >>> > >>> And the partition is used by default dmplex distribution. > >>> > >>>? ? ? ?!c distribute mesh over processes > >>>? ? ? ?call DMPlexDistribute(dmda_flow%da,stencil_width, & > >>>? ? ? ? ? ? ? ? ? ? ? ? ? ? ?PETSC_NULL_SF, ? ? ? ? ? ? ? ? ? ? > ? ?& > >>>? ? ? ? ? ? ? ? ? ? ? ? ? ? ?PETSC_NULL_OBJECT, ? ? ? ? ? ? ? ? > ? ? ? ?& > >>> ?distributedMesh,ierr) > >>>? ? ? ?CHKERRQ(ierr) > >>> > >>> Any idea on this strange problem? > >>> > >>> > >>> I just looked at the code. Your mesh should be partitioned by > k-way partitioning using Metis since its on 1 proc for > partitioning. This code > >>> is the same for 3.9 and 3.11, and you get the same result on > your machine. I cannot understand what might be happening on your > cluster > >>> (MPI plays no role). Is it possible that you changed the > adjacency specification in that version? > >>> > >>>? ?Thanks, > >>> > >>>? ? ? Matt > >>> Thanks, > >>> > >>> Danyang > >>> > >>> > >>> > >>> -- > >>> What most experimenters take for granted before they begin > their experiments is infinitely more interesting than any results > to which their experiments lead. > >>> -- Norbert Wiener > >>> > >>> https://www.cse.buffalo.edu/~knepley/ > >>> > >>> -- > >>> Sent from my Android device with K-9 Mail. Please excuse my > brevity. > >>> > >>> > >>> -- > >>> What most experimenters take for granted before they begin > their experiments is infinitely more interesting than any results > to which their experiments lead. > >>> -- Norbert Wiener > >>> > >>> https://www.cse.buffalo.edu/~knepley/ > > > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which > their experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: petsc-partition-compare.png Type: image/png Size: 69346 bytes Desc: not available URL: From mhbaghaei at mail.sjtu.edu.cn Tue Sep 17 08:27:44 2019 From: mhbaghaei at mail.sjtu.edu.cn (Mohammad Hassan) Date: Tue, 17 Sep 2019 21:27:44 +0800 Subject: [petsc-users] DMPlex Distribution Message-ID: <001001d56d5b$b0a40b50$11ec21f0$@mail.sjtu.edu.cn> Hi I am using DMPlexCreateFromDAG() to construct my DM. Is it possible to set the distribution across processors manually. I mean, how can I set the share of dm on each rank (local)? Thanks Amir -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Tue Sep 17 10:04:04 2019 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 17 Sep 2019 11:04:04 -0400 Subject: [petsc-users] DMPlex Distribution In-Reply-To: <001001d56d5b$b0a40b50$11ec21f0$@mail.sjtu.edu.cn> References: <001001d56d5b$b0a40b50$11ec21f0$@mail.sjtu.edu.cn> Message-ID: On Tue, Sep 17, 2019 at 9:27 AM Mohammad Hassan via petsc-users < petsc-users at mcs.anl.gov> wrote: > Hi > > I am using DMPlexCreateFromDAG() to construct my DM. Is it possible to set > the distribution across processors manually. I mean, how can I set the > share of dm on each rank (local)? > > You could make a Shell partitioner and tell it the entire partition: https://www.mcs.anl.gov/petsc/petsc-master/docs/manualpages/DMPLEX/PetscPartitionerShellSetPartition.html However, I would be surprised if you could do this. It is likely that you just want to mess with the weights in ParMetis. Thanks, Matt > Thanks > > Amir > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From mfadams at lbl.gov Tue Sep 17 11:02:20 2019 From: mfadams at lbl.gov (Mark Adams) Date: Tue, 17 Sep 2019 12:02:20 -0400 Subject: [petsc-users] Strange Partition in PETSc 3.11 version on some computers In-Reply-To: References: <9c36fe1c-e6c2-0278-864e-f1453687d3f9@gmail.com> <5217DF9F-E42D-4E1F-AE9B-1088954548BB@anl.gov> <4ffe88c2-354c-07d7-ab1e-0f1edd8ec3c3@gmail.com> Message-ID: Danyang, Excuse me if I missed something in this thread but just a few ideas. First, I trust that you have verified that you are getting a good solution with these bad meshes. Ideally you would check that the solver convergence rates are similar. You might verify that your mesh is inside of DMPLex correctly. You can visualize a Plex mesh very easily. (let us know if you need instructions). This striping on the 2D meshes look something like what you are getting with your 3D PRISM mesh. DMPLex just calls Parmetis with a flat graph. It is odd to me that your rectangular grids have so much structure and are non-isotropic. I assume that these rectangular meshes are isotropic (eg, squares). Anyway, just some thoughts, Mark On Tue, Sep 17, 2019 at 12:43 AM Danyang Su via petsc-users < petsc-users at mcs.anl.gov> wrote: > > On 2019-09-16 12:02 p.m., Matthew Knepley wrote: > > On Mon, Sep 16, 2019 at 1:46 PM Smith, Barry F. > wrote: > >> >> Very different stuff going on in the two cases, different objects being >> created, different number of different types of operations. Clearly a major >> refactorization of the code was done. Presumably a regression was >> introduced that changed the behavior dramatically, possible by mistake. >> >> You can attempt to use git bisect to determine what changed caused the >> dramatic change in behavior. Then it can be decided if the changed that >> triggered the change in the results was a bug or a planned feature. >> > > Danyang, > > Can you send me the smallest mesh you care about, and I will look at the > partitioning? We can at least get quality metrics > between these two releases. > > Thanks, > > Matt > > Hi Matt, > > This is the smallest mesh for the regional scale simulation that has > strange partition problem. It can be download via the link below. > > https://www.dropbox.com/s/tu34jgqqhkz8pwj/basin-3d.vtk?dl=0 > > I am trying to reproduce the similar problem using smaller 2D mesh, > however, there is no such problem in 2D, even though the partitions using > PETSc 3.9.3 and 3.11.3 are a bit different, they both look reasonable. As > shown below, both rectangular mesh and triangular mesh use DMPlex. > > [image: 2D rectangular and triangle mesh] > > I will keep on testing using PETSc3.11.3 but with different compiler and > MPI to check if I can reproduce the problem. > > Thanks, > > Danyang > > > >> Barry >> >> >> > On Sep 16, 2019, at 11:50 AM, Danyang Su wrote: >> > >> > Hi Barry and Matt, >> > >> > Attached is the output of both runs with -dm_view -log_view included. >> > >> > I am now coordinating with staff to install PETSc 3.9.3 version using >> intel2019u4 to narrow down the problem. Will get back to you later after >> the test. >> > >> > Thanks, >> > >> > Danyang >> > >> > On 2019-09-15 4:43 p.m., Smith, Barry F. wrote: >> >> Send the configure.log and make.log for the two system >> configurations that produce very different results as well as the output >> running with -dm_view -info for both runs. The cause is likely not subtle, >> one is likely using metis and the other is likely just not using any >> partitioner. >> >> >> >> >> >> >> >>> On Sep 15, 2019, at 6:07 PM, Matthew Knepley via petsc-users < >> petsc-users at mcs.anl.gov> wrote: >> >>> >> >>> On Sun, Sep 15, 2019 at 6:59 PM Danyang Su >> wrote: >> >>> Hi Matt, >> >>> >> >>> Thanks for the quick reply. I have no change in the adjacency. The >> source code and the simulation input files are all the same. I also tried >> to use GNU compiler and mpich with petsc 3.11.3 and it works fine. >> >>> >> >>> It looks like the problem is caused by the difference in >> configuration. However, the configuration is pretty the same as petsc 3.9.3 >> except the compiler and mpi used. I will contact scinet staff to check if >> they have any idea on this. >> >>> >> >>> Very very strange since the partition is handled completely by Metis, >> and does not use MPI. >> >>> >> >>> Thanks, >> >>> >> >>> Matt >> >>> Thanks, >> >>> >> >>> Danyang >> >>> >> >>> On September 15, 2019 3:20:18 p.m. PDT, Matthew Knepley < >> knepley at gmail.com> wrote: >> >>> On Sun, Sep 15, 2019 at 5:19 PM Danyang Su via petsc-users < >> petsc-users at mcs.anl.gov> wrote: >> >>> Dear All, >> >>> >> >>> I have a question regarding strange partition problem in PETSc 3.11 >> version. The problem does not exist on my local workstation. However, on a >> cluster with different PETSc versions, the partition seems quite different, >> as you can find in the figure below, which is tested with 160 processors. >> The color means the processor owns that subdomain. In this layered prism >> mesh, there are 40 layers from bottom to top and each layer has around 20k >> nodes. The natural order of nodes is also layered from bottom to top. >> >>> >> >>> The left partition (PETSc 3.10 and earlier) looks good with minimum >> number of ghost nodes while the right one (PETSc 3.11) looks weired with >> huge number of ghost nodes. Looks like the right one uses partition layer >> by layer. This problem exists on a a cluster but not on my local >> workstation for the same PETSc version (with different compiler and MPI). >> Other than the difference in partition and efficiency, the simulation >> results are the same. >> >>> >> >>> >> >>> >> >>> >> >>> Below is PETSc configuration on three machine: >> >>> >> >>> Local workstation (works fine): ./configure --with-cc=gcc >> --with-cxx=g++ --with-fc=gfortran --download-mpich --download-scalapack >> --download-parmetis --download-metis --download-ptscotch >> --download-fblaslapack --download-hypre --download-superlu_dist >> --download-hdf5=yes --download-ctetgen --with-debugging=0 COPTFLAGS=-O3 >> CXXOPTFLAGS=-O3 FOPTFLAGS=-O3 --with-cxx-dialect=C++11 >> >>> >> >>> Cluster with PETSc 3.9.3 (works fine): >> --prefix=/scinet/niagara/software/2018a/opt/intel-2018.2-intelmpi-2018.2/petsc/3.9.3 >> CC=mpicc CXX=mpicxx F77=mpif77 F90=mpif90 FC=mpifc COPTFLAGS="-march=native >> -O2" CXXOPTFLAGS="-march=native -O2" FOPTFLAGS="-march=native -O2" >> --download-chaco=1 --download-hypre=1 --download-metis=1 --download-ml=1 >> --download-mumps=1 --download-parmetis=1 --download-plapack=1 >> --download-prometheus=1 --download-ptscotch=1 --download-scotch=1 >> --download-sprng=1 --download-superlu=1 --download-superlu_dist=1 >> --download-triangle=1 --with-avx512-kernels=1 >> --with-blaslapack-dir=/scinet/niagara/intel/2018.2/compilers_and_libraries_2018.2.199/linux/mkl >> --with-debugging=0 --with-hdf5=1 >> --with-mkl_pardiso-dir=/scinet/niagara/intel/2018.2/compilers_and_libraries_2018.2.199/linux/mkl >> --with-scalapack=1 >> --with-scalapack-lib="[/scinet/niagara/intel/2018.2/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64/libmkl_scalapack_lp64.so,/scinet/niagara/intel/2018.2/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64/libmkl_blacs_intelmpi_lp64.so]" >> --with-x=0 >> >>> >> >>> Cluster with PETSc 3.11.3 (looks weired): >> --prefix=/scinet/niagara/software/2019b/opt/intel-2019u4-intelmpi-2019u4/petsc/3.11.3 >> CC=mpicc CXX=mpicxx F77=mpif77 F90=mpif90 FC=mpifc COPTFLAGS="-march=native >> -O2" CXXOPTFLAGS="-march=native -O2" FOPTFLAGS="-march=native -O2" >> --download-chaco=1 --download-hdf5=1 --download-hypre=1 --download-metis=1 >> --download-ml=1 --download-mumps=1 --download-parmetis=1 >> --download-plapack=1 --download-prometheus=1 --download-ptscotch=1 >> --download-scotch=1 --download-sprng=1 --download-superlu=1 >> --download-superlu_dist=1 --download-triangle=1 --with-avx512-kernels=1 >> --with-blaslapack-dir=/scinet/intel/2019u4/compilers_and_libraries_2019.4.243/linux/mkl >> --with-cxx-dialect=C++11 --with-debugging=0 >> --with-mkl_pardiso-dir=/scinet/intel/2019u4/compilers_and_libraries_2019.4.243/linux/mkl >> --with-scalapack=1 >> --with-scalapack-lib="[/scinet/intel/2019u4/compilers_and_libraries_2019.4.243/linux/mkl/lib/intel64/libmkl_scalapack_lp64.so,/scinet/intel/2019u4/compilers_and_libraries_2019.4.243/linux/mkl/lib/intel64/libmkl_blacs_intelmpi_lp64.so]" >> --with-x=0 >> >>> >> >>> And the partition is used by default dmplex distribution. >> >>> >> >>> !c distribute mesh over processes >> >>> call DMPlexDistribute(dmda_flow%da,stencil_width, >> & >> >>> PETSC_NULL_SF, >> & >> >>> PETSC_NULL_OBJECT, >> & >> >>> distributedMesh,ierr) >> >>> CHKERRQ(ierr) >> >>> >> >>> Any idea on this strange problem? >> >>> >> >>> >> >>> I just looked at the code. Your mesh should be partitioned by k-way >> partitioning using Metis since its on 1 proc for partitioning. This code >> >>> is the same for 3.9 and 3.11, and you get the same result on your >> machine. I cannot understand what might be happening on your cluster >> >>> (MPI plays no role). Is it possible that you changed the adjacency >> specification in that version? >> >>> >> >>> Thanks, >> >>> >> >>> Matt >> >>> Thanks, >> >>> >> >>> Danyang >> >>> >> >>> >> >>> >> >>> -- >> >>> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> >>> -- Norbert Wiener >> >>> >> >>> https://www.cse.buffalo.edu/~knepley/ >> >>> >> >>> -- >> >>> Sent from my Android device with K-9 Mail. Please excuse my brevity. >> >>> >> >>> >> >>> -- >> >>> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> >>> -- Norbert Wiener >> >>> >> >>> https://www.cse.buffalo.edu/~knepley/ >> > >> >> > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: petsc-partition-compare.png Type: image/png Size: 69346 bytes Desc: not available URL: From mhbaghaei at mail.sjtu.edu.cn Tue Sep 17 11:04:56 2019 From: mhbaghaei at mail.sjtu.edu.cn (Mohammad Hassan) Date: Wed, 18 Sep 2019 00:04:56 +0800 Subject: [petsc-users] DMPlex Distribution In-Reply-To: References: <001001d56d5b$b0a40b50$11ec21f0$@mail.sjtu.edu.cn> Message-ID: <004401d56d71$a6b41750$f41c45f0$@mail.sjtu.edu.cn> Thanks for suggestion. I am going to use a block-based amr. I think I need to know exactly the mesh distribution of blocks across different processors for implementation of amr. And as a general question, can we set block size of vector on each rank? Thanks Amir From: Matthew Knepley [mailto:knepley at gmail.com] Sent: Tuesday, September 17, 2019 11:04 PM To: Mohammad Hassan Cc: PETSc Subject: Re: [petsc-users] DMPlex Distribution On Tue, Sep 17, 2019 at 9:27 AM Mohammad Hassan via petsc-users > wrote: Hi I am using DMPlexCreateFromDAG() to construct my DM. Is it possible to set the distribution across processors manually. I mean, how can I set the share of dm on each rank (local)? You could make a Shell partitioner and tell it the entire partition: https://www.mcs.anl.gov/petsc/petsc-master/docs/manualpages/DMPLEX/PetscPartitionerShellSetPartition.html However, I would be surprised if you could do this. It is likely that you just want to mess with the weights in ParMetis. Thanks, Matt Thanks Amir -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From mfadams at lbl.gov Tue Sep 17 11:43:05 2019 From: mfadams at lbl.gov (Mark Adams) Date: Tue, 17 Sep 2019 12:43:05 -0400 Subject: [petsc-users] DMPlex Distribution In-Reply-To: <004401d56d71$a6b41750$f41c45f0$@mail.sjtu.edu.cn> References: <001001d56d5b$b0a40b50$11ec21f0$@mail.sjtu.edu.cn> <004401d56d71$a6b41750$f41c45f0$@mail.sjtu.edu.cn> Message-ID: On Tue, Sep 17, 2019 at 12:07 PM Mohammad Hassan via petsc-users < petsc-users at mcs.anl.gov> wrote: > Thanks for suggestion. I am going to use a block-based amr. I think I need > to know exactly the mesh distribution of blocks across different processors > for implementation of amr. > > And as a general question, can we set block size of vector on each rank? > I don't understand what you mean by AMR in this context exactly. And I'm not sure what you mean by blocks size. Block size is the number of dof per vertex (eg, 3) and it is a constant for a vector. > Thanks > > Amir > > > > *From:* Matthew Knepley [mailto:knepley at gmail.com] > *Sent:* Tuesday, September 17, 2019 11:04 PM > *To:* Mohammad Hassan > *Cc:* PETSc > *Subject:* Re: [petsc-users] DMPlex Distribution > > > > On Tue, Sep 17, 2019 at 9:27 AM Mohammad Hassan via petsc-users < > petsc-users at mcs.anl.gov> wrote: > > Hi > > I am using DMPlexCreateFromDAG() to construct my DM. Is it possible to set > the distribution across processors manually. I mean, how can I set the > share of dm on each rank (local)? > > > > You could make a Shell partitioner and tell it the entire partition: > > > > > https://www.mcs.anl.gov/petsc/petsc-master/docs/manualpages/DMPLEX/PetscPartitionerShellSetPartition.html > > > > However, I would be surprised if you could do this. It is likely that you > just want to mess with the weights in ParMetis. > > > > Thanks, > > > > Matt > > > > Thanks > > Amir > > > > > -- > > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > > > https://www.cse.buffalo.edu/~knepley/ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From danyang.su at gmail.com Tue Sep 17 11:53:37 2019 From: danyang.su at gmail.com (Danyang Su) Date: Tue, 17 Sep 2019 09:53:37 -0700 Subject: [petsc-users] Strange Partition in PETSc 3.11 version on some computers In-Reply-To: References: <9c36fe1c-e6c2-0278-864e-f1453687d3f9@gmail.com> <5217DF9F-E42D-4E1F-AE9B-1088954548BB@anl.gov> <4ffe88c2-354c-07d7-ab1e-0f1edd8ec3c3@gmail.com> Message-ID: <89d9f65e-c185-0f6f-cd27-d372303cecc3@gmail.com> Hi Mark, Thanks for your follow-up. The unstructured grid code has been verified and there is no problem in the results. The convergence rate is also good. The 3D mesh is not good, it is based on the original stratum which I haven't refined, but good for initial test as it is relative small and the results obtained from this mesh still makes sense. The 2D meshes are just for testing purpose as I want to reproduce the partition problem on a cluster using PETSc3.11.3 and Intel2019. Unfortunately, I didn't find problem using this example. The code has no problem in using different PETSc versions (PETSc V3.4 to V3.11) and MPI distribution (MPICH, OpenMPI, IntelMPI), except for one simulation case (the mesh I attached) on a cluster with PETSc3.11.3 and Intel2019u4 due to the very different partition compared to PETSc3.9.3. Yet the simulation results are the same except for the efficiency problem because the strange partition results into much more communication (ghost nodes). I am still trying different compiler and mpi with PETSc3.11.3 on that cluster to trace the problem. Will get back to you guys when there is update. Thanks, danyang On 2019-09-17 9:02 a.m., Mark Adams wrote: > Danyang, > > Excuse me if I missed?something in this thread but just a few ideas. > > First, I trust that you have verified that you are getting a good > solution with these bad meshes. Ideally you would check that the > solver convergence rates are similar. > > You might verify that your mesh is inside of DMPLex correctly. You can > visualize?a Plex mesh very?easily. (let us know if you need instructions). > > This striping on the 2D meshes look something like what you are > getting with your 3D PRISM mesh. DMPLex just calls Parmetis with a > flat graph. It is odd to me that your rectangular grids have so much > structure and are non-isotropic. I assume that these > rectangular?meshes are isotropic?(eg, squares). > > Anyway, just some thoughts, > Mark > > On Tue, Sep 17, 2019 at 12:43 AM Danyang Su via petsc-users > > wrote: > > > On 2019-09-16 12:02 p.m., Matthew Knepley wrote: >> On Mon, Sep 16, 2019 at 1:46 PM Smith, Barry F. >> > wrote: >> >> >> ? Very different stuff going on in the two cases, different >> objects being created, different number of different types of >> operations. Clearly a major refactorization of the code was >> done. Presumably a regression was introduced that changed the >> behavior dramatically, possible by mistake. >> >> ? ?You can attempt to use git bisect to determine what >> changed caused the dramatic change in behavior. Then it can >> be decided if the changed that triggered the change in the >> results was a bug or a planned feature. >> >> >> Danyang, >> >> Can you send me the smallest mesh you care about, and I will look >> at the partitioning? We can at least get quality metrics >> between these two releases. >> >> ? Thanks, >> >> ? ? ?Matt > > Hi Matt, > > This is the smallest mesh for the regional scale simulation that > has strange partition problem. It can be download via the link below. > > https://www.dropbox.com/s/tu34jgqqhkz8pwj/basin-3d.vtk?dl=0 > > I am trying to reproduce the similar problem using smaller 2D > mesh, however, there is no such problem in 2D, even though the > partitions using PETSc 3.9.3 and 3.11.3 are a bit different, they > both look reasonable. As shown below, both rectangular mesh and > triangular mesh use DMPlex. > > 2D rectangular and triangle mesh > > I will keep on testing using PETSc3.11.3 but with different > compiler and MPI to check if I can reproduce the problem. > > Thanks, > > Danyang > >> >> ? ?Barry >> >> >> > On Sep 16, 2019, at 11:50 AM, Danyang Su >> > wrote: >> > >> > Hi Barry and Matt, >> > >> > Attached is the output of both runs with -dm_view -log_view >> included. >> > >> > I am now coordinating with staff to install PETSc 3.9.3 >> version using intel2019u4 to narrow down the problem. Will >> get back to you later after the test. >> > >> > Thanks, >> > >> > Danyang >> > >> > On 2019-09-15 4:43 p.m., Smith, Barry F. wrote: >> >>? ?Send the configure.log and make.log for the two system >> configurations that produce very different results as well as >> the output running with -dm_view -info for both runs. The >> cause is likely not subtle, one is likely using metis and the >> other is likely just not using any partitioner. >> >> >> >> >> >> >> >>> On Sep 15, 2019, at 6:07 PM, Matthew Knepley via >> petsc-users > > wrote: >> >>> >> >>> On Sun, Sep 15, 2019 at 6:59 PM Danyang Su >> > wrote: >> >>> Hi Matt, >> >>> >> >>> Thanks for the quick reply. I have no change in the >> adjacency. The source code and the simulation input files are >> all the same. I also tried to use GNU compiler and mpich with >> petsc 3.11.3 and it works fine. >> >>> >> >>> It looks like the problem is caused by the difference in >> configuration. However, the configuration is pretty the same >> as petsc 3.9.3 except the compiler and mpi used. I will >> contact scinet staff to check if they have any idea on this. >> >>> >> >>> Very very strange since the partition is handled >> completely by Metis, and does not use MPI. >> >>> >> >>>? ?Thanks, >> >>> >> >>>? ? ?Matt >> >>>? Thanks, >> >>> >> >>> Danyang >> >>> >> >>> On September 15, 2019 3:20:18 p.m. PDT, Matthew Knepley >> > wrote: >> >>> On Sun, Sep 15, 2019 at 5:19 PM Danyang Su via >> petsc-users > > wrote: >> >>> Dear All, >> >>> >> >>> I have a question regarding strange partition problem in >> PETSc 3.11 version. The problem does not exist on my local >> workstation. However, on a cluster with different PETSc >> versions, the partition seems quite different, as you can >> find in the figure below, which is tested with 160 >> processors. The color means the processor owns that >> subdomain. In this layered prism mesh, there are 40 layers >> from bottom to top and each layer has around 20k nodes. The >> natural order of nodes is also layered from bottom to top. >> >>> >> >>> The left partition (PETSc 3.10 and earlier) looks good >> with minimum number of ghost nodes while the right one (PETSc >> 3.11) looks weired with huge number of ghost nodes. Looks >> like the right one uses partition layer by layer. This >> problem exists on a a cluster but not on my local workstation >> for the same PETSc version (with different compiler and MPI). >> Other than the difference in partition and efficiency, the >> simulation results are the same. >> >>> >> >>> >> >>> >> >>> >> >>> Below is PETSc configuration on three machine: >> >>> >> >>> Local workstation (works fine): ./configure --with-cc=gcc >> --with-cxx=g++ --with-fc=gfortran --download-mpich >> --download-scalapack --download-parmetis --download-metis >> --download-ptscotch --download-fblaslapack --download-hypre >> --download-superlu_dist --download-hdf5=yes >> --download-ctetgen --with-debugging=0 COPTFLAGS=-O3 >> CXXOPTFLAGS=-O3 FOPTFLAGS=-O3 --with-cxx-dialect=C++11 >> >>> >> >>> Cluster with PETSc 3.9.3 (works fine): >> --prefix=/scinet/niagara/software/2018a/opt/intel-2018.2-intelmpi-2018.2/petsc/3.9.3 >> CC=mpicc CXX=mpicxx F77=mpif77 F90=mpif90 FC=mpifc >> COPTFLAGS="-march=native -O2" CXXOPTFLAGS="-march=native -O2" >> FOPTFLAGS="-march=native -O2" --download-chaco=1 >> --download-hypre=1 --download-metis=1 --download-ml=1 >> --download-mumps=1 --download-parmetis=1 --download-plapack=1 >> --download-prometheus=1 --download-ptscotch=1 >> --download-scotch=1 --download-sprng=1 --download-superlu=1 >> --download-superlu_dist=1 --download-triangle=1 >> --with-avx512-kernels=1 >> --with-blaslapack-dir=/scinet/niagara/intel/2018.2/compilers_and_libraries_2018.2.199/linux/mkl >> --with-debugging=0 --with-hdf5=1 >> --with-mkl_pardiso-dir=/scinet/niagara/intel/2018.2/compilers_and_libraries_2018.2.199/linux/mkl >> --with-scalapack=1 >> --with-scalapack-lib="[/scinet/niagara/intel/2018.2/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64/libmkl_scalapack_lp64.so,/scinet/niagara/intel/2018.2/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64/libmkl_blacs_intelmpi_lp64.so]" >> --with-x=0 >> >>> >> >>> Cluster with PETSc 3.11.3 (looks weired): >> --prefix=/scinet/niagara/software/2019b/opt/intel-2019u4-intelmpi-2019u4/petsc/3.11.3 >> CC=mpicc CXX=mpicxx F77=mpif77 F90=mpif90 FC=mpifc >> COPTFLAGS="-march=native -O2" CXXOPTFLAGS="-march=native -O2" >> FOPTFLAGS="-march=native -O2" --download-chaco=1 >> --download-hdf5=1 --download-hypre=1 --download-metis=1 >> --download-ml=1 --download-mumps=1 --download-parmetis=1 >> --download-plapack=1 --download-prometheus=1 >> --download-ptscotch=1 --download-scotch=1 --download-sprng=1 >> --download-superlu=1 --download-superlu_dist=1 >> --download-triangle=1 --with-avx512-kernels=1 >> --with-blaslapack-dir=/scinet/intel/2019u4/compilers_and_libraries_2019.4.243/linux/mkl >> --with-cxx-dialect=C++11 --with-debugging=0 >> --with-mkl_pardiso-dir=/scinet/intel/2019u4/compilers_and_libraries_2019.4.243/linux/mkl >> --with-scalapack=1 >> --with-scalapack-lib="[/scinet/intel/2019u4/compilers_and_libraries_2019.4.243/linux/mkl/lib/intel64/libmkl_scalapack_lp64.so,/scinet/intel/2019u4/compilers_and_libraries_2019.4.243/linux/mkl/lib/intel64/libmkl_blacs_intelmpi_lp64.so]" >> --with-x=0 >> >>> >> >>> And the partition is used by default dmplex distribution. >> >>> >> >>>? ? ? ?!c distribute mesh over processes >> >>>? ? ? ?call DMPlexDistribute(dmda_flow%da,stencil_width, ? >> ? ? ? & >> >>> ?PETSC_NULL_SF,? ? ? ? ? ? ? ? ? ? ? ? ? ? ?& >> >>> ?PETSC_NULL_OBJECT,? ? ? ? ? ? ? ? ? ? ? ? ?& >> >>> ?distributedMesh,ierr) >> >>>? ? ? ?CHKERRQ(ierr) >> >>> >> >>> Any idea on this strange problem? >> >>> >> >>> >> >>> I just looked at the code. Your mesh should be >> partitioned by k-way partitioning using Metis since its on 1 >> proc for partitioning. This code >> >>> is the same for 3.9 and 3.11, and you get the same result >> on your machine. I cannot understand what might be happening >> on your cluster >> >>> (MPI plays no role). Is it possible that you changed the >> adjacency specification in that version? >> >>> >> >>>? ?Thanks, >> >>> >> >>>? ? ? Matt >> >>> Thanks, >> >>> >> >>> Danyang >> >>> >> >>> >> >>> >> >>> -- >> >>> What most experimenters take for granted before they >> begin their experiments is infinitely more interesting than >> any results to which their experiments lead. >> >>> -- Norbert Wiener >> >>> >> >>> https://www.cse.buffalo.edu/~knepley/ >> >>> >> >>> -- >> >>> Sent from my Android device with K-9 Mail. Please excuse >> my brevity. >> >>> >> >>> >> >>> -- >> >>> What most experimenters take for granted before they >> begin their experiments is infinitely more interesting than >> any results to which their experiments lead. >> >>> -- Norbert Wiener >> >>> >> >>> https://www.cse.buffalo.edu/~knepley/ >> > >> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to >> which their experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: petsc-partition-compare.png Type: image/png Size: 69346 bytes Desc: not available URL: From pedro.gonzalez at u-bordeaux.fr Tue Sep 17 11:57:11 2019 From: pedro.gonzalez at u-bordeaux.fr (Pedro Gonzalez) Date: Tue, 17 Sep 2019 18:57:11 +0200 (CEST) Subject: [petsc-users] I find slow performance of SNES In-Reply-To: <9249F6D6-0F16-4FCB-A820-5F8F3F04E03C@anl.gov> References: <1415422715.5264647.1568586915625.JavaMail.zimbra@u-bordeaux.fr> <9249F6D6-0F16-4FCB-A820-5F8F3F04E03C@anl.gov> Message-ID: <366122035.18684472.1568739431284.JavaMail.zimbra@u-bordeaux.fr> Dear Barry, Thanks a lot for your quick reply. > Do you simply mean that -snes_type ngs is not converging to the solution? Does it seem to converge to something else or nothing at all? > The SNES GS code just calls the user provided routine set with SNESSetNGS(). This means that it is expect to behave the same way as if the user simply called the user provided routine themselves. I assume you have a routine that implements the GS on the DMDA since you write " I managed to parallelize it by using DMDA with very good results. " Thus you should get the same iterations for both calling your Gauss-Seidel code yourself and calling SNESSetNGS() and then calling SNESSolve() with -snes_type ngs. You can check if they are behaving in the same (for simplicity) by using the debugger or by putting VecView() into code to see if they are generating the same values. Just have your GS code call VecView() on the input vectors at the top and the output vectors at the bottom. Before adding SNES to my code, I had programmed a nonlinear Gauss-Seidel algorithm which I parallelized via DMDA. I use it as reference solution. The SNES solver converges towards this solution. The problem is that it is being considerably slower than my reference algorithm. For instance, "-snes_type ngs" takes about 200 times more time, "-snes_type nrichardson" employs approximately 50 times more time, while "-snes_type ngmres" is about 20 times slower. I expect that SNES be as fast as my reference algorithm. This slow-performance problem may be due to a not optimal configuration of SNES (I am not an expert user of PETSc) or, as you suggest (see below), to the fact that SNES is calculating more than the necessary at each iteration (e.g., norms). Moreover, I have checked that the number iterations needed by "ngs" to reach convergence is almost the same as my reference GS algorithm. The small differences are due to the way the grid points are sweept during the iterations. > Do you mean both -snes_ngs_secant and -snes_ngs_secant_h ? The second option by itself will do nothing. Cut and paste the exact options you use. In order to make SNES converge, I use the following options: - Nonlinear Gauss-Seidel: -snes_type ngs -snes_ngs_secant_h 1.E-1 - Nonlinear Richardson: -snes_type nrichardson - Nolinear Generalized Minimal RESidual: -snes_type ngmres -snes_linesearch_type cp > This would happen if you did not use -snes_ngs_secant but do use -snes_ngs_secant_h but this doesn't change the algorithm With respect to "ngs" solver, it seems that there is a problem with the residual. Physically, the solution of the nonlinear system is proportional to an input amplitude and the convergence criterium which I set is normalized to this input amplitude. Hence, if I just vary the amplitude in a given problem, the solver should make the same number of iterations because the residual scales to that input amplitude. For physically small input amplitudes from 1.E0 to 1.E8 I observed that the option "-snes_type ngs" makes SNES converge (within the same time and the same number of iterations) and the absolute residual were proportional to the input amplitude, as expected. However, my imput amplitude is 1.E10 and I observed that "-snes_type ngs" could not make the absolute residual decrease. That is the reason why I added the option "-snes_ngs_secant_h 1.E-1" to "ngs" in order to converge when the input amplitude is 1.E10. However, I realize that I can vary this parameter from 1.E-1 to 1.E50 and I obtain the same solution within the same computation time. For very small values (I tested -snes_ngs_secant_h 1.E-8), it does not converge. What is happening here? Nonlinear Richardson does not have this problem and converges at any input amplitude. > By default SNES may be doing some norm computations at each iteration which are expensive and will slow things down. You can use SNESSetNormSchedule() or the command line form -snes_norm_schedule to turn these norms off. Thanks for this pertinent suggestion. Actually I have my own convergence criterium. In the subroutine MySNESConverged(snes,nc,xnorm,snorm,fnorm,reason,ctx,ierr) called by SNESSetConvergenceTest(snes,MySNESConverged,0,PETSC_NULL_FUNCTION,ierr), I do not need the norms "xnorm", "snorm" and "fnorm" which SNES calculated by default, as you say. However, I have done a print of these values and I see that "xnorm" and "snorm" are always 0 (so I deduce that they are not calculated) and that "fnorm" is calculated even when I add the option "-snes_norm_schedule none". If "-snes_norm_schedule none" is not working, how can I switch off the calculation of "fnorm"? Below you can see how I integrated SNES into my code. Moreover, I have several degrees of freedom at each point of DMDA grid (dof > 1). The norm which I need to calculate for my convergence criterium only concerns a subset of degrees of freedom: start = [start1, start2, ..., startn], with len(start) = n < dof. I am using VecStrideNorm() or VecStrideMax() with a do-loop: call VecStrideNorm(solution,start(1),NORM_INFINITY,mynorm,ierr) do i = 2, n call VecStrideNorm(solution,start(i),NORM_INFINITY,mynorm_i,ierr) mynorm = MAX(mynorm, mynorm_i) end do Does there exist a more efficient way of calculating the norm I need? The PETSc part which I integrated in my code is the following (pseudo-code): !-----------------------------------------------------------------------------! DM :: dmda Vec :: lv, gv SNES :: snes PetscReal, pointer :: p !-----------------------------------------------------------------------------! external FormFunctionLocal external MySNESConverged !-----------------------------------------------------------------------------! call PetscInitialize(...) call MPI_Comm_rank(...) call MPI_Comm_size(...) call DMDACreate3d(... DM_BOUNDARY_GHOSTED ... DMDA_STENCIL_STAR ... dof ...) call DMSetUp(dmda,ierr) call DMDAGetCorners(...) ===> I obtain xs, xe, ys, ye, zs, ze in Fortan 1-based indexing call DMDAGetGhostCorners(...) ===> I obtain xsg, xeg, ysg, yeg, zsg, zeg in Fortan 1-based indexing call DMCreateLocalVector(dmda,lv,ierr) call DMCreateGlobalVector(dmda,gv,ierr) call SNESCreate(comm,snes,ierr) call SNESSetConvergenceTest(snes,MySNESConverged,0,PETSC_NULL_FUNCTION,ierr) call SNESSetDM(snes,dmda,ierr) call DMDASNESSetFunctionLocal(dmda,INSERT_VALUES,FormFunctionLocal,0,ierr) call SNESSetFromOptions(snes,ierr) nullify(p) call DMDAVecGetArrayF90(dmda,lv,p,ierr) do n = zsg, zeg do m = ysg, yeg do l = xsg, xeg do i = 1, dof p(i-1,l-1,m-1,n-1) = INITIAL VALUE end do end do end do end do call DMDAVecRestoreArrayF90(dmda,lv,p,ierr) nullify(p) call DMLocalToGlobalBegin(dmda,lv,INSERT_VALUES,gv,ierr) call DMLocalToGlobalEnd(dmda,lv,INSERT_VALUES,gv,ierr) call SNESSolve(snes,PETSC_NULL_VEC,gv,ierr) call VecDestroy(lv,ierr) call VecDestroy(gv,ierr) call SNESDestroy(snes,ierr) call DMDestroy(dmda,ierr) call PetscFinalize(ierr) !-----------------------------------------------------------------------------! subroutine FormFunctionLocal(info,x,G,da,ierr) DMDALocalInfo, intent(in) :: info(DMDA_LOCAL_INFO_SIZE) PetscScalar, intent(in) :: x(0:petscdof-1,xs-stw:xe+stw,ys-stw:ye+stw,zs-stw:ze+stw) PetscScalar, intent(inout) :: G(0:petscdof-1,xs:xe,ys:ye,zs:ze) DM, intent(in) :: da PetscErrorCode, intent(inout) :: ierr do n = zs, ze do m = ys, ye do l = xs, xe do i = 1, dof G(i,l,m,n) = VALUE OF THE FUNCTION G(x) end do end do end do end do end subroutine FormFunctionLocal !-----------------------------------------------------------------------------! subroutine MySNESConverged(snes,nc,xnorm,snorm,fnorm,reason,ctx,ierr) SNES, intent(in) :: snes PetscInt, intent(in) :: nc PetscReal, intent(in) :: xnorm, snorm, fnorm SNESConvergedReason, intent(inout) :: reason PetscInt, intent(in) :: ctx PetscErrorCode, intent(inout) :: ierr (...) end subroutine MySNESConverged !-----------------------------------------------------------------------------! Thanks a lot in advance for your reply. Best regards, Pedro ----- Mail original ----- De: "Smith, Barry F." ?: "Pedro Gonzalez" Cc: "petsc-users" Envoy?: Lundi 16 Septembre 2019 01:28:15 Objet: Re: [petsc-users] I find slow performance of SNES > On Sep 15, 2019, at 5:35 PM, Pedro Gonzalez via petsc-users wrote: > > Dear all, > > I am working on a code that solves a nonlinear system of equations G(x)=0 with Gauss-Seidel method. I managed to parallelize it by using DMDA with very good results. The previous week I changed my Gauss-Seidel solver by SNES. The code using SNES gives the same result as before, but I do not obtain the performance that I expected: > 1) When using the Gauss-Seidel method (-snes_type ngs) the residual G(x) seems not be scallable to the amplitude of x Do you simply mean that -snes_type ngs is not converging to the solution? Does it seem to converge to something else or nothing at all? The SNES GS code just calls the user provided routine set with SNESSetNGS(). This means that it is expect to behave the same way as if the user simply called the user provided routine themselves. I assume you have a routine that implements the GS on the DMDA since you write " I managed to parallelize it by using DMDA with very good results. " Thus you should get the same iterations for both calling your Gauss-Seidel code yourself and calling SNESSetNGS() and then calling SNESSolve() with -snes_type ngs. You can check if they are behaving in the same (for simplicity) by using the debugger or by putting VecView() into code to see if they are generating the same values. Just have your GS code call VecView() on the input vectors at the top and the output vectors at the bottom. > and I have to add the option -snes_secant_h in order to make SNES converge. Do you mean both -snes_ngs_secant and -snes_ngs_secant_h ? The second option by itself will do nothing. Cut and paste the exact options you use. > However, I varied the step from 1.E-1 to 1.E50 and obtained the same result within the same computation time. This would happen if you did not use -snes_ngs_secant but do use -snes_ngs_secant_h but this doesn't change the algorithm > Is it normal that snes_secant_h can vary so many orders of magnitude? That certainly does seem odd. Looking at the code SNESComputeNGSDefaultSecant() we see it is perturbing the input vector (by color) with h - for (j=0;j atol) { /* This is equivalent to d = x - (h*f) / PetscRealPart(g-f) */ d = (x*g-w*f) / PetscRealPart(g-f); } else { d = x; } In PetscErrorCode SNESSetFromOptions_NGS(PetscOptionItems *PetscOptionsObject,SNES snes) one can see that the user provided h is accepted ierr = PetscOptionsReal("-snes_ngs_secant_h","Differencing parameter for secant search","",gs->h,&gs->h,NULL);CHKERRQ(ierr); You could run in the debugger with a break point in SNESComputeNGSDefaultSecant() to see if it is truly using the h you provided. > 2) Compared to my Gauss-Seidel algorithm, SNES does (approximately) the same number of iterations (with the same convergence criterium) but it is about 100 times slower. I don't fully understand what you are running so cannot completely answer this. -snes_ngs_secant will be lots slower than using a user provided GS By default SNES may be doing some norm computations at each iteration which are expensive and will slow things down. You can use SNESSetNormSchedule() or the command line form -snes_norm_schedule to turn these norms off. > What can be the reason(s) of this slow performance of SNES solver? I do not use preconditioner with my algorithm so I did not add one to SNES. > > The main PETSc subroutines that I have included (in this order) are the following: > call DMDACreate3D > call DMSetUp > call DMCreateLocalVector > call DMCreateGlobalVector > call SNESCreate > call SNESSetConvergenceTest > call SNESSetDM > call DMDASNESSetFunctionLocal > call SNESSetFromOptions > call SNESSolve > > Thanks in advance for you help. > > Best regards, > Pedro > > From knepley at gmail.com Tue Sep 17 12:02:33 2019 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 17 Sep 2019 13:02:33 -0400 Subject: [petsc-users] Strange Partition in PETSc 3.11 version on some computers In-Reply-To: <89d9f65e-c185-0f6f-cd27-d372303cecc3@gmail.com> References: <9c36fe1c-e6c2-0278-864e-f1453687d3f9@gmail.com> <5217DF9F-E42D-4E1F-AE9B-1088954548BB@anl.gov> <4ffe88c2-354c-07d7-ab1e-0f1edd8ec3c3@gmail.com> <89d9f65e-c185-0f6f-cd27-d372303cecc3@gmail.com> Message-ID: On Tue, Sep 17, 2019 at 12:53 PM Danyang Su wrote: > Hi Mark, > > Thanks for your follow-up. > > The unstructured grid code has been verified and there is no problem in > the results. The convergence rate is also good. The 3D mesh is not good, it > is based on the original stratum which I haven't refined, but good for > initial test as it is relative small and the results obtained from this > mesh still makes sense. > > The 2D meshes are just for testing purpose as I want to reproduce the > partition problem on a cluster using PETSc3.11.3 and Intel2019. > Unfortunately, I didn't find problem using this example. > > The code has no problem in using different PETSc versions (PETSc V3.4 to > V3.11) and MPI distribution (MPICH, OpenMPI, IntelMPI), except for one > simulation case (the mesh I attached) on a cluster with PETSc3.11.3 and > Intel2019u4 due to the very different partition compared to PETSc3.9.3. Yet > the simulation results are the same except for the efficiency problem > because the strange partition results into much more communication (ghost > nodes). > > I am still trying different compiler and mpi with PETSc3.11.3 on that > cluster to trace the problem. Will get back to you guys when there is > update. > > You had --download-parmetis in your configure command, but I wonder if it is possible that it actually was not downloaded and already present. The type of the ParMetis weights can be changed, and if the type that PETSc thinks it is does not match the actual library type, then the weights could all be crazy numbers. I seem to recall someone changing the weight type in a release, which might mean that the built ParMetis was fine with one version and not the other. Thanks, Matt > Thanks, > > danyang > On 2019-09-17 9:02 a.m., Mark Adams wrote: > > Danyang, > > Excuse me if I missed something in this thread but just a few ideas. > > First, I trust that you have verified that you are getting a good solution > with these bad meshes. Ideally you would check that the solver convergence > rates are similar. > > You might verify that your mesh is inside of DMPLex correctly. You can > visualize a Plex mesh very easily. (let us know if you need instructions). > > This striping on the 2D meshes look something like what you are getting > with your 3D PRISM mesh. DMPLex just calls Parmetis with a flat graph. It > is odd to me that your rectangular grids have so much structure and are > non-isotropic. I assume that these rectangular meshes are isotropic (eg, > squares). > > Anyway, just some thoughts, > Mark > > On Tue, Sep 17, 2019 at 12:43 AM Danyang Su via petsc-users < > petsc-users at mcs.anl.gov> wrote: > >> >> On 2019-09-16 12:02 p.m., Matthew Knepley wrote: >> >> On Mon, Sep 16, 2019 at 1:46 PM Smith, Barry F. >> wrote: >> >>> >>> Very different stuff going on in the two cases, different objects >>> being created, different number of different types of operations. Clearly a >>> major refactorization of the code was done. Presumably a regression was >>> introduced that changed the behavior dramatically, possible by mistake. >>> >>> You can attempt to use git bisect to determine what changed caused >>> the dramatic change in behavior. Then it can be decided if the changed that >>> triggered the change in the results was a bug or a planned feature. >>> >> >> Danyang, >> >> Can you send me the smallest mesh you care about, and I will look at the >> partitioning? We can at least get quality metrics >> between these two releases. >> >> Thanks, >> >> Matt >> >> Hi Matt, >> >> This is the smallest mesh for the regional scale simulation that has >> strange partition problem. It can be download via the link below. >> >> https://www.dropbox.com/s/tu34jgqqhkz8pwj/basin-3d.vtk?dl=0 >> >> I am trying to reproduce the similar problem using smaller 2D mesh, >> however, there is no such problem in 2D, even though the partitions using >> PETSc 3.9.3 and 3.11.3 are a bit different, they both look reasonable. As >> shown below, both rectangular mesh and triangular mesh use DMPlex. >> >> [image: 2D rectangular and triangle mesh] >> >> I will keep on testing using PETSc3.11.3 but with different compiler and >> MPI to check if I can reproduce the problem. >> >> Thanks, >> >> Danyang >> >> >> >>> Barry >>> >>> >>> > On Sep 16, 2019, at 11:50 AM, Danyang Su wrote: >>> > >>> > Hi Barry and Matt, >>> > >>> > Attached is the output of both runs with -dm_view -log_view included. >>> > >>> > I am now coordinating with staff to install PETSc 3.9.3 version using >>> intel2019u4 to narrow down the problem. Will get back to you later after >>> the test. >>> > >>> > Thanks, >>> > >>> > Danyang >>> > >>> > On 2019-09-15 4:43 p.m., Smith, Barry F. wrote: >>> >> Send the configure.log and make.log for the two system >>> configurations that produce very different results as well as the output >>> running with -dm_view -info for both runs. The cause is likely not subtle, >>> one is likely using metis and the other is likely just not using any >>> partitioner. >>> >> >>> >> >>> >> >>> >>> On Sep 15, 2019, at 6:07 PM, Matthew Knepley via petsc-users < >>> petsc-users at mcs.anl.gov> wrote: >>> >>> >>> >>> On Sun, Sep 15, 2019 at 6:59 PM Danyang Su >>> wrote: >>> >>> Hi Matt, >>> >>> >>> >>> Thanks for the quick reply. I have no change in the adjacency. The >>> source code and the simulation input files are all the same. I also tried >>> to use GNU compiler and mpich with petsc 3.11.3 and it works fine. >>> >>> >>> >>> It looks like the problem is caused by the difference in >>> configuration. However, the configuration is pretty the same as petsc 3.9.3 >>> except the compiler and mpi used. I will contact scinet staff to check if >>> they have any idea on this. >>> >>> >>> >>> Very very strange since the partition is handled completely by >>> Metis, and does not use MPI. >>> >>> >>> >>> Thanks, >>> >>> >>> >>> Matt >>> >>> Thanks, >>> >>> >>> >>> Danyang >>> >>> >>> >>> On September 15, 2019 3:20:18 p.m. PDT, Matthew Knepley < >>> knepley at gmail.com> wrote: >>> >>> On Sun, Sep 15, 2019 at 5:19 PM Danyang Su via petsc-users < >>> petsc-users at mcs.anl.gov> wrote: >>> >>> Dear All, >>> >>> >>> >>> I have a question regarding strange partition problem in PETSc 3.11 >>> version. The problem does not exist on my local workstation. However, on a >>> cluster with different PETSc versions, the partition seems quite different, >>> as you can find in the figure below, which is tested with 160 processors. >>> The color means the processor owns that subdomain. In this layered prism >>> mesh, there are 40 layers from bottom to top and each layer has around 20k >>> nodes. The natural order of nodes is also layered from bottom to top. >>> >>> >>> >>> The left partition (PETSc 3.10 and earlier) looks good with minimum >>> number of ghost nodes while the right one (PETSc 3.11) looks weired with >>> huge number of ghost nodes. Looks like the right one uses partition layer >>> by layer. This problem exists on a a cluster but not on my local >>> workstation for the same PETSc version (with different compiler and MPI). >>> Other than the difference in partition and efficiency, the simulation >>> results are the same. >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> Below is PETSc configuration on three machine: >>> >>> >>> >>> Local workstation (works fine): ./configure --with-cc=gcc >>> --with-cxx=g++ --with-fc=gfortran --download-mpich --download-scalapack >>> --download-parmetis --download-metis --download-ptscotch >>> --download-fblaslapack --download-hypre --download-superlu_dist >>> --download-hdf5=yes --download-ctetgen --with-debugging=0 COPTFLAGS=-O3 >>> CXXOPTFLAGS=-O3 FOPTFLAGS=-O3 --with-cxx-dialect=C++11 >>> >>> >>> >>> Cluster with PETSc 3.9.3 (works fine): >>> --prefix=/scinet/niagara/software/2018a/opt/intel-2018.2-intelmpi-2018.2/petsc/3.9.3 >>> CC=mpicc CXX=mpicxx F77=mpif77 F90=mpif90 FC=mpifc COPTFLAGS="-march=native >>> -O2" CXXOPTFLAGS="-march=native -O2" FOPTFLAGS="-march=native -O2" >>> --download-chaco=1 --download-hypre=1 --download-metis=1 --download-ml=1 >>> --download-mumps=1 --download-parmetis=1 --download-plapack=1 >>> --download-prometheus=1 --download-ptscotch=1 --download-scotch=1 >>> --download-sprng=1 --download-superlu=1 --download-superlu_dist=1 >>> --download-triangle=1 --with-avx512-kernels=1 >>> --with-blaslapack-dir=/scinet/niagara/intel/2018.2/compilers_and_libraries_2018.2.199/linux/mkl >>> --with-debugging=0 --with-hdf5=1 >>> --with-mkl_pardiso-dir=/scinet/niagara/intel/2018.2/compilers_and_libraries_2018.2.199/linux/mkl >>> --with-scalapack=1 >>> --with-scalapack-lib="[/scinet/niagara/intel/2018.2/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64/libmkl_scalapack_lp64.so,/scinet/niagara/intel/2018.2/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64/libmkl_blacs_intelmpi_lp64.so]" >>> --with-x=0 >>> >>> >>> >>> Cluster with PETSc 3.11.3 (looks weired): >>> --prefix=/scinet/niagara/software/2019b/opt/intel-2019u4-intelmpi-2019u4/petsc/3.11.3 >>> CC=mpicc CXX=mpicxx F77=mpif77 F90=mpif90 FC=mpifc COPTFLAGS="-march=native >>> -O2" CXXOPTFLAGS="-march=native -O2" FOPTFLAGS="-march=native -O2" >>> --download-chaco=1 --download-hdf5=1 --download-hypre=1 --download-metis=1 >>> --download-ml=1 --download-mumps=1 --download-parmetis=1 >>> --download-plapack=1 --download-prometheus=1 --download-ptscotch=1 >>> --download-scotch=1 --download-sprng=1 --download-superlu=1 >>> --download-superlu_dist=1 --download-triangle=1 --with-avx512-kernels=1 >>> --with-blaslapack-dir=/scinet/intel/2019u4/compilers_and_libraries_2019.4.243/linux/mkl >>> --with-cxx-dialect=C++11 --with-debugging=0 >>> --with-mkl_pardiso-dir=/scinet/intel/2019u4/compilers_and_libraries_2019.4.243/linux/mkl >>> --with-scalapack=1 >>> --with-scalapack-lib="[/scinet/intel/2019u4/compilers_and_libraries_2019.4.243/linux/mkl/lib/intel64/libmkl_scalapack_lp64.so,/scinet/intel/2019u4/compilers_and_libraries_2019.4.243/linux/mkl/lib/intel64/libmkl_blacs_intelmpi_lp64.so]" >>> --with-x=0 >>> >>> >>> >>> And the partition is used by default dmplex distribution. >>> >>> >>> >>> !c distribute mesh over processes >>> >>> call DMPlexDistribute(dmda_flow%da,stencil_width, >>> & >>> >>> PETSC_NULL_SF, >>> & >>> >>> PETSC_NULL_OBJECT, >>> & >>> >>> distributedMesh,ierr) >>> >>> CHKERRQ(ierr) >>> >>> >>> >>> Any idea on this strange problem? >>> >>> >>> >>> >>> >>> I just looked at the code. Your mesh should be partitioned by k-way >>> partitioning using Metis since its on 1 proc for partitioning. This code >>> >>> is the same for 3.9 and 3.11, and you get the same result on your >>> machine. I cannot understand what might be happening on your cluster >>> >>> (MPI plays no role). Is it possible that you changed the adjacency >>> specification in that version? >>> >>> >>> >>> Thanks, >>> >>> >>> >>> Matt >>> >>> Thanks, >>> >>> >>> >>> Danyang >>> >>> >>> >>> >>> >>> >>> >>> -- >>> >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> >>> -- Norbert Wiener >>> >>> >>> >>> https://www.cse.buffalo.edu/~knepley/ >>> >>> >>> >>> -- >>> >>> Sent from my Android device with K-9 Mail. Please excuse my brevity. >>> >>> >>> >>> >>> >>> -- >>> >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> >>> -- Norbert Wiener >>> >>> >>> >>> https://www.cse.buffalo.edu/~knepley/ >>> > >>> >>> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ >> >> >> -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: petsc-partition-compare.png Type: image/png Size: 69346 bytes Desc: not available URL: From mhbaghaei at mail.sjtu.edu.cn Tue Sep 17 12:05:35 2019 From: mhbaghaei at mail.sjtu.edu.cn (Mohammad Hassan) Date: Wed, 18 Sep 2019 01:05:35 +0800 Subject: [petsc-users] DMPlex Distribution In-Reply-To: References: <001001d56d5b$b0a40b50$11ec21f0$@mail.sjtu.edu.cn> <004401d56d71$a6b41750$f41c45f0$@mail.sjtu.edu.cn> Message-ID: <006d01d56d7a$20346500$609d2f00$@mail.sjtu.edu.cn> Sorry if I confused you. In fact, I want to use grid adaptively using block-based AMR technique. By that, I mean I will have the same stencil for all points inside the block. For better functionality of AMR and its parallelization, it is needed to know the location of points for both working vectors and also vectors obtained from DMPlex. That?s why I think I need to specify the AMR block across processors. Thanks Amir From: Mark Adams [mailto:mfadams at lbl.gov] Sent: Wednesday, September 18, 2019 12:43 AM To: Mohammad Hassan Cc: Matthew Knepley ; PETSc users list Subject: Re: [petsc-users] DMPlex Distribution On Tue, Sep 17, 2019 at 12:07 PM Mohammad Hassan via petsc-users > wrote: Thanks for suggestion. I am going to use a block-based amr. I think I need to know exactly the mesh distribution of blocks across different processors for implementation of amr. And as a general question, can we set block size of vector on each rank? I don't understand what you mean by AMR in this context exactly. And I'm not sure what you mean by blocks size. Block size is the number of dof per vertex (eg, 3) and it is a constant for a vector. Thanks Amir From: Matthew Knepley [mailto:knepley at gmail.com ] Sent: Tuesday, September 17, 2019 11:04 PM To: Mohammad Hassan > Cc: PETSc > Subject: Re: [petsc-users] DMPlex Distribution On Tue, Sep 17, 2019 at 9:27 AM Mohammad Hassan via petsc-users > wrote: Hi I am using DMPlexCreateFromDAG() to construct my DM. Is it possible to set the distribution across processors manually. I mean, how can I set the share of dm on each rank (local)? You could make a Shell partitioner and tell it the entire partition: https://www.mcs.anl.gov/petsc/petsc-master/docs/manualpages/DMPLEX/PetscPartitionerShellSetPartition.html However, I would be surprised if you could do this. It is likely that you just want to mess with the weights in ParMetis. Thanks, Matt Thanks Amir -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From mfadams at lbl.gov Tue Sep 17 12:05:06 2019 From: mfadams at lbl.gov (Mark Adams) Date: Tue, 17 Sep 2019 13:05:06 -0400 Subject: [petsc-users] Strange Partition in PETSc 3.11 version on some computers In-Reply-To: <89d9f65e-c185-0f6f-cd27-d372303cecc3@gmail.com> References: <9c36fe1c-e6c2-0278-864e-f1453687d3f9@gmail.com> <5217DF9F-E42D-4E1F-AE9B-1088954548BB@anl.gov> <4ffe88c2-354c-07d7-ab1e-0f1edd8ec3c3@gmail.com> <89d9f65e-c185-0f6f-cd27-d372303cecc3@gmail.com> Message-ID: On Tue, Sep 17, 2019 at 12:53 PM Danyang Su wrote: > Hi Mark, > > Thanks for your follow-up. > > The unstructured grid code has been verified and there is no problem in > the results. The convergence rate is also good. The 3D mesh is not good, it > is based on the original stratum which I haven't refined, but good for > initial test as it is relative small and the results obtained from this > mesh still makes sense. > > The 2D meshes are just for testing purpose as I want to reproduce the > partition problem on a cluster using PETSc3.11.3 and Intel2019. > Unfortunately, I didn't find problem using this example. > > The code has no problem in using different PETSc versions (PETSc V3.4 to > V3.11) > OK, it is the same code. I thought I saw something about your code changing. Just to be clear, v3.11 never gives you good partitions. It is not just a problem on this Intel cluster. The machine, compiler and MPI version should not matter. > and MPI distribution (MPICH, OpenMPI, IntelMPI), except for one simulation > case (the mesh I attached) on a cluster with PETSc3.11.3 and Intel2019u4 > due to the very different partition compared to PETSc3.9.3. Yet the > simulation results are the same except for the efficiency problem because > the strange partition results into much more communication (ghost nodes). > > I am still trying different compiler and mpi with PETSc3.11.3 on that > cluster to trace the problem. Will get back to you guys when there is > update. > This is very strange. You might want to use 'git bisect'. You set a good and a bad SHA1 (we can give you this for 3.9 and 3.11 and the exact commands). The git will go to a version in the middle. You then reconfigure, remake, rebuild your code, run your test. Git will ask you, as I recall, if the version is good or bad. Once you get this workflow going it is not too bad, depending on how hard this loop is of course. > Thanks, > > danyang > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: petsc-partition-compare.png Type: image/png Size: 69346 bytes Desc: not available URL: From mfadams at lbl.gov Tue Sep 17 12:07:29 2019 From: mfadams at lbl.gov (Mark Adams) Date: Tue, 17 Sep 2019 13:07:29 -0400 Subject: [petsc-users] Strange Partition in PETSc 3.11 version on some computers In-Reply-To: References: <9c36fe1c-e6c2-0278-864e-f1453687d3f9@gmail.com> <5217DF9F-E42D-4E1F-AE9B-1088954548BB@anl.gov> <4ffe88c2-354c-07d7-ab1e-0f1edd8ec3c3@gmail.com> <89d9f65e-c185-0f6f-cd27-d372303cecc3@gmail.com> Message-ID: Matt that sound like it. danyang, just in case its not clear, you need to delete your architecture directory and reconfigure from scratch. You should be able to just delete the arch-dir/externalpackages/git.parmetis[metis] directories but I'd simply delete the whole arch-dir. On Tue, Sep 17, 2019 at 1:03 PM Matthew Knepley wrote: > On Tue, Sep 17, 2019 at 12:53 PM Danyang Su wrote: > >> Hi Mark, >> >> Thanks for your follow-up. >> >> The unstructured grid code has been verified and there is no problem in >> the results. The convergence rate is also good. The 3D mesh is not good, it >> is based on the original stratum which I haven't refined, but good for >> initial test as it is relative small and the results obtained from this >> mesh still makes sense. >> >> The 2D meshes are just for testing purpose as I want to reproduce the >> partition problem on a cluster using PETSc3.11.3 and Intel2019. >> Unfortunately, I didn't find problem using this example. >> >> The code has no problem in using different PETSc versions (PETSc V3.4 to >> V3.11) and MPI distribution (MPICH, OpenMPI, IntelMPI), except for one >> simulation case (the mesh I attached) on a cluster with PETSc3.11.3 and >> Intel2019u4 due to the very different partition compared to PETSc3.9.3. Yet >> the simulation results are the same except for the efficiency problem >> because the strange partition results into much more communication (ghost >> nodes). >> >> I am still trying different compiler and mpi with PETSc3.11.3 on that >> cluster to trace the problem. Will get back to you guys when there is >> update. >> >> You had --download-parmetis in your configure command, but I wonder if it > is possible that it actually was not downloaded and > already present. The type of the ParMetis weights can be changed, and if > the type that PETSc thinks it is does not match the > actual library type, then the weights could all be crazy numbers. I seem > to recall someone changing the weight type in a release, > which might mean that the built ParMetis was fine with one version and not > the other. > > Thanks, > > Matt > >> Thanks, >> >> danyang >> On 2019-09-17 9:02 a.m., Mark Adams wrote: >> >> Danyang, >> >> Excuse me if I missed something in this thread but just a few ideas. >> >> First, I trust that you have verified that you are getting a good >> solution with these bad meshes. Ideally you would check that the solver >> convergence rates are similar. >> >> You might verify that your mesh is inside of DMPLex correctly. You can >> visualize a Plex mesh very easily. (let us know if you need instructions). >> >> This striping on the 2D meshes look something like what you are getting >> with your 3D PRISM mesh. DMPLex just calls Parmetis with a flat graph. It >> is odd to me that your rectangular grids have so much structure and are >> non-isotropic. I assume that these rectangular meshes are isotropic (eg, >> squares). >> >> Anyway, just some thoughts, >> Mark >> >> On Tue, Sep 17, 2019 at 12:43 AM Danyang Su via petsc-users < >> petsc-users at mcs.anl.gov> wrote: >> >>> >>> On 2019-09-16 12:02 p.m., Matthew Knepley wrote: >>> >>> On Mon, Sep 16, 2019 at 1:46 PM Smith, Barry F. >>> wrote: >>> >>>> >>>> Very different stuff going on in the two cases, different objects >>>> being created, different number of different types of operations. Clearly a >>>> major refactorization of the code was done. Presumably a regression was >>>> introduced that changed the behavior dramatically, possible by mistake. >>>> >>>> You can attempt to use git bisect to determine what changed caused >>>> the dramatic change in behavior. Then it can be decided if the changed that >>>> triggered the change in the results was a bug or a planned feature. >>>> >>> >>> Danyang, >>> >>> Can you send me the smallest mesh you care about, and I will look at the >>> partitioning? We can at least get quality metrics >>> between these two releases. >>> >>> Thanks, >>> >>> Matt >>> >>> Hi Matt, >>> >>> This is the smallest mesh for the regional scale simulation that has >>> strange partition problem. It can be download via the link below. >>> >>> https://www.dropbox.com/s/tu34jgqqhkz8pwj/basin-3d.vtk?dl=0 >>> >>> I am trying to reproduce the similar problem using smaller 2D mesh, >>> however, there is no such problem in 2D, even though the partitions using >>> PETSc 3.9.3 and 3.11.3 are a bit different, they both look reasonable. As >>> shown below, both rectangular mesh and triangular mesh use DMPlex. >>> >>> [image: 2D rectangular and triangle mesh] >>> >>> I will keep on testing using PETSc3.11.3 but with different compiler and >>> MPI to check if I can reproduce the problem. >>> >>> Thanks, >>> >>> Danyang >>> >>> >>> >>>> Barry >>>> >>>> >>>> > On Sep 16, 2019, at 11:50 AM, Danyang Su >>>> wrote: >>>> > >>>> > Hi Barry and Matt, >>>> > >>>> > Attached is the output of both runs with -dm_view -log_view included. >>>> > >>>> > I am now coordinating with staff to install PETSc 3.9.3 version using >>>> intel2019u4 to narrow down the problem. Will get back to you later after >>>> the test. >>>> > >>>> > Thanks, >>>> > >>>> > Danyang >>>> > >>>> > On 2019-09-15 4:43 p.m., Smith, Barry F. wrote: >>>> >> Send the configure.log and make.log for the two system >>>> configurations that produce very different results as well as the output >>>> running with -dm_view -info for both runs. The cause is likely not subtle, >>>> one is likely using metis and the other is likely just not using any >>>> partitioner. >>>> >> >>>> >> >>>> >> >>>> >>> On Sep 15, 2019, at 6:07 PM, Matthew Knepley via petsc-users < >>>> petsc-users at mcs.anl.gov> wrote: >>>> >>> >>>> >>> On Sun, Sep 15, 2019 at 6:59 PM Danyang Su >>>> wrote: >>>> >>> Hi Matt, >>>> >>> >>>> >>> Thanks for the quick reply. I have no change in the adjacency. The >>>> source code and the simulation input files are all the same. I also tried >>>> to use GNU compiler and mpich with petsc 3.11.3 and it works fine. >>>> >>> >>>> >>> It looks like the problem is caused by the difference in >>>> configuration. However, the configuration is pretty the same as petsc 3.9.3 >>>> except the compiler and mpi used. I will contact scinet staff to check if >>>> they have any idea on this. >>>> >>> >>>> >>> Very very strange since the partition is handled completely by >>>> Metis, and does not use MPI. >>>> >>> >>>> >>> Thanks, >>>> >>> >>>> >>> Matt >>>> >>> Thanks, >>>> >>> >>>> >>> Danyang >>>> >>> >>>> >>> On September 15, 2019 3:20:18 p.m. PDT, Matthew Knepley < >>>> knepley at gmail.com> wrote: >>>> >>> On Sun, Sep 15, 2019 at 5:19 PM Danyang Su via petsc-users < >>>> petsc-users at mcs.anl.gov> wrote: >>>> >>> Dear All, >>>> >>> >>>> >>> I have a question regarding strange partition problem in PETSc 3.11 >>>> version. The problem does not exist on my local workstation. However, on a >>>> cluster with different PETSc versions, the partition seems quite different, >>>> as you can find in the figure below, which is tested with 160 processors. >>>> The color means the processor owns that subdomain. In this layered prism >>>> mesh, there are 40 layers from bottom to top and each layer has around 20k >>>> nodes. The natural order of nodes is also layered from bottom to top. >>>> >>> >>>> >>> The left partition (PETSc 3.10 and earlier) looks good with minimum >>>> number of ghost nodes while the right one (PETSc 3.11) looks weired with >>>> huge number of ghost nodes. Looks like the right one uses partition layer >>>> by layer. This problem exists on a a cluster but not on my local >>>> workstation for the same PETSc version (with different compiler and MPI). >>>> Other than the difference in partition and efficiency, the simulation >>>> results are the same. >>>> >>> >>>> >>> >>>> >>> >>>> >>> >>>> >>> Below is PETSc configuration on three machine: >>>> >>> >>>> >>> Local workstation (works fine): ./configure --with-cc=gcc >>>> --with-cxx=g++ --with-fc=gfortran --download-mpich --download-scalapack >>>> --download-parmetis --download-metis --download-ptscotch >>>> --download-fblaslapack --download-hypre --download-superlu_dist >>>> --download-hdf5=yes --download-ctetgen --with-debugging=0 COPTFLAGS=-O3 >>>> CXXOPTFLAGS=-O3 FOPTFLAGS=-O3 --with-cxx-dialect=C++11 >>>> >>> >>>> >>> Cluster with PETSc 3.9.3 (works fine): >>>> --prefix=/scinet/niagara/software/2018a/opt/intel-2018.2-intelmpi-2018.2/petsc/3.9.3 >>>> CC=mpicc CXX=mpicxx F77=mpif77 F90=mpif90 FC=mpifc COPTFLAGS="-march=native >>>> -O2" CXXOPTFLAGS="-march=native -O2" FOPTFLAGS="-march=native -O2" >>>> --download-chaco=1 --download-hypre=1 --download-metis=1 --download-ml=1 >>>> --download-mumps=1 --download-parmetis=1 --download-plapack=1 >>>> --download-prometheus=1 --download-ptscotch=1 --download-scotch=1 >>>> --download-sprng=1 --download-superlu=1 --download-superlu_dist=1 >>>> --download-triangle=1 --with-avx512-kernels=1 >>>> --with-blaslapack-dir=/scinet/niagara/intel/2018.2/compilers_and_libraries_2018.2.199/linux/mkl >>>> --with-debugging=0 --with-hdf5=1 >>>> --with-mkl_pardiso-dir=/scinet/niagara/intel/2018.2/compilers_and_libraries_2018.2.199/linux/mkl >>>> --with-scalapack=1 >>>> --with-scalapack-lib="[/scinet/niagara/intel/2018.2/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64/libmkl_scalapack_lp64.so,/scinet/niagara/intel/2018.2/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64/libmkl_blacs_intelmpi_lp64.so]" >>>> --with-x=0 >>>> >>> >>>> >>> Cluster with PETSc 3.11.3 (looks weired): >>>> --prefix=/scinet/niagara/software/2019b/opt/intel-2019u4-intelmpi-2019u4/petsc/3.11.3 >>>> CC=mpicc CXX=mpicxx F77=mpif77 F90=mpif90 FC=mpifc COPTFLAGS="-march=native >>>> -O2" CXXOPTFLAGS="-march=native -O2" FOPTFLAGS="-march=native -O2" >>>> --download-chaco=1 --download-hdf5=1 --download-hypre=1 --download-metis=1 >>>> --download-ml=1 --download-mumps=1 --download-parmetis=1 >>>> --download-plapack=1 --download-prometheus=1 --download-ptscotch=1 >>>> --download-scotch=1 --download-sprng=1 --download-superlu=1 >>>> --download-superlu_dist=1 --download-triangle=1 --with-avx512-kernels=1 >>>> --with-blaslapack-dir=/scinet/intel/2019u4/compilers_and_libraries_2019.4.243/linux/mkl >>>> --with-cxx-dialect=C++11 --with-debugging=0 >>>> --with-mkl_pardiso-dir=/scinet/intel/2019u4/compilers_and_libraries_2019.4.243/linux/mkl >>>> --with-scalapack=1 >>>> --with-scalapack-lib="[/scinet/intel/2019u4/compilers_and_libraries_2019.4.243/linux/mkl/lib/intel64/libmkl_scalapack_lp64.so,/scinet/intel/2019u4/compilers_and_libraries_2019.4.243/linux/mkl/lib/intel64/libmkl_blacs_intelmpi_lp64.so]" >>>> --with-x=0 >>>> >>> >>>> >>> And the partition is used by default dmplex distribution. >>>> >>> >>>> >>> !c distribute mesh over processes >>>> >>> call DMPlexDistribute(dmda_flow%da,stencil_width, >>>> & >>>> >>> PETSC_NULL_SF, >>>> & >>>> >>> PETSC_NULL_OBJECT, >>>> & >>>> >>> distributedMesh,ierr) >>>> >>> CHKERRQ(ierr) >>>> >>> >>>> >>> Any idea on this strange problem? >>>> >>> >>>> >>> >>>> >>> I just looked at the code. Your mesh should be partitioned by k-way >>>> partitioning using Metis since its on 1 proc for partitioning. This code >>>> >>> is the same for 3.9 and 3.11, and you get the same result on your >>>> machine. I cannot understand what might be happening on your cluster >>>> >>> (MPI plays no role). Is it possible that you changed the adjacency >>>> specification in that version? >>>> >>> >>>> >>> Thanks, >>>> >>> >>>> >>> Matt >>>> >>> Thanks, >>>> >>> >>>> >>> Danyang >>>> >>> >>>> >>> >>>> >>> >>>> >>> -- >>>> >>> What most experimenters take for granted before they begin their >>>> experiments is infinitely more interesting than any results to which their >>>> experiments lead. >>>> >>> -- Norbert Wiener >>>> >>> >>>> >>> https://www.cse.buffalo.edu/~knepley/ >>>> >>> >>>> >>> -- >>>> >>> Sent from my Android device with K-9 Mail. Please excuse my brevity. >>>> >>> >>>> >>> >>>> >>> -- >>>> >>> What most experimenters take for granted before they begin their >>>> experiments is infinitely more interesting than any results to which their >>>> experiments lead. >>>> >>> -- Norbert Wiener >>>> >>> >>>> >>> https://www.cse.buffalo.edu/~knepley/ >>>> > >>>> >>>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >>> https://www.cse.buffalo.edu/~knepley/ >>> >>> >>> > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: petsc-partition-compare.png Type: image/png Size: 69346 bytes Desc: not available URL: From danyang.su at gmail.com Tue Sep 17 12:15:04 2019 From: danyang.su at gmail.com (Danyang Su) Date: Tue, 17 Sep 2019 10:15:04 -0700 Subject: [petsc-users] Strange Partition in PETSc 3.11 version on some computers In-Reply-To: References: <9c36fe1c-e6c2-0278-864e-f1453687d3f9@gmail.com> <5217DF9F-E42D-4E1F-AE9B-1088954548BB@anl.gov> <4ffe88c2-354c-07d7-ab1e-0f1edd8ec3c3@gmail.com> <89d9f65e-c185-0f6f-cd27-d372303cecc3@gmail.com> Message-ID: <7117849b-6b4b-abd3-a7f8-141fcb8a420e@gmail.com> On 2019-09-17 10:07 a.m., Mark Adams wrote: > Matt that sound like it. > > danyang, just in case its not clear, you need to delete your > architecture directory and reconfigure from scratch. You should be > able to just delete the arch-dir/externalpackages/git.parmetis[metis] > directories but I'd simply delete the whole arch-dir. Many thanks to you all for the suggestions. I will try this first and keep you updated. Danyang > > On Tue, Sep 17, 2019 at 1:03 PM Matthew Knepley > wrote: > > On Tue, Sep 17, 2019 at 12:53 PM Danyang Su > wrote: > > Hi Mark, > > Thanks for your follow-up. > > The unstructured grid code has been verified and there is no > problem in the results. The convergence rate is also good. The > 3D mesh is not good, it is based on the original stratum which > I haven't refined, but good for initial test as it is relative > small and the results obtained from this mesh still makes sense. > > The 2D meshes are just for testing purpose as I want to > reproduce the partition problem on a cluster using PETSc3.11.3 > and Intel2019. Unfortunately, I didn't find problem using this > example. > > The code has no problem in using different PETSc versions > (PETSc V3.4 to V3.11) and MPI distribution (MPICH, OpenMPI, > IntelMPI), except for one simulation case (the mesh I > attached) on a cluster with PETSc3.11.3 and Intel2019u4 due to > the very different partition compared to PETSc3.9.3. Yet the > simulation results are the same except for the efficiency > problem because the strange partition results into much more > communication (ghost nodes). > > I am still trying different compiler and mpi with PETSc3.11.3 > on that cluster to trace the problem. Will get back to you > guys when there is update. > > You had --download-parmetis in your configure command, but I > wonder if it is possible that it actually was not downloaded and > already present. The type of the ParMetis weights can be changed, > and if the type that PETSc thinks it is does not match the > actual library type, then the weights could all be crazy numbers. > I seem to recall someone changing the weight type in a release, > which might mean that the built ParMetis was fine with one version > and not the other. > > ? Thanks, > > ? ? Matt > > Thanks, > > danyang > > On 2019-09-17 9:02 a.m., Mark Adams wrote: >> Danyang, >> >> Excuse me if I missed?something in this thread but just a few >> ideas. >> >> First, I trust that you have verified that you are getting a >> good solution with these bad meshes. Ideally you would check >> that the solver convergence rates are similar. >> >> You might verify that your mesh is inside of DMPLex >> correctly. You can visualize?a Plex mesh very?easily. (let us >> know if you need instructions). >> >> This striping on the 2D meshes look something like what you >> are getting with your 3D PRISM mesh. DMPLex just calls >> Parmetis with a flat graph. It is odd to me that your >> rectangular grids have so much structure and are >> non-isotropic. I assume that these rectangular?meshes are >> isotropic?(eg, squares). >> >> Anyway, just some thoughts, >> Mark >> >> On Tue, Sep 17, 2019 at 12:43 AM Danyang Su via petsc-users >> > wrote: >> >> >> On 2019-09-16 12:02 p.m., Matthew Knepley wrote: >>> On Mon, Sep 16, 2019 at 1:46 PM Smith, Barry F. >>> > wrote: >>> >>> >>> ? Very different stuff going on in the two cases, >>> different objects being created, different number of >>> different types of operations. Clearly a major >>> refactorization of the code was done. Presumably a >>> regression was introduced that changed the behavior >>> dramatically, possible by mistake. >>> >>> ? ?You can attempt to use git bisect to determine >>> what changed caused the dramatic change in behavior. >>> Then it can be decided if the changed that triggered >>> the change in the results was a bug or a planned >>> feature. >>> >>> >>> Danyang, >>> >>> Can you send me the smallest mesh you care about, and I >>> will look at the partitioning? We can at least get >>> quality metrics >>> between these two releases. >>> >>> ? Thanks, >>> >>> ? ? ?Matt >> >> Hi Matt, >> >> This is the smallest mesh for the regional scale >> simulation that has strange partition problem. It can be >> download via the link below. >> >> https://www.dropbox.com/s/tu34jgqqhkz8pwj/basin-3d.vtk?dl=0 >> >> I am trying to reproduce the similar problem using >> smaller 2D mesh, however, there is no such problem in 2D, >> even though the partitions using PETSc 3.9.3 and 3.11.3 >> are a bit different, they both look reasonable. As shown >> below, both rectangular mesh and triangular mesh use DMPlex. >> >> 2D rectangular and triangle mesh >> >> I will keep on testing using PETSc3.11.3 but with >> different compiler and MPI to check if I can reproduce >> the problem. >> >> Thanks, >> >> Danyang >> >>> >>> ?Barry >>> >>> >>> > On Sep 16, 2019, at 11:50 AM, Danyang Su >>> > >>> wrote: >>> > >>> > Hi Barry and Matt, >>> > >>> > Attached is the output of both runs with -dm_view >>> -log_view included. >>> > >>> > I am now coordinating with staff to install PETSc >>> 3.9.3 version using intel2019u4 to narrow down the >>> problem. Will get back to you later after the test. >>> > >>> > Thanks, >>> > >>> > Danyang >>> > >>> > On 2019-09-15 4:43 p.m., Smith, Barry F. wrote: >>> >>? ?Send the configure.log and make.log for the two >>> system configurations that produce very different >>> results as well as the output running with -dm_view >>> -info for both runs. The cause is likely not subtle, >>> one is likely using metis and the other is likely >>> just not using any partitioner. >>> >> >>> >> >>> >> >>> >>> On Sep 15, 2019, at 6:07 PM, Matthew Knepley via >>> petsc-users >> > wrote: >>> >>> >>> >>> On Sun, Sep 15, 2019 at 6:59 PM Danyang Su >>> > >>> wrote: >>> >>> Hi Matt, >>> >>> >>> >>> Thanks for the quick reply. I have no change in >>> the adjacency. The source code and the simulation >>> input files are all the same. I also tried to use >>> GNU compiler and mpich with petsc 3.11.3 and it >>> works fine. >>> >>> >>> >>> It looks like the problem is caused by the >>> difference in configuration. However, the >>> configuration is pretty the same as petsc 3.9.3 >>> except the compiler and mpi used. I will contact >>> scinet staff to check if they have any idea on this. >>> >>> >>> >>> Very very strange since the partition is handled >>> completely by Metis, and does not use MPI. >>> >>> >>> >>>? ?Thanks, >>> >>> >>> >>>? ? ?Matt >>> >>>? Thanks, >>> >>> >>> >>> Danyang >>> >>> >>> >>> On September 15, 2019 3:20:18 p.m. PDT, Matthew >>> Knepley >> > wrote: >>> >>> On Sun, Sep 15, 2019 at 5:19 PM Danyang Su via >>> petsc-users >> > wrote: >>> >>> Dear All, >>> >>> >>> >>> I have a question regarding strange partition >>> problem in PETSc 3.11 version. The problem does not >>> exist on my local workstation. However, on a cluster >>> with different PETSc versions, the partition seems >>> quite different, as you can find in the figure >>> below, which is tested with 160 processors. The >>> color means the processor owns that subdomain. In >>> this layered prism mesh, there are 40 layers from >>> bottom to top and each layer has around 20k nodes. >>> The natural order of nodes is also layered from >>> bottom to top. >>> >>> >>> >>> The left partition (PETSc 3.10 and earlier) >>> looks good with minimum number of ghost nodes while >>> the right one (PETSc 3.11) looks weired with huge >>> number of ghost nodes. Looks like the right one uses >>> partition layer by layer. This problem exists on a a >>> cluster but not on my local workstation for the same >>> PETSc version (with different compiler and MPI). >>> Other than the difference in partition and >>> efficiency, the simulation results are the same. >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> Below is PETSc configuration on three machine: >>> >>> >>> >>> Local workstation (works fine):? ./configure >>> --with-cc=gcc --with-cxx=g++ --with-fc=gfortran >>> --download-mpich --download-scalapack >>> --download-parmetis --download-metis >>> --download-ptscotch --download-fblaslapack >>> --download-hypre --download-superlu_dist >>> --download-hdf5=yes --download-ctetgen >>> --with-debugging=0 COPTFLAGS=-O3 CXXOPTFLAGS=-O3 >>> FOPTFLAGS=-O3 --with-cxx-dialect=C++11 >>> >>> >>> >>> Cluster with PETSc 3.9.3 (works fine): >>> --prefix=/scinet/niagara/software/2018a/opt/intel-2018.2-intelmpi-2018.2/petsc/3.9.3 >>> CC=mpicc CXX=mpicxx F77=mpif77 F90=mpif90 FC=mpifc >>> COPTFLAGS="-march=native -O2" >>> CXXOPTFLAGS="-march=native -O2" >>> FOPTFLAGS="-march=native -O2" --download-chaco=1 >>> --download-hypre=1 --download-metis=1 >>> --download-ml=1 --download-mumps=1 >>> --download-parmetis=1 --download-plapack=1 >>> --download-prometheus=1 --download-ptscotch=1 >>> --download-scotch=1 --download-sprng=1 >>> --download-superlu=1 --download-superlu_dist=1 >>> --download-triangle=1 --with-avx512-kernels=1 >>> --with-blaslapack-dir=/scinet/niagara/intel/2018.2/compilers_and_libraries_2018.2.199/linux/mkl >>> --with-debugging=0 --with-hdf5=1 >>> --with-mkl_pardiso-dir=/scinet/niagara/intel/2018.2/compilers_and_libraries_2018.2.199/linux/mkl >>> --with-scalapack=1 >>> --with-scalapack-lib="[/scinet/niagara/intel/2018.2/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64/libmkl_scalapack_lp64.so,/scinet/niagara/intel/2018.2/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64/libmkl_blacs_intelmpi_lp64.so]" >>> --with-x=0 >>> >>> >>> >>> Cluster with PETSc 3.11.3 (looks weired): >>> --prefix=/scinet/niagara/software/2019b/opt/intel-2019u4-intelmpi-2019u4/petsc/3.11.3 >>> CC=mpicc CXX=mpicxx F77=mpif77 F90=mpif90 FC=mpifc >>> COPTFLAGS="-march=native -O2" >>> CXXOPTFLAGS="-march=native -O2" >>> FOPTFLAGS="-march=native -O2" --download-chaco=1 >>> --download-hdf5=1 --download-hypre=1 >>> --download-metis=1 --download-ml=1 >>> --download-mumps=1 --download-parmetis=1 >>> --download-plapack=1 --download-prometheus=1 >>> --download-ptscotch=1 --download-scotch=1 >>> --download-sprng=1 --download-superlu=1 >>> --download-superlu_dist=1 --download-triangle=1 >>> --with-avx512-kernels=1 >>> --with-blaslapack-dir=/scinet/intel/2019u4/compilers_and_libraries_2019.4.243/linux/mkl >>> --with-cxx-dialect=C++11 --with-debugging=0 >>> --with-mkl_pardiso-dir=/scinet/intel/2019u4/compilers_and_libraries_2019.4.243/linux/mkl >>> --with-scalapack=1 >>> --with-scalapack-lib="[/scinet/intel/2019u4/compilers_and_libraries_2019.4.243/linux/mkl/lib/intel64/libmkl_scalapack_lp64.so,/scinet/intel/2019u4/compilers_and_libraries_2019.4.243/linux/mkl/lib/intel64/libmkl_blacs_intelmpi_lp64.so]" >>> --with-x=0 >>> >>> >>> >>> And the partition is used by default dmplex >>> distribution. >>> >>> >>> >>>? ? ? ?!c distribute mesh over processes >>> >>>? ? ? ?call >>> DMPlexDistribute(dmda_flow%da,stencil_width, ? ? ? ? >>> ? ? ? & >>> >>> ? ?PETSC_NULL_SF, ? ? ? ?& >>> >>> ? ?PETSC_NULL_OBJECT, ? ? ? ?& >>> >>> ? ?distributedMesh,ierr) >>> >>>? ? ? ?CHKERRQ(ierr) >>> >>> >>> >>> Any idea on this strange problem? >>> >>> >>> >>> >>> >>> I just looked at the code. Your mesh should be >>> partitioned by k-way partitioning using Metis since >>> its on 1 proc for partitioning. This code >>> >>> is the same for 3.9 and 3.11, and you get the >>> same result on your machine. I cannot understand >>> what might be happening on your cluster >>> >>> (MPI plays no role). Is it possible that you >>> changed the adjacency specification in that version? >>> >>> >>> >>>? ?Thanks, >>> >>> >>> >>>? ? ? Matt >>> >>> Thanks, >>> >>> >>> >>> Danyang >>> >>> >>> >>> >>> >>> >>> >>> -- >>> >>> What most experimenters take for granted before >>> they begin their experiments is infinitely more >>> interesting than any results to which their >>> experiments lead. >>> >>> -- Norbert Wiener >>> >>> >>> >>> https://www.cse.buffalo.edu/~knepley/ >>> >>> >>> >>> -- >>> >>> Sent from my Android device with K-9 Mail. >>> Please excuse my brevity. >>> >>> >>> >>> >>> >>> -- >>> >>> What most experimenters take for granted before >>> they begin their experiments is infinitely more >>> interesting than any results to which their >>> experiments lead. >>> >>> -- Norbert Wiener >>> >>> >>> >>> https://www.cse.buffalo.edu/~knepley/ >>> > >>> >>> >>> >>> -- >>> What most experimenters take for granted before they >>> begin their experiments is infinitely more interesting >>> than any results to which their experiments lead. >>> -- Norbert Wiener >>> >>> https://www.cse.buffalo.edu/~knepley/ >>> >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to > which their experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: petsc-partition-compare.png Type: image/png Size: 69346 bytes Desc: not available URL: From juaneah at gmail.com Tue Sep 17 12:17:37 2019 From: juaneah at gmail.com (Emmanuel Ayala) Date: Tue, 17 Sep 2019 12:17:37 -0500 Subject: [petsc-users] Optimized mode In-Reply-To: <88D14AC3-F5CA-4A72-82F4-F1D74EF96FF5@anl.gov> References: <88D14AC3-F5CA-4A72-82F4-F1D74EF96FF5@anl.gov> Message-ID: Hi, thanks for the quick reply. El lun., 16 de sep. de 2019 a la(s) 23:40, Smith, Barry F. ( bsmith at mcs.anl.gov) escribi?: > > What do you mean by 2.0e-4 units ? If you mean the last 4 digits may > differ in the two solutions, yes, that is the meaning yes that is completely normal. How many digits you lose depends on the > order of the operations and the condition number of the matrix and and for > elasticity that will very easily be greater than 10^4 > OK. I understand that two solution can produce slightly different results. So, I guess, the optimized mode produce slightly different results because internally something change (order of some operations, maybe) when the number of process change?. > > Barry > > From: > https://pdfs.semanticscholar.org/dccf/d6daa35fc9d585de1f927c58cc29c4cd0bab.pdf > > We conclude this section by noting the need for care in interpreting the > forward error. Experiments in [24] show that simply changing the order of > evaluation of an inner product in the substitution algorithm for solution > of a triangular system can change the forward error in the computed > solution by orders of magnitude. This means, for example, that it is > dangerous to compare different codes or algorithms solely in terms of > observed forward errors. > > Thank you very much! Best regards. > > > On Sep 16, 2019, at 11:13 PM, Emmanuel Ayala via petsc-users < > petsc-users at mcs.anl.gov> wrote: > > > > Hi everyone, > > > > I have a code for elastic linear analysis which involves DMDA, SuperLu > and matrix operations. I have a MPI vector which contains the nodal > displacements, when I run the code in debug mode i get the same value for > the norm (2 or Inf), for different number of process. But when I use the > optimized mode, I have small variations for the same norm depending on the > number of process. > > > > -Debug mode: the same norm value for any number of processes. > > -Optimized mode: the norm value changes with the number of processes. > The variations are around 2.0e-4 units. > > > > This is normal? > > > > For my optimized mode I used the next configuration > > > > ./configure --with-debugging=0 COPTFLAGS='-O3 -march=native > -mtune=native' CXXOPTFLAGS='-O3 -march=native -mtune=native' FOPTFLAGS='-O3 > -march=native -mtune=native' --download-mpich --download-superlu_dist > --download-metis --download-parmetis --download-cmake > --download-fblaslapack=1 --with-cxx-dialect=C++11 > > > > Best regards. > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Tue Sep 17 13:16:56 2019 From: bsmith at mcs.anl.gov (Smith, Barry F.) Date: Tue, 17 Sep 2019 18:16:56 +0000 Subject: [petsc-users] Optimized mode In-Reply-To: References: <88D14AC3-F5CA-4A72-82F4-F1D74EF96FF5@anl.gov> Message-ID: <0D85CF2B-E526-41F4-9024-6E5A593E8A17@mcs.anl.gov> In parallel the order of operations will always change with different number of processes; even with the same number the orders will be changed based on order of arrival of data from other processes; so identical runs can produce different results. Barry > On Sep 17, 2019, at 12:17 PM, Emmanuel Ayala wrote: > > Hi, thanks for the quick reply. > > > El lun., 16 de sep. de 2019 a la(s) 23:40, Smith, Barry F. (bsmith at mcs.anl.gov) escribi?: > > What do you mean by 2.0e-4 units ? If you mean the last 4 digits may differ in the two solutions, > > yes, that is the meaning > > yes that is completely normal. How many digits you lose depends on the order of the operations and the condition number of the matrix and and for elasticity that will very easily be greater than 10^4 > > OK. I understand that two solution can produce slightly different results. So, I guess, the optimized mode produce slightly different results because internally something change (order of some operations, maybe) when the number of process change?. > > Barry > > From: https://pdfs.semanticscholar.org/dccf/d6daa35fc9d585de1f927c58cc29c4cd0bab.pdf > > We conclude this section by noting the need for care in interpreting the forward error. Experiments in [24] show that simply changing the order of evaluation of an inner product in the substitution algorithm for solution of a triangular system can change the forward error in the computed solution by orders of magnitude. This means, for example, that it is dangerous to compare different codes or algorithms solely in terms of observed forward errors. > > Thank you very much! > Best regards. > > > On Sep 16, 2019, at 11:13 PM, Emmanuel Ayala via petsc-users wrote: > > > > Hi everyone, > > > > I have a code for elastic linear analysis which involves DMDA, SuperLu and matrix operations. I have a MPI vector which contains the nodal displacements, when I run the code in debug mode i get the same value for the norm (2 or Inf), for different number of process. But when I use the optimized mode, I have small variations for the same norm depending on the number of process. > > > > -Debug mode: the same norm value for any number of processes. > > -Optimized mode: the norm value changes with the number of processes. The variations are around 2.0e-4 units. > > > > This is normal? > > > > For my optimized mode I used the next configuration > > > > ./configure --with-debugging=0 COPTFLAGS='-O3 -march=native -mtune=native' CXXOPTFLAGS='-O3 -march=native -mtune=native' FOPTFLAGS='-O3 -march=native -mtune=native' --download-mpich --download-superlu_dist --download-metis --download-parmetis --download-cmake --download-fblaslapack=1 --with-cxx-dialect=C++11 > > > > Best regards. > From juaneah at gmail.com Tue Sep 17 13:20:26 2019 From: juaneah at gmail.com (Emmanuel Ayala) Date: Tue, 17 Sep 2019 13:20:26 -0500 Subject: [petsc-users] Optimized mode In-Reply-To: <0D85CF2B-E526-41F4-9024-6E5A593E8A17@mcs.anl.gov> References: <88D14AC3-F5CA-4A72-82F4-F1D74EF96FF5@anl.gov> <0D85CF2B-E526-41F4-9024-6E5A593E8A17@mcs.anl.gov> Message-ID: OK, thanks for the clarification! :D Regards. El mar., 17 de sep. de 2019 a la(s) 13:16, Smith, Barry F. ( bsmith at mcs.anl.gov) escribi?: > > In parallel the order of operations will always change with different > number of processes; even with the same number the orders will be changed > based on order of arrival of data from other processes; so identical runs > can produce different results. > > Barry > > > > On Sep 17, 2019, at 12:17 PM, Emmanuel Ayala wrote: > > > > Hi, thanks for the quick reply. > > > > > > El lun., 16 de sep. de 2019 a la(s) 23:40, Smith, Barry F. ( > bsmith at mcs.anl.gov) escribi?: > > > > What do you mean by 2.0e-4 units ? If you mean the last 4 digits may > differ in the two solutions, > > > > yes, that is the meaning > > > > yes that is completely normal. How many digits you lose depends on the > order of the operations and the condition number of the matrix and and for > elasticity that will very easily be greater than 10^4 > > > > OK. I understand that two solution can produce slightly different > results. So, I guess, the optimized mode produce slightly different > results because internally something change (order of some operations, > maybe) when the number of process change?. > > > > Barry > > > > From: > https://pdfs.semanticscholar.org/dccf/d6daa35fc9d585de1f927c58cc29c4cd0bab.pdf > > > > We conclude this section by noting the need for care in interpreting the > forward error. Experiments in [24] show that simply changing the order of > evaluation of an inner product in the substitution algorithm for solution > of a triangular system can change the forward error in the computed > solution by orders of magnitude. This means, for example, that it is > dangerous to compare different codes or algorithms solely in terms of > observed forward errors. > > > > Thank you very much! > > Best regards. > > > > > On Sep 16, 2019, at 11:13 PM, Emmanuel Ayala via petsc-users < > petsc-users at mcs.anl.gov> wrote: > > > > > > Hi everyone, > > > > > > I have a code for elastic linear analysis which involves DMDA, SuperLu > and matrix operations. I have a MPI vector which contains the nodal > displacements, when I run the code in debug mode i get the same value for > the norm (2 or Inf), for different number of process. But when I use the > optimized mode, I have small variations for the same norm depending on the > number of process. > > > > > > -Debug mode: the same norm value for any number of processes. > > > -Optimized mode: the norm value changes with the number of processes. > The variations are around 2.0e-4 units. > > > > > > This is normal? > > > > > > For my optimized mode I used the next configuration > > > > > > ./configure --with-debugging=0 COPTFLAGS='-O3 -march=native > -mtune=native' CXXOPTFLAGS='-O3 -march=native -mtune=native' FOPTFLAGS='-O3 > -march=native -mtune=native' --download-mpich --download-superlu_dist > --download-metis --download-parmetis --download-cmake > --download-fblaslapack=1 --with-cxx-dialect=C++11 > > > > > > Best regards. > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mfadams at lbl.gov Tue Sep 17 15:39:37 2019 From: mfadams at lbl.gov (Mark Adams) Date: Tue, 17 Sep 2019 16:39:37 -0400 Subject: [petsc-users] Optimized mode In-Reply-To: References: <88D14AC3-F5CA-4A72-82F4-F1D74EF96FF5@anl.gov> <0D85CF2B-E526-41F4-9024-6E5A593E8A17@mcs.anl.gov> Message-ID: I am suspicious that he gets the exact same answer with a debug build. You might try -O2 (and -O1, and -O0, which should be the same as your debug build). On Tue, Sep 17, 2019 at 2:21 PM Emmanuel Ayala via petsc-users < petsc-users at mcs.anl.gov> wrote: > OK, thanks for the clarification! :D > > Regards. > > El mar., 17 de sep. de 2019 a la(s) 13:16, Smith, Barry F. ( > bsmith at mcs.anl.gov) escribi?: > >> >> In parallel the order of operations will always change with different >> number of processes; even with the same number the orders will be changed >> based on order of arrival of data from other processes; so identical runs >> can produce different results. >> >> Barry >> >> >> > On Sep 17, 2019, at 12:17 PM, Emmanuel Ayala wrote: >> > >> > Hi, thanks for the quick reply. >> > >> > >> > El lun., 16 de sep. de 2019 a la(s) 23:40, Smith, Barry F. ( >> bsmith at mcs.anl.gov) escribi?: >> > >> > What do you mean by 2.0e-4 units ? If you mean the last 4 digits >> may differ in the two solutions, >> > >> > yes, that is the meaning >> > >> > yes that is completely normal. How many digits you lose depends on the >> order of the operations and the condition number of the matrix and and for >> elasticity that will very easily be greater than 10^4 >> > >> > OK. I understand that two solution can produce slightly different >> results. So, I guess, the optimized mode produce slightly different >> results because internally something change (order of some operations, >> maybe) when the number of process change?. >> > >> > Barry >> > >> > From: >> https://pdfs.semanticscholar.org/dccf/d6daa35fc9d585de1f927c58cc29c4cd0bab.pdf >> > >> > We conclude this section by noting the need for care in interpreting >> the forward error. Experiments in [24] show that simply changing the order >> of evaluation of an inner product in the substitution algorithm for >> solution of a triangular system can change the forward error in the >> computed solution by orders of magnitude. This means, for example, that it >> is dangerous to compare different codes or algorithms solely in terms of >> observed forward errors. >> > >> > Thank you very much! >> > Best regards. >> > >> > > On Sep 16, 2019, at 11:13 PM, Emmanuel Ayala via petsc-users < >> petsc-users at mcs.anl.gov> wrote: >> > > >> > > Hi everyone, >> > > >> > > I have a code for elastic linear analysis which involves DMDA, >> SuperLu and matrix operations. I have a MPI vector which contains the nodal >> displacements, when I run the code in debug mode i get the same value for >> the norm (2 or Inf), for different number of process. But when I use the >> optimized mode, I have small variations for the same norm depending on the >> number of process. >> > > >> > > -Debug mode: the same norm value for any number of processes. >> > > -Optimized mode: the norm value changes with the number of processes. >> The variations are around 2.0e-4 units. >> > > >> > > This is normal? >> > > >> > > For my optimized mode I used the next configuration >> > > >> > > ./configure --with-debugging=0 COPTFLAGS='-O3 -march=native >> -mtune=native' CXXOPTFLAGS='-O3 -march=native -mtune=native' FOPTFLAGS='-O3 >> -march=native -mtune=native' --download-mpich --download-superlu_dist >> --download-metis --download-parmetis --download-cmake >> --download-fblaslapack=1 --with-cxx-dialect=C++11 >> > > >> > > Best regards. >> > >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From juaneah at gmail.com Tue Sep 17 15:54:25 2019 From: juaneah at gmail.com (Emmanuel Ayala) Date: Tue, 17 Sep 2019 15:54:25 -0500 Subject: [petsc-users] Optimized mode In-Reply-To: References: <88D14AC3-F5CA-4A72-82F4-F1D74EF96FF5@anl.gov> <0D85CF2B-E526-41F4-9024-6E5A593E8A17@mcs.anl.gov> Message-ID: Thanks for the tip! Regards. El mar., 17 de sep. de 2019 a la(s) 15:40, Mark Adams (mfadams at lbl.gov) escribi?: > I am suspicious that he gets the exact same answer with a debug build. > > You might try -O2 (and -O1, and -O0, which should be the same as your > debug build). > > On Tue, Sep 17, 2019 at 2:21 PM Emmanuel Ayala via petsc-users < > petsc-users at mcs.anl.gov> wrote: > >> OK, thanks for the clarification! :D >> >> Regards. >> >> El mar., 17 de sep. de 2019 a la(s) 13:16, Smith, Barry F. ( >> bsmith at mcs.anl.gov) escribi?: >> >>> >>> In parallel the order of operations will always change with different >>> number of processes; even with the same number the orders will be changed >>> based on order of arrival of data from other processes; so identical runs >>> can produce different results. >>> >>> Barry >>> >>> >>> > On Sep 17, 2019, at 12:17 PM, Emmanuel Ayala >>> wrote: >>> > >>> > Hi, thanks for the quick reply. >>> > >>> > >>> > El lun., 16 de sep. de 2019 a la(s) 23:40, Smith, Barry F. ( >>> bsmith at mcs.anl.gov) escribi?: >>> > >>> > What do you mean by 2.0e-4 units ? If you mean the last 4 digits >>> may differ in the two solutions, >>> > >>> > yes, that is the meaning >>> > >>> > yes that is completely normal. How many digits you lose depends on the >>> order of the operations and the condition number of the matrix and and for >>> elasticity that will very easily be greater than 10^4 >>> > >>> > OK. I understand that two solution can produce slightly different >>> results. So, I guess, the optimized mode produce slightly different >>> results because internally something change (order of some operations, >>> maybe) when the number of process change?. >>> > >>> > Barry >>> > >>> > From: >>> https://pdfs.semanticscholar.org/dccf/d6daa35fc9d585de1f927c58cc29c4cd0bab.pdf >>> > >>> > We conclude this section by noting the need for care in interpreting >>> the forward error. Experiments in [24] show that simply changing the order >>> of evaluation of an inner product in the substitution algorithm for >>> solution of a triangular system can change the forward error in the >>> computed solution by orders of magnitude. This means, for example, that it >>> is dangerous to compare different codes or algorithms solely in terms of >>> observed forward errors. >>> > >>> > Thank you very much! >>> > Best regards. >>> > >>> > > On Sep 16, 2019, at 11:13 PM, Emmanuel Ayala via petsc-users < >>> petsc-users at mcs.anl.gov> wrote: >>> > > >>> > > Hi everyone, >>> > > >>> > > I have a code for elastic linear analysis which involves DMDA, >>> SuperLu and matrix operations. I have a MPI vector which contains the nodal >>> displacements, when I run the code in debug mode i get the same value for >>> the norm (2 or Inf), for different number of process. But when I use the >>> optimized mode, I have small variations for the same norm depending on the >>> number of process. >>> > > >>> > > -Debug mode: the same norm value for any number of processes. >>> > > -Optimized mode: the norm value changes with the number of >>> processes. The variations are around 2.0e-4 units. >>> > > >>> > > This is normal? >>> > > >>> > > For my optimized mode I used the next configuration >>> > > >>> > > ./configure --with-debugging=0 COPTFLAGS='-O3 -march=native >>> -mtune=native' CXXOPTFLAGS='-O3 -march=native -mtune=native' FOPTFLAGS='-O3 >>> -march=native -mtune=native' --download-mpich --download-superlu_dist >>> --download-metis --download-parmetis --download-cmake >>> --download-fblaslapack=1 --with-cxx-dialect=C++11 >>> > > >>> > > Best regards. >>> > >>> >>> -------------- next part -------------- An HTML attachment was scrubbed... URL: From mvalera-w at sdsu.edu Tue Sep 17 17:13:46 2019 From: mvalera-w at sdsu.edu (Manuel Valera) Date: Tue, 17 Sep 2019 15:13:46 -0700 Subject: [petsc-users] TS scheme with different DAs Message-ID: Hello, petsc users, I have integrated the TS routines in my code, but i just noticed i didn't do it optimally. I was using 3 different TS objects to integrate velocities, temperature and salinity, and it works but only for small DTs. I suspect the intermediate Runge-Kutta states are unphased and this creates the discrepancy for broader time steps, so I need to integrate the 3 quantities in the same routine. I tried to do this by using a 5 DOF distributed array for the RHS, where I store the velocities in the first 3 and then Temperature and Salinity in the rest. The problem is that I use a staggered grid and T,S are located in a different DA layout than the velocities. This is creating problems for me since I can't find a way to communicate the information from the result of the TS integration back to the respective DAs of each variable. Is there a way to communicate across DAs? or can you suggest an alternative solution to this problem? Thanks, -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Tue Sep 17 17:28:44 2019 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 17 Sep 2019 18:28:44 -0400 Subject: [petsc-users] TS scheme with different DAs In-Reply-To: References: Message-ID: On Tue, Sep 17, 2019 at 6:15 PM Manuel Valera via petsc-users < petsc-users at mcs.anl.gov> wrote: > Hello, petsc users, > > I have integrated the TS routines in my code, but i just noticed i didn't > do it optimally. I was using 3 different TS objects to integrate > velocities, temperature and salinity, and it works but only for small DTs. > I suspect the intermediate Runge-Kutta states are unphased and this creates > the discrepancy for broader time steps, so I need to integrate the 3 > quantities in the same routine. > > I tried to do this by using a 5 DOF distributed array for the RHS, where I > store the velocities in the first 3 and then Temperature and Salinity in > the rest. The problem is that I use a staggered grid and T,S are located in > a different DA layout than the velocities. This is creating problems for me > since I can't find a way to communicate the information from the result of > the TS integration back to the respective DAs of each variable. > > Is there a way to communicate across DAs? or can you suggest > an alternative solution to this problem? > If you have a staggered discretization on a structured grid, I would recommend checking out DMStag. Thanks, MAtt > Thanks, > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From mvalera-w at sdsu.edu Tue Sep 17 19:04:04 2019 From: mvalera-w at sdsu.edu (Manuel Valera) Date: Tue, 17 Sep 2019 17:04:04 -0700 Subject: [petsc-users] TS scheme with different DAs In-Reply-To: References: Message-ID: Thanks Matthew, but my code is too complicated to be redone on DMStag now after spending a long time using DMDAs, Is there a way to ensure PETSc distributes several DAs in the same way? besides manually distributing the points, Thanks, On Tue, Sep 17, 2019 at 3:28 PM Matthew Knepley wrote: > On Tue, Sep 17, 2019 at 6:15 PM Manuel Valera via petsc-users < > petsc-users at mcs.anl.gov> wrote: > >> Hello, petsc users, >> >> I have integrated the TS routines in my code, but i just noticed i didn't >> do it optimally. I was using 3 different TS objects to integrate >> velocities, temperature and salinity, and it works but only for small DTs. >> I suspect the intermediate Runge-Kutta states are unphased and this creates >> the discrepancy for broader time steps, so I need to integrate the 3 >> quantities in the same routine. >> >> I tried to do this by using a 5 DOF distributed array for the RHS, where >> I store the velocities in the first 3 and then Temperature and Salinity in >> the rest. The problem is that I use a staggered grid and T,S are located in >> a different DA layout than the velocities. This is creating problems for me >> since I can't find a way to communicate the information from the result of >> the TS integration back to the respective DAs of each variable. >> >> Is there a way to communicate across DAs? or can you suggest >> an alternative solution to this problem? >> > > If you have a staggered discretization on a structured grid, I would > recommend checking out DMStag. > > Thanks, > > MAtt > > >> Thanks, >> >> > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Tue Sep 17 19:15:11 2019 From: bsmith at mcs.anl.gov (Smith, Barry F.) Date: Wed, 18 Sep 2019 00:15:11 +0000 Subject: [petsc-users] I find slow performance of SNES In-Reply-To: <366122035.18684472.1568739431284.JavaMail.zimbra@u-bordeaux.fr> References: <1415422715.5264647.1568586915625.JavaMail.zimbra@u-bordeaux.fr> <9249F6D6-0F16-4FCB-A820-5F8F3F04E03C@anl.gov> <366122035.18684472.1568739431284.JavaMail.zimbra@u-bordeaux.fr> Message-ID: <4E75EEA0-7F70-40B5-9EB1-B3DD883C63F0@mcs.anl.gov> > On Sep 17, 2019, at 11:57 AM, Pedro Gonzalez wrote: > > Dear Barry, > > > Thanks a lot for your quick reply. > > > >> Do you simply mean that -snes_type ngs is not converging to the solution? Does it seem to converge to something else or nothing at all? >> The SNES GS code just calls the user provided routine set with SNESSetNGS(). This means that it is expect to behave the same way as if the user simply called the user provided routine themselves. I assume you have a routine that implements the GS on the DMDA since you write " I managed to parallelize it by using DMDA with very good results. " Thus you should get the same iterations for both calling your Gauss-Seidel code yourself and calling SNESSetNGS() and then calling SNESSolve() with -snes_type ngs. You can check if they are behaving in the same (for simplicity) by using the debugger or by putting VecView() into code to see if they are generating the same values. Just have your GS code call VecView() on the input vectors at the top and the output vectors at the bottom. > > Before adding SNES to my code, I had programmed a nonlinear Gauss-Seidel algorithm which I parallelized via DMDA. I use it as reference solution. The SNES solver converges towards this solution. The problem is that it is being considerably slower than my reference algorithm. For instance, "-snes_type ngs" takes about 200 times more time, "-snes_type nrichardson" employs approximately 50 times more time, while "-snes_type ngmres" is about 20 times slower. > > I expect that SNES be as fast as my reference algorithm. Why? It is using different algorithms that are not going to necessarily converge any faster in iterations than your GS nor take less time per iteration. > This slow-performance problem may be due to a not optimal configuration of SNES (I am not an expert user of PETSc) or, as you suggest (see below), to the fact that SNES is calculating more than the necessary at each iteration (e.g., norms). > > Moreover, I have checked that the number iterations needed by "ngs" to reach convergence is almost the same as my reference GS algorithm. The small differences are due to the way the grid points are sweept during the iterations. > > > >> Do you mean both -snes_ngs_secant and -snes_ngs_secant_h ? The second option by itself will do nothing. Cut and paste the exact options you use. > > In order to make SNES converge, I use the following options: > - Nonlinear Gauss-Seidel: -snes_type ngs -snes_ngs_secant_h 1.E-1 > - Nonlinear Richardson: -snes_type nrichardson > - Nolinear Generalized Minimal RESidual: -snes_type ngmres -snes_linesearch_type cp > > > >> This would happen if you did not use -snes_ngs_secant but do use -snes_ngs_secant_h but this doesn't change the algorithm I see. You don't provide your GS solver to SNES so it uses the secant method by default. > > With respect to "ngs" solver, it seems that there is a problem with the residual. Physically, the solution of the nonlinear system is proportional to an input amplitude and the convergence criterium which I set is normalized to this input amplitude. Hence, if I just vary the amplitude in a given problem, the solver should make the same number of iterations because the residual scales to that input amplitude. For physically small input amplitudes from 1.E0 to 1.E8 I observed that the option "-snes_type ngs" makes SNES converge (within the same time and the same number of iterations) and the absolute residual were proportional to the input amplitude, as expected. However, my imput amplitude is 1.E10 and I observed that "-snes_type ngs" could not make the absolute residual decrease. That is the reason why I added the option "-snes_ngs_secant_h 1.E-1" to "ngs" in order to converge when the input amplitude is 1.E10. However, I realize that I can vary this parameter from 1.E-1 to 1.E50 and I obtain the same solution within the same computation time. For very small values (I tested -snes_ngs_secant_h 1.E-8), it does not converge. I cannot explain why using huge values would work for h. I am not sure what you mean by "For physically small input amplitudes from 1.E0 to 1.E8 I", do you mean your solution is in that range? And does "my imput amplitude is 1.E10" mean the solution is about 10^10. If so then that makes sense, because h is the perturbation to x and if the perturbation is too small then it doesn't change the solution so you get no secant computed. Thus you need a bigger h > What is happening here? Nonlinear Richardson does not have this problem and converges at any input amplitude. Nonlinear Richardson does not use any differencing scheme. It has no h. > > > >> By default SNES may be doing some norm computations at each iteration which are expensive and will slow things down. You can use SNESSetNormSchedule() or the command line form -snes_norm_schedule to turn these norms off. > > Thanks for this pertinent suggestion. Actually I have my own convergence criterium. In the subroutine MySNESConverged(snes,nc,xnorm,snorm,fnorm,reason,ctx,ierr) called by SNESSetConvergenceTest(snes,MySNESConverged,0,PETSC_NULL_FUNCTION,ierr), I do not need the norms "xnorm", "snorm" and "fnorm" which SNES calculated by default, as you say. However, I have done a print of these values and I see that "xnorm" and "snorm" are always 0 (so I deduce that they are not calculated) and that "fnorm" is calculated even when I add the option "-snes_norm_schedule none". Some algorithms such as NGMRES require the norms. Basic nonlinear GS should not. Here is the code: Note the norm is only called for all iterations depending on norm schedule. If you are sure it is calling the norm every iteration you could run in the debugger with a break point in VecNorm to see where it is being called. for (i = 0; i < snes->max_its; i++) { ierr = SNESComputeNGS(snes, B, X);CHKERRQ(ierr); /* only compute norms if requested or about to exit due to maximum iterations */ if (normschedule == SNES_NORM_ALWAYS || ((i == snes->max_its - 1) && (normschedule == SNES_NORM_INITIAL_FINAL_ONLY || normschedule == SNES_NORM_FINAL_ONLY))) { ierr = SNESComputeFunction(snes,X,F);CHKERRQ(ierr); ierr = VecNorm(F, NORM_2, &fnorm);CHKERRQ(ierr); /* fnorm <- ||F|| */ SNESCheckFunctionNorm(snes,fnorm); /* Monitor convergence */ ierr = PetscObjectSAWsTakeAccess((PetscObject)snes);CHKERRQ(ierr); snes->iter = i+1; snes->norm = fnorm; ierr = PetscObjectSAWsGrantAccess((PetscObject)snes);CHKERRQ(ierr); ierr = SNESLogConvergenceHistory(snes,snes->norm,0);CHKERRQ(ierr); ierr = SNESMonitor(snes,snes->iter,snes->norm);CHKERRQ(ierr); } /* Test for convergence */ if (normschedule == SNES_NORM_ALWAYS) ierr = (*snes->ops->converged)(snes,snes->iter,0.0,0.0,fnorm,&snes->reason,snes->cnvP);CHKERRQ(ierr); if (snes->reason) PetscFunctionReturn(0); /* Call general purpose update function */ if (snes->ops->update) { ierr = (*snes->ops->update)(snes, snes->iter);CHKERRQ(ierr); } } > If "-snes_norm_schedule none" is not working, how can I switch off the calculation of "fnorm"? Below you can see how I integrated SNES into my code. > > Moreover, I have several degrees of freedom at each point of DMDA grid (dof > 1). The norm which I need to calculate for my convergence criterium only concerns a subset of degrees of freedom: start = [start1, start2, ..., startn], with len(start) = n < dof. I am using VecStrideNorm() or VecStrideMax() with a do-loop: > > call VecStrideNorm(solution,start(1),NORM_INFINITY,mynorm,ierr) > do i = 2, n > call VecStrideNorm(solution,start(i),NORM_INFINITY,mynorm_i,ierr) > mynorm = MAX(mynorm, mynorm_i) > end do > > Does there exist a more efficient way of calculating the norm I need? Yes, the above will requires a n global reductions and n loops of the data; so pretty inefficient. Better you write the loops yourself and use MPI_Allreduce() directly to accumulate the parallel maximum. Below you are not providing a Gauss-Seidel method so it if you pick ngs the only thing it can do is the default coloring. But you can pass in your GS with SNESSetNGS() and SNES will use it, I thought that was what you were doing. This is what I think you should do. 1) use SNESSetNGS() to provide PETSc with your GS. Make sure it converges at the same rate as your standalone solver and roughly the same time. 2) use NGMRES to accelerate the convergence of the GS. This can also be phrased as use your Gauss-Seidel as a nonlinear preconditioner for nonlinear GMRES. Basically you create the SNES and then call SNESGetNPC() on it then provide to this "inner" SNES your GS smoother > > > > The PETSc part which I integrated in my code is the following (pseudo-code): > > !-----------------------------------------------------------------------------! > DM :: dmda > Vec :: lv, gv > SNES :: snes > PetscReal, pointer :: p > !-----------------------------------------------------------------------------! > external FormFunctionLocal > external MySNESConverged > !-----------------------------------------------------------------------------! > call PetscInitialize(...) > call MPI_Comm_rank(...) > call MPI_Comm_size(...) > > call DMDACreate3d(... DM_BOUNDARY_GHOSTED ... DMDA_STENCIL_STAR ... dof ...) > call DMSetUp(dmda,ierr) > call DMDAGetCorners(...) ===> I obtain xs, xe, ys, ye, zs, ze in Fortan 1-based indexing > call DMDAGetGhostCorners(...) ===> I obtain xsg, xeg, ysg, yeg, zsg, zeg in Fortan 1-based indexing > call DMCreateLocalVector(dmda,lv,ierr) > call DMCreateGlobalVector(dmda,gv,ierr) > > call SNESCreate(comm,snes,ierr) > call SNESSetConvergenceTest(snes,MySNESConverged,0,PETSC_NULL_FUNCTION,ierr) > call SNESSetDM(snes,dmda,ierr) > call DMDASNESSetFunctionLocal(dmda,INSERT_VALUES,FormFunctionLocal,0,ierr) > call SNESSetFromOptions(snes,ierr) > > nullify(p) > call DMDAVecGetArrayF90(dmda,lv,p,ierr) > do n = zsg, zeg > do m = ysg, yeg > do l = xsg, xeg > do i = 1, dof > p(i-1,l-1,m-1,n-1) = INITIAL VALUE > end do > end do > end do > end do > call DMDAVecRestoreArrayF90(dmda,lv,p,ierr) > nullify(p) > call DMLocalToGlobalBegin(dmda,lv,INSERT_VALUES,gv,ierr) > call DMLocalToGlobalEnd(dmda,lv,INSERT_VALUES,gv,ierr) > > call SNESSolve(snes,PETSC_NULL_VEC,gv,ierr) > > call VecDestroy(lv,ierr) > call VecDestroy(gv,ierr) > call SNESDestroy(snes,ierr) > call DMDestroy(dmda,ierr) > call PetscFinalize(ierr) > !-----------------------------------------------------------------------------! > subroutine FormFunctionLocal(info,x,G,da,ierr) > DMDALocalInfo, intent(in) :: info(DMDA_LOCAL_INFO_SIZE) > PetscScalar, intent(in) :: x(0:petscdof-1,xs-stw:xe+stw,ys-stw:ye+stw,zs-stw:ze+stw) > PetscScalar, intent(inout) :: G(0:petscdof-1,xs:xe,ys:ye,zs:ze) > DM, intent(in) :: da > PetscErrorCode, intent(inout) :: ierr > do n = zs, ze > do m = ys, ye > do l = xs, xe > do i = 1, dof > G(i,l,m,n) = VALUE OF THE FUNCTION G(x) > end do > end do > end do > end do > end subroutine FormFunctionLocal > !-----------------------------------------------------------------------------! > subroutine MySNESConverged(snes,nc,xnorm,snorm,fnorm,reason,ctx,ierr) > SNES, intent(in) :: snes > PetscInt, intent(in) :: nc > PetscReal, intent(in) :: xnorm, snorm, fnorm > SNESConvergedReason, intent(inout) :: reason > PetscInt, intent(in) :: ctx > PetscErrorCode, intent(inout) :: ierr > (...) > end subroutine MySNESConverged > !-----------------------------------------------------------------------------! > > > > > > > Thanks a lot in advance for your reply. > > > Best regards, > Pedro > > > > > > > > ----- Mail original ----- > De: "Smith, Barry F." > ?: "Pedro Gonzalez" > Cc: "petsc-users" > Envoy?: Lundi 16 Septembre 2019 01:28:15 > Objet: Re: [petsc-users] I find slow performance of SNES > >> On Sep 15, 2019, at 5:35 PM, Pedro Gonzalez via petsc-users wrote: >> >> Dear all, >> >> I am working on a code that solves a nonlinear system of equations G(x)=0 with Gauss-Seidel method. I managed to parallelize it by using DMDA with very good results. The previous week I changed my Gauss-Seidel solver by SNES. The code using SNES gives the same result as before, but I do not obtain the performance that I expected: >> 1) When using the Gauss-Seidel method (-snes_type ngs) the residual G(x) seems not be scallable to the amplitude of x > > Do you simply mean that -snes_type ngs is not converging to the solution? Does it seem to converge to something else or nothing at all? > > The SNES GS code just calls the user provided routine set with SNESSetNGS(). This means that it is expect to behave the same way as if the user simply called the user provided routine themselves. I assume you have a routine that implements the GS on the DMDA since you write " I managed to parallelize it by using DMDA with very good results. " Thus you should get the same iterations for both calling your Gauss-Seidel code yourself and calling SNESSetNGS() and then calling SNESSolve() with -snes_type ngs. You can check if they are behaving in the same (for simplicity) by using the debugger or by putting VecView() into code to see if they are generating the same values. Just have your GS code call VecView() on the input vectors at the top and the output vectors at the bottom. > > >> and I have to add the option -snes_secant_h in order to make SNES converge. > > Do you mean both -snes_ngs_secant and -snes_ngs_secant_h ? The second option by itself will do nothing. Cut and paste the exact options you use. > >> However, I varied the step from 1.E-1 to 1.E50 and obtained the same result within the same computation time. > > This would happen if you did not use -snes_ngs_secant but do use -snes_ngs_secant_h but this doesn't change the algorithm > >> Is it normal that snes_secant_h can vary so many orders of magnitude? > > That certainly does seem odd. Looking at the code SNESComputeNGSDefaultSecant() we see it is perturbing the input vector (by color) with h > > - for (j=0;j - wa[idx[j]-s] += h; > - } > > It then does the secant step with > > if (PetscAbsScalar(g-f) > atol) { > /* This is equivalent to d = x - (h*f) / PetscRealPart(g-f) */ > d = (x*g-w*f) / PetscRealPart(g-f); > } else { > d = x; > } > > In PetscErrorCode SNESSetFromOptions_NGS(PetscOptionItems *PetscOptionsObject,SNES snes) one can see that the user provided h is accepted > > ierr = PetscOptionsReal("-snes_ngs_secant_h","Differencing parameter for secant search","",gs->h,&gs->h,NULL);CHKERRQ(ierr); > > You could run in the debugger with a break point in SNESComputeNGSDefaultSecant() to see if it is truly using the h you provided. > > > > >> 2) Compared to my Gauss-Seidel algorithm, SNES does (approximately) the same number of iterations (with the same convergence criterium) but it is about 100 times slower. > > I don't fully understand what you are running so cannot completely answer this. > > -snes_ngs_secant will be lots slower than using a user provided GS > > By default SNES may be doing some norm computations at each iteration which are expensive and will slow things down. You can use SNESSetNormSchedule() or the command line form -snes_norm_schedule to turn these norms off. > > >> What can be the reason(s) of this slow performance of SNES solver? I do not use preconditioner with my algorithm so I did not add one to SNES. >> >> The main PETSc subroutines that I have included (in this order) are the following: >> call DMDACreate3D >> call DMSetUp >> call DMCreateLocalVector >> call DMCreateGlobalVector >> call SNESCreate >> call SNESSetConvergenceTest >> call SNESSetDM >> call DMDASNESSetFunctionLocal >> call SNESSetFromOptions >> call SNESSolve >> >> Thanks in advance for you help. >> >> Best regards, >> Pedro >> >> From bsmith at mcs.anl.gov Tue Sep 17 19:27:47 2019 From: bsmith at mcs.anl.gov (Smith, Barry F.) Date: Wed, 18 Sep 2019 00:27:47 +0000 Subject: [petsc-users] TS scheme with different DAs In-Reply-To: References: Message-ID: Don't be too quick to dismiss switching to the DMStag you may find that it actually takes little time to convert and then you have a much less cumbersome process to manage the staggered grid. Take a look at src/dm/impls/stag/examples/tutorials/ex2.c where const PetscInt dof0 = 0, dof1 = 1,dof2 = 1; /* 1 dof on each edge and element center */ const PetscInt stencilWidth = 1; ierr = DMStagCreate2d(PETSC_COMM_WORLD,DM_BOUNDARY_NONE,DM_BOUNDARY_NONE,7,9,PETSC_DECIDE,PETSC_DECIDE,dof0,dof1,dof2,DMSTAG_STENCIL_BOX,stencilWidth,NULL,NULL,&dmSol);CHKERRQ(ierr); BOOM, it has set up a staggered grid with 1 cell centered variable and 1 on each edge. Adding more the cell centers, vertices, or edges is trivial. If you want to stick to DMDA you "cheat". Depending on exactly what staggering you have you make the DMDA for the "smaller problem" as large as the other ones and just track zeros in those locations. For example if velocities are "edges" and T, S are on cells, make your "cells" DMDA one extra grid width wide in all three dimensions. You may need to be careful on the boundaries deepening on the types of boundary conditions. > On Sep 17, 2019, at 7:04 PM, Manuel Valera via petsc-users wrote: > > Thanks Matthew, but my code is too complicated to be redone on DMStag now after spending a long time using DMDAs, > > Is there a way to ensure PETSc distributes several DAs in the same way? besides manually distributing the points, > > Thanks, > > On Tue, Sep 17, 2019 at 3:28 PM Matthew Knepley wrote: > On Tue, Sep 17, 2019 at 6:15 PM Manuel Valera via petsc-users wrote: > Hello, petsc users, > > I have integrated the TS routines in my code, but i just noticed i didn't do it optimally. I was using 3 different TS objects to integrate velocities, temperature and salinity, and it works but only for small DTs. I suspect the intermediate Runge-Kutta states are unphased and this creates the discrepancy for broader time steps, so I need to integrate the 3 quantities in the same routine. > > I tried to do this by using a 5 DOF distributed array for the RHS, where I store the velocities in the first 3 and then Temperature and Salinity in the rest. The problem is that I use a staggered grid and T,S are located in a different DA layout than the velocities. This is creating problems for me since I can't find a way to communicate the information from the result of the TS integration back to the respective DAs of each variable. > > Is there a way to communicate across DAs? or can you suggest an alternative solution to this problem? > > If you have a staggered discretization on a structured grid, I would recommend checking out DMStag. > > Thanks, > > MAtt > > Thanks, > > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ From mhbaghaei at mail.sjtu.edu.cn Tue Sep 17 22:41:33 2019 From: mhbaghaei at mail.sjtu.edu.cn (Mohammad Hassan) Date: Wed, 18 Sep 2019 11:41:33 +0800 Subject: [petsc-users] DMPlex Distribution In-Reply-To: References: <001001d56d5b$b0a40b50$11ec21f0$@mail.sjtu.edu.cn> <003701d56d71$7a13d5f0$6e3b81d0$@mail.sjtu.edu.cn> Message-ID: <000801d56dd2$f81ae6d0$e850b470$@mail.sjtu.edu.cn> Mark is right. The functionality of AMR does not relate to parallelization of that. The vector size (global or local) does not conflict with AMR functions. Thanks Amir From: Matthew Knepley [mailto:knepley at gmail.com] Sent: Wednesday, September 18, 2019 12:59 AM To: Mohammad Hassan Cc: PETSc Subject: Re: [petsc-users] DMPlex Distribution On Tue, Sep 17, 2019 at 12:03 PM Mohammad Hassan > wrote: Thanks for suggestion. I am going to use a block-based amr. I think I need to know exactly the mesh distribution of blocks across different processors for implementation of amr. Hi Amir, How are you using Plex if the block-AMR is coming from somewhere else? This will help me tell you what would be best. And as a general question, can we set block size of vector on each rank? I think as Mark says that you are using "blocksize" is a different way than PETSc. Thanks, Matt Thanks Amir From: Matthew Knepley [mailto: knepley at gmail.com] Sent: Tuesday, September 17, 2019 11:04 PM To: Mohammad Hassan < mhbaghaei at mail.sjtu.edu.cn> Cc: PETSc < petsc-users at mcs.anl.gov> Subject: Re: [petsc-users] DMPlex Distribution On Tue, Sep 17, 2019 at 9:27 AM Mohammad Hassan via petsc-users > wrote: Hi I am using DMPlexCreateFromDAG() to construct my DM. Is it possible to set the distribution across processors manually. I mean, how can I set the share of dm on each rank (local)? You could make a Shell partitioner and tell it the entire partition: https://www.mcs.anl.gov/petsc/petsc-master/docs/manualpages/DMPLEX/PetscPartitionerShellSetPartition.html However, I would be surprised if you could do this. It is likely that you just want to mess with the weights in ParMetis. Thanks, Matt Thanks Amir -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Wed Sep 18 06:35:49 2019 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 18 Sep 2019 07:35:49 -0400 Subject: [petsc-users] TS scheme with different DAs In-Reply-To: References: Message-ID: On Tue, Sep 17, 2019 at 8:27 PM Smith, Barry F. wrote: > > Don't be too quick to dismiss switching to the DMStag you may find that > it actually takes little time to convert and then you have a much less > cumbersome process to manage the staggered grid. Take a look at > src/dm/impls/stag/examples/tutorials/ex2.c where > > const PetscInt dof0 = 0, dof1 = 1,dof2 = 1; /* 1 dof on each edge and > element center */ > const PetscInt stencilWidth = 1; > ierr = > DMStagCreate2d(PETSC_COMM_WORLD,DM_BOUNDARY_NONE,DM_BOUNDARY_NONE,7,9,PETSC_DECIDE,PETSC_DECIDE,dof0,dof1,dof2,DMSTAG_STENCIL_BOX,stencilWidth,NULL,NULL,&dmSol);CHKERRQ(ierr); > > BOOM, it has set up a staggered grid with 1 cell centered variable and 1 > on each edge. Adding more the cell centers, vertices, or edges is trivial. > > If you want to stick to DMDA you > > "cheat". Depending on exactly what staggering you have you make the DMDA > for the "smaller problem" as large as the other ones and just track zeros > in those locations. For example if velocities are "edges" and T, S are on > cells, make your "cells" DMDA one extra grid width wide in all three > dimensions. You may need to be careful on the boundaries deepening on the > types of boundary conditions. > Yes, SNES ex30 does exactly this. However, I still recommend looking at DMStag. Patrick created it because managing the DMDA became such as headache. Thanks, Matt > > On Sep 17, 2019, at 7:04 PM, Manuel Valera via petsc-users < > petsc-users at mcs.anl.gov> wrote: > > > > Thanks Matthew, but my code is too complicated to be redone on DMStag > now after spending a long time using DMDAs, > > > > Is there a way to ensure PETSc distributes several DAs in the same way? > besides manually distributing the points, > > > > Thanks, > > > > On Tue, Sep 17, 2019 at 3:28 PM Matthew Knepley > wrote: > > On Tue, Sep 17, 2019 at 6:15 PM Manuel Valera via petsc-users < > petsc-users at mcs.anl.gov> wrote: > > Hello, petsc users, > > > > I have integrated the TS routines in my code, but i just noticed i > didn't do it optimally. I was using 3 different TS objects to integrate > velocities, temperature and salinity, and it works but only for small DTs. > I suspect the intermediate Runge-Kutta states are unphased and this creates > the discrepancy for broader time steps, so I need to integrate the 3 > quantities in the same routine. > > > > I tried to do this by using a 5 DOF distributed array for the RHS, where > I store the velocities in the first 3 and then Temperature and Salinity in > the rest. The problem is that I use a staggered grid and T,S are located in > a different DA layout than the velocities. This is creating problems for me > since I can't find a way to communicate the information from the result of > the TS integration back to the respective DAs of each variable. > > > > Is there a way to communicate across DAs? or can you suggest an > alternative solution to this problem? > > > > If you have a staggered discretization on a structured grid, I would > recommend checking out DMStag. > > > > Thanks, > > > > MAtt > > > > Thanks, > > > > > > > > -- > > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > > -- Norbert Wiener > > > > https://www.cse.buffalo.edu/~knepley/ > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From mfadams at lbl.gov Wed Sep 18 08:22:47 2019 From: mfadams at lbl.gov (Mark Adams) Date: Wed, 18 Sep 2019 09:22:47 -0400 Subject: [petsc-users] DMPlex Distribution In-Reply-To: <000801d56dd2$f81ae6d0$e850b470$@mail.sjtu.edu.cn> References: <001001d56d5b$b0a40b50$11ec21f0$@mail.sjtu.edu.cn> <003701d56d71$7a13d5f0$6e3b81d0$@mail.sjtu.edu.cn> <000801d56dd2$f81ae6d0$e850b470$@mail.sjtu.edu.cn> Message-ID: I'm puzzled. It sounds like you are doing non-conforming AMR (structured block AMR), but Plex does not support that. On Tue, Sep 17, 2019 at 11:41 PM Mohammad Hassan via petsc-users < petsc-users at mcs.anl.gov> wrote: > Mark is right. The functionality of AMR does not relate to > parallelization of that. The vector size (global or local) does not > conflict with AMR functions. > > Thanks > > > > Amir > > > > *From:* Matthew Knepley [mailto:knepley at gmail.com] > *Sent:* Wednesday, September 18, 2019 12:59 AM > *To:* Mohammad Hassan > *Cc:* PETSc > *Subject:* Re: [petsc-users] DMPlex Distribution > > > > On Tue, Sep 17, 2019 at 12:03 PM Mohammad Hassan < > mhbaghaei at mail.sjtu.edu.cn> wrote: > > Thanks for suggestion. I am going to use a block-based amr. I think I need > to know exactly the mesh distribution of blocks across different processors > for implementation of amr. > > > > Hi Amir, > > > > How are you using Plex if the block-AMR is coming from somewhere else? > This will help > > me tell you what would be best. > > > > And as a general question, can we set block size of vector on each rank? > > > > I think as Mark says that you are using "blocksize" is a different way > than PETSc. > > > > Thanks, > > > > Matt > > > > Thanks > > Amir > > > > *From:* Matthew Knepley [mailto:knepley at gmail.com] > *Sent:* Tuesday, September 17, 2019 11:04 PM > *To:* Mohammad Hassan > *Cc:* PETSc > *Subject:* Re: [petsc-users] DMPlex Distribution > > > > On Tue, Sep 17, 2019 at 9:27 AM Mohammad Hassan via petsc-users < > petsc-users at mcs.anl.gov> wrote: > > Hi > > I am using DMPlexCreateFromDAG() to construct my DM. Is it possible to set > the distribution across processors manually. I mean, how can I set the > share of dm on each rank (local)? > > > > You could make a Shell partitioner and tell it the entire partition: > > > > > https://www.mcs.anl.gov/petsc/petsc-master/docs/manualpages/DMPLEX/PetscPartitionerShellSetPartition.html > > > > However, I would be surprised if you could do this. It is likely that you > just want to mess with the weights in ParMetis. > > > > Thanks, > > > > Matt > > > > Thanks > > Amir > > > > > -- > > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > > > https://www.cse.buffalo.edu/~knepley/ > > > > > > -- > > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > > > https://www.cse.buffalo.edu/~knepley/ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mhbaghaei at mail.sjtu.edu.cn Wed Sep 18 08:35:36 2019 From: mhbaghaei at mail.sjtu.edu.cn (Mohammad Hassan) Date: Wed, 18 Sep 2019 21:35:36 +0800 Subject: [petsc-users] DMPlex Distribution In-Reply-To: References: <001001d56d5b$b0a40b50$11ec21f0$@mail.sjtu.edu.cn> <003701d56d71$7a13d5f0$6e3b81d0$@mail.sjtu.edu.cn> <000801d56dd2$f81ae6d0$e850b470$@mail.sjtu.edu.cn> Message-ID: <001f01d56e25$f55b14d0$e0113e70$@mail.sjtu.edu.cn> If DMPlex does not support, I may need to use PARAMESH or CHOMBO. Is there any way that we can construct non-conformal layout for DM in petsc? From: Mark Adams [mailto:mfadams at lbl.gov] Sent: Wednesday, September 18, 2019 9:23 PM To: Mohammad Hassan Cc: Matthew Knepley ; PETSc users list Subject: Re: [petsc-users] DMPlex Distribution I'm puzzled. It sounds like you are doing non-conforming AMR (structured block AMR), but Plex does not support that. On Tue, Sep 17, 2019 at 11:41 PM Mohammad Hassan via petsc-users > wrote: Mark is right. The functionality of AMR does not relate to parallelization of that. The vector size (global or local) does not conflict with AMR functions. Thanks Amir From: Matthew Knepley [mailto:knepley at gmail.com ] Sent: Wednesday, September 18, 2019 12:59 AM To: Mohammad Hassan > Cc: PETSc > Subject: Re: [petsc-users] DMPlex Distribution On Tue, Sep 17, 2019 at 12:03 PM Mohammad Hassan > wrote: Thanks for suggestion. I am going to use a block-based amr. I think I need to know exactly the mesh distribution of blocks across different processors for implementation of amr. Hi Amir, How are you using Plex if the block-AMR is coming from somewhere else? This will help me tell you what would be best. And as a general question, can we set block size of vector on each rank? I think as Mark says that you are using "blocksize" is a different way than PETSc. Thanks, Matt Thanks Amir From: Matthew Knepley [mailto: knepley at gmail.com] Sent: Tuesday, September 17, 2019 11:04 PM To: Mohammad Hassan < mhbaghaei at mail.sjtu.edu.cn> Cc: PETSc < petsc-users at mcs.anl.gov> Subject: Re: [petsc-users] DMPlex Distribution On Tue, Sep 17, 2019 at 9:27 AM Mohammad Hassan via petsc-users > wrote: Hi I am using DMPlexCreateFromDAG() to construct my DM. Is it possible to set the distribution across processors manually. I mean, how can I set the share of dm on each rank (local)? You could make a Shell partitioner and tell it the entire partition: https://www.mcs.anl.gov/petsc/petsc-master/docs/manualpages/DMPLEX/PetscPartitionerShellSetPartition.html However, I would be surprised if you could do this. It is likely that you just want to mess with the weights in ParMetis. Thanks, Matt Thanks Amir -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Wed Sep 18 08:50:26 2019 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 18 Sep 2019 09:50:26 -0400 Subject: [petsc-users] DMPlex Distribution In-Reply-To: <001f01d56e25$f55b14d0$e0113e70$@mail.sjtu.edu.cn> References: <001001d56d5b$b0a40b50$11ec21f0$@mail.sjtu.edu.cn> <003701d56d71$7a13d5f0$6e3b81d0$@mail.sjtu.edu.cn> <000801d56dd2$f81ae6d0$e850b470$@mail.sjtu.edu.cn> <001f01d56e25$f55b14d0$e0113e70$@mail.sjtu.edu.cn> Message-ID: On Wed, Sep 18, 2019 at 9:35 AM Mohammad Hassan via petsc-users < petsc-users at mcs.anl.gov> wrote: > If DMPlex does not support, I may need to use PARAMESH or CHOMBO. Is there > any way that we can construct non-conformal layout for DM in petsc? > Lets see. Plex does support geometrically non-conforming meshes. This is how we support p4est. However, if you want that, you can just use DMForest I think. So you jsut want structured AMR? Thanks, Matt > > > *From:* Mark Adams [mailto:mfadams at lbl.gov] > *Sent:* Wednesday, September 18, 2019 9:23 PM > *To:* Mohammad Hassan > *Cc:* Matthew Knepley ; PETSc users list < > petsc-users at mcs.anl.gov> > *Subject:* Re: [petsc-users] DMPlex Distribution > > > > I'm puzzled. It sounds like you are doing non-conforming AMR (structured > block AMR), but Plex does not support that. > > > > On Tue, Sep 17, 2019 at 11:41 PM Mohammad Hassan via petsc-users < > petsc-users at mcs.anl.gov> wrote: > > Mark is right. The functionality of AMR does not relate to > parallelization of that. The vector size (global or local) does not > conflict with AMR functions. > > Thanks > > > > Amir > > > > *From:* Matthew Knepley [mailto:knepley at gmail.com] > *Sent:* Wednesday, September 18, 2019 12:59 AM > *To:* Mohammad Hassan > *Cc:* PETSc > *Subject:* Re: [petsc-users] DMPlex Distribution > > > > On Tue, Sep 17, 2019 at 12:03 PM Mohammad Hassan < > mhbaghaei at mail.sjtu.edu.cn> wrote: > > Thanks for suggestion. I am going to use a block-based amr. I think I need > to know exactly the mesh distribution of blocks across different processors > for implementation of amr. > > > > Hi Amir, > > > > How are you using Plex if the block-AMR is coming from somewhere else? > This will help > > me tell you what would be best. > > > > And as a general question, can we set block size of vector on each rank? > > > > I think as Mark says that you are using "blocksize" is a different way > than PETSc. > > > > Thanks, > > > > Matt > > > > Thanks > > Amir > > > > *From:* Matthew Knepley [mailto:knepley at gmail.com] > *Sent:* Tuesday, September 17, 2019 11:04 PM > *To:* Mohammad Hassan > *Cc:* PETSc > *Subject:* Re: [petsc-users] DMPlex Distribution > > > > On Tue, Sep 17, 2019 at 9:27 AM Mohammad Hassan via petsc-users < > petsc-users at mcs.anl.gov> wrote: > > Hi > > I am using DMPlexCreateFromDAG() to construct my DM. Is it possible to set > the distribution across processors manually. I mean, how can I set the > share of dm on each rank (local)? > > > > You could make a Shell partitioner and tell it the entire partition: > > > > > https://www.mcs.anl.gov/petsc/petsc-master/docs/manualpages/DMPLEX/PetscPartitionerShellSetPartition.html > > > > However, I would be surprised if you could do this. It is likely that you > just want to mess with the weights in ParMetis. > > > > Thanks, > > > > Matt > > > > Thanks > > Amir > > > > > -- > > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > > > https://www.cse.buffalo.edu/~knepley/ > > > > > > -- > > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > > > https://www.cse.buffalo.edu/~knepley/ > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From epscodes at gmail.com Wed Sep 18 09:15:58 2019 From: epscodes at gmail.com (Xiangdong) Date: Wed, 18 Sep 2019 10:15:58 -0400 Subject: [petsc-users] MKL_PARDISO question Message-ID: Hello everyone, >From here, https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MATSOLVERMKL_PARDISO.html It seems that MKL_PARDISO only works for seqaij. I am curious that whether one can use mkl_pardiso in petsc with multi-thread. Is there any reason that MKL_PARDISO is not listed in the linear solver table? https://www.mcs.anl.gov/petsc/documentation/linearsolvertable.html Thank you. Best, Xiangdong -------------- next part -------------- An HTML attachment was scrubbed... URL: From mhbaghaei at mail.sjtu.edu.cn Wed Sep 18 09:26:52 2019 From: mhbaghaei at mail.sjtu.edu.cn (Mohammad Hassan) Date: Wed, 18 Sep 2019 22:26:52 +0800 Subject: [petsc-users] DMPlex Distribution In-Reply-To: References: <001001d56d5b$b0a40b50$11ec21f0$@mail.sjtu.edu.cn> <003701d56d71$7a13d5f0$6e3b81d0$@mail.sjtu.edu.cn> <000801d56dd2$f81ae6d0$e850b470$@mail.sjtu.edu.cn> <001f01d56e25$f55b14d0$e0113e70$@mail.sjtu.edu.cn> Message-ID: <001001d56e2d$1eab8ac0$5c02a040$@mail.sjtu.edu.cn> I want to implement block-based AMR, which turns my base conformal mesh to non-conformal. My question is how DMPlex renders a mesh that it cannot support non-conformal meshes. If DMPlex does not work, I will try to use DMForest. From: Matthew Knepley [mailto:knepley at gmail.com] Sent: Wednesday, September 18, 2019 9:50 PM To: Mohammad Hassan Cc: Mark Adams ; PETSc Subject: Re: [petsc-users] DMPlex Distribution On Wed, Sep 18, 2019 at 9:35 AM Mohammad Hassan via petsc-users > wrote: If DMPlex does not support, I may need to use PARAMESH or CHOMBO. Is there any way that we can construct non-conformal layout for DM in petsc? Lets see. Plex does support geometrically non-conforming meshes. This is how we support p4est. However, if you want that, you can just use DMForest I think. So you jsut want structured AMR? Thanks, Matt From: Mark Adams [mailto: mfadams at lbl.gov] Sent: Wednesday, September 18, 2019 9:23 PM To: Mohammad Hassan < mhbaghaei at mail.sjtu.edu.cn> Cc: Matthew Knepley < knepley at gmail.com>; PETSc users list < petsc-users at mcs.anl.gov> Subject: Re: [petsc-users] DMPlex Distribution I'm puzzled. It sounds like you are doing non-conforming AMR (structured block AMR), but Plex does not support that. On Tue, Sep 17, 2019 at 11:41 PM Mohammad Hassan via petsc-users > wrote: Mark is right. The functionality of AMR does not relate to parallelization of that. The vector size (global or local) does not conflict with AMR functions. Thanks Amir From: Matthew Knepley [mailto: knepley at gmail.com] Sent: Wednesday, September 18, 2019 12:59 AM To: Mohammad Hassan < mhbaghaei at mail.sjtu.edu.cn> Cc: PETSc < petsc-maint at mcs.anl.gov> Subject: Re: [petsc-users] DMPlex Distribution On Tue, Sep 17, 2019 at 12:03 PM Mohammad Hassan > wrote: Thanks for suggestion. I am going to use a block-based amr. I think I need to know exactly the mesh distribution of blocks across different processors for implementation of amr. Hi Amir, How are you using Plex if the block-AMR is coming from somewhere else? This will help me tell you what would be best. And as a general question, can we set block size of vector on each rank? I think as Mark says that you are using "blocksize" is a different way than PETSc. Thanks, Matt Thanks Amir From: Matthew Knepley [mailto: knepley at gmail.com] Sent: Tuesday, September 17, 2019 11:04 PM To: Mohammad Hassan < mhbaghaei at mail.sjtu.edu.cn> Cc: PETSc < petsc-users at mcs.anl.gov> Subject: Re: [petsc-users] DMPlex Distribution On Tue, Sep 17, 2019 at 9:27 AM Mohammad Hassan via petsc-users > wrote: Hi I am using DMPlexCreateFromDAG() to construct my DM. Is it possible to set the distribution across processors manually. I mean, how can I set the share of dm on each rank (local)? You could make a Shell partitioner and tell it the entire partition: https://www.mcs.anl.gov/petsc/petsc-master/docs/manualpages/DMPLEX/PetscPartitionerShellSetPartition.html However, I would be surprised if you could do this. It is likely that you just want to mess with the weights in ParMetis. Thanks, Matt Thanks Amir -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Wed Sep 18 09:35:12 2019 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 18 Sep 2019 10:35:12 -0400 Subject: [petsc-users] DMPlex Distribution In-Reply-To: <001001d56e2d$1eab8ac0$5c02a040$@mail.sjtu.edu.cn> References: <001001d56d5b$b0a40b50$11ec21f0$@mail.sjtu.edu.cn> <003701d56d71$7a13d5f0$6e3b81d0$@mail.sjtu.edu.cn> <000801d56dd2$f81ae6d0$e850b470$@mail.sjtu.edu.cn> <001f01d56e25$f55b14d0$e0113e70$@mail.sjtu.edu.cn> <001001d56e2d$1eab8ac0$5c02a040$@mail.sjtu.edu.cn> Message-ID: On Wed, Sep 18, 2019 at 10:27 AM Mohammad Hassan wrote: > I want to implement block-based AMR, which turns my base conformal mesh to > non-conformal. My question is how DMPlex renders a mesh that it cannot > support non-conformal meshes. > Mark misspoke. Plex _does_ support geometrically non-conforming meshing, e.g. "hanging nodes". The easiest way to use Plex this way is to use DMForest, which uses Plex underneath. There are excellent p4est tutorials. What you would do is create your conformal mesh, using Plex if you want, and use that for the p4est base mesh (you would have the base mesh be the forest roots). Thanks, Matt > If DMPlex does not work, I will try to use DMForest. > > > > *From:* Matthew Knepley [mailto:knepley at gmail.com] > *Sent:* Wednesday, September 18, 2019 9:50 PM > *To:* Mohammad Hassan > *Cc:* Mark Adams ; PETSc > *Subject:* Re: [petsc-users] DMPlex Distribution > > > > On Wed, Sep 18, 2019 at 9:35 AM Mohammad Hassan via petsc-users < > petsc-users at mcs.anl.gov> wrote: > > If DMPlex does not support, I may need to use PARAMESH or CHOMBO. Is there > any way that we can construct non-conformal layout for DM in petsc? > > > > Lets see. Plex does support geometrically non-conforming meshes. This is > how we support p4est. However, if > > you want that, you can just use DMForest I think. So you jsut want > structured AMR? > > > > Thanks, > > > > Matt > > > > > > *From:* Mark Adams [mailto:mfadams at lbl.gov] > *Sent:* Wednesday, September 18, 2019 9:23 PM > *To:* Mohammad Hassan > *Cc:* Matthew Knepley ; PETSc users list < > petsc-users at mcs.anl.gov> > *Subject:* Re: [petsc-users] DMPlex Distribution > > > > I'm puzzled. It sounds like you are doing non-conforming AMR (structured > block AMR), but Plex does not support that. > > > > On Tue, Sep 17, 2019 at 11:41 PM Mohammad Hassan via petsc-users < > petsc-users at mcs.anl.gov> wrote: > > Mark is right. The functionality of AMR does not relate to > parallelization of that. The vector size (global or local) does not > conflict with AMR functions. > > Thanks > > > > Amir > > > > *From:* Matthew Knepley [mailto:knepley at gmail.com] > *Sent:* Wednesday, September 18, 2019 12:59 AM > *To:* Mohammad Hassan > *Cc:* PETSc > *Subject:* Re: [petsc-users] DMPlex Distribution > > > > On Tue, Sep 17, 2019 at 12:03 PM Mohammad Hassan < > mhbaghaei at mail.sjtu.edu.cn> wrote: > > Thanks for suggestion. I am going to use a block-based amr. I think I need > to know exactly the mesh distribution of blocks across different processors > for implementation of amr. > > > > Hi Amir, > > > > How are you using Plex if the block-AMR is coming from somewhere else? > This will help > > me tell you what would be best. > > > > And as a general question, can we set block size of vector on each rank? > > > > I think as Mark says that you are using "blocksize" is a different way > than PETSc. > > > > Thanks, > > > > Matt > > > > Thanks > > Amir > > > > *From:* Matthew Knepley [mailto:knepley at gmail.com] > *Sent:* Tuesday, September 17, 2019 11:04 PM > *To:* Mohammad Hassan > *Cc:* PETSc > *Subject:* Re: [petsc-users] DMPlex Distribution > > > > On Tue, Sep 17, 2019 at 9:27 AM Mohammad Hassan via petsc-users < > petsc-users at mcs.anl.gov> wrote: > > Hi > > I am using DMPlexCreateFromDAG() to construct my DM. Is it possible to set > the distribution across processors manually. I mean, how can I set the > share of dm on each rank (local)? > > > > You could make a Shell partitioner and tell it the entire partition: > > > > > https://www.mcs.anl.gov/petsc/petsc-master/docs/manualpages/DMPLEX/PetscPartitionerShellSetPartition.html > > > > However, I would be surprised if you could do this. It is likely that you > just want to mess with the weights in ParMetis. > > > > Thanks, > > > > Matt > > > > Thanks > > Amir > > > > > -- > > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > > > https://www.cse.buffalo.edu/~knepley/ > > > > > > -- > > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > > > https://www.cse.buffalo.edu/~knepley/ > > > > > > -- > > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > > > https://www.cse.buffalo.edu/~knepley/ > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From mhbaghaei at mail.sjtu.edu.cn Wed Sep 18 10:01:57 2019 From: mhbaghaei at mail.sjtu.edu.cn (Mohammad Hassan) Date: Wed, 18 Sep 2019 23:01:57 +0800 Subject: [petsc-users] DMPlex Distribution In-Reply-To: References: <001001d56d5b$b0a40b50$11ec21f0$@mail.sjtu.edu.cn> <003701d56d71$7a13d5f0$6e3b81d0$@mail.sjtu.edu.cn> <000801d56dd2$f81ae6d0$e850b470$@mail.sjtu.edu.cn> <001f01d56e25$f55b14d0$e0113e70$@mail.sjtu.edu.cn> <001001d56e2d$1eab8ac0$5c02a040$@mail.sjtu.edu.cn> Message-ID: <002701d56e32$04985550$0dc8fff0$@mail.sjtu.edu.cn> Thanks for your suggestion, Matthew. I will certainly look into DMForest for refining of my base DMPlex dm. From: Matthew Knepley [mailto:knepley at gmail.com] Sent: Wednesday, September 18, 2019 10:35 PM To: Mohammad Hassan Cc: PETSc Subject: Re: [petsc-users] DMPlex Distribution On Wed, Sep 18, 2019 at 10:27 AM Mohammad Hassan > wrote: I want to implement block-based AMR, which turns my base conformal mesh to non-conformal. My question is how DMPlex renders a mesh that it cannot support non-conformal meshes. Mark misspoke. Plex _does_ support geometrically non-conforming meshing, e.g. "hanging nodes". The easiest way to use Plex this way is to use DMForest, which uses Plex underneath. There are excellent p4est tutorials. What you would do is create your conformal mesh, using Plex if you want, and use that for the p4est base mesh (you would have the base mesh be the forest roots). Thanks, Matt If DMPlex does not work, I will try to use DMForest. From: Matthew Knepley [mailto:knepley at gmail.com ] Sent: Wednesday, September 18, 2019 9:50 PM To: Mohammad Hassan > Cc: Mark Adams >; PETSc > Subject: Re: [petsc-users] DMPlex Distribution On Wed, Sep 18, 2019 at 9:35 AM Mohammad Hassan via petsc-users > wrote: If DMPlex does not support, I may need to use PARAMESH or CHOMBO. Is there any way that we can construct non-conformal layout for DM in petsc? Lets see. Plex does support geometrically non-conforming meshes. This is how we support p4est. However, if you want that, you can just use DMForest I think. So you jsut want structured AMR? Thanks, Matt From: Mark Adams [mailto: mfadams at lbl.gov] Sent: Wednesday, September 18, 2019 9:23 PM To: Mohammad Hassan < mhbaghaei at mail.sjtu.edu.cn> Cc: Matthew Knepley < knepley at gmail.com>; PETSc users list < petsc-users at mcs.anl.gov> Subject: Re: [petsc-users] DMPlex Distribution I'm puzzled. It sounds like you are doing non-conforming AMR (structured block AMR), but Plex does not support that. On Tue, Sep 17, 2019 at 11:41 PM Mohammad Hassan via petsc-users > wrote: Mark is right. The functionality of AMR does not relate to parallelization of that. The vector size (global or local) does not conflict with AMR functions. Thanks Amir From: Matthew Knepley [mailto: knepley at gmail.com] Sent: Wednesday, September 18, 2019 12:59 AM To: Mohammad Hassan < mhbaghaei at mail.sjtu.edu.cn> Cc: PETSc < petsc-maint at mcs.anl.gov> Subject: Re: [petsc-users] DMPlex Distribution On Tue, Sep 17, 2019 at 12:03 PM Mohammad Hassan > wrote: Thanks for suggestion. I am going to use a block-based amr. I think I need to know exactly the mesh distribution of blocks across different processors for implementation of amr. Hi Amir, How are you using Plex if the block-AMR is coming from somewhere else? This will help me tell you what would be best. And as a general question, can we set block size of vector on each rank? I think as Mark says that you are using "blocksize" is a different way than PETSc. Thanks, Matt Thanks Amir From: Matthew Knepley [mailto: knepley at gmail.com] Sent: Tuesday, September 17, 2019 11:04 PM To: Mohammad Hassan < mhbaghaei at mail.sjtu.edu.cn> Cc: PETSc < petsc-users at mcs.anl.gov> Subject: Re: [petsc-users] DMPlex Distribution On Tue, Sep 17, 2019 at 9:27 AM Mohammad Hassan via petsc-users > wrote: Hi I am using DMPlexCreateFromDAG() to construct my DM. Is it possible to set the distribution across processors manually. I mean, how can I set the share of dm on each rank (local)? You could make a Shell partitioner and tell it the entire partition: https://www.mcs.anl.gov/petsc/petsc-master/docs/manualpages/DMPLEX/PetscPartitionerShellSetPartition.html However, I would be surprised if you could do this. It is likely that you just want to mess with the weights in ParMetis. Thanks, Matt Thanks Amir -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From mlohry at gmail.com Wed Sep 18 12:25:47 2019 From: mlohry at gmail.com (Mark Lohry) Date: Wed, 18 Sep 2019 13:25:47 -0400 Subject: [petsc-users] Strange Partition in PETSc 3.11 version on some computers In-Reply-To: References: <9c36fe1c-e6c2-0278-864e-f1453687d3f9@gmail.com> <5217DF9F-E42D-4E1F-AE9B-1088954548BB@anl.gov> <4ffe88c2-354c-07d7-ab1e-0f1edd8ec3c3@gmail.com> <89d9f65e-c185-0f6f-cd27-d372303cecc3@gmail.com> Message-ID: Mark, > The machine, compiler and MPI version should not matter. I might have missed something earlier in the thread, but parmetis has a dependency on the machine's glibc srand, and it can (and does) create different partitions with different srand versions. The same mesh on the same code on the same process count can and will give different partitions (possibly bad ones) on different machines. On Tue, Sep 17, 2019 at 1:05 PM Mark Adams via petsc-users < petsc-users at mcs.anl.gov> wrote: > > > On Tue, Sep 17, 2019 at 12:53 PM Danyang Su wrote: > >> Hi Mark, >> >> Thanks for your follow-up. >> >> The unstructured grid code has been verified and there is no problem in >> the results. The convergence rate is also good. The 3D mesh is not good, it >> is based on the original stratum which I haven't refined, but good for >> initial test as it is relative small and the results obtained from this >> mesh still makes sense. >> >> The 2D meshes are just for testing purpose as I want to reproduce the >> partition problem on a cluster using PETSc3.11.3 and Intel2019. >> Unfortunately, I didn't find problem using this example. >> >> The code has no problem in using different PETSc versions (PETSc V3.4 to >> V3.11) >> > OK, it is the same code. I thought I saw something about your code > changing. > > Just to be clear, v3.11 never gives you good partitions. It is not just a > problem on this Intel cluster. > > The machine, compiler and MPI version should not matter. > > >> and MPI distribution (MPICH, OpenMPI, IntelMPI), except for one >> simulation case (the mesh I attached) on a cluster with PETSc3.11.3 and >> Intel2019u4 due to the very different partition compared to PETSc3.9.3. Yet >> the simulation results are the same except for the efficiency problem >> because the strange partition results into much more communication (ghost >> nodes). >> >> I am still trying different compiler and mpi with PETSc3.11.3 on that >> cluster to trace the problem. Will get back to you guys when there is >> update. >> > > This is very strange. You might want to use 'git bisect'. You set a good > and a bad SHA1 (we can give you this for 3.9 and 3.11 and the exact > commands). The git will go to a version in the middle. You then > reconfigure, remake, rebuild your code, run your test. Git will ask you, as > I recall, if the version is good or bad. Once you get this workflow going > it is not too bad, depending on how hard this loop is of course. > > >> Thanks, >> >> danyang >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Wed Sep 18 12:56:50 2019 From: bsmith at mcs.anl.gov (Smith, Barry F.) Date: Wed, 18 Sep 2019 17:56:50 +0000 Subject: [petsc-users] Strange Partition in PETSc 3.11 version on some computers In-Reply-To: References: <9c36fe1c-e6c2-0278-864e-f1453687d3f9@gmail.com> <5217DF9F-E42D-4E1F-AE9B-1088954548BB@anl.gov> <4ffe88c2-354c-07d7-ab1e-0f1edd8ec3c3@gmail.com> <89d9f65e-c185-0f6f-cd27-d372303cecc3@gmail.com> Message-ID: <389E5FDB-FD86-4B39-B18C-67332CDE3A76@anl.gov> > On Sep 18, 2019, at 12:25 PM, Mark Lohry via petsc-users wrote: > > Mark, > Mark, Good point. This has been a big headache forever Note that this has been "fixed" in the master version of PETSc and will be in its next release. If you use --download-parmetis in the future it will use the same random numbers on all machines and thus should produce the same partitions on all machines. I think that metis has aways used the same random numbers and all machines and thus always produced the same results. Barry > The machine, compiler and MPI version should not matter. > > I might have missed something earlier in the thread, but parmetis has a dependency on the machine's glibc srand, and it can (and does) create different partitions with different srand versions. The same mesh on the same code on the same process count can and will give different partitions (possibly bad ones) on different machines. > > On Tue, Sep 17, 2019 at 1:05 PM Mark Adams via petsc-users wrote: > > > On Tue, Sep 17, 2019 at 12:53 PM Danyang Su wrote: > Hi Mark, > > Thanks for your follow-up. > > The unstructured grid code has been verified and there is no problem in the results. The convergence rate is also good. The 3D mesh is not good, it is based on the original stratum which I haven't refined, but good for initial test as it is relative small and the results obtained from this mesh still makes sense. > > The 2D meshes are just for testing purpose as I want to reproduce the partition problem on a cluster using PETSc3.11.3 and Intel2019. Unfortunately, I didn't find problem using this example. > > The code has no problem in using different PETSc versions (PETSc V3.4 to V3.11) > > OK, it is the same code. I thought I saw something about your code changing. > > Just to be clear, v3.11 never gives you good partitions. It is not just a problem on this Intel cluster. > > The machine, compiler and MPI version should not matter. > > and MPI distribution (MPICH, OpenMPI, IntelMPI), except for one simulation case (the mesh I attached) on a cluster with PETSc3.11.3 and Intel2019u4 due to the very different partition compared to PETSc3.9.3. Yet the simulation results are the same except for the efficiency problem because the strange partition results into much more communication (ghost nodes). > > I am still trying different compiler and mpi with PETSc3.11.3 on that cluster to trace the problem. Will get back to you guys when there is update. > > > This is very strange. You might want to use 'git bisect'. You set a good and a bad SHA1 (we can give you this for 3.9 and 3.11 and the exact commands). The git will go to a version in the middle. You then reconfigure, remake, rebuild your code, run your test. Git will ask you, as I recall, if the version is good or bad. Once you get this workflow going it is not too bad, depending on how hard this loop is of course. > > Thanks, > > danyang > From bsmith at mcs.anl.gov Wed Sep 18 13:38:55 2019 From: bsmith at mcs.anl.gov (Smith, Barry F.) Date: Wed, 18 Sep 2019 18:38:55 +0000 Subject: [petsc-users] MKL_PARDISO question In-Reply-To: References: Message-ID: <37B64A51-5A23-4897-B766-C6460D32B156@anl.gov> > On Sep 18, 2019, at 9:15 AM, Xiangdong via petsc-users wrote: > > Hello everyone, > > From here, > https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MATSOLVERMKL_PARDISO.html > > It seems thatMKL_PARDISO only works for seqaij. I am curious that whether one can use mkl_pardiso in petsc with multi-thread. You can use mkl_pardiso for multi-threaded and mkl_cpardiso for MPI parallelism. In both cases you must use the master branch of PETSc (or the next release of PETSc) to do this this easily. Note that when you use mkl_pardiso with multiple threads the matrix is coming from a single MPI process (or the single program if not running with MPI). So it is not MPI parallel that matches the rest of the parallelism with PETSc. So one much be a little careful: for example if one has 4 cores and uses them all with mpiexec -n 4 and then uses mkl_pardiso with 4 threads (each) then you have 16 threads fighting over 4 cores. So you need to select the number of MPI processes and number of threads wisely. > > Is there any reason that MKL_PARDISO is not listed in the linear solver table? > https://www.mcs.anl.gov/petsc/documentation/linearsolvertable.html > Just an oversight, thanks for letting us know, I have added it. > Thank you. > > Best, > Xiangdong From danyang.su at gmail.com Wed Sep 18 13:44:38 2019 From: danyang.su at gmail.com (Danyang Su) Date: Wed, 18 Sep 2019 11:44:38 -0700 Subject: [petsc-users] Strange Partition in PETSc 3.11 version on some computers In-Reply-To: <389E5FDB-FD86-4B39-B18C-67332CDE3A76@anl.gov> References: <9c36fe1c-e6c2-0278-864e-f1453687d3f9@gmail.com> <5217DF9F-E42D-4E1F-AE9B-1088954548BB@anl.gov> <4ffe88c2-354c-07d7-ab1e-0f1edd8ec3c3@gmail.com> <89d9f65e-c185-0f6f-cd27-d372303cecc3@gmail.com> <389E5FDB-FD86-4B39-B18C-67332CDE3A76@anl.gov> Message-ID: On 2019-09-18 10:56 a.m., Smith, Barry F. via petsc-users wrote: > >> On Sep 18, 2019, at 12:25 PM, Mark Lohry via petsc-users wrote: >> >> Mark, >> > Mark, > > Good point. This has been a big headache forever > > Note that this has been "fixed" in the master version of PETSc and will be in its next release. If you use --download-parmetis in the future it will use the same random numbers on all machines and thus should produce the same partitions on all machines. > > I think that metis has aways used the same random numbers and all machines and thus always produced the same results. > > Barry Good to know this. I will the same configuration that causes strange partition problem to test the next version. Thanks, Danyang > > >> The machine, compiler and MPI version should not matter. >> >> I might have missed something earlier in the thread, but parmetis has a dependency on the machine's glibc srand, and it can (and does) create different partitions with different srand versions. The same mesh on the same code on the same process count can and will give different partitions (possibly bad ones) on different machines. >> >> On Tue, Sep 17, 2019 at 1:05 PM Mark Adams via petsc-users wrote: >> >> >> On Tue, Sep 17, 2019 at 12:53 PM Danyang Su wrote: >> Hi Mark, >> >> Thanks for your follow-up. >> >> The unstructured grid code has been verified and there is no problem in the results. The convergence rate is also good. The 3D mesh is not good, it is based on the original stratum which I haven't refined, but good for initial test as it is relative small and the results obtained from this mesh still makes sense. >> >> The 2D meshes are just for testing purpose as I want to reproduce the partition problem on a cluster using PETSc3.11.3 and Intel2019. Unfortunately, I didn't find problem using this example. >> >> The code has no problem in using different PETSc versions (PETSc V3.4 to V3.11) >> >> OK, it is the same code. I thought I saw something about your code changing. >> >> Just to be clear, v3.11 never gives you good partitions. It is not just a problem on this Intel cluster. >> >> The machine, compiler and MPI version should not matter. >> >> and MPI distribution (MPICH, OpenMPI, IntelMPI), except for one simulation case (the mesh I attached) on a cluster with PETSc3.11.3 and Intel2019u4 due to the very different partition compared to PETSc3.9.3. Yet the simulation results are the same except for the efficiency problem because the strange partition results into much more communication (ghost nodes). >> >> I am still trying different compiler and mpi with PETSc3.11.3 on that cluster to trace the problem. Will get back to you guys when there is update. >> >> >> This is very strange. You might want to use 'git bisect'. You set a good and a bad SHA1 (we can give you this for 3.9 and 3.11 and the exact commands). The git will go to a version in the middle. You then reconfigure, remake, rebuild your code, run your test. Git will ask you, as I recall, if the version is good or bad. Once you get this workflow going it is not too bad, depending on how hard this loop is of course. >> >> Thanks, >> >> danyang >> From mpovolot at gmail.com Wed Sep 18 16:31:39 2019 From: mpovolot at gmail.com (Michael Povolotskyi) Date: Wed, 18 Sep 2019 17:31:39 -0400 Subject: [petsc-users] question MatCreateRedundantMatrix Message-ID: <82c15290-a8c9-84d9-bec5-e910574575a7@gmail.com> Dear Petsc developers, I found that MatCreateRedundantMatrix does not support dense matrices. This causes the following problem: I cannot use CISS eigensolver from SLEPC with dense matrices with parallelization over quadrature points. Is it possible for you to add this support? Thank you, Michael. From mpovolot at purdue.edu Wed Sep 18 17:40:20 2019 From: mpovolot at purdue.edu (Povolotskyi, Mykhailo) Date: Wed, 18 Sep 2019 22:40:20 +0000 Subject: [petsc-users] question about MatCreateRedundantMatrix Message-ID: <52131a99-8f0e-6e94-88a7-cb637d5053e2@purdue.edu> Dear Petsc developers, I found that MatCreateRedundantMatrix does not support dense matrices. This causes the following problem: I cannot use CISS eigensolver from SLEPC with dense matrices with parallelization over quadrature points. Is it possible for you to add this support? Thank you, Michael. p.s. I apologize if you received this e-mail twice, I sent if first from a different address. From epscodes at gmail.com Wed Sep 18 21:40:18 2019 From: epscodes at gmail.com (Xiangdong) Date: Wed, 18 Sep 2019 22:40:18 -0400 Subject: [petsc-users] MKL_PARDISO question In-Reply-To: <37B64A51-5A23-4897-B766-C6460D32B156@anl.gov> References: <37B64A51-5A23-4897-B766-C6460D32B156@anl.gov> Message-ID: Thank you very much for your information. I pulled the master branch but got the error when configuring it. When I run configure without mkl_cpardiso (configure.log_nocpardiso): ./configure PETSC_ARCH=arch-debug --with-debugging=1 --with-mpi-dir=$MPI_ROOT --with-blaslapack-dir=${MKL_ROOT} , it works fine. However, when I add mkl_cpardiso (configure.log_withcpardiso): ./configure PETSC_ARCH=arch-debug --with-debugging=1 --with-mpi-dir=$MPI_ROOT -with-blaslapack-dir=${MKL_ROOT} --with-mkl_cpardiso-dir=${MKL_ROOT} , it complains about "Could not find a functional BLAS.", but the blas was provided through mkl as same as previous configuration. Can you help me on the configuration? Thank you. Xiangdong On Wed, Sep 18, 2019 at 2:39 PM Smith, Barry F. wrote: > > > > On Sep 18, 2019, at 9:15 AM, Xiangdong via petsc-users < > petsc-users at mcs.anl.gov> wrote: > > > > Hello everyone, > > > > From here, > > > https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MATSOLVERMKL_PARDISO.html > > > > It seems thatMKL_PARDISO only works for seqaij. I am curious that > whether one can use mkl_pardiso in petsc with multi-thread. > > You can use mkl_pardiso for multi-threaded and mkl_cpardiso for MPI > parallelism. > > In both cases you must use the master branch of PETSc (or the next > release of PETSc) to do this this easily. > > Note that when you use mkl_pardiso with multiple threads the matrix is > coming from a single MPI process (or the single program if not running with > MPI). So it is not MPI parallel that matches the rest of the parallelism > with PETSc. So one much be a little careful: for example if one has 4 cores > and uses them all with mpiexec -n 4 and then uses mkl_pardiso with 4 > threads (each) then you have 16 threads fighting over 4 cores. So you need > to select the number of MPI processes and number of threads wisely. > > > > > Is there any reason that MKL_PARDISO is not listed in the linear solver > table? > > https://www.mcs.anl.gov/petsc/documentation/linearsolvertable.html > > > > Just an oversight, thanks for letting us know, I have added it. > > > > Thank you. > > > > Best, > > Xiangdong > > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: configure.log_nocpardiso Type: application/octet-stream Size: 1021259 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: configure.log_withcpardiso Type: application/octet-stream Size: 1790213 bytes Desc: not available URL: From bsmith at mcs.anl.gov Wed Sep 18 22:46:33 2019 From: bsmith at mcs.anl.gov (Smith, Barry F.) Date: Thu, 19 Sep 2019 03:46:33 +0000 Subject: [petsc-users] MKL_PARDISO question In-Reply-To: References: <37B64A51-5A23-4897-B766-C6460D32B156@anl.gov> Message-ID: <9804B796-C7E0-4D19-BB9F-9B289D27A58F@mcs.anl.gov> This is easy thanks to the additional debugging I added recently. Your install of MKL does not have CPardiso support. When you install MKL you have to make sure you select the "extra" cluster option, otherwise it doesn't install some of the library. I only learned this myself recently from another PETSc user. So please try again after you install the full MKL and send configure.log if it fails (by the way,. just use --with-mkl_cpardiso not --with-mkl_cpardiso-dir since it always has to find the CPardiso in the MKL BLAS/Lapack directory). After you install the full MKL you will see that directory also has files with *blacs* in them. Barry Executing: ls /home/epscodes/MyLocal/intel/mkl/lib/intel64 stdout: libmkl_avx2.so libmkl_avx512_mic.so libmkl_avx512.so libmkl_avx.so libmkl_blas95_ilp64.a libmkl_blas95_lp64.a libmkl_core.a libmkl_core.so libmkl_def.so libmkl_gf_ilp64.a libmkl_gf_ilp64.so libmkl_gf_lp64.a libmkl_gf_lp64.so libmkl_gnu_thread.a libmkl_gnu_thread.so libmkl_intel_ilp64.a libmkl_intel_ilp64.so libmkl_intel_lp64.a libmkl_intel_lp64.so libmkl_intel_thread.a libmkl_intel_thread.so libmkl_lapack95_ilp64.a libmkl_lapack95_lp64.a libmkl_mc3.so libmkl_mc.so libmkl_rt.so libmkl_sequential.a libmkl_sequential.so libmkl_tbb_thread.a libmkl_tbb_thread.so libmkl_vml_avx2.so libmkl_vml_avx512_mic.so libmkl_vml_avx512.so libmkl_vml_avx.so libmkl_vml_cmpt.so libmkl_vml_def.so libmkl_vml_mc2.so libmkl_vml_mc3.so libmkl_vml_mc.so > On Sep 18, 2019, at 9:40 PM, Xiangdong wrote: > > Thank you very much for your information. I pulled the master branch but got the error when configuring it. > > When I run configure without mkl_cpardiso (configure.log_nocpardiso): ./configure PETSC_ARCH=arch-debug --with-debugging=1 --with-mpi-dir=$MPI_ROOT --with-blaslapack-dir=${MKL_ROOT} , it works fine. > > However, when I add mkl_cpardiso (configure.log_withcpardiso): ./configure PETSC_ARCH=arch-debug --with-debugging=1 --with-mpi-dir=$MPI_ROOT -with-blaslapack-dir=${MKL_ROOT} --with-mkl_cpardiso-dir=${MKL_ROOT} , it complains about "Could not find a functional BLAS.", but the blas was provided through mkl as same as previous configuration. > > Can you help me on the configuration? Thank you. > > Xiangdong > > On Wed, Sep 18, 2019 at 2:39 PM Smith, Barry F. wrote: > > > > On Sep 18, 2019, at 9:15 AM, Xiangdong via petsc-users wrote: > > > > Hello everyone, > > > > From here, > > https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MATSOLVERMKL_PARDISO.html > > > > It seems thatMKL_PARDISO only works for seqaij. I am curious that whether one can use mkl_pardiso in petsc with multi-thread. > > You can use mkl_pardiso for multi-threaded and mkl_cpardiso for MPI parallelism. > > In both cases you must use the master branch of PETSc (or the next release of PETSc) to do this this easily. > > Note that when you use mkl_pardiso with multiple threads the matrix is coming from a single MPI process (or the single program if not running with MPI). So it is not MPI parallel that matches the rest of the parallelism with PETSc. So one much be a little careful: for example if one has 4 cores and uses them all with mpiexec -n 4 and then uses mkl_pardiso with 4 threads (each) then you have 16 threads fighting over 4 cores. So you need to select the number of MPI processes and number of threads wisely. > > > > > Is there any reason that MKL_PARDISO is not listed in the linear solver table? > > https://www.mcs.anl.gov/petsc/documentation/linearsolvertable.html > > > > Just an oversight, thanks for letting us know, I have added it. > > > > Thank you. > > > > Best, > > Xiangdong > > From hong at aspiritech.org Wed Sep 18 23:20:32 2019 From: hong at aspiritech.org (hong at aspiritech.org) Date: Wed, 18 Sep 2019 23:20:32 -0500 Subject: [petsc-users] question about MatCreateRedundantMatrix In-Reply-To: <52131a99-8f0e-6e94-88a7-cb637d5053e2@purdue.edu> References: <52131a99-8f0e-6e94-88a7-cb637d5053e2@purdue.edu> Message-ID: Michael, We have support of MatCreateRedundantMatrix for dense matrices. For example, petsc/src/mat/examples/tests/ex9.c: mpiexec -n 4 ./ex9 -mat_type dense -view_mat -nsubcomms 2 Hong On Wed, Sep 18, 2019 at 5:40 PM Povolotskyi, Mykhailo via petsc-users < petsc-users at mcs.anl.gov> wrote: > Dear Petsc developers, > > I found that MatCreateRedundantMatrix does not support dense matrices. > > This causes the following problem: I cannot use CISS eigensolver from > SLEPC with dense matrices with parallelization over quadrature points. > > Is it possible for you to add this support? > > Thank you, > > Michael. > > > p.s. I apologize if you received this e-mail twice, I sent if first from > a different address. > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jroman at dsic.upv.es Thu Sep 19 03:55:41 2019 From: jroman at dsic.upv.es (Jose E. Roman) Date: Thu, 19 Sep 2019 10:55:41 +0200 Subject: [petsc-users] question about MatCreateRedundantMatrix In-Reply-To: References: <52131a99-8f0e-6e94-88a7-cb637d5053e2@purdue.edu> Message-ID: <57ED352E-14B3-416B-8574-C04EBB09D797@dsic.upv.es> Michael, In my previous email I should have checked it better. The CISS solver works indeed with dense matrices: $ mpiexec -n 2 ./ex2 -n 30 -eps_type ciss -terse -rg_type ellipse -rg_ellipse_center 1.175 -rg_ellipse_radius 0.075 -eps_ciss_partitions 2 -mat_type dense 2-D Laplacian Eigenproblem, N=900 (30x30 grid) Solution method: ciss Number of requested eigenvalues: 1 Found 15 eigenvalues, all of them computed up to the required tolerance: 1.10416, 1.10416, 1.10455, 1.10455, 1.12947, 1.12947, 1.13426, 1.13426, 1.16015, 1.16015, 1.19338, 1.19338, 1.21093, 1.21093, 1.24413 There might be something different in the way matrices are initialized in your code. Send me a simple example that reproduces the problem and I will track it down. Sorry for the confusion. Jose > El 19 sept 2019, a las 6:20, hong--- via petsc-users escribi?: > > Michael, > We have support of MatCreateRedundantMatrix for dense matrices. For example, petsc/src/mat/examples/tests/ex9.c: > mpiexec -n 4 ./ex9 -mat_type dense -view_mat -nsubcomms 2 > > Hong > > On Wed, Sep 18, 2019 at 5:40 PM Povolotskyi, Mykhailo via petsc-users wrote: > Dear Petsc developers, > > I found that MatCreateRedundantMatrix does not support dense matrices. > > This causes the following problem: I cannot use CISS eigensolver from > SLEPC with dense matrices with parallelization over quadrature points. > > Is it possible for you to add this support? > > Thank you, > > Michael. > > > p.s. I apologize if you received this e-mail twice, I sent if first from > a different address. > From mfadams at lbl.gov Thu Sep 19 06:23:28 2019 From: mfadams at lbl.gov (Mark Adams) Date: Thu, 19 Sep 2019 07:23:28 -0400 Subject: [petsc-users] DMPlex Distribution In-Reply-To: <002701d56e32$04985550$0dc8fff0$@mail.sjtu.edu.cn> References: <001001d56d5b$b0a40b50$11ec21f0$@mail.sjtu.edu.cn> <003701d56d71$7a13d5f0$6e3b81d0$@mail.sjtu.edu.cn> <000801d56dd2$f81ae6d0$e850b470$@mail.sjtu.edu.cn> <001f01d56e25$f55b14d0$e0113e70$@mail.sjtu.edu.cn> <001001d56e2d$1eab8ac0$5c02a040$@mail.sjtu.edu.cn> <002701d56e32$04985550$0dc8fff0$@mail.sjtu.edu.cn> Message-ID: Note, Forest gives you individual elements at the leaves. Donna Calhoun, a former Chombo user, has developed a block structured solver on p4est ( https://math.boisestate.edu/~calhoun/ForestClaw/index.html), but I would imagine that you could just take the Plex that DMForest creates and just call DMRefine(...) on it to get a block structured AMR mesh. On Wed, Sep 18, 2019 at 11:02 AM Mohammad Hassan via petsc-users < petsc-users at mcs.anl.gov> wrote: > Thanks for your suggestion, Matthew. I will certainly look into DMForest > for refining of my base DMPlex dm. > > > > *From:* Matthew Knepley [mailto:knepley at gmail.com] > *Sent:* Wednesday, September 18, 2019 10:35 PM > *To:* Mohammad Hassan > *Cc:* PETSc > *Subject:* Re: [petsc-users] DMPlex Distribution > > > > On Wed, Sep 18, 2019 at 10:27 AM Mohammad Hassan < > mhbaghaei at mail.sjtu.edu.cn> wrote: > > I want to implement block-based AMR, which turns my base conformal mesh to > non-conformal. My question is how DMPlex renders a mesh that it cannot > support non-conformal meshes. > > > > Mark misspoke. Plex _does_ support geometrically non-conforming meshing, > e.g. "hanging nodes". The easiest way to > > use Plex this way is to use DMForest, which uses Plex underneath. > > > > There are excellent p4est tutorials. What you would do is create your > conformal mesh, using Plex if you want, and > > use that for the p4est base mesh (you would have the base mesh be the > forest roots). > > > > Thanks, > > > > Matt > > > > If DMPlex does not work, I will try to use DMForest. > > > > *From:* Matthew Knepley [mailto:knepley at gmail.com] > *Sent:* Wednesday, September 18, 2019 9:50 PM > *To:* Mohammad Hassan > *Cc:* Mark Adams ; PETSc > *Subject:* Re: [petsc-users] DMPlex Distribution > > > > On Wed, Sep 18, 2019 at 9:35 AM Mohammad Hassan via petsc-users < > petsc-users at mcs.anl.gov> wrote: > > If DMPlex does not support, I may need to use PARAMESH or CHOMBO. Is there > any way that we can construct non-conformal layout for DM in petsc? > > > > Lets see. Plex does support geometrically non-conforming meshes. This is > how we support p4est. However, if > > you want that, you can just use DMForest I think. So you jsut want > structured AMR? > > > > Thanks, > > > > Matt > > > > > > *From:* Mark Adams [mailto:mfadams at lbl.gov] > *Sent:* Wednesday, September 18, 2019 9:23 PM > *To:* Mohammad Hassan > *Cc:* Matthew Knepley ; PETSc users list < > petsc-users at mcs.anl.gov> > *Subject:* Re: [petsc-users] DMPlex Distribution > > > > I'm puzzled. It sounds like you are doing non-conforming AMR (structured > block AMR), but Plex does not support that. > > > > On Tue, Sep 17, 2019 at 11:41 PM Mohammad Hassan via petsc-users < > petsc-users at mcs.anl.gov> wrote: > > Mark is right. The functionality of AMR does not relate to > parallelization of that. The vector size (global or local) does not > conflict with AMR functions. > > Thanks > > > > Amir > > > > *From:* Matthew Knepley [mailto:knepley at gmail.com] > *Sent:* Wednesday, September 18, 2019 12:59 AM > *To:* Mohammad Hassan > *Cc:* PETSc > *Subject:* Re: [petsc-users] DMPlex Distribution > > > > On Tue, Sep 17, 2019 at 12:03 PM Mohammad Hassan < > mhbaghaei at mail.sjtu.edu.cn> wrote: > > Thanks for suggestion. I am going to use a block-based amr. I think I need > to know exactly the mesh distribution of blocks across different processors > for implementation of amr. > > > > Hi Amir, > > > > How are you using Plex if the block-AMR is coming from somewhere else? > This will help > > me tell you what would be best. > > > > And as a general question, can we set block size of vector on each rank? > > > > I think as Mark says that you are using "blocksize" is a different way > than PETSc. > > > > Thanks, > > > > Matt > > > > Thanks > > Amir > > > > *From:* Matthew Knepley [mailto:knepley at gmail.com] > *Sent:* Tuesday, September 17, 2019 11:04 PM > *To:* Mohammad Hassan > *Cc:* PETSc > *Subject:* Re: [petsc-users] DMPlex Distribution > > > > On Tue, Sep 17, 2019 at 9:27 AM Mohammad Hassan via petsc-users < > petsc-users at mcs.anl.gov> wrote: > > Hi > > I am using DMPlexCreateFromDAG() to construct my DM. Is it possible to set > the distribution across processors manually. I mean, how can I set the > share of dm on each rank (local)? > > > > You could make a Shell partitioner and tell it the entire partition: > > > > > https://www.mcs.anl.gov/petsc/petsc-master/docs/manualpages/DMPLEX/PetscPartitionerShellSetPartition.html > > > > However, I would be surprised if you could do this. It is likely that you > just want to mess with the weights in ParMetis. > > > > Thanks, > > > > Matt > > > > Thanks > > Amir > > > > > -- > > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > > > https://www.cse.buffalo.edu/~knepley/ > > > > > > -- > > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > > > https://www.cse.buffalo.edu/~knepley/ > > > > > > -- > > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > > > https://www.cse.buffalo.edu/~knepley/ > > > > > > -- > > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > > > https://www.cse.buffalo.edu/~knepley/ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mhbaghaei at mail.sjtu.edu.cn Thu Sep 19 07:15:55 2019 From: mhbaghaei at mail.sjtu.edu.cn (Mohammad Hassan) Date: Thu, 19 Sep 2019 20:15:55 +0800 Subject: [petsc-users] DMPlex Distribution In-Reply-To: References: <001001d56d5b$b0a40b50$11ec21f0$@mail.sjtu.edu.cn> <003701d56d71$7a13d5f0$6e3b81d0$@mail.sjtu.edu.cn> <000801d56dd2$f81ae6d0$e850b470$@mail.sjtu.edu.cn> <001f01d56e25$f55b14d0$e0113e70$@mail.sjtu.edu.cn> <001001d56e2d$1eab8ac0$5c02a040$@mail.sjtu.edu.cn> <002701d56e32$04985550$0dc8fff0$@mail.sjtu.edu.cn> Message-ID: <006b01d56ee3$fdf91a10$f9eb4e30$@mail.sjtu.edu.cn> In fact, I would create my base mesh in DMPlex and use DMForest to construct the non-conformal meshing obtained by my own block-based AMR functions. ForestClaw is based on p4est. However, I may need to implement the AMR algorithm on DM in p4est library and then convert it to DMPlex. Do you think DMForest alone will not allow me to create the AMR for DMPlex? Thanks Amir From: Mark Adams [mailto:mfadams at lbl.gov] Sent: Thursday, September 19, 2019 7:23 PM To: Mohammad Hassan Cc: Matthew Knepley ; PETSc users list Subject: Re: [petsc-users] DMPlex Distribution Note, Forest gives you individual elements at the leaves. Donna Calhoun, a former Chombo user, has developed a block structured solver on p4est (https://math.boisestate.edu/~calhoun/ForestClaw/index.html), but I would imagine that you could just take the Plex that DMForest creates and just call DMRefine(...) on it to get a block structured AMR mesh. On Wed, Sep 18, 2019 at 11:02 AM Mohammad Hassan via petsc-users > wrote: Thanks for your suggestion, Matthew. I will certainly look into DMForest for refining of my base DMPlex dm. From: Matthew Knepley [mailto:knepley at gmail.com ] Sent: Wednesday, September 18, 2019 10:35 PM To: Mohammad Hassan > Cc: PETSc > Subject: Re: [petsc-users] DMPlex Distribution On Wed, Sep 18, 2019 at 10:27 AM Mohammad Hassan > wrote: I want to implement block-based AMR, which turns my base conformal mesh to non-conformal. My question is how DMPlex renders a mesh that it cannot support non-conformal meshes. Mark misspoke. Plex _does_ support geometrically non-conforming meshing, e.g. "hanging nodes". The easiest way to use Plex this way is to use DMForest, which uses Plex underneath. There are excellent p4est tutorials. What you would do is create your conformal mesh, using Plex if you want, and use that for the p4est base mesh (you would have the base mesh be the forest roots). Thanks, Matt If DMPlex does not work, I will try to use DMForest. From: Matthew Knepley [mailto:knepley at gmail.com ] Sent: Wednesday, September 18, 2019 9:50 PM To: Mohammad Hassan > Cc: Mark Adams >; PETSc > Subject: Re: [petsc-users] DMPlex Distribution On Wed, Sep 18, 2019 at 9:35 AM Mohammad Hassan via petsc-users > wrote: If DMPlex does not support, I may need to use PARAMESH or CHOMBO. Is there any way that we can construct non-conformal layout for DM in petsc? Lets see. Plex does support geometrically non-conforming meshes. This is how we support p4est. However, if you want that, you can just use DMForest I think. So you jsut want structured AMR? Thanks, Matt From: Mark Adams [mailto: mfadams at lbl.gov] Sent: Wednesday, September 18, 2019 9:23 PM To: Mohammad Hassan < mhbaghaei at mail.sjtu.edu.cn> Cc: Matthew Knepley < knepley at gmail.com>; PETSc users list < petsc-users at mcs.anl.gov> Subject: Re: [petsc-users] DMPlex Distribution I'm puzzled. It sounds like you are doing non-conforming AMR (structured block AMR), but Plex does not support that. On Tue, Sep 17, 2019 at 11:41 PM Mohammad Hassan via petsc-users > wrote: Mark is right. The functionality of AMR does not relate to parallelization of that. The vector size (global or local) does not conflict with AMR functions. Thanks Amir From: Matthew Knepley [mailto: knepley at gmail.com] Sent: Wednesday, September 18, 2019 12:59 AM To: Mohammad Hassan < mhbaghaei at mail.sjtu.edu.cn> Cc: PETSc < petsc-maint at mcs.anl.gov> Subject: Re: [petsc-users] DMPlex Distribution On Tue, Sep 17, 2019 at 12:03 PM Mohammad Hassan > wrote: Thanks for suggestion. I am going to use a block-based amr. I think I need to know exactly the mesh distribution of blocks across different processors for implementation of amr. Hi Amir, How are you using Plex if the block-AMR is coming from somewhere else? This will help me tell you what would be best. And as a general question, can we set block size of vector on each rank? I think as Mark says that you are using "blocksize" is a different way than PETSc. Thanks, Matt Thanks Amir From: Matthew Knepley [mailto: knepley at gmail.com] Sent: Tuesday, September 17, 2019 11:04 PM To: Mohammad Hassan < mhbaghaei at mail.sjtu.edu.cn> Cc: PETSc < petsc-users at mcs.anl.gov> Subject: Re: [petsc-users] DMPlex Distribution On Tue, Sep 17, 2019 at 9:27 AM Mohammad Hassan via petsc-users > wrote: Hi I am using DMPlexCreateFromDAG() to construct my DM. Is it possible to set the distribution across processors manually. I mean, how can I set the share of dm on each rank (local)? You could make a Shell partitioner and tell it the entire partition: https://www.mcs.anl.gov/petsc/petsc-master/docs/manualpages/DMPLEX/PetscPartitionerShellSetPartition.html However, I would be surprised if you could do this. It is likely that you just want to mess with the weights in ParMetis. Thanks, Matt Thanks Amir -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From mpovolot at purdue.edu Thu Sep 19 13:03:15 2019 From: mpovolot at purdue.edu (Povolotskyi, Mykhailo) Date: Thu, 19 Sep 2019 18:03:15 +0000 Subject: [petsc-users] question about MatCreateRedundantMatrix In-Reply-To: <57ED352E-14B3-416B-8574-C04EBB09D797@dsic.upv.es> References: <52131a99-8f0e-6e94-88a7-cb637d5053e2@purdue.edu> <57ED352E-14B3-416B-8574-C04EBB09D797@dsic.upv.es> Message-ID: <8540f02b-52bd-229e-7124-23f033e31ef6@purdue.edu> Hello Jose, I have done the test case to reproduce my error. 1. You will need to download a file "matrix.bin"? from the following link https://www.dropbox.com/s/6y7ro99ou4qr8uy/matrix.bin?dl=0 2. Here is the C++ code I use #include #include #include #include #include #include #include "mpi.h" #include "petscmat.h" #include "slepcsys.h" #include "slepceps.h" #include "slepcrg.h" using namespace std; void read_matrix(const std::string& filename, ??? ??? ?int& matrix_size, ??? ??? ?std::vector >& data) { ? int file_size; ? struct stat results; ? if (stat(filename.c_str(), &results) == 0) ? { ??? file_size = results.st_size; ? } ? else ? { ??? throw runtime_error("Wrong file\n"); ? } ? int data_size =? file_size / sizeof(std::complex); ? int n1 = (int) sqrt(data_size); ? if (n1? * n1 == data_size) ? { ??? matrix_size = n1; ? } ? else ? { ???? throw runtime_error("Wrong file size\n"); ? } ? data.resize(matrix_size*matrix_size); ? ifstream myFile (filename.c_str(), ios::in | ios::binary); ? myFile.read ((char*) data.data(), file_size); } int main(int argc, char* argv[]) { ? MPI_Init(NULL, NULL); ? PetscInitialize(&argc, &argv,(char*)0, (char* )0); ? SlepcInitialize(&argc, &argv,(char*)0, (char* )0); ? int rank; ? MPI_Comm_rank(MPI_COMM_WORLD, &rank); ? string filename("matrix.bin"); ? int matrix_size; ? std::vector > data; ? read_matrix(filename, matrix_size,? data); ? if (rank == 0) ? { ??? cout << "matrix size " << matrix_size << "\n"; ? } ? Mat mat; MatCreateDense(MPI_COMM_WORLD,PETSC_DECIDE,PETSC_DECIDE,matrix_size,matrix_size,NULL,&mat); ? int local_row_begin; ? int local_row_end; MatGetOwnershipRange(mat,&local_row_begin,&local_row_end); ? for (int i = local_row_begin; i < local_row_end; i++) ? { ??? vector > v(matrix_size); ??? vector index(matrix_size); ??? for (int j = 0; j < matrix_size; j++) ??? { ????? v[j] = data[j*matrix_size + i]; ????? index[j] = j; ??? } ??? MatSetValues(mat,1,&i,matrix_size,index.data(), v.data(), INSERT_VALUES); ? } ? MatAssemblyBegin(mat,MAT_FINAL_ASSEMBLY); ? MatAssemblyEnd(mat,MAT_FINAL_ASSEMBLY); ? complex center(0, 0); ? double? radius(100); ? double vscale(1.0); ? EPS??????????? eps; ? EPSType??????? type; ? RG???????????? rg; ? EPSCreate(MPI_COMM_WORLD,&eps); ? EPSSetOperators( eps,mat,NULL); ? EPSSetType(eps,EPSCISS); ? EPSSetProblemType(eps, EPS_NHEP); ? EPSSetFromOptions(eps); ? EPSGetRG(eps,&rg); ? RGSetType(rg,RGELLIPSE); ? RGEllipseSetParameters( rg, center,radius,vscale); ? EPSSolve(eps); ? if (rank == 0) ? { ??? int nconv; ??? EPSGetConverged(eps,&nconv); ??? for (int i = 0; i < nconv; i++) ??? { ????? complex a,b; ????? EPSGetEigenvalue(eps,i,&a,&b);; ????? cout << a << "\n"; ??? } ? } ? PetscFinalize(); ? SlepcFinalize(); ? MPI_Finalize(); } 3. If I run it as mpiexec -n 1 a.out -eps_ciss_partitions 1 it works well. If run it as? mpiexec -n 2 a.out -eps_ciss_partitions 2 I get an error message [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [0]PETSC ERROR: No support for this operation for this object type [0]PETSC ERROR: Mat type seqdense [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. [0]PETSC ERROR: Petsc Release Version 3.8.4, Mar, 24, 2018 [0]PETSC ERROR: a.out on a linux-complex named brown-fe03.rcac.purdue.edu by mpovolot Thu Sep 19 14:02:06 2019 [0]PETSC ERROR: Configure options --with-scalar-type=complex --with-x=0 --with-hdf5 --download-hdf5=1 --with-single-library=1 --with-pic=1 --with-shared-libraries=0 --with-log=0 --with-clanguage=C++ --CXXFLAGS="-fopenmp -fPIC" --CFLAGS="-fopenmp -fPIC" --with-fortran=0 --FFLAGS="-fopenmp -fPIC" --with-debugging=0 --with-cc=mpicc --with-fc=mpif90 --with-cxx=mpicxx COPTFLAGS= CXXOPTFLAGS= FOPTFLAGS= --download-metis=1 --download-parmetis=1 --with-valgrind-dir=/apps/brown/valgrind/3.13.0_gcc-4.8.5 --download-mumps=1 --with-fortran-kernels=0 --download-superlu_dist=1 --with-blaslapack-lib="-L/apps/cent7/intel/compilers_and_libraries_2017.1.132/linux/mkl/lib/intel64 -lmkl_intel_lp64 -lmkl_gnu_thread -lmkl_core " --with-blacs-lib=/apps/cent7/intel/compilers_and_libraries_2017.1.132/linux/mkl/lib/intel64/libmkl_blacs_intelmpi_lp64.so --with-blacs-include=/apps/cent7/intel/compilers_and_libraries_2017.1.132/linux/mkl/include --with-scalapack-lib="-Wl,-rpath,/apps/cent7/intel/compilers_and_libraries_2017.1.132/linux/mkl/lib/intel64 -L/apps/cent7/intel/compilers_and_libraries_2017.1.132/linux/mkl/lib/intel64 -lmkl_gf_lp64 -lmkl_gnu_thread -lmkl_core? -lpthread -L/apps/cent7/intel/compilers_and_libraries_2017.1.132/linux/mkl/lib/intel64 -lmkl_scalapack_lp64 -lmkl_blacs_intelmpi_lp64" --with-scalapack-include=/apps/cent7/intel/compilers_and_libraries_2017.1.132/linux/mkl/include [0]PETSC ERROR: #1 MatCreateMPIMatConcatenateSeqMat() line 10547 in /depot/kildisha/apps/brown/nemo5/libs/petsc/build-cplx/src/mat/interface/matrix.c [0]PETSC ERROR: #2 MatCreateRedundantMatrix() line 10080 in /depot/kildisha/apps/brown/nemo5/libs/petsc/build-cplx/src/mat/interface/matrix.c [0]PETSC ERROR: #3 CISSRedundantMat() line 105 in /depot/kildisha/apps/brown/nemo5/libs/slepc/build-cplx/src/eps/impls/ciss/ciss.c [0]PETSC ERROR: #4 EPSSetUp_CISS() line 862 in /depot/kildisha/apps/brown/nemo5/libs/slepc/build-cplx/src/eps/impls/ciss/ciss.c [0]PETSC ERROR: #5 EPSSetUp() line 165 in /depot/kildisha/apps/brown/nemo5/libs/slepc/build-cplx/src/eps/interface/epssetup.c [0]PETSC ERROR: #6 EPSSolve() line 135 in /depot/kildisha/apps/brown/nemo5/libs/slepc/build-cplx/src/eps/interface/epssolve.c Thank you, Michael. On 09/19/2019 04:55 AM, Jose E. Roman wrote: > Michael, > > In my previous email I should have checked it better. The CISS solver works indeed with dense matrices: > > $ mpiexec -n 2 ./ex2 -n 30 -eps_type ciss -terse -rg_type ellipse -rg_ellipse_center 1.175 -rg_ellipse_radius 0.075 -eps_ciss_partitions 2 -mat_type dense > > 2-D Laplacian Eigenproblem, N=900 (30x30 grid) > > Solution method: ciss > > Number of requested eigenvalues: 1 > Found 15 eigenvalues, all of them computed up to the required tolerance: > 1.10416, 1.10416, 1.10455, 1.10455, 1.12947, 1.12947, 1.13426, 1.13426, > 1.16015, 1.16015, 1.19338, 1.19338, 1.21093, 1.21093, 1.24413 > > > There might be something different in the way matrices are initialized in your code. Send me a simple example that reproduces the problem and I will track it down. > > Sorry for the confusion. > Jose > > > >> El 19 sept 2019, a las 6:20, hong--- via petsc-users escribi?: >> >> Michael, >> We have support of MatCreateRedundantMatrix for dense matrices. For example, petsc/src/mat/examples/tests/ex9.c: >> mpiexec -n 4 ./ex9 -mat_type dense -view_mat -nsubcomms 2 >> >> Hong >> >> On Wed, Sep 18, 2019 at 5:40 PM Povolotskyi, Mykhailo via petsc-users wrote: >> Dear Petsc developers, >> >> I found that MatCreateRedundantMatrix does not support dense matrices. >> >> This causes the following problem: I cannot use CISS eigensolver from >> SLEPC with dense matrices with parallelization over quadrature points. >> >> Is it possible for you to add this support? >> >> Thank you, >> >> Michael. >> >> >> p.s. I apologize if you received this e-mail twice, I sent if first from >> a different address. >> From hzhang at mcs.anl.gov Thu Sep 19 13:22:34 2019 From: hzhang at mcs.anl.gov (Zhang, Hong) Date: Thu, 19 Sep 2019 18:22:34 +0000 Subject: [petsc-users] question about MatCreateRedundantMatrix In-Reply-To: <8540f02b-52bd-229e-7124-23f033e31ef6@purdue.edu> References: <52131a99-8f0e-6e94-88a7-cb637d5053e2@purdue.edu> <57ED352E-14B3-416B-8574-C04EBB09D797@dsic.upv.es> <8540f02b-52bd-229e-7124-23f033e31ef6@purdue.edu> Message-ID: Michael, -------------------------------------------------------------- [0]PETSC ERROR: No support for this operation for this object type [0]PETSC ERROR: Mat type seqdense [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. [0]PETSC ERROR: Petsc Release Version 3.8.4, Mar, 24, 2018 This is an old version of Petsc. Can you update to the latest Petsc release? Hong On 09/19/2019 04:55 AM, Jose E. Roman wrote: > Michael, > > In my previous email I should have checked it better. The CISS solver works indeed with dense matrices: > > $ mpiexec -n 2 ./ex2 -n 30 -eps_type ciss -terse -rg_type ellipse -rg_ellipse_center 1.175 -rg_ellipse_radius 0.075 -eps_ciss_partitions 2 -mat_type dense > > 2-D Laplacian Eigenproblem, N=900 (30x30 grid) > > Solution method: ciss > > Number of requested eigenvalues: 1 > Found 15 eigenvalues, all of them computed up to the required tolerance: > 1.10416, 1.10416, 1.10455, 1.10455, 1.12947, 1.12947, 1.13426, 1.13426, > 1.16015, 1.16015, 1.19338, 1.19338, 1.21093, 1.21093, 1.24413 > > > There might be something different in the way matrices are initialized in your code. Send me a simple example that reproduces the problem and I will track it down. > > Sorry for the confusion. > Jose > > > >> El 19 sept 2019, a las 6:20, hong--- via petsc-users > escribi?: >> >> Michael, >> We have support of MatCreateRedundantMatrix for dense matrices. For example, petsc/src/mat/examples/tests/ex9.c: >> mpiexec -n 4 ./ex9 -mat_type dense -view_mat -nsubcomms 2 >> >> Hong >> >> On Wed, Sep 18, 2019 at 5:40 PM Povolotskyi, Mykhailo via petsc-users > wrote: >> Dear Petsc developers, >> >> I found that MatCreateRedundantMatrix does not support dense matrices. >> >> This causes the following problem: I cannot use CISS eigensolver from >> SLEPC with dense matrices with parallelization over quadrature points. >> >> Is it possible for you to add this support? >> >> Thank you, >> >> Michael. >> >> >> p.s. I apologize if you received this e-mail twice, I sent if first from >> a different address. >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From mpovolot at purdue.edu Thu Sep 19 13:33:26 2019 From: mpovolot at purdue.edu (Povolotskyi, Mykhailo) Date: Thu, 19 Sep 2019 18:33:26 +0000 Subject: [petsc-users] question about MatCreateRedundantMatrix In-Reply-To: References: <52131a99-8f0e-6e94-88a7-cb637d5053e2@purdue.edu> <57ED352E-14B3-416B-8574-C04EBB09D797@dsic.upv.es> <8540f02b-52bd-229e-7124-23f033e31ef6@purdue.edu> Message-ID: <29818415-5864-08bb-80cc-b0e27c4643d7@purdue.edu> Hong, do you have in mind a reason why the newer version should work or is it a general recommendation? Which stable version would you recommend to upgrade to? Thank you, Michael. On 09/19/2019 02:22 PM, Zhang, Hong wrote: Michael, -------------------------------------------------------------- [0]PETSC ERROR: No support for this operation for this object type [0]PETSC ERROR: Mat type seqdense [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. [0]PETSC ERROR: Petsc Release Version 3.8.4, Mar, 24, 2018 This is an old version of Petsc. Can you update to the latest Petsc release? Hong On 09/19/2019 04:55 AM, Jose E. Roman wrote: > Michael, > > In my previous email I should have checked it better. The CISS solver works indeed with dense matrices: > > $ mpiexec -n 2 ./ex2 -n 30 -eps_type ciss -terse -rg_type ellipse -rg_ellipse_center 1.175 -rg_ellipse_radius 0.075 -eps_ciss_partitions 2 -mat_type dense > > 2-D Laplacian Eigenproblem, N=900 (30x30 grid) > > Solution method: ciss > > Number of requested eigenvalues: 1 > Found 15 eigenvalues, all of them computed up to the required tolerance: > 1.10416, 1.10416, 1.10455, 1.10455, 1.12947, 1.12947, 1.13426, 1.13426, > 1.16015, 1.16015, 1.19338, 1.19338, 1.21093, 1.21093, 1.24413 > > > There might be something different in the way matrices are initialized in your code. Send me a simple example that reproduces the problem and I will track it down. > > Sorry for the confusion. > Jose > > > >> El 19 sept 2019, a las 6:20, hong--- via petsc-users > escribi?: >> >> Michael, >> We have support of MatCreateRedundantMatrix for dense matrices. For example, petsc/src/mat/examples/tests/ex9.c: >> mpiexec -n 4 ./ex9 -mat_type dense -view_mat -nsubcomms 2 >> >> Hong >> >> On Wed, Sep 18, 2019 at 5:40 PM Povolotskyi, Mykhailo via petsc-users > wrote: >> Dear Petsc developers, >> >> I found that MatCreateRedundantMatrix does not support dense matrices. >> >> This causes the following problem: I cannot use CISS eigensolver from >> SLEPC with dense matrices with parallelization over quadrature points. >> >> Is it possible for you to add this support? >> >> Thank you, >> >> Michael. >> >> >> p.s. I apologize if you received this e-mail twice, I sent if first from >> a different address. >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From mfadams at lbl.gov Thu Sep 19 15:59:55 2019 From: mfadams at lbl.gov (Mark Adams) Date: Thu, 19 Sep 2019 16:59:55 -0400 Subject: [petsc-users] DMPlex Distribution In-Reply-To: <006b01d56ee3$fdf91a10$f9eb4e30$@mail.sjtu.edu.cn> References: <001001d56d5b$b0a40b50$11ec21f0$@mail.sjtu.edu.cn> <003701d56d71$7a13d5f0$6e3b81d0$@mail.sjtu.edu.cn> <000801d56dd2$f81ae6d0$e850b470$@mail.sjtu.edu.cn> <001f01d56e25$f55b14d0$e0113e70$@mail.sjtu.edu.cn> <001001d56e2d$1eab8ac0$5c02a040$@mail.sjtu.edu.cn> <002701d56e32$04985550$0dc8fff0$@mail.sjtu.edu.cn> <006b01d56ee3$fdf91a10$f9eb4e30$@mail.sjtu.edu.cn> Message-ID: I think you are fine with DMForest. I just mentioned ForestClaw for background. It has a bunch of hyperbolic stuff in there that is specialized. On Thu, Sep 19, 2019 at 8:16 AM Mohammad Hassan wrote: > In fact, I would create my base mesh in DMPlex and use DMForest to > construct the non-conformal meshing obtained by my own block-based AMR > functions. ForestClaw is based on p4est. However, I may need to implement > the AMR algorithm on DM in p4est library and then convert it to DMPlex. Do > you think DMForest alone will not allow me to create the AMR for DMPlex? > > > > Thanks > > Amir > > > > *From:* Mark Adams [mailto:mfadams at lbl.gov] > *Sent:* Thursday, September 19, 2019 7:23 PM > *To:* Mohammad Hassan > *Cc:* Matthew Knepley ; PETSc users list < > petsc-users at mcs.anl.gov> > *Subject:* Re: [petsc-users] DMPlex Distribution > > > > Note, Forest gives you individual elements at the leaves. Donna Calhoun, a > former Chombo user, has developed a block structured solver on p4est ( > https://math.boisestate.edu/~calhoun/ForestClaw/index.html), but I would > imagine that you could just take the Plex that DMForest creates and just > call DMRefine(...) on it to get a block structured AMR mesh. > > > > On Wed, Sep 18, 2019 at 11:02 AM Mohammad Hassan via petsc-users < > petsc-users at mcs.anl.gov> wrote: > > Thanks for your suggestion, Matthew. I will certainly look into DMForest > for refining of my base DMPlex dm. > > > > *From:* Matthew Knepley [mailto:knepley at gmail.com] > *Sent:* Wednesday, September 18, 2019 10:35 PM > *To:* Mohammad Hassan > *Cc:* PETSc > *Subject:* Re: [petsc-users] DMPlex Distribution > > > > On Wed, Sep 18, 2019 at 10:27 AM Mohammad Hassan < > mhbaghaei at mail.sjtu.edu.cn> wrote: > > I want to implement block-based AMR, which turns my base conformal mesh to > non-conformal. My question is how DMPlex renders a mesh that it cannot > support non-conformal meshes. > > > > Mark misspoke. Plex _does_ support geometrically non-conforming meshing, > e.g. "hanging nodes". The easiest way to > > use Plex this way is to use DMForest, which uses Plex underneath. > > > > There are excellent p4est tutorials. What you would do is create your > conformal mesh, using Plex if you want, and > > use that for the p4est base mesh (you would have the base mesh be the > forest roots). > > > > Thanks, > > > > Matt > > > > If DMPlex does not work, I will try to use DMForest. > > > > *From:* Matthew Knepley [mailto:knepley at gmail.com] > *Sent:* Wednesday, September 18, 2019 9:50 PM > *To:* Mohammad Hassan > *Cc:* Mark Adams ; PETSc > *Subject:* Re: [petsc-users] DMPlex Distribution > > > > On Wed, Sep 18, 2019 at 9:35 AM Mohammad Hassan via petsc-users < > petsc-users at mcs.anl.gov> wrote: > > If DMPlex does not support, I may need to use PARAMESH or CHOMBO. Is there > any way that we can construct non-conformal layout for DM in petsc? > > > > Lets see. Plex does support geometrically non-conforming meshes. This is > how we support p4est. However, if > > you want that, you can just use DMForest I think. So you jsut want > structured AMR? > > > > Thanks, > > > > Matt > > > > > > *From:* Mark Adams [mailto:mfadams at lbl.gov] > *Sent:* Wednesday, September 18, 2019 9:23 PM > *To:* Mohammad Hassan > *Cc:* Matthew Knepley ; PETSc users list < > petsc-users at mcs.anl.gov> > *Subject:* Re: [petsc-users] DMPlex Distribution > > > > I'm puzzled. It sounds like you are doing non-conforming AMR (structured > block AMR), but Plex does not support that. > > > > On Tue, Sep 17, 2019 at 11:41 PM Mohammad Hassan via petsc-users < > petsc-users at mcs.anl.gov> wrote: > > Mark is right. The functionality of AMR does not relate to > parallelization of that. The vector size (global or local) does not > conflict with AMR functions. > > Thanks > > > > Amir > > > > *From:* Matthew Knepley [mailto:knepley at gmail.com] > *Sent:* Wednesday, September 18, 2019 12:59 AM > *To:* Mohammad Hassan > *Cc:* PETSc > *Subject:* Re: [petsc-users] DMPlex Distribution > > > > On Tue, Sep 17, 2019 at 12:03 PM Mohammad Hassan < > mhbaghaei at mail.sjtu.edu.cn> wrote: > > Thanks for suggestion. I am going to use a block-based amr. I think I need > to know exactly the mesh distribution of blocks across different processors > for implementation of amr. > > > > Hi Amir, > > > > How are you using Plex if the block-AMR is coming from somewhere else? > This will help > > me tell you what would be best. > > > > And as a general question, can we set block size of vector on each rank? > > > > I think as Mark says that you are using "blocksize" is a different way > than PETSc. > > > > Thanks, > > > > Matt > > > > Thanks > > Amir > > > > *From:* Matthew Knepley [mailto:knepley at gmail.com] > *Sent:* Tuesday, September 17, 2019 11:04 PM > *To:* Mohammad Hassan > *Cc:* PETSc > *Subject:* Re: [petsc-users] DMPlex Distribution > > > > On Tue, Sep 17, 2019 at 9:27 AM Mohammad Hassan via petsc-users < > petsc-users at mcs.anl.gov> wrote: > > Hi > > I am using DMPlexCreateFromDAG() to construct my DM. Is it possible to set > the distribution across processors manually. I mean, how can I set the > share of dm on each rank (local)? > > > > You could make a Shell partitioner and tell it the entire partition: > > > > > https://www.mcs.anl.gov/petsc/petsc-master/docs/manualpages/DMPLEX/PetscPartitionerShellSetPartition.html > > > > However, I would be surprised if you could do this. It is likely that you > just want to mess with the weights in ParMetis. > > > > Thanks, > > > > Matt > > > > Thanks > > Amir > > > > > -- > > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > > > https://www.cse.buffalo.edu/~knepley/ > > > > > > -- > > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > > > https://www.cse.buffalo.edu/~knepley/ > > > > > > -- > > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > > > https://www.cse.buffalo.edu/~knepley/ > > > > > > -- > > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > > > https://www.cse.buffalo.edu/~knepley/ > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bcjo17 at gmail.com Thu Sep 19 22:48:29 2019 From: bcjo17 at gmail.com (Young Hyun Jo) Date: Fri, 20 Sep 2019 12:48:29 +0900 Subject: [petsc-users] Questions about multigrid preconditioner and multigrid level Message-ID: Hello, I'm Young Hyun Jo, and I study plasma physics and particle-in-cell simulation. Currently, I'm using PETSc to solve a 3D Poisson's equation in the FDM scheme. I have one question about the multigrid preconditioner. When I use PCG(KSPCG) with the multigrid preconditioner(PCMG), I get an error if I don't use the appropriate multigrid level for the grid number. For example, If I use 129 grids, I can use 7 multigrid levels. However, If I use 130 grids, I can't use any multigrid levels but one. So, It seems that the grid number is better to be (2*n + 1) to use multigrid preconditioner. Is this correct that the multigrid conditioner has some restrictions for the grid number to use? It will be really helpful for me to use PETSc properly. Thanks in advance. Sincerely, Young Hyun Jo -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Thu Sep 19 23:08:50 2019 From: bsmith at mcs.anl.gov (Smith, Barry F.) Date: Fri, 20 Sep 2019 04:08:50 +0000 Subject: [petsc-users] Questions about multigrid preconditioner and multigrid level In-Reply-To: References: Message-ID: <72B59503-B020-4258-917C-481F6A792467@anl.gov> You didn't indicate "why" you can't use multiple levels with "130 grids", is there some error message? Nor do you mention if you have periodic boundary conditions or are using cell or vertex centered unknowns. All of these things affect when you can coarsen for multigrain or not. Consider one simple case, Dirichlet boundary conditions with vertex centered unknowns, I show the fine | and coarse grid * | | | | | * * * Now consider 4 points, | | | | Where am I going to put the coarse points? It is possible to do multigrid with non-nesting of degrees for freedom like | | | | * * * but that is really uncommon, nobody does it. People just use the grid sizes which have a natural hierarchy of nested coarser grids. Barry > On Sep 19, 2019, at 10:48 PM, Young Hyun Jo via petsc-users wrote: > > > Hello, I'm Young Hyun Jo, and I study plasma physics and particle-in-cell simulation. > Currently, I'm using PETSc to solve a 3D Poisson's equation in the FDM scheme. > I have one question about the multigrid preconditioner. > When I use PCG(KSPCG) with the multigrid preconditioner(PCMG), I get an error if I don't use the appropriate multigrid level for the grid number. > For example, If I use 129 grids, I can use 7 multigrid levels. > However, If I use 130 grids, I can't use any multigrid levels but one. > So, It seems that the grid number is better to be (2*n + 1) to use multigrid preconditioner. > Is this correct that the multigrid conditioner has some restrictions for the grid number to use? > It will be really helpful for me to use PETSc properly. > Thanks in advance. > > Sincerely, > Young Hyun Jo From bsmith at mcs.anl.gov Fri Sep 20 00:19:27 2019 From: bsmith at mcs.anl.gov (Smith, Barry F.) Date: Fri, 20 Sep 2019 05:19:27 +0000 Subject: [petsc-users] Questions about multigrid preconditioner and multigrid level In-Reply-To: References: <72B59503-B020-4258-917C-481F6A792467@anl.gov> Message-ID: <997C18A6-5E37-45EA-AAC6-9149BC400757@mcs.anl.gov> The DMDA structured grid management in PETSc does not provide the needed interpolations for doing non-nested multigrid. The algebraic portions of PETSc's geometric multigrid would work fine for that case if you have your own way to provide the needed interpolation. I'll note that mathematically the interpolation is completely straightforward but the practical issues of computing such interpolations and managing the non-nested nature of the grids in MPI are nontrivial, not impossible or even particularly difficult but require careful thought and coding. The PETSc team doesn't have the resources or the need to develop this ability. I can only suggest sticking to the grid sizes where there is a natural nesting of the mesh points I don't think the coding effort is worth the benefit. Barry > On Sep 19, 2019, at 11:50 PM, Young Hyun Jo wrote: > > > Oh, I'm sorry. You're right. > I use Dirichlet boundary conditions with a central difference scheme. > I mentioned 130 grids, but I have an actual case below, and I get the errors : > > DM Object: 8 MPI processes > type: da > Processor [0] M 127 N 127 P 62 m 2 n 2 p 2 w 1 s 1 > X range of indices: 0 64, Y range of indices: 0 64, Z range of indices: 0 31 > Processor [1] M 127 N 127 P 62 m 2 n 2 p 2 w 1 s 1 > X range of indices: 64 127, Y range of indices: 0 64, Z range of indices: 0 31 > Processor [2] M 127 N 127 P 62 m 2 n 2 p 2 w 1 s 1 > X range of indices: 0 64, Y range of indices: 64 127, Z range of indices: 0 31 > Processor [3] M 127 N 127 P 62 m 2 n 2 p 2 w 1 s 1 > X range of indices: 64 127, Y range of indices: 64 127, Z range of indices: 0 31 > Processor [4] M 127 N 127 P 62 m 2 n 2 p 2 w 1 s 1 > X range of indices: 0 64, Y range of indices: 0 64, Z range of indices: 31 62 > Processor [5] M 127 N 127 P 62 m 2 n 2 p 2 w 1 s 1 > X range of indices: 64 127, Y range of indices: 0 64, Z range of indices: 31 62 > Processor [6] M 127 N 127 P 62 m 2 n 2 p 2 w 1 s 1 > X range of indices: 0 64, Y range of indices: 64 127, Z range of indices: 31 62 > Processor [7] M 127 N 127 P 62 m 2 n 2 p 2 w 1 s 1 > X range of indices: 64 127, Y range of indices: 64 127, Z range of indices: 31 62 > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > [0]PETSC ERROR: Arguments are incompatible > [0]PETSC ERROR: Ratio between levels: (mz - 1)/(Mz - 1) must be integer: mz 62 Mz 31 > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > [0]PETSC ERROR: Petsc Release Version 3.11.3, Jun, 26, 2019 > [0]PETSC ERROR: ../../../eclipse-workspace/PIC3DXYZ/PIC3DXYZ_MPI/Debug/PIC3DXYZ_MPI on a named mn0 by bcjo17 Fri Sep 20 13:44:29 2019 > [0]PETSC ERROR: Configure options --with-cc=mpiicc --with-cxx=mpiicpc --with-fc=mpiifort --prefix=/home/bcjo17/petsc_mpiicc --with-mpi=1 --with-blaslapack-dir=/home/bcjo17/intel/compilers_and_libraries_2019.3.199/linux/mkl > [0]PETSC ERROR: #1 DMCreateInterpolation_DA_3D_Q1() line 773 in /home/bcjo17/Downloads/petsc-3.11.3/src/dm/impls/da/dainterp.c > [0]PETSC ERROR: #2 DMCreateInterpolation_DA() line 1039 in /home/bcjo17/Downloads/petsc-3.11.3/src/dm/impls/da/dainterp.c > [0]PETSC ERROR: [1]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > [1]PETSC ERROR: Arguments are incompatible > [1]PETSC ERROR: Ratio between levels: (mz - 1)/(Mz - 1) must be integer: mz 62 Mz 31 > [1]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > [1]PETSC ERROR: Petsc Release Version 3.11.3, Jun, 26, 2019 > [1]PETSC ERROR: ../../../eclipse-workspace/PIC3DXYZ/PIC3DXYZ_MPI/Debug/PIC3DXYZ_MPI on a named mn0 by bcjo17 Fri Sep 20 13:44:29 2019 > [1]PETSC ERROR: Configure options --with-cc=mpiicc --with-cxx=mpiicpc --with-fc=mpiifort --prefix=/home/bcjo17/petsc_mpiicc --with-mpi=1 --with-blaslapack-dir=/home/bcjo17/intel/compilers_and_libraries_2019.3.199/linux/mkl > [1]PETSC ERROR: #1 DMCreateInterpolation_DA_3D_Q1() line 773 in /home/bcjo17/Downloads/petsc-3.11.3/src/dm/impls/da/dainterp.c > [1]PETSC ERROR: #2 DMCreateInterpolation_DA() line 1039 in /home/bcjo17/Downloads/petsc-3.11.3/src/dm/impls/da/dainterp.c > [1]PETSC ERROR: #3 DMCreateInterpolation() line 1114 in /home/bcjo17/Downloads/petsc-3.11.3/src/dm/interface/dm.c > [1]PETSC ERROR: #4 PCSetUp_MG() line 684 in /home/bcjo17/Downloads/petsc-3.11.3/src/ksp/pc/impls/mg/mg.c > [1]PETSC ERROR: #5 PCSetUp() line 932 in /home/bcjo17/Downloads/petsc-3.11.3/src/ksp/pc/interface/precon.c > [1]PETSC ERROR: #6 KSPSetUp() line 391 in /home/bcjo17/Downloads/petsc-3.11.3/src/ksp/ksp/interface/itfunc.c > [1]PETSC ERROR: #7 KSPSolve() line 725 in /home/bcjo17/Downloads/petsc-3.11.3/src/ksp/ksp/interface/itfunc.c > ... same messages for other processors ... > > The message 'Ratio between levels: (mz - 1)/(Mz - 1) must be integer: mz 62 Mz 31' made me think that there are some restrictions to use the multigrid. > I agree that people usually use a natural hierarchy for the multigrid method, but I just want to know whether it is possible or not. > So, could you please let me know how I can make it possible? > > > > 2019? 9? 20? (?) ?? 1:08, Smith, Barry F. ?? ??: > > > You didn't indicate "why" you can't use multiple levels with "130 grids", is there some error message? Nor do you mention if you have periodic boundary conditions or are using cell or vertex centered unknowns. All of these things affect when you can coarsen for multigrain or not. > > Consider one simple case, Dirichlet boundary conditions with vertex centered unknowns, I show the fine | and coarse grid * > > > | | | | | > > * * * > > > Now consider 4 points, > > | | | | > > Where am I going to put the coarse points? > > It is possible to do multigrid with non-nesting of degrees for freedom like > > | | | | > * * * > > but that is really uncommon, nobody does it. People just use the grid sizes which have a natural hierarchy of > nested coarser grids. > > Barry > > > > > On Sep 19, 2019, at 10:48 PM, Young Hyun Jo via petsc-users wrote: > > > > > > Hello, I'm Young Hyun Jo, and I study plasma physics and particle-in-cell simulation. > > Currently, I'm using PETSc to solve a 3D Poisson's equation in the FDM scheme. > > I have one question about the multigrid preconditioner. > > When I use PCG(KSPCG) with the multigrid preconditioner(PCMG), I get an error if I don't use the appropriate multigrid level for the grid number. > > For example, If I use 129 grids, I can use 7 multigrid levels. > > However, If I use 130 grids, I can't use any multigrid levels but one. > > So, It seems that the grid number is better to be (2*n + 1) to use multigrid preconditioner. > > Is this correct that the multigrid conditioner has some restrictions for the grid number to use? > > It will be really helpful for me to use PETSc properly. > > Thanks in advance. > > > > Sincerely, > > Young Hyun Jo > From bsmith at mcs.anl.gov Fri Sep 20 00:53:47 2019 From: bsmith at mcs.anl.gov (Smith, Barry F.) Date: Fri, 20 Sep 2019 05:53:47 +0000 Subject: [petsc-users] Questions about multigrid preconditioner and multigrid level In-Reply-To: References: <72B59503-B020-4258-917C-481F6A792467@anl.gov> <997C18A6-5E37-45EA-AAC6-9149BC400757@mcs.anl.gov> Message-ID: > On Sep 20, 2019, at 12:35 AM, Young Hyun Jo wrote: > > Thanks for the answer. > It's really helpful to understand the PETSc library. > By the way, I just want to ask two more questions not related to the multigrid. > > 1. Is there a method known as the fastest solver for Poisson's equation with the central difference scheme in the PETSc library? > I want to clarify that I need the fastest method, not the least iteration method. Geometric multigrid. For larger problems it will be significantly faster than anything else. But for a fixed size "moderate" sized problem it could be something like a relatively simply preconditioned CG. Determining the sizes for one method is the fastest is based on experiments, running the problem size you need with various methods. NEVER run on a different size problem to make a selection for an different size, this can lead to bad choices since the size of the problem has a large effect on which method is fastest. > I use the PCG(KSPCG) method without any preconditioner now, and I'm trying other methods too, but I couldn't find any other faster method yet. > > 2. What kind of preconditioner is used in 'KSPCG'? You can run with -ksp_view to see exactly what solver and parameters are used. By default PETSc uses block Jacobi with ILU(0) on each block. > I have made my own solver using PCG with a preconditioner 'incomplete Cholesky factorization' which is shown in https://en.wikipedia.org/wiki/Conjugate_gradient_method, and my solver takes more iterations than KSPCG. > So, I'm wondering what is the default preconditioner for KSPCG, and whether it's usually the fastest one. We made this one the default because it is fairly good for a range of problems. I can't explain why your incomplete Cholesky should require more iterations than our default. I would expect them to be pretty similar. Barry > > > > > 2019? 9? 20? (?) ?? 2:19, Smith, Barry F. ?? ??: > > The DMDA structured grid management in PETSc does not provide the needed interpolations for doing non-nested multigrid. The algebraic portions of PETSc's geometric multigrid would work fine for that case if you have your own way to provide the needed interpolation. I'll note that mathematically the interpolation is completely straightforward but the practical issues of computing such interpolations and managing the non-nested nature of the grids in MPI are nontrivial, not impossible or even particularly difficult but require careful thought and coding. The PETSc team doesn't have the resources or the need to develop this ability. I can only suggest sticking to the grid sizes where there is a natural nesting of the mesh points I don't think the coding effort is worth the benefit. > > Barry > > > > On Sep 19, 2019, at 11:50 PM, Young Hyun Jo wrote: > > > > > > Oh, I'm sorry. You're right. > > I use Dirichlet boundary conditions with a central difference scheme. > > I mentioned 130 grids, but I have an actual case below, and I get the errors : > > > > DM Object: 8 MPI processes > > type: da > > Processor [0] M 127 N 127 P 62 m 2 n 2 p 2 w 1 s 1 > > X range of indices: 0 64, Y range of indices: 0 64, Z range of indices: 0 31 > > Processor [1] M 127 N 127 P 62 m 2 n 2 p 2 w 1 s 1 > > X range of indices: 64 127, Y range of indices: 0 64, Z range of indices: 0 31 > > Processor [2] M 127 N 127 P 62 m 2 n 2 p 2 w 1 s 1 > > X range of indices: 0 64, Y range of indices: 64 127, Z range of indices: 0 31 > > Processor [3] M 127 N 127 P 62 m 2 n 2 p 2 w 1 s 1 > > X range of indices: 64 127, Y range of indices: 64 127, Z range of indices: 0 31 > > Processor [4] M 127 N 127 P 62 m 2 n 2 p 2 w 1 s 1 > > X range of indices: 0 64, Y range of indices: 0 64, Z range of indices: 31 62 > > Processor [5] M 127 N 127 P 62 m 2 n 2 p 2 w 1 s 1 > > X range of indices: 64 127, Y range of indices: 0 64, Z range of indices: 31 62 > > Processor [6] M 127 N 127 P 62 m 2 n 2 p 2 w 1 s 1 > > X range of indices: 0 64, Y range of indices: 64 127, Z range of indices: 31 62 > > Processor [7] M 127 N 127 P 62 m 2 n 2 p 2 w 1 s 1 > > X range of indices: 64 127, Y range of indices: 64 127, Z range of indices: 31 62 > > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > > [0]PETSC ERROR: Arguments are incompatible > > [0]PETSC ERROR: Ratio between levels: (mz - 1)/(Mz - 1) must be integer: mz 62 Mz 31 > > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > > [0]PETSC ERROR: Petsc Release Version 3.11.3, Jun, 26, 2019 > > [0]PETSC ERROR: ../../../eclipse-workspace/PIC3DXYZ/PIC3DXYZ_MPI/Debug/PIC3DXYZ_MPI on a named mn0 by bcjo17 Fri Sep 20 13:44:29 2019 > > [0]PETSC ERROR: Configure options --with-cc=mpiicc --with-cxx=mpiicpc --with-fc=mpiifort --prefix=/home/bcjo17/petsc_mpiicc --with-mpi=1 --with-blaslapack-dir=/home/bcjo17/intel/compilers_and_libraries_2019.3.199/linux/mkl > > [0]PETSC ERROR: #1 DMCreateInterpolation_DA_3D_Q1() line 773 in /home/bcjo17/Downloads/petsc-3.11.3/src/dm/impls/da/dainterp.c > > [0]PETSC ERROR: #2 DMCreateInterpolation_DA() line 1039 in /home/bcjo17/Downloads/petsc-3.11.3/src/dm/impls/da/dainterp.c > > [0]PETSC ERROR: [1]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > > [1]PETSC ERROR: Arguments are incompatible > > [1]PETSC ERROR: Ratio between levels: (mz - 1)/(Mz - 1) must be integer: mz 62 Mz 31 > > [1]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > > [1]PETSC ERROR: Petsc Release Version 3.11.3, Jun, 26, 2019 > > [1]PETSC ERROR: ../../../eclipse-workspace/PIC3DXYZ/PIC3DXYZ_MPI/Debug/PIC3DXYZ_MPI on a named mn0 by bcjo17 Fri Sep 20 13:44:29 2019 > > [1]PETSC ERROR: Configure options --with-cc=mpiicc --with-cxx=mpiicpc --with-fc=mpiifort --prefix=/home/bcjo17/petsc_mpiicc --with-mpi=1 --with-blaslapack-dir=/home/bcjo17/intel/compilers_and_libraries_2019.3.199/linux/mkl > > [1]PETSC ERROR: #1 DMCreateInterpolation_DA_3D_Q1() line 773 in /home/bcjo17/Downloads/petsc-3.11.3/src/dm/impls/da/dainterp.c > > [1]PETSC ERROR: #2 DMCreateInterpolation_DA() line 1039 in /home/bcjo17/Downloads/petsc-3.11.3/src/dm/impls/da/dainterp.c > > [1]PETSC ERROR: #3 DMCreateInterpolation() line 1114 in /home/bcjo17/Downloads/petsc-3.11.3/src/dm/interface/dm.c > > [1]PETSC ERROR: #4 PCSetUp_MG() line 684 in /home/bcjo17/Downloads/petsc-3.11.3/src/ksp/pc/impls/mg/mg.c > > [1]PETSC ERROR: #5 PCSetUp() line 932 in /home/bcjo17/Downloads/petsc-3.11.3/src/ksp/pc/interface/precon.c > > [1]PETSC ERROR: #6 KSPSetUp() line 391 in /home/bcjo17/Downloads/petsc-3.11.3/src/ksp/ksp/interface/itfunc.c > > [1]PETSC ERROR: #7 KSPSolve() line 725 in /home/bcjo17/Downloads/petsc-3.11.3/src/ksp/ksp/interface/itfunc.c > > ... same messages for other processors ... > > > > The message 'Ratio between levels: (mz - 1)/(Mz - 1) must be integer: mz 62 Mz 31' made me think that there are some restrictions to use the multigrid. > > I agree that people usually use a natural hierarchy for the multigrid method, but I just want to know whether it is possible or not. > > So, could you please let me know how I can make it possible? > > > > > > > > 2019? 9? 20? (?) ?? 1:08, Smith, Barry F. ?? ??: > > > > > > You didn't indicate "why" you can't use multiple levels with "130 grids", is there some error message? Nor do you mention if you have periodic boundary conditions or are using cell or vertex centered unknowns. All of these things affect when you can coarsen for multigrain or not. > > > > Consider one simple case, Dirichlet boundary conditions with vertex centered unknowns, I show the fine | and coarse grid * > > > > > > | | | | | > > > > * * * > > > > > > Now consider 4 points, > > > > | | | | > > > > Where am I going to put the coarse points? > > > > It is possible to do multigrid with non-nesting of degrees for freedom like > > > > | | | | > > * * * > > > > but that is really uncommon, nobody does it. People just use the grid sizes which have a natural hierarchy of > > nested coarser grids. > > > > Barry > > > > > > > > > On Sep 19, 2019, at 10:48 PM, Young Hyun Jo via petsc-users wrote: > > > > > > > > > Hello, I'm Young Hyun Jo, and I study plasma physics and particle-in-cell simulation. > > > Currently, I'm using PETSc to solve a 3D Poisson's equation in the FDM scheme. > > > I have one question about the multigrid preconditioner. > > > When I use PCG(KSPCG) with the multigrid preconditioner(PCMG), I get an error if I don't use the appropriate multigrid level for the grid number. > > > For example, If I use 129 grids, I can use 7 multigrid levels. > > > However, If I use 130 grids, I can't use any multigrid levels but one. > > > So, It seems that the grid number is better to be (2*n + 1) to use multigrid preconditioner. > > > Is this correct that the multigrid conditioner has some restrictions for the grid number to use? > > > It will be really helpful for me to use PETSc properly. > > > Thanks in advance. > > > > > > Sincerely, > > > Young Hyun Jo > > > From jroman at dsic.upv.es Fri Sep 20 06:22:30 2019 From: jroman at dsic.upv.es (Jose E. Roman) Date: Fri, 20 Sep 2019 13:22:30 +0200 Subject: [petsc-users] question about MatCreateRedundantMatrix In-Reply-To: <29818415-5864-08bb-80cc-b0e27c4643d7@purdue.edu> References: <52131a99-8f0e-6e94-88a7-cb637d5053e2@purdue.edu> <57ED352E-14B3-416B-8574-C04EBB09D797@dsic.upv.es> <8540f02b-52bd-229e-7124-23f033e31ef6@purdue.edu> <29818415-5864-08bb-80cc-b0e27c4643d7@purdue.edu> Message-ID: I have tried with slepc-master and it works: $ mpiexec -n 2 ./ex1 -eps_ciss_partitions 2 matrix size 774 (-78.7875,8.8022) (-73.9569,-42.2401) (-66.9942,-7.50907) (-62.262,-2.71603) (-58.9716,0.601111) (-57.9883,0.298729) (-57.8323,1.06041) (-56.5317,1.10758) (-56.0234,45.2405) (-54.4058,2.88373) (-25.946,26.0317) (-23.5383,-16.9096) (-19.0999,0.194467) (-18.795,1.15113) (-15.3051,0.915914) (-14.803,-0.00475538) (-8.52467,10.6032) (-4.36051,2.29996) (-0.525758,0.796658) (1.41227,0.112858) (1.53801,0.446984) (9.43357,0.505277) slepc-master will become version 3.12 in a few days. I have not tried with 3.11 but I think it should work. It is always recommended to use the latest version. Version 3.8 is two years old. Jose > El 19 sept 2019, a las 20:33, Povolotskyi, Mykhailo escribi?: > > Hong, > > do you have in mind a reason why the newer version should work or is it a general recommendation? > > Which stable version would you recommend to upgrade to? > > Thank you, > > Michael. > > > On 09/19/2019 02:22 PM, Zhang, Hong wrote: >> Michael, >> >> -------------------------------------------------------------- >> [0]PETSC ERROR: No support for this operation for this object type >> [0]PETSC ERROR: Mat type seqdense >> [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html >> for trouble shooting. >> [0]PETSC ERROR: Petsc Release Version 3.8.4, Mar, 24, 2018 >> >> This is an old version of Petsc. Can you update to the latest Petsc release? >> Hong >> >> >> On 09/19/2019 04:55 AM, Jose E. Roman wrote: >> > Michael, >> > >> > In my previous email I should have checked it better. The CISS solver works indeed with dense matrices: >> > >> > $ mpiexec -n 2 ./ex2 -n 30 -eps_type ciss -terse -rg_type ellipse -rg_ellipse_center 1.175 -rg_ellipse_radius 0.075 -eps_ciss_partitions 2 -mat_type dense >> > >> > 2-D Laplacian Eigenproblem, N=900 (30x30 grid) >> > >> > Solution method: ciss >> > >> > Number of requested eigenvalues: 1 >> > Found 15 eigenvalues, all of them computed up to the required tolerance: >> > 1.10416, 1.10416, 1.10455, 1.10455, 1.12947, 1.12947, 1.13426, 1.13426, >> > 1.16015, 1.16015, 1.19338, 1.19338, 1.21093, 1.21093, 1.24413 >> > >> > >> > There might be something different in the way matrices are initialized in your code. Send me a simple example that reproduces the problem and I will track it down. >> > >> > Sorry for the confusion. >> > Jose >> > >> > >> > >> >> El 19 sept 2019, a las 6:20, hong--- via petsc-users escribi?: >> >> >> >> Michael, >> >> We have support of MatCreateRedundantMatrix for dense matrices. For example, petsc/src/mat/examples/tests/ex9.c: >> >> mpiexec -n 4 ./ex9 -mat_type dense -view_mat -nsubcomms 2 >> >> >> >> Hong >> >> >> >> On Wed, Sep 18, 2019 at 5:40 PM Povolotskyi, Mykhailo via petsc-users wrote: >> >> Dear Petsc developers, >> >> >> >> I found that MatCreateRedundantMatrix does not support dense matrices. >> >> >> >> This causes the following problem: I cannot use CISS eigensolver from >> >> SLEPC with dense matrices with parallelization over quadrature points. >> >> >> >> Is it possible for you to add this support? >> >> >> >> Thank you, >> >> >> >> Michael. >> >> >> >> >> >> p.s. I apologize if you received this e-mail twice, I sent if first from >> >> a different address. >> >> >> > From paeanball at gmail.com Fri Sep 20 06:53:02 2019 From: paeanball at gmail.com (Bao Kai) Date: Fri, 20 Sep 2019 13:53:02 +0200 Subject: [petsc-users] Reading in the full matrix in one process and then trying to solve in parallel with PETSc In-Reply-To: References: Message-ID: Hi, I understand that PETSc is not designed to be used this way, while I am wondering if someone have done something similar to this. We have the full matrix from a simulation and rhs vector. We would like to read in through PETSc in one process, then we use some partition functions to partition the matrix. Based on the partition information, we redistribute the matrix among the processes. Then we solve it in parallel. It is for testing the performance of some parallel linear solver and preconditions. We are not in the position to develop a full parallel implementation of the simulator yet. Thanks. Cheers, Kai Bao From knepley at gmail.com Fri Sep 20 07:08:28 2019 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 20 Sep 2019 08:08:28 -0400 Subject: [petsc-users] Reading in the full matrix in one process and then trying to solve in parallel with PETSc In-Reply-To: References: Message-ID: On Fri, Sep 20, 2019 at 7:54 AM Bao Kai via petsc-users < petsc-users at mcs.anl.gov> wrote: > Hi, > > I understand that PETSc is not designed to be used this way, while I > am wondering if someone have done something similar to this. > > We have the full matrix from a simulation and rhs vector. We would > like to read in through PETSc in one process, then we use some > partition functions to partition the matrix. > > Based on the partition information, we redistribute the matrix among > the processes. Then we solve it in parallel. It is for testing the > performance of some parallel linear solver and preconditions. > > We are not in the position to develop a full parallel implementation > of the simulator yet. > This is not hard to do. 1) Write a simple serial converter that reads in your matrix in whatever format you have it, and output it in PETSc Binary format using MatView() with a binary viewer. Same for the vector, and they can be in the same file. 2) Run your parallel code and use MatLoad/VecLoad for the system and it will automatically partition it. Thanks, Matt > Thanks. > > Cheers, > Kai Bao > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From paeanball at gmail.com Fri Sep 20 07:43:36 2019 From: paeanball at gmail.com (Bao Kai) Date: Fri, 20 Sep 2019 14:43:36 +0200 Subject: [petsc-users] Reading in the full matrix in one process and then trying to solve in parallel with PETSc In-Reply-To: References: Message-ID: Thanks a lot, Matt. I will do that. Cheers, Kai On Fri, Sep 20, 2019 at 2:08 PM Matthew Knepley wrote: > > On Fri, Sep 20, 2019 at 7:54 AM Bao Kai via petsc-users wrote: >> >> Hi, >> >> I understand that PETSc is not designed to be used this way, while I >> am wondering if someone have done something similar to this. >> >> We have the full matrix from a simulation and rhs vector. We would >> like to read in through PETSc in one process, then we use some >> partition functions to partition the matrix. >> >> Based on the partition information, we redistribute the matrix among >> the processes. Then we solve it in parallel. It is for testing the >> performance of some parallel linear solver and preconditions. >> >> We are not in the position to develop a full parallel implementation >> of the simulator yet. > > > This is not hard to do. > > 1) Write a simple serial converter that reads in your matrix in whatever format you have it, and output it in PETSc Binary format > using MatView() with a binary viewer. Same for the vector, and they can be in the same file. > > 2) Run your parallel code and use MatLoad/VecLoad for the system and it will automatically partition it. > > Thanks, > > Matt > >> >> Thanks. >> >> Cheers, >> Kai Bao > > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ From paeanball at gmail.com Fri Sep 20 06:46:31 2019 From: paeanball at gmail.com (Bao Kai) Date: Fri, 20 Sep 2019 13:46:31 +0200 Subject: [petsc-users] Reading in the full matrix in one process and then trying to solve in parallel with PETSc Message-ID: Hi, I understand that PETSc is not designed to be used this way, while I am wondering if someone have done something similar to this. We have the full matrix from a simulation and rhs vector. We would like to read in through PETSc in one process, then we use some partition functions to partition the matrix. Based on the partition information, we redistribute the matrix among the processes. Then we solve it in parallel. It is for testing the performance of some parallel linear solver and preconditions. We are not in the position to develop a full parallel implementation of the simulator yet. Thanks. Cheers, Kai Bao From mpovolot at purdue.edu Fri Sep 20 12:55:00 2019 From: mpovolot at purdue.edu (Povolotskyi, Mykhailo) Date: Fri, 20 Sep 2019 17:55:00 +0000 Subject: [petsc-users] question about installing petsc3.11 Message-ID: <8e82203a-d694-f3d3-8f10-8e36401822b6@purdue.edu> Hello, I'm upgrading petsc from 3.8 to 3.11. In doing so, I see an error message: ?UNABLE to CONFIGURE with GIVEN OPTIONS??? (see configure.log for details): ------------------------------------------------------------------------------- Cannot use SuperLU_DIST with 64 bit BLAS/Lapack indices ******************************************************************************* I wonder why this configuration step worked well for 3.8?? I did not change anything else but version of petsc. Thank you, Michael. From knepley at gmail.com Fri Sep 20 14:41:01 2019 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 20 Sep 2019 15:41:01 -0400 Subject: [petsc-users] question about installing petsc3.11 In-Reply-To: <8e82203a-d694-f3d3-8f10-8e36401822b6@purdue.edu> References: <8e82203a-d694-f3d3-8f10-8e36401822b6@purdue.edu> Message-ID: On Fri, Sep 20, 2019 at 1:55 PM Povolotskyi, Mykhailo via petsc-users < petsc-users at mcs.anl.gov> wrote: > Hello, > > I'm upgrading petsc from 3.8 to 3.11. > > In doing so, I see an error message: > > UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log for > details): > > ------------------------------------------------------------------------------- > Cannot use SuperLU_DIST with 64 bit BLAS/Lapack indices > > ******************************************************************************* > > I wonder why this configuration step worked well for 3.8? I did not > change anything else but version of petsc. > This never worked. We are just checking now. Thanks, Matt > Thank you, > > Michael. > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From mpovolot at purdue.edu Fri Sep 20 14:43:59 2019 From: mpovolot at purdue.edu (Povolotskyi, Mykhailo) Date: Fri, 20 Sep 2019 19:43:59 +0000 Subject: [petsc-users] question about installing petsc3.11 In-Reply-To: References: <8e82203a-d694-f3d3-8f10-8e36401822b6@purdue.edu> Message-ID: Does it mean I have to configure petsc with --with-64-bit-indices=1 ? On 09/20/2019 03:41 PM, Matthew Knepley wrote: On Fri, Sep 20, 2019 at 1:55 PM Povolotskyi, Mykhailo via petsc-users > wrote: Hello, I'm upgrading petsc from 3.8 to 3.11. In doing so, I see an error message: UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log for details): ------------------------------------------------------------------------------- Cannot use SuperLU_DIST with 64 bit BLAS/Lapack indices ******************************************************************************* I wonder why this configuration step worked well for 3.8? I did not change anything else but version of petsc. This never worked. We are just checking now. Thanks, Matt Thank you, Michael. -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From balay at mcs.anl.gov Fri Sep 20 14:53:34 2019 From: balay at mcs.anl.gov (Balay, Satish) Date: Fri, 20 Sep 2019 19:53:34 +0000 Subject: [petsc-users] question about installing petsc3.11 In-Reply-To: References: <8e82203a-d694-f3d3-8f10-8e36401822b6@purdue.edu> Message-ID: --with-64-bit-indices=1 => PetscInt = int64_t --known-64-bit-blas-indices=1 => blas specified uses 64bit indices. What is your requirement (use case)? Satish On Fri, 20 Sep 2019, Povolotskyi, Mykhailo via petsc-users wrote: > Does it mean I have to configure petsc with --with-64-bit-indices=1 ? > > On 09/20/2019 03:41 PM, Matthew Knepley wrote: > On Fri, Sep 20, 2019 at 1:55 PM Povolotskyi, Mykhailo via petsc-users > wrote: > Hello, > > I'm upgrading petsc from 3.8 to 3.11. > > In doing so, I see an error message: > > UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log for details): > ------------------------------------------------------------------------------- > Cannot use SuperLU_DIST with 64 bit BLAS/Lapack indices > ******************************************************************************* > > I wonder why this configuration step worked well for 3.8? I did not > change anything else but version of petsc. > > This never worked. We are just checking now. > > Thanks, > > Matt > > Thank you, > > Michael. > > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > From knepley at gmail.com Fri Sep 20 14:52:43 2019 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 20 Sep 2019 15:52:43 -0400 Subject: [petsc-users] question about installing petsc3.11 In-Reply-To: References: <8e82203a-d694-f3d3-8f10-8e36401822b6@purdue.edu> Message-ID: On Fri, Sep 20, 2019 at 3:44 PM Povolotskyi, Mykhailo wrote: > Does it mean I have to configure petsc with --with-64-bit-indices=1 ? > If you do not send configure.log, we have no way of figuring out what is going on. Thanks, Matt > On 09/20/2019 03:41 PM, Matthew Knepley wrote: > > On Fri, Sep 20, 2019 at 1:55 PM Povolotskyi, Mykhailo via petsc-users < > petsc-users at mcs.anl.gov> wrote: > >> Hello, >> >> I'm upgrading petsc from 3.8 to 3.11. >> >> In doing so, I see an error message: >> >> UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log for >> details): >> >> ------------------------------------------------------------------------------- >> Cannot use SuperLU_DIST with 64 bit BLAS/Lapack indices >> >> ******************************************************************************* >> >> I wonder why this configuration step worked well for 3.8? I did not >> change anything else but version of petsc. >> > > This never worked. We are just checking now. > > Thanks, > > Matt > > >> Thank you, >> >> Michael. >> >> > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From mpovolot at purdue.edu Fri Sep 20 15:06:24 2019 From: mpovolot at purdue.edu (Povolotskyi, Mykhailo) Date: Fri, 20 Sep 2019 20:06:24 +0000 Subject: [petsc-users] question about installing petsc3.11 In-Reply-To: References: <8e82203a-d694-f3d3-8f10-8e36401822b6@purdue.edu> Message-ID: I have to apologize. By mistake I was installing the new version in the directory where the old version already existed. After I cleaned everything, I do not see that error message anymore. Yes, the error message was somewhat misleading, but I will not be able to reproduce it. Michael. On 09/20/2019 03:53 PM, Balay, Satish wrote: > --with-64-bit-indices=1 => PetscInt = int64_t > --known-64-bit-blas-indices=1 => blas specified uses 64bit indices. > > What is your requirement (use case)? > > Satish > > On Fri, 20 Sep 2019, Povolotskyi, Mykhailo via petsc-users wrote: > >> Does it mean I have to configure petsc with --with-64-bit-indices=1 ? >> >> On 09/20/2019 03:41 PM, Matthew Knepley wrote: >> On Fri, Sep 20, 2019 at 1:55 PM Povolotskyi, Mykhailo via petsc-users > wrote: >> Hello, >> >> I'm upgrading petsc from 3.8 to 3.11. >> >> In doing so, I see an error message: >> >> UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log for details): >> ------------------------------------------------------------------------------- >> Cannot use SuperLU_DIST with 64 bit BLAS/Lapack indices >> ******************************************************************************* >> >> I wonder why this configuration step worked well for 3.8? I did not >> change anything else but version of petsc. >> >> This never worked. We are just checking now. >> >> Thanks, >> >> Matt >> >> Thank you, >> >> Michael. >> >> >> >> -- >> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ >> >> From mpovolot at purdue.edu Fri Sep 20 15:18:00 2019 From: mpovolot at purdue.edu (Povolotskyi, Mykhailo) Date: Fri, 20 Sep 2019 20:18:00 +0000 Subject: [petsc-users] reproduced the problem In-Reply-To: References: <8e82203a-d694-f3d3-8f10-8e36401822b6@purdue.edu> Message-ID: <10956ef1-9fb0-d495-6d91-779e80a63624@purdue.edu> Dear Matthew and Satish, I just wrote that the error disappeared, but it still exists (I had to wait longer). The configuration log can be accessed here: https://www.dropbox.com/s/tmkksemu294j719/configure.log?dl=0 Sorry for the last e-mail. Michael. On 09/20/2019 03:53 PM, Balay, Satish wrote: > --with-64-bit-indices=1 => PetscInt = int64_t > --known-64-bit-blas-indices=1 => blas specified uses 64bit indices. > > What is your requirement (use case)? > > Satish > > On Fri, 20 Sep 2019, Povolotskyi, Mykhailo via petsc-users wrote: > >> Does it mean I have to configure petsc with --with-64-bit-indices=1 ? >> >> On 09/20/2019 03:41 PM, Matthew Knepley wrote: >> On Fri, Sep 20, 2019 at 1:55 PM Povolotskyi, Mykhailo via petsc-users > wrote: >> Hello, >> >> I'm upgrading petsc from 3.8 to 3.11. >> >> In doing so, I see an error message: >> >> UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log for details): >> ------------------------------------------------------------------------------- >> Cannot use SuperLU_DIST with 64 bit BLAS/Lapack indices >> ******************************************************************************* >> >> I wonder why this configuration step worked well for 3.8? I did not >> change anything else but version of petsc. >> >> This never worked. We are just checking now. >> >> Thanks, >> >> Matt >> >> Thank you, >> >> Michael. >> >> >> >> -- >> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ >> >> From balay at mcs.anl.gov Fri Sep 20 15:32:01 2019 From: balay at mcs.anl.gov (Balay, Satish) Date: Fri, 20 Sep 2019 20:32:01 +0000 Subject: [petsc-users] reproduced the problem In-Reply-To: <10956ef1-9fb0-d495-6d91-779e80a63624@purdue.edu> References: <8e82203a-d694-f3d3-8f10-8e36401822b6@purdue.edu> <10956ef1-9fb0-d495-6d91-779e80a63624@purdue.edu> Message-ID: >>>>>>>>> ================================================================================ TEST checkRuntimeIssues from config.packages.BlasLapack(/depot/kildisha/apps/brown/nemo5/libs/petsc/build-real3.11/config/BuildSystem/config/packages/BlasLapack.py:579) TESTING: checkRuntimeIssues from config.packages.BlasLapack(config/BuildSystem/config/packages/BlasLapack.py:579) Determines if BLAS/LAPACK routines use 32 or 64 bit integers Checking if BLAS/LAPACK routines use 32 or 64 bit integersExecuting: mpicc -c -o /tmp/petsc-wf99X2/config.packages.BlasLapack/conftest.o -I/tmp/petsc-wf99X2/config.setCompilers -I/tmp/petsc-wf99X2/config.compilers -I/tmp/petsc-wf99X2/config.utilities.closure -I/tmp/petsc-wf99X2/config.headers -I/tmp/petsc-wf99X2/config.utilities.cacheDetails -I/tmp/petsc-wf99X2/config.atomics -I/tmp/petsc-wf99X2/config.libraries -I/tmp/petsc-wf99X2/config.functions -I/tmp/petsc-wf99X2/config.utilities.featureTestMacros -I/tmp/petsc-wf99X2/config.utilities.missing -I/tmp/petsc-wf99X2/config.types -I/tmp/petsc-wf99X2/config.packages.MPI -I/tmp/petsc-wf99X2/config.packages.valgrind -I/tmp/petsc-wf99X2/config.packages.pthread -I/tmp/petsc-wf99X2/config.packages.metis -I/tmp/petsc-wf99X2/config.packages.hdf5 -I/tmp/petsc-wf99X2/config.packages.BlasLapack -fopenmp -fPIC /tmp/petsc-wf99X2/config.packages.BlasLapack/conftest.c Successful compile: Source: #include "confdefs.h" #include "conffix.h" #include #if STDC_HEADERS #include #include #include #endif int main() { FILE *output = fopen("runtimetestoutput","w"); extern double ddot_(const int*,const double*,const int *,const double*,const int*); double x1mkl[4] = {3.0,5.0,7.0,9.0}; int one1mkl = 1,nmkl = 2; double dotresultmkl = 0; dotresultmkl = ddot_(&nmkl,x1mkl,&one1mkl,x1mkl,&one1mkl); fprintf(output, "-known-64-bit-blas-indices=%d",dotresultmkl != 34);; return 0; } Executing: mpicc -o /tmp/petsc-wf99X2/config.packages.BlasLapack/conftest -fopenmp -fPIC /tmp/petsc-wf99X2/config.packages.BlasLapack/conftest.o -L/apps/cent7/intel/compilers_and_libraries_2017.1.132/linux/mkl/lib/intel64 -lmkl_intel_lp64 -lmkl_gnu_thread -lmkl_core -lm -lstdc++ -ldl -L/apps/brown/openmpi.20190215/2.1.6_gcc-5.2.0/lib -lmpi_usempif08 -lmpi_usempi_ignore_tkr -lmpi_mpifh -lmpi -lgfortran -lm -L/apps/cent7/gcc/5.2.0/lib/gcc/x86_64-unknown-linux-gnu/5.2.0 -L/apps/cent7/gcc/5.2.0/lib64 -L/apps/cent7/gcc/5.2.0/lib -Wl,-rpath,/apps/brown/openmpi.20190215/2.1.6_gcc-5.2.0/lib -lgfortran -lm -lgomp -lgcc_s -lquadmath -lpthread -lstdc++ -ldl Testing executable /tmp/petsc-wf99X2/config.packages.BlasLapack/conftest to see if it can be run Executing: /tmp/petsc-wf99X2/config.packages.BlasLapack/conftest Executing: /tmp/petsc-wf99X2/config.packages.BlasLapack/conftest ERROR while running executable: Could not execute "['/tmp/petsc-wf99X2/config.packages.BlasLapack/conftest']": /tmp/petsc-wf99X2/config.packages.BlasLapack/conftest: error while loading shared libraries: libmkl_intel_lp64.so: cannot open shared object file: No such file or directory Defined "HAVE_64BIT_BLAS_INDICES" to "1" Checking for 64 bit blas indices: program did not return therefor assuming 64 bit blas indices Defined "HAVE_LIBMKL_INTEL_ILP64" to "1" <<<<<<<< So this test has an error but yet the flag HAVE_64BIT_BLAS_INDICES is set. Is your compiler not returning correct error codes? Does it make a difference if you also specify -Wl,-rpath along with -L in --with-blaslapack-lib option? Satish On Fri, 20 Sep 2019, Povolotskyi, Mykhailo wrote: > Dear Matthew and Satish, > > I just wrote that the error disappeared, but it still exists (I had to > wait longer). > > The configuration log can be accessed here: > > https://www.dropbox.com/s/tmkksemu294j719/configure.log?dl=0 > > Sorry for the last e-mail. > > Michael. > > > On 09/20/2019 03:53 PM, Balay, Satish wrote: > > --with-64-bit-indices=1 => PetscInt = int64_t > > --known-64-bit-blas-indices=1 => blas specified uses 64bit indices. > > > > What is your requirement (use case)? > > > > Satish > > > > On Fri, 20 Sep 2019, Povolotskyi, Mykhailo via petsc-users wrote: > > > >> Does it mean I have to configure petsc with --with-64-bit-indices=1 ? > >> > >> On 09/20/2019 03:41 PM, Matthew Knepley wrote: > >> On Fri, Sep 20, 2019 at 1:55 PM Povolotskyi, Mykhailo via petsc-users > wrote: > >> Hello, > >> > >> I'm upgrading petsc from 3.8 to 3.11. > >> > >> In doing so, I see an error message: > >> > >> UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log for details): > >> ------------------------------------------------------------------------------- > >> Cannot use SuperLU_DIST with 64 bit BLAS/Lapack indices > >> ******************************************************************************* > >> > >> I wonder why this configuration step worked well for 3.8? I did not > >> change anything else but version of petsc. > >> > >> This never worked. We are just checking now. > >> > >> Thanks, > >> > >> Matt > >> > >> Thank you, > >> > >> Michael. > >> > >> > >> > >> -- > >> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > >> -- Norbert Wiener > >> > >> https://www.cse.buffalo.edu/~knepley/ > >> > >> > > From jed at jedbrown.org Fri Sep 20 15:44:31 2019 From: jed at jedbrown.org (Jed Brown) Date: Fri, 20 Sep 2019 14:44:31 -0600 Subject: [petsc-users] Reading in the full matrix in one process and then trying to solve in parallel with PETSc In-Reply-To: References: Message-ID: <87lfuivgsw.fsf@jedbrown.org> Matthew Knepley via petsc-users writes: > On Fri, Sep 20, 2019 at 7:54 AM Bao Kai via petsc-users < > petsc-users at mcs.anl.gov> wrote: > >> Hi, >> >> I understand that PETSc is not designed to be used this way, while I >> am wondering if someone have done something similar to this. >> >> We have the full matrix from a simulation and rhs vector. We would >> like to read in through PETSc in one process, then we use some >> partition functions to partition the matrix. >> >> Based on the partition information, we redistribute the matrix among >> the processes. Then we solve it in parallel. It is for testing the >> performance of some parallel linear solver and preconditions. >> >> We are not in the position to develop a full parallel implementation >> of the simulator yet. An alternative is to assemble a Mat living on a parallel communicator, but with all entries on rank 0 (so just call your serial code to build the matrix). You can do the same for your vector, then KSPSolve. To make the solver parallel, just use run-time options: -ksp_type preonly -pc_type redistribute will redistribute automatically inside the solver and return the solution vector to you on rank 0. You can control the inner solver via prefix -redistribute_ksp_type gmres -redistribute_pc_type gamg -redistibute_ksp_monitor From mpovolot at purdue.edu Fri Sep 20 17:38:24 2019 From: mpovolot at purdue.edu (Povolotskyi, Mykhailo) Date: Fri, 20 Sep 2019 22:38:24 +0000 Subject: [petsc-users] reproduced the problem In-Reply-To: References: <8e82203a-d694-f3d3-8f10-8e36401822b6@purdue.edu> <10956ef1-9fb0-d495-6d91-779e80a63624@purdue.edu> Message-ID: <13db4493-5113-5667-2b34-b93d7a9090de@purdue.edu> Hello Satish, I did what you suggested, now the error is different: ??? UNABLE to CONFIGURE with GIVEN OPTIONS??? (see configure.log for details): ------------------------------------------------------------------------------- Cannot use SuperLU_DIST without enabling C++11, see --with-cxx-dialect=C++11 ******************************************************************************* The updated configure.log is here: https://www.dropbox.com/s/tmkksemu294j719/configure.log?dl=0 On 9/20/2019 4:32 PM, Balay, Satish wrote: > ================================================================================ > TEST checkRuntimeIssues from config.packages.BlasLapack(/depot/kildisha/apps/brown/nemo5/libs/petsc/build-real3.11/config/BuildSystem/config/packages/BlasLapack.py:579) > TESTING: checkRuntimeIssues from config.packages.BlasLapack(config/BuildSystem/config/packages/BlasLapack.py:579) > Determines if BLAS/LAPACK routines use 32 or 64 bit integers > Checking if BLAS/LAPACK routines use 32 or 64 bit integersExecuting: mpicc -c -o /tmp/petsc-wf99X2/config.packages.BlasLapack/conftest.o -I/tmp/petsc-wf99X2/config.setCompilers -I/tmp/petsc-wf99X2/config.compilers -I/tmp/petsc-wf99X2/config.utilities.closure -I/tmp/petsc-wf99X2/config.headers -I/tmp/petsc-wf99X2/config.utilities.cacheDetails -I/tmp/petsc-wf99X2/config.atomics -I/tmp/petsc-wf99X2/config.libraries -I/tmp/petsc-wf99X2/config.functions -I/tmp/petsc-wf99X2/config.utilities.featureTestMacros -I/tmp/petsc-wf99X2/config.utilities.missing -I/tmp/petsc-wf99X2/config.types -I/tmp/petsc-wf99X2/config.packages.MPI -I/tmp/petsc-wf99X2/config.packages.valgrind -I/tmp/petsc-wf99X2/config.packages.pthread -I/tmp/petsc-wf99X2/config.packages.metis -I/tmp/petsc-wf99X2/config.packages.hdf5 -I/tmp/petsc-wf99X2/config.packages.BlasLapack -fopenmp -fPIC /tmp/petsc-wf99X2/config.packages.BlasLapack/conftest.c > Successful compile: > Source: > #include "confdefs.h" > #include "conffix.h" > #include > #if STDC_HEADERS > #include > #include > #include > #endif > > int main() { > FILE *output = fopen("runtimetestoutput","w"); > extern double ddot_(const int*,const double*,const int *,const double*,const int*); > double x1mkl[4] = {3.0,5.0,7.0,9.0}; > int one1mkl = 1,nmkl = 2; > double dotresultmkl = 0; > dotresultmkl = ddot_(&nmkl,x1mkl,&one1mkl,x1mkl,&one1mkl); > fprintf(output, "-known-64-bit-blas-indices=%d",dotresultmkl != 34);; > return 0; > } > Executing: mpicc -o /tmp/petsc-wf99X2/config.packages.BlasLapack/conftest -fopenmp -fPIC /tmp/petsc-wf99X2/config.packages.BlasLapack/conftest.o -L/apps/cent7/intel/compilers_and_libraries_2017.1.132/linux/mkl/lib/intel64 -lmkl_intel_lp64 -lmkl_gnu_thread -lmkl_core -lm -lstdc++ -ldl -L/apps/brown/openmpi.20190215/2.1.6_gcc-5.2.0/lib -lmpi_usempif08 -lmpi_usempi_ignore_tkr -lmpi_mpifh -lmpi -lgfortran -lm -L/apps/cent7/gcc/5.2.0/lib/gcc/x86_64-unknown-linux-gnu/5.2.0 -L/apps/cent7/gcc/5.2.0/lib64 -L/apps/cent7/gcc/5.2.0/lib -Wl,-rpath,/apps/brown/openmpi.20190215/2.1.6_gcc-5.2.0/lib -lgfortran -lm -lgomp -lgcc_s -lquadmath -lpthread -lstdc++ -ldl > Testing executable /tmp/petsc-wf99X2/config.packages.BlasLapack/conftest to see if it can be run > Executing: /tmp/petsc-wf99X2/config.packages.BlasLapack/conftest > Executing: /tmp/petsc-wf99X2/config.packages.BlasLapack/conftest > ERROR while running executable: Could not execute "['/tmp/petsc-wf99X2/config.packages.BlasLapack/conftest']": > /tmp/petsc-wf99X2/config.packages.BlasLapack/conftest: error while loading shared libraries: libmkl_intel_lp64.so: cannot open shared object file: No such file or directory > > Defined "HAVE_64BIT_BLAS_INDICES" to "1" > Checking for 64 bit blas indices: program did not return therefor assuming 64 bit blas indices > Defined "HAVE_LIBMKL_INTEL_ILP64" to "1" > > <<<<<<<< > > So this test has an error but yet the flag HAVE_64BIT_BLAS_INDICES is set. > > Is your compiler not returning correct error codes? > > Does it make a difference if you also specify -Wl,-rpath along with -L in --with-blaslapack-lib option? > > > Satish > > On Fri, 20 Sep 2019, Povolotskyi, Mykhailo wrote: > >> Dear Matthew and Satish, >> >> I just wrote that the error disappeared, but it still exists (I had to >> wait longer). >> >> The configuration log can be accessed here: >> >> https://www.dropbox.com/s/tmkksemu294j719/configure.log?dl=0 >> >> Sorry for the last e-mail. >> >> Michael. >> >> >> On 09/20/2019 03:53 PM, Balay, Satish wrote: >>> --with-64-bit-indices=1 => PetscInt = int64_t >>> --known-64-bit-blas-indices=1 => blas specified uses 64bit indices. >>> >>> What is your requirement (use case)? >>> >>> Satish >>> >>> On Fri, 20 Sep 2019, Povolotskyi, Mykhailo via petsc-users wrote: >>> >>>> Does it mean I have to configure petsc with --with-64-bit-indices=1 ? >>>> >>>> On 09/20/2019 03:41 PM, Matthew Knepley wrote: >>>> On Fri, Sep 20, 2019 at 1:55 PM Povolotskyi, Mykhailo via petsc-users > wrote: >>>> Hello, >>>> >>>> I'm upgrading petsc from 3.8 to 3.11. >>>> >>>> In doing so, I see an error message: >>>> >>>> UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log for details): >>>> ------------------------------------------------------------------------------- >>>> Cannot use SuperLU_DIST with 64 bit BLAS/Lapack indices >>>> ******************************************************************************* >>>> >>>> I wonder why this configuration step worked well for 3.8? I did not >>>> change anything else but version of petsc. >>>> >>>> This never worked. We are just checking now. >>>> >>>> Thanks, >>>> >>>> Matt >>>> >>>> Thank you, >>>> >>>> Michael. >>>> >>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>>> -- Norbert Wiener >>>> >>>> https://www.cse.buffalo.edu/~knepley/ >>>> >>>> >> From mfadams at lbl.gov Fri Sep 20 18:16:17 2019 From: mfadams at lbl.gov (Mark Adams) Date: Fri, 20 Sep 2019 19:16:17 -0400 Subject: [petsc-users] Undefined symbols for architecture x86_64: "_dmviewfromoptions_", Message-ID: DMViewFromOptions does not seem to have Fortran bindings and I don't see it on the web page for DM methods. I was able to get it to compile using PetscObjectViewFromOptions FYI, It seems to be an inlined thing, thus missing the web page and Fortran bindings: include/petscdm.h:PETSC_STATIC_INLINE PetscErrorCode DMViewFromOptions(DM A,PetscObject obj,const char name[]) {return PetscObjectViewFromOptions((PetscObject)A,obj,name);} 18:53 2 mark/feature-xgc-interface *+ ~/Codes/petsc/src/dm/impls/plex/examples/tutorials$ make ex6f90 /Users/markadams/homebrew/Cellar/mpich/3.3.1/bin/mpif90 -Wl,-multiply_defined,suppress -Wl,-multiply_defined -Wl,suppress -Wl,-commons,use_dylibs -Wl,-search_paths_first -Wl,-no_compact_unwind -Wall -ffree-line-length-0 -Wno-unused-dummy-argument -g -I/Users/markadams/Codes/petsc/include -I/Users/markadams/Codes/petsc/arch-macosx-gnu-g/include -I/opt/X11/include -I/Users/markadams/homebrew/Cellar/mpich/3.3.1/include ex6f90.F90 -Wl,-rpath,/Users/markadams/Codes/petsc/arch-macosx-gnu-g/lib -L/Users/markadams/Codes/petsc/arch-macosx-gnu-g/lib -Wl,-rpath,/Users/markadams/Codes/petsc/arch-macosx-gnu-g/lib -L/Users/markadams/Codes/petsc/arch-macosx-gnu-g/lib -Wl,-rpath,/opt/X11/lib -L/opt/X11/lib -Wl,-rpath,/Users/markadams/homebrew/Cellar/mpich/3.3.1/lib -L/Users/markadams/homebrew/Cellar/mpich/3.3.1/lib -Wl,-rpath,/Users/markadams/homebrew/Cellar/gcc/9.1.0/lib/gcc/9/gcc/x86_64-apple-darwin18/9.1.0 -L/Users/markadams/homebrew/Cellar/gcc/9.1.0/lib/gcc/9/gcc/x86_64-apple-darwin18/9.1.0 -Wl,-rpath,/Users/markadams/homebrew/Cellar/gcc/9.1.0/lib/gcc/9 -L/Users/markadams/homebrew/Cellar/gcc/9.1.0/lib/gcc/9 -lpetsc -lHYPRE -lcmumps -ldmumps -lsmumps -lzmumps -lmumps_common -lpord -lscalapack -lsuperlu -lsuperlu_dist -lfftw3_mpi -lfftw3 -lp4est -lsc -llapack -lblas -lhdf5hl_fortran -lhdf5_fortran -lhdf5_hl -lhdf5 -lchaco -lparmetis -lmetis -ltriangle -lz -lX11 -lctetgen -lc++ -ldl -lmpifort -lmpi -lpmpi -lgfortran -lquadmath -lm -lc++ -ldl -o ex6f90 Undefined symbols for architecture x86_64: "_dmviewfromoptions_", referenced from: _MAIN__ in ccALMXJ2.o ld: symbol(s) not found for architecture x86_64 collect2: error: ld returned 1 exit status make: *** [ex6f90] Error 1 -------------- next part -------------- An HTML attachment was scrubbed... URL: From balay at mcs.anl.gov Fri Sep 20 18:24:08 2019 From: balay at mcs.anl.gov (Balay, Satish) Date: Fri, 20 Sep 2019 23:24:08 +0000 Subject: [petsc-users] reproduced the problem In-Reply-To: <13db4493-5113-5667-2b34-b93d7a9090de@purdue.edu> References: <8e82203a-d694-f3d3-8f10-8e36401822b6@purdue.edu> <10956ef1-9fb0-d495-6d91-779e80a63624@purdue.edu> <13db4493-5113-5667-2b34-b93d7a9090de@purdue.edu> Message-ID: As the message says - you need to use configure option --with-cxx-dialect=C++11 with --download-superlu_dist [this requirement is automated in petsc/master so extra configure option is no longer required] Satish On Fri, 20 Sep 2019, Povolotskyi, Mykhailo via petsc-users wrote: > Hello Satish, > > I did what you suggested, now the error is different: > > > ??? UNABLE to CONFIGURE with GIVEN OPTIONS??? (see configure.log for > details): > ------------------------------------------------------------------------------- > Cannot use SuperLU_DIST without enabling C++11, see --with-cxx-dialect=C++11 > ******************************************************************************* > > The updated configure.log is here: > > https://www.dropbox.com/s/tmkksemu294j719/configure.log?dl=0 > > On 9/20/2019 4:32 PM, Balay, Satish wrote: > > ================================================================================ > > TEST checkRuntimeIssues from config.packages.BlasLapack(/depot/kildisha/apps/brown/nemo5/libs/petsc/build-real3.11/config/BuildSystem/config/packages/BlasLapack.py:579) > > TESTING: checkRuntimeIssues from config.packages.BlasLapack(config/BuildSystem/config/packages/BlasLapack.py:579) > > Determines if BLAS/LAPACK routines use 32 or 64 bit integers > > Checking if BLAS/LAPACK routines use 32 or 64 bit integersExecuting: mpicc -c -o /tmp/petsc-wf99X2/config.packages.BlasLapack/conftest.o -I/tmp/petsc-wf99X2/config.setCompilers -I/tmp/petsc-wf99X2/config.compilers -I/tmp/petsc-wf99X2/config.utilities.closure -I/tmp/petsc-wf99X2/config.headers -I/tmp/petsc-wf99X2/config.utilities.cacheDetails -I/tmp/petsc-wf99X2/config.atomics -I/tmp/petsc-wf99X2/config.libraries -I/tmp/petsc-wf99X2/config.functions -I/tmp/petsc-wf99X2/config.utilities.featureTestMacros -I/tmp/petsc-wf99X2/config.utilities.missing -I/tmp/petsc-wf99X2/config.types -I/tmp/petsc-wf99X2/config.packages.MPI -I/tmp/petsc-wf99X2/config.packages.valgrind -I/tmp/petsc-wf99X2/config.packages.pthread -I/tmp/petsc-wf99X2/config.packages.metis -I/tmp/petsc-wf99X2/config.packages.hdf5 -I/tmp/petsc-wf99X2/config.packages.BlasLapack -fopenmp -fPIC /tmp/petsc-wf99X2/config.packages.BlasLapack/conftest.c > > Successful compile: > > Source: > > #include "confdefs.h" > > #include "conffix.h" > > #include > > #if STDC_HEADERS > > #include > > #include > > #include > > #endif > > > > int main() { > > FILE *output = fopen("runtimetestoutput","w"); > > extern double ddot_(const int*,const double*,const int *,const double*,const int*); > > double x1mkl[4] = {3.0,5.0,7.0,9.0}; > > int one1mkl = 1,nmkl = 2; > > double dotresultmkl = 0; > > dotresultmkl = ddot_(&nmkl,x1mkl,&one1mkl,x1mkl,&one1mkl); > > fprintf(output, "-known-64-bit-blas-indices=%d",dotresultmkl != 34);; > > return 0; > > } > > Executing: mpicc -o /tmp/petsc-wf99X2/config.packages.BlasLapack/conftest -fopenmp -fPIC /tmp/petsc-wf99X2/config.packages.BlasLapack/conftest.o -L/apps/cent7/intel/compilers_and_libraries_2017.1.132/linux/mkl/lib/intel64 -lmkl_intel_lp64 -lmkl_gnu_thread -lmkl_core -lm -lstdc++ -ldl -L/apps/brown/openmpi.20190215/2.1.6_gcc-5.2.0/lib -lmpi_usempif08 -lmpi_usempi_ignore_tkr -lmpi_mpifh -lmpi -lgfortran -lm -L/apps/cent7/gcc/5.2.0/lib/gcc/x86_64-unknown-linux-gnu/5.2.0 -L/apps/cent7/gcc/5.2.0/lib64 -L/apps/cent7/gcc/5.2.0/lib -Wl,-rpath,/apps/brown/openmpi.20190215/2.1.6_gcc-5.2.0/lib -lgfortran -lm -lgomp -lgcc_s -lquadmath -lpthread -lstdc++ -ldl > > Testing executable /tmp/petsc-wf99X2/config.packages.BlasLapack/conftest to see if it can be run > > Executing: /tmp/petsc-wf99X2/config.packages.BlasLapack/conftest > > Executing: /tmp/petsc-wf99X2/config.packages.BlasLapack/conftest > > ERROR while running executable: Could not execute "['/tmp/petsc-wf99X2/config.packages.BlasLapack/conftest']": > > /tmp/petsc-wf99X2/config.packages.BlasLapack/conftest: error while loading shared libraries: libmkl_intel_lp64.so: cannot open shared object file: No such file or directory > > > > Defined "HAVE_64BIT_BLAS_INDICES" to "1" > > Checking for 64 bit blas indices: program did not return therefor assuming 64 bit blas indices > > Defined "HAVE_LIBMKL_INTEL_ILP64" to "1" > > > > <<<<<<<< > > > > So this test has an error but yet the flag HAVE_64BIT_BLAS_INDICES is set. > > > > Is your compiler not returning correct error codes? > > > > Does it make a difference if you also specify -Wl,-rpath along with -L in --with-blaslapack-lib option? > > > > > > Satish > > > > On Fri, 20 Sep 2019, Povolotskyi, Mykhailo wrote: > > > >> Dear Matthew and Satish, > >> > >> I just wrote that the error disappeared, but it still exists (I had to > >> wait longer). > >> > >> The configuration log can be accessed here: > >> > >> https://www.dropbox.com/s/tmkksemu294j719/configure.log?dl=0 > >> > >> Sorry for the last e-mail. > >> > >> Michael. > >> > >> > >> On 09/20/2019 03:53 PM, Balay, Satish wrote: > >>> --with-64-bit-indices=1 => PetscInt = int64_t > >>> --known-64-bit-blas-indices=1 => blas specified uses 64bit indices. > >>> > >>> What is your requirement (use case)? > >>> > >>> Satish > >>> > >>> On Fri, 20 Sep 2019, Povolotskyi, Mykhailo via petsc-users wrote: > >>> > >>>> Does it mean I have to configure petsc with --with-64-bit-indices=1 ? > >>>> > >>>> On 09/20/2019 03:41 PM, Matthew Knepley wrote: > >>>> On Fri, Sep 20, 2019 at 1:55 PM Povolotskyi, Mykhailo via petsc-users > wrote: > >>>> Hello, > >>>> > >>>> I'm upgrading petsc from 3.8 to 3.11. > >>>> > >>>> In doing so, I see an error message: > >>>> > >>>> UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log for details): > >>>> ------------------------------------------------------------------------------- > >>>> Cannot use SuperLU_DIST with 64 bit BLAS/Lapack indices > >>>> ******************************************************************************* > >>>> > >>>> I wonder why this configuration step worked well for 3.8? I did not > >>>> change anything else but version of petsc. > >>>> > >>>> This never worked. We are just checking now. > >>>> > >>>> Thanks, > >>>> > >>>> Matt > >>>> > >>>> Thank you, > >>>> > >>>> Michael. > >>>> > >>>> > >>>> > >>>> -- > >>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > >>>> -- Norbert Wiener > >>>> > >>>> https://www.cse.buffalo.edu/~knepley/ > >>>> > >>>> > >> > From knepley at gmail.com Fri Sep 20 20:14:05 2019 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 20 Sep 2019 21:14:05 -0400 Subject: [petsc-users] reproduced the problem In-Reply-To: References: <8e82203a-d694-f3d3-8f10-8e36401822b6@purdue.edu> <10956ef1-9fb0-d495-6d91-779e80a63624@purdue.edu> <13db4493-5113-5667-2b34-b93d7a9090de@purdue.edu> Message-ID: On Fri, Sep 20, 2019 at 7:24 PM Balay, Satish via petsc-users < petsc-users at mcs.anl.gov> wrote: > As the message says - you need to use configure option > --with-cxx-dialect=C++11 with --download-superlu_dist > SuperLU_DIst recently put some C++ code in it that needs C++-11. IT was not there last time you used it. Thanks, Matt > [this requirement is automated in petsc/master so extra configure option > is no longer required] > > Satish > > On Fri, 20 Sep 2019, Povolotskyi, Mykhailo via petsc-users wrote: > > > Hello Satish, > > > > I did what you suggested, now the error is different: > > > > > > UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log for > > details): > > > ------------------------------------------------------------------------------- > > Cannot use SuperLU_DIST without enabling C++11, see > --with-cxx-dialect=C++11 > > > ******************************************************************************* > > > > The updated configure.log is here: > > > > https://www.dropbox.com/s/tmkksemu294j719/configure.log?dl=0 > > > > On 9/20/2019 4:32 PM, Balay, Satish wrote: > > > > ================================================================================ > > > TEST checkRuntimeIssues from > config.packages.BlasLapack(/depot/kildisha/apps/brown/nemo5/libs/petsc/build-real3.11/config/BuildSystem/config/packages/BlasLapack.py:579) > > > TESTING: checkRuntimeIssues from > config.packages.BlasLapack(config/BuildSystem/config/packages/BlasLapack.py:579) > > > Determines if BLAS/LAPACK routines use 32 or 64 bit integers > > > Checking if BLAS/LAPACK routines use 32 or 64 bit integersExecuting: > mpicc -c -o /tmp/petsc-wf99X2/config.packages.BlasLapack/conftest.o > -I/tmp/petsc-wf99X2/config.setCompilers > -I/tmp/petsc-wf99X2/config.compilers > -I/tmp/petsc-wf99X2/config.utilities.closure > -I/tmp/petsc-wf99X2/config.headers > -I/tmp/petsc-wf99X2/config.utilities.cacheDetails > -I/tmp/petsc-wf99X2/config.atomics -I/tmp/petsc-wf99X2/config.libraries > -I/tmp/petsc-wf99X2/config.functions > -I/tmp/petsc-wf99X2/config.utilities.featureTestMacros > -I/tmp/petsc-wf99X2/config.utilities.missing > -I/tmp/petsc-wf99X2/config.types -I/tmp/petsc-wf99X2/config.packages.MPI > -I/tmp/petsc-wf99X2/config.packages.valgrind > -I/tmp/petsc-wf99X2/config.packages.pthread > -I/tmp/petsc-wf99X2/config.packages.metis > -I/tmp/petsc-wf99X2/config.packages.hdf5 > -I/tmp/petsc-wf99X2/config.packages.BlasLapack -fopenmp -fPIC > /tmp/petsc-wf99X2/config.packages.BlasLapack/conftest.c > > > Successful compile: > > > Source: > > > #include "confdefs.h" > > > #include "conffix.h" > > > #include > > > #if STDC_HEADERS > > > #include > > > #include > > > #include > > > #endif > > > > > > int main() { > > > FILE *output = fopen("runtimetestoutput","w"); > > > extern double ddot_(const int*,const double*,const int *,const > double*,const int*); > > > double x1mkl[4] = {3.0,5.0,7.0,9.0}; > > > int one1mkl = 1,nmkl = 2; > > > double dotresultmkl = 0; > > > dotresultmkl = > ddot_(&nmkl,x1mkl,&one1mkl,x1mkl,&one1mkl); > > > fprintf(output, > "-known-64-bit-blas-indices=%d",dotresultmkl != 34);; > > > return 0; > > > } > > > Executing: mpicc -o > /tmp/petsc-wf99X2/config.packages.BlasLapack/conftest -fopenmp -fPIC > /tmp/petsc-wf99X2/config.packages.BlasLapack/conftest.o > -L/apps/cent7/intel/compilers_and_libraries_2017.1.132/linux/mkl/lib/intel64 > -lmkl_intel_lp64 -lmkl_gnu_thread -lmkl_core -lm -lstdc++ -ldl > -L/apps/brown/openmpi.20190215/2.1.6_gcc-5.2.0/lib -lmpi_usempif08 > -lmpi_usempi_ignore_tkr -lmpi_mpifh -lmpi -lgfortran -lm > -L/apps/cent7/gcc/5.2.0/lib/gcc/x86_64-unknown-linux-gnu/5.2.0 > -L/apps/cent7/gcc/5.2.0/lib64 -L/apps/cent7/gcc/5.2.0/lib > -Wl,-rpath,/apps/brown/openmpi.20190215/2.1.6_gcc-5.2.0/lib -lgfortran -lm > -lgomp -lgcc_s -lquadmath -lpthread -lstdc++ -ldl > > > Testing executable > /tmp/petsc-wf99X2/config.packages.BlasLapack/conftest to see if it can be > run > > > Executing: /tmp/petsc-wf99X2/config.packages.BlasLapack/conftest > > > Executing: /tmp/petsc-wf99X2/config.packages.BlasLapack/conftest > > > ERROR while running executable: Could not execute > "['/tmp/petsc-wf99X2/config.packages.BlasLapack/conftest']": > > > /tmp/petsc-wf99X2/config.packages.BlasLapack/conftest: error while > loading shared libraries: libmkl_intel_lp64.so: cannot open shared object > file: No such file or directory > > > > > > Defined "HAVE_64BIT_BLAS_INDICES" to "1" > > > Checking for 64 bit blas indices: program did not return therefor > assuming 64 bit blas indices > > > Defined "HAVE_LIBMKL_INTEL_ILP64" to "1" > > > > > > <<<<<<<< > > > > > > So this test has an error but yet the flag HAVE_64BIT_BLAS_INDICES is > set. > > > > > > Is your compiler not returning correct error codes? > > > > > > Does it make a difference if you also specify -Wl,-rpath along with -L > in --with-blaslapack-lib option? > > > > > > > > > Satish > > > > > > On Fri, 20 Sep 2019, Povolotskyi, Mykhailo wrote: > > > > > >> Dear Matthew and Satish, > > >> > > >> I just wrote that the error disappeared, but it still exists (I had to > > >> wait longer). > > >> > > >> The configuration log can be accessed here: > > >> > > >> https://www.dropbox.com/s/tmkksemu294j719/configure.log?dl=0 > > >> > > >> Sorry for the last e-mail. > > >> > > >> Michael. > > >> > > >> > > >> On 09/20/2019 03:53 PM, Balay, Satish wrote: > > >>> --with-64-bit-indices=1 => PetscInt = int64_t > > >>> --known-64-bit-blas-indices=1 => blas specified uses 64bit indices. > > >>> > > >>> What is your requirement (use case)? > > >>> > > >>> Satish > > >>> > > >>> On Fri, 20 Sep 2019, Povolotskyi, Mykhailo via petsc-users wrote: > > >>> > > >>>> Does it mean I have to configure petsc with --with-64-bit-indices=1 > ? > > >>>> > > >>>> On 09/20/2019 03:41 PM, Matthew Knepley wrote: > > >>>> On Fri, Sep 20, 2019 at 1:55 PM Povolotskyi, Mykhailo via > petsc-users > > wrote: > > >>>> Hello, > > >>>> > > >>>> I'm upgrading petsc from 3.8 to 3.11. > > >>>> > > >>>> In doing so, I see an error message: > > >>>> > > >>>> UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log > for details): > > >>>> > ------------------------------------------------------------------------------- > > >>>> Cannot use SuperLU_DIST with 64 bit BLAS/Lapack indices > > >>>> > ******************************************************************************* > > >>>> > > >>>> I wonder why this configuration step worked well for 3.8? I did not > > >>>> change anything else but version of petsc. > > >>>> > > >>>> This never worked. We are just checking now. > > >>>> > > >>>> Thanks, > > >>>> > > >>>> Matt > > >>>> > > >>>> Thank you, > > >>>> > > >>>> Michael. > > >>>> > > >>>> > > >>>> > > >>>> -- > > >>>> What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > > >>>> -- Norbert Wiener > > >>>> > > >>>> https://www.cse.buffalo.edu/~knepley/< > http://www.cse.buffalo.edu/%7Eknepley/> > > >>>> > > >>>> > > >> > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Fri Sep 20 20:20:10 2019 From: bsmith at mcs.anl.gov (Smith, Barry F.) Date: Sat, 21 Sep 2019 01:20:10 +0000 Subject: [petsc-users] Undefined symbols for architecture x86_64: "_dmviewfromoptions_", In-Reply-To: References: Message-ID: <0E59E2AA-2B85-4320-A3D9-A091C1BD36FA@anl.gov> Currently none of the XXXViewFromOptions() have manual pages or Fortran stubs/interfaces. It is probably easier to remove them as inline functions and instead write them as full functions which just call PetscObjectViewFromOptions() with manual pages then the Fortran stubs/interfaces will be built automatically. Barry > On Sep 20, 2019, at 6:16 PM, Mark Adams via petsc-users wrote: > > DMViewFromOptions does not seem to have Fortran bindings and I don't see it on the web page for DM methods. > > I was able to get it to compile using PetscObjectViewFromOptions > > FYI, > It seems to be an inlined thing, thus missing the web page and Fortran bindings: > > include/petscdm.h:PETSC_STATIC_INLINE PetscErrorCode DMViewFromOptions(DM A,PetscObject obj,const char name[]) {return PetscObjectViewFromOptions((PetscObject)A,obj,name);} > > > > 18:53 2 mark/feature-xgc-interface *+ ~/Codes/petsc/src/dm/impls/plex/examples/tutorials$ make ex6f90 > /Users/markadams/homebrew/Cellar/mpich/3.3.1/bin/mpif90 -Wl,-multiply_defined,suppress -Wl,-multiply_defined -Wl,suppress -Wl,-commons,use_dylibs -Wl,-search_paths_first -Wl,-no_compact_unwind -Wall -ffree-line-length-0 -Wno-unused-dummy-argument -g -I/Users/markadams/Codes/petsc/include -I/Users/markadams/Codes/petsc/arch-macosx-gnu-g/include -I/opt/X11/include -I/Users/markadams/homebrew/Cellar/mpich/3.3.1/include ex6f90.F90 -Wl,-rpath,/Users/markadams/Codes/petsc/arch-macosx-gnu-g/lib -L/Users/markadams/Codes/petsc/arch-macosx-gnu-g/lib -Wl,-rpath,/Users/markadams/Codes/petsc/arch-macosx-gnu-g/lib -L/Users/markadams/Codes/petsc/arch-macosx-gnu-g/lib -Wl,-rpath,/opt/X11/lib -L/opt/X11/lib -Wl,-rpath,/Users/markadams/homebrew/Cellar/mpich/3.3.1/lib -L/Users/markadams/homebrew/Cellar/mpich/3.3.1/lib -Wl,-rpath,/Users/markadams/homebrew/Cellar/gcc/9.1.0/lib/gcc/9/gcc/x86_64-apple-darwin18/9.1.0 -L/Users/markadams/homebrew/Cellar/gcc/9.1.0/lib/gcc/9/gcc/x86_64-apple-darwin18/9.1.0 -Wl,-rpath,/Users/markadams/homebrew/Cellar/gcc/9.1.0/lib/gcc/9 -L/Users/markadams/homebrew/Cellar/gcc/9.1.0/lib/gcc/9 -lpetsc -lHYPRE -lcmumps -ldmumps -lsmumps -lzmumps -lmumps_common -lpord -lscalapack -lsuperlu -lsuperlu_dist -lfftw3_mpi -lfftw3 -lp4est -lsc -llapack -lblas -lhdf5hl_fortran -lhdf5_fortran -lhdf5_hl -lhdf5 -lchaco -lparmetis -lmetis -ltriangle -lz -lX11 -lctetgen -lc++ -ldl -lmpifort -lmpi -lpmpi -lgfortran -lquadmath -lm -lc++ -ldl -o ex6f90 > Undefined symbols for architecture x86_64: > "_dmviewfromoptions_", referenced from: > _MAIN__ in ccALMXJ2.o > ld: symbol(s) not found for architecture x86_64 > collect2: error: ld returned 1 exit status > make: *** [ex6f90] Error 1 From jed at jedbrown.org Fri Sep 20 21:35:13 2019 From: jed at jedbrown.org (Jed Brown) Date: Fri, 20 Sep 2019 20:35:13 -0600 Subject: [petsc-users] Undefined symbols for architecture x86_64: "_dmviewfromoptions_", In-Reply-To: <0E59E2AA-2B85-4320-A3D9-A091C1BD36FA@anl.gov> References: <0E59E2AA-2B85-4320-A3D9-A091C1BD36FA@anl.gov> Message-ID: <8736gqv0ke.fsf@jedbrown.org> "Smith, Barry F. via petsc-users" writes: > Currently none of the XXXViewFromOptions() have manual pages or Fortran stubs/interfaces. It is probably easier to remove them as inline functions and instead write them as full functions which just call PetscObjectViewFromOptions() with manual pages then the Fortran stubs/interfaces will be built automatically. PetscObjectViewFromOptions has a custom interface because it takes a string. Fortran users could call that today, rather than wait for stubs to be written. From bsmith at mcs.anl.gov Fri Sep 20 21:38:29 2019 From: bsmith at mcs.anl.gov (Smith, Barry F.) Date: Sat, 21 Sep 2019 02:38:29 +0000 Subject: [petsc-users] Undefined symbols for architecture x86_64: "_dmviewfromoptions_", In-Reply-To: <8736gqv0ke.fsf@jedbrown.org> References: <0E59E2AA-2B85-4320-A3D9-A091C1BD36FA@anl.gov> <8736gqv0ke.fsf@jedbrown.org> Message-ID: <612AB019-D647-4327-9026-DFAE0DAEAD49@mcs.anl.gov> Oh yes, I didn't notice that. The stubs and interfaces cannot be generated automatically, but cut, paste, and make a mistake will work. > On Sep 20, 2019, at 9:35 PM, Jed Brown wrote: > > "Smith, Barry F. via petsc-users" writes: > >> Currently none of the XXXViewFromOptions() have manual pages or Fortran stubs/interfaces. It is probably easier to remove them as inline functions and instead write them as full functions which just call PetscObjectViewFromOptions() with manual pages then the Fortran stubs/interfaces will be built automatically. > > PetscObjectViewFromOptions has a custom interface because it takes a string. > > Fortran users could call that today, rather than wait for stubs to be > written. From swarnava89 at gmail.com Mon Sep 23 19:45:44 2019 From: swarnava89 at gmail.com (Swarnava Ghosh) Date: Mon, 23 Sep 2019 17:45:44 -0700 Subject: [petsc-users] DMPlex cell number containing a point in space In-Reply-To: References: Message-ID: Hi Matt, I am trying to get this working. However, It seems that this does not work for 3D. I have tets as elements, and the dmplex is serial. I get the following error: 0]PETSC ERROR: No support for this operation for this object type [0]PETSC ERROR: I have only coded this for 2D [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. Is there a work around to this? would you please let me know? Also here is my code PetscSF cellSF=NULL; Vec v; PetscErrorCode ierr; // create v; VecCreate(PETSC_COMM_SELF,&v); VecSetSizes(v,PETSC_DECIDE,3); VecSetBlockSize(v,3); VecSetFromOptions(v); VecSetValue(v,0,0.0,INSERT_VALUES); VecSetValue(v,1,0.1,INSERT_VALUES); VecSetValue(v,2,0.12,INSERT_VALUES); VecAssemblyBegin(v); VecAssemblyEnd(v); PetscInt bs; VecGetBlockSize(v,&bs); printf("Block size of v=%d \n",bs); ierr=DMLocatePoints(pCgdft->dmatom,v,DM_POINTLOCATION_NEAREST,&cellSF); // print vector VecView(v,PETSC_VIEWER_STDOUT_SELF); Sincerely, SG On Mon, Sep 16, 2019 at 6:37 AM Matthew Knepley wrote: > On Fri, Sep 6, 2019 at 6:07 PM Swarnava Ghosh via petsc-users < > petsc-users at mcs.anl.gov> wrote: > >> Dear Petsc developers and users, >> >> I have a DMPlex mesh in 3D. Given a point with (x,y,z) coordinates, I am >> trying the find the cell number in which this point lies, and the vertices >> of the cell. Is there any DMPlex function that will give me the cell number? >> > > Sorry, I lost this mail. > > In serial, you can just use DMLocatePoint(). If you have some points and > you are not > sure which process they might be located on, then you need a > DMInterpolation context. > > Thanks, > > Matt > > >> Thank you, >> SG >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Tue Sep 24 04:17:00 2019 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 24 Sep 2019 05:17:00 -0400 Subject: [petsc-users] DMPlex cell number containing a point in space In-Reply-To: References: Message-ID: On Mon, Sep 23, 2019 at 8:46 PM Swarnava Ghosh wrote: > Hi Matt, > > I am trying to get this working. However, It seems that this does not work > for 3D. I have tets as elements, and the dmplex is serial. I get the > following error: > > 0]PETSC ERROR: No support for this operation for this object type > [0]PETSC ERROR: I have only coded this for 2D > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html > for trouble shooting. > > Is there a work around to this? would you please let me know? > Did you give ' -dm_plex_hash_location'? It does not seem possible that you got to this code without giving that option. Thanks, Matt > Also here is my code > > PetscSF cellSF=NULL; > Vec v; > PetscErrorCode ierr; > > // create v; > VecCreate(PETSC_COMM_SELF,&v); > VecSetSizes(v,PETSC_DECIDE,3); > VecSetBlockSize(v,3); > VecSetFromOptions(v); > > VecSetValue(v,0,0.0,INSERT_VALUES); > VecSetValue(v,1,0.1,INSERT_VALUES); > VecSetValue(v,2,0.12,INSERT_VALUES); > > VecAssemblyBegin(v); > VecAssemblyEnd(v); > > PetscInt bs; > VecGetBlockSize(v,&bs); > > printf("Block size of v=%d \n",bs); > > ierr=DMLocatePoints(pCgdft->dmatom,v,DM_POINTLOCATION_NEAREST,&cellSF); > > // print vector > VecView(v,PETSC_VIEWER_STDOUT_SELF); > > > Sincerely, > SG > > On Mon, Sep 16, 2019 at 6:37 AM Matthew Knepley wrote: > >> On Fri, Sep 6, 2019 at 6:07 PM Swarnava Ghosh via petsc-users < >> petsc-users at mcs.anl.gov> wrote: >> >>> Dear Petsc developers and users, >>> >>> I have a DMPlex mesh in 3D. Given a point with (x,y,z) coordinates, I am >>> trying the find the cell number in which this point lies, and the vertices >>> of the cell. Is there any DMPlex function that will give me the cell number? >>> >> >> Sorry, I lost this mail. >> >> In serial, you can just use DMLocatePoint(). If you have some points and >> you are not >> sure which process they might be located on, then you need a >> DMInterpolation context. >> >> Thanks, >> >> Matt >> >> >>> Thank you, >>> SG >>> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ >> >> > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From marco.cisternino at optimad.it Tue Sep 24 06:06:26 2019 From: marco.cisternino at optimad.it (Marco Cisternino) Date: Tue, 24 Sep 2019 11:06:26 +0000 Subject: [petsc-users] Multiple linear solver defined at command line Message-ID: Good morning, in my code I need to solve 2 linear systems. I would like to use different solvers for the 2 systems and most of all I would like to choose the single solver by flags from command line, is it possible? I can call PetscInitialize/PetscFinalize multiple times passing PetscInitialize different argc and argv. What happens if I call the second PetscInitiliaze before the first PetscFinalize with different argc and argv? Thanks. Bests, Marco Cisternino From wencel at gmail.com Tue Sep 24 06:59:18 2019 From: wencel at gmail.com (Lawrence Mitchell) Date: Tue, 24 Sep 2019 12:59:18 +0100 Subject: [petsc-users] Multiple linear solver defined at command line In-Reply-To: References: Message-ID: Dear Marco, > On 24 Sep 2019, at 12:06, Marco Cisternino via petsc-users wrote: > > Good morning, > in my code I need to solve 2 linear systems. I would like to use different solvers for the 2 systems and most of all I would like to choose the single solver by flags from command line, is it possible? > I can call PetscInitialize/PetscFinalize multiple times passing PetscInitialize different argc and argv. What happens if I call the second PetscInitiliaze before the first PetscFinalize with different argc and argv? The way you should do this is by giving your two different solvers two different options prefixes: Assuming they are KSP objects call: KSPSetOptionsPrefix(ksp1, "solver1_"); KSPSetOptionsPrefix(ksp2, "solver2_"); Now you can configure ksp1 with: -solver1_ksp_type ... -solver1_pc_type ... And ksp2 with: -solver2_ksp_type ... -solver2_pc_type ... In general, all PETSc objects can be given such an options prefix so that they may be controlled separately. Thanks, Lawrence From marco.cisternino at optimad.it Tue Sep 24 07:32:44 2019 From: marco.cisternino at optimad.it (Marco Cisternino) Date: Tue, 24 Sep 2019 12:32:44 +0000 Subject: [petsc-users] R: Multiple linear solver defined at command line In-Reply-To: References: , Message-ID: Thank you, Lawrence. Cool! That's perfect! Bests, Marco Cisternino ________________________________________ Da: Lawrence Mitchell Inviato: marted? 24 settembre 2019 13:59 A: Marco Cisternino Cc: petsc-users Oggetto: Re: [petsc-users] Multiple linear solver defined at command line Dear Marco, > On 24 Sep 2019, at 12:06, Marco Cisternino via petsc-users wrote: > > Good morning, > in my code I need to solve 2 linear systems. I would like to use different solvers for the 2 systems and most of all I would like to choose the single solver by flags from command line, is it possible? > I can call PetscInitialize/PetscFinalize multiple times passing PetscInitialize different argc and argv. What happens if I call the second PetscInitiliaze before the first PetscFinalize with different argc and argv? The way you should do this is by giving your two different solvers two different options prefixes: Assuming they are KSP objects call: KSPSetOptionsPrefix(ksp1, "solver1_"); KSPSetOptionsPrefix(ksp2, "solver2_"); Now you can configure ksp1 with: -solver1_ksp_type ... -solver1_pc_type ... And ksp2 with: -solver2_ksp_type ... -solver2_pc_type ... In general, all PETSc objects can be given such an options prefix so that they may be controlled separately. Thanks, Lawrence From yjwu16 at gmail.com Tue Sep 24 10:16:58 2019 From: yjwu16 at gmail.com (Yingjie Wu) Date: Tue, 24 Sep 2019 23:16:58 +0800 Subject: [petsc-users] Problem about Scaling Message-ID: Respected Petsc developers Hi, I am currently using SNES to solve some non-linear PDEs. The model is a two-dimensional X-Y geometry. Because the magnitude of different physical variables is too large, it is difficult to find the direction in Krylov subspace, and the residual descends very slowly or even does not converge. I think my PDEs need scaling. I need some help to solve the following quentions. 1. I use - snes_mf_operator, so instead of providing Jacobian matrix, I only set up an approximate Jacobian matrix for precondition. For my model, do I just need to magnify the residuals to the same level? Is there any need to modify the precondition matrix? 2. I have seen some articles referring to the non-dimensional method. I don't know how to implement this method in the program and how difficult it is to implement. Thanks, Yingjie -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Tue Sep 24 10:27:10 2019 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 24 Sep 2019 11:27:10 -0400 Subject: [petsc-users] Problem about Scaling In-Reply-To: References: Message-ID: On Tue, Sep 24, 2019 at 11:17 AM Yingjie Wu via petsc-users < petsc-users at mcs.anl.gov> wrote: > Respected Petsc developers > Hi, > I am currently using SNES to solve some non-linear PDEs. The model is a > two-dimensional X-Y geometry. Because the magnitude of different physical > variables is too large, it is difficult to find the direction in Krylov > subspace, and the residual descends very slowly or even does not converge. > I think my PDEs need scaling. I need some help to solve the following > quentions. > > 1. I use - snes_mf_operator, so instead of providing Jacobian matrix, I > only set up an approximate Jacobian matrix for precondition. For my model, > do I just need to magnify the residuals to the same level? Is there any > need to modify the precondition matrix? > 2. I have seen some articles referring to the non-dimensional method. I > don't know how to implement this method in the program and how difficult it > is to implement. > That answer to 1 and 2 is the same. Nondimensionalize your system, and in the the process scale your unknowns so that they are about the same magnitude. Here is a great article on this process https://epubs.siam.org/doi/pdf/10.1137/16M1107127 Thanks, Matt > Thanks, > Yingjie > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From mvalera-w at sdsu.edu Tue Sep 24 14:58:07 2019 From: mvalera-w at sdsu.edu (Manuel Valera) Date: Tue, 24 Sep 2019 12:58:07 -0700 Subject: [petsc-users] TS scheme with different DAs In-Reply-To: References: Message-ID: Hello all, I finally implemented the TS routine operating in several DAs at the same time, hacking it as you suggested. I still have a problem with my algorithm though. It is not DMDA related so there's that. My algorithm needs to update u,v,w with information from the updated T,S,rho. My problem, or what I don't understand yet, is how to operate in the intermediate runge-kutta time integration states inside the RHSFunction. If I can be more clear, I would need the intermediate T,S states to obtain an updated rho (density) to in turn, obtain the correct intermediate velocities, and keep the loop going. As I understand right now, the RHS vector is different from this intermediate state, and it would be only the RHS input to the loop, so operating on this would be incorrect. As of now, my algorithm still creates artifacts because of this lack of information to accurately update all of the variables at the same time. The problem happens as well in serial. Thanks for your help, On Wed, Sep 18, 2019 at 4:36 AM Matthew Knepley wrote: > On Tue, Sep 17, 2019 at 8:27 PM Smith, Barry F. > wrote: > >> >> Don't be too quick to dismiss switching to the DMStag you may find that >> it actually takes little time to convert and then you have a much less >> cumbersome process to manage the staggered grid. Take a look at >> src/dm/impls/stag/examples/tutorials/ex2.c where >> >> const PetscInt dof0 = 0, dof1 = 1,dof2 = 1; /* 1 dof on each edge and >> element center */ >> const PetscInt stencilWidth = 1; >> ierr = >> DMStagCreate2d(PETSC_COMM_WORLD,DM_BOUNDARY_NONE,DM_BOUNDARY_NONE,7,9,PETSC_DECIDE,PETSC_DECIDE,dof0,dof1,dof2,DMSTAG_STENCIL_BOX,stencilWidth,NULL,NULL,&dmSol);CHKERRQ(ierr); >> >> BOOM, it has set up a staggered grid with 1 cell centered variable and 1 >> on each edge. Adding more the cell centers, vertices, or edges is trivial. >> >> If you want to stick to DMDA you >> >> "cheat". Depending on exactly what staggering you have you make the DMDA >> for the "smaller problem" as large as the other ones and just track zeros >> in those locations. For example if velocities are "edges" and T, S are on >> cells, make your "cells" DMDA one extra grid width wide in all three >> dimensions. You may need to be careful on the boundaries deepening on the >> types of boundary conditions. >> > > Yes, SNES ex30 does exactly this. However, I still recommend looking at > DMStag. Patrick created it because managing the DMDA > became such as headache. > > Thanks, > > Matt > > >> > On Sep 17, 2019, at 7:04 PM, Manuel Valera via petsc-users < >> petsc-users at mcs.anl.gov> wrote: >> > >> > Thanks Matthew, but my code is too complicated to be redone on DMStag >> now after spending a long time using DMDAs, >> > >> > Is there a way to ensure PETSc distributes several DAs in the same way? >> besides manually distributing the points, >> > >> > Thanks, >> > >> > On Tue, Sep 17, 2019 at 3:28 PM Matthew Knepley >> wrote: >> > On Tue, Sep 17, 2019 at 6:15 PM Manuel Valera via petsc-users < >> petsc-users at mcs.anl.gov> wrote: >> > Hello, petsc users, >> > >> > I have integrated the TS routines in my code, but i just noticed i >> didn't do it optimally. I was using 3 different TS objects to integrate >> velocities, temperature and salinity, and it works but only for small DTs. >> I suspect the intermediate Runge-Kutta states are unphased and this creates >> the discrepancy for broader time steps, so I need to integrate the 3 >> quantities in the same routine. >> > >> > I tried to do this by using a 5 DOF distributed array for the RHS, >> where I store the velocities in the first 3 and then Temperature and >> Salinity in the rest. The problem is that I use a staggered grid and T,S >> are located in a different DA layout than the velocities. This is creating >> problems for me since I can't find a way to communicate the information >> from the result of the TS integration back to the respective DAs of each >> variable. >> > >> > Is there a way to communicate across DAs? or can you suggest an >> alternative solution to this problem? >> > >> > If you have a staggered discretization on a structured grid, I would >> recommend checking out DMStag. >> > >> > Thanks, >> > >> > MAtt >> > >> > Thanks, >> > >> > >> > >> > -- >> > What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> > -- Norbert Wiener >> > >> > https://www.cse.buffalo.edu/~knepley/ >> >> > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mpovolot at purdue.edu Wed Sep 25 00:27:35 2019 From: mpovolot at purdue.edu (Povolotskyi, Mykhailo) Date: Wed, 25 Sep 2019 05:27:35 +0000 Subject: [petsc-users] question about small matrices Message-ID: Dear Petsc developers, in my application I have to solve millions of linear and non-linear systems with small matrices (2x2, 3x3,..., 10x10). I consider them as dense, and use SNES with KSP method PREONLY, and LU preconditioner. I found that when KSPSolve is called only 25% of time is spend in lapack, the rest is PETSc overhead. I know how to call lapack directly to solve a linear system. Question: is it possible to call lapack directly in the SNES solver to avoid the KSPSolve overhead? Thank you, Michael. From knepley at gmail.com Wed Sep 25 02:12:11 2019 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 25 Sep 2019 03:12:11 -0400 Subject: [petsc-users] question about small matrices In-Reply-To: References: Message-ID: On Wed, Sep 25, 2019 at 1:27 AM Povolotskyi, Mykhailo via petsc-users < petsc-users at mcs.anl.gov> wrote: > Dear Petsc developers, > > in my application I have to solve millions of linear and non-linear > systems with small matrices (2x2, 3x3,..., 10x10). > > I consider them as dense, and use SNES with KSP method PREONLY, and LU > preconditioner. > > I found that when KSPSolve is called only 25% of time is spend in > lapack, the rest is PETSc overhead. > > I know how to call lapack directly to solve a linear system. > > Question: is it possible to call lapack directly in the SNES solver to > avoid the KSPSolve overhead? > Question: Do you solve a bunch of them simultaneously? Thanks, Matt > Thank you, > > Michael. > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From Eugenio.Aulisa at ttu.edu Wed Sep 25 09:04:48 2019 From: Eugenio.Aulisa at ttu.edu (Aulisa, Eugenio) Date: Wed, 25 Sep 2019 14:04:48 +0000 Subject: [petsc-users] Clarification of INSERT_VALUES for vec with ghost nodes Message-ID: Hi, I have a vector with ghost nodes where each process may or may not change the value of a specific ghost node (using INSERT_VALUES). At the end I would like for each process, that see a particular ghost node, to have the smallest of the set values. I do not think there is a straightforward way to achieve this, but I would like to be wrong. Any suggestion? %%%%%%%%%%%%%%%% To build a work around I need to understand better the behavior of VecGhostUpdateBegin(...); VecGhostUpdateEnd(...). In particular in the documentation I do not see the option VecGhostUpdateBegin(v, INSERT_VALUES, SCATTER_REVERSE); VecGhostUpdateEnd(v, INSERT_VALUES, SCATTER_REVERSE); In case this is possible to be used, what is the behavior of this call in the following two cases? 1) Assume that node-i belongs to proc0, and is ghosted in proc1 and proc2, also assume that the current value of node-i is value0 and proc0 does not modify it, but proc1 and proc2 do. start with: proc0 -> value0 proc1 -> value0 proc2 -> value0 change to: proc0 -> value0 proc1 -> value1 proc2 -> value2 I assume that calling VecGhostUpdateBegin(v, INSERT_VALUES, SCATTER_REVERSE); VecGhostUpdateEnd(v, INSERT_VALUES, SCATTER_REVERSE); will have an unpredictable behavior as proc0 -> either value1 or value2 proc1 -> value1 proc2 -> value2 2) Assume now that node-i belongs to proc0, and is ghosted in proc1 and proc2, also assume that the current value of node-i is value0 and proc0 and proc1 do not modify it, but proc2 does. start with: proc0 -> value0 proc1 -> value0 proc2 -> value0 change to: proc0 -> value0 proc1 -> value0 proc2 -> value2 Is the call VecGhostUpdateBegin(v, INSERT_VALUES, SCATTER_REVERSE); VecGhostUpdateEnd(v, INSERT_VALUES, SCATTER_REVERSE); still unpredictable? proc0 -> either value0 or value2 proc1 -> value0 proc2 -> value2 or proc0 -> value2 (since proc1 did not modify the original value, so it did not reverse scatter) proc1 -> value0 proc2 -> value2 Thanks a lot for your help Eugenio From michael.werner at dlr.de Wed Sep 25 09:18:45 2019 From: michael.werner at dlr.de (Michael Werner) Date: Wed, 25 Sep 2019 16:18:45 +0200 Subject: [petsc-users] SLEPc - st_type cayley choice of shift and antishift Message-ID: <816e6835-c273-178b-e532-5dec0fe0710c@dlr.de> Hello, I'm looking for advice on how to set shift and antishift for the cayley spectral transformation. So far I've been using sinvert to find the eigenvalues with the smallest real part (but possibly large imaginary part). For this, I use the following options: -st_type sinvert -eps_target -0.05 -eps_target_real With sinvert, it is easy to understand how to chose the target, but for Cayley I'm not sure how to set shift and antishift. What is the mathematical meaning of the antishift? Best regards, Michael Werner From jroman at dsic.upv.es Wed Sep 25 10:21:48 2019 From: jroman at dsic.upv.es (Jose E. Roman) Date: Wed, 25 Sep 2019 17:21:48 +0200 Subject: [petsc-users] SLEPc - st_type cayley choice of shift and antishift In-Reply-To: <816e6835-c273-178b-e532-5dec0fe0710c@dlr.de> References: <816e6835-c273-178b-e532-5dec0fe0710c@dlr.de> Message-ID: <2E0C9AFB-9B3B-427D-AF15-66FEC9DB8115@dsic.upv.es> > El 25 sept 2019, a las 16:18, Michael Werner via petsc-users escribi?: > > Hello, > > I'm looking for advice on how to set shift and antishift for the cayley > spectral transformation. So far I've been using sinvert to find the > eigenvalues with the smallest real part (but possibly large imaginary > part). For this, I use the following options: > -st_type sinvert > -eps_target -0.05 > -eps_target_real > > With sinvert, it is easy to understand how to chose the target, but for > Cayley I'm not sure how to set shift and antishift. What is the > mathematical meaning of the antishift? > > Best regards, > Michael Werner In exact arithmetic, both shift-and-invert and Cayley build the same Krylov subspace, so no difference. If the linear solves are computed "inexactly" (iterative solver) then Cayley may have some advantage, but it depends on the application. Also, iterative solvers usually are not robust enough in this context. You can see the discussion here https://doi.org/10.1108/09615530410544328 Jose From mpovolot at purdue.edu Wed Sep 25 10:49:01 2019 From: mpovolot at purdue.edu (Povolotskyi, Mykhailo) Date: Wed, 25 Sep 2019 15:49:01 +0000 Subject: [petsc-users] question about small matrices In-Reply-To: References: Message-ID: <414a9b00-0961-e228-aaa4-6e3fdcfd9b97@purdue.edu> Hi Matthew, is it possible to do in principle what I would like to do? On 9/25/2019 3:12 AM, Matthew Knepley wrote: On Wed, Sep 25, 2019 at 1:27 AM Povolotskyi, Mykhailo via petsc-users > wrote: Dear Petsc developers, in my application I have to solve millions of linear and non-linear systems with small matrices (2x2, 3x3,..., 10x10). I consider them as dense, and use SNES with KSP method PREONLY, and LU preconditioner. I found that when KSPSolve is called only 25% of time is spend in lapack, the rest is PETSc overhead. I know how to call lapack directly to solve a linear system. Question: is it possible to call lapack directly in the SNES solver to avoid the KSPSolve overhead? Question: Do you solve a bunch of them simultaneously? Thanks, Matt Thank you, Michael. -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From asitav at gmail.com Wed Sep 25 11:55:32 2019 From: asitav at gmail.com (Asitav Mishra) Date: Wed, 25 Sep 2019 12:55:32 -0400 Subject: [petsc-users] DMPlexCreateFromDAG in parallel Message-ID: Hi, I have a native distributed mesh graph across multiple processors, using which I would want to create DMPlex mesh using DMPlexCreateFromDAG. I see in Petsc plex/examples that DMPlexCreateFromDAG creates DM only from master processor and then the DM is distributed across multiple (one-to-many) processors. My question is: is it possible to create DAG locally in each processor and then build the global DM? If yes, are there any such examples? Best, Asitav -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at jedbrown.org Wed Sep 25 12:18:02 2019 From: jed at jedbrown.org (Jed Brown) Date: Wed, 25 Sep 2019 11:18:02 -0600 Subject: [petsc-users] question about small matrices In-Reply-To: <414a9b00-0961-e228-aaa4-6e3fdcfd9b97@purdue.edu> References: <414a9b00-0961-e228-aaa4-6e3fdcfd9b97@purdue.edu> Message-ID: <87pnjos3at.fsf@jedbrown.org> "Povolotskyi, Mykhailo via petsc-users" writes: > Hi Matthew, > > is it possible to do in principle what I would like to do? SNES isn't meant to solve tiny independent systems. (It's just high overhead for that purpose.) You can solve many such instances together by creating a residual function that evaluates them all. The linear algebra will be more efficient with that granularity, though all sub-problems will take the same number of Newton iterations. From jczhang at mcs.anl.gov Wed Sep 25 22:15:09 2019 From: jczhang at mcs.anl.gov (Zhang, Junchao) Date: Thu, 26 Sep 2019 03:15:09 +0000 Subject: [petsc-users] Clarification of INSERT_VALUES for vec with ghost nodes In-Reply-To: References: Message-ID: On Wed, Sep 25, 2019 at 9:11 AM Aulisa, Eugenio via petsc-users > wrote: Hi, I have a vector with ghost nodes where each process may or may not change the value of a specific ghost node (using INSERT_VALUES). At the end I would like for each process, that see a particular ghost node, to have the smallest of the set values. Do you mean owner of ghost nodes gets the smallest values. That is, in your example below, proc0 gets Min(value0, value1, value2)? I do not think there is a straightforward way to achieve this, but I would like to be wrong. Any suggestion? %%%%%%%%%%%%%%%% To build a work around I need to understand better the behavior of VecGhostUpdateBegin(...); VecGhostUpdateEnd(...). In particular in the documentation I do not see the option VecGhostUpdateBegin(v, INSERT_VALUES, SCATTER_REVERSE); VecGhostUpdateEnd(v, INSERT_VALUES, SCATTER_REVERSE); In case this is possible to be used, what is the behavior of this call in the following two cases? 1) Assume that node-i belongs to proc0, and is ghosted in proc1 and proc2, also assume that the current value of node-i is value0 and proc0 does not modify it, but proc1 and proc2 do. start with: proc0 -> value0 proc1 -> value0 proc2 -> value0 change to: proc0 -> value0 proc1 -> value1 proc2 -> value2 I assume that calling VecGhostUpdateBegin(v, INSERT_VALUES, SCATTER_REVERSE); VecGhostUpdateEnd(v, INSERT_VALUES, SCATTER_REVERSE); will have an unpredictable behavior as proc0 -> either value1 or value2 proc1 -> value1 proc2 -> value2 2) Assume now that node-i belongs to proc0, and is ghosted in proc1 and proc2, also assume that the current value of node-i is value0 and proc0 and proc1 do not modify it, but proc2 does. start with: proc0 -> value0 proc1 -> value0 proc2 -> value0 change to: proc0 -> value0 proc1 -> value0 proc2 -> value2 Is the call VecGhostUpdateBegin(v, INSERT_VALUES, SCATTER_REVERSE); VecGhostUpdateEnd(v, INSERT_VALUES, SCATTER_REVERSE); still unpredictable? proc0 -> either value0 or value2 proc1 -> value0 proc2 -> value2 or proc0 -> value2 (since proc1 did not modify the original value, so it did not reverse scatter) proc1 -> value0 proc2 -> value2 Thanks a lot for your help Eugenio -------------- next part -------------- An HTML attachment was scrubbed... URL: From Eugenio.Aulisa at ttu.edu Thu Sep 26 03:28:19 2019 From: Eugenio.Aulisa at ttu.edu (Aulisa, Eugenio) Date: Thu, 26 Sep 2019 08:28:19 +0000 Subject: [petsc-users] Clarification of INSERT_VALUES for vec with ghost nodes In-Reply-To: References: , Message-ID: On Wed, Sep 25, 2019 at 9:11 AM Aulisa, Eugenio via petsc-users > wrote: Hi, I have a vector with ghost nodes where each process may or may not change the value of a specific ghost node (using INSERT_VALUES). At the end I would like for each process, that see a particular ghost node, to have the smallest of the set values. Do you mean owner of ghost nodes gets the smallest values. That is, in your example below, proc0 gets Min(value0, value1, value2)? If I can get the Min(value0, value1, value2) on the owner then I can scatter it forward with INSERT_VALUES to all processes that ghost it. And if there is a easy way to get Min(value0, value1, value2) on the owner (or on all processes) I would like to know. Since I do not think there is a straightforward way to achieve that, I was looking at a workaround, and to do that I need to know the behavior of scatter reverse in the cases described below. Notice that I used the option INSERT_VALUES which I am not even sure is allowed. I do not think there is a straightforward way to achieve this, but I would like to be wrong. Any suggestion? %%%%%%%%%%%%%%%% To build a work around I need to understand better the behavior of VecGhostUpdateBegin(...); VecGhostUpdateEnd(...). In particular in the documentation I do not see the option VecGhostUpdateBegin(v, INSERT_VALUES, SCATTER_REVERSE); VecGhostUpdateEnd(v, INSERT_VALUES, SCATTER_REVERSE); In case this is possible to be used, what is the behavior of this call in the following two cases? 1) Assume that node-i belongs to proc0, and is ghosted in proc1 and proc2, also assume that the current value of node-i is value0 and proc0 does not modify it, but proc1 and proc2 do. start with: proc0 -> value0 proc1 -> value0 proc2 -> value0 change to: proc0 -> value0 proc1 -> value1 proc2 -> value2 I assume that calling VecGhostUpdateBegin(v, INSERT_VALUES, SCATTER_REVERSE); VecGhostUpdateEnd(v, INSERT_VALUES, SCATTER_REVERSE); will have an unpredictable behavior as proc0 -> either value1 or value2 proc1 -> value1 proc2 -> value2 2) Assume now that node-i belongs to proc0, and is ghosted in proc1 and proc2, also assume that the current value of node-i is value0 and proc0 and proc1 do not modify it, but proc2 does. start with: proc0 -> value0 proc1 -> value0 proc2 -> value0 change to: proc0 -> value0 proc1 -> value0 proc2 -> value2 Is the call VecGhostUpdateBegin(v, INSERT_VALUES, SCATTER_REVERSE); VecGhostUpdateEnd(v, INSERT_VALUES, SCATTER_REVERSE); still unpredictable? proc0 -> either value0 or value2 proc1 -> value0 proc2 -> value2 or proc0 -> value2 (since proc1 did not modify the original value, so it did not reverse scatter) proc1 -> value0 proc2 -> value2 Thanks a lot for your help Eugenio -------------- next part -------------- An HTML attachment was scrubbed... URL: From li.luo at kaust.edu.sa Thu Sep 26 05:14:06 2019 From: li.luo at kaust.edu.sa (Li Luo) Date: Thu, 26 Sep 2019 13:14:06 +0300 Subject: [petsc-users] DIVERGED_LINEAR_SOLVE in SNES Message-ID: Dear developer, I am using SNES for solving a nonlinear system. For some cases, SNES diverged -3 with "DIVERGED_LINEAR_SOLVE" when the linear solver reached its maximum iteration count (i.e -ksp_max_it 10000). Is that possible to let SNES continue even though the linear solver reaches the maximum number of iterations? Just take the result at 10000 for the Jacobian solution and then update the Newton step? Best, Li -- This message and its contents, including attachments are intended solely for the original recipient. If you are not the intended recipient or have received this message in error, please notify me immediately and delete this message from your computer system. Any unauthorized use or distribution is prohibited. Please consider the environment before printing this email. -------------- next part -------------- An HTML attachment was scrubbed... URL: From hzhang at mcs.anl.gov Thu Sep 26 10:58:12 2019 From: hzhang at mcs.anl.gov (Zhang, Hong) Date: Thu, 26 Sep 2019 15:58:12 +0000 Subject: [petsc-users] DIVERGED_LINEAR_SOLVE in SNES In-Reply-To: References: Message-ID: Li : You can use '-ksp_max_it 20000' to change maximum iteration count. However, it does not make sense to continue after it fails at 10000 iterations. You should figure out why linear solver diverges. Run your code with '-ksp_monitor' or '-ksp_monitor_true_residual'. Hong Dear developer, I am using SNES for solving a nonlinear system. For some cases, SNES diverged -3 with "DIVERGED_LINEAR_SOLVE" when the linear solver reached its maximum iteration count (i.e -ksp_max_it 10000). Is that possible to let SNES continue even though the linear solver reaches the maximum number of iterations? Just take the result at 10000 for the Jacobian solution and then update the Newton step? Best, Li ________________________________ This message and its contents, including attachments are intended solely for the original recipient. If you are not the intended recipient or have received this message in error, please notify me immediately and delete this message from your computer system. Any unauthorized use or distribution is prohibited. Please consider the environment before printing this email. -------------- next part -------------- An HTML attachment was scrubbed... URL: From jczhang at mcs.anl.gov Thu Sep 26 11:02:50 2019 From: jczhang at mcs.anl.gov (Zhang, Junchao) Date: Thu, 26 Sep 2019 16:02:50 +0000 Subject: [petsc-users] Clarification of INSERT_VALUES for vec with ghost nodes In-Reply-To: References: Message-ID: With VecGhostUpdateBegin(v, INSERT_VALUES, SCATTER_REVERSE), the owner will get updated by ghost values. So in your case 1, proc0 gets either value1 or value2 from proc1/2; in case 2; proc0 gets either value0 or value2 from proc1/2. In short, you could not achieve your goal with INSERT_VALUES. Though you can do it with other interfaces in PETSc, e.g., PetscSFReduceBegin/End, I believe it is better to extend VecGhostUpdate to support MAX/MIN_VALUES, because it is a simpler interface for you and it is very easy to add. Could you try branch jczhang/feature-vscat-min-values to see if it works for you? See the end of src/vec/vec/examples/tutorials/ex9.c for an example of the new functionality. Use mpirun -n 2 ./ex9 -minvalues to test it and its expected output is output/ex9_2.out Petsc will have a new release this weekend. Let's see whether I can put it in the new release. Thanks. --Junchao Zhang On Thu, Sep 26, 2019 at 3:28 AM Aulisa, Eugenio > wrote: On Wed, Sep 25, 2019 at 9:11 AM Aulisa, Eugenio via petsc-users > wrote: Hi, I have a vector with ghost nodes where each process may or may not change the value of a specific ghost node (using INSERT_VALUES). At the end I would like for each process, that see a particular ghost node, to have the smallest of the set values. Do you mean owner of ghost nodes gets the smallest values. That is, in your example below, proc0 gets Min(value0, value1, value2)? If I can get the Min(value0, value1, value2) on the owner then I can scatter it forward with INSERT_VALUES to all processes that ghost it. And if there is a easy way to get Min(value0, value1, value2) on the owner (or on all processes) I would like to know. Since I do not think there is a straightforward way to achieve that, I was looking at a workaround, and to do that I need to know the behavior of scatter reverse in the cases described below. Notice that I used the option INSERT_VALUES which I am not even sure is allowed. I do not think there is a straightforward way to achieve this, but I would like to be wrong. Any suggestion? %%%%%%%%%%%%%%%% To build a work around I need to understand better the behavior of VecGhostUpdateBegin(...); VecGhostUpdateEnd(...). In particular in the documentation I do not see the option VecGhostUpdateBegin(v, INSERT_VALUES, SCATTER_REVERSE); VecGhostUpdateEnd(v, INSERT_VALUES, SCATTER_REVERSE); In case this is possible to be used, what is the behavior of this call in the following two cases? 1) Assume that node-i belongs to proc0, and is ghosted in proc1 and proc2, also assume that the current value of node-i is value0 and proc0 does not modify it, but proc1 and proc2 do. start with: proc0 -> value0 proc1 -> value0 proc2 -> value0 change to: proc0 -> value0 proc1 -> value1 proc2 -> value2 I assume that calling VecGhostUpdateBegin(v, INSERT_VALUES, SCATTER_REVERSE); VecGhostUpdateEnd(v, INSERT_VALUES, SCATTER_REVERSE); will have an unpredictable behavior as proc0 -> either value1 or value2 proc1 -> value1 proc2 -> value2 2) Assume now that node-i belongs to proc0, and is ghosted in proc1 and proc2, also assume that the current value of node-i is value0 and proc0 and proc1 do not modify it, but proc2 does. start with: proc0 -> value0 proc1 -> value0 proc2 -> value0 change to: proc0 -> value0 proc1 -> value0 proc2 -> value2 Is the call VecGhostUpdateBegin(v, INSERT_VALUES, SCATTER_REVERSE); VecGhostUpdateEnd(v, INSERT_VALUES, SCATTER_REVERSE); still unpredictable? proc0 -> either value0 or value2 proc1 -> value0 proc2 -> value2 or proc0 -> value2 (since proc1 did not modify the original value, so it did not reverse scatter) proc1 -> value0 proc2 -> value2 Thanks a lot for your help Eugenio -------------- next part -------------- An HTML attachment was scrubbed... URL: From mvalera-w at sdsu.edu Thu Sep 26 16:27:22 2019 From: mvalera-w at sdsu.edu (Manuel Valera) Date: Thu, 26 Sep 2019 14:27:22 -0700 Subject: [petsc-users] TS scheme with different DAs In-Reply-To: References: Message-ID: Hi all, Just retouching in case my mail was lost, does my problem make enough sense? I could try to be clearer if you like. I tried using TSGetStages() in the middle of the RHS function but that didn't work. Maybe we can find a different approach for this ? Thanks, On Tue, Sep 24, 2019 at 12:58 PM Manuel Valera wrote: > Hello all, > > I finally implemented the TS routine operating in several DAs at the same > time, hacking it as you suggested. I still have a problem with my algorithm > though. It is not DMDA related so there's that. > > My algorithm needs to update u,v,w with information from the updated > T,S,rho. My problem, or what I don't understand yet, is how to operate in > the intermediate runge-kutta time integration states inside the > RHSFunction. > > If I can be more clear, I would need the intermediate T,S states to obtain > an updated rho (density) to in turn, obtain the correct intermediate > velocities, and keep the loop going. As I understand right now, the RHS > vector is different from this intermediate state, and it would be only the > RHS input to the loop, so operating on this would be incorrect. > > As of now, my algorithm still creates artifacts because of this lack of > information to accurately update all of the variables at the same time. The > problem happens as well in serial. > > Thanks for your help, > > > > > > > > On Wed, Sep 18, 2019 at 4:36 AM Matthew Knepley wrote: > >> On Tue, Sep 17, 2019 at 8:27 PM Smith, Barry F. >> wrote: >> >>> >>> Don't be too quick to dismiss switching to the DMStag you may find >>> that it actually takes little time to convert and then you have a much less >>> cumbersome process to manage the staggered grid. Take a look at >>> src/dm/impls/stag/examples/tutorials/ex2.c where >>> >>> const PetscInt dof0 = 0, dof1 = 1,dof2 = 1; /* 1 dof on each edge and >>> element center */ >>> const PetscInt stencilWidth = 1; >>> ierr = >>> DMStagCreate2d(PETSC_COMM_WORLD,DM_BOUNDARY_NONE,DM_BOUNDARY_NONE,7,9,PETSC_DECIDE,PETSC_DECIDE,dof0,dof1,dof2,DMSTAG_STENCIL_BOX,stencilWidth,NULL,NULL,&dmSol);CHKERRQ(ierr); >>> >>> BOOM, it has set up a staggered grid with 1 cell centered variable and 1 >>> on each edge. Adding more the cell centers, vertices, or edges is trivial. >>> >>> If you want to stick to DMDA you >>> >>> "cheat". Depending on exactly what staggering you have you make the DMDA >>> for the "smaller problem" as large as the other ones and just track zeros >>> in those locations. For example if velocities are "edges" and T, S are on >>> cells, make your "cells" DMDA one extra grid width wide in all three >>> dimensions. You may need to be careful on the boundaries deepening on the >>> types of boundary conditions. >>> >> >> Yes, SNES ex30 does exactly this. However, I still recommend looking at >> DMStag. Patrick created it because managing the DMDA >> became such as headache. >> >> Thanks, >> >> Matt >> >> >>> > On Sep 17, 2019, at 7:04 PM, Manuel Valera via petsc-users < >>> petsc-users at mcs.anl.gov> wrote: >>> > >>> > Thanks Matthew, but my code is too complicated to be redone on DMStag >>> now after spending a long time using DMDAs, >>> > >>> > Is there a way to ensure PETSc distributes several DAs in the same >>> way? besides manually distributing the points, >>> > >>> > Thanks, >>> > >>> > On Tue, Sep 17, 2019 at 3:28 PM Matthew Knepley >>> wrote: >>> > On Tue, Sep 17, 2019 at 6:15 PM Manuel Valera via petsc-users < >>> petsc-users at mcs.anl.gov> wrote: >>> > Hello, petsc users, >>> > >>> > I have integrated the TS routines in my code, but i just noticed i >>> didn't do it optimally. I was using 3 different TS objects to integrate >>> velocities, temperature and salinity, and it works but only for small DTs. >>> I suspect the intermediate Runge-Kutta states are unphased and this creates >>> the discrepancy for broader time steps, so I need to integrate the 3 >>> quantities in the same routine. >>> > >>> > I tried to do this by using a 5 DOF distributed array for the RHS, >>> where I store the velocities in the first 3 and then Temperature and >>> Salinity in the rest. The problem is that I use a staggered grid and T,S >>> are located in a different DA layout than the velocities. This is creating >>> problems for me since I can't find a way to communicate the information >>> from the result of the TS integration back to the respective DAs of each >>> variable. >>> > >>> > Is there a way to communicate across DAs? or can you suggest an >>> alternative solution to this problem? >>> > >>> > If you have a staggered discretization on a structured grid, I would >>> recommend checking out DMStag. >>> > >>> > Thanks, >>> > >>> > MAtt >>> > >>> > Thanks, >>> > >>> > >>> > >>> > -- >>> > What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> > -- Norbert Wiener >>> > >>> > https://www.cse.buffalo.edu/~knepley/ >>> >>> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From Eugenio.Aulisa at ttu.edu Thu Sep 26 16:35:47 2019 From: Eugenio.Aulisa at ttu.edu (Aulisa, Eugenio) Date: Thu, 26 Sep 2019 21:35:47 +0000 Subject: [petsc-users] Clarification of INSERT_VALUES for vec with ghost nodes In-Reply-To: References: , Message-ID: Yes it worked. I get exactly the same vector as reported in the output file I will test it now with my problem and if I see anything strange I will let you know. Thank you so much Eugenio Eugenio Aulisa Department of Mathematics and Statistics, Texas Tech University Lubbock TX, 79409-1042 room: 226 http://www.math.ttu.edu/~eaulisa/ phone: (806) 834-6684 fax: (806) 742-1112 ________________________________________ From: Zhang, Junchao Sent: Thursday, September 26, 2019 11:02 AM To: Aulisa, Eugenio Cc: petsc-users at mcs.anl.gov Subject: Re: [petsc-users] Clarification of INSERT_VALUES for vec with ghost nodes With VecGhostUpdateBegin(v, INSERT_VALUES, SCATTER_REVERSE), the owner will get updated by ghost values. So in your case 1, proc0 gets either value1 or value2 from proc1/2; in case 2; proc0 gets either value0 or value2 from proc1/2. In short, you could not achieve your goal with INSERT_VALUES. Though you can do it with other interfaces in PETSc, e.g., PetscSFReduceBegin/End, I believe it is better to extend VecGhostUpdate to support MAX/MIN_VALUES, because it is a simpler interface for you and it is very easy to add. Could you try branch jczhang/feature-vscat-min-values to see if it works for you? See the end of src/vec/vec/examples/tutorials/ex9.c for an example of the new functionality. Use mpirun -n 2 ./ex9 -minvalues to test it and its expected output is output/ex9_2.out Petsc will have a new release this weekend. Let's see whether I can put it in the new release. Thanks. --Junchao Zhang On Thu, Sep 26, 2019 at 3:28 AM Aulisa, Eugenio > wrote: On Wed, Sep 25, 2019 at 9:11 AM Aulisa, Eugenio via petsc-users > wrote: Hi, I have a vector with ghost nodes where each process may or may not change the value of a specific ghost node (using INSERT_VALUES). At the end I would like for each process, that see a particular ghost node, to have the smallest of the set values. Do you mean owner of ghost nodes gets the smallest values. That is, in your example below, proc0 gets Min(value0, value1, value2)? If I can get the Min(value0, value1, value2) on the owner then I can scatter it forward with INSERT_VALUES to all processes that ghost it. And if there is a easy way to get Min(value0, value1, value2) on the owner (or on all processes) I would like to know. Since I do not think there is a straightforward way to achieve that, I was looking at a workaround, and to do that I need to know the behavior of scatter reverse in the cases described below. Notice that I used the option INSERT_VALUES which I am not even sure is allowed. I do not think there is a straightforward way to achieve this, but I would like to be wrong. Any suggestion? %%%%%%%%%%%%%%%% To build a work around I need to understand better the behavior of VecGhostUpdateBegin(...); VecGhostUpdateEnd(...). In particular in the documentation I do not see the option VecGhostUpdateBegin(v, INSERT_VALUES, SCATTER_REVERSE); VecGhostUpdateEnd(v, INSERT_VALUES, SCATTER_REVERSE); In case this is possible to be used, what is the behavior of this call in the following two cases? 1) Assume that node-i belongs to proc0, and is ghosted in proc1 and proc2, also assume that the current value of node-i is value0 and proc0 does not modify it, but proc1 and proc2 do. start with: proc0 -> value0 proc1 -> value0 proc2 -> value0 change to: proc0 -> value0 proc1 -> value1 proc2 -> value2 I assume that calling VecGhostUpdateBegin(v, INSERT_VALUES, SCATTER_REVERSE); VecGhostUpdateEnd(v, INSERT_VALUES, SCATTER_REVERSE); will have an unpredictable behavior as proc0 -> either value1 or value2 proc1 -> value1 proc2 -> value2 2) Assume now that node-i belongs to proc0, and is ghosted in proc1 and proc2, also assume that the current value of node-i is value0 and proc0 and proc1 do not modify it, but proc2 does. start with: proc0 -> value0 proc1 -> value0 proc2 -> value0 change to: proc0 -> value0 proc1 -> value0 proc2 -> value2 Is the call VecGhostUpdateBegin(v, INSERT_VALUES, SCATTER_REVERSE); VecGhostUpdateEnd(v, INSERT_VALUES, SCATTER_REVERSE); still unpredictable? proc0 -> either value0 or value2 proc1 -> value0 proc2 -> value2 or proc0 -> value2 (since proc1 did not modify the original value, so it did not reverse scatter) proc1 -> value0 proc2 -> value2 Thanks a lot for your help Eugenio From bsmith at mcs.anl.gov Thu Sep 26 21:27:39 2019 From: bsmith at mcs.anl.gov (Smith, Barry F.) Date: Fri, 27 Sep 2019 02:27:39 +0000 Subject: [petsc-users] DIVERGED_LINEAR_SOLVE in SNES In-Reply-To: References: Message-ID: <5D0BAE9F-7BC7-4888-9F29-BEE65F4EDD79@anl.gov> Li, It is possible, but as Hong said probably never appropriate. Especially if KSP has iterated for 10,000 iterations. If you want SNES to "try" the direction given by a failed solve you should use a much smaller maximum number of iterations for KSP. Anways, to do what you desire #include call KSPSetPostSolve() and in your post solve() function simply do ksp->reason = KSP_CONVERGED_ITS. Barry > On Sep 26, 2019, at 10:58 AM, Zhang, Hong via petsc-users wrote: > > Li : > You can use '-ksp_max_it 20000' to change maximum iteration count. However, it does not make sense to continue after it fails at 10000 iterations. You should figure out why linear solver diverges. Run your code with '-ksp_monitor' or '-ksp_monitor_true_residual'. > Hong > > Dear developer, > > I am using SNES for solving a nonlinear system. For some cases, SNES diverged -3 with "DIVERGED_LINEAR_SOLVE" when the linear solver reached its maximum iteration count (i.e -ksp_max_it 10000). > Is that possible to let SNES continue even though the linear solver reaches the maximum number of iterations? Just take the result at 10000 for the Jacobian solution and then update the Newton step? > > Best, > Li > > > > This message and its contents, including attachments are intended solely for the original recipient. If you are not the intended recipient or have received this message in error, please notify me immediately and delete this message from your computer system. Any unauthorized use or distribution is prohibited. Please consider the environment before printing this email. From li.luo at kaust.edu.sa Fri Sep 27 03:17:53 2019 From: li.luo at kaust.edu.sa (Li Luo) Date: Fri, 27 Sep 2019 11:17:53 +0300 Subject: [petsc-users] DIVERGED_LINEAR_SOLVE in SNES In-Reply-To: <5D0BAE9F-7BC7-4888-9F29-BEE65F4EDD79@anl.gov> References: <5D0BAE9F-7BC7-4888-9F29-BEE65F4EDD79@anl.gov> Message-ID: Thank you for the suggestions, I'll try it. Best, Li On Fri, Sep 27, 2019 at 5:27 AM Smith, Barry F. wrote: > > Li, > > It is possible, but as Hong said probably never appropriate. Especially > if KSP has iterated for 10,000 iterations. If you want SNES to "try" the > direction given by a failed solve you should use a much smaller maximum > number of iterations for KSP. > > Anways, to do what you desire > > #include > > call KSPSetPostSolve() and in your post solve() function simply do > ksp->reason = KSP_CONVERGED_ITS. > > Barry > > > > > > On Sep 26, 2019, at 10:58 AM, Zhang, Hong via petsc-users < > petsc-users at mcs.anl.gov> wrote: > > > > Li : > > You can use '-ksp_max_it 20000' to change maximum iteration count. > However, it does not make sense to continue after it fails at 10000 > iterations. You should figure out why linear solver diverges. Run your code > with '-ksp_monitor' or '-ksp_monitor_true_residual'. > > Hong > > > > Dear developer, > > > > I am using SNES for solving a nonlinear system. For some cases, SNES > diverged -3 with "DIVERGED_LINEAR_SOLVE" when the linear solver reached its > maximum iteration count (i.e -ksp_max_it 10000). > > Is that possible to let SNES continue even though the linear solver > reaches the maximum number of iterations? Just take the result at 10000 for > the Jacobian solution and then update the Newton step? > > > > Best, > > Li > > > > > > > > This message and its contents, including attachments are intended solely > for the original recipient. If you are not the intended recipient or have > received this message in error, please notify me immediately and delete > this message from your computer system. Any unauthorized use or > distribution is prohibited. Please consider the environment before printing > this email. > > -- Postdoctoral Fellow Extreme Computing Research Center King Abdullah University of Science & Technology https://sites.google.com/site/rolyliluo/ -- This message and its contents, including attachments are intended solely for the original recipient. If you are not the intended recipient or have received this message in error, please notify me immediately and delete this message from your computer system. Any unauthorized use or distribution is prohibited. Please consider the environment before printing this email. -------------- next part -------------- An HTML attachment was scrubbed... URL: From griesser.jan at googlemail.com Fri Sep 27 03:34:41 2019 From: griesser.jan at googlemail.com (=?UTF-8?B?SmFuIEdyaWXDn2Vy?=) Date: Fri, 27 Sep 2019 10:34:41 +0200 Subject: [petsc-users] Computing part of the inverse of a large matrix Message-ID: Hi all, i am using petsc4py. I am dealing with rather large sparse matrices up to 600kx600k and i am interested in calculating a part of the inverse of the matrix(I know it will be a dense matrix). Due to the nature of my problem, I am only interested in approximately the first 1000 rows and 1000 columns (i.e. a large block in the upper left ofthe matrix). Before I start to play around now, I wanted to ask if there is a clever way to tackle this kind of problem in PETSc in principle. For any input I would be very grateful! Greetings Jan -------------- next part -------------- An HTML attachment was scrubbed... URL: From michael.werner at dlr.de Fri Sep 27 06:11:56 2019 From: michael.werner at dlr.de (Michael Werner) Date: Fri, 27 Sep 2019 13:11:56 +0200 Subject: [petsc-users] SLEPc - st_type cayley choice of shift and antishift In-Reply-To: <2E0C9AFB-9B3B-427D-AF15-66FEC9DB8115@dsic.upv.es> References: <816e6835-c273-178b-e532-5dec0fe0710c@dlr.de> <2E0C9AFB-9B3B-427D-AF15-66FEC9DB8115@dsic.upv.es> Message-ID: <5baec17c-7ebb-114b-2fc7-5281f2285c41@dlr.de> Thank you for the link to the paper, it's quite interesting and pretty close to what I'm doing. I'm currently also using the "inexact" approach for my application, and in general it works, as long as the ksp tolerance is low enough. However, I was hoping to speed up convergence towards the "interesting" eigenvalues by using Cayley. Now as a test I tried to follow the approach from your paper, choosing mu = -sigma, and mu in the order of magnitude of the imaginary part of the most amplified eigenvalue. I know the most amplified eigenvalue for my problem is -0.0398+0.724i, so I tried running SLEPc with the following settings: -st_type cayley -st_shift -1 -st_cayley_antishift 1 But I never get the correct eigenvalue, instead SLEPc returns only the value of st_shift: [0]????? Number of iterations of the method: 1 [0]????? Solution method: krylovschur [0]????? Number of requested eigenvalues: 1 [0]????? Stopping condition: tol=1e-08, maxit=19382 [0]????? Number of converged eigenpairs: 16 [0]????? [0]????????????? k????????? ||Ax-kx||/||kx|| [0]????? ----------------- ------------------ [0]????????? -1.000000????????? 0.0281754 [0]????????? -1.000000????????? 0.0286815 [0]????????? -1.000000????????? 0.0109186 [0]????????? -1.000000?????????? 0.140883 [0]????????? -1.000000?????????? 0.203036 [0]????????? -1.000000???????? 0.00801616 [0]????????? -1.000000????????? 0.0526871 [0]????????? -1.000000?????????? 0.022244 [0]????????? -1.000000????????? 0.0182197 [0]????????? -1.000000????????? 0.0107924 [0]????????? -1.000000???????? 0.00963378 [0]????????? -1.000000????????? 0.0239422 [0]????????? -1.000000???????? 0.00472435 [0]????????? -1.000000???????? 0.00607732 [0]????????? -1.000000????????? 0.0124056 [0]????????? -1.000000???????? 0.00557715 Also, it doesn't matter if I'm using exact or inexact solves. Changing the values of shift and antishift also doesn't change the behaviour. Do I need to make additional adjustments to get cayley to work? Best regards, Michael Am 25.09.19 um 17:21 schrieb Jose E. Roman: > >> El 25 sept 2019, a las 16:18, Michael Werner via petsc-users escribi?: >> >> Hello, >> >> I'm looking for advice on how to set shift and antishift for the cayley >> spectral transformation. So far I've been using sinvert to find the >> eigenvalues with the smallest real part (but possibly large imaginary >> part). For this, I use the following options: >> -st_type sinvert >> -eps_target -0.05 >> -eps_target_real >> >> With sinvert, it is easy to understand how to chose the target, but for >> Cayley I'm not sure how to set shift and antishift. What is the >> mathematical meaning of the antishift? >> >> Best regards, >> Michael Werner > In exact arithmetic, both shift-and-invert and Cayley build the same Krylov subspace, so no difference. If the linear solves are computed "inexactly" (iterative solver) then Cayley may have some advantage, but it depends on the application. Also, iterative solvers usually are not robust enough in this context. You can see the discussion here https://doi.org/10.1108/09615530410544328 > > Jose > From jroman at dsic.upv.es Fri Sep 27 06:32:05 2019 From: jroman at dsic.upv.es (Jose E. Roman) Date: Fri, 27 Sep 2019 13:32:05 +0200 Subject: [petsc-users] SLEPc - st_type cayley choice of shift and antishift In-Reply-To: <5baec17c-7ebb-114b-2fc7-5281f2285c41@dlr.de> References: <816e6835-c273-178b-e532-5dec0fe0710c@dlr.de> <2E0C9AFB-9B3B-427D-AF15-66FEC9DB8115@dsic.upv.es> <5baec17c-7ebb-114b-2fc7-5281f2285c41@dlr.de> Message-ID: <6A0885FA-B484-4DC9-8920-70BC463C25C0@dsic.upv.es> Try setting -eps_target -1 instead of -st_shift -1 Does sinvert work with target -1? Can you send me the matrices so that I can reproduce the issue? Jose > El 27 sept 2019, a las 13:11, Michael Werner escribi?: > > Thank you for the link to the paper, it's quite interesting and pretty > close to what I'm doing. I'm currently also using the "inexact" approach > for my application, and in general it works, as long as the ksp > tolerance is low enough. However, I was hoping to speed up convergence > towards the "interesting" eigenvalues by using Cayley. > > Now as a test I tried to follow the approach from your paper, choosing > mu = -sigma, and mu in the order of magnitude of the imaginary part of > the most amplified eigenvalue. I know the most amplified eigenvalue for > my problem is -0.0398+0.724i, so I tried running SLEPc with the > following settings: > -st_type cayley > -st_shift -1 > -st_cayley_antishift 1 > > But I never get the correct eigenvalue, instead SLEPc returns only the > value of st_shift: > [0] Number of iterations of the method: 1 > [0] Solution method: krylovschur > [0] Number of requested eigenvalues: 1 > [0] Stopping condition: tol=1e-08, maxit=19382 > [0] Number of converged eigenpairs: 16 > [0] > [0] k ||Ax-kx||/||kx|| > [0] ----------------- ------------------ > [0] -1.000000 0.0281754 > [0] -1.000000 0.0286815 > [0] -1.000000 0.0109186 > [0] -1.000000 0.140883 > [0] -1.000000 0.203036 > [0] -1.000000 0.00801616 > [0] -1.000000 0.0526871 > [0] -1.000000 0.022244 > [0] -1.000000 0.0182197 > [0] -1.000000 0.0107924 > [0] -1.000000 0.00963378 > [0] -1.000000 0.0239422 > [0] -1.000000 0.00472435 > [0] -1.000000 0.00607732 > [0] -1.000000 0.0124056 > [0] -1.000000 0.00557715 > > Also, it doesn't matter if I'm using exact or inexact solves. Changing > the values of shift and antishift also doesn't change the behaviour. Do > I need to make additional adjustments to get cayley to work? > > Best regards, > Michael > > > > Am 25.09.19 um 17:21 schrieb Jose E. Roman: >> >>> El 25 sept 2019, a las 16:18, Michael Werner via petsc-users escribi?: >>> >>> Hello, >>> >>> I'm looking for advice on how to set shift and antishift for the cayley >>> spectral transformation. So far I've been using sinvert to find the >>> eigenvalues with the smallest real part (but possibly large imaginary >>> part). For this, I use the following options: >>> -st_type sinvert >>> -eps_target -0.05 >>> -eps_target_real >>> >>> With sinvert, it is easy to understand how to chose the target, but for >>> Cayley I'm not sure how to set shift and antishift. What is the >>> mathematical meaning of the antishift? >>> >>> Best regards, >>> Michael Werner >> In exact arithmetic, both shift-and-invert and Cayley build the same Krylov subspace, so no difference. If the linear solves are computed "inexactly" (iterative solver) then Cayley may have some advantage, but it depends on the application. Also, iterative solvers usually are not robust enough in this context. You can see the discussion here https://doi.org/10.1108/09615530410544328 >> >> Jose >> From michael.werner at dlr.de Fri Sep 27 06:54:48 2019 From: michael.werner at dlr.de (Michael Werner) Date: Fri, 27 Sep 2019 13:54:48 +0200 Subject: [petsc-users] SLEPc - st_type cayley choice of shift and antishift In-Reply-To: <6A0885FA-B484-4DC9-8920-70BC463C25C0@dsic.upv.es> References: <816e6835-c273-178b-e532-5dec0fe0710c@dlr.de> <2E0C9AFB-9B3B-427D-AF15-66FEC9DB8115@dsic.upv.es> <5baec17c-7ebb-114b-2fc7-5281f2285c41@dlr.de> <6A0885FA-B484-4DC9-8920-70BC463C25C0@dsic.upv.es> Message-ID: <84721713-83fd-8884-5e8c-0d6db748b6d1@dlr.de> Yes, with sinvert its working. And using -eps_target instead of -st_shift didn't change anything. I also just sent you the matrices for reproduction of the issue. Michael Am 27.09.19 um 13:32 schrieb Jose E. Roman: > Try setting -eps_target -1 instead of -st_shift -1 > Does sinvert work with target -1? > Can you send me the matrices so that I can reproduce the issue? > > Jose > > >> El 27 sept 2019, a las 13:11, Michael Werner escribi?: >> >> Thank you for the link to the paper, it's quite interesting and pretty >> close to what I'm doing. I'm currently also using the "inexact" approach >> for my application, and in general it works, as long as the ksp >> tolerance is low enough. However, I was hoping to speed up convergence >> towards the "interesting" eigenvalues by using Cayley. >> >> Now as a test I tried to follow the approach from your paper, choosing >> mu = -sigma, and mu in the order of magnitude of the imaginary part of >> the most amplified eigenvalue. I know the most amplified eigenvalue for >> my problem is -0.0398+0.724i, so I tried running SLEPc with the >> following settings: >> -st_type cayley >> -st_shift -1 >> -st_cayley_antishift 1 >> >> But I never get the correct eigenvalue, instead SLEPc returns only the >> value of st_shift: >> [0] Number of iterations of the method: 1 >> [0] Solution method: krylovschur >> [0] Number of requested eigenvalues: 1 >> [0] Stopping condition: tol=1e-08, maxit=19382 >> [0] Number of converged eigenpairs: 16 >> [0] >> [0] k ||Ax-kx||/||kx|| >> [0] ----------------- ------------------ >> [0] -1.000000 0.0281754 >> [0] -1.000000 0.0286815 >> [0] -1.000000 0.0109186 >> [0] -1.000000 0.140883 >> [0] -1.000000 0.203036 >> [0] -1.000000 0.00801616 >> [0] -1.000000 0.0526871 >> [0] -1.000000 0.022244 >> [0] -1.000000 0.0182197 >> [0] -1.000000 0.0107924 >> [0] -1.000000 0.00963378 >> [0] -1.000000 0.0239422 >> [0] -1.000000 0.00472435 >> [0] -1.000000 0.00607732 >> [0] -1.000000 0.0124056 >> [0] -1.000000 0.00557715 >> >> Also, it doesn't matter if I'm using exact or inexact solves. Changing >> the values of shift and antishift also doesn't change the behaviour. Do >> I need to make additional adjustments to get cayley to work? >> >> Best regards, >> Michael >> >> >> >> Am 25.09.19 um 17:21 schrieb Jose E. Roman: >>>> El 25 sept 2019, a las 16:18, Michael Werner via petsc-users escribi?: >>>> >>>> Hello, >>>> >>>> I'm looking for advice on how to set shift and antishift for the cayley >>>> spectral transformation. So far I've been using sinvert to find the >>>> eigenvalues with the smallest real part (but possibly large imaginary >>>> part). For this, I use the following options: >>>> -st_type sinvert >>>> -eps_target -0.05 >>>> -eps_target_real >>>> >>>> With sinvert, it is easy to understand how to chose the target, but for >>>> Cayley I'm not sure how to set shift and antishift. What is the >>>> mathematical meaning of the antishift? >>>> >>>> Best regards, >>>> Michael Werner >>> In exact arithmetic, both shift-and-invert and Cayley build the same Krylov subspace, so no difference. If the linear solves are computed "inexactly" (iterative solver) then Cayley may have some advantage, but it depends on the application. Also, iterative solvers usually are not robust enough in this context. You can see the discussion here https://doi.org/10.1108/09615530410544328 >>> >>> Jose >>> From bsmith at mcs.anl.gov Fri Sep 27 08:03:57 2019 From: bsmith at mcs.anl.gov (Smith, Barry F.) Date: Fri, 27 Sep 2019 13:03:57 +0000 Subject: [petsc-users] Computing part of the inverse of a large matrix In-Reply-To: References: Message-ID: <1D24C212-DEB9-438F-B1E9-8D64494D1CA2@anl.gov> MatMumpsGetInverse() maybe useful. Also simply using MatMatSolve() with the first 1000 columns of the identity and "throwing away" the part you don't need may be most effective. Barry > On Sep 27, 2019, at 3:34 AM, Jan Grie?er via petsc-users wrote: > > Hi all, > i am using petsc4py. I am dealing with rather large sparse matrices up to 600kx600k and i am interested in calculating a part of the inverse of the matrix(I know it will be a dense matrix). Due to the nature of my problem, I am only interested in approximately the first 1000 rows and 1000 columns (i.e. a large block in the upper left ofthe matrix). Before I start to play around now, I wanted to ask if there is a clever way to tackle this kind of problem in PETSc in principle. For any input I would be very grateful! > Greetings Jan From jroman at dsic.upv.es Fri Sep 27 08:54:42 2019 From: jroman at dsic.upv.es (Jose E. Roman) Date: Fri, 27 Sep 2019 15:54:42 +0200 Subject: [petsc-users] SLEPc - st_type cayley choice of shift and antishift In-Reply-To: <84721713-83fd-8884-5e8c-0d6db748b6d1@dlr.de> References: <816e6835-c273-178b-e532-5dec0fe0710c@dlr.de> <2E0C9AFB-9B3B-427D-AF15-66FEC9DB8115@dsic.upv.es> <5baec17c-7ebb-114b-2fc7-5281f2285c41@dlr.de> <6A0885FA-B484-4DC9-8920-70BC463C25C0@dsic.upv.es> <84721713-83fd-8884-5e8c-0d6db748b6d1@dlr.de> Message-ID: <160372D7-892C-443B-8E88-7576897C2AAB@dsic.upv.es> I now see what is happening. In the expression of the paper the antishift has different sign compared to the expression used in SLEPc (see the users manual): (A-sigma*B)^{-1}*(A+nu*B)x = \theta x So nu=-sigma is a forbidden value, otherwise both factors cancel out (I will fix the interface so that this is catched). In your case you should do -eps_target -1 -st_cayley_antishift -1 Jose > El 27 sept 2019, a las 13:54, Michael Werner escribi?: > > Yes, with sinvert its working. And using -eps_target instead of > -st_shift didn't change anything. > > I also just sent you the matrices for reproduction of the issue. > > Michael > > Am 27.09.19 um 13:32 schrieb Jose E. Roman: >> Try setting -eps_target -1 instead of -st_shift -1 >> Does sinvert work with target -1? >> Can you send me the matrices so that I can reproduce the issue? >> >> Jose >> >> >>> El 27 sept 2019, a las 13:11, Michael Werner escribi?: >>> >>> Thank you for the link to the paper, it's quite interesting and pretty >>> close to what I'm doing. I'm currently also using the "inexact" approach >>> for my application, and in general it works, as long as the ksp >>> tolerance is low enough. However, I was hoping to speed up convergence >>> towards the "interesting" eigenvalues by using Cayley. >>> >>> Now as a test I tried to follow the approach from your paper, choosing >>> mu = -sigma, and mu in the order of magnitude of the imaginary part of >>> the most amplified eigenvalue. I know the most amplified eigenvalue for >>> my problem is -0.0398+0.724i, so I tried running SLEPc with the >>> following settings: >>> -st_type cayley >>> -st_shift -1 >>> -st_cayley_antishift 1 >>> >>> But I never get the correct eigenvalue, instead SLEPc returns only the >>> value of st_shift: >>> [0] Number of iterations of the method: 1 >>> [0] Solution method: krylovschur >>> [0] Number of requested eigenvalues: 1 >>> [0] Stopping condition: tol=1e-08, maxit=19382 >>> [0] Number of converged eigenpairs: 16 >>> [0] >>> [0] k ||Ax-kx||/||kx|| >>> [0] ----------------- ------------------ >>> [0] -1.000000 0.0281754 >>> [0] -1.000000 0.0286815 >>> [0] -1.000000 0.0109186 >>> [0] -1.000000 0.140883 >>> [0] -1.000000 0.203036 >>> [0] -1.000000 0.00801616 >>> [0] -1.000000 0.0526871 >>> [0] -1.000000 0.022244 >>> [0] -1.000000 0.0182197 >>> [0] -1.000000 0.0107924 >>> [0] -1.000000 0.00963378 >>> [0] -1.000000 0.0239422 >>> [0] -1.000000 0.00472435 >>> [0] -1.000000 0.00607732 >>> [0] -1.000000 0.0124056 >>> [0] -1.000000 0.00557715 >>> >>> Also, it doesn't matter if I'm using exact or inexact solves. Changing >>> the values of shift and antishift also doesn't change the behaviour. Do >>> I need to make additional adjustments to get cayley to work? >>> >>> Best regards, >>> Michael >>> >>> >>> >>> Am 25.09.19 um 17:21 schrieb Jose E. Roman: >>>>> El 25 sept 2019, a las 16:18, Michael Werner via petsc-users escribi?: >>>>> >>>>> Hello, >>>>> >>>>> I'm looking for advice on how to set shift and antishift for the cayley >>>>> spectral transformation. So far I've been using sinvert to find the >>>>> eigenvalues with the smallest real part (but possibly large imaginary >>>>> part). For this, I use the following options: >>>>> -st_type sinvert >>>>> -eps_target -0.05 >>>>> -eps_target_real >>>>> >>>>> With sinvert, it is easy to understand how to chose the target, but for >>>>> Cayley I'm not sure how to set shift and antishift. What is the >>>>> mathematical meaning of the antishift? >>>>> >>>>> Best regards, >>>>> Michael Werner >>>> In exact arithmetic, both shift-and-invert and Cayley build the same Krylov subspace, so no difference. If the linear solves are computed "inexactly" (iterative solver) then Cayley may have some advantage, but it depends on the application. Also, iterative solvers usually are not robust enough in this context. You can see the discussion here https://doi.org/10.1108/09615530410544328 >>>> >>>> Jose >>>> > > > > From michael.werner at dlr.de Fri Sep 27 09:09:26 2019 From: michael.werner at dlr.de (Michael Werner) Date: Fri, 27 Sep 2019 16:09:26 +0200 Subject: [petsc-users] SLEPc - st_type cayley choice of shift and antishift In-Reply-To: <160372D7-892C-443B-8E88-7576897C2AAB@dsic.upv.es> References: <816e6835-c273-178b-e532-5dec0fe0710c@dlr.de> <2E0C9AFB-9B3B-427D-AF15-66FEC9DB8115@dsic.upv.es> <5baec17c-7ebb-114b-2fc7-5281f2285c41@dlr.de> <6A0885FA-B484-4DC9-8920-70BC463C25C0@dsic.upv.es> <84721713-83fd-8884-5e8c-0d6db748b6d1@dlr.de> <160372D7-892C-443B-8E88-7576897C2AAB@dsic.upv.es> Message-ID: Ah, yes, I didn't notice the difference. Now its working. Thank you! Michael Am 27.09.19 um 15:54 schrieb Jose E. Roman: > I now see what is happening. In the expression of the paper the antishift has different sign compared to the expression used in SLEPc (see the users manual): > > (A-sigma*B)^{-1}*(A+nu*B)x = \theta x > > So nu=-sigma is a forbidden value, otherwise both factors cancel out (I will fix the interface so that this is catched). > > In your case you should do -eps_target -1 -st_cayley_antishift -1 > > Jose > > >> El 27 sept 2019, a las 13:54, Michael Werner escribi?: >> >> Yes, with sinvert its working. And using -eps_target instead of >> -st_shift didn't change anything. >> >> I also just sent you the matrices for reproduction of the issue. >> >> Michael >> >> Am 27.09.19 um 13:32 schrieb Jose E. Roman: >>> Try setting -eps_target -1 instead of -st_shift -1 >>> Does sinvert work with target -1? >>> Can you send me the matrices so that I can reproduce the issue? >>> >>> Jose >>> >>> >>>> El 27 sept 2019, a las 13:11, Michael Werner escribi?: >>>> >>>> Thank you for the link to the paper, it's quite interesting and pretty >>>> close to what I'm doing. I'm currently also using the "inexact" approach >>>> for my application, and in general it works, as long as the ksp >>>> tolerance is low enough. However, I was hoping to speed up convergence >>>> towards the "interesting" eigenvalues by using Cayley. >>>> >>>> Now as a test I tried to follow the approach from your paper, choosing >>>> mu = -sigma, and mu in the order of magnitude of the imaginary part of >>>> the most amplified eigenvalue. I know the most amplified eigenvalue for >>>> my problem is -0.0398+0.724i, so I tried running SLEPc with the >>>> following settings: >>>> -st_type cayley >>>> -st_shift -1 >>>> -st_cayley_antishift 1 >>>> >>>> But I never get the correct eigenvalue, instead SLEPc returns only the >>>> value of st_shift: >>>> [0] Number of iterations of the method: 1 >>>> [0] Solution method: krylovschur >>>> [0] Number of requested eigenvalues: 1 >>>> [0] Stopping condition: tol=1e-08, maxit=19382 >>>> [0] Number of converged eigenpairs: 16 >>>> [0] >>>> [0] k ||Ax-kx||/||kx|| >>>> [0] ----------------- ------------------ >>>> [0] -1.000000 0.0281754 >>>> [0] -1.000000 0.0286815 >>>> [0] -1.000000 0.0109186 >>>> [0] -1.000000 0.140883 >>>> [0] -1.000000 0.203036 >>>> [0] -1.000000 0.00801616 >>>> [0] -1.000000 0.0526871 >>>> [0] -1.000000 0.022244 >>>> [0] -1.000000 0.0182197 >>>> [0] -1.000000 0.0107924 >>>> [0] -1.000000 0.00963378 >>>> [0] -1.000000 0.0239422 >>>> [0] -1.000000 0.00472435 >>>> [0] -1.000000 0.00607732 >>>> [0] -1.000000 0.0124056 >>>> [0] -1.000000 0.00557715 >>>> >>>> Also, it doesn't matter if I'm using exact or inexact solves. Changing >>>> the values of shift and antishift also doesn't change the behaviour. Do >>>> I need to make additional adjustments to get cayley to work? >>>> >>>> Best regards, >>>> Michael >>>> >>>> >>>> >>>> Am 25.09.19 um 17:21 schrieb Jose E. Roman: >>>>>> El 25 sept 2019, a las 16:18, Michael Werner via petsc-users escribi?: >>>>>> >>>>>> Hello, >>>>>> >>>>>> I'm looking for advice on how to set shift and antishift for the cayley >>>>>> spectral transformation. So far I've been using sinvert to find the >>>>>> eigenvalues with the smallest real part (but possibly large imaginary >>>>>> part). For this, I use the following options: >>>>>> -st_type sinvert >>>>>> -eps_target -0.05 >>>>>> -eps_target_real >>>>>> >>>>>> With sinvert, it is easy to understand how to chose the target, but for >>>>>> Cayley I'm not sure how to set shift and antishift. What is the >>>>>> mathematical meaning of the antishift? >>>>>> >>>>>> Best regards, >>>>>> Michael Werner >>>>> In exact arithmetic, both shift-and-invert and Cayley build the same Krylov subspace, so no difference. If the linear solves are computed "inexactly" (iterative solver) then Cayley may have some advantage, but it depends on the application. Also, iterative solvers usually are not robust enough in this context. You can see the discussion here https://doi.org/10.1108/09615530410544328 >>>>> >>>>> Jose >>>>> >> >> >> -- ____________________________________________________ Deutsches Zentrum f?r Luft- und Raumfahrt e.V. (DLR) Institut f?r Aerodynamik und Str?mungstechnik | Bunsenstr. 10 | 37073 G?ttingen Michael Werner Telefon 0551 709-2627 | Telefax 0551 709-2811 | Michael.Werner at dlr.de DLR.de From hzhang at mcs.anl.gov Fri Sep 27 09:26:28 2019 From: hzhang at mcs.anl.gov (Zhang, Hong) Date: Fri, 27 Sep 2019 14:26:28 +0000 Subject: [petsc-users] Computing part of the inverse of a large matrix In-Reply-To: <1D24C212-DEB9-438F-B1E9-8D64494D1CA2@anl.gov> References: <1D24C212-DEB9-438F-B1E9-8D64494D1CA2@anl.gov> Message-ID: See ~petsc/src/mat/examples/tests/ex214.c on how to compute selected entries of inv(A) using mumps. Hong On Fri, Sep 27, 2019 at 8:04 AM Smith, Barry F. via petsc-users > wrote: MatMumpsGetInverse() maybe useful. Also simply using MatMatSolve() with the first 1000 columns of the identity and "throwing away" the part you don't need may be most effective. Barry > On Sep 27, 2019, at 3:34 AM, Jan Grie?er via petsc-users > wrote: > > Hi all, > i am using petsc4py. I am dealing with rather large sparse matrices up to 600kx600k and i am interested in calculating a part of the inverse of the matrix(I know it will be a dense matrix). Due to the nature of my problem, I am only interested in approximately the first 1000 rows and 1000 columns (i.e. a large block in the upper left ofthe matrix). Before I start to play around now, I wanted to ask if there is a clever way to tackle this kind of problem in PETSc in principle. For any input I would be very grateful! > Greetings Jan -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Sat Sep 28 00:11:54 2019 From: knepley at gmail.com (Matthew Knepley) Date: Sat, 28 Sep 2019 06:11:54 +0100 Subject: [petsc-users] DIVERGED_LINEAR_SOLVE in SNES In-Reply-To: References: <5D0BAE9F-7BC7-4888-9F29-BEE65F4EDD79@anl.gov> Message-ID: On Fri, Sep 27, 2019 at 9:19 AM Li Luo via petsc-users < petsc-users at mcs.anl.gov> wrote: > Thank you for the suggestions, I'll try it. > I think an easier way to do what you want is -snes_max_linear_solve_fail 10 from https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/SNES/SNESSetMaxLinearSolveFailures.html Thanks, Matt > Best, > Li > > On Fri, Sep 27, 2019 at 5:27 AM Smith, Barry F. > wrote: > >> >> Li, >> >> It is possible, but as Hong said probably never appropriate. >> Especially if KSP has iterated for 10,000 iterations. If you want SNES to >> "try" the direction given by a failed solve you should use a much smaller >> maximum number of iterations for KSP. >> >> Anways, to do what you desire >> >> #include >> >> call KSPSetPostSolve() and in your post solve() function simply do >> ksp->reason = KSP_CONVERGED_ITS. >> >> Barry >> >> >> >> >> > On Sep 26, 2019, at 10:58 AM, Zhang, Hong via petsc-users < >> petsc-users at mcs.anl.gov> wrote: >> > >> > Li : >> > You can use '-ksp_max_it 20000' to change maximum iteration count. >> However, it does not make sense to continue after it fails at 10000 >> iterations. You should figure out why linear solver diverges. Run your code >> with '-ksp_monitor' or '-ksp_monitor_true_residual'. >> > Hong >> > >> > Dear developer, >> > >> > I am using SNES for solving a nonlinear system. For some cases, SNES >> diverged -3 with "DIVERGED_LINEAR_SOLVE" when the linear solver reached its >> maximum iteration count (i.e -ksp_max_it 10000). >> > Is that possible to let SNES continue even though the linear solver >> reaches the maximum number of iterations? Just take the result at 10000 for >> the Jacobian solution and then update the Newton step? >> > >> > Best, >> > Li >> > >> > >> > >> > This message and its contents, including attachments are intended >> solely for the original recipient. If you are not the intended recipient or >> have received this message in error, please notify me immediately and >> delete this message from your computer system. Any unauthorized use or >> distribution is prohibited. Please consider the environment before printing >> this email. >> >> > > -- > > Postdoctoral Fellow > Extreme Computing Research Center > King Abdullah University of Science & Technology > https://sites.google.com/site/rolyliluo/ > > ------------------------------ > This message and its contents, including attachments are intended solely > for the original recipient. If you are not the intended recipient or have > received this message in error, please notify me immediately and delete > this message from your computer system. Any unauthorized use or > distribution is prohibited. Please consider the environment before printing > this email. -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Sat Sep 28 00:20:04 2019 From: knepley at gmail.com (Matthew Knepley) Date: Sat, 28 Sep 2019 06:20:04 +0100 Subject: [petsc-users] DMPlexCreateFromDAG in parallel In-Reply-To: References: Message-ID: On Wed, Sep 25, 2019 at 5:56 PM Asitav Mishra via petsc-users < petsc-users at mcs.anl.gov> wrote: > Hi, > > I have a native distributed mesh graph across multiple processors, using > which I would want to create DMPlex mesh using DMPlexCreateFromDAG. I see > in Petsc plex/examples that DMPlexCreateFromDAG creates DM only from master > processor and then the DM is distributed across multiple (one-to-many) > processors. My question is: is it possible to create DAG locally in each > processor and then build the global DM? If yes, are there any such examples? > 1) If you do not mind us redistributing the mesh on input, then you can probably do what you want using https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/DMPLEX/DMPlexCreateFromCellListParallel.html Note that the input to this function wants a unique set of vertices from each process, so each vertex must come from only one processes. 2) If that does not work, you can certainly call CreateFromDAG() on multiple processes. However, then you must manually create the PetscSF which describes how the mesh is connected in parallel. If this is what you need to do, I can give you instructions but at that point we should probably make an example that does it. Thanks, Matt > Best, > Asitav > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Sat Sep 28 00:21:32 2019 From: bsmith at mcs.anl.gov (Smith, Barry F.) Date: Sat, 28 Sep 2019 05:21:32 +0000 Subject: [petsc-users] DIVERGED_LINEAR_SOLVE in SNES In-Reply-To: References: <5D0BAE9F-7BC7-4888-9F29-BEE65F4EDD79@anl.gov> Message-ID: <56CE8794-7550-4B4A-8DC3-31F8B53DAD49@mcs.anl.gov> Matt is right. I had completely forgotten about this option and gave you an overly complicated solution. Barry > On Sep 28, 2019, at 12:11 AM, Matthew Knepley wrote: > > On Fri, Sep 27, 2019 at 9:19 AM Li Luo via petsc-users wrote: > Thank you for the suggestions, I'll try it. > > I think an easier way to do what you want is > > -snes_max_linear_solve_fail 10 > > from > > https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/SNES/SNESSetMaxLinearSolveFailures.html > > Thanks, > > Matt > > Best, > Li > > On Fri, Sep 27, 2019 at 5:27 AM Smith, Barry F. wrote: > > Li, > > It is possible, but as Hong said probably never appropriate. Especially if KSP has iterated for 10,000 iterations. If you want SNES to "try" the direction given by a failed solve you should use a much smaller maximum number of iterations for KSP. > > Anways, to do what you desire > > #include > > call KSPSetPostSolve() and in your post solve() function simply do ksp->reason = KSP_CONVERGED_ITS. > > Barry > > > > > > On Sep 26, 2019, at 10:58 AM, Zhang, Hong via petsc-users wrote: > > > > Li : > > You can use '-ksp_max_it 20000' to change maximum iteration count. However, it does not make sense to continue after it fails at 10000 iterations. You should figure out why linear solver diverges. Run your code with '-ksp_monitor' or '-ksp_monitor_true_residual'. > > Hong > > > > Dear developer, > > > > I am using SNES for solving a nonlinear system. For some cases, SNES diverged -3 with "DIVERGED_LINEAR_SOLVE" when the linear solver reached its maximum iteration count (i.e -ksp_max_it 10000). > > Is that possible to let SNES continue even though the linear solver reaches the maximum number of iterations? Just take the result at 10000 for the Jacobian solution and then update the Newton step? > > > > Best, > > Li > > > > > > > > This message and its contents, including attachments are intended solely for the original recipient. If you are not the intended recipient or have received this message in error, please notify me immediately and delete this message from your computer system. Any unauthorized use or distribution is prohibited. Please consider the environment before printing this email. > > > > -- > Postdoctoral Fellow > Extreme Computing Research Center > King Abdullah University of Science & Technology > https://sites.google.com/site/rolyliluo/ > > This message and its contents, including attachments are intended solely for the original recipient. If you are not the intended recipient or have received this message in error, please notify me immediately and delete this message from your computer system. Any unauthorized use or distribution is prohibited. Please consider the environment before printing this email. > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ From li.luo at kaust.edu.sa Sat Sep 28 05:22:44 2019 From: li.luo at kaust.edu.sa (Li Luo) Date: Sat, 28 Sep 2019 13:22:44 +0300 Subject: [petsc-users] DIVERGED_LINEAR_SOLVE in SNES In-Reply-To: References: <5D0BAE9F-7BC7-4888-9F29-BEE65F4EDD79@anl.gov> Message-ID: Thank you! It works. Best, Li On Sat, Sep 28, 2019 at 8:12 AM Matthew Knepley wrote: > On Fri, Sep 27, 2019 at 9:19 AM Li Luo via petsc-users < > petsc-users at mcs.anl.gov> wrote: > >> Thank you for the suggestions, I'll try it. >> > > I think an easier way to do what you want is > > -snes_max_linear_solve_fail 10 > > from > > > https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/SNES/SNESSetMaxLinearSolveFailures.html > > Thanks, > > Matt > > >> Best, >> Li >> >> On Fri, Sep 27, 2019 at 5:27 AM Smith, Barry F. >> wrote: >> >>> >>> Li, >>> >>> It is possible, but as Hong said probably never appropriate. >>> Especially if KSP has iterated for 10,000 iterations. If you want SNES to >>> "try" the direction given by a failed solve you should use a much smaller >>> maximum number of iterations for KSP. >>> >>> Anways, to do what you desire >>> >>> #include >>> >>> call KSPSetPostSolve() and in your post solve() function simply do >>> ksp->reason = KSP_CONVERGED_ITS. >>> >>> Barry >>> >>> >>> >>> >>> > On Sep 26, 2019, at 10:58 AM, Zhang, Hong via petsc-users < >>> petsc-users at mcs.anl.gov> wrote: >>> > >>> > Li : >>> > You can use '-ksp_max_it 20000' to change maximum iteration count. >>> However, it does not make sense to continue after it fails at 10000 >>> iterations. You should figure out why linear solver diverges. Run your code >>> with '-ksp_monitor' or '-ksp_monitor_true_residual'. >>> > Hong >>> > >>> > Dear developer, >>> > >>> > I am using SNES for solving a nonlinear system. For some cases, SNES >>> diverged -3 with "DIVERGED_LINEAR_SOLVE" when the linear solver reached its >>> maximum iteration count (i.e -ksp_max_it 10000). >>> > Is that possible to let SNES continue even though the linear solver >>> reaches the maximum number of iterations? Just take the result at 10000 for >>> the Jacobian solution and then update the Newton step? >>> > >>> > Best, >>> > Li >>> > >>> > >>> > >>> > This message and its contents, including attachments are intended >>> solely for the original recipient. If you are not the intended recipient or >>> have received this message in error, please notify me immediately and >>> delete this message from your computer system. Any unauthorized use or >>> distribution is prohibited. Please consider the environment before printing >>> this email. >>> >>> >> >> -- >> >> Postdoctoral Fellow >> Extreme Computing Research Center >> King Abdullah University of Science & Technology >> https://sites.google.com/site/rolyliluo/ >> >> ------------------------------ >> This message and its contents, including attachments are intended solely >> for the original recipient. If you are not the intended recipient or have >> received this message in error, please notify me immediately and delete >> this message from your computer system. Any unauthorized use or >> distribution is prohibited. Please consider the environment before printing >> this email. > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > -- Postdoctoral Fellow Extreme Computing Research Center King Abdullah University of Science & Technology https://sites.google.com/site/rolyliluo/ -- This message and its contents, including attachments are intended solely for the original recipient. If you are not the intended recipient or have received this message in error, please notify me immediately and delete this message from your computer system. Any unauthorized use or distribution is prohibited. Please consider the environment before printing this email. -------------- next part -------------- An HTML attachment was scrubbed... URL: From michael.wick.1980 at gmail.com Sat Sep 28 05:32:30 2019 From: michael.wick.1980 at gmail.com (Michael Wick) Date: Sat, 28 Sep 2019 03:32:30 -0700 Subject: [petsc-users] [petsc-maint] petsc ksp solver hangs In-Reply-To: References: Message-ID: I attached a debugger to my run. The code just hangs without throwing an error message, interestingly. I uses 72 processors. I turned on the ksp monitor. And I can see it hangs either at the beginning or the end of KSP iteration. I also uses valgrind to debug my code on my local machine, which does not detect any issue. I uses fgmres + fieldsplit, which is really a standard option. Do you have any suggestions to do? On Fri, Sep 27, 2019 at 8:17 PM Zhang, Junchao wrote: > How many MPI ranks did you use? If it is done on your desktop, you can > just attach a debugger to a MPI process to see what is going on. > > --Junchao Zhang > > > On Fri, Sep 27, 2019 at 4:24 PM Michael Wick via petsc-maint < > petsc-maint at mcs.anl.gov> wrote: > >> Hi PETSc: >> >> I have been experiencing a code stagnation at certain KSP iterations. >> This happens rather randomly, which means the code may stop at the middle >> of a KSP solve and hangs there. >> >> I have used valgrind and detect nothing. I just wonder if you have any >> suggestions. >> >> Thanks!!! >> M >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Sat Sep 28 06:17:46 2019 From: knepley at gmail.com (Matthew Knepley) Date: Sat, 28 Sep 2019 07:17:46 -0400 Subject: [petsc-users] [petsc-maint] petsc ksp solver hangs In-Reply-To: References: Message-ID: What we want to see is the stack trace. Thanks Matt On Sat, Sep 28, 2019, 06:33 Michael Wick via petsc-maint < petsc-maint at mcs.anl.gov> wrote: > I attached a debugger to my run. The code just hangs without throwing an > error message, interestingly. I uses 72 processors. I turned on the ksp > monitor. And I can see it hangs either at the beginning or the end of KSP > iteration. I also uses valgrind to debug my code on my local machine, which > does not detect any issue. I uses fgmres + fieldsplit, which is really a > standard option. > > Do you have any suggestions to do? > > On Fri, Sep 27, 2019 at 8:17 PM Zhang, Junchao > wrote: > >> How many MPI ranks did you use? If it is done on your desktop, you can >> just attach a debugger to a MPI process to see what is going on. >> >> --Junchao Zhang >> >> >> On Fri, Sep 27, 2019 at 4:24 PM Michael Wick via petsc-maint < >> petsc-maint at mcs.anl.gov> wrote: >> >>> Hi PETSc: >>> >>> I have been experiencing a code stagnation at certain KSP iterations. >>> This happens rather randomly, which means the code may stop at the middle >>> of a KSP solve and hangs there. >>> >>> I have used valgrind and detect nothing. I just wonder if you have any >>> suggestions. >>> >>> Thanks!!! >>> M >>> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From jczhang at mcs.anl.gov Sat Sep 28 08:22:43 2019 From: jczhang at mcs.anl.gov (Zhang, Junchao) Date: Sat, 28 Sep 2019 13:22:43 +0000 Subject: [petsc-users] [petsc-maint] petsc ksp solver hangs In-Reply-To: References: Message-ID: Does it hang with 2 or 4 processes? Which PETSc version do you use (using the latest is easier for us to debug)? Did you configure PETSc with --with-debugging=yes COPTFLAGS="-O0 -g" CXXOPTFLAGS="-O0 -g" After attaching gdb to one process, you can use bt to see its stack trace. --Junchao Zhang On Sat, Sep 28, 2019 at 5:33 AM Michael Wick > wrote: I attached a debugger to my run. The code just hangs without throwing an error message, interestingly. I uses 72 processors. I turned on the ksp monitor. And I can see it hangs either at the beginning or the end of KSP iteration. I also uses valgrind to debug my code on my local machine, which does not detect any issue. I uses fgmres + fieldsplit, which is really a standard option. Do you have any suggestions to do? On Fri, Sep 27, 2019 at 8:17 PM Zhang, Junchao > wrote: How many MPI ranks did you use? If it is done on your desktop, you can just attach a debugger to a MPI process to see what is going on. --Junchao Zhang On Fri, Sep 27, 2019 at 4:24 PM Michael Wick via petsc-maint > wrote: Hi PETSc: I have been experiencing a code stagnation at certain KSP iterations. This happens rather randomly, which means the code may stop at the middle of a KSP solve and hangs there. I have used valgrind and detect nothing. I just wonder if you have any suggestions. Thanks!!! M -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefano.zampini at gmail.com Sat Sep 28 08:28:08 2019 From: stefano.zampini at gmail.com (Stefano Zampini) Date: Sat, 28 Sep 2019 16:28:08 +0300 Subject: [petsc-users] [petsc-maint] petsc ksp solver hangs In-Reply-To: References: Message-ID: In my experience, an hanging execution may results from seterrq being called with the wrong communicator. Anyway, it would be useful to get the output of -log_trace . Also, does it hang when -pc_type none is specified? Il Sab 28 Set 2019, 16:22 Zhang, Junchao via petsc-users < petsc-users at mcs.anl.gov> ha scritto: > Does it hang with 2 or 4 processes? Which PETSc version do you use (using > the latest is easier for us to debug)? Did you configure PETSc with > --with-debugging=yes COPTFLAGS="-O0 -g" CXXOPTFLAGS="-O0 -g" > After attaching gdb to one process, you can use bt to see its stack trace. > > --Junchao Zhang > > > On Sat, Sep 28, 2019 at 5:33 AM Michael Wick > wrote: > >> I attached a debugger to my run. The code just hangs without throwing an >> error message, interestingly. I uses 72 processors. I turned on the ksp >> monitor. And I can see it hangs either at the beginning or the end of KSP >> iteration. I also uses valgrind to debug my code on my local machine, which >> does not detect any issue. I uses fgmres + fieldsplit, which is really a >> standard option. >> >> Do you have any suggestions to do? >> >> On Fri, Sep 27, 2019 at 8:17 PM Zhang, Junchao >> wrote: >> >>> How many MPI ranks did you use? If it is done on your desktop, you can >>> just attach a debugger to a MPI process to see what is going on. >>> >>> --Junchao Zhang >>> >>> >>> On Fri, Sep 27, 2019 at 4:24 PM Michael Wick via petsc-maint < >>> petsc-maint at mcs.anl.gov> wrote: >>> >>>> Hi PETSc: >>>> >>>> I have been experiencing a code stagnation at certain KSP iterations. >>>> This happens rather randomly, which means the code may stop at the middle >>>> of a KSP solve and hangs there. >>>> >>>> I have used valgrind and detect nothing. I just wonder if you have any >>>> suggestions. >>>> >>>> Thanks!!! >>>> M >>>> >>> -------------- next part -------------- An HTML attachment was scrubbed... URL: From michael.wick.1980 at gmail.com Sun Sep 29 00:28:49 2019 From: michael.wick.1980 at gmail.com (Michael Wick) Date: Sat, 28 Sep 2019 22:28:49 -0700 Subject: [petsc-users] [petsc-maint] petsc ksp solver hangs In-Reply-To: References: Message-ID: Thank you all for the reply. I am trying to get the backtrace. However, the code hangs totally randomly, and it hangs only when I run large simulations (e.g. 72 CPUs for this one). I am trying very hard to get the error message. So far, I can pin-point that the issue is related with hypre, and a static build of the petsc library. Switching to a dynamic build works fine so far. Also, using a naked gmres works. Does anyone have similar issues before? On Sat, Sep 28, 2019 at 6:28 AM Stefano Zampini wrote: > In my experience, an hanging execution may results from seterrq being > called with the wrong communicator. Anyway, it would be useful to get the > output of -log_trace . > > Also, does it hang when -pc_type none is specified? > > Il Sab 28 Set 2019, 16:22 Zhang, Junchao via petsc-users < > petsc-users at mcs.anl.gov> ha scritto: > >> Does it hang with 2 or 4 processes? Which PETSc version do you use >> (using the latest is easier for us to debug)? Did you configure PETSc with >> --with-debugging=yes COPTFLAGS="-O0 -g" CXXOPTFLAGS="-O0 -g" >> After attaching gdb to one process, you can use bt to see its stack >> trace. >> >> --Junchao Zhang >> >> >> On Sat, Sep 28, 2019 at 5:33 AM Michael Wick >> wrote: >> >>> I attached a debugger to my run. The code just hangs without throwing an >>> error message, interestingly. I uses 72 processors. I turned on the ksp >>> monitor. And I can see it hangs either at the beginning or the end of KSP >>> iteration. I also uses valgrind to debug my code on my local machine, which >>> does not detect any issue. I uses fgmres + fieldsplit, which is really a >>> standard option. >>> >>> Do you have any suggestions to do? >>> >>> On Fri, Sep 27, 2019 at 8:17 PM Zhang, Junchao >>> wrote: >>> >>>> How many MPI ranks did you use? If it is done on your desktop, you can >>>> just attach a debugger to a MPI process to see what is going on. >>>> >>>> --Junchao Zhang >>>> >>>> >>>> On Fri, Sep 27, 2019 at 4:24 PM Michael Wick via petsc-maint < >>>> petsc-maint at mcs.anl.gov> wrote: >>>> >>>>> Hi PETSc: >>>>> >>>>> I have been experiencing a code stagnation at certain KSP iterations. >>>>> This happens rather randomly, which means the code may stop at the middle >>>>> of a KSP solve and hangs there. >>>>> >>>>> I have used valgrind and detect nothing. I just wonder if you have any >>>>> suggestions. >>>>> >>>>> Thanks!!! >>>>> M >>>>> >>>> -------------- next part -------------- An HTML attachment was scrubbed... URL: From mfadams at lbl.gov Sun Sep 29 08:02:10 2019 From: mfadams at lbl.gov (Mark Adams) Date: Sun, 29 Sep 2019 09:02:10 -0400 Subject: [petsc-users] [petsc-maint] petsc ksp solver hangs In-Reply-To: References: Message-ID: On Sun, Sep 29, 2019 at 1:30 AM Michael Wick via petsc-maint < petsc-maint at mcs.anl.gov> wrote: > Thank you all for the reply. > > I am trying to get the backtrace. However, the code hangs totally > randomly, and it hangs only when I run large simulations (e.g. 72 CPUs for > this one). I am trying very hard to get the error message. > > So far, I can pin-point that the issue is related with hypre, and a static > build of the petsc library. Switching to a dynamic build works fine so far. > Also, using a naked gmres works. Does anyone have similar issues before? > I've never heard of a problem like this. You might try deleting your architectured directory (a make clean essentially) and reconfigure. If dynamic builds work is there any reason not to just do that and move on? > > On Sat, Sep 28, 2019 at 6:28 AM Stefano Zampini > wrote: > >> In my experience, an hanging execution may results from seterrq being >> called with the wrong communicator. Anyway, it would be useful to get the >> output of -log_trace . >> >> Also, does it hang when -pc_type none is specified? >> >> Il Sab 28 Set 2019, 16:22 Zhang, Junchao via petsc-users < >> petsc-users at mcs.anl.gov> ha scritto: >> >>> Does it hang with 2 or 4 processes? Which PETSc version do you use >>> (using the latest is easier for us to debug)? Did you configure PETSc with >>> --with-debugging=yes COPTFLAGS="-O0 -g" CXXOPTFLAGS="-O0 -g" >>> After attaching gdb to one process, you can use bt to see its stack >>> trace. >>> >>> --Junchao Zhang >>> >>> >>> On Sat, Sep 28, 2019 at 5:33 AM Michael Wick < >>> michael.wick.1980 at gmail.com> wrote: >>> >>>> I attached a debugger to my run. The code just hangs without throwing >>>> an error message, interestingly. I uses 72 processors. I turned on the ksp >>>> monitor. And I can see it hangs either at the beginning or the end of KSP >>>> iteration. I also uses valgrind to debug my code on my local machine, which >>>> does not detect any issue. I uses fgmres + fieldsplit, which is really a >>>> standard option. >>>> >>>> Do you have any suggestions to do? >>>> >>>> On Fri, Sep 27, 2019 at 8:17 PM Zhang, Junchao >>>> wrote: >>>> >>>>> How many MPI ranks did you use? If it is done on your desktop, you can >>>>> just attach a debugger to a MPI process to see what is going on. >>>>> >>>>> --Junchao Zhang >>>>> >>>>> >>>>> On Fri, Sep 27, 2019 at 4:24 PM Michael Wick via petsc-maint < >>>>> petsc-maint at mcs.anl.gov> wrote: >>>>> >>>>>> Hi PETSc: >>>>>> >>>>>> I have been experiencing a code stagnation at certain KSP iterations. >>>>>> This happens rather randomly, which means the code may stop at the middle >>>>>> of a KSP solve and hangs there. >>>>>> >>>>>> I have used valgrind and detect nothing. I just wonder if you have >>>>>> any suggestions. >>>>>> >>>>>> Thanks!!! >>>>>> M >>>>>> >>>>> -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Sun Sep 29 11:24:33 2019 From: bsmith at mcs.anl.gov (Smith, Barry F.) Date: Sun, 29 Sep 2019 16:24:33 +0000 Subject: [petsc-users] [petsc-maint] petsc ksp solver hangs In-Reply-To: References: Message-ID: <9A5F4C32-7D74-4202-99E4-BF1542015B50@anl.gov> If you have TotalView or DDT or some other parallel debugger you can wait until it is "hanging" and then send a single to one or more of the processes to stop in and from this get the stack trace. You'll have to figure out for your debugger how that is done. If you can start your 72 rank job in "interactive" mode you can launch it with the option -start_in_debugger noxterm -debugger_nodes 0 then it will only start the debugger on the first rank. Now wait until it hangs and do a control c and then you can type bt to get the traceback. Barry Note it is possible to run 72 rank jobs even on a laptop/workstations/non-cluster (so long as they don't use too much memory and take too long to get to the hang point) and the you can use the debugger as I indicated above. > On Sep 28, 2019, at 5:32 AM, Michael Wick via petsc-maint wrote: > > I attached a debugger to my run. The code just hangs without throwing an error message, interestingly. I uses 72 processors. I turned on the ksp monitor. And I can see it hangs either at the beginning or the end of KSP iteration. I also uses valgrind to debug my code on my local machine, which does not detect any issue. I uses fgmres + fieldsplit, which is really a standard option. > > Do you have any suggestions to do? > > On Fri, Sep 27, 2019 at 8:17 PM Zhang, Junchao wrote: > How many MPI ranks did you use? If it is done on your desktop, you can just attach a debugger to a MPI process to see what is going on. > > --Junchao Zhang > > > On Fri, Sep 27, 2019 at 4:24 PM Michael Wick via petsc-maint wrote: > Hi PETSc: > > I have been experiencing a code stagnation at certain KSP iterations. This happens rather randomly, which means the code may stop at the middle of a KSP solve and hangs there. > > I have used valgrind and detect nothing. I just wonder if you have any suggestions. > > Thanks!!! > M From michael.wick.1980 at gmail.com Mon Sep 30 00:32:01 2019 From: michael.wick.1980 at gmail.com (Michael Wick) Date: Sun, 29 Sep 2019 22:32:01 -0700 Subject: [petsc-users] [petsc-maint] petsc ksp solver hangs In-Reply-To: <9A5F4C32-7D74-4202-99E4-BF1542015B50@anl.gov> References: <9A5F4C32-7D74-4202-99E4-BF1542015B50@anl.gov> Message-ID: Hi Barry: Thanks! I can capture an issue from my local run, although I am not 100% sure this is the reason causing the code hanging. When I run with -pc_hypre_boomeramg_relax_type_all Chebyshev, valgrind captures a memory leak: ==4410== 192 bytes in 8 blocks are indirectly lost in loss record 1 of 5 ==4410== at 0x4C2FB55: calloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) ==4410== by 0x73FED84: hypre_HostMalloc (hypre_memory.c:192) ==4410== by 0x73FEE53: hypre_MAllocWithInit (hypre_memory.c:301) ==4410== by 0x73FEF1A: hypre_CAlloc (hypre_memory.c:338) ==4410== by 0x726E4C4: hypre_ParCSRRelax_Cheby_Setup (par_cheby.c:70) ==4410== by 0x7265A4C: hypre_BoomerAMGSetup (par_amg_setup.c:2738) ==4410== by 0x7240F96: HYPRE_BoomerAMGSetup (HYPRE_parcsr_amg.c:52) ==4410== by 0x694FFC2: PCSetUp_HYPRE (hypre.c:322) ==4410== by 0x69DBE0F: PCSetUp (precon.c:923) ==4410== by 0x6B2BDDC: KSPSetUp (itfunc.c:381) ==4410== by 0x6B2DABF: KSPSolve (itfunc.c:612) Best, Mike On Sun, Sep 29, 2019 at 9:24 AM Smith, Barry F. wrote: > > If you have TotalView or DDT or some other parallel debugger you can > wait until it is "hanging" and then send a single to one or more of the > processes to stop in and from this get the stack trace. You'll have to > figure out for your debugger how that is done. > > If you can start your 72 rank job in "interactive" mode you can launch > it with the option -start_in_debugger noxterm -debugger_nodes 0 then it > will only start the debugger on the first rank. Now wait until it hangs and > do a control c and then you can type bt to get the traceback. > > Barry > > Note it is possible to run 72 rank jobs even on a > laptop/workstations/non-cluster (so long as they don't use too much memory > and take too long to get to the hang point) and the you can use the > debugger as I indicated above. > > > > On Sep 28, 2019, at 5:32 AM, Michael Wick via petsc-maint < > petsc-maint at mcs.anl.gov> wrote: > > > > I attached a debugger to my run. The code just hangs without throwing an > error message, interestingly. I uses 72 processors. I turned on the ksp > monitor. And I can see it hangs either at the beginning or the end of KSP > iteration. I also uses valgrind to debug my code on my local machine, which > does not detect any issue. I uses fgmres + fieldsplit, which is really a > standard option. > > > > Do you have any suggestions to do? > > > > On Fri, Sep 27, 2019 at 8:17 PM Zhang, Junchao > wrote: > > How many MPI ranks did you use? If it is done on your desktop, you can > just attach a debugger to a MPI process to see what is going on. > > > > --Junchao Zhang > > > > > > On Fri, Sep 27, 2019 at 4:24 PM Michael Wick via petsc-maint < > petsc-maint at mcs.anl.gov> wrote: > > Hi PETSc: > > > > I have been experiencing a code stagnation at certain KSP iterations. > This happens rather randomly, which means the code may stop at the middle > of a KSP solve and hangs there. > > > > I have used valgrind and detect nothing. I just wonder if you have any > suggestions. > > > > Thanks!!! > > M > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Mon Sep 30 01:50:46 2019 From: bsmith at mcs.anl.gov (Smith, Barry F.) Date: Mon, 30 Sep 2019 06:50:46 +0000 Subject: [petsc-users] [petsc-maint] petsc ksp solver hangs In-Reply-To: References: <9A5F4C32-7D74-4202-99E4-BF1542015B50@anl.gov> Message-ID: <282A2B81-49B4-4240-ADE0-DF11904FF9D9@mcs.anl.gov> This is just a memory leak in hypre; you might report it to them. Memory leaks don't cause hangs Barry > On Sep 30, 2019, at 12:32 AM, Michael Wick wrote: > > Hi Barry: > > Thanks! I can capture an issue from my local run, although I am not 100% sure this is the reason causing the code hanging. > > When I run with -pc_hypre_boomeramg_relax_type_all Chebyshev, valgrind captures a memory leak: > > ==4410== 192 bytes in 8 blocks are indirectly lost in loss record 1 of 5 > ==4410== at 0x4C2FB55: calloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) > ==4410== by 0x73FED84: hypre_HostMalloc (hypre_memory.c:192) > ==4410== by 0x73FEE53: hypre_MAllocWithInit (hypre_memory.c:301) > ==4410== by 0x73FEF1A: hypre_CAlloc (hypre_memory.c:338) > ==4410== by 0x726E4C4: hypre_ParCSRRelax_Cheby_Setup (par_cheby.c:70) > ==4410== by 0x7265A4C: hypre_BoomerAMGSetup (par_amg_setup.c:2738) > ==4410== by 0x7240F96: HYPRE_BoomerAMGSetup (HYPRE_parcsr_amg.c:52) > ==4410== by 0x694FFC2: PCSetUp_HYPRE (hypre.c:322) > ==4410== by 0x69DBE0F: PCSetUp (precon.c:923) > ==4410== by 0x6B2BDDC: KSPSetUp (itfunc.c:381) > ==4410== by 0x6B2DABF: KSPSolve (itfunc.c:612) > > Best, > > Mike > > > On Sun, Sep 29, 2019 at 9:24 AM Smith, Barry F. wrote: > > If you have TotalView or DDT or some other parallel debugger you can wait until it is "hanging" and then send a single to one or more of the processes to stop in and from this get the stack trace. You'll have to figure out for your debugger how that is done. > > If you can start your 72 rank job in "interactive" mode you can launch it with the option -start_in_debugger noxterm -debugger_nodes 0 then it will only start the debugger on the first rank. Now wait until it hangs and do a control c and then you can type bt to get the traceback. > > Barry > > Note it is possible to run 72 rank jobs even on a laptop/workstations/non-cluster (so long as they don't use too much memory and take too long to get to the hang point) and the you can use the debugger as I indicated above. > > > > On Sep 28, 2019, at 5:32 AM, Michael Wick via petsc-maint wrote: > > > > I attached a debugger to my run. The code just hangs without throwing an error message, interestingly. I uses 72 processors. I turned on the ksp monitor. And I can see it hangs either at the beginning or the end of KSP iteration. I also uses valgrind to debug my code on my local machine, which does not detect any issue. I uses fgmres + fieldsplit, which is really a standard option. > > > > Do you have any suggestions to do? > > > > On Fri, Sep 27, 2019 at 8:17 PM Zhang, Junchao wrote: > > How many MPI ranks did you use? If it is done on your desktop, you can just attach a debugger to a MPI process to see what is going on. > > > > --Junchao Zhang > > > > > > On Fri, Sep 27, 2019 at 4:24 PM Michael Wick via petsc-maint wrote: > > Hi PETSc: > > > > I have been experiencing a code stagnation at certain KSP iterations. This happens rather randomly, which means the code may stop at the middle of a KSP solve and hangs there. > > > > I have used valgrind and detect nothing. I just wonder if you have any suggestions. > > > > Thanks!!! > > M > From griesser.jan at googlemail.com Mon Sep 30 09:44:46 2019 From: griesser.jan at googlemail.com (=?UTF-8?B?SmFuIEdyaWXDn2Vy?=) Date: Mon, 30 Sep 2019 16:44:46 +0200 Subject: [petsc-users] Computing part of the inverse of a large matrix In-Reply-To: References: <1D24C212-DEB9-438F-B1E9-8D64494D1CA2@anl.gov> Message-ID: Is the MatMumpsGetInverse also wrapped to the python version in PETSc4py ? If yes is there any example for using it ? My other question is related to the LU factoriation ( https://www.mcs.anl.gov/petsc/documentation/faq.html#invertmatrix). Is the LU factorization only possible for sequential Aij matrices ? I read in the docs that this is the case for ordering. After setting up my matrix A, B and x i tried: r, c = dynamical_matrix_nn.getOrdering("nd") fac_dyn_matrix = dynamical_matrix_nn.factorLU(r,c) resulting in an error: [0] No support for this operation for this object type [0] Mat type mpiaij Am Fr., 27. Sept. 2019 um 16:26 Uhr schrieb Zhang, Hong : > See ~petsc/src/mat/examples/tests/ex214.c on how to compute selected > entries of inv(A) using mumps. > Hong > > On Fri, Sep 27, 2019 at 8:04 AM Smith, Barry F. via petsc-users < > petsc-users at mcs.anl.gov> wrote: > >> >> MatMumpsGetInverse() maybe useful. Also simply using MatMatSolve() with >> the first 1000 columns of the identity and "throwing away" the part you >> don't need may be most effective. >> >> Barry >> >> >> >> > On Sep 27, 2019, at 3:34 AM, Jan Grie?er via petsc-users < >> petsc-users at mcs.anl.gov> wrote: >> > >> > Hi all, >> > i am using petsc4py. I am dealing with rather large sparse matrices up >> to 600kx600k and i am interested in calculating a part of the inverse of >> the matrix(I know it will be a dense matrix). Due to the nature of my >> problem, I am only interested in approximately the first 1000 rows and 1000 >> columns (i.e. a large block in the upper left ofthe matrix). Before I >> start to play around now, I wanted to ask if there is a clever way to >> tackle this kind of problem in PETSc in principle. For any input I would be >> very grateful! >> > Greetings Jan >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Mon Sep 30 09:57:37 2019 From: bsmith at mcs.anl.gov (Smith, Barry F.) Date: Mon, 30 Sep 2019 14:57:37 +0000 Subject: [petsc-users] Computing part of the inverse of a large matrix In-Reply-To: References: <1D24C212-DEB9-438F-B1E9-8D64494D1CA2@anl.gov> Message-ID: <631C0418-DB21-40AE-B46B-2E39D56EC462@mcs.anl.gov> If you want a parallal LU (and hence the ability to build the inverse in parallel) you need to configure PETSc with --download-mumps --download-scalapack Barry > On Sep 30, 2019, at 9:44 AM, Jan Grie?er wrote: > > Is the MatMumpsGetInverse also wrapped to the python version in PETSc4py ? If yes is there any example for using it ? > My other question is related to the LU factoriation (https://www.mcs.anl.gov/petsc/documentation/faq.html#invertmatrix). > Is the LU factorization only possible for sequential Aij matrices ? I read in the docs that this is the case for ordering. > After setting up my matrix A, B and x i tried: > r, c = dynamical_matrix_nn.getOrdering("nd") > fac_dyn_matrix = dynamical_matrix_nn.factorLU(r,c) > > resulting in an error: > [0] No support for this operation for this object type > [0] Mat type mpiaij > > Am Fr., 27. Sept. 2019 um 16:26 Uhr schrieb Zhang, Hong : > See ~petsc/src/mat/examples/tests/ex214.c on how to compute selected entries of inv(A) using mumps. > Hong > > On Fri, Sep 27, 2019 at 8:04 AM Smith, Barry F. via petsc-users wrote: > > MatMumpsGetInverse() maybe useful. Also simply using MatMatSolve() with the first 1000 columns of the identity and "throwing away" the part you don't need may be most effective. > > Barry > > > > > On Sep 27, 2019, at 3:34 AM, Jan Grie?er via petsc-users wrote: > > > > Hi all, > > i am using petsc4py. I am dealing with rather large sparse matrices up to 600kx600k and i am interested in calculating a part of the inverse of the matrix(I know it will be a dense matrix). Due to the nature of my problem, I am only interested in approximately the first 1000 rows and 1000 columns (i.e. a large block in the upper left ofthe matrix). Before I start to play around now, I wanted to ask if there is a clever way to tackle this kind of problem in PETSc in principle. For any input I would be very grateful! > > Greetings Jan > From griesser.jan at googlemail.com Mon Sep 30 10:13:36 2019 From: griesser.jan at googlemail.com (=?UTF-8?B?SmFuIEdyaWXDn2Vy?=) Date: Mon, 30 Sep 2019 17:13:36 +0200 Subject: [petsc-users] Computing part of the inverse of a large matrix In-Reply-To: <631C0418-DB21-40AE-B46B-2E39D56EC462@mcs.anl.gov> References: <1D24C212-DEB9-438F-B1E9-8D64494D1CA2@anl.gov> <631C0418-DB21-40AE-B46B-2E39D56EC462@mcs.anl.gov> Message-ID: I configured PETSc with MUMPS and tested it already for the spectrum slicing method in Slepc4py but i have problems in setting up the LU factorization in the PETSc4py. Since i do not find the corresponding methods and commands in the source code. Thats why is was wondering if this is even possible in the python version. Am Mo., 30. Sept. 2019 um 16:57 Uhr schrieb Smith, Barry F. < bsmith at mcs.anl.gov>: > > If you want a parallal LU (and hence the ability to build the inverse in > parallel) you need to configure PETSc with --download-mumps > --download-scalapack > > Barry > > > > On Sep 30, 2019, at 9:44 AM, Jan Grie?er > wrote: > > > > Is the MatMumpsGetInverse also wrapped to the python version in PETSc4py > ? If yes is there any example for using it ? > > My other question is related to the LU factoriation ( > https://www.mcs.anl.gov/petsc/documentation/faq.html#invertmatrix). > > Is the LU factorization only possible for sequential Aij matrices ? I > read in the docs that this is the case for ordering. > > After setting up my matrix A, B and x i tried: > > r, c = dynamical_matrix_nn.getOrdering("nd") > > fac_dyn_matrix = dynamical_matrix_nn.factorLU(r,c) > > > > resulting in an error: > > [0] No support for this operation for this object type > > [0] Mat type mpiaij > > > > Am Fr., 27. Sept. 2019 um 16:26 Uhr schrieb Zhang, Hong < > hzhang at mcs.anl.gov>: > > See ~petsc/src/mat/examples/tests/ex214.c on how to compute selected > entries of inv(A) using mumps. > > Hong > > > > On Fri, Sep 27, 2019 at 8:04 AM Smith, Barry F. via petsc-users < > petsc-users at mcs.anl.gov> wrote: > > > > MatMumpsGetInverse() maybe useful. Also simply using MatMatSolve() with > the first 1000 columns of the identity and "throwing away" the part you > don't need may be most effective. > > > > Barry > > > > > > > > > On Sep 27, 2019, at 3:34 AM, Jan Grie?er via petsc-users < > petsc-users at mcs.anl.gov> wrote: > > > > > > Hi all, > > > i am using petsc4py. I am dealing with rather large sparse matrices up > to 600kx600k and i am interested in calculating a part of the inverse of > the matrix(I know it will be a dense matrix). Due to the nature of my > problem, I am only interested in approximately the first 1000 rows and 1000 > columns (i.e. a large block in the upper left ofthe matrix). Before I > start to play around now, I wanted to ask if there is a clever way to > tackle this kind of problem in PETSc in principle. For any input I would be > very grateful! > > > Greetings Jan > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Mon Sep 30 10:47:19 2019 From: bsmith at mcs.anl.gov (Smith, Barry F.) Date: Mon, 30 Sep 2019 15:47:19 +0000 Subject: [petsc-users] Computing part of the inverse of a large matrix In-Reply-To: References: <1D24C212-DEB9-438F-B1E9-8D64494D1CA2@anl.gov> <631C0418-DB21-40AE-B46B-2E39D56EC462@mcs.anl.gov> Message-ID: <580485AA-7FBA-454E-9C23-CCF9618E569F@mcs.anl.gov> The Python wrapper for PETSc may be missing some functionality; there is a manual process involved in creating new ones. You could poke around the petsc4py source and see how easy it would be to add more functionality that you need. > On Sep 30, 2019, at 10:13 AM, Jan Grie?er wrote: > > I configured PETSc with MUMPS and tested it already for the spectrum slicing method in Slepc4py but i have problems in setting up the LU factorization in the PETSc4py. Since i do not find the corresponding methods and commands in the source code. Thats why is was wondering if this is even possible in the python version. > > Am Mo., 30. Sept. 2019 um 16:57 Uhr schrieb Smith, Barry F. : > > If you want a parallal LU (and hence the ability to build the inverse in parallel) you need to configure PETSc with --download-mumps --download-scalapack > > Barry > > > > On Sep 30, 2019, at 9:44 AM, Jan Grie?er wrote: > > > > Is the MatMumpsGetInverse also wrapped to the python version in PETSc4py ? If yes is there any example for using it ? > > My other question is related to the LU factoriation (https://www.mcs.anl.gov/petsc/documentation/faq.html#invertmatrix). > > Is the LU factorization only possible for sequential Aij matrices ? I read in the docs that this is the case for ordering. > > After setting up my matrix A, B and x i tried: > > r, c = dynamical_matrix_nn.getOrdering("nd") > > fac_dyn_matrix = dynamical_matrix_nn.factorLU(r,c) > > > > resulting in an error: > > [0] No support for this operation for this object type > > [0] Mat type mpiaij > > > > Am Fr., 27. Sept. 2019 um 16:26 Uhr schrieb Zhang, Hong : > > See ~petsc/src/mat/examples/tests/ex214.c on how to compute selected entries of inv(A) using mumps. > > Hong > > > > On Fri, Sep 27, 2019 at 8:04 AM Smith, Barry F. via petsc-users wrote: > > > > MatMumpsGetInverse() maybe useful. Also simply using MatMatSolve() with the first 1000 columns of the identity and "throwing away" the part you don't need may be most effective. > > > > Barry > > > > > > > > > On Sep 27, 2019, at 3:34 AM, Jan Grie?er via petsc-users wrote: > > > > > > Hi all, > > > i am using petsc4py. I am dealing with rather large sparse matrices up to 600kx600k and i am interested in calculating a part of the inverse of the matrix(I know it will be a dense matrix). Due to the nature of my problem, I am only interested in approximately the first 1000 rows and 1000 columns (i.e. a large block in the upper left ofthe matrix). Before I start to play around now, I wanted to ask if there is a clever way to tackle this kind of problem in PETSc in principle. For any input I would be very grateful! > > > Greetings Jan > > > From knepley at gmail.com Mon Sep 30 10:50:20 2019 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 30 Sep 2019 11:50:20 -0400 Subject: [petsc-users] Computing part of the inverse of a large matrix In-Reply-To: <580485AA-7FBA-454E-9C23-CCF9618E569F@mcs.anl.gov> References: <1D24C212-DEB9-438F-B1E9-8D64494D1CA2@anl.gov> <631C0418-DB21-40AE-B46B-2E39D56EC462@mcs.anl.gov> <580485AA-7FBA-454E-9C23-CCF9618E569F@mcs.anl.gov> Message-ID: I think the easier way to do it is to use a KSP which is configured to do preonly and LU. That will do the right thing in parallel. Matt On Mon, Sep 30, 2019 at 11:47 AM Smith, Barry F. via petsc-users < petsc-users at mcs.anl.gov> wrote: > > The Python wrapper for PETSc may be missing some functionality; there > is a manual process involved in creating new ones. You could poke around > the petsc4py source and see how easy it would be to add more functionality > that you need. > > > > > On Sep 30, 2019, at 10:13 AM, Jan Grie?er > wrote: > > > > I configured PETSc with MUMPS and tested it already for the spectrum > slicing method in Slepc4py but i have problems in setting up the LU > factorization in the PETSc4py. Since i do not find the corresponding > methods and commands in the source code. Thats why is was wondering if this > is even possible in the python version. > > > > Am Mo., 30. Sept. 2019 um 16:57 Uhr schrieb Smith, Barry F. < > bsmith at mcs.anl.gov>: > > > > If you want a parallal LU (and hence the ability to build the inverse > in parallel) you need to configure PETSc with --download-mumps > --download-scalapack > > > > Barry > > > > > > > On Sep 30, 2019, at 9:44 AM, Jan Grie?er > wrote: > > > > > > Is the MatMumpsGetInverse also wrapped to the python version in > PETSc4py ? If yes is there any example for using it ? > > > My other question is related to the LU factoriation ( > https://www.mcs.anl.gov/petsc/documentation/faq.html#invertmatrix). > > > Is the LU factorization only possible for sequential Aij matrices ? I > read in the docs that this is the case for ordering. > > > After setting up my matrix A, B and x i tried: > > > r, c = dynamical_matrix_nn.getOrdering("nd") > > > fac_dyn_matrix = dynamical_matrix_nn.factorLU(r,c) > > > > > > resulting in an error: > > > [0] No support for this operation for this object type > > > [0] Mat type mpiaij > > > > > > Am Fr., 27. Sept. 2019 um 16:26 Uhr schrieb Zhang, Hong < > hzhang at mcs.anl.gov>: > > > See ~petsc/src/mat/examples/tests/ex214.c on how to compute selected > entries of inv(A) using mumps. > > > Hong > > > > > > On Fri, Sep 27, 2019 at 8:04 AM Smith, Barry F. via petsc-users < > petsc-users at mcs.anl.gov> wrote: > > > > > > MatMumpsGetInverse() maybe useful. Also simply using MatMatSolve() > with the first 1000 columns of the identity and "throwing away" the part > you don't need may be most effective. > > > > > > Barry > > > > > > > > > > > > > On Sep 27, 2019, at 3:34 AM, Jan Grie?er via petsc-users < > petsc-users at mcs.anl.gov> wrote: > > > > > > > > Hi all, > > > > i am using petsc4py. I am dealing with rather large sparse matrices > up to 600kx600k and i am interested in calculating a part of the inverse of > the matrix(I know it will be a dense matrix). Due to the nature of my > problem, I am only interested in approximately the first 1000 rows and 1000 > columns (i.e. a large block in the upper left ofthe matrix). Before I > start to play around now, I wanted to ask if there is a clever way to > tackle this kind of problem in PETSc in principle. For any input I would be > very grateful! > > > > Greetings Jan > > > > > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Mon Sep 30 13:18:43 2019 From: bsmith at mcs.anl.gov (Smith, Barry F.) Date: Mon, 30 Sep 2019 18:18:43 +0000 Subject: [petsc-users] PETSc 3.12 release Message-ID: We are pleased to announce the release of PETSc version 3.12 at http://www.mcs.anl.gov/petsc The major changes and updates can be found at http://www.mcs.anl.gov/petsc/documentation/changes/312.html We recommend upgrading to PETSc 3.12 soon. As always, please report problems to petsc-maint at mcs.anl.gov and ask questions at petsc-users at mcs.anl.gov This release includes contributions from Alex Lindsay Alp Dener Barry Smith Boris Boutkov Chris Eldred Dave May David Wells Debojyoti Ghosh deepblu2718 Emil Constantinescu Fande Kong Florian Wechsung Glenn Hammond Hannah Morgan Hansol Suh Hendrik Ranocha Hong Zhang Hong Zhang Jacob Faibussowitsch Jakub Kru??k Jed Brown Jeffrey Larson Joe Wallwork Jose E. Roman Julian Giordani Junchao Zhang Karl Rupp Lawrence Mitchell Lisandro Dalcin Lu luzhanghpp Mark Adams Martin Diehl Matthew G. Knepley Patrick Farrell Patrick Sanan Pierre Jolivet Richard Tran Mills Sajid Ali Satish Balay Scott Kruger Siegfried Cools Stefano Zampini Thibaut Appel Tristan Konolige V?clav Hapla valeriabarr Volker Jacht William Gropp and bug reports/patches/proposed improvements received from ??? Adina P?s?k Andrea Gallegati "Appel, Thibaut" ?smund Ervik "Aulisa, Eugenio" Barry Smith "Betrie, Getnet" Brad Aagaard Carl Steefel Clausen Pascal Danyang Su Dave A. May David Dang Veselin Dobrev Dylan P. Brennan Ed Bueler Elias Karabelas Fabian Jakub Fande Kong Glenn E. Hammond "Guenther, Stefanie" Hannah Mairs Morgan Hapla Vaclav Heeho Park Hong Zhang Hubert Weissmann Ian Lin Jacob Faibussowitsch Jakub Kruzik Jaysaval, Piyoosh Jed Brown Joe Wallwork Jose E. Roman Junchao Zhang Karl Lin Karl Rupp Daniel Kokron Lardelli Nicolo Jeffery Larson Leon Avery Lisandro Dalcin Magne Rudshaug Manuel Colera Rico Mark Adams Matthew Knepley Matthias Beaupere Mohammad Asghar Mouralidaran Arrutselvi "Nair, Nirmal Jayaprasad" Nick Papior Oana Marin Paolo Orsini Patrick Sanan Paul Seibert Pierre Jolivet Piyoosh Jaysaval Prince Joseph Richard Katz Richard Tran Mills Robert Nr Nourgaliev Sajid Ali Satish Balay "Schanen, Michel" Sergi Molins Rafa Sophie Blondel Stefano Zampini Steve Thibaut Appel Tim Steinhoff Todd Munson Tom Goffrey Vaclav Hapla "Wells, David" Xiangdong Xiang Huang Yuyun Yang "Zhang, Hong" "Zhang, Junchao" From sajidsyed2021 at u.northwestern.edu Mon Sep 30 16:24:07 2019 From: sajidsyed2021 at u.northwestern.edu (Sajid Ali) Date: Mon, 30 Sep 2019 16:24:07 -0500 Subject: [petsc-users] Question about TSComputeRHSJacobianConstant In-Reply-To: <0D532C0E-1D9A-41A8-9B37-E286DF08B22B@anl.gov> References: <61B21078-9146-4FE2-8967-95D64DB583C6@anl.gov> <400504D5-9319-4A96-B0C0-C871284EB989@anl.gov> <1A99BD32-723F-4A76-98A4-2AFFA790802B@anl.gov> <6280A5E9-9DA5-485D-96F0-12FB944ACC4C@anl.gov> <0D532C0E-1D9A-41A8-9B37-E286DF08B22B@anl.gov> Message-ID: Hi PETSc-developers, Has this bug been fixed in the new 3.12 release ? Thank You, Sajid Ali Applied Physics Northwestern University s-sajid-ali.github.io -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Mon Sep 30 18:39:17 2019 From: bsmith at mcs.anl.gov (Smith, Barry F.) Date: Mon, 30 Sep 2019 23:39:17 +0000 Subject: [petsc-users] Question about TSComputeRHSJacobianConstant In-Reply-To: References: <61B21078-9146-4FE2-8967-95D64DB583C6@anl.gov> <400504D5-9319-4A96-B0C0-C871284EB989@anl.gov> <1A99BD32-723F-4A76-98A4-2AFFA790802B@anl.gov> <6280A5E9-9DA5-485D-96F0-12FB944ACC4C@anl.gov> <0D532C0E-1D9A-41A8-9B37-E286DF08B22B@anl.gov> Message-ID: Sorry this code has not been changed. Barry > On Sep 30, 2019, at 4:24 PM, Sajid Ali wrote: > > Hi PETSc-developers, > > Has this bug been fixed in the new 3.12 release ? > > Thank You, > Sajid Ali > Applied Physics > Northwestern University > s-sajid-ali.github.io