<div dir="ltr"><div>I thought so and ran the code with the option -malloc_dump but everything looked fine, no warnings were displayed in a smaller case. Perhaps this is not enough?<br></div><div><br></div><div>Yes, all the processes call both VecAssemblyBegin/End.</div><div><br></div><div>I will try valgrind.<br></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">El vie., 13 sept. 2019 a las 22:45, Stefano Zampini (<<a href="mailto:stefano.zampini@gmail.com">stefano.zampini@gmail.com</a>>) escribió:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div style="overflow-wrap: break-word;"><br><div><br><blockquote type="cite"><div>On Sep 13, 2019, at 11:37 PM, José Lorenzo via petsc-users <<a href="mailto:petsc-users@mcs.anl.gov" target="_blank">petsc-users@mcs.anl.gov</a>> wrote:</div><br class="gmail-m_232176222364807433Apple-interchange-newline"><div><div dir="ltr"><div>I'm using PETSc 3.10.2, I guess it is the master branch but I do not know for sure as I didn't install it myself.</div><div><br></div><div>You are right, each processor provides data for all the boundary entries.</div><div><br></div><div>I have carried out a few more tests and apparently it gets stuck during VecAssemblyBegin. I don't know whether I can be more preciseabout this.<br></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">El vie., 13 sept. 2019 a las 20:14, Smith, Barry F. (<<a href="mailto:bsmith@mcs.anl.gov" target="_blank">bsmith@mcs.anl.gov</a>>) escribió:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><br>
What version of PETSc is this? The master branch? <br>
<br>
> call VecSetValues(H, nedge_all, edgappglo(diredg_app) - 1, Hself, ADD_VALUES, ierr)<br>
<br>
So each process is providing all data for all the boundary entries in the vector?<br>
<br>
I don't think there is anything wrong with what you are doing but the mechanism that does the communication inside the VecAssembly cannot know about the structure of the communication and so will do it inefficiently. It would be useful to know where it is "stuck" that might help us improve the assembly process for your case.<br>
<br>
But I think it would better for you to just use MPI directly to put the data where it is needed. <br>
<br>
<br>
> On Sep 13, 2019, at 11:30 AM, José Lorenzo via petsc-users <<a href="mailto:petsc-users@mcs.anl.gov" target="_blank">petsc-users@mcs.anl.gov</a>> wrote:<br>
> <br>
> Hello,<br>
> <br>
> I am solving a finite element problem with Dirichlet boundary conditions using PETSC. In the boundary conditions there are two terms: a first one that is known before hand (normally zero) and a second term that depends linearly on the unknown variable itself in the whole domain. Therefore, at every time step I need to iterate as the boundary condition depends on the field and the latter depends on the BC. Moreover, the problem is nonlinear and I use a ghosted vector to represent the field.<br>
> <br>
> Every processor manages a portion of the domain and a portion of the boundary (if not interior). At every Newton iteration within the time loop the way I set the boundary conditions is as follows:<br>
> <br>
> First, each processor computes the known term of the BC (first term) and inserts the values into the vector<br>
> <br>
> call VecSetValues(H, nedge_own, edglocglo(diredg_loc) - 1, Hdir, INSERT_VALUES, ierr)<br>
> call VecAssemblyBegin(H, ierr)<br>
> call VecAssemblyEnd(H, ierr)<br>
> <br>
> As far as I understand, at this stage VecAssembly will not need to communicate to other processors as each processor only sets values to components that belong to it.<br>
> <br>
> Then, each processor computes its own contribution to the field-dependent term of the BC for the whole domain boundary as <br>
> <br>
> call VecSetValues(H, nedge_all, edgappglo(diredg_app) - 1, Hself, ADD_VALUES, ierr)<br>
> call VecAssemblyBegin(H, ierr)<br>
> call VecAssemblyEnd(H, ierr)<br>
> <br>
> In this case communication will be needed as each processor will add values to vector components that are not stored by it, and I guess it might get very busy as all the processors will need to communicate with each other.<br>
> <br>
> When using this strategy I don't find any issue for problems using a small amount of processors, but recently I've been solving using 90 processors and the simulation always hangs at the second VecSetValues at some random time step. It works fine for some time steps but at some point it just gets stuck and I have to cancel the simulation. <br>
> <br></blockquote></div></div></blockquote><div><br></div><div>The words “hangs” and “gets stuck” 99% percent of the time indicates some memory issue with your code.</div><div>First; are all processes calling VecAssemblyBegin/End or only a subset of them? VecAssemblyBegin/End must be called by all processes</div><div>Second: run with valgrind, even a smaller case <a href="http://www.valgrind.org/" target="_blank">http://www.valgrind.org/</a></div><br><blockquote type="cite"><div><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
> I have managed to overcome this by making each processor contribute to its own components using first MPI_Reduce and then doing<br>
> <br>
> call VecSetValues(H, nedge_own, edgappglo(diredg_app_loc), Hself_own, ADD_VALUES, ierr)<br>
> call VecAssemblyBegin(H, ierr)<br>
> call VecAssemblyEnd(H, ierr)<br>
> <br>
> However I would like to understand whether there is something wrong in the code above.<br>
> <br>
> Thank you.<br>
> <br>
<br>
</blockquote></div>
</div></blockquote></div><br></div></blockquote></div>