From marco.cisternino at optimad.it Mon Jan 3 03:33:31 2022 From: marco.cisternino at optimad.it (Marco Cisternino) Date: Mon, 3 Jan 2022 09:33:31 +0000 Subject: [petsc-users] Nullspaces In-Reply-To: <7AA299EF-4757-4C0C-8CC4-E4857DBD4EE1@petsc.dev> References: <7AA299EF-4757-4C0C-8CC4-E4857DBD4EE1@petsc.dev> Message-ID: My comments are between Barry?s lines Marco Cisternino From: Barry Smith Sent: venerd? 24 dicembre 2021 23:57 To: Marco Cisternino Cc: Matthew Knepley ; petsc-users Subject: Re: [petsc-users] Nullspaces I tried your code but it appears the ./gyroid_solution.txt contains a vector of all zeros. Is this intended? Yes it is, it should be the initial guess and not the solution of the linear system. It is read just to build a null space vector of the right size in case 2. However, the purpose here is not to solve the linear system but to understand why the constant null space is working (it passes the MatNullSpaceTest) in case 1 and it is not working in case 2. That was what Matthew asked me in his last message to this thread. Actually VecDuplicateVecs() does not copy the values in the vector so your nsp[0] will contain the zero vector anyways. I know, that?s why there is a VecSet on nsp Vec. Is it not correct? It should be a Vec of 1s ? Would you be able to send the data that indicates what rows of the vector are associated with each subdomain? For example a vector with all 1s on the first domain and all 2s on the second domain? I think with this one should be able to construct the 2 dimensional null space. I would, but again the case was meant to test the two ways of building the constant null space. The idea is to understand why case 1 passes the MatNullSpaceTest and case 2 doesn?t. Barry Thank you, Barry. As soon as the issue Matthew noted is understood, I will come back to the 2-dimensional null space. On Dec 16, 2021, at 11:09 AM, Marco Cisternino > wrote: Hello Matthew, as promised I prepared a minimal (112960 rows. I?m not able to produce anything smaller than this and triggering the issue) example of the behavior I was talking about some days ago. What I did is to produce matrix, right hand side and initial solution of the linear system. As I told you before, this linear system is the discretization of the pressure equation of a predictor-corrector method for NS equations in the framework of finite volume method. This case has homogeneous Neumann boundary conditions. Computational domain has two independent and separated sub-domains. I discretize the weak formulation and I divide every row of the linear system by the volume of the relative cell. The underlying mesh is not uniform, therefore cells have different volumes. The issue I?m going to explain does not show up if the mesh is uniform, same volume for all the cells. I usually build the null space sub-domain by sub-domain with MatNullSpaceCreate(getCommunicator(), PETSC_FALSE, nConstants, constants, &nullspace); Where nConstants = 2 and constants contains two normalized arrays with constant values on degrees of freedom relative to the associated sub-domain and zeros elsewhere. However, as a test I tried the constant over the whole domain using 2 alternatives that should produce the same null space: 1. MatNullSpaceCreate(getCommunicator(), PETSC_TRUE, 0, nullptr, &nullspace); 2. Vec* nsp; VecDuplicateVecs(solution, 1, &nsp); VecSet(nsp[0],1.0); VecNormalize(nsp[0], nullptr); MatNullSpaceCreate(getCommunicator(), PETSC_FALSE, 1, nsp, &nullspace); Once I created the null space I test it using: MatNullSpaceTest(nullspace, m_A, &isNullSpaceValid); The case 1 pass the test while case 2 don?t. I have a small code for matrix loading, null spaces creation and testing. Unfortunately I cannot implement a small code able to produce that linear system. As attachment you can find an archive containing the matrix, the initial solution (used to manually build the null space) and the rhs (not used in the test code) in binary format. You can also find the testing code in the same archive. I used petsc 3.12(gcc+openMPI) and petsc 3.15.2(intelOneAPI) same results. If the attachment is not delivered, I can share a link to it. Thanks for any help. Marco Cisternino Marco Cisternino, PhD marco.cisternino at optimad.it ______________________ Optimad Engineering Srl Via Bligny 5, Torino, Italia. +3901119719782 www.optimad.it From: Marco Cisternino > Sent: marted? 7 dicembre 2021 19:36 To: Matthew Knepley > Cc: petsc-users > Subject: Re: [petsc-users] Nullspaces I will, as soon as possible... Scarica Outlook per Android ________________________________ From: Matthew Knepley > Sent: Tuesday, December 7, 2021 7:25:43 PM To: Marco Cisternino > Cc: petsc-users > Subject: Re: [petsc-users] Nullspaces On Tue, Dec 7, 2021 at 11:19 AM Marco Cisternino > wrote: Good morning, I?m still struggling with the Poisson equation with Neumann BCs. I discretize the equation by finite volume method and I divide every line of the linear system by the volume of the cell. I could avoid this division, but I?m trying to understand. My mesh is not uniform, i.e. cells have different volumes (it is an octree mesh). Moreover, in my computational domain there are 2 separated sub-domains. I build the null space and then I use MatNullSpaceTest to check it. If I do this: MatNullSpaceCreate(getCommunicator(), PETSC_TRUE, 0, nullptr, &nullspace); It works This produces the normalized constant vector. If I do this: Vec nsp; VecDuplicate(m_rhs, &nsp); VecSet(nsp,1.0); VecNormalize(nsp, nullptr); MatNullSpaceCreate(getCommunicator(), PETSC_FALSE, 1, &nsp, &nullspace); It does not work This is also the normalized constant vector. So you are saying that these two vectors give different results with MatNullSpaceTest()? Something must be wrong in the code. Can you send a minimal example of this? I will go through and debug it. Thanks, Matt Probably, I have wrong expectations, but should not it be the same? Thanks Marco Cisternino, PhD marco.cisternino at optimad.it ______________________ Optimad Engineering Srl Via Bligny 5, Torino, Italia. +3901119719782 www.optimad.it -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From marco.cisternino at optimad.it Mon Jan 3 03:42:24 2022 From: marco.cisternino at optimad.it (Marco Cisternino) Date: Mon, 3 Jan 2022 09:42:24 +0000 Subject: [petsc-users] Nullspaces In-Reply-To: References: Message-ID: My comments are between the Mark?s lines and they starts with ?#? Marco Cisternino From: Mark Adams Sent: sabato 25 dicembre 2021 14:59 To: Marco Cisternino Cc: Matthew Knepley ; petsc-users Subject: Re: [petsc-users] Nullspaces If "triggering the issue" requires a substantial mesh, that makes me think there is a logic bug somewhere. Maybe use valgrind. # Are you suggesting to use valgrind on this tiny toy code or on the original one? However, considering the purpose of the tiny code, i.e. testing the constant null space, why there should be a logical bug? Case 1 passes and case 2 should be exactly the same, shouldn?t be it? Also you say you divide by the cell volume. Maybe I am not understanding this but that is basically diagonal scaling and that will change the null space (ie, not a constant anymore) # I agree on this, but it pushes a question: why the case 1 passes the test? # Thank you, Mark. On Thu, Dec 16, 2021 at 11:11 AM Marco Cisternino > wrote: Hello Matthew, as promised I prepared a minimal (112960 rows. I?m not able to produce anything smaller than this and triggering the issue) example of the behavior I was talking about some days ago. What I did is to produce matrix, right hand side and initial solution of the linear system. As I told you before, this linear system is the discretization of the pressure equation of a predictor-corrector method for NS equations in the framework of finite volume method. This case has homogeneous Neumann boundary conditions. Computational domain has two independent and separated sub-domains. I discretize the weak formulation and I divide every row of the linear system by the volume of the relative cell. The underlying mesh is not uniform, therefore cells have different volumes. The issue I?m going to explain does not show up if the mesh is uniform, same volume for all the cells. I usually build the null space sub-domain by sub-domain with MatNullSpaceCreate(getCommunicator(), PETSC_FALSE, nConstants, constants, &nullspace); Where nConstants = 2 and constants contains two normalized arrays with constant values on degrees of freedom relative to the associated sub-domain and zeros elsewhere. However, as a test I tried the constant over the whole domain using 2 alternatives that should produce the same null space: 1. MatNullSpaceCreate(getCommunicator(), PETSC_TRUE, 0, nullptr, &nullspace); 2. Vec* nsp; VecDuplicateVecs(solution, 1, &nsp); VecSet(nsp[0],1.0); VecNormalize(nsp[0], nullptr); MatNullSpaceCreate(getCommunicator(), PETSC_FALSE, 1, nsp, &nullspace); Once I created the null space I test it using: MatNullSpaceTest(nullspace, m_A, &isNullSpaceValid); The case 1 pass the test while case 2 don?t. I have a small code for matrix loading, null spaces creation and testing. Unfortunately I cannot implement a small code able to produce that linear system. As attachment you can find an archive containing the matrix, the initial solution (used to manually build the null space) and the rhs (not used in the test code) in binary format. You can also find the testing code in the same archive. I used petsc 3.12(gcc+openMPI) and petsc 3.15.2(intelOneAPI) same results. If the attachment is not delivered, I can share a link to it. Thanks for any help. Marco Cisternino Marco Cisternino, PhD marco.cisternino at optimad.it ______________________ Optimad Engineering Srl Via Bligny 5, Torino, Italia. +3901119719782 www.optimad.it From: Marco Cisternino > Sent: marted? 7 dicembre 2021 19:36 To: Matthew Knepley > Cc: petsc-users > Subject: Re: [petsc-users] Nullspaces I will, as soon as possible... Scarica Outlook per Android ________________________________ From: Matthew Knepley > Sent: Tuesday, December 7, 2021 7:25:43 PM To: Marco Cisternino > Cc: petsc-users > Subject: Re: [petsc-users] Nullspaces On Tue, Dec 7, 2021 at 11:19 AM Marco Cisternino > wrote: Good morning, I?m still struggling with the Poisson equation with Neumann BCs. I discretize the equation by finite volume method and I divide every line of the linear system by the volume of the cell. I could avoid this division, but I?m trying to understand. My mesh is not uniform, i.e. cells have different volumes (it is an octree mesh). Moreover, in my computational domain there are 2 separated sub-domains. I build the null space and then I use MatNullSpaceTest to check it. If I do this: MatNullSpaceCreate(getCommunicator(), PETSC_TRUE, 0, nullptr, &nullspace); It works This produces the normalized constant vector. If I do this: Vec nsp; VecDuplicate(m_rhs, &nsp); VecSet(nsp,1.0); VecNormalize(nsp, nullptr); MatNullSpaceCreate(getCommunicator(), PETSC_FALSE, 1, &nsp, &nullspace); It does not work This is also the normalized constant vector. So you are saying that these two vectors give different results with MatNullSpaceTest()? Something must be wrong in the code. Can you send a minimal example of this? I will go through and debug it. Thanks, Matt Probably, I have wrong expectations, but should not it be the same? Thanks Marco Cisternino, PhD marco.cisternino at optimad.it ______________________ Optimad Engineering Srl Via Bligny 5, Torino, Italia. +3901119719782 www.optimad.it -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From mfadams at lbl.gov Mon Jan 3 07:41:43 2022 From: mfadams at lbl.gov (Mark Adams) Date: Mon, 3 Jan 2022 08:41:43 -0500 Subject: [petsc-users] Nullspaces In-Reply-To: References: Message-ID: There could be a memory bug that does not cause a noticeable problem until it hits some vital data and valgrind might find it on a small problem. However you might have a bug like a hardwired buffer size that overflows that is in fact not a bug until you get to this large size and in that case valgrid would need to be run on the large case and would have a good chance of finding it. On Mon, Jan 3, 2022 at 4:42 AM Marco Cisternino wrote: > My comments are between the Mark?s lines and they starts with ?#? > > > > Marco Cisternino > > > > *From:* Mark Adams > *Sent:* sabato 25 dicembre 2021 14:59 > *To:* Marco Cisternino > *Cc:* Matthew Knepley ; petsc-users < > petsc-users at mcs.anl.gov> > *Subject:* Re: [petsc-users] Nullspaces > > > > If "triggering the issue" requires a substantial mesh, that makes me > think there is a logic bug somewhere. Maybe use valgrind. > > > > # Are you suggesting to use valgrind on this tiny toy code or on the > original one? However, considering the purpose of the tiny code, i.e. > testing the constant null space, why there should be a logical bug? Case 1 > passes and case 2 should be exactly the same, shouldn?t be it? > > > > Also you say you divide by the cell volume. Maybe I am not understanding > this but that is basically diagonal scaling and that will change the null > space (ie, not a constant anymore) > > > > # I agree on this, but it pushes a question: why the case 1 passes the > test? > > # Thank you, Mark. > > > > On Thu, Dec 16, 2021 at 11:11 AM Marco Cisternino < > marco.cisternino at optimad.it> wrote: > > Hello Matthew, > > as promised I prepared a minimal (112960 rows. I?m not able to produce > anything smaller than this and triggering the issue) example of the > behavior I was talking about some days ago. > > What I did is to produce matrix, right hand side and initial solution of > the linear system. > > > > As I told you before, this linear system is the discretization of the > pressure equation of a predictor-corrector method for NS equations in the > framework of finite volume method. > > This case has homogeneous Neumann boundary conditions. Computational > domain has two independent and separated sub-domains. > > I discretize the weak formulation and I divide every row of the linear > system by the volume of the relative cell. > > The underlying mesh is not uniform, therefore cells have different > volumes. > > The issue I?m going to explain does not show up if the mesh is uniform, > same volume for all the cells. > > > > I usually build the null space sub-domain by sub-domain with > > MatNullSpaceCreate(getCommunicator(), PETSC_FALSE, nConstants, constants, > &nullspace); > > Where nConstants = 2 and constants contains two normalized arrays with > constant values on degrees of freedom relative to the associated sub-domain > and zeros elsewhere. > > > > However, as a test I tried the constant over the whole domain using 2 > alternatives that should produce the same null space: > > 1. MatNullSpaceCreate(getCommunicator(), PETSC_TRUE, 0, nullptr, > &nullspace); > 2. Vec* nsp; > > VecDuplicateVecs(solution, 1, &nsp); > > VecSet(nsp[0],1.0); > > VecNormalize(nsp[0], nullptr); > > MatNullSpaceCreate(getCommunicator(), PETSC_FALSE, 1, nsp, &nullspace); > > > > Once I created the null space I test it using: > > MatNullSpaceTest(nullspace, m_A, &isNullSpaceValid); > > > > The case 1 pass the test while case 2 don?t. > > > > I have a small code for matrix loading, null spaces creation and testing. > > Unfortunately I cannot implement a small code able to produce that linear > system. > > > > As attachment you can find an archive containing the matrix, the initial > solution (used to manually build the null space) and the rhs (not used in > the test code) in binary format. > > You can also find the testing code in the same archive. > > I used petsc 3.12(gcc+openMPI) and petsc 3.15.2(intelOneAPI) same results. > > If the attachment is not delivered, I can share a link to it. > > > > Thanks for any help. > > > > Marco Cisternino > > > > > > Marco Cisternino, PhD > marco.cisternino at optimad.it > > ______________________ > > Optimad Engineering Srl > > Via Bligny 5, Torino, Italia. > +3901119719782 > www.optimad.it > > > > *From:* Marco Cisternino > *Sent:* marted? 7 dicembre 2021 19:36 > *To:* Matthew Knepley > *Cc:* petsc-users > *Subject:* Re: [petsc-users] Nullspaces > > > > I will, as soon as possible... > > > > Scarica Outlook per Android > ------------------------------ > > *From:* Matthew Knepley > *Sent:* Tuesday, December 7, 2021 7:25:43 PM > *To:* Marco Cisternino > *Cc:* petsc-users > *Subject:* Re: [petsc-users] Nullspaces > > > > On Tue, Dec 7, 2021 at 11:19 AM Marco Cisternino < > marco.cisternino at optimad.it> wrote: > > Good morning, > > I?m still struggling with the Poisson equation with Neumann BCs. > > I discretize the equation by finite volume method and I divide every line > of the linear system by the volume of the cell. I could avoid this > division, but I?m trying to understand. > > My mesh is not uniform, i.e. cells have different volumes (it is an octree > mesh). > > Moreover, in my computational domain there are 2 separated sub-domains. > > I build the null space and then I use MatNullSpaceTest to check it. > > > > If I do this: > > MatNullSpaceCreate(getCommunicator(), PETSC_TRUE, 0, nullptr, &nullspace); > > It works > > > > This produces the normalized constant vector. > > > > If I do this: > > Vec nsp; > > VecDuplicate(m_rhs, &nsp); > > VecSet(nsp,1.0); > > VecNormalize(nsp, nullptr); > > MatNullSpaceCreate(getCommunicator(), PETSC_FALSE, 1, &nsp, &nullspace); > > It does not work > > > > This is also the normalized constant vector. > > > > So you are saying that these two vectors give different results with > MatNullSpaceTest()? > > Something must be wrong in the code. Can you send a minimal example of > this? I will go > > through and debug it. > > > > Thanks, > > > > Matt > > > > Probably, I have wrong expectations, but should not it be the same? > > > > Thanks > > > > Marco Cisternino, PhD > marco.cisternino at optimad.it > > ______________________ > > Optimad Engineering Srl > > Via Bligny 5, Torino, Italia. > +3901119719782 > www.optimad.it > > > > > > > -- > > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > > > https://www.cse.buffalo.edu/~knepley/ > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From marco.cisternino at optimad.it Mon Jan 3 07:47:53 2022 From: marco.cisternino at optimad.it (Marco Cisternino) Date: Mon, 3 Jan 2022 13:47:53 +0000 Subject: [petsc-users] Nullspaces In-Reply-To: References: Message-ID: Are you talking about the code that produce the linear system or about the tiny code that test the null space? In the first case, it is absolutely possible, but I would expect no problem in the tiny code, do you agree? It is important to remark that the real code and the tiny one behave in the same way when testing the null space of the operator. I can analyze with valgrind and I will, but I would not expect great insights. Thanks, Marco Cisternino, PhD marco.cisternino at optimad.it ______________________ Optimad Engineering Srl Via Bligny 5, Torino, Italia. +3901119719782 www.optimad.it From: Mark Adams Sent: luned? 3 gennaio 2022 14:42 To: Marco Cisternino Cc: Matthew Knepley ; petsc-users Subject: Re: [petsc-users] Nullspaces There could be a memory bug that does not cause a noticeable problem until it hits some vital data and valgrind might find it on a small problem. However you might have a bug like a hardwired buffer size that overflows that is in fact not a bug until you get to this large size and in that case valgrid would need to be run on the large case and would have a good chance of finding it. On Mon, Jan 3, 2022 at 4:42 AM Marco Cisternino > wrote: My comments are between the Mark?s lines and they starts with ?#? Marco Cisternino From: Mark Adams > Sent: sabato 25 dicembre 2021 14:59 To: Marco Cisternino > Cc: Matthew Knepley >; petsc-users > Subject: Re: [petsc-users] Nullspaces If "triggering the issue" requires a substantial mesh, that makes me think there is a logic bug somewhere. Maybe use valgrind. # Are you suggesting to use valgrind on this tiny toy code or on the original one? However, considering the purpose of the tiny code, i.e. testing the constant null space, why there should be a logical bug? Case 1 passes and case 2 should be exactly the same, shouldn?t be it? Also you say you divide by the cell volume. Maybe I am not understanding this but that is basically diagonal scaling and that will change the null space (ie, not a constant anymore) # I agree on this, but it pushes a question: why the case 1 passes the test? # Thank you, Mark. On Thu, Dec 16, 2021 at 11:11 AM Marco Cisternino > wrote: Hello Matthew, as promised I prepared a minimal (112960 rows. I?m not able to produce anything smaller than this and triggering the issue) example of the behavior I was talking about some days ago. What I did is to produce matrix, right hand side and initial solution of the linear system. As I told you before, this linear system is the discretization of the pressure equation of a predictor-corrector method for NS equations in the framework of finite volume method. This case has homogeneous Neumann boundary conditions. Computational domain has two independent and separated sub-domains. I discretize the weak formulation and I divide every row of the linear system by the volume of the relative cell. The underlying mesh is not uniform, therefore cells have different volumes. The issue I?m going to explain does not show up if the mesh is uniform, same volume for all the cells. I usually build the null space sub-domain by sub-domain with MatNullSpaceCreate(getCommunicator(), PETSC_FALSE, nConstants, constants, &nullspace); Where nConstants = 2 and constants contains two normalized arrays with constant values on degrees of freedom relative to the associated sub-domain and zeros elsewhere. However, as a test I tried the constant over the whole domain using 2 alternatives that should produce the same null space: 1. MatNullSpaceCreate(getCommunicator(), PETSC_TRUE, 0, nullptr, &nullspace); 2. Vec* nsp; VecDuplicateVecs(solution, 1, &nsp); VecSet(nsp[0],1.0); VecNormalize(nsp[0], nullptr); MatNullSpaceCreate(getCommunicator(), PETSC_FALSE, 1, nsp, &nullspace); Once I created the null space I test it using: MatNullSpaceTest(nullspace, m_A, &isNullSpaceValid); The case 1 pass the test while case 2 don?t. I have a small code for matrix loading, null spaces creation and testing. Unfortunately I cannot implement a small code able to produce that linear system. As attachment you can find an archive containing the matrix, the initial solution (used to manually build the null space) and the rhs (not used in the test code) in binary format. You can also find the testing code in the same archive. I used petsc 3.12(gcc+openMPI) and petsc 3.15.2(intelOneAPI) same results. If the attachment is not delivered, I can share a link to it. Thanks for any help. Marco Cisternino Marco Cisternino, PhD marco.cisternino at optimad.it ______________________ Optimad Engineering Srl Via Bligny 5, Torino, Italia. +3901119719782 www.optimad.it From: Marco Cisternino > Sent: marted? 7 dicembre 2021 19:36 To: Matthew Knepley > Cc: petsc-users > Subject: Re: [petsc-users] Nullspaces I will, as soon as possible... Scarica Outlook per Android ________________________________ From: Matthew Knepley > Sent: Tuesday, December 7, 2021 7:25:43 PM To: Marco Cisternino > Cc: petsc-users > Subject: Re: [petsc-users] Nullspaces On Tue, Dec 7, 2021 at 11:19 AM Marco Cisternino > wrote: Good morning, I?m still struggling with the Poisson equation with Neumann BCs. I discretize the equation by finite volume method and I divide every line of the linear system by the volume of the cell. I could avoid this division, but I?m trying to understand. My mesh is not uniform, i.e. cells have different volumes (it is an octree mesh). Moreover, in my computational domain there are 2 separated sub-domains. I build the null space and then I use MatNullSpaceTest to check it. If I do this: MatNullSpaceCreate(getCommunicator(), PETSC_TRUE, 0, nullptr, &nullspace); It works This produces the normalized constant vector. If I do this: Vec nsp; VecDuplicate(m_rhs, &nsp); VecSet(nsp,1.0); VecNormalize(nsp, nullptr); MatNullSpaceCreate(getCommunicator(), PETSC_FALSE, 1, &nsp, &nullspace); It does not work This is also the normalized constant vector. So you are saying that these two vectors give different results with MatNullSpaceTest()? Something must be wrong in the code. Can you send a minimal example of this? I will go through and debug it. Thanks, Matt Probably, I have wrong expectations, but should not it be the same? Thanks Marco Cisternino, PhD marco.cisternino at optimad.it ______________________ Optimad Engineering Srl Via Bligny 5, Torino, Italia. +3901119719782 www.optimad.it -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From mfadams at lbl.gov Mon Jan 3 08:50:01 2022 From: mfadams at lbl.gov (Mark Adams) Date: Mon, 3 Jan 2022 09:50:01 -0500 Subject: [petsc-users] Nullspaces In-Reply-To: References: Message-ID: I have not looked at your code, but as a general observation you want to have some sort of memory checker, like valgrid for CPUs, in your workflow. It is the fastest way to find some classes of bugs. On Mon, Jan 3, 2022 at 8:47 AM Marco Cisternino wrote: > Are you talking about the code that produce the linear system or about the > tiny code that test the null space? > In the first case, it is absolutely possible, but I would expect no > problem in the tiny code, do you agree? > It is important to remark that the real code and the tiny one behave in > the same way when testing the null space of the operator. I can analyze > with valgrind and I will, but I would not expect great insights. > > > > Thanks, > > > > Marco Cisternino, PhD > marco.cisternino at optimad.it > > ______________________ > > Optimad Engineering Srl > > Via Bligny 5, Torino, Italia. > +3901119719782 > www.optimad.it > > > > *From:* Mark Adams > *Sent:* luned? 3 gennaio 2022 14:42 > *To:* Marco Cisternino > *Cc:* Matthew Knepley ; petsc-users < > petsc-users at mcs.anl.gov> > *Subject:* Re: [petsc-users] Nullspaces > > > > There could be a memory bug that does not cause a noticeable problem until > it hits some vital data and valgrind might find it on a small problem. > > > > However you might have a bug like a hardwired buffer size that > overflows that is in fact not a bug until you get to this large size and in > that case valgrid would need to be run on the large case and would have a > good chance of finding it. > > > > > > On Mon, Jan 3, 2022 at 4:42 AM Marco Cisternino < > marco.cisternino at optimad.it> wrote: > > My comments are between the Mark?s lines and they starts with ?#? > > > > Marco Cisternino > > > > *From:* Mark Adams > *Sent:* sabato 25 dicembre 2021 14:59 > *To:* Marco Cisternino > *Cc:* Matthew Knepley ; petsc-users < > petsc-users at mcs.anl.gov> > *Subject:* Re: [petsc-users] Nullspaces > > > > If "triggering the issue" requires a substantial mesh, that makes me > think there is a logic bug somewhere. Maybe use valgrind. > > > > # Are you suggesting to use valgrind on this tiny toy code or on the > original one? However, considering the purpose of the tiny code, i.e. > testing the constant null space, why there should be a logical bug? Case 1 > passes and case 2 should be exactly the same, shouldn?t be it? > > > > Also you say you divide by the cell volume. Maybe I am not understanding > this but that is basically diagonal scaling and that will change the null > space (ie, not a constant anymore) > > > > # I agree on this, but it pushes a question: why the case 1 passes the > test? > > # Thank you, Mark. > > > > On Thu, Dec 16, 2021 at 11:11 AM Marco Cisternino < > marco.cisternino at optimad.it> wrote: > > Hello Matthew, > > as promised I prepared a minimal (112960 rows. I?m not able to produce > anything smaller than this and triggering the issue) example of the > behavior I was talking about some days ago. > > What I did is to produce matrix, right hand side and initial solution of > the linear system. > > > > As I told you before, this linear system is the discretization of the > pressure equation of a predictor-corrector method for NS equations in the > framework of finite volume method. > > This case has homogeneous Neumann boundary conditions. Computational > domain has two independent and separated sub-domains. > > I discretize the weak formulation and I divide every row of the linear > system by the volume of the relative cell. > > The underlying mesh is not uniform, therefore cells have different > volumes. > > The issue I?m going to explain does not show up if the mesh is uniform, > same volume for all the cells. > > > > I usually build the null space sub-domain by sub-domain with > > MatNullSpaceCreate(getCommunicator(), PETSC_FALSE, nConstants, constants, > &nullspace); > > Where nConstants = 2 and constants contains two normalized arrays with > constant values on degrees of freedom relative to the associated sub-domain > and zeros elsewhere. > > > > However, as a test I tried the constant over the whole domain using 2 > alternatives that should produce the same null space: > > 1. MatNullSpaceCreate(getCommunicator(), PETSC_TRUE, 0, nullptr, > &nullspace); > 2. Vec* nsp; > > VecDuplicateVecs(solution, 1, &nsp); > > VecSet(nsp[0],1.0); > > VecNormalize(nsp[0], nullptr); > > MatNullSpaceCreate(getCommunicator(), PETSC_FALSE, 1, nsp, &nullspace); > > > > Once I created the null space I test it using: > > MatNullSpaceTest(nullspace, m_A, &isNullSpaceValid); > > > > The case 1 pass the test while case 2 don?t. > > > > I have a small code for matrix loading, null spaces creation and testing. > > Unfortunately I cannot implement a small code able to produce that linear > system. > > > > As attachment you can find an archive containing the matrix, the initial > solution (used to manually build the null space) and the rhs (not used in > the test code) in binary format. > > You can also find the testing code in the same archive. > > I used petsc 3.12(gcc+openMPI) and petsc 3.15.2(intelOneAPI) same results. > > If the attachment is not delivered, I can share a link to it. > > > > Thanks for any help. > > > > Marco Cisternino > > > > > > Marco Cisternino, PhD > marco.cisternino at optimad.it > > ______________________ > > Optimad Engineering Srl > > Via Bligny 5, Torino, Italia. > +3901119719782 > www.optimad.it > > > > *From:* Marco Cisternino > *Sent:* marted? 7 dicembre 2021 19:36 > *To:* Matthew Knepley > *Cc:* petsc-users > *Subject:* Re: [petsc-users] Nullspaces > > > > I will, as soon as possible... > > > > Scarica Outlook per Android > ------------------------------ > > *From:* Matthew Knepley > *Sent:* Tuesday, December 7, 2021 7:25:43 PM > *To:* Marco Cisternino > *Cc:* petsc-users > *Subject:* Re: [petsc-users] Nullspaces > > > > On Tue, Dec 7, 2021 at 11:19 AM Marco Cisternino < > marco.cisternino at optimad.it> wrote: > > Good morning, > > I?m still struggling with the Poisson equation with Neumann BCs. > > I discretize the equation by finite volume method and I divide every line > of the linear system by the volume of the cell. I could avoid this > division, but I?m trying to understand. > > My mesh is not uniform, i.e. cells have different volumes (it is an octree > mesh). > > Moreover, in my computational domain there are 2 separated sub-domains. > > I build the null space and then I use MatNullSpaceTest to check it. > > > > If I do this: > > MatNullSpaceCreate(getCommunicator(), PETSC_TRUE, 0, nullptr, &nullspace); > > It works > > > > This produces the normalized constant vector. > > > > If I do this: > > Vec nsp; > > VecDuplicate(m_rhs, &nsp); > > VecSet(nsp,1.0); > > VecNormalize(nsp, nullptr); > > MatNullSpaceCreate(getCommunicator(), PETSC_FALSE, 1, &nsp, &nullspace); > > It does not work > > > > This is also the normalized constant vector. > > > > So you are saying that these two vectors give different results with > MatNullSpaceTest()? > > Something must be wrong in the code. Can you send a minimal example of > this? I will go > > through and debug it. > > > > Thanks, > > > > Matt > > > > Probably, I have wrong expectations, but should not it be the same? > > > > Thanks > > > > Marco Cisternino, PhD > marco.cisternino at optimad.it > > ______________________ > > Optimad Engineering Srl > > Via Bligny 5, Torino, Italia. > +3901119719782 > www.optimad.it > > > > > > > -- > > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > > > https://www.cse.buffalo.edu/~knepley/ > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From marco.cisternino at optimad.it Mon Jan 3 09:07:46 2022 From: marco.cisternino at optimad.it (Marco Cisternino) Date: Mon, 3 Jan 2022 15:07:46 +0000 Subject: [petsc-users] Nullspaces In-Reply-To: References: Message-ID: We usually analyze the code with valgrind, when important changes are implemented. I have to admit that this analysis is still not automatic and the case we are talking about is not a test case for our workload. The test cases we have give no errors in valgrind analysis. However, I will analyze both the real code and the tiny one for this case with valgrind and report the results. Thank you, Marco Cisternino From: Mark Adams Sent: luned? 3 gennaio 2022 15:50 To: Marco Cisternino Cc: Matthew Knepley ; petsc-users Subject: Re: [petsc-users] Nullspaces I have not looked at your code, but as a general observation you want to have some sort of memory checker, like valgrid for CPUs, in your workflow. It is the fastest way to find some classes of bugs. On Mon, Jan 3, 2022 at 8:47 AM Marco Cisternino > wrote: Are you talking about the code that produce the linear system or about the tiny code that test the null space? In the first case, it is absolutely possible, but I would expect no problem in the tiny code, do you agree? It is important to remark that the real code and the tiny one behave in the same way when testing the null space of the operator. I can analyze with valgrind and I will, but I would not expect great insights. Thanks, Marco Cisternino, PhD marco.cisternino at optimad.it ______________________ Optimad Engineering Srl Via Bligny 5, Torino, Italia. +3901119719782 www.optimad.it From: Mark Adams > Sent: luned? 3 gennaio 2022 14:42 To: Marco Cisternino > Cc: Matthew Knepley >; petsc-users > Subject: Re: [petsc-users] Nullspaces There could be a memory bug that does not cause a noticeable problem until it hits some vital data and valgrind might find it on a small problem. However you might have a bug like a hardwired buffer size that overflows that is in fact not a bug until you get to this large size and in that case valgrid would need to be run on the large case and would have a good chance of finding it. On Mon, Jan 3, 2022 at 4:42 AM Marco Cisternino > wrote: My comments are between the Mark?s lines and they starts with ?#? Marco Cisternino From: Mark Adams > Sent: sabato 25 dicembre 2021 14:59 To: Marco Cisternino > Cc: Matthew Knepley >; petsc-users > Subject: Re: [petsc-users] Nullspaces If "triggering the issue" requires a substantial mesh, that makes me think there is a logic bug somewhere. Maybe use valgrind. # Are you suggesting to use valgrind on this tiny toy code or on the original one? However, considering the purpose of the tiny code, i.e. testing the constant null space, why there should be a logical bug? Case 1 passes and case 2 should be exactly the same, shouldn?t be it? Also you say you divide by the cell volume. Maybe I am not understanding this but that is basically diagonal scaling and that will change the null space (ie, not a constant anymore) # I agree on this, but it pushes a question: why the case 1 passes the test? # Thank you, Mark. On Thu, Dec 16, 2021 at 11:11 AM Marco Cisternino > wrote: Hello Matthew, as promised I prepared a minimal (112960 rows. I?m not able to produce anything smaller than this and triggering the issue) example of the behavior I was talking about some days ago. What I did is to produce matrix, right hand side and initial solution of the linear system. As I told you before, this linear system is the discretization of the pressure equation of a predictor-corrector method for NS equations in the framework of finite volume method. This case has homogeneous Neumann boundary conditions. Computational domain has two independent and separated sub-domains. I discretize the weak formulation and I divide every row of the linear system by the volume of the relative cell. The underlying mesh is not uniform, therefore cells have different volumes. The issue I?m going to explain does not show up if the mesh is uniform, same volume for all the cells. I usually build the null space sub-domain by sub-domain with MatNullSpaceCreate(getCommunicator(), PETSC_FALSE, nConstants, constants, &nullspace); Where nConstants = 2 and constants contains two normalized arrays with constant values on degrees of freedom relative to the associated sub-domain and zeros elsewhere. However, as a test I tried the constant over the whole domain using 2 alternatives that should produce the same null space: 1. MatNullSpaceCreate(getCommunicator(), PETSC_TRUE, 0, nullptr, &nullspace); 2. Vec* nsp; VecDuplicateVecs(solution, 1, &nsp); VecSet(nsp[0],1.0); VecNormalize(nsp[0], nullptr); MatNullSpaceCreate(getCommunicator(), PETSC_FALSE, 1, nsp, &nullspace); Once I created the null space I test it using: MatNullSpaceTest(nullspace, m_A, &isNullSpaceValid); The case 1 pass the test while case 2 don?t. I have a small code for matrix loading, null spaces creation and testing. Unfortunately I cannot implement a small code able to produce that linear system. As attachment you can find an archive containing the matrix, the initial solution (used to manually build the null space) and the rhs (not used in the test code) in binary format. You can also find the testing code in the same archive. I used petsc 3.12(gcc+openMPI) and petsc 3.15.2(intelOneAPI) same results. If the attachment is not delivered, I can share a link to it. Thanks for any help. Marco Cisternino Marco Cisternino, PhD marco.cisternino at optimad.it ______________________ Optimad Engineering Srl Via Bligny 5, Torino, Italia. +3901119719782 www.optimad.it From: Marco Cisternino > Sent: marted? 7 dicembre 2021 19:36 To: Matthew Knepley > Cc: petsc-users > Subject: Re: [petsc-users] Nullspaces I will, as soon as possible... Scarica Outlook per Android ________________________________ From: Matthew Knepley > Sent: Tuesday, December 7, 2021 7:25:43 PM To: Marco Cisternino > Cc: petsc-users > Subject: Re: [petsc-users] Nullspaces On Tue, Dec 7, 2021 at 11:19 AM Marco Cisternino > wrote: Good morning, I?m still struggling with the Poisson equation with Neumann BCs. I discretize the equation by finite volume method and I divide every line of the linear system by the volume of the cell. I could avoid this division, but I?m trying to understand. My mesh is not uniform, i.e. cells have different volumes (it is an octree mesh). Moreover, in my computational domain there are 2 separated sub-domains. I build the null space and then I use MatNullSpaceTest to check it. If I do this: MatNullSpaceCreate(getCommunicator(), PETSC_TRUE, 0, nullptr, &nullspace); It works This produces the normalized constant vector. If I do this: Vec nsp; VecDuplicate(m_rhs, &nsp); VecSet(nsp,1.0); VecNormalize(nsp, nullptr); MatNullSpaceCreate(getCommunicator(), PETSC_FALSE, 1, &nsp, &nullspace); It does not work This is also the normalized constant vector. So you are saying that these two vectors give different results with MatNullSpaceTest()? Something must be wrong in the code. Can you send a minimal example of this? I will go through and debug it. Thanks, Matt Probably, I have wrong expectations, but should not it be the same? Thanks Marco Cisternino, PhD marco.cisternino at optimad.it ______________________ Optimad Engineering Srl Via Bligny 5, Torino, Italia. +3901119719782 www.optimad.it -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From marco.cisternino at optimad.it Tue Jan 4 09:56:26 2022 From: marco.cisternino at optimad.it (Marco Cisternino) Date: Tue, 4 Jan 2022 15:56:26 +0000 Subject: [petsc-users] Nullspaces In-Reply-To: References: Message-ID: Hello Mark, I analyzed the codes with valgrind, both the real code and the tiny one. I obviously used memcheck tool but with full leak check compiling the codes with debug info. Not considering OpenMPI events (I have no wrappers on the machine I used for the analysis), the real code gave zero errors and the tiny one gave this ==17911== 905,536 (1,552 direct, 903,984 indirect) bytes in 1 blocks are definitely lost in loss record 33 of 33 ==17911== at 0x483E340: memalign (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so) ==17911== by 0x49CB672: PetscMallocAlign (in /usr/lib/x86_64-linux-gnu/libpetsc_real.so.3.12.4) ==17911== by 0x49CBE1D: PetscMallocA (in /usr/lib/x86_64-linux-gnu/libpetsc_real.so.3.12.4) ==17911== by 0x4B26187: VecCreate (in /usr/lib/x86_64-linux-gnu/libpetsc_real.so.3.12.4) ==17911== by 0x10940D: main (testNullSpace.cpp:30) due to the fact that I forgot to destroy the solution Vec (adding VecDestroy(&solution) at the end of the main, the error disappear). For both the codes, I analyzed the two ways of passing the constant to the null space of the operator, no memory errors but still the same results from MatNullSpaceTest, i.e. MatNullSpaceCreate(PETSC_COMM_WORLD, PETSC_TRUE, 0, nullptr, &nullspace); passes the test while Vec* nsp; VecDuplicateVecs(solution, 1, &nsp); VecSet(nsp[0],1.0); VecNormalize(nsp[0], nullptr); MatNullSpaceCreate(PETSC_COMM_WORLD, PETSC_FALSE, 1, nsp, &nullspace); VecDestroyVecs(1,&nsp); PetscFree(nsp); does not. I hope this can satisfy your doubt about the memory behavior, but please do not hesitate to ask for more analysis if it cannot. As Matthew said some weeks ago, something should be wrong in the code, I would say in the matrix, that?s why I provided the matrix and the way I test it. Unfortunately, it is hard (read impossible) for me to share the code producing the matrix. I hope the minimal code I provided is enough to understand something. Thank you all. Marco Cisternino From: Marco Cisternino Sent: luned? 3 gennaio 2022 16:08 To: Mark Adams Cc: Matthew Knepley ; petsc-users Subject: RE: [petsc-users] Nullspaces We usually analyze the code with valgrind, when important changes are implemented. I have to admit that this analysis is still not automatic and the case we are talking about is not a test case for our workload. The test cases we have give no errors in valgrind analysis. However, I will analyze both the real code and the tiny one for this case with valgrind and report the results. Thank you, Marco Cisternino From: Mark Adams > Sent: luned? 3 gennaio 2022 15:50 To: Marco Cisternino > Cc: Matthew Knepley >; petsc-users > Subject: Re: [petsc-users] Nullspaces I have not looked at your code, but as a general observation you want to have some sort of memory checker, like valgrid for CPUs, in your workflow. It is the fastest way to find some classes of bugs. On Mon, Jan 3, 2022 at 8:47 AM Marco Cisternino > wrote: Are you talking about the code that produce the linear system or about the tiny code that test the null space? In the first case, it is absolutely possible, but I would expect no problem in the tiny code, do you agree? It is important to remark that the real code and the tiny one behave in the same way when testing the null space of the operator. I can analyze with valgrind and I will, but I would not expect great insights. Thanks, Marco Cisternino, PhD marco.cisternino at optimad.it ______________________ Optimad Engineering Srl Via Bligny 5, Torino, Italia. +3901119719782 www.optimad.it From: Mark Adams > Sent: luned? 3 gennaio 2022 14:42 To: Marco Cisternino > Cc: Matthew Knepley >; petsc-users > Subject: Re: [petsc-users] Nullspaces There could be a memory bug that does not cause a noticeable problem until it hits some vital data and valgrind might find it on a small problem. However you might have a bug like a hardwired buffer size that overflows that is in fact not a bug until you get to this large size and in that case valgrid would need to be run on the large case and would have a good chance of finding it. On Mon, Jan 3, 2022 at 4:42 AM Marco Cisternino > wrote: My comments are between the Mark?s lines and they starts with ?#? Marco Cisternino From: Mark Adams > Sent: sabato 25 dicembre 2021 14:59 To: Marco Cisternino > Cc: Matthew Knepley >; petsc-users > Subject: Re: [petsc-users] Nullspaces If "triggering the issue" requires a substantial mesh, that makes me think there is a logic bug somewhere. Maybe use valgrind. # Are you suggesting to use valgrind on this tiny toy code or on the original one? However, considering the purpose of the tiny code, i.e. testing the constant null space, why there should be a logical bug? Case 1 passes and case 2 should be exactly the same, shouldn?t be it? Also you say you divide by the cell volume. Maybe I am not understanding this but that is basically diagonal scaling and that will change the null space (ie, not a constant anymore) # I agree on this, but it pushes a question: why the case 1 passes the test? # Thank you, Mark. On Thu, Dec 16, 2021 at 11:11 AM Marco Cisternino > wrote: Hello Matthew, as promised I prepared a minimal (112960 rows. I?m not able to produce anything smaller than this and triggering the issue) example of the behavior I was talking about some days ago. What I did is to produce matrix, right hand side and initial solution of the linear system. As I told you before, this linear system is the discretization of the pressure equation of a predictor-corrector method for NS equations in the framework of finite volume method. This case has homogeneous Neumann boundary conditions. Computational domain has two independent and separated sub-domains. I discretize the weak formulation and I divide every row of the linear system by the volume of the relative cell. The underlying mesh is not uniform, therefore cells have different volumes. The issue I?m going to explain does not show up if the mesh is uniform, same volume for all the cells. I usually build the null space sub-domain by sub-domain with MatNullSpaceCreate(getCommunicator(), PETSC_FALSE, nConstants, constants, &nullspace); Where nConstants = 2 and constants contains two normalized arrays with constant values on degrees of freedom relative to the associated sub-domain and zeros elsewhere. However, as a test I tried the constant over the whole domain using 2 alternatives that should produce the same null space: 1. MatNullSpaceCreate(getCommunicator(), PETSC_TRUE, 0, nullptr, &nullspace); 2. Vec* nsp; VecDuplicateVecs(solution, 1, &nsp); VecSet(nsp[0],1.0); VecNormalize(nsp[0], nullptr); MatNullSpaceCreate(getCommunicator(), PETSC_FALSE, 1, nsp, &nullspace); Once I created the null space I test it using: MatNullSpaceTest(nullspace, m_A, &isNullSpaceValid); The case 1 pass the test while case 2 don?t. I have a small code for matrix loading, null spaces creation and testing. Unfortunately I cannot implement a small code able to produce that linear system. As attachment you can find an archive containing the matrix, the initial solution (used to manually build the null space) and the rhs (not used in the test code) in binary format. You can also find the testing code in the same archive. I used petsc 3.12(gcc+openMPI) and petsc 3.15.2(intelOneAPI) same results. If the attachment is not delivered, I can share a link to it. Thanks for any help. Marco Cisternino Marco Cisternino, PhD marco.cisternino at optimad.it ______________________ Optimad Engineering Srl Via Bligny 5, Torino, Italia. +3901119719782 www.optimad.it From: Marco Cisternino > Sent: marted? 7 dicembre 2021 19:36 To: Matthew Knepley > Cc: petsc-users > Subject: Re: [petsc-users] Nullspaces I will, as soon as possible... Scarica Outlook per Android ________________________________ From: Matthew Knepley > Sent: Tuesday, December 7, 2021 7:25:43 PM To: Marco Cisternino > Cc: petsc-users > Subject: Re: [petsc-users] Nullspaces On Tue, Dec 7, 2021 at 11:19 AM Marco Cisternino > wrote: Good morning, I?m still struggling with the Poisson equation with Neumann BCs. I discretize the equation by finite volume method and I divide every line of the linear system by the volume of the cell. I could avoid this division, but I?m trying to understand. My mesh is not uniform, i.e. cells have different volumes (it is an octree mesh). Moreover, in my computational domain there are 2 separated sub-domains. I build the null space and then I use MatNullSpaceTest to check it. If I do this: MatNullSpaceCreate(getCommunicator(), PETSC_TRUE, 0, nullptr, &nullspace); It works This produces the normalized constant vector. If I do this: Vec nsp; VecDuplicate(m_rhs, &nsp); VecSet(nsp,1.0); VecNormalize(nsp, nullptr); MatNullSpaceCreate(getCommunicator(), PETSC_FALSE, 1, &nsp, &nullspace); It does not work This is also the normalized constant vector. So you are saying that these two vectors give different results with MatNullSpaceTest()? Something must be wrong in the code. Can you send a minimal example of this? I will go through and debug it. Thanks, Matt Probably, I have wrong expectations, but should not it be the same? Thanks Marco Cisternino, PhD marco.cisternino at optimad.it ______________________ Optimad Engineering Srl Via Bligny 5, Torino, Italia. +3901119719782 www.optimad.it -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From gridpack.account at pnnl.gov Tue Jan 4 10:30:04 2022 From: gridpack.account at pnnl.gov (^PNNL GridPACK Account) Date: Tue, 4 Jan 2022 16:30:04 +0000 Subject: [petsc-users] Building apps on Cori Message-ID: <58116F49-1C02-4253-8B47-7D95A2EE2032@pnnl.gov> Hi, We?ve got some users of our GridPACK package that are trying to build on the Cori machine at NERSC. GridPACK uses CMake for its build system and relies on Jed Brown?s FindPETSc.cmake module, along with the FindPackageMultipass.cmake module to identify PETSc. The tests for PETSc are currently failing with -- Checking PETSc ... -- petsc_lib_dir /global/u1/s/smittal/petsc/arch-linux2-c-debug/lib -- Recognized PETSc install with single library for all packages -- Performing Test MULTIPASS_TEST_1_petsc_works_minimal -- Performing Test MULTIPASS_TEST_1_petsc_works_minimal - Failed -- Performing Test MULTIPASS_TEST_2_petsc_works_allincludes -- Performing Test MULTIPASS_TEST_2_petsc_works_allincludes - Failed -- Performing Test MULTIPASS_TEST_3_petsc_works_alllibraries CMake Error: The following variables are used in this project, but they are set to NOTFOUND. Please set them or make sure they are set and tested correctly in the CMake files: MPI_LIBRARY linked by target "cmTC_664e6" in directory /global/homes/s/smittal/GridPACK/build/CMakeFiles/CMakeTmp PETSC_LIBRARY_SINGLE linked by target "cmTC_664e6" in directory /global/homes/s/smittal/GridPACK/build/CMakeFiles/CMakeTmp CMake Error at /global/common/cori_cle7/software/cmake/3.21.3/share/cmake-3.21/Modules/Internal/CheckSourceRuns.cmake:94 (try_run): Failed to generate test project build system. Call Stack (most recent call first): /global/common/cori_cle7/software/cmake/3.21.3/share/cmake-3.21/Modules/CheckCSourceRuns.cmake:76 (cmake_check_source_runs) /global/homes/s/smittal/GridPACK/cmake-jedbrown/FindPackageMultipass.cmake:97 (check_c_source_runs) /global/homes/s/smittal/GridPACK/cmake-jedbrown/FindPETSc.cmake:293 (multipass_source_runs) /global/homes/s/smittal/GridPACK/cmake-jedbrown/FindPETSc.cmake:332 (petsc_test_runs) CMakeLists.txt:280 (find_package) -- Configuring incomplete, errors occurred! We have code in the CMakeLists.txt file to identify a Cray build and set the MPI_LIBRARY variable to ?? instead of NOTFOUND but that may be failing. The PETSC_LIBRARY_SINGLE error is new and one that I haven?t seen in past attempts to build at NERSC. My recollection was that the FindMPI module was not geared towards identifying the MPI compiler wrappers on Crays and that had a tendency to mess everything else up. Have you seen these kinds of problems recently and if so, has anyone come up with a solution? Bruce -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefano.zampini at gmail.com Tue Jan 4 10:39:04 2022 From: stefano.zampini at gmail.com (Stefano Zampini) Date: Tue, 4 Jan 2022 17:39:04 +0100 Subject: [petsc-users] Nullspaces In-Reply-To: References: Message-ID: Il Mar 4 Gen 2022, 16:56 Marco Cisternino ha scritto: > Hello Mark, > > I analyzed the codes with valgrind, both the real code and the tiny one. > > I obviously used memcheck tool but with full leak check compiling the > codes with debug info. > > Not considering OpenMPI events (I have no wrappers on the machine I used > for the analysis), the real code gave zero errors and the tiny one gave this > > ==17911== 905,536 (1,552 direct, 903,984 indirect) bytes in 1 blocks are > definitely lost in loss record 33 of 33 > > ==17911== at 0x483E340: memalign (in > /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so) > > ==17911== by 0x49CB672: PetscMallocAlign (in > /usr/lib/x86_64-linux-gnu/libpetsc_real.so.3.12.4) > > ==17911== by 0x49CBE1D: PetscMallocA (in > /usr/lib/x86_64-linux-gnu/libpetsc_real.so.3.12.4) > > ==17911== by 0x4B26187: VecCreate (in > /usr/lib/x86_64-linux-gnu/libpetsc_real.so.3.12.4) > > ==17911== by 0x10940D: main (testNullSpace.cpp:30) > > > > due to the fact that I forgot to destroy the solution Vec (adding > VecDestroy(&solution) at the end of the main, the error disappear). > > For both the codes, I analyzed the two ways of passing the constant to the > null space of the operator, no memory errors but still the same results > from MatNullSpaceTest, i.e. > > > > MatNullSpaceCreate(PETSC_COMM_WORLD, PETSC_TRUE, 0, nullptr, &nullspace); > > > > passes the test while > > > > Vec* nsp; > > VecDuplicateVecs(solution, 1, &nsp); > > VecSet(nsp[0],1.0); > > VecNormalize(nsp[0], nullptr); > > MatNullSpaceCreate(PETSC_COMM_WORLD, PETSC_FALSE, 1, nsp, &nullspace); > > VecDestroyVecs(1,&nsp); > > PetscFree(nsp); > > > > does not. > > > > I hope this can satisfy your doubt about the memory behavior, but please > do not hesitate to ask for more analysis if it cannot. > > > > As Matthew said some weeks ago, something should be wrong in the code, I > would say in the matrix, that?s why I provided the matrix and the way I > test it. > > Unfortunately, it is hard (read impossible) for me to share the code > producing the matrix. I hope the minimal code I provided is enough to > understand something. > Try running slepc to find the smallest eigenvalues. There should be two zero eigenvalues, then inspect the eigenvectors > > > Thank you all. > > > > Marco Cisternino > > > > *From:* Marco Cisternino > *Sent:* luned? 3 gennaio 2022 16:08 > *To:* Mark Adams > *Cc:* Matthew Knepley ; petsc-users < > petsc-users at mcs.anl.gov> > *Subject:* RE: [petsc-users] Nullspaces > > > > We usually analyze the code with valgrind, when important changes are > implemented. > > I have to admit that this analysis is still not automatic and the case we > are talking about is not a test case for our workload. > > The test cases we have give no errors in valgrind analysis. > > > > However, I will analyze both the real code and the tiny one for this case > with valgrind and report the results. > > > > Thank you, > > > > > > Marco Cisternino > > > > *From:* Mark Adams > *Sent:* luned? 3 gennaio 2022 15:50 > *To:* Marco Cisternino > *Cc:* Matthew Knepley ; petsc-users < > petsc-users at mcs.anl.gov> > *Subject:* Re: [petsc-users] Nullspaces > > > > I have not looked at your code, but as a general observation you want to > have some sort of memory checker, like valgrid for CPUs, in your workflow. > > It is the fastest way to find some classes of bugs. > > > > On Mon, Jan 3, 2022 at 8:47 AM Marco Cisternino < > marco.cisternino at optimad.it> wrote: > > Are you talking about the code that produce the linear system or about the > tiny code that test the null space? > In the first case, it is absolutely possible, but I would expect no > problem in the tiny code, do you agree? > It is important to remark that the real code and the tiny one behave in > the same way when testing the null space of the operator. I can analyze > with valgrind and I will, but I would not expect great insights. > > > > Thanks, > > > > Marco Cisternino, PhD > marco.cisternino at optimad.it > > ______________________ > > Optimad Engineering Srl > > Via Bligny 5, Torino, Italia. > +3901119719782 > www.optimad.it > > > > *From:* Mark Adams > *Sent:* luned? 3 gennaio 2022 14:42 > *To:* Marco Cisternino > *Cc:* Matthew Knepley ; petsc-users < > petsc-users at mcs.anl.gov> > *Subject:* Re: [petsc-users] Nullspaces > > > > There could be a memory bug that does not cause a noticeable problem until > it hits some vital data and valgrind might find it on a small problem. > > > > However you might have a bug like a hardwired buffer size that > overflows that is in fact not a bug until you get to this large size and in > that case valgrid would need to be run on the large case and would have a > good chance of finding it. > > > > > > On Mon, Jan 3, 2022 at 4:42 AM Marco Cisternino < > marco.cisternino at optimad.it> wrote: > > My comments are between the Mark?s lines and they starts with ?#? > > > > Marco Cisternino > > > > *From:* Mark Adams > *Sent:* sabato 25 dicembre 2021 14:59 > *To:* Marco Cisternino > *Cc:* Matthew Knepley ; petsc-users < > petsc-users at mcs.anl.gov> > *Subject:* Re: [petsc-users] Nullspaces > > > > If "triggering the issue" requires a substantial mesh, that makes me > think there is a logic bug somewhere. Maybe use valgrind. > > > > # Are you suggesting to use valgrind on this tiny toy code or on the > original one? However, considering the purpose of the tiny code, i.e. > testing the constant null space, why there should be a logical bug? Case 1 > passes and case 2 should be exactly the same, shouldn?t be it? > > > > Also you say you divide by the cell volume. Maybe I am not understanding > this but that is basically diagonal scaling and that will change the null > space (ie, not a constant anymore) > > > > # I agree on this, but it pushes a question: why the case 1 passes the > test? > > # Thank you, Mark. > > > > On Thu, Dec 16, 2021 at 11:11 AM Marco Cisternino < > marco.cisternino at optimad.it> wrote: > > Hello Matthew, > > as promised I prepared a minimal (112960 rows. I?m not able to produce > anything smaller than this and triggering the issue) example of the > behavior I was talking about some days ago. > > What I did is to produce matrix, right hand side and initial solution of > the linear system. > > > > As I told you before, this linear system is the discretization of the > pressure equation of a predictor-corrector method for NS equations in the > framework of finite volume method. > > This case has homogeneous Neumann boundary conditions. Computational > domain has two independent and separated sub-domains. > > I discretize the weak formulation and I divide every row of the linear > system by the volume of the relative cell. > > The underlying mesh is not uniform, therefore cells have different > volumes. > > The issue I?m going to explain does not show up if the mesh is uniform, > same volume for all the cells. > > > > I usually build the null space sub-domain by sub-domain with > > MatNullSpaceCreate(getCommunicator(), PETSC_FALSE, nConstants, constants, > &nullspace); > > Where nConstants = 2 and constants contains two normalized arrays with > constant values on degrees of freedom relative to the associated sub-domain > and zeros elsewhere. > > > > However, as a test I tried the constant over the whole domain using 2 > alternatives that should produce the same null space: > > 1. MatNullSpaceCreate(getCommunicator(), PETSC_TRUE, 0, nullptr, > &nullspace); > 2. Vec* nsp; > > VecDuplicateVecs(solution, 1, &nsp); > > VecSet(nsp[0],1.0); > > VecNormalize(nsp[0], nullptr); > > MatNullSpaceCreate(getCommunicator(), PETSC_FALSE, 1, nsp, &nullspace); > > > > Once I created the null space I test it using: > > MatNullSpaceTest(nullspace, m_A, &isNullSpaceValid); > > > > The case 1 pass the test while case 2 don?t. > > > > I have a small code for matrix loading, null spaces creation and testing. > > Unfortunately I cannot implement a small code able to produce that linear > system. > > > > As attachment you can find an archive containing the matrix, the initial > solution (used to manually build the null space) and the rhs (not used in > the test code) in binary format. > > You can also find the testing code in the same archive. > > I used petsc 3.12(gcc+openMPI) and petsc 3.15.2(intelOneAPI) same results. > > If the attachment is not delivered, I can share a link to it. > > > > Thanks for any help. > > > > Marco Cisternino > > > > > > Marco Cisternino, PhD > marco.cisternino at optimad.it > > ______________________ > > Optimad Engineering Srl > > Via Bligny 5, Torino, Italia. > +3901119719782 > www.optimad.it > > > > *From:* Marco Cisternino > *Sent:* marted? 7 dicembre 2021 19:36 > *To:* Matthew Knepley > *Cc:* petsc-users > *Subject:* Re: [petsc-users] Nullspaces > > > > I will, as soon as possible... > > > > Scarica Outlook per Android > ------------------------------ > > *From:* Matthew Knepley > *Sent:* Tuesday, December 7, 2021 7:25:43 PM > *To:* Marco Cisternino > *Cc:* petsc-users > *Subject:* Re: [petsc-users] Nullspaces > > > > On Tue, Dec 7, 2021 at 11:19 AM Marco Cisternino < > marco.cisternino at optimad.it> wrote: > > Good morning, > > I?m still struggling with the Poisson equation with Neumann BCs. > > I discretize the equation by finite volume method and I divide every line > of the linear system by the volume of the cell. I could avoid this > division, but I?m trying to understand. > > My mesh is not uniform, i.e. cells have different volumes (it is an octree > mesh). > > Moreover, in my computational domain there are 2 separated sub-domains. > > I build the null space and then I use MatNullSpaceTest to check it. > > > > If I do this: > > MatNullSpaceCreate(getCommunicator(), PETSC_TRUE, 0, nullptr, &nullspace); > > It works > > > > This produces the normalized constant vector. > > > > If I do this: > > Vec nsp; > > VecDuplicate(m_rhs, &nsp); > > VecSet(nsp,1.0); > > VecNormalize(nsp, nullptr); > > MatNullSpaceCreate(getCommunicator(), PETSC_FALSE, 1, &nsp, &nullspace); > > It does not work > > > > This is also the normalized constant vector. > > > > So you are saying that these two vectors give different results with > MatNullSpaceTest()? > > Something must be wrong in the code. Can you send a minimal example of > this? I will go > > through and debug it. > > > > Thanks, > > > > Matt > > > > Probably, I have wrong expectations, but should not it be the same? > > > > Thanks > > > > Marco Cisternino, PhD > marco.cisternino at optimad.it > > ______________________ > > Optimad Engineering Srl > > Via Bligny 5, Torino, Italia. > +3901119719782 > www.optimad.it > > > > > > > -- > > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > > > https://www.cse.buffalo.edu/~knepley/ > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From marco.cisternino at optimad.it Tue Jan 4 11:44:25 2022 From: marco.cisternino at optimad.it (Marco Cisternino) Date: Tue, 4 Jan 2022 17:44:25 +0000 Subject: [petsc-users] Nullspaces In-Reply-To: References: Message-ID: Hello Stefano and thank you for your support. I never used SLEPc before but I did this: right after the matrix loading from file I added the following lines to my shared tiny code MatLoad(matrix, v); EPS eps; EPSCreate(PETSC_COMM_WORLD, &eps); EPSSetOperators( eps, matrix, NULL ); EPSSetWhichEigenpairs(eps, EPS_SMALLEST_MAGNITUDE); EPSSetProblemType( eps, EPS_NHEP ); EPSSetTolerances(eps, 1.0e-12, 1000); EPSSetFromOptions( eps ); EPSSolve( eps ); Vec xr, xi; /* eigenvector, x */ PetscScalar kr, ki; /* eigenvalue, k */ PetscInt j, nconv; PetscReal error; EPSGetConverged( eps, &nconv ); for (j=0; j> wrote: Hello Matthew, as promised I prepared a minimal (112960 rows. I?m not able to produce anything smaller than this and triggering the issue) example of the behavior I was talking about some days ago. What I did is to produce matrix, right hand side and initial solution of the linear system. As I told you before, this linear system is the discretization of the pressure equation of a predictor-corrector method for NS equations in the framework of finite volume method. This case has homogeneous Neumann boundary conditions. Computational domain has two independent and separated sub-domains. I discretize the weak formulation and I divide every row of the linear system by the volume of the relative cell. The underlying mesh is not uniform, therefore cells have different volumes. The issue I?m going to explain does not show up if the mesh is uniform, same volume for all the cells. I usually build the null space sub-domain by sub-domain with MatNullSpaceCreate(getCommunicator(), PETSC_FALSE, nConstants, constants, &nullspace); Where nConstants = 2 and constants contains two normalized arrays with constant values on degrees of freedom relative to the associated sub-domain and zeros elsewhere. However, as a test I tried the constant over the whole domain using 2 alternatives that should produce the same null space: 1. MatNullSpaceCreate(getCommunicator(), PETSC_TRUE, 0, nullptr, &nullspace); 2. Vec* nsp; VecDuplicateVecs(solution, 1, &nsp); VecSet(nsp[0],1.0); VecNormalize(nsp[0], nullptr); MatNullSpaceCreate(getCommunicator(), PETSC_FALSE, 1, nsp, &nullspace); Once I created the null space I test it using: MatNullSpaceTest(nullspace, m_A, &isNullSpaceValid); The case 1 pass the test while case 2 don?t. I have a small code for matrix loading, null spaces creation and testing. Unfortunately I cannot implement a small code able to produce that linear system. As attachment you can find an archive containing the matrix, the initial solution (used to manually build the null space) and the rhs (not used in the test code) in binary format. You can also find the testing code in the same archive. I used petsc 3.12(gcc+openMPI) and petsc 3.15.2(intelOneAPI) same results. If the attachment is not delivered, I can share a link to it. Thanks for any help. Marco Cisternino Marco Cisternino, PhD marco.cisternino at optimad.it ______________________ Optimad Engineering Srl Via Bligny 5, Torino, Italia. +3901119719782 www.optimad.it From: Marco Cisternino > Sent: marted? 7 dicembre 2021 19:36 To: Matthew Knepley > Cc: petsc-users > Subject: Re: [petsc-users] Nullspaces I will, as soon as possible... Scarica Outlook per Android ________________________________ From: Matthew Knepley > Sent: Tuesday, December 7, 2021 7:25:43 PM To: Marco Cisternino > Cc: petsc-users > Subject: Re: [petsc-users] Nullspaces On Tue, Dec 7, 2021 at 11:19 AM Marco Cisternino > wrote: Good morning, I?m still struggling with the Poisson equation with Neumann BCs. I discretize the equation by finite volume method and I divide every line of the linear system by the volume of the cell. I could avoid this division, but I?m trying to understand. My mesh is not uniform, i.e. cells have different volumes (it is an octree mesh). Moreover, in my computational domain there are 2 separated sub-domains. I build the null space and then I use MatNullSpaceTest to check it. If I do this: MatNullSpaceCreate(getCommunicator(), PETSC_TRUE, 0, nullptr, &nullspace); It works This produces the normalized constant vector. If I do this: Vec nsp; VecDuplicate(m_rhs, &nsp); VecSet(nsp,1.0); VecNormalize(nsp, nullptr); MatNullSpaceCreate(getCommunicator(), PETSC_FALSE, 1, &nsp, &nullspace); It does not work This is also the normalized constant vector. So you are saying that these two vectors give different results with MatNullSpaceTest()? Something must be wrong in the code. Can you send a minimal example of this? I will go through and debug it. Thanks, Matt Probably, I have wrong expectations, but should not it be the same? Thanks Marco Cisternino, PhD marco.cisternino at optimad.it ______________________ Optimad Engineering Srl Via Bligny 5, Torino, Italia. +3901119719782 www.optimad.it -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From jroman at dsic.upv.es Tue Jan 4 12:30:07 2022 From: jroman at dsic.upv.es (Jose E. Roman) Date: Tue, 4 Jan 2022 19:30:07 +0100 Subject: [petsc-users] Nullspaces In-Reply-To: References: Message-ID: <34A686AA-D337-484B-9EB3-A01C7565AD48@dsic.upv.es> To compute more than one eigenpair, call EPSSetDimensions(eps,nev,PETSC_DEFAULT,PETSC_DEFAULT). To compute zero eigenvalues you may want to use an absolute convergence criterion, with EPSSetConvergenceTest(eps,EPS_CONV_ABS), but then a tolerance of 1e-12 is probably too small. You can try without this, anyway. Jose > El 4 ene 2022, a las 18:44, Marco Cisternino escribi?: > > Hello Stefano and thank you for your support. > I never used SLEPc before but I did this: > right after the matrix loading from file I added the following lines to my shared tiny code > MatLoad(matrix, v); > > EPS eps; > EPSCreate(PETSC_COMM_WORLD, &eps); > EPSSetOperators( eps, matrix, NULL ); > EPSSetWhichEigenpairs(eps, EPS_SMALLEST_MAGNITUDE); > EPSSetProblemType( eps, EPS_NHEP ); > EPSSetTolerances(eps, 1.0e-12, 1000); > EPSSetFromOptions( eps ); > EPSSolve( eps ); > > Vec xr, xi; /* eigenvector, x */ > PetscScalar kr, ki; /* eigenvalue, k */ > PetscInt j, nconv; > PetscReal error; > EPSGetConverged( eps, &nconv ); > for (j=0; j EPSGetEigenpair( eps, j, &kr, &ki, xr, xi ); > EPSComputeError( eps, j, EPS_ERROR_ABSOLUTE, &error ); > std::cout << "The smallest eigenvalue is = (" << kr << ", " << ki << ") with error = " << error << std::endl; > } > > I launched using > mpirun -n 1 ./testnullspace -eps_monitor > > and the output is > > 1 EPS nconv=0 first unconverged value (error) -1499.29 (6.57994794e+01) > 2 EPS nconv=0 first unconverged value (error) -647.468 (5.39939262e+01) > 3 EPS nconv=0 first unconverged value (error) -177.157 (9.49337698e+01) > 4 EPS nconv=0 first unconverged value (error) 59.6771 (1.62531943e+02) > 5 EPS nconv=0 first unconverged value (error) 41.755 (1.41965990e+02) > 6 EPS nconv=0 first unconverged value (error) -11.5462 (3.60453662e+02) > 7 EPS nconv=0 first unconverged value (error) -6.04493 (4.60890030e+02) > 8 EPS nconv=0 first unconverged value (error) -22.7362 (8.67630086e+01) > 9 EPS nconv=0 first unconverged value (error) -12.9637 (1.08507821e+02) > 10 EPS nconv=0 first unconverged value (error) 7.7234 (1.53561979e+02) > ? > 111 EPS nconv=0 first unconverged value (error) -2.27e-08 (6.84762319e+00) > 112 EPS nconv=0 first unconverged value (error) -2.60619e-08 (4.45245528e+00) > 113 EPS nconv=0 first unconverged value (error) -5.49592e-09 (1.87798984e+01) > 114 EPS nconv=0 first unconverged value (error) -9.9456e-09 (7.96711076e+00) > 115 EPS nconv=0 first unconverged value (error) -1.89779e-08 (4.15471472e+00) > 116 EPS nconv=0 first unconverged value (error) -2.05288e-08 (2.52953194e+00) > 117 EPS nconv=0 first unconverged value (error) -2.02919e-08 (2.90090711e+00) > 118 EPS nconv=0 first unconverged value (error) -3.8706e-08 (8.03595736e-01) > 119 EPS nconv=1 first unconverged value (error) -61751.8 (9.58036571e-07) > Computed 1 pairs > The smallest eigenvalue is = (-3.8706e-08, 0) with error = 4.9707e-07 > > Am I using SLEPc in the right way at least for the first smallest eigenvalue? If I?m on the right way I can find out how to compute the second one. > > Thanks a lot > > Marco Cisternino From jed at jedbrown.org Tue Jan 4 23:36:42 2022 From: jed at jedbrown.org (Jed Brown) Date: Tue, 04 Jan 2022 22:36:42 -0700 Subject: [petsc-users] [EXTERNAL] Re: DM misuse causes massive memory leak? In-Reply-To: References: <87tueraunm.fsf@jedbrown.org> Message-ID: <87sfu2921h.fsf@jedbrown.org> Please "reply all" include the list in the future. "Ferrand, Jesus A." writes: > Forgot to say thanks for the reply (my bad). > Yes, I was indeed forgetting to pre-allocate the sparse matrices when doing them by hand (complacency seeded by MATLAB's zeros()). Thank you, Jed, and Jeremy, for the hints! > > I have more questions, these ones about boundary conditions (I think these are for Matt). > In my current code I set Dirichlet conditions directly on a Mat by calling MatSetZeroRows(). I profiled my code and found the part that applies them to be unnacceptably slow. In response, I've been trying to better pre-allocate Mats using PetscSections. I have found documentation for PetscSectionSetDof(), PetscSectionSetNumFields(), PetscSectionSetFieldName(), and PetscSectionSetFieldDof(), to set the size of my Mats and Vecs by calling DMSetLocalSection() followed by DMCreateMatrix() and DMCreateLocalVector() to get a RHS vector. This seems faster. > > In PetscSection, what is the difference between a "field" and a "component"? For example, I could have one field "Velocity" with three components ux, uy, and uz or perhaps three fields ux, uy, and uz each with a default component? It's just how you name them and how they appear in output. Usually "velocity" is better as a field with three components, but fields with other meaning (and perhaps different finite element spaces), such as pressure, would be different fields. Different components are always in the same FE space. > I am struggling now to impose boundary conditions after constraining dofs using PetscSection. My understanding is that constraining dof's reduces the size of the DM's matrix but it does not give the DM knowledge of what values the constrained dofs should have, right? > > I know that there is DMAddBoundary(), but I am unsure of how to use it. From Gmsh I have a mesh with surface boundaries flagged. I'm not sure whether DMAddBoundary()will constrain the face, edge, or vertex points when I give it the DMLabel from Gmsh. (I need specific dof on the vertices to be constrained). I did some testing and I think DMAddBoundary() attempts to constrain the Face points (see error log below). I only associated fields with the vertices but not the Faces. I can extract the vertex points from the face label using DMPlexGetConeRecursiveVertices() but the output IS has repeated entries for the vertex points (many faces share the same vertex). Is there an easier way to get the vertex points from a gmsh surface tag? How did you call DMAddBoundary()? Are you using DM_BC_ESSENTIAL and a callback function that provides the inhomogeneous boundary condition? > I'm sorry this is a mouthful. > > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > [0]PETSC ERROR: Argument out of range > [0]PETSC ERROR: Field number 2 must be in [0, 0) It looks like you haven't added these fields yet. > [0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting. > [0]PETSC ERROR: Petsc Release Version 3.16.0, unknown > [0]PETSC ERROR: ./gmsh4 on a linux-c-dbg named F86 by jesus Tue Jan 4 15:19:57 2022 > [0]PETSC ERROR: Configure options --with-32bits-pci-domain=1 --with-debugging =1 > [0]PETSC ERROR: #1 DMGetField() at /home/jesus/SAND/PETSc_install/petsc/src/dm/interface/dm.c:4803 > [0]PETSC ERROR: #2 DMCompleteBoundaryLabel_Internal() at /home/jesus/SAND/PETSc_install/petsc/src/dm/interface/dm.c:5140 > [0]PETSC ERROR: #3 DMAddBoundary() at /home/jesus/SAND/PETSc_install/petsc/src/dm/interface/dm.c:8561 > [0]PETSC ERROR: #4 main() at /home/jesus/SAND/FEA/3D/gmshBACKUP4.c:215 > [0]PETSC ERROR: No PETSc Option Table entries > [0]PETSC ERROR: ----------------End of Error Message -------send entire error message to petsc-maint at mcs.anl.gov---------- > > > > > > > ________________________________ > From: Jed Brown > Sent: Wednesday, December 29, 2021 5:55 PM > To: Ferrand, Jesus A. ; petsc-users at mcs.anl.gov > Subject: [EXTERNAL] Re: [petsc-users] DM misuse causes massive memory leak? > > CAUTION: This email originated outside of Embry-Riddle Aeronautical University. Do not click links or open attachments unless you recognize the sender and know the content is safe. > > > "Ferrand, Jesus A." writes: > >> Dear PETSc Team: >> >> I have a question about DM and PetscSection. Say I import a mesh (for FEM purposes) and create a DMPlex for it. I then use PetscSections to set degrees of freedom per "point" (by point I mean vertices, lines, faces, and cells). I then use PetscSectionGetStorageSize() to get the size of the global stiffness matrix (K) needed for my FEM problem. One last detail, this K I populate inside a rather large loop using an element stiffness matrix function of my own. Instead of using DMCreateMatrix(), I manually created a Mat using MatCreate(), MatSetType(), MatSetSizes(), and MatSetUp(). I come to find that said loop is painfully slow when I use the manually created matrix, but 20x faster when I use the Mat coming out of DMCreateMatrix(). > > The sparse matrix hasn't been preallocated, which forces the data structure to do a lot of copies (as bad as O(n^2) complexity). DMCreateMatrix() preallocates for you. > > https://petsc.org/release/docs/manual/performance/#memory-allocation-for-sparse-matrix-assembly > https://petsc.org/release/docs/manual/mat/#sec-matsparse > >> My question is then: Is the manual Mat a noob mistake and is it somehow creating a memory leak with K? Just in case it's something else I'm attaching the code. The loop that populates K is between lines 221 and 278. Anything related to DM, DMPlex, and PetscSection is between lines 117 and 180. >> >> Machine Type: HP Laptop >> C-compiler: Gnu C >> OS: Ubuntu 20.04 >> PETSc version: 3.16.0 >> MPI Implementation: MPICH >> >> Hope you all had a Merry Christmas and that you will have a happy and productive New Year. :D >> >> >> Sincerely: >> >> J.A. Ferrand >> >> Embry-Riddle Aeronautical University - Daytona Beach FL >> >> M.Sc. Aerospace Engineering | May 2022 >> >> B.Sc. Aerospace Engineering >> >> B.Sc. Computational Mathematics >> >> >> >> Sigma Gamma Tau >> >> Tau Beta Pi >> >> Honors Program >> >> >> >> Phone: (386)-843-1829 >> >> Email(s): ferranj2 at my.erau.edu >> >> jesus.ferrand at gmail.com >> //REFERENCE: https://github.com/FreeFem/FreeFem-sources/blob/master/plugin/mpi/PETSc-code.hpp >> #include >> static char help[] = "Imports a Gmsh mesh with boundary conditions and solves the elasticity equation.\n" >> "Option prefix = opt_.\n"; >> >> struct preKE{//Preallocation before computing KE >> Mat matB, >> matBTCB; >> //matKE; >> PetscInt x_insert[3], >> y_insert[3], >> z_insert[3], >> m,//Looping variables. >> sizeKE,//size of the element stiffness matrix. >> N,//Number of nodes in element. >> x_in,y_in,z_in; //LI to index B matrix. >> PetscReal J[3][3],//Jacobian matrix. >> invJ[3][3],//Inverse of the Jacobian matrix. >> detJ,//Determinant of the Jacobian. >> dX[3], >> dY[3], >> dZ[3], >> minor00, >> minor01, >> minor02,//Determinants of minors in a 3x3 matrix. >> dPsidX, dPsidY, dPsidZ,//Shape function derivatives w.r.t global coordinates. >> weight,//Multiplier of quadrature weights. >> *dPsidXi,//Derivatives of shape functions w.r.t. Xi. >> *dPsidEta,//Derivatives of shape functions w.r.t. Eta. >> *dPsidZeta;//Derivatives of shape functions w.r.t Zeta. >> PetscErrorCode ierr; >> };//end struct. >> >> //Function declarations. >> extern PetscErrorCode tetra4(PetscScalar*, PetscScalar*, PetscScalar*,struct preKE*, Mat*, Mat*); >> extern PetscErrorCode ConstitutiveMatrix(Mat*,const char*,PetscInt); >> extern PetscErrorCode InitializeKEpreallocation(struct preKE*,const char*); >> >> PetscErrorCode PetscViewerVTKWriteFunction(PetscObject vec,PetscViewer viewer){ >> PetscErrorCode ierr; >> ierr = VecView((Vec)vec,viewer); CHKERRQ(ierr); >> return ierr; >> } >> >> >> >> >> int main(int argc, char **args){ >> //DEFINITIONS OF PETSC's DMPLEX LINGO: >> //POINT: A topology element (cell, face, edge, or vertex). >> //CHART: It an interval from 0 to the number of "points." (the range of admissible linear indices) >> //STRATUM: A subset of the "chart" which corresponds to all "points" at a given "level." >> //LEVEL: This is either a "depth" or a "height". >> //HEIGHT: Dimensionality of an element measured from 0D to 3D. Heights: cell = 0, face = 1, edge = 2, vertex = 3. >> //DEPTH: Dimensionality of an element measured from 3D to 0D. Depths: cell = 3, face = 2, edge = 1, vertex = 0; >> //CLOSURE: *of an element is the collection of all other elements that define it.I.e., the closure of a surface is the collection of vertices and edges that make it up. >> //STAR: >> //STANDARD LABELS: These are default tags that DMPlex has for its topology. ("depth") >> PetscErrorCode ierr;//Error tracking variable. >> DM dm;//Distributed memory object (useful for managing grids.) >> DMLabel physicalgroups;//Identifies user-specified tags in gmsh (to impose BC's). >> DMPolytopeType celltype;//When looping through cells, determines its type (tetrahedron, pyramid, hexahedron, etc.) >> PetscSection s; >> KSP ksp;//Krylov Sub-Space (linear solver object) >> Mat K,//Global stiffness matrix (Square, assume unsymmetric). >> KE,//Element stiffness matrix (Square, assume unsymmetric). >> matC;//Constitutive matrix. >> Vec XYZ,//Coordinate vector, contains spatial locations of mesh's vertices (NOTE: This vector self-destroys!). >> U,//Displacement vector. >> F;//Load Vector. >> PetscViewer XYZviewer,//Viewer object to output mesh to ASCII format. >> XYZpUviewer; //Viewer object to output displacements to ASCII format. >> PetscBool interpolate = PETSC_TRUE,//Instructs Gmsh importer whether to generate faces and edges (Needed when using P2 or higher elements). >> useCone = PETSC_TRUE,//Instructs "DMPlexGetTransitiveClosure()" whether to extract the closure or the star. >> dirichletBC = PETSC_FALSE,//For use when assembling the K matrix. >> neumannBC = PETSC_FALSE,//For use when assembling the F vector. >> saveASCII = PETSC_FALSE,//Whether to save results in ASCII format. >> saveVTK = PETSC_FALSE;//Whether to save results as VTK format. >> PetscInt nc,//number of cells. (PETSc lingo for "elements") >> nv,//number of vertices. (PETSc lingo for "nodes") >> nf,//number of faces. (PETSc lingo for "surfaces") >> ne,//number of edges. (PETSc lingo for "lines") >> pStart,//starting LI of global elements. >> pEnd,//ending LI of all elements. >> cStart,//starting LI for cells global arrangement. >> cEnd,//ending LI for cells in global arrangement. >> vStart,//starting LI for vertices in global arrangement. >> vEnd,//ending LI for vertices in global arrangement. >> fStart,//starting LI for faces in global arrangement. >> fEnd,//ending LI for faces in global arrangement. >> eStart,//starting LI for edges in global arrangement. >> eEnd,//ending LI for edges in global arrangement. >> sizeK,//Size of the element stiffness matrix. >> ii,jj,kk,//Dedicated looping variables. >> indexXYZ,//Variable to access the elements of XYZ vector. >> indexK,//Variable to access the elements of the U and F vectors (can reference rows and colums of K matrix.) >> *closure = PETSC_NULL,//Pointer to the closure elements of a cell. >> size_closure,//Size of the closure of a cell. >> dim,//Dimension of the mesh. >> //*edof,//Linear indices of dof's inside the K matrix. >> dof = 3,//Degrees of freedom per node. >> cells=0, edges=0, vertices=0, faces=0,//Topology counters when looping through cells. >> pinXcode=10, pinZcode=11,forceZcode=12;//Gmsh codes to extract relevant "Face Sets." >> PetscReal //*x_el,//Pointer to a vector that will store the x-coordinates of an element's vertices. >> //*y_el,//Pointer to a vector that will store the y-coordinates of an element's vertices. >> //*z_el,//Pointer to a vector that will store the z-coordinates of an element's vertices. >> *xyz_el,//Pointer to xyz array in the XYZ vector. >> traction = -10, >> *KEdata, >> t1,t2; //time keepers. >> const char *gmshfile = "TopOptmeshfine2.msh";//Name of gmsh file to import. >> >> ierr = PetscInitialize(&argc,&args,NULL,help); if(ierr) return ierr; //And the machine shall work.... >> >> //MESH IMPORT================================================================= >> //IMPORTANT NOTE: Gmsh only creates "cells" and "vertices," it does not create the "faces" or "edges." >> //Gmsh probably can generate them, must figure out how to. >> t1 = MPI_Wtime(); >> ierr = DMPlexCreateGmshFromFile(PETSC_COMM_WORLD,gmshfile,interpolate,&dm); CHKERRQ(ierr);//Read Gmsh file and generate the DMPlex. >> ierr = DMGetDimension(dm, &dim); CHKERRQ(ierr);//1-D, 2-D, or 3-D >> ierr = DMPlexGetChart(dm, &pStart, &pEnd); CHKERRQ(ierr);//Extracts linear indices of cells, vertices, faces, and edges. >> ierr = DMGetCoordinatesLocal(dm,&XYZ); CHKERRQ(ierr);//Extracts coordinates from mesh.(Contiguous storage: [x0,y0,z0,x1,y1,z1,...]) >> ierr = VecGetArray(XYZ,&xyz_el); CHKERRQ(ierr);//Get pointer to vector's data. >> t2 = MPI_Wtime(); >> PetscPrintf(PETSC_COMM_WORLD,"Mesh Import time: %10f\n",t2-t1); >> DMView(dm,PETSC_VIEWER_STDOUT_WORLD); >> >> //IMPORTANT NOTE: PETSc assumes that vertex-cell meshes are 2D even if they were 3D, so its ordering changes. >> //Cells remain at height 0, but vertices move to height 1 from height 3. To prevent this from becoming an issue >> //the "interpolate" variable is set to PETSC_TRUE to tell the mesh importer to generate faces and edges. >> //PETSc, therefore, technically does additional meshing. Gotta figure out how to get this from Gmsh directly. >> ierr = DMPlexGetDepthStratum(dm,3, &cStart, &cEnd);//Get LI of cells. >> ierr = DMPlexGetDepthStratum(dm,2, &fStart, &fEnd);//Get LI of faces >> ierr = DMPlexGetDepthStratum(dm,1, &eStart, &eEnd);//Get LI of edges. >> ierr = DMPlexGetDepthStratum(dm,0, &vStart, &vEnd);//Get LI of vertices. >> ierr = DMGetStratumSize(dm,"depth", 3, &nc);//Get number of cells. >> ierr = DMGetStratumSize(dm,"depth", 2, &nf);//Get number of faces. >> ierr = DMGetStratumSize(dm,"depth", 1, &ne);//Get number of edges. >> ierr = DMGetStratumSize(dm,"depth", 0, &nv);//Get number of vertices. >> /* >> PetscPrintf(PETSC_COMM_WORLD,"global start = %10d\t global end = %10d\n",pStart,pEnd); >> PetscPrintf(PETSC_COMM_WORLD,"#cells = %10d\t i = %10d\t i < %10d\n",nc,cStart,cEnd); >> PetscPrintf(PETSC_COMM_WORLD,"#faces = %10d\t i = %10d\t i < %10d\n",nf,fStart,fEnd); >> PetscPrintf(PETSC_COMM_WORLD,"#edges = %10d\t i = %10d\t i < %10d\n",ne,eStart,eEnd); >> PetscPrintf(PETSC_COMM_WORLD,"#vertices = %10d\t i = %10d\t i < %10d\n",nv,vStart,vEnd); >> */ >> //MESH IMPORT================================================================= >> >> //NOTE: This section extremely hardcoded right now. >> //Current setup would only support P1 meshes. >> //MEMORY ALLOCATION ========================================================== >> ierr = PetscSectionCreate(PETSC_COMM_WORLD, &s); CHKERRQ(ierr); >> //The chart is akin to a contiguous memory storage allocation. Each chart entry is associated >> //with a "thing," could be a vertex, face, cell, or edge, or anything else. >> ierr = PetscSectionSetChart(s, pStart, pEnd); CHKERRQ(ierr); >> //For each "thing" in the chart, additional room can be made. This is helpful for associating >> //nodes to multiple degrees of freedom. These commands help associate nodes with >> for(ii = cStart; ii < cEnd; ii++){//Cell loop. >> ierr = PetscSectionSetDof(s, ii, 0);CHKERRQ(ierr);}//NOTE: Currently no dof's associated with cells. >> for(ii = fStart; ii < fEnd; ii++){//Face loop. >> ierr = PetscSectionSetDof(s, ii, 0);CHKERRQ(ierr);}//NOTE: Currently no dof's associated with faces. >> for(ii = vStart; ii < vEnd; ii++){//Vertex loop. >> ierr = PetscSectionSetDof(s, ii, dof);CHKERRQ(ierr);}//Sets x, y, and z displacements as dofs. >> for(ii = eStart; ii < eEnd; ii++){//Edge loop >> ierr = PetscSectionSetDof(s, ii, 0);CHKERRQ(ierr);}//NOTE: Currently no dof's associated with edges. >> ierr = PetscSectionSetUp(s); CHKERRQ(ierr); >> ierr = PetscSectionGetStorageSize(s,&sizeK);CHKERRQ(ierr);//Determine the size of the global stiffness matrix. >> ierr = DMSetLocalSection(dm,s); CHKERRQ(ierr);//Associate the PetscSection with the DM object. >> //PetscErrorCode DMCreateGlobalVector(DM dm,Vec *vec) >> //ierr = DMCreateGlobalVector(dm,&U); CHKERRQ(ierr); >> PetscSectionDestroy(&s); >> //PetscPrintf(PETSC_COMM_WORLD,"sizeK = %10d\n",sizeK); >> >> //OBJECT SETUP================================================================ >> //Global stiffness matrix. >> //PetscErrorCode DMCreateMatrix(DM dm,Mat *mat) >> >> //This makes the loop fast. >> ierr = DMCreateMatrix(dm,&K); >> >> //This makes the loop uber slow. >> //ierr = MatCreate(PETSC_COMM_WORLD,&K); CHKERRQ(ierr); >> //ierr = MatSetType(K,MATAIJ); CHKERRQ(ierr);// Global stiffness matrix set to some sparse type. >> //ierr = MatSetSizes(K,PETSC_DECIDE,PETSC_DECIDE,sizeK,sizeK); CHKERRQ(ierr); >> //ierr = MatSetUp(K); CHKERRQ(ierr); >> >> //Displacement vector. >> ierr = VecCreate(PETSC_COMM_WORLD,&U); CHKERRQ(ierr); >> ierr = VecSetType(U,VECSTANDARD); CHKERRQ(ierr);// Global stiffness matrix set to some sparse type. >> ierr = VecSetSizes(U,PETSC_DECIDE,sizeK); CHKERRQ(ierr); >> >> //Load vector. >> ierr = VecCreate(PETSC_COMM_WORLD,&F); CHKERRQ(ierr); >> ierr = VecSetType(F,VECSTANDARD); CHKERRQ(ierr);// Global stiffness matrix set to some sparse type. >> ierr = VecSetSizes(F,PETSC_DECIDE,sizeK); CHKERRQ(ierr); >> //OBJECT SETUP================================================================ >> >> //WARNING: This loop is currently hardcoded for P1 elements only! Must Figure >> //out a clever way to modify to accomodate Pn (n>1) elements. >> >> //BEGIN GLOBAL STIFFNESS MATRIX BUILDER======================================= >> t1 = MPI_Wtime(); >> >> //PREALLOCATIONS============================================================== >> ierr = ConstitutiveMatrix(&matC,"isotropic",0); CHKERRQ(ierr); >> struct preKE preKEtetra4; >> ierr = InitializeKEpreallocation(&preKEtetra4,"tetra4"); CHKERRQ(ierr); >> ierr = MatCreate(PETSC_COMM_WORLD,&KE); CHKERRQ(ierr); //SEQUENTIAL >> ierr = MatSetSizes(KE,PETSC_DECIDE,PETSC_DECIDE,12,12); CHKERRQ(ierr); //SEQUENTIAL >> ierr = MatSetType(KE,MATDENSE); CHKERRQ(ierr); //SEQUENTIAL >> ierr = MatSetUp(KE); CHKERRQ(ierr); >> PetscReal x_tetra4[4], y_tetra4[4],z_tetra4[4], >> x_hex8[8], y_hex8[8],z_hex8[8], >> *x,*y,*z; >> PetscInt *EDOF,edof_tetra4[12],edof_hex8[24]; >> DMPolytopeType previous = DM_POLYTOPE_UNKNOWN; >> //PREALLOCATIONS============================================================== >> >> >> >> for(ii=cStart;ii> ierr = DMPlexGetTransitiveClosure(dm, ii, useCone, &size_closure, &closure); CHKERRQ(ierr); >> ierr = DMPlexGetCellType(dm, ii, &celltype); CHKERRQ(ierr); >> //IMPORTANT NOTE: MOST OF THIS LOOP SHOULD BE INCLUDED IN THE KE3D function. >> if(previous != celltype){ >> //PetscPrintf(PETSC_COMM_WORLD,"run \n"); >> if(celltype == DM_POLYTOPE_TETRAHEDRON){ >> x = x_tetra4; >> y = y_tetra4; >> z = z_tetra4; >> EDOF = edof_tetra4; >> }//end if. >> else if(celltype == DM_POLYTOPE_HEXAHEDRON){ >> x = x_hex8; >> y = y_hex8; >> z = z_hex8; >> EDOF = edof_hex8; >> }//end else if. >> } >> previous = celltype; >> >> //PetscPrintf(PETSC_COMM_WORLD,"Cell # %4i\t",ii); >> cells=0; >> edges=0; >> vertices=0; >> faces=0; >> kk = 0; >> for(jj=0;jj<(2*size_closure);jj+=2){//Scan the closure of the current cell. >> //Use information from the DM's strata to determine composition of cell_ii. >> if(vStart <= closure[jj] && closure[jj]< vEnd){//Check for vertices. >> //PetscPrintf(PETSC_COMM_WORLD,"%5i\t",closure[jj]); >> indexXYZ = dim*(closure[jj]-vStart);//Linear index of x-coordinate in the xyz_el array. >> >> *(x+vertices) = xyz_el[indexXYZ]; >> *(y+vertices) = xyz_el[indexXYZ+1];//Extract Y-coordinates of the current vertex. >> *(z+vertices) = xyz_el[indexXYZ+2];//Extract Y-coordinates of the current vertex. >> *(EDOF + kk) = indexXYZ; >> *(EDOF + kk+1) = indexXYZ+1; >> *(EDOF + kk+2) = indexXYZ+2; >> kk+=3; >> vertices++;//Update vertex counter. >> }//end if >> else if(eStart <= closure[jj] && closure[jj]< eEnd){//Check for edge ID's >> edges++; >> }//end else ifindexK >> else if(fStart <= closure[jj] && closure[jj]< fEnd){//Check for face ID's >> faces++; >> }//end else if >> else if(cStart <= closure[jj] && closure[jj]< cEnd){//Check for cell ID's >> cells++; >> }//end else if >> }//end "jj" loop. >> ierr = tetra4(x,y,z,&preKEtetra4,&matC,&KE); CHKERRQ(ierr); //Generate the element stiffness matrix for this cell. >> ierr = MatDenseGetArray(KE,&KEdata); CHKERRQ(ierr); >> ierr = MatSetValues(K,12,EDOF,12,EDOF,KEdata,ADD_VALUES); CHKERRQ(ierr);//WARNING: HARDCODED FOR TETRAHEDRAL P1 ELEMENTS ONLY !!!!!!!!!!!!!!!!!!!!!!! >> ierr = MatDenseRestoreArray(KE,&KEdata); CHKERRQ(ierr); >> ierr = DMPlexRestoreTransitiveClosure(dm, ii,useCone, &size_closure, &closure); CHKERRQ(ierr); >> }//end "ii" loop. >> ierr = MatAssemblyBegin(K,MAT_FINAL_ASSEMBLY); CHKERRQ(ierr); >> ierr = MatAssemblyEnd(K,MAT_FINAL_ASSEMBLY); CHKERRQ(ierr); >> //ierr = MatView(K,PETSC_VIEWER_DRAW_WORLD); CHKERRQ(ierr); >> //END GLOBAL STIFFNESS MATRIX BUILDER=========================================== >> t2 = MPI_Wtime(); >> PetscPrintf(PETSC_COMM_WORLD,"K build time: %10f\n",t2-t1); >> >> >> >> >> >> >> >> >> t1 = MPI_Wtime(); >> //BEGIN BOUNDARY CONDITION ENFORCEMENT========================================== >> IS TrianglesIS, physicalsurfaceID;//, VerticesIS; >> PetscInt numsurfvals, >> //numRows, >> dof_offset,numTri; >> const PetscInt *surfvals, >> //*pinZID, >> *TriangleID; >> PetscScalar diag =1; >> PetscReal area,force; >> //NOTE: Petsc can read/assign labels. Eeach label may posses multiple "values." >> //These values act as tags within a tag. >> //IMPORTANT NOTE: The below line needs a safety. If a mesh that does not feature >> //face sets is imported, the code in its current state will crash!!!. This is currently >> //hardcoded for the test mesh. >> ierr = DMGetLabel(dm, "Face Sets", &physicalgroups); CHKERRQ(ierr);//Inspects Physical surface groups defined by gmsh (if any). >> ierr = DMLabelGetValueIS(physicalgroups, &physicalsurfaceID); CHKERRQ(ierr);//Gets the physical surface ID's defined in gmsh (as specified in the .geo file). >> ierr = ISGetIndices(physicalsurfaceID,&surfvals); CHKERRQ(ierr);//Get a pointer to the actual surface values. >> ierr = DMLabelGetNumValues(physicalgroups, &numsurfvals); CHKERRQ(ierr);//Gets the number of different values that the label assigns. >> for(ii=0;ii> //PetscPrintf(PETSC_COMM_WORLD,"Values = %5i\n",surfvals[ii]); >> //PROBLEM: The surface values are hardcoded in the gmsh file. We need to adopt standard "codes" >> //that we can give to users when they make their meshes so that this code recognizes the Type >> // of boundary conditions that are to be imposed. >> if(surfvals[ii] == pinXcode){ >> dof_offset = 0; >> dirichletBC = PETSC_TRUE; >> }//end if. >> else if(surfvals[ii] == pinZcode){ >> dof_offset = 2; >> dirichletBC = PETSC_TRUE; >> }//end else if. >> else if(surfvals[ii] == forceZcode){ >> dof_offset = 2; >> neumannBC = PETSC_TRUE; >> }//end else if. >> >> ierr = DMLabelGetStratumIS(physicalgroups, surfvals[ii], &TrianglesIS); CHKERRQ(ierr);//Get the ID's (as an IS) of the surfaces belonging to value 11. >> //PROBLEM: DMPlexGetConeRecursiveVertices returns an array with repeated node ID's. For each repetition, the lines that enforce BC's unnecessarily re-run. >> ierr = ISGetSize(TrianglesIS,&numTri); CHKERRQ(ierr); >> ierr = ISGetIndices(TrianglesIS,&TriangleID); CHKERRQ(ierr);//Get a pointer to the actual surface values. >> for(kk=0;kk> ierr = DMPlexGetTransitiveClosure(dm, TriangleID[kk], useCone, &size_closure, &closure); CHKERRQ(ierr); >> if(neumannBC){ >> ierr = DMPlexComputeCellGeometryFVM(dm, TriangleID[kk], &area,PETSC_NULL,PETSC_NULL); CHKERRQ(ierr); >> force = traction*area/3;//WARNING: The 3 here is hardcoded for a purely tetrahedral mesh only!!!!!!!!!! >> } >> for(jj=0;jj<(2*size_closure);jj+=2){ >> //PetscErrorCode DMPlexComputeCellGeometryFVM(DM dm, PetscInt cell, PetscReal *vol, PetscReal centroid[], PetscReal normal[]) >> if(vStart <= closure[jj] && closure[jj]< vEnd){//Check for vertices. >> indexK = dof*(closure[jj] - vStart) + dof_offset; //Compute the dof ID's in the K matrix. >> if(dirichletBC){//Boundary conditions requiring an edit of K matrix. >> ierr = MatZeroRows(K,1,&indexK,diag,NULL,NULL); CHKERRQ(ierr); >> }//end if. >> else if(neumannBC){//Boundary conditions requiring an edit of RHS vector. >> ierr = VecSetValue(F,indexK,force,ADD_VALUES); CHKERRQ(ierr); >> }// end else if. >> }//end if. >> }//end "jj" loop. >> ierr = DMPlexRestoreTransitiveClosure(dm, closure[jj],useCone, &size_closure, &closure); CHKERRQ(ierr); >> }//end "kk" loop. >> ierr = ISRestoreIndices(TrianglesIS,&TriangleID); CHKERRQ(ierr); >> >> /* >> ierr = DMPlexGetConeRecursiveVertices(dm, TrianglesIS, &VerticesIS); CHKERRQ(ierr);//Get the ID's (as an IS) of the vertices that make up the surfaces of value 11. >> ierr = ISGetSize(VerticesIS,&numRows); CHKERRQ(ierr);//Get number of flagged vertices (this includes repeated indices for faces that share nodes). >> ierr = ISGetIndices(VerticesIS,&pinZID); CHKERRQ(ierr);//Get a pointer to the actual surface values. >> if(dirichletBC){//Boundary conditions requiring an edit of K matrix. >> for(kk=0;kk> indexK = 3*(pinZID[kk] - vStart) + dof_offset; //Compute the dof ID's in the K matrix. (NOTE: the 3* ishardcoded for 3 degrees of freedom, tie this to a variable in the FUTURE.) >> ierr = MatZeroRows(K,1,&indexK,diag,NULL,NULL); CHKERRQ(ierr); >> }//end "kk" loop. >> }//end if. >> else if(neumannBC){//Boundary conditions requiring an edit of RHS vector. >> for(kk=0;kk> indexK = 3*(pinZID[kk] - vStart) + dof_offset; >> ierr = VecSetValue(F,indexK,traction,INSERT_VALUES); CHKERRQ(ierr); >> }//end "kk" loop. >> }// end else if. >> ierr = ISRestoreIndices(VerticesIS,&pinZID); CHKERRQ(ierr); >> */ >> dirichletBC = PETSC_FALSE; >> neumannBC = PETSC_FALSE; >> }//end "ii" loop. >> ierr = ISRestoreIndices(physicalsurfaceID,&surfvals); CHKERRQ(ierr); >> //ierr = ISRestoreIndices(VerticesIS,&pinZID); CHKERRQ(ierr); >> ierr = ISDestroy(&physicalsurfaceID); CHKERRQ(ierr); >> //ierr = ISDestroy(&VerticesIS); CHKERRQ(ierr); >> ierr = ISDestroy(&TrianglesIS); CHKERRQ(ierr); >> //END BOUNDARY CONDITION ENFORCEMENT============================================ >> t2 = MPI_Wtime(); >> PetscPrintf(PETSC_COMM_WORLD,"BC imposition time: %10f\n",t2-t1); >> >> /* >> PetscInt kk = 0; >> for(ii=vStart;ii> kk++; >> PetscPrintf(PETSC_COMM_WORLD,"Vertex #%4i\t x = %10.9f\ty = %10.9f\tz = %10.9f\n",ii,xyz_el[3*kk],xyz_el[3*kk+1],xyz_el[3*kk+2]); >> }// end "ii" loop. >> */ >> >> t1 = MPI_Wtime(); >> //SOLVER======================================================================== >> ierr = KSPCreate(PETSC_COMM_WORLD,&ksp); CHKERRQ(ierr); >> ierr = KSPSetOperators(ksp,K,K); CHKERRQ(ierr); >> ierr = KSPSetFromOptions(ksp); CHKERRQ(ierr); >> ierr = KSPSolve(ksp,F,U); CHKERRQ(ierr); >> t2 = MPI_Wtime(); >> //ierr = KSPView(ksp,PETSC_VIEWER_STDOUT_WORLD);CHKERRQ(ierr); >> //SOLVER======================================================================== >> t2 = MPI_Wtime(); >> PetscPrintf(PETSC_COMM_WORLD,"Solver time: %10f\n",t2-t1); >> ierr = VecRestoreArray(XYZ,&xyz_el); CHKERRQ(ierr);//Get pointer to vector's data. >> >> //BEGIN MAX/MIN DISPLACEMENTS=================================================== >> IS ISux,ISuy,ISuz; >> Vec UX,UY,UZ; >> PetscReal UXmax,UYmax,UZmax,UXmin,UYmin,UZmin; >> ierr = ISCreateStride(PETSC_COMM_WORLD,nv,0,3,&ISux); CHKERRQ(ierr); >> ierr = ISCreateStride(PETSC_COMM_WORLD,nv,1,3,&ISuy); CHKERRQ(ierr); >> ierr = ISCreateStride(PETSC_COMM_WORLD,nv,2,3,&ISuz); CHKERRQ(ierr); >> >> //PetscErrorCode VecGetSubVector(Vec X,IS is,Vec *Y) >> ierr = VecGetSubVector(U,ISux,&UX); CHKERRQ(ierr); >> ierr = VecGetSubVector(U,ISuy,&UY); CHKERRQ(ierr); >> ierr = VecGetSubVector(U,ISuz,&UZ); CHKERRQ(ierr); >> >> //PetscErrorCode VecMax(Vec x,PetscInt *p,PetscReal *val) >> ierr = VecMax(UX,PETSC_NULL,&UXmax); CHKERRQ(ierr); >> ierr = VecMax(UY,PETSC_NULL,&UYmax); CHKERRQ(ierr); >> ierr = VecMax(UZ,PETSC_NULL,&UZmax); CHKERRQ(ierr); >> >> ierr = VecMin(UX,PETSC_NULL,&UXmin); CHKERRQ(ierr); >> ierr = VecMin(UY,PETSC_NULL,&UYmin); CHKERRQ(ierr); >> ierr = VecMin(UZ,PETSC_NULL,&UZmin); CHKERRQ(ierr); >> >> PetscPrintf(PETSC_COMM_WORLD,"%10f\t <= ux <= %10f\n",UXmin,UXmax); >> PetscPrintf(PETSC_COMM_WORLD,"%10f\t <= uy <= %10f\n",UYmin,UYmax); >> PetscPrintf(PETSC_COMM_WORLD,"%10f\t <= uz <= %10f\n",UZmin,UZmax); >> >> >> >> >> //BEGIN OUTPUT SOLUTION========================================================= >> if(saveASCII){ >> ierr = PetscViewerASCIIOpen(PETSC_COMM_WORLD,"XYZ.txt",&XYZviewer); >> ierr = VecView(XYZ,XYZviewer); CHKERRQ(ierr); >> ierr = PetscViewerASCIIOpen(PETSC_COMM_WORLD,"U.txt",&XYZpUviewer); >> ierr = VecView(U,XYZpUviewer); CHKERRQ(ierr); >> PetscViewerDestroy(&XYZviewer); PetscViewerDestroy(&XYZpUviewer); >> >> }//end if. >> if(saveVTK){ >> const char *meshfile = "starting_mesh.vtk", >> *deformedfile = "deformed_mesh.vtk"; >> ierr = PetscViewerVTKOpen(PETSC_COMM_WORLD,meshfile,FILE_MODE_WRITE,&XYZviewer); CHKERRQ(ierr); >> //PetscErrorCode DMSetAuxiliaryVec(DM dm, DMLabel label, PetscInt value, Vec aux) >> DMLabel UXlabel,UYlabel, UZlabel; >> //PetscErrorCode DMLabelCreate(MPI_Comm comm, const char name[], DMLabel *label) >> ierr = DMLabelCreate(PETSC_COMM_WORLD, "X-Displacement", &UXlabel); CHKERRQ(ierr); >> ierr = DMLabelCreate(PETSC_COMM_WORLD, "Y-Displacement", &UYlabel); CHKERRQ(ierr); >> ierr = DMLabelCreate(PETSC_COMM_WORLD, "Z-Displacement", &UZlabel); CHKERRQ(ierr); >> ierr = DMSetAuxiliaryVec(dm,UXlabel, 1, UX); CHKERRQ(ierr); >> ierr = DMSetAuxiliaryVec(dm,UYlabel, 1, UY); CHKERRQ(ierr); >> ierr = DMSetAuxiliaryVec(dm,UZlabel, 1, UZ); CHKERRQ(ierr); >> //PetscErrorCode PetscViewerVTKAddField(PetscViewer viewer,PetscObject dm,PetscErrorCode (*PetscViewerVTKWriteFunction)(PetscObject,PetscViewer),PetscInt fieldnum,PetscViewerVTKFieldType fieldtype,PetscBool checkdm,PetscObject vec) >> >> >> >> //ierr = PetscViewerVTKAddField(XYZviewer, dm,PetscErrorCode (*PetscViewerVTKWriteFunction)(Vec,PetscViewer),PETSC_DEFAULT,PETSC_VTK_POINT_FIELD,PETSC_FALSE,UX); >> ierr = PetscViewerVTKAddField(XYZviewer, (PetscObject)dm,&PetscViewerVTKWriteFunction,PETSC_DEFAULT,PETSC_VTK_POINT_FIELD,PETSC_FALSE,(PetscObject)UX); >> >> >> ierr = DMPlexVTKWriteAll((PetscObject)dm, XYZviewer); CHKERRQ(ierr); >> ierr = VecAXPY(XYZ,1,U); CHKERRQ(ierr);//Add displacement field to the mesh coordinates to deform. >> ierr = PetscViewerVTKOpen(PETSC_COMM_WORLD,deformedfile,FILE_MODE_WRITE,&XYZpUviewer); CHKERRQ(ierr); >> ierr = DMPlexVTKWriteAll((PetscObject)dm, XYZpUviewer); CHKERRQ(ierr);// >> PetscViewerDestroy(&XYZviewer); PetscViewerDestroy(&XYZpUviewer); >> >> }//end else if. >> else{ >> ierr = PetscPrintf(PETSC_COMM_WORLD,"No output format specified! Files not saved.\n"); CHKERRQ(ierr); >> }//end else. >> >> >> //END OUTPUT SOLUTION=========================================================== >> VecDestroy(&UX); ISDestroy(&ISux); >> VecDestroy(&UY); ISDestroy(&ISuy); >> VecDestroy(&UZ); ISDestroy(&ISuz); >> //END MAX/MIN DISPLACEMENTS===================================================== >> >> //CLEANUP===================================================================== >> DMDestroy(&dm); >> KSPDestroy(&ksp); >> MatDestroy(&K); MatDestroy(&KE); MatDestroy(&matC); //MatDestroy(preKEtetra4.matB); MatDestroy(preKEtetra4.matBTCB); >> VecDestroy(&U); VecDestroy(&F); >> >> //DMLabelDestroy(&physicalgroups);//Destroyig the DM destroys the label. >> //CLEANUP===================================================================== >> //PetscErrorCode PetscMallocDump(FILE *fp) >> //ierr = PetscMallocDump(NULL); >> return PetscFinalize();//And the machine shall rest.... >> }//end main. >> >> PetscErrorCode tetra4(PetscScalar* X,PetscScalar* Y, PetscScalar* Z,struct preKE *P, Mat* matC, Mat* KE){ >> //INPUTS: >> //X: Global X coordinates of the elemental nodes. >> //Y: Global Y coordinates of the elemental nodes. >> //Z: Global Z coordinates of the elemental nodes. >> //J: Jacobian matrix. >> //invJ: Inverse Jacobian matrix. >> PetscErrorCode ierr; >> //For current quadrature point, get dPsi/dXi_i Xi_i = {Xi,Eta,Zeta} >> /* >> P->dPsidXi[0] = +1.; P->dPsidEta[0] = 0.0; P->dPsidZeta[0] = 0.0; >> P->dPsidXi[1] = 0.0; P->dPsidEta[1] = +1.; P->dPsidZeta[1] = 0.0; >> P->dPsidXi[2] = 0.0; P->dPsidEta[2] = 0.0; P->dPsidZeta[2] = +1.; >> P->dPsidXi[3] = -1.; P->dPsidEta[3] = -1.; P->dPsidZeta[3] = -1.; >> */ >> //Populate the Jacobian matrix. >> P->J[0][0] = X[0] - X[3]; >> P->J[0][1] = Y[0] - Y[3]; >> P->J[0][2] = Z[0] - Z[3]; >> P->J[1][0] = X[1] - X[3]; >> P->J[1][1] = Y[1] - Y[3]; >> P->J[1][2] = Z[1] - Z[3]; >> P->J[2][0] = X[2] - X[3]; >> P->J[2][1] = Y[2] - Y[3]; >> P->J[2][2] = Z[2] - Z[3]; >> >> //Determinant of the 3x3 Jacobian. (Expansion along 1st row). >> P->minor00 = P->J[1][1]*P->J[2][2] - P->J[2][1]*P->J[1][2];//Reuse when finding InvJ. >> P->minor01 = P->J[1][0]*P->J[2][2] - P->J[2][0]*P->J[1][2];//Reuse when finding InvJ. >> P->minor02 = P->J[1][0]*P->J[2][1] - P->J[2][0]*P->J[1][1];//Reuse when finding InvJ. >> P->detJ = P->J[0][0]*P->minor00 - P->J[0][1]*P->minor01 + P->J[0][2]*P->minor02; >> //Inverse of the 3x3 Jacobian >> P->invJ[0][0] = +P->minor00/P->detJ;//Reuse precomputed minor. >> P->invJ[0][1] = -(P->J[0][1]*P->J[2][2] - P->J[0][2]*P->J[2][1])/P->detJ; >> P->invJ[0][2] = +(P->J[0][1]*P->J[1][2] - P->J[1][1]*P->J[0][2])/P->detJ; >> P->invJ[1][0] = -P->minor01/P->detJ;//Reuse precomputed minor. >> P->invJ[1][1] = +(P->J[0][0]*P->J[2][2] - P->J[0][2]*P->J[2][0])/P->detJ; >> P->invJ[1][2] = -(P->J[0][0]*P->J[1][2] - P->J[1][0]*P->J[0][2])/P->detJ; >> P->invJ[2][0] = +P->minor02/P->detJ;//Reuse precomputed minor. >> P->invJ[2][1] = -(P->J[0][0]*P->J[2][1] - P->J[0][1]*P->J[2][0])/P->detJ; >> P->invJ[2][2] = +(P->J[0][0]*P->J[1][1] - P->J[0][1]*P->J[1][0])/P->detJ; >> >> //*****************STRAIN MATRIX (B)************************************** >> for(P->m=0;P->mN;P->m++){//Scan all shape functions. >> >> P->x_in = 0 + P->m*3;//Every 3rd column starting at 0 >> P->y_in = P->x_in +1;//Every 3rd column starting at 1 >> P->z_in = P->y_in +1;//Every 3rd column starting at 2 >> >> P->dX[0] = P->invJ[0][0]*P->dPsidXi[P->m] + P->invJ[0][1]*P->dPsidEta[P->m] + P->invJ[0][2]*P->dPsidZeta[P->m]; >> P->dY[0] = P->invJ[1][0]*P->dPsidXi[P->m] + P->invJ[1][1]*P->dPsidEta[P->m] + P->invJ[1][2]*P->dPsidZeta[P->m]; >> P->dZ[0] = P->invJ[2][0]*P->dPsidXi[P->m] + P->invJ[2][1]*P->dPsidEta[P->m] + P->invJ[2][2]*P->dPsidZeta[P->m]; >> >> P->dX[1] = P->dZ[0]; P->dX[2] = P->dY[0]; >> P->dY[1] = P->dZ[0]; P->dY[2] = P->dX[0]; >> P->dZ[1] = P->dX[0]; P->dZ[2] = P->dY[0]; >> >> ierr = MatSetValues(P->matB,3,P->x_insert,1,&(P->x_in),P->dX,INSERT_VALUES); CHKERRQ(ierr); >> ierr = MatSetValues(P->matB,3,P->y_insert,1,&(P->y_in),P->dY,INSERT_VALUES); CHKERRQ(ierr); >> ierr = MatSetValues(P->matB,3,P->z_insert,1,&(P->z_in),P->dZ,INSERT_VALUES); CHKERRQ(ierr); >> >> }//end "m" loop. >> ierr = MatAssemblyBegin(P->matB,MAT_FINAL_ASSEMBLY); CHKERRQ(ierr); >> ierr = MatAssemblyEnd(P->matB,MAT_FINAL_ASSEMBLY); CHKERRQ(ierr); >> //*****************STRAIN MATRIX (B)************************************** >> >> //Compute the matrix product B^t*C*B, scale it by the quadrature weights and add to KE. >> P->weight = -P->detJ/6; >> >> ierr = MatZeroEntries(*KE); CHKERRQ(ierr); >> ierr = MatPtAP(*matC,P->matB,MAT_INITIAL_MATRIX,PETSC_DEFAULT,&(P->matBTCB));CHKERRQ(ierr); >> ierr = MatScale(P->matBTCB,P->weight); CHKERRQ(ierr); >> ierr = MatAssemblyBegin(P->matBTCB,MAT_FINAL_ASSEMBLY); CHKERRQ(ierr); >> ierr = MatAssemblyEnd(P->matBTCB,MAT_FINAL_ASSEMBLY); CHKERRQ(ierr); >> ierr = MatAXPY(*KE,1,P->matBTCB,DIFFERENT_NONZERO_PATTERN); CHKERRQ(ierr);//Add contribution of current quadrature point to KE. >> >> //ierr = MatPtAP(*matC,P->matB,MAT_INITIAL_MATRIX,PETSC_DEFAULT,KE);CHKERRQ(ierr); >> //ierr = MatScale(*KE,P->weight); CHKERRQ(ierr); >> >> ierr = MatAssemblyBegin(*KE,MAT_FINAL_ASSEMBLY); CHKERRQ(ierr); >> ierr = MatAssemblyEnd(*KE,MAT_FINAL_ASSEMBLY); CHKERRQ(ierr); >> >> //Cleanup >> return ierr; >> }//end tetra4. >> >> PetscErrorCode ConstitutiveMatrix(Mat *matC,const char* type,PetscInt materialID){ >> PetscErrorCode ierr; >> PetscBool isotropic = PETSC_FALSE, >> orthotropic = PETSC_FALSE; >> //PetscErrorCode PetscStrcmp(const char a[],const char b[],PetscBool *flg) >> ierr = PetscStrcmp(type,"isotropic",&isotropic); >> ierr = PetscStrcmp(type,"orthotropic",&orthotropic); >> ierr = MatCreate(PETSC_COMM_WORLD,matC); CHKERRQ(ierr); >> ierr = MatSetSizes(*matC,PETSC_DECIDE,PETSC_DECIDE,6,6); CHKERRQ(ierr); >> ierr = MatSetType(*matC,MATAIJ); CHKERRQ(ierr); >> ierr = MatSetUp(*matC); CHKERRQ(ierr); >> >> if(isotropic){ >> PetscReal E,nu, M,L,vals[3]; >> switch(materialID){ >> case 0://Hardcoded properties for isotropic material #0 >> E = 200; >> nu = 1./3; >> break; >> case 1://Hardcoded properties for isotropic material #1 >> E = 96; >> nu = 1./3; >> break; >> }//end switch. >> M = E/(2*(1+nu)),//Lame's constant 1 ("mu"). >> L = E*nu/((1+nu)*(1-2*nu));//Lame's constant 2 ("lambda"). >> //PetscErrorCode MatSetValues(Mat mat,PetscInt m,const PetscInt idxm[],PetscInt n,const PetscInt idxn[],const PetscScalar v[],InsertMode addv) >> PetscInt idxn[3] = {0,1,2}; >> vals[0] = L+2*M; vals[1] = L; vals[2] = vals[1]; >> ierr = MatSetValues(*matC,1,&idxn[0],3,idxn,vals,INSERT_VALUES); CHKERRQ(ierr); >> vals[1] = vals[0]; vals[0] = vals[2]; >> ierr = MatSetValues(*matC,1,&idxn[1],3,idxn,vals,INSERT_VALUES); CHKERRQ(ierr); >> vals[2] = vals[1]; vals[1] = vals[0]; >> ierr = MatSetValues(*matC,1,&idxn[2],3,idxn,vals,INSERT_VALUES); CHKERRQ(ierr); >> ierr = MatSetValue(*matC,3,3,M,INSERT_VALUES); CHKERRQ(ierr); >> ierr = MatSetValue(*matC,4,4,M,INSERT_VALUES); CHKERRQ(ierr); >> ierr = MatSetValue(*matC,5,5,M,INSERT_VALUES); CHKERRQ(ierr); >> }//end if. >> /* >> else if(orthotropic){ >> switch(materialID){ >> case 0: >> break; >> case 1: >> break; >> }//end switch. >> }//end else if. >> */ >> ierr = MatAssemblyBegin(*matC,MAT_FINAL_ASSEMBLY); CHKERRQ(ierr); >> ierr = MatAssemblyEnd(*matC,MAT_FINAL_ASSEMBLY); CHKERRQ(ierr); >> //MatView(*matC,0); >> return ierr; >> }//End ConstitutiveMatrix >> >> PetscErrorCode InitializeKEpreallocation(struct preKE *P,const char* type){ >> PetscErrorCode ierr; >> PetscBool istetra4 = PETSC_FALSE, >> ishex8 = PETSC_FALSE; >> ierr = PetscStrcmp(type,"tetra4",&istetra4); CHKERRQ(ierr); >> ierr = PetscStrcmp(type,"hex8",&ishex8); CHKERRQ(ierr); >> if(istetra4){ >> P->sizeKE = 12; >> P->N = 4; >> }//end if. >> else if(ishex8){ >> P->sizeKE = 24; >> P->N = 8; >> }//end else if. >> >> >> P->x_insert[0] = 0; P->x_insert[1] = 3; P->x_insert[2] = 5; >> P->y_insert[0] = 1; P->y_insert[1] = 4; P->y_insert[2] = 5; >> P->z_insert[0] = 2; P->z_insert[1] = 3; P->z_insert[2] = 4; >> //Allocate memory for the differentiated shape function vectors. >> ierr = PetscMalloc1(P->N,&(P->dPsidXi)); CHKERRQ(ierr); >> ierr = PetscMalloc1(P->N,&(P->dPsidEta)); CHKERRQ(ierr); >> ierr = PetscMalloc1(P->N,&(P->dPsidZeta)); CHKERRQ(ierr); >> >> P->dPsidXi[0] = +1.; P->dPsidEta[0] = 0.0; P->dPsidZeta[0] = 0.0; >> P->dPsidXi[1] = 0.0; P->dPsidEta[1] = +1.; P->dPsidZeta[1] = 0.0; >> P->dPsidXi[2] = 0.0; P->dPsidEta[2] = 0.0; P->dPsidZeta[2] = +1.; >> P->dPsidXi[3] = -1.; P->dPsidEta[3] = -1.; P->dPsidZeta[3] = -1.; >> >> >> //Strain matrix. >> ierr = MatCreate(PETSC_COMM_WORLD,&(P->matB)); CHKERRQ(ierr); >> ierr = MatSetSizes(P->matB,PETSC_DECIDE,PETSC_DECIDE,6,P->sizeKE); CHKERRQ(ierr);//Hardcoded >> ierr = MatSetType(P->matB,MATAIJ); CHKERRQ(ierr); >> ierr = MatSetUp(P->matB); CHKERRQ(ierr); >> >> //Contribution matrix. >> ierr = MatCreate(PETSC_COMM_WORLD,&(P->matBTCB)); CHKERRQ(ierr); >> ierr = MatSetSizes(P->matBTCB,PETSC_DECIDE,PETSC_DECIDE,P->sizeKE,P->sizeKE); CHKERRQ(ierr); >> ierr = MatSetType(P->matBTCB,MATAIJ); CHKERRQ(ierr); >> ierr = MatSetUp(P->matBTCB); CHKERRQ(ierr); >> >> //Element stiffness matrix. >> //ierr = MatCreateSeqDense(PETSC_COMM_SELF,12,12,NULL,&KE); CHKERRQ(ierr); //PARALLEL >> >> return ierr; >> } From FERRANJ2 at my.erau.edu Wed Jan 5 10:18:23 2022 From: FERRANJ2 at my.erau.edu (Ferrand, Jesus A.) Date: Wed, 5 Jan 2022 16:18:23 +0000 Subject: [petsc-users] [EXTERNAL] Re: DM misuse causes massive memory leak? In-Reply-To: <87sfu2921h.fsf@jedbrown.org> References: <87tueraunm.fsf@jedbrown.org> <87sfu2921h.fsf@jedbrown.org> Message-ID: Thanks for the reply (I hit reply all this time). So, I set 3 fields: /* ierr = PetscSectionSetNumFields(s,dof); CHKERRQ(ierr); ierr = PetscSectionSetFieldName(s,0, "X-Displacement"); CHKERRQ(ierr); //Field ID is 0 ierr = PetscSectionSetFieldName(s,1, "Y-Displacement"); CHKERRQ(ierr); //Field ID is 1 ierr = PetscSectionSetFieldName(s,2, "Z-Displacement"); CHKERRQ(ierr); //Field ID is 2 */ I then loop through the vertices of my DMPlex /* for(ii = vStart; ii < vEnd; ii++){//Vertex loop. ierr = PetscSectionSetDof(s, ii, dof); CHKERRQ(ierr); ierr = PetscSectionSetFieldDof(s,ii,0,1); CHKERRQ(ierr);//One X-displacement per vertex (1 dof) ierr = PetscSectionSetFieldDof(s,ii,1,1); CHKERRQ(ierr);//One Y-displacement per vertex (1 dof) ierr = PetscSectionSetFieldDof(s,ii,2,1); CHKERRQ(ierr);//One Z-displacement per vertex (1 dof) }//Sets x, y, and z displacements as dofs. */ I only associated fields with vertices, not with any other points in the DAG. Regarding the use of DMAddBoundary(), I mostly copied the usage shown in SNES example 77. I modified the function definition to simply set the dof to 0.0 as opposed to the coordinates. Below "physicalgroups" is the DMLabel that I got from gmsh, this flags Face points, not vertices. That is why I think the error log suggests that fields were never set. ierr = DMAddBoundary(dm, DM_BC_ESSENTIAL, "fixed", physicalgroups, 1, &surfvals[ii], fieldID, 0, NULL, (void (*)(void)) coordinates, NULL, NULL, NULL); CHKERRQ(ierr); PetscErrorCode coordinates(PetscInt dim, PetscReal time, const PetscReal x[], PetscInt Nf, PetscScalar *u, void *ctx){ const PetscInt Ncomp = dim; PetscInt comp; for (comp = 0; comp < Ncomp; ++comp) u[comp] = 0.0; return 0; } ________________________________ From: Jed Brown Sent: Wednesday, January 5, 2022 12:36 AM To: Ferrand, Jesus A. Cc: petsc-users Subject: Re: [EXTERNAL] Re: [petsc-users] DM misuse causes massive memory leak? Please "reply all" include the list in the future. "Ferrand, Jesus A." writes: > Forgot to say thanks for the reply (my bad). > Yes, I was indeed forgetting to pre-allocate the sparse matrices when doing them by hand (complacency seeded by MATLAB's zeros()). Thank you, Jed, and Jeremy, for the hints! > > I have more questions, these ones about boundary conditions (I think these are for Matt). > In my current code I set Dirichlet conditions directly on a Mat by calling MatSetZeroRows(). I profiled my code and found the part that applies them to be unnacceptably slow. In response, I've been trying to better pre-allocate Mats using PetscSections. I have found documentation for PetscSectionSetDof(), PetscSectionSetNumFields(), PetscSectionSetFieldName(), and PetscSectionSetFieldDof(), to set the size of my Mats and Vecs by calling DMSetLocalSection() followed by DMCreateMatrix() and DMCreateLocalVector() to get a RHS vector. This seems faster. > > In PetscSection, what is the difference between a "field" and a "component"? For example, I could have one field "Velocity" with three components ux, uy, and uz or perhaps three fields ux, uy, and uz each with a default component? It's just how you name them and how they appear in output. Usually "velocity" is better as a field with three components, but fields with other meaning (and perhaps different finite element spaces), such as pressure, would be different fields. Different components are always in the same FE space. > I am struggling now to impose boundary conditions after constraining dofs using PetscSection. My understanding is that constraining dof's reduces the size of the DM's matrix but it does not give the DM knowledge of what values the constrained dofs should have, right? > > I know that there is DMAddBoundary(), but I am unsure of how to use it. From Gmsh I have a mesh with surface boundaries flagged. I'm not sure whether DMAddBoundary()will constrain the face, edge, or vertex points when I give it the DMLabel from Gmsh. (I need specific dof on the vertices to be constrained). I did some testing and I think DMAddBoundary() attempts to constrain the Face points (see error log below). I only associated fields with the vertices but not the Faces. I can extract the vertex points from the face label using DMPlexGetConeRecursiveVertices() but the output IS has repeated entries for the vertex points (many faces share the same vertex). Is there an easier way to get the vertex points from a gmsh surface tag? How did you call DMAddBoundary()? Are you using DM_BC_ESSENTIAL and a callback function that provides the inhomogeneous boundary condition? > I'm sorry this is a mouthful. > > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > [0]PETSC ERROR: Argument out of range > [0]PETSC ERROR: Field number 2 must be in [0, 0) It looks like you haven't added these fields yet. > [0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting. > [0]PETSC ERROR: Petsc Release Version 3.16.0, unknown > [0]PETSC ERROR: ./gmsh4 on a linux-c-dbg named F86 by jesus Tue Jan 4 15:19:57 2022 > [0]PETSC ERROR: Configure options --with-32bits-pci-domain=1 --with-debugging =1 > [0]PETSC ERROR: #1 DMGetField() at /home/jesus/SAND/PETSc_install/petsc/src/dm/interface/dm.c:4803 > [0]PETSC ERROR: #2 DMCompleteBoundaryLabel_Internal() at /home/jesus/SAND/PETSc_install/petsc/src/dm/interface/dm.c:5140 > [0]PETSC ERROR: #3 DMAddBoundary() at /home/jesus/SAND/PETSc_install/petsc/src/dm/interface/dm.c:8561 > [0]PETSC ERROR: #4 main() at /home/jesus/SAND/FEA/3D/gmshBACKUP4.c:215 > [0]PETSC ERROR: No PETSc Option Table entries > [0]PETSC ERROR: ----------------End of Error Message -------send entire error message to petsc-maint at mcs.anl.gov---------- > > > > > > > ________________________________ > From: Jed Brown > Sent: Wednesday, December 29, 2021 5:55 PM > To: Ferrand, Jesus A. ; petsc-users at mcs.anl.gov > Subject: [EXTERNAL] Re: [petsc-users] DM misuse causes massive memory leak? > > CAUTION: This email originated outside of Embry-Riddle Aeronautical University. Do not click links or open attachments unless you recognize the sender and know the content is safe. > > > "Ferrand, Jesus A." writes: > >> Dear PETSc Team: >> >> I have a question about DM and PetscSection. Say I import a mesh (for FEM purposes) and create a DMPlex for it. I then use PetscSections to set degrees of freedom per "point" (by point I mean vertices, lines, faces, and cells). I then use PetscSectionGetStorageSize() to get the size of the global stiffness matrix (K) needed for my FEM problem. One last detail, this K I populate inside a rather large loop using an element stiffness matrix function of my own. Instead of using DMCreateMatrix(), I manually created a Mat using MatCreate(), MatSetType(), MatSetSizes(), and MatSetUp(). I come to find that said loop is painfully slow when I use the manually created matrix, but 20x faster when I use the Mat coming out of DMCreateMatrix(). > > The sparse matrix hasn't been preallocated, which forces the data structure to do a lot of copies (as bad as O(n^2) complexity). DMCreateMatrix() preallocates for you. > > https://petsc.org/release/docs/manual/performance/#memory-allocation-for-sparse-matrix-assembly > https://petsc.org/release/docs/manual/mat/#sec-matsparse > >> My question is then: Is the manual Mat a noob mistake and is it somehow creating a memory leak with K? Just in case it's something else I'm attaching the code. The loop that populates K is between lines 221 and 278. Anything related to DM, DMPlex, and PetscSection is between lines 117 and 180. >> >> Machine Type: HP Laptop >> C-compiler: Gnu C >> OS: Ubuntu 20.04 >> PETSc version: 3.16.0 >> MPI Implementation: MPICH >> >> Hope you all had a Merry Christmas and that you will have a happy and productive New Year. :D >> >> >> Sincerely: >> >> J.A. Ferrand >> >> Embry-Riddle Aeronautical University - Daytona Beach FL >> >> M.Sc. Aerospace Engineering | May 2022 >> >> B.Sc. Aerospace Engineering >> >> B.Sc. Computational Mathematics >> >> >> >> Sigma Gamma Tau >> >> Tau Beta Pi >> >> Honors Program >> >> >> >> Phone: (386)-843-1829 >> >> Email(s): ferranj2 at my.erau.edu >> >> jesus.ferrand at gmail.com >> //REFERENCE: https://github.com/FreeFem/FreeFem-sources/blob/master/plugin/mpi/PETSc-code.hpp >> #include >> static char help[] = "Imports a Gmsh mesh with boundary conditions and solves the elasticity equation.\n" >> "Option prefix = opt_.\n"; >> >> struct preKE{//Preallocation before computing KE >> Mat matB, >> matBTCB; >> //matKE; >> PetscInt x_insert[3], >> y_insert[3], >> z_insert[3], >> m,//Looping variables. >> sizeKE,//size of the element stiffness matrix. >> N,//Number of nodes in element. >> x_in,y_in,z_in; //LI to index B matrix. >> PetscReal J[3][3],//Jacobian matrix. >> invJ[3][3],//Inverse of the Jacobian matrix. >> detJ,//Determinant of the Jacobian. >> dX[3], >> dY[3], >> dZ[3], >> minor00, >> minor01, >> minor02,//Determinants of minors in a 3x3 matrix. >> dPsidX, dPsidY, dPsidZ,//Shape function derivatives w.r.t global coordinates. >> weight,//Multiplier of quadrature weights. >> *dPsidXi,//Derivatives of shape functions w.r.t. Xi. >> *dPsidEta,//Derivatives of shape functions w.r.t. Eta. >> *dPsidZeta;//Derivatives of shape functions w.r.t Zeta. >> PetscErrorCode ierr; >> };//end struct. >> >> //Function declarations. >> extern PetscErrorCode tetra4(PetscScalar*, PetscScalar*, PetscScalar*,struct preKE*, Mat*, Mat*); >> extern PetscErrorCode ConstitutiveMatrix(Mat*,const char*,PetscInt); >> extern PetscErrorCode InitializeKEpreallocation(struct preKE*,const char*); >> >> PetscErrorCode PetscViewerVTKWriteFunction(PetscObject vec,PetscViewer viewer){ >> PetscErrorCode ierr; >> ierr = VecView((Vec)vec,viewer); CHKERRQ(ierr); >> return ierr; >> } >> >> >> >> >> int main(int argc, char **args){ >> //DEFINITIONS OF PETSC's DMPLEX LINGO: >> //POINT: A topology element (cell, face, edge, or vertex). >> //CHART: It an interval from 0 to the number of "points." (the range of admissible linear indices) >> //STRATUM: A subset of the "chart" which corresponds to all "points" at a given "level." >> //LEVEL: This is either a "depth" or a "height". >> //HEIGHT: Dimensionality of an element measured from 0D to 3D. Heights: cell = 0, face = 1, edge = 2, vertex = 3. >> //DEPTH: Dimensionality of an element measured from 3D to 0D. Depths: cell = 3, face = 2, edge = 1, vertex = 0; >> //CLOSURE: *of an element is the collection of all other elements that define it.I.e., the closure of a surface is the collection of vertices and edges that make it up. >> //STAR: >> //STANDARD LABELS: These are default tags that DMPlex has for its topology. ("depth") >> PetscErrorCode ierr;//Error tracking variable. >> DM dm;//Distributed memory object (useful for managing grids.) >> DMLabel physicalgroups;//Identifies user-specified tags in gmsh (to impose BC's). >> DMPolytopeType celltype;//When looping through cells, determines its type (tetrahedron, pyramid, hexahedron, etc.) >> PetscSection s; >> KSP ksp;//Krylov Sub-Space (linear solver object) >> Mat K,//Global stiffness matrix (Square, assume unsymmetric). >> KE,//Element stiffness matrix (Square, assume unsymmetric). >> matC;//Constitutive matrix. >> Vec XYZ,//Coordinate vector, contains spatial locations of mesh's vertices (NOTE: This vector self-destroys!). >> U,//Displacement vector. >> F;//Load Vector. >> PetscViewer XYZviewer,//Viewer object to output mesh to ASCII format. >> XYZpUviewer; //Viewer object to output displacements to ASCII format. >> PetscBool interpolate = PETSC_TRUE,//Instructs Gmsh importer whether to generate faces and edges (Needed when using P2 or higher elements). >> useCone = PETSC_TRUE,//Instructs "DMPlexGetTransitiveClosure()" whether to extract the closure or the star. >> dirichletBC = PETSC_FALSE,//For use when assembling the K matrix. >> neumannBC = PETSC_FALSE,//For use when assembling the F vector. >> saveASCII = PETSC_FALSE,//Whether to save results in ASCII format. >> saveVTK = PETSC_FALSE;//Whether to save results as VTK format. >> PetscInt nc,//number of cells. (PETSc lingo for "elements") >> nv,//number of vertices. (PETSc lingo for "nodes") >> nf,//number of faces. (PETSc lingo for "surfaces") >> ne,//number of edges. (PETSc lingo for "lines") >> pStart,//starting LI of global elements. >> pEnd,//ending LI of all elements. >> cStart,//starting LI for cells global arrangement. >> cEnd,//ending LI for cells in global arrangement. >> vStart,//starting LI for vertices in global arrangement. >> vEnd,//ending LI for vertices in global arrangement. >> fStart,//starting LI for faces in global arrangement. >> fEnd,//ending LI for faces in global arrangement. >> eStart,//starting LI for edges in global arrangement. >> eEnd,//ending LI for edges in global arrangement. >> sizeK,//Size of the element stiffness matrix. >> ii,jj,kk,//Dedicated looping variables. >> indexXYZ,//Variable to access the elements of XYZ vector. >> indexK,//Variable to access the elements of the U and F vectors (can reference rows and colums of K matrix.) >> *closure = PETSC_NULL,//Pointer to the closure elements of a cell. >> size_closure,//Size of the closure of a cell. >> dim,//Dimension of the mesh. >> //*edof,//Linear indices of dof's inside the K matrix. >> dof = 3,//Degrees of freedom per node. >> cells=0, edges=0, vertices=0, faces=0,//Topology counters when looping through cells. >> pinXcode=10, pinZcode=11,forceZcode=12;//Gmsh codes to extract relevant "Face Sets." >> PetscReal //*x_el,//Pointer to a vector that will store the x-coordinates of an element's vertices. >> //*y_el,//Pointer to a vector that will store the y-coordinates of an element's vertices. >> //*z_el,//Pointer to a vector that will store the z-coordinates of an element's vertices. >> *xyz_el,//Pointer to xyz array in the XYZ vector. >> traction = -10, >> *KEdata, >> t1,t2; //time keepers. >> const char *gmshfile = "TopOptmeshfine2.msh";//Name of gmsh file to import. >> >> ierr = PetscInitialize(&argc,&args,NULL,help); if(ierr) return ierr; //And the machine shall work.... >> >> //MESH IMPORT================================================================= >> //IMPORTANT NOTE: Gmsh only creates "cells" and "vertices," it does not create the "faces" or "edges." >> //Gmsh probably can generate them, must figure out how to. >> t1 = MPI_Wtime(); >> ierr = DMPlexCreateGmshFromFile(PETSC_COMM_WORLD,gmshfile,interpolate,&dm); CHKERRQ(ierr);//Read Gmsh file and generate the DMPlex. >> ierr = DMGetDimension(dm, &dim); CHKERRQ(ierr);//1-D, 2-D, or 3-D >> ierr = DMPlexGetChart(dm, &pStart, &pEnd); CHKERRQ(ierr);//Extracts linear indices of cells, vertices, faces, and edges. >> ierr = DMGetCoordinatesLocal(dm,&XYZ); CHKERRQ(ierr);//Extracts coordinates from mesh.(Contiguous storage: [x0,y0,z0,x1,y1,z1,...]) >> ierr = VecGetArray(XYZ,&xyz_el); CHKERRQ(ierr);//Get pointer to vector's data. >> t2 = MPI_Wtime(); >> PetscPrintf(PETSC_COMM_WORLD,"Mesh Import time: %10f\n",t2-t1); >> DMView(dm,PETSC_VIEWER_STDOUT_WORLD); >> >> //IMPORTANT NOTE: PETSc assumes that vertex-cell meshes are 2D even if they were 3D, so its ordering changes. >> //Cells remain at height 0, but vertices move to height 1 from height 3. To prevent this from becoming an issue >> //the "interpolate" variable is set to PETSC_TRUE to tell the mesh importer to generate faces and edges. >> //PETSc, therefore, technically does additional meshing. Gotta figure out how to get this from Gmsh directly. >> ierr = DMPlexGetDepthStratum(dm,3, &cStart, &cEnd);//Get LI of cells. >> ierr = DMPlexGetDepthStratum(dm,2, &fStart, &fEnd);//Get LI of faces >> ierr = DMPlexGetDepthStratum(dm,1, &eStart, &eEnd);//Get LI of edges. >> ierr = DMPlexGetDepthStratum(dm,0, &vStart, &vEnd);//Get LI of vertices. >> ierr = DMGetStratumSize(dm,"depth", 3, &nc);//Get number of cells. >> ierr = DMGetStratumSize(dm,"depth", 2, &nf);//Get number of faces. >> ierr = DMGetStratumSize(dm,"depth", 1, &ne);//Get number of edges. >> ierr = DMGetStratumSize(dm,"depth", 0, &nv);//Get number of vertices. >> /* >> PetscPrintf(PETSC_COMM_WORLD,"global start = %10d\t global end = %10d\n",pStart,pEnd); >> PetscPrintf(PETSC_COMM_WORLD,"#cells = %10d\t i = %10d\t i < %10d\n",nc,cStart,cEnd); >> PetscPrintf(PETSC_COMM_WORLD,"#faces = %10d\t i = %10d\t i < %10d\n",nf,fStart,fEnd); >> PetscPrintf(PETSC_COMM_WORLD,"#edges = %10d\t i = %10d\t i < %10d\n",ne,eStart,eEnd); >> PetscPrintf(PETSC_COMM_WORLD,"#vertices = %10d\t i = %10d\t i < %10d\n",nv,vStart,vEnd); >> */ >> //MESH IMPORT================================================================= >> >> //NOTE: This section extremely hardcoded right now. >> //Current setup would only support P1 meshes. >> //MEMORY ALLOCATION ========================================================== >> ierr = PetscSectionCreate(PETSC_COMM_WORLD, &s); CHKERRQ(ierr); >> //The chart is akin to a contiguous memory storage allocation. Each chart entry is associated >> //with a "thing," could be a vertex, face, cell, or edge, or anything else. >> ierr = PetscSectionSetChart(s, pStart, pEnd); CHKERRQ(ierr); >> //For each "thing" in the chart, additional room can be made. This is helpful for associating >> //nodes to multiple degrees of freedom. These commands help associate nodes with >> for(ii = cStart; ii < cEnd; ii++){//Cell loop. >> ierr = PetscSectionSetDof(s, ii, 0);CHKERRQ(ierr);}//NOTE: Currently no dof's associated with cells. >> for(ii = fStart; ii < fEnd; ii++){//Face loop. >> ierr = PetscSectionSetDof(s, ii, 0);CHKERRQ(ierr);}//NOTE: Currently no dof's associated with faces. >> for(ii = vStart; ii < vEnd; ii++){//Vertex loop. >> ierr = PetscSectionSetDof(s, ii, dof);CHKERRQ(ierr);}//Sets x, y, and z displacements as dofs. >> for(ii = eStart; ii < eEnd; ii++){//Edge loop >> ierr = PetscSectionSetDof(s, ii, 0);CHKERRQ(ierr);}//NOTE: Currently no dof's associated with edges. >> ierr = PetscSectionSetUp(s); CHKERRQ(ierr); >> ierr = PetscSectionGetStorageSize(s,&sizeK);CHKERRQ(ierr);//Determine the size of the global stiffness matrix. >> ierr = DMSetLocalSection(dm,s); CHKERRQ(ierr);//Associate the PetscSection with the DM object. >> //PetscErrorCode DMCreateGlobalVector(DM dm,Vec *vec) >> //ierr = DMCreateGlobalVector(dm,&U); CHKERRQ(ierr); >> PetscSectionDestroy(&s); >> //PetscPrintf(PETSC_COMM_WORLD,"sizeK = %10d\n",sizeK); >> >> //OBJECT SETUP================================================================ >> //Global stiffness matrix. >> //PetscErrorCode DMCreateMatrix(DM dm,Mat *mat) >> >> //This makes the loop fast. >> ierr = DMCreateMatrix(dm,&K); >> >> //This makes the loop uber slow. >> //ierr = MatCreate(PETSC_COMM_WORLD,&K); CHKERRQ(ierr); >> //ierr = MatSetType(K,MATAIJ); CHKERRQ(ierr);// Global stiffness matrix set to some sparse type. >> //ierr = MatSetSizes(K,PETSC_DECIDE,PETSC_DECIDE,sizeK,sizeK); CHKERRQ(ierr); >> //ierr = MatSetUp(K); CHKERRQ(ierr); >> >> //Displacement vector. >> ierr = VecCreate(PETSC_COMM_WORLD,&U); CHKERRQ(ierr); >> ierr = VecSetType(U,VECSTANDARD); CHKERRQ(ierr);// Global stiffness matrix set to some sparse type. >> ierr = VecSetSizes(U,PETSC_DECIDE,sizeK); CHKERRQ(ierr); >> >> //Load vector. >> ierr = VecCreate(PETSC_COMM_WORLD,&F); CHKERRQ(ierr); >> ierr = VecSetType(F,VECSTANDARD); CHKERRQ(ierr);// Global stiffness matrix set to some sparse type. >> ierr = VecSetSizes(F,PETSC_DECIDE,sizeK); CHKERRQ(ierr); >> //OBJECT SETUP================================================================ >> >> //WARNING: This loop is currently hardcoded for P1 elements only! Must Figure >> //out a clever way to modify to accomodate Pn (n>1) elements. >> >> //BEGIN GLOBAL STIFFNESS MATRIX BUILDER======================================= >> t1 = MPI_Wtime(); >> >> //PREALLOCATIONS============================================================== >> ierr = ConstitutiveMatrix(&matC,"isotropic",0); CHKERRQ(ierr); >> struct preKE preKEtetra4; >> ierr = InitializeKEpreallocation(&preKEtetra4,"tetra4"); CHKERRQ(ierr); >> ierr = MatCreate(PETSC_COMM_WORLD,&KE); CHKERRQ(ierr); //SEQUENTIAL >> ierr = MatSetSizes(KE,PETSC_DECIDE,PETSC_DECIDE,12,12); CHKERRQ(ierr); //SEQUENTIAL >> ierr = MatSetType(KE,MATDENSE); CHKERRQ(ierr); //SEQUENTIAL >> ierr = MatSetUp(KE); CHKERRQ(ierr); >> PetscReal x_tetra4[4], y_tetra4[4],z_tetra4[4], >> x_hex8[8], y_hex8[8],z_hex8[8], >> *x,*y,*z; >> PetscInt *EDOF,edof_tetra4[12],edof_hex8[24]; >> DMPolytopeType previous = DM_POLYTOPE_UNKNOWN; >> //PREALLOCATIONS============================================================== >> >> >> >> for(ii=cStart;ii> ierr = DMPlexGetTransitiveClosure(dm, ii, useCone, &size_closure, &closure); CHKERRQ(ierr); >> ierr = DMPlexGetCellType(dm, ii, &celltype); CHKERRQ(ierr); >> //IMPORTANT NOTE: MOST OF THIS LOOP SHOULD BE INCLUDED IN THE KE3D function. >> if(previous != celltype){ >> //PetscPrintf(PETSC_COMM_WORLD,"run \n"); >> if(celltype == DM_POLYTOPE_TETRAHEDRON){ >> x = x_tetra4; >> y = y_tetra4; >> z = z_tetra4; >> EDOF = edof_tetra4; >> }//end if. >> else if(celltype == DM_POLYTOPE_HEXAHEDRON){ >> x = x_hex8; >> y = y_hex8; >> z = z_hex8; >> EDOF = edof_hex8; >> }//end else if. >> } >> previous = celltype; >> >> //PetscPrintf(PETSC_COMM_WORLD,"Cell # %4i\t",ii); >> cells=0; >> edges=0; >> vertices=0; >> faces=0; >> kk = 0; >> for(jj=0;jj<(2*size_closure);jj+=2){//Scan the closure of the current cell. >> //Use information from the DM's strata to determine composition of cell_ii. >> if(vStart <= closure[jj] && closure[jj]< vEnd){//Check for vertices. >> //PetscPrintf(PETSC_COMM_WORLD,"%5i\t",closure[jj]); >> indexXYZ = dim*(closure[jj]-vStart);//Linear index of x-coordinate in the xyz_el array. >> >> *(x+vertices) = xyz_el[indexXYZ]; >> *(y+vertices) = xyz_el[indexXYZ+1];//Extract Y-coordinates of the current vertex. >> *(z+vertices) = xyz_el[indexXYZ+2];//Extract Y-coordinates of the current vertex. >> *(EDOF + kk) = indexXYZ; >> *(EDOF + kk+1) = indexXYZ+1; >> *(EDOF + kk+2) = indexXYZ+2; >> kk+=3; >> vertices++;//Update vertex counter. >> }//end if >> else if(eStart <= closure[jj] && closure[jj]< eEnd){//Check for edge ID's >> edges++; >> }//end else ifindexK >> else if(fStart <= closure[jj] && closure[jj]< fEnd){//Check for face ID's >> faces++; >> }//end else if >> else if(cStart <= closure[jj] && closure[jj]< cEnd){//Check for cell ID's >> cells++; >> }//end else if >> }//end "jj" loop. >> ierr = tetra4(x,y,z,&preKEtetra4,&matC,&KE); CHKERRQ(ierr); //Generate the element stiffness matrix for this cell. >> ierr = MatDenseGetArray(KE,&KEdata); CHKERRQ(ierr); >> ierr = MatSetValues(K,12,EDOF,12,EDOF,KEdata,ADD_VALUES); CHKERRQ(ierr);//WARNING: HARDCODED FOR TETRAHEDRAL P1 ELEMENTS ONLY !!!!!!!!!!!!!!!!!!!!!!! >> ierr = MatDenseRestoreArray(KE,&KEdata); CHKERRQ(ierr); >> ierr = DMPlexRestoreTransitiveClosure(dm, ii,useCone, &size_closure, &closure); CHKERRQ(ierr); >> }//end "ii" loop. >> ierr = MatAssemblyBegin(K,MAT_FINAL_ASSEMBLY); CHKERRQ(ierr); >> ierr = MatAssemblyEnd(K,MAT_FINAL_ASSEMBLY); CHKERRQ(ierr); >> //ierr = MatView(K,PETSC_VIEWER_DRAW_WORLD); CHKERRQ(ierr); >> //END GLOBAL STIFFNESS MATRIX BUILDER=========================================== >> t2 = MPI_Wtime(); >> PetscPrintf(PETSC_COMM_WORLD,"K build time: %10f\n",t2-t1); >> >> >> >> >> >> >> >> >> t1 = MPI_Wtime(); >> //BEGIN BOUNDARY CONDITION ENFORCEMENT========================================== >> IS TrianglesIS, physicalsurfaceID;//, VerticesIS; >> PetscInt numsurfvals, >> //numRows, >> dof_offset,numTri; >> const PetscInt *surfvals, >> //*pinZID, >> *TriangleID; >> PetscScalar diag =1; >> PetscReal area,force; >> //NOTE: Petsc can read/assign labels. Eeach label may posses multiple "values." >> //These values act as tags within a tag. >> //IMPORTANT NOTE: The below line needs a safety. If a mesh that does not feature >> //face sets is imported, the code in its current state will crash!!!. This is currently >> //hardcoded for the test mesh. >> ierr = DMGetLabel(dm, "Face Sets", &physicalgroups); CHKERRQ(ierr);//Inspects Physical surface groups defined by gmsh (if any). >> ierr = DMLabelGetValueIS(physicalgroups, &physicalsurfaceID); CHKERRQ(ierr);//Gets the physical surface ID's defined in gmsh (as specified in the .geo file). >> ierr = ISGetIndices(physicalsurfaceID,&surfvals); CHKERRQ(ierr);//Get a pointer to the actual surface values. >> ierr = DMLabelGetNumValues(physicalgroups, &numsurfvals); CHKERRQ(ierr);//Gets the number of different values that the label assigns. >> for(ii=0;ii> //PetscPrintf(PETSC_COMM_WORLD,"Values = %5i\n",surfvals[ii]); >> //PROBLEM: The surface values are hardcoded in the gmsh file. We need to adopt standard "codes" >> //that we can give to users when they make their meshes so that this code recognizes the Type >> // of boundary conditions that are to be imposed. >> if(surfvals[ii] == pinXcode){ >> dof_offset = 0; >> dirichletBC = PETSC_TRUE; >> }//end if. >> else if(surfvals[ii] == pinZcode){ >> dof_offset = 2; >> dirichletBC = PETSC_TRUE; >> }//end else if. >> else if(surfvals[ii] == forceZcode){ >> dof_offset = 2; >> neumannBC = PETSC_TRUE; >> }//end else if. >> >> ierr = DMLabelGetStratumIS(physicalgroups, surfvals[ii], &TrianglesIS); CHKERRQ(ierr);//Get the ID's (as an IS) of the surfaces belonging to value 11. >> //PROBLEM: DMPlexGetConeRecursiveVertices returns an array with repeated node ID's. For each repetition, the lines that enforce BC's unnecessarily re-run. >> ierr = ISGetSize(TrianglesIS,&numTri); CHKERRQ(ierr); >> ierr = ISGetIndices(TrianglesIS,&TriangleID); CHKERRQ(ierr);//Get a pointer to the actual surface values. >> for(kk=0;kk> ierr = DMPlexGetTransitiveClosure(dm, TriangleID[kk], useCone, &size_closure, &closure); CHKERRQ(ierr); >> if(neumannBC){ >> ierr = DMPlexComputeCellGeometryFVM(dm, TriangleID[kk], &area,PETSC_NULL,PETSC_NULL); CHKERRQ(ierr); >> force = traction*area/3;//WARNING: The 3 here is hardcoded for a purely tetrahedral mesh only!!!!!!!!!! >> } >> for(jj=0;jj<(2*size_closure);jj+=2){ >> //PetscErrorCode DMPlexComputeCellGeometryFVM(DM dm, PetscInt cell, PetscReal *vol, PetscReal centroid[], PetscReal normal[]) >> if(vStart <= closure[jj] && closure[jj]< vEnd){//Check for vertices. >> indexK = dof*(closure[jj] - vStart) + dof_offset; //Compute the dof ID's in the K matrix. >> if(dirichletBC){//Boundary conditions requiring an edit of K matrix. >> ierr = MatZeroRows(K,1,&indexK,diag,NULL,NULL); CHKERRQ(ierr); >> }//end if. >> else if(neumannBC){//Boundary conditions requiring an edit of RHS vector. >> ierr = VecSetValue(F,indexK,force,ADD_VALUES); CHKERRQ(ierr); >> }// end else if. >> }//end if. >> }//end "jj" loop. >> ierr = DMPlexRestoreTransitiveClosure(dm, closure[jj],useCone, &size_closure, &closure); CHKERRQ(ierr); >> }//end "kk" loop. >> ierr = ISRestoreIndices(TrianglesIS,&TriangleID); CHKERRQ(ierr); >> >> /* >> ierr = DMPlexGetConeRecursiveVertices(dm, TrianglesIS, &VerticesIS); CHKERRQ(ierr);//Get the ID's (as an IS) of the vertices that make up the surfaces of value 11. >> ierr = ISGetSize(VerticesIS,&numRows); CHKERRQ(ierr);//Get number of flagged vertices (this includes repeated indices for faces that share nodes). >> ierr = ISGetIndices(VerticesIS,&pinZID); CHKERRQ(ierr);//Get a pointer to the actual surface values. >> if(dirichletBC){//Boundary conditions requiring an edit of K matrix. >> for(kk=0;kk> indexK = 3*(pinZID[kk] - vStart) + dof_offset; //Compute the dof ID's in the K matrix. (NOTE: the 3* ishardcoded for 3 degrees of freedom, tie this to a variable in the FUTURE.) >> ierr = MatZeroRows(K,1,&indexK,diag,NULL,NULL); CHKERRQ(ierr); >> }//end "kk" loop. >> }//end if. >> else if(neumannBC){//Boundary conditions requiring an edit of RHS vector. >> for(kk=0;kk> indexK = 3*(pinZID[kk] - vStart) + dof_offset; >> ierr = VecSetValue(F,indexK,traction,INSERT_VALUES); CHKERRQ(ierr); >> }//end "kk" loop. >> }// end else if. >> ierr = ISRestoreIndices(VerticesIS,&pinZID); CHKERRQ(ierr); >> */ >> dirichletBC = PETSC_FALSE; >> neumannBC = PETSC_FALSE; >> }//end "ii" loop. >> ierr = ISRestoreIndices(physicalsurfaceID,&surfvals); CHKERRQ(ierr); >> //ierr = ISRestoreIndices(VerticesIS,&pinZID); CHKERRQ(ierr); >> ierr = ISDestroy(&physicalsurfaceID); CHKERRQ(ierr); >> //ierr = ISDestroy(&VerticesIS); CHKERRQ(ierr); >> ierr = ISDestroy(&TrianglesIS); CHKERRQ(ierr); >> //END BOUNDARY CONDITION ENFORCEMENT============================================ >> t2 = MPI_Wtime(); >> PetscPrintf(PETSC_COMM_WORLD,"BC imposition time: %10f\n",t2-t1); >> >> /* >> PetscInt kk = 0; >> for(ii=vStart;ii> kk++; >> PetscPrintf(PETSC_COMM_WORLD,"Vertex #%4i\t x = %10.9f\ty = %10.9f\tz = %10.9f\n",ii,xyz_el[3*kk],xyz_el[3*kk+1],xyz_el[3*kk+2]); >> }// end "ii" loop. >> */ >> >> t1 = MPI_Wtime(); >> //SOLVER======================================================================== >> ierr = KSPCreate(PETSC_COMM_WORLD,&ksp); CHKERRQ(ierr); >> ierr = KSPSetOperators(ksp,K,K); CHKERRQ(ierr); >> ierr = KSPSetFromOptions(ksp); CHKERRQ(ierr); >> ierr = KSPSolve(ksp,F,U); CHKERRQ(ierr); >> t2 = MPI_Wtime(); >> //ierr = KSPView(ksp,PETSC_VIEWER_STDOUT_WORLD);CHKERRQ(ierr); >> //SOLVER======================================================================== >> t2 = MPI_Wtime(); >> PetscPrintf(PETSC_COMM_WORLD,"Solver time: %10f\n",t2-t1); >> ierr = VecRestoreArray(XYZ,&xyz_el); CHKERRQ(ierr);//Get pointer to vector's data. >> >> //BEGIN MAX/MIN DISPLACEMENTS=================================================== >> IS ISux,ISuy,ISuz; >> Vec UX,UY,UZ; >> PetscReal UXmax,UYmax,UZmax,UXmin,UYmin,UZmin; >> ierr = ISCreateStride(PETSC_COMM_WORLD,nv,0,3,&ISux); CHKERRQ(ierr); >> ierr = ISCreateStride(PETSC_COMM_WORLD,nv,1,3,&ISuy); CHKERRQ(ierr); >> ierr = ISCreateStride(PETSC_COMM_WORLD,nv,2,3,&ISuz); CHKERRQ(ierr); >> >> //PetscErrorCode VecGetSubVector(Vec X,IS is,Vec *Y) >> ierr = VecGetSubVector(U,ISux,&UX); CHKERRQ(ierr); >> ierr = VecGetSubVector(U,ISuy,&UY); CHKERRQ(ierr); >> ierr = VecGetSubVector(U,ISuz,&UZ); CHKERRQ(ierr); >> >> //PetscErrorCode VecMax(Vec x,PetscInt *p,PetscReal *val) >> ierr = VecMax(UX,PETSC_NULL,&UXmax); CHKERRQ(ierr); >> ierr = VecMax(UY,PETSC_NULL,&UYmax); CHKERRQ(ierr); >> ierr = VecMax(UZ,PETSC_NULL,&UZmax); CHKERRQ(ierr); >> >> ierr = VecMin(UX,PETSC_NULL,&UXmin); CHKERRQ(ierr); >> ierr = VecMin(UY,PETSC_NULL,&UYmin); CHKERRQ(ierr); >> ierr = VecMin(UZ,PETSC_NULL,&UZmin); CHKERRQ(ierr); >> >> PetscPrintf(PETSC_COMM_WORLD,"%10f\t <= ux <= %10f\n",UXmin,UXmax); >> PetscPrintf(PETSC_COMM_WORLD,"%10f\t <= uy <= %10f\n",UYmin,UYmax); >> PetscPrintf(PETSC_COMM_WORLD,"%10f\t <= uz <= %10f\n",UZmin,UZmax); >> >> >> >> >> //BEGIN OUTPUT SOLUTION========================================================= >> if(saveASCII){ >> ierr = PetscViewerASCIIOpen(PETSC_COMM_WORLD,"XYZ.txt",&XYZviewer); >> ierr = VecView(XYZ,XYZviewer); CHKERRQ(ierr); >> ierr = PetscViewerASCIIOpen(PETSC_COMM_WORLD,"U.txt",&XYZpUviewer); >> ierr = VecView(U,XYZpUviewer); CHKERRQ(ierr); >> PetscViewerDestroy(&XYZviewer); PetscViewerDestroy(&XYZpUviewer); >> >> }//end if. >> if(saveVTK){ >> const char *meshfile = "starting_mesh.vtk", >> *deformedfile = "deformed_mesh.vtk"; >> ierr = PetscViewerVTKOpen(PETSC_COMM_WORLD,meshfile,FILE_MODE_WRITE,&XYZviewer); CHKERRQ(ierr); >> //PetscErrorCode DMSetAuxiliaryVec(DM dm, DMLabel label, PetscInt value, Vec aux) >> DMLabel UXlabel,UYlabel, UZlabel; >> //PetscErrorCode DMLabelCreate(MPI_Comm comm, const char name[], DMLabel *label) >> ierr = DMLabelCreate(PETSC_COMM_WORLD, "X-Displacement", &UXlabel); CHKERRQ(ierr); >> ierr = DMLabelCreate(PETSC_COMM_WORLD, "Y-Displacement", &UYlabel); CHKERRQ(ierr); >> ierr = DMLabelCreate(PETSC_COMM_WORLD, "Z-Displacement", &UZlabel); CHKERRQ(ierr); >> ierr = DMSetAuxiliaryVec(dm,UXlabel, 1, UX); CHKERRQ(ierr); >> ierr = DMSetAuxiliaryVec(dm,UYlabel, 1, UY); CHKERRQ(ierr); >> ierr = DMSetAuxiliaryVec(dm,UZlabel, 1, UZ); CHKERRQ(ierr); >> //PetscErrorCode PetscViewerVTKAddField(PetscViewer viewer,PetscObject dm,PetscErrorCode (*PetscViewerVTKWriteFunction)(PetscObject,PetscViewer),PetscInt fieldnum,PetscViewerVTKFieldType fieldtype,PetscBool checkdm,PetscObject vec) >> >> >> >> //ierr = PetscViewerVTKAddField(XYZviewer, dm,PetscErrorCode (*PetscViewerVTKWriteFunction)(Vec,PetscViewer),PETSC_DEFAULT,PETSC_VTK_POINT_FIELD,PETSC_FALSE,UX); >> ierr = PetscViewerVTKAddField(XYZviewer, (PetscObject)dm,&PetscViewerVTKWriteFunction,PETSC_DEFAULT,PETSC_VTK_POINT_FIELD,PETSC_FALSE,(PetscObject)UX); >> >> >> ierr = DMPlexVTKWriteAll((PetscObject)dm, XYZviewer); CHKERRQ(ierr); >> ierr = VecAXPY(XYZ,1,U); CHKERRQ(ierr);//Add displacement field to the mesh coordinates to deform. >> ierr = PetscViewerVTKOpen(PETSC_COMM_WORLD,deformedfile,FILE_MODE_WRITE,&XYZpUviewer); CHKERRQ(ierr); >> ierr = DMPlexVTKWriteAll((PetscObject)dm, XYZpUviewer); CHKERRQ(ierr);// >> PetscViewerDestroy(&XYZviewer); PetscViewerDestroy(&XYZpUviewer); >> >> }//end else if. >> else{ >> ierr = PetscPrintf(PETSC_COMM_WORLD,"No output format specified! Files not saved.\n"); CHKERRQ(ierr); >> }//end else. >> >> >> //END OUTPUT SOLUTION=========================================================== >> VecDestroy(&UX); ISDestroy(&ISux); >> VecDestroy(&UY); ISDestroy(&ISuy); >> VecDestroy(&UZ); ISDestroy(&ISuz); >> //END MAX/MIN DISPLACEMENTS===================================================== >> >> //CLEANUP===================================================================== >> DMDestroy(&dm); >> KSPDestroy(&ksp); >> MatDestroy(&K); MatDestroy(&KE); MatDestroy(&matC); //MatDestroy(preKEtetra4.matB); MatDestroy(preKEtetra4.matBTCB); >> VecDestroy(&U); VecDestroy(&F); >> >> //DMLabelDestroy(&physicalgroups);//Destroyig the DM destroys the label. >> //CLEANUP===================================================================== >> //PetscErrorCode PetscMallocDump(FILE *fp) >> //ierr = PetscMallocDump(NULL); >> return PetscFinalize();//And the machine shall rest.... >> }//end main. >> >> PetscErrorCode tetra4(PetscScalar* X,PetscScalar* Y, PetscScalar* Z,struct preKE *P, Mat* matC, Mat* KE){ >> //INPUTS: >> //X: Global X coordinates of the elemental nodes. >> //Y: Global Y coordinates of the elemental nodes. >> //Z: Global Z coordinates of the elemental nodes. >> //J: Jacobian matrix. >> //invJ: Inverse Jacobian matrix. >> PetscErrorCode ierr; >> //For current quadrature point, get dPsi/dXi_i Xi_i = {Xi,Eta,Zeta} >> /* >> P->dPsidXi[0] = +1.; P->dPsidEta[0] = 0.0; P->dPsidZeta[0] = 0.0; >> P->dPsidXi[1] = 0.0; P->dPsidEta[1] = +1.; P->dPsidZeta[1] = 0.0; >> P->dPsidXi[2] = 0.0; P->dPsidEta[2] = 0.0; P->dPsidZeta[2] = +1.; >> P->dPsidXi[3] = -1.; P->dPsidEta[3] = -1.; P->dPsidZeta[3] = -1.; >> */ >> //Populate the Jacobian matrix. >> P->J[0][0] = X[0] - X[3]; >> P->J[0][1] = Y[0] - Y[3]; >> P->J[0][2] = Z[0] - Z[3]; >> P->J[1][0] = X[1] - X[3]; >> P->J[1][1] = Y[1] - Y[3]; >> P->J[1][2] = Z[1] - Z[3]; >> P->J[2][0] = X[2] - X[3]; >> P->J[2][1] = Y[2] - Y[3]; >> P->J[2][2] = Z[2] - Z[3]; >> >> //Determinant of the 3x3 Jacobian. (Expansion along 1st row). >> P->minor00 = P->J[1][1]*P->J[2][2] - P->J[2][1]*P->J[1][2];//Reuse when finding InvJ. >> P->minor01 = P->J[1][0]*P->J[2][2] - P->J[2][0]*P->J[1][2];//Reuse when finding InvJ. >> P->minor02 = P->J[1][0]*P->J[2][1] - P->J[2][0]*P->J[1][1];//Reuse when finding InvJ. >> P->detJ = P->J[0][0]*P->minor00 - P->J[0][1]*P->minor01 + P->J[0][2]*P->minor02; >> //Inverse of the 3x3 Jacobian >> P->invJ[0][0] = +P->minor00/P->detJ;//Reuse precomputed minor. >> P->invJ[0][1] = -(P->J[0][1]*P->J[2][2] - P->J[0][2]*P->J[2][1])/P->detJ; >> P->invJ[0][2] = +(P->J[0][1]*P->J[1][2] - P->J[1][1]*P->J[0][2])/P->detJ; >> P->invJ[1][0] = -P->minor01/P->detJ;//Reuse precomputed minor. >> P->invJ[1][1] = +(P->J[0][0]*P->J[2][2] - P->J[0][2]*P->J[2][0])/P->detJ; >> P->invJ[1][2] = -(P->J[0][0]*P->J[1][2] - P->J[1][0]*P->J[0][2])/P->detJ; >> P->invJ[2][0] = +P->minor02/P->detJ;//Reuse precomputed minor. >> P->invJ[2][1] = -(P->J[0][0]*P->J[2][1] - P->J[0][1]*P->J[2][0])/P->detJ; >> P->invJ[2][2] = +(P->J[0][0]*P->J[1][1] - P->J[0][1]*P->J[1][0])/P->detJ; >> >> //*****************STRAIN MATRIX (B)************************************** >> for(P->m=0;P->mN;P->m++){//Scan all shape functions. >> >> P->x_in = 0 + P->m*3;//Every 3rd column starting at 0 >> P->y_in = P->x_in +1;//Every 3rd column starting at 1 >> P->z_in = P->y_in +1;//Every 3rd column starting at 2 >> >> P->dX[0] = P->invJ[0][0]*P->dPsidXi[P->m] + P->invJ[0][1]*P->dPsidEta[P->m] + P->invJ[0][2]*P->dPsidZeta[P->m]; >> P->dY[0] = P->invJ[1][0]*P->dPsidXi[P->m] + P->invJ[1][1]*P->dPsidEta[P->m] + P->invJ[1][2]*P->dPsidZeta[P->m]; >> P->dZ[0] = P->invJ[2][0]*P->dPsidXi[P->m] + P->invJ[2][1]*P->dPsidEta[P->m] + P->invJ[2][2]*P->dPsidZeta[P->m]; >> >> P->dX[1] = P->dZ[0]; P->dX[2] = P->dY[0]; >> P->dY[1] = P->dZ[0]; P->dY[2] = P->dX[0]; >> P->dZ[1] = P->dX[0]; P->dZ[2] = P->dY[0]; >> >> ierr = MatSetValues(P->matB,3,P->x_insert,1,&(P->x_in),P->dX,INSERT_VALUES); CHKERRQ(ierr); >> ierr = MatSetValues(P->matB,3,P->y_insert,1,&(P->y_in),P->dY,INSERT_VALUES); CHKERRQ(ierr); >> ierr = MatSetValues(P->matB,3,P->z_insert,1,&(P->z_in),P->dZ,INSERT_VALUES); CHKERRQ(ierr); >> >> }//end "m" loop. >> ierr = MatAssemblyBegin(P->matB,MAT_FINAL_ASSEMBLY); CHKERRQ(ierr); >> ierr = MatAssemblyEnd(P->matB,MAT_FINAL_ASSEMBLY); CHKERRQ(ierr); >> //*****************STRAIN MATRIX (B)************************************** >> >> //Compute the matrix product B^t*C*B, scale it by the quadrature weights and add to KE. >> P->weight = -P->detJ/6; >> >> ierr = MatZeroEntries(*KE); CHKERRQ(ierr); >> ierr = MatPtAP(*matC,P->matB,MAT_INITIAL_MATRIX,PETSC_DEFAULT,&(P->matBTCB));CHKERRQ(ierr); >> ierr = MatScale(P->matBTCB,P->weight); CHKERRQ(ierr); >> ierr = MatAssemblyBegin(P->matBTCB,MAT_FINAL_ASSEMBLY); CHKERRQ(ierr); >> ierr = MatAssemblyEnd(P->matBTCB,MAT_FINAL_ASSEMBLY); CHKERRQ(ierr); >> ierr = MatAXPY(*KE,1,P->matBTCB,DIFFERENT_NONZERO_PATTERN); CHKERRQ(ierr);//Add contribution of current quadrature point to KE. >> >> //ierr = MatPtAP(*matC,P->matB,MAT_INITIAL_MATRIX,PETSC_DEFAULT,KE);CHKERRQ(ierr); >> //ierr = MatScale(*KE,P->weight); CHKERRQ(ierr); >> >> ierr = MatAssemblyBegin(*KE,MAT_FINAL_ASSEMBLY); CHKERRQ(ierr); >> ierr = MatAssemblyEnd(*KE,MAT_FINAL_ASSEMBLY); CHKERRQ(ierr); >> >> //Cleanup >> return ierr; >> }//end tetra4. >> >> PetscErrorCode ConstitutiveMatrix(Mat *matC,const char* type,PetscInt materialID){ >> PetscErrorCode ierr; >> PetscBool isotropic = PETSC_FALSE, >> orthotropic = PETSC_FALSE; >> //PetscErrorCode PetscStrcmp(const char a[],const char b[],PetscBool *flg) >> ierr = PetscStrcmp(type,"isotropic",&isotropic); >> ierr = PetscStrcmp(type,"orthotropic",&orthotropic); >> ierr = MatCreate(PETSC_COMM_WORLD,matC); CHKERRQ(ierr); >> ierr = MatSetSizes(*matC,PETSC_DECIDE,PETSC_DECIDE,6,6); CHKERRQ(ierr); >> ierr = MatSetType(*matC,MATAIJ); CHKERRQ(ierr); >> ierr = MatSetUp(*matC); CHKERRQ(ierr); >> >> if(isotropic){ >> PetscReal E,nu, M,L,vals[3]; >> switch(materialID){ >> case 0://Hardcoded properties for isotropic material #0 >> E = 200; >> nu = 1./3; >> break; >> case 1://Hardcoded properties for isotropic material #1 >> E = 96; >> nu = 1./3; >> break; >> }//end switch. >> M = E/(2*(1+nu)),//Lame's constant 1 ("mu"). >> L = E*nu/((1+nu)*(1-2*nu));//Lame's constant 2 ("lambda"). >> //PetscErrorCode MatSetValues(Mat mat,PetscInt m,const PetscInt idxm[],PetscInt n,const PetscInt idxn[],const PetscScalar v[],InsertMode addv) >> PetscInt idxn[3] = {0,1,2}; >> vals[0] = L+2*M; vals[1] = L; vals[2] = vals[1]; >> ierr = MatSetValues(*matC,1,&idxn[0],3,idxn,vals,INSERT_VALUES); CHKERRQ(ierr); >> vals[1] = vals[0]; vals[0] = vals[2]; >> ierr = MatSetValues(*matC,1,&idxn[1],3,idxn,vals,INSERT_VALUES); CHKERRQ(ierr); >> vals[2] = vals[1]; vals[1] = vals[0]; >> ierr = MatSetValues(*matC,1,&idxn[2],3,idxn,vals,INSERT_VALUES); CHKERRQ(ierr); >> ierr = MatSetValue(*matC,3,3,M,INSERT_VALUES); CHKERRQ(ierr); >> ierr = MatSetValue(*matC,4,4,M,INSERT_VALUES); CHKERRQ(ierr); >> ierr = MatSetValue(*matC,5,5,M,INSERT_VALUES); CHKERRQ(ierr); >> }//end if. >> /* >> else if(orthotropic){ >> switch(materialID){ >> case 0: >> break; >> case 1: >> break; >> }//end switch. >> }//end else if. >> */ >> ierr = MatAssemblyBegin(*matC,MAT_FINAL_ASSEMBLY); CHKERRQ(ierr); >> ierr = MatAssemblyEnd(*matC,MAT_FINAL_ASSEMBLY); CHKERRQ(ierr); >> //MatView(*matC,0); >> return ierr; >> }//End ConstitutiveMatrix >> >> PetscErrorCode InitializeKEpreallocation(struct preKE *P,const char* type){ >> PetscErrorCode ierr; >> PetscBool istetra4 = PETSC_FALSE, >> ishex8 = PETSC_FALSE; >> ierr = PetscStrcmp(type,"tetra4",&istetra4); CHKERRQ(ierr); >> ierr = PetscStrcmp(type,"hex8",&ishex8); CHKERRQ(ierr); >> if(istetra4){ >> P->sizeKE = 12; >> P->N = 4; >> }//end if. >> else if(ishex8){ >> P->sizeKE = 24; >> P->N = 8; >> }//end else if. >> >> >> P->x_insert[0] = 0; P->x_insert[1] = 3; P->x_insert[2] = 5; >> P->y_insert[0] = 1; P->y_insert[1] = 4; P->y_insert[2] = 5; >> P->z_insert[0] = 2; P->z_insert[1] = 3; P->z_insert[2] = 4; >> //Allocate memory for the differentiated shape function vectors. >> ierr = PetscMalloc1(P->N,&(P->dPsidXi)); CHKERRQ(ierr); >> ierr = PetscMalloc1(P->N,&(P->dPsidEta)); CHKERRQ(ierr); >> ierr = PetscMalloc1(P->N,&(P->dPsidZeta)); CHKERRQ(ierr); >> >> P->dPsidXi[0] = +1.; P->dPsidEta[0] = 0.0; P->dPsidZeta[0] = 0.0; >> P->dPsidXi[1] = 0.0; P->dPsidEta[1] = +1.; P->dPsidZeta[1] = 0.0; >> P->dPsidXi[2] = 0.0; P->dPsidEta[2] = 0.0; P->dPsidZeta[2] = +1.; >> P->dPsidXi[3] = -1.; P->dPsidEta[3] = -1.; P->dPsidZeta[3] = -1.; >> >> >> //Strain matrix. >> ierr = MatCreate(PETSC_COMM_WORLD,&(P->matB)); CHKERRQ(ierr); >> ierr = MatSetSizes(P->matB,PETSC_DECIDE,PETSC_DECIDE,6,P->sizeKE); CHKERRQ(ierr);//Hardcoded >> ierr = MatSetType(P->matB,MATAIJ); CHKERRQ(ierr); >> ierr = MatSetUp(P->matB); CHKERRQ(ierr); >> >> //Contribution matrix. >> ierr = MatCreate(PETSC_COMM_WORLD,&(P->matBTCB)); CHKERRQ(ierr); >> ierr = MatSetSizes(P->matBTCB,PETSC_DECIDE,PETSC_DECIDE,P->sizeKE,P->sizeKE); CHKERRQ(ierr); >> ierr = MatSetType(P->matBTCB,MATAIJ); CHKERRQ(ierr); >> ierr = MatSetUp(P->matBTCB); CHKERRQ(ierr); >> >> //Element stiffness matrix. >> //ierr = MatCreateSeqDense(PETSC_COMM_SELF,12,12,NULL,&KE); CHKERRQ(ierr); //PARALLEL >> >> return ierr; >> } -------------- next part -------------- An HTML attachment was scrubbed... URL: From marco.cisternino at optimad.it Wed Jan 5 10:21:20 2022 From: marco.cisternino at optimad.it (Marco Cisternino) Date: Wed, 5 Jan 2022 16:21:20 +0000 Subject: [petsc-users] Nullspaces In-Reply-To: <34A686AA-D337-484B-9EB3-A01C7565AD48@dsic.upv.es> References: <34A686AA-D337-484B-9EB3-A01C7565AD48@dsic.upv.es> Message-ID: Hello Jose and Stefano. Thank you, Jose for your hints. I computed the two smallest eigenvalues of my operator and they are tiny but not zero. The smallest 0 eigenvalue is = (4.71506e-08, 0) with abs error = 3.95575e-07 The smallest 1 eigenvalue is = (1.95628e-07, 0) with abs error = 4.048e-07 As Stefano remarked, I would have expected much tinier values, closer to zero. Probably something is wrong in what I do: EPS eps; EPSCreate(PETSC_COMM_WORLD, &eps); EPSSetOperators( eps, matrix, NULL ); EPSSetWhichEigenpairs(eps, EPS_SMALLEST_MAGNITUDE); EPSSetProblemType( eps, EPS_NHEP ); EPSSetConvergenceTest(eps,EPS_CONV_ABS); EPSSetTolerances(eps, 1.0e-10, 1000); EPSSetDimensions(eps,2,PETSC_DEFAULT,PETSC_DEFAULT); EPSSetFromOptions( eps ); EPSSolve( eps ); Even commenting " EPSSetTolerances(eps, 1.0e-10, 1000);" and use default values, the results are exactly the same. Am I correctly computing the 2 smallest eigenvalues? They should be zeros but they are not. Any suggestions about how understanding why? In a previous email Mark remarked: "Also you say you divide by the cell volume. Maybe I am not understanding this but that is basically diagonal scaling and that will change the null space (ie, not a constant anymore)", therefore why does the null space built with MatNullSpaceCreate(PETSC_COMM_WORLD, PETSC_TRUE, 0, nullptr, &nullspace); passes the MatNullSpaceTest?? Thank you all! Marco Cisternino -----Original Message----- From: Jose E. Roman Sent: marted? 4 gennaio 2022 19:30 To: Marco Cisternino Cc: Stefano Zampini ; petsc-users Subject: Re: [petsc-users] Nullspaces To compute more than one eigenpair, call EPSSetDimensions(eps,nev,PETSC_DEFAULT,PETSC_DEFAULT). To compute zero eigenvalues you may want to use an absolute convergence criterion, with EPSSetConvergenceTest(eps,EPS_CONV_ABS), but then a tolerance of 1e-12 is probably too small. You can try without this, anyway. Jose > El 4 ene 2022, a las 18:44, Marco Cisternino escribi?: > > Hello Stefano and thank you for your support. > I never used SLEPc before but I did this: > right after the matrix loading from file I added the following lines to my shared tiny code > MatLoad(matrix, v); > > EPS eps; > EPSCreate(PETSC_COMM_WORLD, &eps); > EPSSetOperators( eps, matrix, NULL ); > EPSSetWhichEigenpairs(eps, EPS_SMALLEST_MAGNITUDE); > EPSSetProblemType( eps, EPS_NHEP ); > EPSSetTolerances(eps, 1.0e-12, 1000); > EPSSetFromOptions( eps ); > EPSSolve( eps ); > > Vec xr, xi; /* eigenvector, x */ > PetscScalar kr, ki; /* eigenvalue, k */ > PetscInt j, nconv; > PetscReal error; > EPSGetConverged( eps, &nconv ); > for (j=0; j EPSGetEigenpair( eps, j, &kr, &ki, xr, xi ); > EPSComputeError( eps, j, EPS_ERROR_ABSOLUTE, &error ); > std::cout << "The smallest eigenvalue is = (" << kr << ", " << ki << ") with error = " << error << std::endl; > } > > I launched using > mpirun -n 1 ./testnullspace -eps_monitor > > and the output is > > 1 EPS nconv=0 first unconverged value (error) -1499.29 (6.57994794e+01) > 2 EPS nconv=0 first unconverged value (error) -647.468 (5.39939262e+01) > 3 EPS nconv=0 first unconverged value (error) -177.157 (9.49337698e+01) > 4 EPS nconv=0 first unconverged value (error) 59.6771 (1.62531943e+02) > 5 EPS nconv=0 first unconverged value (error) 41.755 (1.41965990e+02) > 6 EPS nconv=0 first unconverged value (error) -11.5462 (3.60453662e+02) > 7 EPS nconv=0 first unconverged value (error) -6.04493 (4.60890030e+02) > 8 EPS nconv=0 first unconverged value (error) -22.7362 (8.67630086e+01) > 9 EPS nconv=0 first unconverged value (error) -12.9637 > (1.08507821e+02) > 10 EPS nconv=0 first unconverged value (error) 7.7234 (1.53561979e+02) > ? > 111 EPS nconv=0 first unconverged value (error) -2.27e-08 > (6.84762319e+00) > 112 EPS nconv=0 first unconverged value (error) -2.60619e-08 > (4.45245528e+00) > 113 EPS nconv=0 first unconverged value (error) -5.49592e-09 > (1.87798984e+01) > 114 EPS nconv=0 first unconverged value (error) -9.9456e-09 > (7.96711076e+00) > 115 EPS nconv=0 first unconverged value (error) -1.89779e-08 > (4.15471472e+00) > 116 EPS nconv=0 first unconverged value (error) -2.05288e-08 > (2.52953194e+00) > 117 EPS nconv=0 first unconverged value (error) -2.02919e-08 > (2.90090711e+00) > 118 EPS nconv=0 first unconverged value (error) -3.8706e-08 > (8.03595736e-01) > 119 EPS nconv=1 first unconverged value (error) -61751.8 > (9.58036571e-07) Computed 1 pairs The smallest eigenvalue is = > (-3.8706e-08, 0) with error = 4.9707e-07 > > Am I using SLEPc in the right way at least for the first smallest eigenvalue? If I?m on the right way I can find out how to compute the second one. > > Thanks a lot > > Marco Cisternino From bsmith at petsc.dev Wed Jan 5 16:17:10 2022 From: bsmith at petsc.dev (Barry Smith) Date: Wed, 5 Jan 2022 17:17:10 -0500 Subject: [petsc-users] Nullspaces In-Reply-To: References: <34A686AA-D337-484B-9EB3-A01C7565AD48@dsic.upv.es> Message-ID: <45645EB2-6EA9-4B0E-9F87-1C64943DE8EA@petsc.dev> What do you get for the two eigenvectors ? > On Jan 5, 2022, at 11:21 AM, Marco Cisternino wrote: > > Hello Jose and Stefano. > Thank you, Jose for your hints. > I computed the two smallest eigenvalues of my operator and they are tiny but not zero. > The smallest 0 eigenvalue is = (4.71506e-08, 0) with abs error = 3.95575e-07 > The smallest 1 eigenvalue is = (1.95628e-07, 0) with abs error = 4.048e-07 > As Stefano remarked, I would have expected much tinier values, closer to zero. > Probably something is wrong in what I do: > EPS eps; > EPSCreate(PETSC_COMM_WORLD, &eps); > EPSSetOperators( eps, matrix, NULL ); > EPSSetWhichEigenpairs(eps, EPS_SMALLEST_MAGNITUDE); > EPSSetProblemType( eps, EPS_NHEP ); > EPSSetConvergenceTest(eps,EPS_CONV_ABS); > EPSSetTolerances(eps, 1.0e-10, 1000); > EPSSetDimensions(eps,2,PETSC_DEFAULT,PETSC_DEFAULT); > EPSSetFromOptions( eps ); > EPSSolve( eps ); > > Even commenting " EPSSetTolerances(eps, 1.0e-10, 1000);" and use default values, the results are exactly the same. > > Am I correctly computing the 2 smallest eigenvalues? > > They should be zeros but they are not. Any suggestions about how understanding why? > > In a previous email Mark remarked: "Also you say you divide by the cell volume. Maybe I am not understanding this but that is basically diagonal scaling and that will change the null space (ie, not a constant anymore)", therefore why does the null space built with MatNullSpaceCreate(PETSC_COMM_WORLD, PETSC_TRUE, 0, nullptr, &nullspace); passes the MatNullSpaceTest?? > > Thank you all! > > Marco Cisternino > > -----Original Message----- > From: Jose E. Roman > Sent: marted? 4 gennaio 2022 19:30 > To: Marco Cisternino > Cc: Stefano Zampini ; petsc-users > Subject: Re: [petsc-users] Nullspaces > > To compute more than one eigenpair, call EPSSetDimensions(eps,nev,PETSC_DEFAULT,PETSC_DEFAULT). > > To compute zero eigenvalues you may want to use an absolute convergence criterion, with EPSSetConvergenceTest(eps,EPS_CONV_ABS), but then a tolerance of 1e-12 is probably too small. You can try without this, anyway. > > Jose > > >> El 4 ene 2022, a las 18:44, Marco Cisternino escribi?: >> >> Hello Stefano and thank you for your support. >> I never used SLEPc before but I did this: >> right after the matrix loading from file I added the following lines to my shared tiny code >> MatLoad(matrix, v); >> >> EPS eps; >> EPSCreate(PETSC_COMM_WORLD, &eps); >> EPSSetOperators( eps, matrix, NULL ); >> EPSSetWhichEigenpairs(eps, EPS_SMALLEST_MAGNITUDE); >> EPSSetProblemType( eps, EPS_NHEP ); >> EPSSetTolerances(eps, 1.0e-12, 1000); >> EPSSetFromOptions( eps ); >> EPSSolve( eps ); >> >> Vec xr, xi; /* eigenvector, x */ >> PetscScalar kr, ki; /* eigenvalue, k */ >> PetscInt j, nconv; >> PetscReal error; >> EPSGetConverged( eps, &nconv ); >> for (j=0; j> EPSGetEigenpair( eps, j, &kr, &ki, xr, xi ); >> EPSComputeError( eps, j, EPS_ERROR_ABSOLUTE, &error ); >> std::cout << "The smallest eigenvalue is = (" << kr << ", " << ki << ") with error = " << error << std::endl; >> } >> >> I launched using >> mpirun -n 1 ./testnullspace -eps_monitor >> >> and the output is >> >> 1 EPS nconv=0 first unconverged value (error) -1499.29 (6.57994794e+01) >> 2 EPS nconv=0 first unconverged value (error) -647.468 (5.39939262e+01) >> 3 EPS nconv=0 first unconverged value (error) -177.157 (9.49337698e+01) >> 4 EPS nconv=0 first unconverged value (error) 59.6771 (1.62531943e+02) >> 5 EPS nconv=0 first unconverged value (error) 41.755 (1.41965990e+02) >> 6 EPS nconv=0 first unconverged value (error) -11.5462 (3.60453662e+02) >> 7 EPS nconv=0 first unconverged value (error) -6.04493 (4.60890030e+02) >> 8 EPS nconv=0 first unconverged value (error) -22.7362 (8.67630086e+01) >> 9 EPS nconv=0 first unconverged value (error) -12.9637 >> (1.08507821e+02) >> 10 EPS nconv=0 first unconverged value (error) 7.7234 (1.53561979e+02) >> ? >> 111 EPS nconv=0 first unconverged value (error) -2.27e-08 >> (6.84762319e+00) >> 112 EPS nconv=0 first unconverged value (error) -2.60619e-08 >> (4.45245528e+00) >> 113 EPS nconv=0 first unconverged value (error) -5.49592e-09 >> (1.87798984e+01) >> 114 EPS nconv=0 first unconverged value (error) -9.9456e-09 >> (7.96711076e+00) >> 115 EPS nconv=0 first unconverged value (error) -1.89779e-08 >> (4.15471472e+00) >> 116 EPS nconv=0 first unconverged value (error) -2.05288e-08 >> (2.52953194e+00) >> 117 EPS nconv=0 first unconverged value (error) -2.02919e-08 >> (2.90090711e+00) >> 118 EPS nconv=0 first unconverged value (error) -3.8706e-08 >> (8.03595736e-01) >> 119 EPS nconv=1 first unconverged value (error) -61751.8 >> (9.58036571e-07) Computed 1 pairs The smallest eigenvalue is = >> (-3.8706e-08, 0) with error = 4.9707e-07 >> >> Am I using SLEPc in the right way at least for the first smallest eigenvalue? If I?m on the right way I can find out how to compute the second one. >> >> Thanks a lot >> >> Marco Cisternino > From jed at jedbrown.org Wed Jan 5 16:44:56 2022 From: jed at jedbrown.org (Jed Brown) Date: Wed, 05 Jan 2022 15:44:56 -0700 Subject: [petsc-users] [EXTERNAL] Re: DM misuse causes massive memory leak? In-Reply-To: References: <87tueraunm.fsf@jedbrown.org> <87sfu2921h.fsf@jedbrown.org> Message-ID: <87a6g99507.fsf@jedbrown.org> For something like displacement (and this sounds like elasticity), I would recommend using one field with three components. You can constrain a subset of the components to implement slip conditions. You can use DMPlexLabelComplete(dm, label) to propagate those face labels to vertices. "Ferrand, Jesus A." writes: > Thanks for the reply (I hit reply all this time). > > So, I set 3 fields: > /* > ierr = PetscSectionSetNumFields(s,dof); CHKERRQ(ierr); > ierr = PetscSectionSetFieldName(s,0, "X-Displacement"); CHKERRQ(ierr); //Field ID is 0 > ierr = PetscSectionSetFieldName(s,1, "Y-Displacement"); CHKERRQ(ierr); //Field ID is 1 > ierr = PetscSectionSetFieldName(s,2, "Z-Displacement"); CHKERRQ(ierr); //Field ID is 2 > */ > > I then loop through the vertices of my DMPlex > > /* > for(ii = vStart; ii < vEnd; ii++){//Vertex loop. > ierr = PetscSectionSetDof(s, ii, dof); CHKERRQ(ierr); > ierr = PetscSectionSetFieldDof(s,ii,0,1); CHKERRQ(ierr);//One X-displacement per vertex (1 dof) > ierr = PetscSectionSetFieldDof(s,ii,1,1); CHKERRQ(ierr);//One Y-displacement per vertex (1 dof) > ierr = PetscSectionSetFieldDof(s,ii,2,1); CHKERRQ(ierr);//One Z-displacement per vertex (1 dof) > }//Sets x, y, and z displacements as dofs. > */ > > I only associated fields with vertices, not with any other points in the DAG. Regarding the use of DMAddBoundary(), I mostly copied the usage shown in SNES example 77. I modified the function definition to simply set the dof to 0.0 as opposed to the coordinates. Below "physicalgroups" is the DMLabel that I got from gmsh, this flags Face points, not vertices. That is why I think the error log suggests that fields were never set. > > ierr = DMAddBoundary(dm, DM_BC_ESSENTIAL, "fixed", physicalgroups, 1, &surfvals[ii], fieldID, 0, NULL, (void (*)(void)) coordinates, NULL, NULL, NULL); CHKERRQ(ierr); > PetscErrorCode coordinates(PetscInt dim, PetscReal time, const PetscReal x[], PetscInt Nf, PetscScalar *u, void *ctx){ > const PetscInt Ncomp = dim; > PetscInt comp; > for (comp = 0; comp < Ncomp; ++comp) u[comp] = 0.0; > return 0; > } > > > ________________________________ > From: Jed Brown > Sent: Wednesday, January 5, 2022 12:36 AM > To: Ferrand, Jesus A. > Cc: petsc-users > Subject: Re: [EXTERNAL] Re: [petsc-users] DM misuse causes massive memory leak? > > Please "reply all" include the list in the future. > > "Ferrand, Jesus A." writes: > >> Forgot to say thanks for the reply (my bad). >> Yes, I was indeed forgetting to pre-allocate the sparse matrices when doing them by hand (complacency seeded by MATLAB's zeros()). Thank you, Jed, and Jeremy, for the hints! >> >> I have more questions, these ones about boundary conditions (I think these are for Matt). >> In my current code I set Dirichlet conditions directly on a Mat by calling MatSetZeroRows(). I profiled my code and found the part that applies them to be unnacceptably slow. In response, I've been trying to better pre-allocate Mats using PetscSections. I have found documentation for PetscSectionSetDof(), PetscSectionSetNumFields(), PetscSectionSetFieldName(), and PetscSectionSetFieldDof(), to set the size of my Mats and Vecs by calling DMSetLocalSection() followed by DMCreateMatrix() and DMCreateLocalVector() to get a RHS vector. This seems faster. >> >> In PetscSection, what is the difference between a "field" and a "component"? For example, I could have one field "Velocity" with three components ux, uy, and uz or perhaps three fields ux, uy, and uz each with a default component? > > It's just how you name them and how they appear in output. Usually "velocity" is better as a field with three components, but fields with other meaning (and perhaps different finite element spaces), such as pressure, would be different fields. Different components are always in the same FE space. > >> I am struggling now to impose boundary conditions after constraining dofs using PetscSection. My understanding is that constraining dof's reduces the size of the DM's matrix but it does not give the DM knowledge of what values the constrained dofs should have, right? >> >> I know that there is DMAddBoundary(), but I am unsure of how to use it. From Gmsh I have a mesh with surface boundaries flagged. I'm not sure whether DMAddBoundary()will constrain the face, edge, or vertex points when I give it the DMLabel from Gmsh. (I need specific dof on the vertices to be constrained). I did some testing and I think DMAddBoundary() attempts to constrain the Face points (see error log below). I only associated fields with the vertices but not the Faces. I can extract the vertex points from the face label using DMPlexGetConeRecursiveVertices() but the output IS has repeated entries for the vertex points (many faces share the same vertex). Is there an easier way to get the vertex points from a gmsh surface tag? > > How did you call DMAddBoundary()? Are you using DM_BC_ESSENTIAL and a callback function that provides the inhomogeneous boundary condition? > >> I'm sorry this is a mouthful. >> >> [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- >> [0]PETSC ERROR: Argument out of range >> [0]PETSC ERROR: Field number 2 must be in [0, 0) > > It looks like you haven't added these fields yet. > >> [0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting. >> [0]PETSC ERROR: Petsc Release Version 3.16.0, unknown >> [0]PETSC ERROR: ./gmsh4 on a linux-c-dbg named F86 by jesus Tue Jan 4 15:19:57 2022 >> [0]PETSC ERROR: Configure options --with-32bits-pci-domain=1 --with-debugging =1 >> [0]PETSC ERROR: #1 DMGetField() at /home/jesus/SAND/PETSc_install/petsc/src/dm/interface/dm.c:4803 >> [0]PETSC ERROR: #2 DMCompleteBoundaryLabel_Internal() at /home/jesus/SAND/PETSc_install/petsc/src/dm/interface/dm.c:5140 >> [0]PETSC ERROR: #3 DMAddBoundary() at /home/jesus/SAND/PETSc_install/petsc/src/dm/interface/dm.c:8561 >> [0]PETSC ERROR: #4 main() at /home/jesus/SAND/FEA/3D/gmshBACKUP4.c:215 >> [0]PETSC ERROR: No PETSc Option Table entries >> [0]PETSC ERROR: ----------------End of Error Message -------send entire error message to petsc-maint at mcs.anl.gov---------- >> >> >> >> >> >> >> ________________________________ >> From: Jed Brown >> Sent: Wednesday, December 29, 2021 5:55 PM >> To: Ferrand, Jesus A. ; petsc-users at mcs.anl.gov >> Subject: [EXTERNAL] Re: [petsc-users] DM misuse causes massive memory leak? >> >> CAUTION: This email originated outside of Embry-Riddle Aeronautical University. Do not click links or open attachments unless you recognize the sender and know the content is safe. >> >> >> "Ferrand, Jesus A." writes: >> >>> Dear PETSc Team: >>> >>> I have a question about DM and PetscSection. Say I import a mesh (for FEM purposes) and create a DMPlex for it. I then use PetscSections to set degrees of freedom per "point" (by point I mean vertices, lines, faces, and cells). I then use PetscSectionGetStorageSize() to get the size of the global stiffness matrix (K) needed for my FEM problem. One last detail, this K I populate inside a rather large loop using an element stiffness matrix function of my own. Instead of using DMCreateMatrix(), I manually created a Mat using MatCreate(), MatSetType(), MatSetSizes(), and MatSetUp(). I come to find that said loop is painfully slow when I use the manually created matrix, but 20x faster when I use the Mat coming out of DMCreateMatrix(). >> >> The sparse matrix hasn't been preallocated, which forces the data structure to do a lot of copies (as bad as O(n^2) complexity). DMCreateMatrix() preallocates for you. >> >> https://petsc.org/release/docs/manual/performance/#memory-allocation-for-sparse-matrix-assembly >> https://petsc.org/release/docs/manual/mat/#sec-matsparse >> >>> My question is then: Is the manual Mat a noob mistake and is it somehow creating a memory leak with K? Just in case it's something else I'm attaching the code. The loop that populates K is between lines 221 and 278. Anything related to DM, DMPlex, and PetscSection is between lines 117 and 180. >>> >>> Machine Type: HP Laptop >>> C-compiler: Gnu C >>> OS: Ubuntu 20.04 >>> PETSc version: 3.16.0 >>> MPI Implementation: MPICH >>> >>> Hope you all had a Merry Christmas and that you will have a happy and productive New Year. :D >>> >>> >>> Sincerely: >>> >>> J.A. Ferrand >>> >>> Embry-Riddle Aeronautical University - Daytona Beach FL >>> >>> M.Sc. Aerospace Engineering | May 2022 >>> >>> B.Sc. Aerospace Engineering >>> >>> B.Sc. Computational Mathematics >>> >>> >>> >>> Sigma Gamma Tau >>> >>> Tau Beta Pi >>> >>> Honors Program >>> >>> >>> >>> Phone: (386)-843-1829 >>> >>> Email(s): ferranj2 at my.erau.edu >>> >>> jesus.ferrand at gmail.com >>> //REFERENCE: https://github.com/FreeFem/FreeFem-sources/blob/master/plugin/mpi/PETSc-code.hpp >>> #include >>> static char help[] = "Imports a Gmsh mesh with boundary conditions and solves the elasticity equation.\n" >>> "Option prefix = opt_.\n"; >>> >>> struct preKE{//Preallocation before computing KE >>> Mat matB, >>> matBTCB; >>> //matKE; >>> PetscInt x_insert[3], >>> y_insert[3], >>> z_insert[3], >>> m,//Looping variables. >>> sizeKE,//size of the element stiffness matrix. >>> N,//Number of nodes in element. >>> x_in,y_in,z_in; //LI to index B matrix. >>> PetscReal J[3][3],//Jacobian matrix. >>> invJ[3][3],//Inverse of the Jacobian matrix. >>> detJ,//Determinant of the Jacobian. >>> dX[3], >>> dY[3], >>> dZ[3], >>> minor00, >>> minor01, >>> minor02,//Determinants of minors in a 3x3 matrix. >>> dPsidX, dPsidY, dPsidZ,//Shape function derivatives w.r.t global coordinates. >>> weight,//Multiplier of quadrature weights. >>> *dPsidXi,//Derivatives of shape functions w.r.t. Xi. >>> *dPsidEta,//Derivatives of shape functions w.r.t. Eta. >>> *dPsidZeta;//Derivatives of shape functions w.r.t Zeta. >>> PetscErrorCode ierr; >>> };//end struct. >>> >>> //Function declarations. >>> extern PetscErrorCode tetra4(PetscScalar*, PetscScalar*, PetscScalar*,struct preKE*, Mat*, Mat*); >>> extern PetscErrorCode ConstitutiveMatrix(Mat*,const char*,PetscInt); >>> extern PetscErrorCode InitializeKEpreallocation(struct preKE*,const char*); >>> >>> PetscErrorCode PetscViewerVTKWriteFunction(PetscObject vec,PetscViewer viewer){ >>> PetscErrorCode ierr; >>> ierr = VecView((Vec)vec,viewer); CHKERRQ(ierr); >>> return ierr; >>> } >>> >>> >>> >>> >>> int main(int argc, char **args){ >>> //DEFINITIONS OF PETSC's DMPLEX LINGO: >>> //POINT: A topology element (cell, face, edge, or vertex). >>> //CHART: It an interval from 0 to the number of "points." (the range of admissible linear indices) >>> //STRATUM: A subset of the "chart" which corresponds to all "points" at a given "level." >>> //LEVEL: This is either a "depth" or a "height". >>> //HEIGHT: Dimensionality of an element measured from 0D to 3D. Heights: cell = 0, face = 1, edge = 2, vertex = 3. >>> //DEPTH: Dimensionality of an element measured from 3D to 0D. Depths: cell = 3, face = 2, edge = 1, vertex = 0; >>> //CLOSURE: *of an element is the collection of all other elements that define it.I.e., the closure of a surface is the collection of vertices and edges that make it up. >>> //STAR: >>> //STANDARD LABELS: These are default tags that DMPlex has for its topology. ("depth") >>> PetscErrorCode ierr;//Error tracking variable. >>> DM dm;//Distributed memory object (useful for managing grids.) >>> DMLabel physicalgroups;//Identifies user-specified tags in gmsh (to impose BC's). >>> DMPolytopeType celltype;//When looping through cells, determines its type (tetrahedron, pyramid, hexahedron, etc.) >>> PetscSection s; >>> KSP ksp;//Krylov Sub-Space (linear solver object) >>> Mat K,//Global stiffness matrix (Square, assume unsymmetric). >>> KE,//Element stiffness matrix (Square, assume unsymmetric). >>> matC;//Constitutive matrix. >>> Vec XYZ,//Coordinate vector, contains spatial locations of mesh's vertices (NOTE: This vector self-destroys!). >>> U,//Displacement vector. >>> F;//Load Vector. >>> PetscViewer XYZviewer,//Viewer object to output mesh to ASCII format. >>> XYZpUviewer; //Viewer object to output displacements to ASCII format. >>> PetscBool interpolate = PETSC_TRUE,//Instructs Gmsh importer whether to generate faces and edges (Needed when using P2 or higher elements). >>> useCone = PETSC_TRUE,//Instructs "DMPlexGetTransitiveClosure()" whether to extract the closure or the star. >>> dirichletBC = PETSC_FALSE,//For use when assembling the K matrix. >>> neumannBC = PETSC_FALSE,//For use when assembling the F vector. >>> saveASCII = PETSC_FALSE,//Whether to save results in ASCII format. >>> saveVTK = PETSC_FALSE;//Whether to save results as VTK format. >>> PetscInt nc,//number of cells. (PETSc lingo for "elements") >>> nv,//number of vertices. (PETSc lingo for "nodes") >>> nf,//number of faces. (PETSc lingo for "surfaces") >>> ne,//number of edges. (PETSc lingo for "lines") >>> pStart,//starting LI of global elements. >>> pEnd,//ending LI of all elements. >>> cStart,//starting LI for cells global arrangement. >>> cEnd,//ending LI for cells in global arrangement. >>> vStart,//starting LI for vertices in global arrangement. >>> vEnd,//ending LI for vertices in global arrangement. >>> fStart,//starting LI for faces in global arrangement. >>> fEnd,//ending LI for faces in global arrangement. >>> eStart,//starting LI for edges in global arrangement. >>> eEnd,//ending LI for edges in global arrangement. >>> sizeK,//Size of the element stiffness matrix. >>> ii,jj,kk,//Dedicated looping variables. >>> indexXYZ,//Variable to access the elements of XYZ vector. >>> indexK,//Variable to access the elements of the U and F vectors (can reference rows and colums of K matrix.) >>> *closure = PETSC_NULL,//Pointer to the closure elements of a cell. >>> size_closure,//Size of the closure of a cell. >>> dim,//Dimension of the mesh. >>> //*edof,//Linear indices of dof's inside the K matrix. >>> dof = 3,//Degrees of freedom per node. >>> cells=0, edges=0, vertices=0, faces=0,//Topology counters when looping through cells. >>> pinXcode=10, pinZcode=11,forceZcode=12;//Gmsh codes to extract relevant "Face Sets." >>> PetscReal //*x_el,//Pointer to a vector that will store the x-coordinates of an element's vertices. >>> //*y_el,//Pointer to a vector that will store the y-coordinates of an element's vertices. >>> //*z_el,//Pointer to a vector that will store the z-coordinates of an element's vertices. >>> *xyz_el,//Pointer to xyz array in the XYZ vector. >>> traction = -10, >>> *KEdata, >>> t1,t2; //time keepers. >>> const char *gmshfile = "TopOptmeshfine2.msh";//Name of gmsh file to import. >>> >>> ierr = PetscInitialize(&argc,&args,NULL,help); if(ierr) return ierr; //And the machine shall work.... >>> >>> //MESH IMPORT================================================================= >>> //IMPORTANT NOTE: Gmsh only creates "cells" and "vertices," it does not create the "faces" or "edges." >>> //Gmsh probably can generate them, must figure out how to. >>> t1 = MPI_Wtime(); >>> ierr = DMPlexCreateGmshFromFile(PETSC_COMM_WORLD,gmshfile,interpolate,&dm); CHKERRQ(ierr);//Read Gmsh file and generate the DMPlex. >>> ierr = DMGetDimension(dm, &dim); CHKERRQ(ierr);//1-D, 2-D, or 3-D >>> ierr = DMPlexGetChart(dm, &pStart, &pEnd); CHKERRQ(ierr);//Extracts linear indices of cells, vertices, faces, and edges. >>> ierr = DMGetCoordinatesLocal(dm,&XYZ); CHKERRQ(ierr);//Extracts coordinates from mesh.(Contiguous storage: [x0,y0,z0,x1,y1,z1,...]) >>> ierr = VecGetArray(XYZ,&xyz_el); CHKERRQ(ierr);//Get pointer to vector's data. >>> t2 = MPI_Wtime(); >>> PetscPrintf(PETSC_COMM_WORLD,"Mesh Import time: %10f\n",t2-t1); >>> DMView(dm,PETSC_VIEWER_STDOUT_WORLD); >>> >>> //IMPORTANT NOTE: PETSc assumes that vertex-cell meshes are 2D even if they were 3D, so its ordering changes. >>> //Cells remain at height 0, but vertices move to height 1 from height 3. To prevent this from becoming an issue >>> //the "interpolate" variable is set to PETSC_TRUE to tell the mesh importer to generate faces and edges. >>> //PETSc, therefore, technically does additional meshing. Gotta figure out how to get this from Gmsh directly. >>> ierr = DMPlexGetDepthStratum(dm,3, &cStart, &cEnd);//Get LI of cells. >>> ierr = DMPlexGetDepthStratum(dm,2, &fStart, &fEnd);//Get LI of faces >>> ierr = DMPlexGetDepthStratum(dm,1, &eStart, &eEnd);//Get LI of edges. >>> ierr = DMPlexGetDepthStratum(dm,0, &vStart, &vEnd);//Get LI of vertices. >>> ierr = DMGetStratumSize(dm,"depth", 3, &nc);//Get number of cells. >>> ierr = DMGetStratumSize(dm,"depth", 2, &nf);//Get number of faces. >>> ierr = DMGetStratumSize(dm,"depth", 1, &ne);//Get number of edges. >>> ierr = DMGetStratumSize(dm,"depth", 0, &nv);//Get number of vertices. >>> /* >>> PetscPrintf(PETSC_COMM_WORLD,"global start = %10d\t global end = %10d\n",pStart,pEnd); >>> PetscPrintf(PETSC_COMM_WORLD,"#cells = %10d\t i = %10d\t i < %10d\n",nc,cStart,cEnd); >>> PetscPrintf(PETSC_COMM_WORLD,"#faces = %10d\t i = %10d\t i < %10d\n",nf,fStart,fEnd); >>> PetscPrintf(PETSC_COMM_WORLD,"#edges = %10d\t i = %10d\t i < %10d\n",ne,eStart,eEnd); >>> PetscPrintf(PETSC_COMM_WORLD,"#vertices = %10d\t i = %10d\t i < %10d\n",nv,vStart,vEnd); >>> */ >>> //MESH IMPORT================================================================= >>> >>> //NOTE: This section extremely hardcoded right now. >>> //Current setup would only support P1 meshes. >>> //MEMORY ALLOCATION ========================================================== >>> ierr = PetscSectionCreate(PETSC_COMM_WORLD, &s); CHKERRQ(ierr); >>> //The chart is akin to a contiguous memory storage allocation. Each chart entry is associated >>> //with a "thing," could be a vertex, face, cell, or edge, or anything else. >>> ierr = PetscSectionSetChart(s, pStart, pEnd); CHKERRQ(ierr); >>> //For each "thing" in the chart, additional room can be made. This is helpful for associating >>> //nodes to multiple degrees of freedom. These commands help associate nodes with >>> for(ii = cStart; ii < cEnd; ii++){//Cell loop. >>> ierr = PetscSectionSetDof(s, ii, 0);CHKERRQ(ierr);}//NOTE: Currently no dof's associated with cells. >>> for(ii = fStart; ii < fEnd; ii++){//Face loop. >>> ierr = PetscSectionSetDof(s, ii, 0);CHKERRQ(ierr);}//NOTE: Currently no dof's associated with faces. >>> for(ii = vStart; ii < vEnd; ii++){//Vertex loop. >>> ierr = PetscSectionSetDof(s, ii, dof);CHKERRQ(ierr);}//Sets x, y, and z displacements as dofs. >>> for(ii = eStart; ii < eEnd; ii++){//Edge loop >>> ierr = PetscSectionSetDof(s, ii, 0);CHKERRQ(ierr);}//NOTE: Currently no dof's associated with edges. >>> ierr = PetscSectionSetUp(s); CHKERRQ(ierr); >>> ierr = PetscSectionGetStorageSize(s,&sizeK);CHKERRQ(ierr);//Determine the size of the global stiffness matrix. >>> ierr = DMSetLocalSection(dm,s); CHKERRQ(ierr);//Associate the PetscSection with the DM object. >>> //PetscErrorCode DMCreateGlobalVector(DM dm,Vec *vec) >>> //ierr = DMCreateGlobalVector(dm,&U); CHKERRQ(ierr); >>> PetscSectionDestroy(&s); >>> //PetscPrintf(PETSC_COMM_WORLD,"sizeK = %10d\n",sizeK); >>> >>> //OBJECT SETUP================================================================ >>> //Global stiffness matrix. >>> //PetscErrorCode DMCreateMatrix(DM dm,Mat *mat) >>> >>> //This makes the loop fast. >>> ierr = DMCreateMatrix(dm,&K); >>> >>> //This makes the loop uber slow. >>> //ierr = MatCreate(PETSC_COMM_WORLD,&K); CHKERRQ(ierr); >>> //ierr = MatSetType(K,MATAIJ); CHKERRQ(ierr);// Global stiffness matrix set to some sparse type. >>> //ierr = MatSetSizes(K,PETSC_DECIDE,PETSC_DECIDE,sizeK,sizeK); CHKERRQ(ierr); >>> //ierr = MatSetUp(K); CHKERRQ(ierr); >>> >>> //Displacement vector. >>> ierr = VecCreate(PETSC_COMM_WORLD,&U); CHKERRQ(ierr); >>> ierr = VecSetType(U,VECSTANDARD); CHKERRQ(ierr);// Global stiffness matrix set to some sparse type. >>> ierr = VecSetSizes(U,PETSC_DECIDE,sizeK); CHKERRQ(ierr); >>> >>> //Load vector. >>> ierr = VecCreate(PETSC_COMM_WORLD,&F); CHKERRQ(ierr); >>> ierr = VecSetType(F,VECSTANDARD); CHKERRQ(ierr);// Global stiffness matrix set to some sparse type. >>> ierr = VecSetSizes(F,PETSC_DECIDE,sizeK); CHKERRQ(ierr); >>> //OBJECT SETUP================================================================ >>> >>> //WARNING: This loop is currently hardcoded for P1 elements only! Must Figure >>> //out a clever way to modify to accomodate Pn (n>1) elements. >>> >>> //BEGIN GLOBAL STIFFNESS MATRIX BUILDER======================================= >>> t1 = MPI_Wtime(); >>> >>> //PREALLOCATIONS============================================================== >>> ierr = ConstitutiveMatrix(&matC,"isotropic",0); CHKERRQ(ierr); >>> struct preKE preKEtetra4; >>> ierr = InitializeKEpreallocation(&preKEtetra4,"tetra4"); CHKERRQ(ierr); >>> ierr = MatCreate(PETSC_COMM_WORLD,&KE); CHKERRQ(ierr); //SEQUENTIAL >>> ierr = MatSetSizes(KE,PETSC_DECIDE,PETSC_DECIDE,12,12); CHKERRQ(ierr); //SEQUENTIAL >>> ierr = MatSetType(KE,MATDENSE); CHKERRQ(ierr); //SEQUENTIAL >>> ierr = MatSetUp(KE); CHKERRQ(ierr); >>> PetscReal x_tetra4[4], y_tetra4[4],z_tetra4[4], >>> x_hex8[8], y_hex8[8],z_hex8[8], >>> *x,*y,*z; >>> PetscInt *EDOF,edof_tetra4[12],edof_hex8[24]; >>> DMPolytopeType previous = DM_POLYTOPE_UNKNOWN; >>> //PREALLOCATIONS============================================================== >>> >>> >>> >>> for(ii=cStart;ii>> ierr = DMPlexGetTransitiveClosure(dm, ii, useCone, &size_closure, &closure); CHKERRQ(ierr); >>> ierr = DMPlexGetCellType(dm, ii, &celltype); CHKERRQ(ierr); >>> //IMPORTANT NOTE: MOST OF THIS LOOP SHOULD BE INCLUDED IN THE KE3D function. >>> if(previous != celltype){ >>> //PetscPrintf(PETSC_COMM_WORLD,"run \n"); >>> if(celltype == DM_POLYTOPE_TETRAHEDRON){ >>> x = x_tetra4; >>> y = y_tetra4; >>> z = z_tetra4; >>> EDOF = edof_tetra4; >>> }//end if. >>> else if(celltype == DM_POLYTOPE_HEXAHEDRON){ >>> x = x_hex8; >>> y = y_hex8; >>> z = z_hex8; >>> EDOF = edof_hex8; >>> }//end else if. >>> } >>> previous = celltype; >>> >>> //PetscPrintf(PETSC_COMM_WORLD,"Cell # %4i\t",ii); >>> cells=0; >>> edges=0; >>> vertices=0; >>> faces=0; >>> kk = 0; >>> for(jj=0;jj<(2*size_closure);jj+=2){//Scan the closure of the current cell. >>> //Use information from the DM's strata to determine composition of cell_ii. >>> if(vStart <= closure[jj] && closure[jj]< vEnd){//Check for vertices. >>> //PetscPrintf(PETSC_COMM_WORLD,"%5i\t",closure[jj]); >>> indexXYZ = dim*(closure[jj]-vStart);//Linear index of x-coordinate in the xyz_el array. >>> >>> *(x+vertices) = xyz_el[indexXYZ]; >>> *(y+vertices) = xyz_el[indexXYZ+1];//Extract Y-coordinates of the current vertex. >>> *(z+vertices) = xyz_el[indexXYZ+2];//Extract Y-coordinates of the current vertex. >>> *(EDOF + kk) = indexXYZ; >>> *(EDOF + kk+1) = indexXYZ+1; >>> *(EDOF + kk+2) = indexXYZ+2; >>> kk+=3; >>> vertices++;//Update vertex counter. >>> }//end if >>> else if(eStart <= closure[jj] && closure[jj]< eEnd){//Check for edge ID's >>> edges++; >>> }//end else ifindexK >>> else if(fStart <= closure[jj] && closure[jj]< fEnd){//Check for face ID's >>> faces++; >>> }//end else if >>> else if(cStart <= closure[jj] && closure[jj]< cEnd){//Check for cell ID's >>> cells++; >>> }//end else if >>> }//end "jj" loop. >>> ierr = tetra4(x,y,z,&preKEtetra4,&matC,&KE); CHKERRQ(ierr); //Generate the element stiffness matrix for this cell. >>> ierr = MatDenseGetArray(KE,&KEdata); CHKERRQ(ierr); >>> ierr = MatSetValues(K,12,EDOF,12,EDOF,KEdata,ADD_VALUES); CHKERRQ(ierr);//WARNING: HARDCODED FOR TETRAHEDRAL P1 ELEMENTS ONLY !!!!!!!!!!!!!!!!!!!!!!! >>> ierr = MatDenseRestoreArray(KE,&KEdata); CHKERRQ(ierr); >>> ierr = DMPlexRestoreTransitiveClosure(dm, ii,useCone, &size_closure, &closure); CHKERRQ(ierr); >>> }//end "ii" loop. >>> ierr = MatAssemblyBegin(K,MAT_FINAL_ASSEMBLY); CHKERRQ(ierr); >>> ierr = MatAssemblyEnd(K,MAT_FINAL_ASSEMBLY); CHKERRQ(ierr); >>> //ierr = MatView(K,PETSC_VIEWER_DRAW_WORLD); CHKERRQ(ierr); >>> //END GLOBAL STIFFNESS MATRIX BUILDER=========================================== >>> t2 = MPI_Wtime(); >>> PetscPrintf(PETSC_COMM_WORLD,"K build time: %10f\n",t2-t1); >>> >>> >>> >>> >>> >>> >>> >>> >>> t1 = MPI_Wtime(); >>> //BEGIN BOUNDARY CONDITION ENFORCEMENT========================================== >>> IS TrianglesIS, physicalsurfaceID;//, VerticesIS; >>> PetscInt numsurfvals, >>> //numRows, >>> dof_offset,numTri; >>> const PetscInt *surfvals, >>> //*pinZID, >>> *TriangleID; >>> PetscScalar diag =1; >>> PetscReal area,force; >>> //NOTE: Petsc can read/assign labels. Eeach label may posses multiple "values." >>> //These values act as tags within a tag. >>> //IMPORTANT NOTE: The below line needs a safety. If a mesh that does not feature >>> //face sets is imported, the code in its current state will crash!!!. This is currently >>> //hardcoded for the test mesh. >>> ierr = DMGetLabel(dm, "Face Sets", &physicalgroups); CHKERRQ(ierr);//Inspects Physical surface groups defined by gmsh (if any). >>> ierr = DMLabelGetValueIS(physicalgroups, &physicalsurfaceID); CHKERRQ(ierr);//Gets the physical surface ID's defined in gmsh (as specified in the .geo file). >>> ierr = ISGetIndices(physicalsurfaceID,&surfvals); CHKERRQ(ierr);//Get a pointer to the actual surface values. >>> ierr = DMLabelGetNumValues(physicalgroups, &numsurfvals); CHKERRQ(ierr);//Gets the number of different values that the label assigns. >>> for(ii=0;ii>> //PetscPrintf(PETSC_COMM_WORLD,"Values = %5i\n",surfvals[ii]); >>> //PROBLEM: The surface values are hardcoded in the gmsh file. We need to adopt standard "codes" >>> //that we can give to users when they make their meshes so that this code recognizes the Type >>> // of boundary conditions that are to be imposed. >>> if(surfvals[ii] == pinXcode){ >>> dof_offset = 0; >>> dirichletBC = PETSC_TRUE; >>> }//end if. >>> else if(surfvals[ii] == pinZcode){ >>> dof_offset = 2; >>> dirichletBC = PETSC_TRUE; >>> }//end else if. >>> else if(surfvals[ii] == forceZcode){ >>> dof_offset = 2; >>> neumannBC = PETSC_TRUE; >>> }//end else if. >>> >>> ierr = DMLabelGetStratumIS(physicalgroups, surfvals[ii], &TrianglesIS); CHKERRQ(ierr);//Get the ID's (as an IS) of the surfaces belonging to value 11. >>> //PROBLEM: DMPlexGetConeRecursiveVertices returns an array with repeated node ID's. For each repetition, the lines that enforce BC's unnecessarily re-run. >>> ierr = ISGetSize(TrianglesIS,&numTri); CHKERRQ(ierr); >>> ierr = ISGetIndices(TrianglesIS,&TriangleID); CHKERRQ(ierr);//Get a pointer to the actual surface values. >>> for(kk=0;kk>> ierr = DMPlexGetTransitiveClosure(dm, TriangleID[kk], useCone, &size_closure, &closure); CHKERRQ(ierr); >>> if(neumannBC){ >>> ierr = DMPlexComputeCellGeometryFVM(dm, TriangleID[kk], &area,PETSC_NULL,PETSC_NULL); CHKERRQ(ierr); >>> force = traction*area/3;//WARNING: The 3 here is hardcoded for a purely tetrahedral mesh only!!!!!!!!!! >>> } >>> for(jj=0;jj<(2*size_closure);jj+=2){ >>> //PetscErrorCode DMPlexComputeCellGeometryFVM(DM dm, PetscInt cell, PetscReal *vol, PetscReal centroid[], PetscReal normal[]) >>> if(vStart <= closure[jj] && closure[jj]< vEnd){//Check for vertices. >>> indexK = dof*(closure[jj] - vStart) + dof_offset; //Compute the dof ID's in the K matrix. >>> if(dirichletBC){//Boundary conditions requiring an edit of K matrix. >>> ierr = MatZeroRows(K,1,&indexK,diag,NULL,NULL); CHKERRQ(ierr); >>> }//end if. >>> else if(neumannBC){//Boundary conditions requiring an edit of RHS vector. >>> ierr = VecSetValue(F,indexK,force,ADD_VALUES); CHKERRQ(ierr); >>> }// end else if. >>> }//end if. >>> }//end "jj" loop. >>> ierr = DMPlexRestoreTransitiveClosure(dm, closure[jj],useCone, &size_closure, &closure); CHKERRQ(ierr); >>> }//end "kk" loop. >>> ierr = ISRestoreIndices(TrianglesIS,&TriangleID); CHKERRQ(ierr); >>> >>> /* >>> ierr = DMPlexGetConeRecursiveVertices(dm, TrianglesIS, &VerticesIS); CHKERRQ(ierr);//Get the ID's (as an IS) of the vertices that make up the surfaces of value 11. >>> ierr = ISGetSize(VerticesIS,&numRows); CHKERRQ(ierr);//Get number of flagged vertices (this includes repeated indices for faces that share nodes). >>> ierr = ISGetIndices(VerticesIS,&pinZID); CHKERRQ(ierr);//Get a pointer to the actual surface values. >>> if(dirichletBC){//Boundary conditions requiring an edit of K matrix. >>> for(kk=0;kk>> indexK = 3*(pinZID[kk] - vStart) + dof_offset; //Compute the dof ID's in the K matrix. (NOTE: the 3* ishardcoded for 3 degrees of freedom, tie this to a variable in the FUTURE.) >>> ierr = MatZeroRows(K,1,&indexK,diag,NULL,NULL); CHKERRQ(ierr); >>> }//end "kk" loop. >>> }//end if. >>> else if(neumannBC){//Boundary conditions requiring an edit of RHS vector. >>> for(kk=0;kk>> indexK = 3*(pinZID[kk] - vStart) + dof_offset; >>> ierr = VecSetValue(F,indexK,traction,INSERT_VALUES); CHKERRQ(ierr); >>> }//end "kk" loop. >>> }// end else if. >>> ierr = ISRestoreIndices(VerticesIS,&pinZID); CHKERRQ(ierr); >>> */ >>> dirichletBC = PETSC_FALSE; >>> neumannBC = PETSC_FALSE; >>> }//end "ii" loop. >>> ierr = ISRestoreIndices(physicalsurfaceID,&surfvals); CHKERRQ(ierr); >>> //ierr = ISRestoreIndices(VerticesIS,&pinZID); CHKERRQ(ierr); >>> ierr = ISDestroy(&physicalsurfaceID); CHKERRQ(ierr); >>> //ierr = ISDestroy(&VerticesIS); CHKERRQ(ierr); >>> ierr = ISDestroy(&TrianglesIS); CHKERRQ(ierr); >>> //END BOUNDARY CONDITION ENFORCEMENT============================================ >>> t2 = MPI_Wtime(); >>> PetscPrintf(PETSC_COMM_WORLD,"BC imposition time: %10f\n",t2-t1); >>> >>> /* >>> PetscInt kk = 0; >>> for(ii=vStart;ii>> kk++; >>> PetscPrintf(PETSC_COMM_WORLD,"Vertex #%4i\t x = %10.9f\ty = %10.9f\tz = %10.9f\n",ii,xyz_el[3*kk],xyz_el[3*kk+1],xyz_el[3*kk+2]); >>> }// end "ii" loop. >>> */ >>> >>> t1 = MPI_Wtime(); >>> //SOLVER======================================================================== >>> ierr = KSPCreate(PETSC_COMM_WORLD,&ksp); CHKERRQ(ierr); >>> ierr = KSPSetOperators(ksp,K,K); CHKERRQ(ierr); >>> ierr = KSPSetFromOptions(ksp); CHKERRQ(ierr); >>> ierr = KSPSolve(ksp,F,U); CHKERRQ(ierr); >>> t2 = MPI_Wtime(); >>> //ierr = KSPView(ksp,PETSC_VIEWER_STDOUT_WORLD);CHKERRQ(ierr); >>> //SOLVER======================================================================== >>> t2 = MPI_Wtime(); >>> PetscPrintf(PETSC_COMM_WORLD,"Solver time: %10f\n",t2-t1); >>> ierr = VecRestoreArray(XYZ,&xyz_el); CHKERRQ(ierr);//Get pointer to vector's data. >>> >>> //BEGIN MAX/MIN DISPLACEMENTS=================================================== >>> IS ISux,ISuy,ISuz; >>> Vec UX,UY,UZ; >>> PetscReal UXmax,UYmax,UZmax,UXmin,UYmin,UZmin; >>> ierr = ISCreateStride(PETSC_COMM_WORLD,nv,0,3,&ISux); CHKERRQ(ierr); >>> ierr = ISCreateStride(PETSC_COMM_WORLD,nv,1,3,&ISuy); CHKERRQ(ierr); >>> ierr = ISCreateStride(PETSC_COMM_WORLD,nv,2,3,&ISuz); CHKERRQ(ierr); >>> >>> //PetscErrorCode VecGetSubVector(Vec X,IS is,Vec *Y) >>> ierr = VecGetSubVector(U,ISux,&UX); CHKERRQ(ierr); >>> ierr = VecGetSubVector(U,ISuy,&UY); CHKERRQ(ierr); >>> ierr = VecGetSubVector(U,ISuz,&UZ); CHKERRQ(ierr); >>> >>> //PetscErrorCode VecMax(Vec x,PetscInt *p,PetscReal *val) >>> ierr = VecMax(UX,PETSC_NULL,&UXmax); CHKERRQ(ierr); >>> ierr = VecMax(UY,PETSC_NULL,&UYmax); CHKERRQ(ierr); >>> ierr = VecMax(UZ,PETSC_NULL,&UZmax); CHKERRQ(ierr); >>> >>> ierr = VecMin(UX,PETSC_NULL,&UXmin); CHKERRQ(ierr); >>> ierr = VecMin(UY,PETSC_NULL,&UYmin); CHKERRQ(ierr); >>> ierr = VecMin(UZ,PETSC_NULL,&UZmin); CHKERRQ(ierr); >>> >>> PetscPrintf(PETSC_COMM_WORLD,"%10f\t <= ux <= %10f\n",UXmin,UXmax); >>> PetscPrintf(PETSC_COMM_WORLD,"%10f\t <= uy <= %10f\n",UYmin,UYmax); >>> PetscPrintf(PETSC_COMM_WORLD,"%10f\t <= uz <= %10f\n",UZmin,UZmax); >>> >>> >>> >>> >>> //BEGIN OUTPUT SOLUTION========================================================= >>> if(saveASCII){ >>> ierr = PetscViewerASCIIOpen(PETSC_COMM_WORLD,"XYZ.txt",&XYZviewer); >>> ierr = VecView(XYZ,XYZviewer); CHKERRQ(ierr); >>> ierr = PetscViewerASCIIOpen(PETSC_COMM_WORLD,"U.txt",&XYZpUviewer); >>> ierr = VecView(U,XYZpUviewer); CHKERRQ(ierr); >>> PetscViewerDestroy(&XYZviewer); PetscViewerDestroy(&XYZpUviewer); >>> >>> }//end if. >>> if(saveVTK){ >>> const char *meshfile = "starting_mesh.vtk", >>> *deformedfile = "deformed_mesh.vtk"; >>> ierr = PetscViewerVTKOpen(PETSC_COMM_WORLD,meshfile,FILE_MODE_WRITE,&XYZviewer); CHKERRQ(ierr); >>> //PetscErrorCode DMSetAuxiliaryVec(DM dm, DMLabel label, PetscInt value, Vec aux) >>> DMLabel UXlabel,UYlabel, UZlabel; >>> //PetscErrorCode DMLabelCreate(MPI_Comm comm, const char name[], DMLabel *label) >>> ierr = DMLabelCreate(PETSC_COMM_WORLD, "X-Displacement", &UXlabel); CHKERRQ(ierr); >>> ierr = DMLabelCreate(PETSC_COMM_WORLD, "Y-Displacement", &UYlabel); CHKERRQ(ierr); >>> ierr = DMLabelCreate(PETSC_COMM_WORLD, "Z-Displacement", &UZlabel); CHKERRQ(ierr); >>> ierr = DMSetAuxiliaryVec(dm,UXlabel, 1, UX); CHKERRQ(ierr); >>> ierr = DMSetAuxiliaryVec(dm,UYlabel, 1, UY); CHKERRQ(ierr); >>> ierr = DMSetAuxiliaryVec(dm,UZlabel, 1, UZ); CHKERRQ(ierr); >>> //PetscErrorCode PetscViewerVTKAddField(PetscViewer viewer,PetscObject dm,PetscErrorCode (*PetscViewerVTKWriteFunction)(PetscObject,PetscViewer),PetscInt fieldnum,PetscViewerVTKFieldType fieldtype,PetscBool checkdm,PetscObject vec) >>> >>> >>> >>> //ierr = PetscViewerVTKAddField(XYZviewer, dm,PetscErrorCode (*PetscViewerVTKWriteFunction)(Vec,PetscViewer),PETSC_DEFAULT,PETSC_VTK_POINT_FIELD,PETSC_FALSE,UX); >>> ierr = PetscViewerVTKAddField(XYZviewer, (PetscObject)dm,&PetscViewerVTKWriteFunction,PETSC_DEFAULT,PETSC_VTK_POINT_FIELD,PETSC_FALSE,(PetscObject)UX); >>> >>> >>> ierr = DMPlexVTKWriteAll((PetscObject)dm, XYZviewer); CHKERRQ(ierr); >>> ierr = VecAXPY(XYZ,1,U); CHKERRQ(ierr);//Add displacement field to the mesh coordinates to deform. >>> ierr = PetscViewerVTKOpen(PETSC_COMM_WORLD,deformedfile,FILE_MODE_WRITE,&XYZpUviewer); CHKERRQ(ierr); >>> ierr = DMPlexVTKWriteAll((PetscObject)dm, XYZpUviewer); CHKERRQ(ierr);// >>> PetscViewerDestroy(&XYZviewer); PetscViewerDestroy(&XYZpUviewer); >>> >>> }//end else if. >>> else{ >>> ierr = PetscPrintf(PETSC_COMM_WORLD,"No output format specified! Files not saved.\n"); CHKERRQ(ierr); >>> }//end else. >>> >>> >>> //END OUTPUT SOLUTION=========================================================== >>> VecDestroy(&UX); ISDestroy(&ISux); >>> VecDestroy(&UY); ISDestroy(&ISuy); >>> VecDestroy(&UZ); ISDestroy(&ISuz); >>> //END MAX/MIN DISPLACEMENTS===================================================== >>> >>> //CLEANUP===================================================================== >>> DMDestroy(&dm); >>> KSPDestroy(&ksp); >>> MatDestroy(&K); MatDestroy(&KE); MatDestroy(&matC); //MatDestroy(preKEtetra4.matB); MatDestroy(preKEtetra4.matBTCB); >>> VecDestroy(&U); VecDestroy(&F); >>> >>> //DMLabelDestroy(&physicalgroups);//Destroyig the DM destroys the label. >>> //CLEANUP===================================================================== >>> //PetscErrorCode PetscMallocDump(FILE *fp) >>> //ierr = PetscMallocDump(NULL); >>> return PetscFinalize();//And the machine shall rest.... >>> }//end main. >>> >>> PetscErrorCode tetra4(PetscScalar* X,PetscScalar* Y, PetscScalar* Z,struct preKE *P, Mat* matC, Mat* KE){ >>> //INPUTS: >>> //X: Global X coordinates of the elemental nodes. >>> //Y: Global Y coordinates of the elemental nodes. >>> //Z: Global Z coordinates of the elemental nodes. >>> //J: Jacobian matrix. >>> //invJ: Inverse Jacobian matrix. >>> PetscErrorCode ierr; >>> //For current quadrature point, get dPsi/dXi_i Xi_i = {Xi,Eta,Zeta} >>> /* >>> P->dPsidXi[0] = +1.; P->dPsidEta[0] = 0.0; P->dPsidZeta[0] = 0.0; >>> P->dPsidXi[1] = 0.0; P->dPsidEta[1] = +1.; P->dPsidZeta[1] = 0.0; >>> P->dPsidXi[2] = 0.0; P->dPsidEta[2] = 0.0; P->dPsidZeta[2] = +1.; >>> P->dPsidXi[3] = -1.; P->dPsidEta[3] = -1.; P->dPsidZeta[3] = -1.; >>> */ >>> //Populate the Jacobian matrix. >>> P->J[0][0] = X[0] - X[3]; >>> P->J[0][1] = Y[0] - Y[3]; >>> P->J[0][2] = Z[0] - Z[3]; >>> P->J[1][0] = X[1] - X[3]; >>> P->J[1][1] = Y[1] - Y[3]; >>> P->J[1][2] = Z[1] - Z[3]; >>> P->J[2][0] = X[2] - X[3]; >>> P->J[2][1] = Y[2] - Y[3]; >>> P->J[2][2] = Z[2] - Z[3]; >>> >>> //Determinant of the 3x3 Jacobian. (Expansion along 1st row). >>> P->minor00 = P->J[1][1]*P->J[2][2] - P->J[2][1]*P->J[1][2];//Reuse when finding InvJ. >>> P->minor01 = P->J[1][0]*P->J[2][2] - P->J[2][0]*P->J[1][2];//Reuse when finding InvJ. >>> P->minor02 = P->J[1][0]*P->J[2][1] - P->J[2][0]*P->J[1][1];//Reuse when finding InvJ. >>> P->detJ = P->J[0][0]*P->minor00 - P->J[0][1]*P->minor01 + P->J[0][2]*P->minor02; >>> //Inverse of the 3x3 Jacobian >>> P->invJ[0][0] = +P->minor00/P->detJ;//Reuse precomputed minor. >>> P->invJ[0][1] = -(P->J[0][1]*P->J[2][2] - P->J[0][2]*P->J[2][1])/P->detJ; >>> P->invJ[0][2] = +(P->J[0][1]*P->J[1][2] - P->J[1][1]*P->J[0][2])/P->detJ; >>> P->invJ[1][0] = -P->minor01/P->detJ;//Reuse precomputed minor. >>> P->invJ[1][1] = +(P->J[0][0]*P->J[2][2] - P->J[0][2]*P->J[2][0])/P->detJ; >>> P->invJ[1][2] = -(P->J[0][0]*P->J[1][2] - P->J[1][0]*P->J[0][2])/P->detJ; >>> P->invJ[2][0] = +P->minor02/P->detJ;//Reuse precomputed minor. >>> P->invJ[2][1] = -(P->J[0][0]*P->J[2][1] - P->J[0][1]*P->J[2][0])/P->detJ; >>> P->invJ[2][2] = +(P->J[0][0]*P->J[1][1] - P->J[0][1]*P->J[1][0])/P->detJ; >>> >>> //*****************STRAIN MATRIX (B)************************************** >>> for(P->m=0;P->mN;P->m++){//Scan all shape functions. >>> >>> P->x_in = 0 + P->m*3;//Every 3rd column starting at 0 >>> P->y_in = P->x_in +1;//Every 3rd column starting at 1 >>> P->z_in = P->y_in +1;//Every 3rd column starting at 2 >>> >>> P->dX[0] = P->invJ[0][0]*P->dPsidXi[P->m] + P->invJ[0][1]*P->dPsidEta[P->m] + P->invJ[0][2]*P->dPsidZeta[P->m]; >>> P->dY[0] = P->invJ[1][0]*P->dPsidXi[P->m] + P->invJ[1][1]*P->dPsidEta[P->m] + P->invJ[1][2]*P->dPsidZeta[P->m]; >>> P->dZ[0] = P->invJ[2][0]*P->dPsidXi[P->m] + P->invJ[2][1]*P->dPsidEta[P->m] + P->invJ[2][2]*P->dPsidZeta[P->m]; >>> >>> P->dX[1] = P->dZ[0]; P->dX[2] = P->dY[0]; >>> P->dY[1] = P->dZ[0]; P->dY[2] = P->dX[0]; >>> P->dZ[1] = P->dX[0]; P->dZ[2] = P->dY[0]; >>> >>> ierr = MatSetValues(P->matB,3,P->x_insert,1,&(P->x_in),P->dX,INSERT_VALUES); CHKERRQ(ierr); >>> ierr = MatSetValues(P->matB,3,P->y_insert,1,&(P->y_in),P->dY,INSERT_VALUES); CHKERRQ(ierr); >>> ierr = MatSetValues(P->matB,3,P->z_insert,1,&(P->z_in),P->dZ,INSERT_VALUES); CHKERRQ(ierr); >>> >>> }//end "m" loop. >>> ierr = MatAssemblyBegin(P->matB,MAT_FINAL_ASSEMBLY); CHKERRQ(ierr); >>> ierr = MatAssemblyEnd(P->matB,MAT_FINAL_ASSEMBLY); CHKERRQ(ierr); >>> //*****************STRAIN MATRIX (B)************************************** >>> >>> //Compute the matrix product B^t*C*B, scale it by the quadrature weights and add to KE. >>> P->weight = -P->detJ/6; >>> >>> ierr = MatZeroEntries(*KE); CHKERRQ(ierr); >>> ierr = MatPtAP(*matC,P->matB,MAT_INITIAL_MATRIX,PETSC_DEFAULT,&(P->matBTCB));CHKERRQ(ierr); >>> ierr = MatScale(P->matBTCB,P->weight); CHKERRQ(ierr); >>> ierr = MatAssemblyBegin(P->matBTCB,MAT_FINAL_ASSEMBLY); CHKERRQ(ierr); >>> ierr = MatAssemblyEnd(P->matBTCB,MAT_FINAL_ASSEMBLY); CHKERRQ(ierr); >>> ierr = MatAXPY(*KE,1,P->matBTCB,DIFFERENT_NONZERO_PATTERN); CHKERRQ(ierr);//Add contribution of current quadrature point to KE. >>> >>> //ierr = MatPtAP(*matC,P->matB,MAT_INITIAL_MATRIX,PETSC_DEFAULT,KE);CHKERRQ(ierr); >>> //ierr = MatScale(*KE,P->weight); CHKERRQ(ierr); >>> >>> ierr = MatAssemblyBegin(*KE,MAT_FINAL_ASSEMBLY); CHKERRQ(ierr); >>> ierr = MatAssemblyEnd(*KE,MAT_FINAL_ASSEMBLY); CHKERRQ(ierr); >>> >>> //Cleanup >>> return ierr; >>> }//end tetra4. >>> >>> PetscErrorCode ConstitutiveMatrix(Mat *matC,const char* type,PetscInt materialID){ >>> PetscErrorCode ierr; >>> PetscBool isotropic = PETSC_FALSE, >>> orthotropic = PETSC_FALSE; >>> //PetscErrorCode PetscStrcmp(const char a[],const char b[],PetscBool *flg) >>> ierr = PetscStrcmp(type,"isotropic",&isotropic); >>> ierr = PetscStrcmp(type,"orthotropic",&orthotropic); >>> ierr = MatCreate(PETSC_COMM_WORLD,matC); CHKERRQ(ierr); >>> ierr = MatSetSizes(*matC,PETSC_DECIDE,PETSC_DECIDE,6,6); CHKERRQ(ierr); >>> ierr = MatSetType(*matC,MATAIJ); CHKERRQ(ierr); >>> ierr = MatSetUp(*matC); CHKERRQ(ierr); >>> >>> if(isotropic){ >>> PetscReal E,nu, M,L,vals[3]; >>> switch(materialID){ >>> case 0://Hardcoded properties for isotropic material #0 >>> E = 200; >>> nu = 1./3; >>> break; >>> case 1://Hardcoded properties for isotropic material #1 >>> E = 96; >>> nu = 1./3; >>> break; >>> }//end switch. >>> M = E/(2*(1+nu)),//Lame's constant 1 ("mu"). >>> L = E*nu/((1+nu)*(1-2*nu));//Lame's constant 2 ("lambda"). >>> //PetscErrorCode MatSetValues(Mat mat,PetscInt m,const PetscInt idxm[],PetscInt n,const PetscInt idxn[],const PetscScalar v[],InsertMode addv) >>> PetscInt idxn[3] = {0,1,2}; >>> vals[0] = L+2*M; vals[1] = L; vals[2] = vals[1]; >>> ierr = MatSetValues(*matC,1,&idxn[0],3,idxn,vals,INSERT_VALUES); CHKERRQ(ierr); >>> vals[1] = vals[0]; vals[0] = vals[2]; >>> ierr = MatSetValues(*matC,1,&idxn[1],3,idxn,vals,INSERT_VALUES); CHKERRQ(ierr); >>> vals[2] = vals[1]; vals[1] = vals[0]; >>> ierr = MatSetValues(*matC,1,&idxn[2],3,idxn,vals,INSERT_VALUES); CHKERRQ(ierr); >>> ierr = MatSetValue(*matC,3,3,M,INSERT_VALUES); CHKERRQ(ierr); >>> ierr = MatSetValue(*matC,4,4,M,INSERT_VALUES); CHKERRQ(ierr); >>> ierr = MatSetValue(*matC,5,5,M,INSERT_VALUES); CHKERRQ(ierr); >>> }//end if. >>> /* >>> else if(orthotropic){ >>> switch(materialID){ >>> case 0: >>> break; >>> case 1: >>> break; >>> }//end switch. >>> }//end else if. >>> */ >>> ierr = MatAssemblyBegin(*matC,MAT_FINAL_ASSEMBLY); CHKERRQ(ierr); >>> ierr = MatAssemblyEnd(*matC,MAT_FINAL_ASSEMBLY); CHKERRQ(ierr); >>> //MatView(*matC,0); >>> return ierr; >>> }//End ConstitutiveMatrix >>> >>> PetscErrorCode InitializeKEpreallocation(struct preKE *P,const char* type){ >>> PetscErrorCode ierr; >>> PetscBool istetra4 = PETSC_FALSE, >>> ishex8 = PETSC_FALSE; >>> ierr = PetscStrcmp(type,"tetra4",&istetra4); CHKERRQ(ierr); >>> ierr = PetscStrcmp(type,"hex8",&ishex8); CHKERRQ(ierr); >>> if(istetra4){ >>> P->sizeKE = 12; >>> P->N = 4; >>> }//end if. >>> else if(ishex8){ >>> P->sizeKE = 24; >>> P->N = 8; >>> }//end else if. >>> >>> >>> P->x_insert[0] = 0; P->x_insert[1] = 3; P->x_insert[2] = 5; >>> P->y_insert[0] = 1; P->y_insert[1] = 4; P->y_insert[2] = 5; >>> P->z_insert[0] = 2; P->z_insert[1] = 3; P->z_insert[2] = 4; >>> //Allocate memory for the differentiated shape function vectors. >>> ierr = PetscMalloc1(P->N,&(P->dPsidXi)); CHKERRQ(ierr); >>> ierr = PetscMalloc1(P->N,&(P->dPsidEta)); CHKERRQ(ierr); >>> ierr = PetscMalloc1(P->N,&(P->dPsidZeta)); CHKERRQ(ierr); >>> >>> P->dPsidXi[0] = +1.; P->dPsidEta[0] = 0.0; P->dPsidZeta[0] = 0.0; >>> P->dPsidXi[1] = 0.0; P->dPsidEta[1] = +1.; P->dPsidZeta[1] = 0.0; >>> P->dPsidXi[2] = 0.0; P->dPsidEta[2] = 0.0; P->dPsidZeta[2] = +1.; >>> P->dPsidXi[3] = -1.; P->dPsidEta[3] = -1.; P->dPsidZeta[3] = -1.; >>> >>> >>> //Strain matrix. >>> ierr = MatCreate(PETSC_COMM_WORLD,&(P->matB)); CHKERRQ(ierr); >>> ierr = MatSetSizes(P->matB,PETSC_DECIDE,PETSC_DECIDE,6,P->sizeKE); CHKERRQ(ierr);//Hardcoded >>> ierr = MatSetType(P->matB,MATAIJ); CHKERRQ(ierr); >>> ierr = MatSetUp(P->matB); CHKERRQ(ierr); >>> >>> //Contribution matrix. >>> ierr = MatCreate(PETSC_COMM_WORLD,&(P->matBTCB)); CHKERRQ(ierr); >>> ierr = MatSetSizes(P->matBTCB,PETSC_DECIDE,PETSC_DECIDE,P->sizeKE,P->sizeKE); CHKERRQ(ierr); >>> ierr = MatSetType(P->matBTCB,MATAIJ); CHKERRQ(ierr); >>> ierr = MatSetUp(P->matBTCB); CHKERRQ(ierr); >>> >>> //Element stiffness matrix. >>> //ierr = MatCreateSeqDense(PETSC_COMM_SELF,12,12,NULL,&KE); CHKERRQ(ierr); //PARALLEL >>> >>> return ierr; >>> } From balay at mcs.anl.gov Wed Jan 5 16:51:55 2022 From: balay at mcs.anl.gov (Satish Balay) Date: Wed, 5 Jan 2022 16:51:55 -0600 (CST) Subject: [petsc-users] Building apps on Cori (FindPETSc.cmake) In-Reply-To: <58116F49-1C02-4253-8B47-7D95A2EE2032@pnnl.gov> References: <58116F49-1C02-4253-8B47-7D95A2EE2032@pnnl.gov> Message-ID: Don't have suggestions for FindPETSc.cmake - however our current recommendation is to use pkgconf [from cmake] to detect petsc. Satish On Tue, 4 Jan 2022, ^PNNL GridPACK Account via petsc-users wrote: > Hi, > > We?ve got some users of our GridPACK package that are trying to build on the Cori machine at NERSC. GridPACK uses CMake for its build system and relies on Jed Brown?s FindPETSc.cmake module, along with the FindPackageMultipass.cmake module to identify PETSc. The tests for PETSc are currently failing with > > -- Checking PETSc ... > -- petsc_lib_dir /global/u1/s/smittal/petsc/arch-linux2-c-debug/lib > -- Recognized PETSc install with single library for all packages > -- Performing Test MULTIPASS_TEST_1_petsc_works_minimal > -- Performing Test MULTIPASS_TEST_1_petsc_works_minimal - Failed > -- Performing Test MULTIPASS_TEST_2_petsc_works_allincludes > -- Performing Test MULTIPASS_TEST_2_petsc_works_allincludes - Failed > -- Performing Test MULTIPASS_TEST_3_petsc_works_alllibraries > CMake Error: The following variables are used in this project, but they are set to NOTFOUND. > Please set them or make sure they are set and tested correctly in the CMake files: > MPI_LIBRARY > linked by target "cmTC_664e6" in directory /global/homes/s/smittal/GridPACK/build/CMakeFiles/CMakeTmp > PETSC_LIBRARY_SINGLE > linked by target "cmTC_664e6" in directory /global/homes/s/smittal/GridPACK/build/CMakeFiles/CMakeTmp > > CMake Error at /global/common/cori_cle7/software/cmake/3.21.3/share/cmake-3.21/Modules/Internal/CheckSourceRuns.cmake:94 (try_run): > Failed to generate test project build system. > Call Stack (most recent call first): > /global/common/cori_cle7/software/cmake/3.21.3/share/cmake-3.21/Modules/CheckCSourceRuns.cmake:76 (cmake_check_source_runs) > /global/homes/s/smittal/GridPACK/cmake-jedbrown/FindPackageMultipass.cmake:97 (check_c_source_runs) > /global/homes/s/smittal/GridPACK/cmake-jedbrown/FindPETSc.cmake:293 (multipass_source_runs) > /global/homes/s/smittal/GridPACK/cmake-jedbrown/FindPETSc.cmake:332 (petsc_test_runs) > CMakeLists.txt:280 (find_package) > > > -- Configuring incomplete, errors occurred! > > We have code in the CMakeLists.txt file to identify a Cray build and set the MPI_LIBRARY variable to ?? instead of NOTFOUND but that may be failing. The PETSC_LIBRARY_SINGLE error is new and one that I haven?t seen in past attempts to build at NERSC. My recollection was that the FindMPI module was not geared towards identifying the MPI compiler wrappers on Crays and that had a tendency to mess everything else up. > > Have you seen these kinds of problems recently and if so, has anyone come up with a solution? > > Bruce > > From jed at jedbrown.org Wed Jan 5 17:12:56 2022 From: jed at jedbrown.org (Jed Brown) Date: Wed, 05 Jan 2022 16:12:56 -0700 Subject: [petsc-users] Building apps on Cori (FindPETSc.cmake) In-Reply-To: References: <58116F49-1C02-4253-8B47-7D95A2EE2032@pnnl.gov> Message-ID: <874k6h93pj.fsf@jedbrown.org> Yeah, recommendation is to use FindPkgConfig.cmake. https://petsc.org/release/faq/#can-i-use-cmake-to-build-my-own-project-that-depends-on-petsc If you are forced to use a very old CMake, then you need to inspect CMakeFiles/CMakeOutput.log and CMakeFiles/CMakeError.log. These files are somewhat confusing and often lack needed information. The debug-by-email experience with CMake has been so bad I've spent the last decade trying to get away from it. Satish Balay writes: > Don't have suggestions for FindPETSc.cmake - however our current recommendation is to use pkgconf [from cmake] to detect petsc. > > Satish > > On Tue, 4 Jan 2022, ^PNNL GridPACK Account via petsc-users wrote: > >> Hi, >> >> We?ve got some users of our GridPACK package that are trying to build on the Cori machine at NERSC. GridPACK uses CMake for its build system and relies on Jed Brown?s FindPETSc.cmake module, along with the FindPackageMultipass.cmake module to identify PETSc. The tests for PETSc are currently failing with >> >> -- Checking PETSc ... >> -- petsc_lib_dir /global/u1/s/smittal/petsc/arch-linux2-c-debug/lib >> -- Recognized PETSc install with single library for all packages >> -- Performing Test MULTIPASS_TEST_1_petsc_works_minimal >> -- Performing Test MULTIPASS_TEST_1_petsc_works_minimal - Failed >> -- Performing Test MULTIPASS_TEST_2_petsc_works_allincludes >> -- Performing Test MULTIPASS_TEST_2_petsc_works_allincludes - Failed >> -- Performing Test MULTIPASS_TEST_3_petsc_works_alllibraries >> CMake Error: The following variables are used in this project, but they are set to NOTFOUND. >> Please set them or make sure they are set and tested correctly in the CMake files: >> MPI_LIBRARY >> linked by target "cmTC_664e6" in directory /global/homes/s/smittal/GridPACK/build/CMakeFiles/CMakeTmp >> PETSC_LIBRARY_SINGLE >> linked by target "cmTC_664e6" in directory /global/homes/s/smittal/GridPACK/build/CMakeFiles/CMakeTmp >> >> CMake Error at /global/common/cori_cle7/software/cmake/3.21.3/share/cmake-3.21/Modules/Internal/CheckSourceRuns.cmake:94 (try_run): >> Failed to generate test project build system. >> Call Stack (most recent call first): >> /global/common/cori_cle7/software/cmake/3.21.3/share/cmake-3.21/Modules/CheckCSourceRuns.cmake:76 (cmake_check_source_runs) >> /global/homes/s/smittal/GridPACK/cmake-jedbrown/FindPackageMultipass.cmake:97 (check_c_source_runs) >> /global/homes/s/smittal/GridPACK/cmake-jedbrown/FindPETSc.cmake:293 (multipass_source_runs) >> /global/homes/s/smittal/GridPACK/cmake-jedbrown/FindPETSc.cmake:332 (petsc_test_runs) >> CMakeLists.txt:280 (find_package) >> >> >> -- Configuring incomplete, errors occurred! >> >> We have code in the CMakeLists.txt file to identify a Cray build and set the MPI_LIBRARY variable to ?? instead of NOTFOUND but that may be failing. The PETSC_LIBRARY_SINGLE error is new and one that I haven?t seen in past attempts to build at NERSC. My recollection was that the FindMPI module was not geared towards identifying the MPI compiler wrappers on Crays and that had a tendency to mess everything else up. >> >> Have you seen these kinds of problems recently and if so, has anyone come up with a solution? >> >> Bruce >> >> From balay at mcs.anl.gov Wed Jan 5 21:51:45 2022 From: balay at mcs.anl.gov (Satish Balay) Date: Wed, 5 Jan 2022 21:51:45 -0600 (CST) Subject: [petsc-users] petsc-3.16.3 now available Message-ID: <5de2a9ba-e5de-5a95-92db-1c48b822b33@mcs.anl.gov> Dear PETSc users, The patch release petsc-3.16.3 is now available for download. https://petsc.org/release/download/ Satish From ning.li at simtechnologyus.com Thu Jan 6 12:52:37 2022 From: ning.li at simtechnologyus.com (Ning Li) Date: Thu, 6 Jan 2022 18:52:37 +0000 Subject: [petsc-users] how do you calculate the relative tolerance Message-ID: Hello, I am trying to compare PETSc and Lis libraries. In PETSc I could use absolute or relative tolerance. I want to know what is your definition of relative tolerance? In Lis, they have 3 convergence conditions: [cid:image001.png at 01D802FC.1A405EF0] Thanks Ning Li -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.png Type: image/png Size: 51481 bytes Desc: image001.png URL: From bsmith at petsc.dev Thu Jan 6 13:45:06 2022 From: bsmith at petsc.dev (Barry Smith) Date: Thu, 6 Jan 2022 14:45:06 -0500 Subject: [petsc-users] how do you calculate the relative tolerance In-Reply-To: References: Message-ID: <74703F79-1C89-4746-B590-93395D9D18B6@petsc.dev> This is discussed in https://petsc.org/release/docs/manual/ksp/ you can see the exact formulas in the code at https://petsc.org/release/src/ksp/ksp/interface/iterativ.c.html#KSPConvergedDefault If one uses a nonzero initial guess then there are two choices for the initial residual definition, either the residual at the initial guess or the right hand side for the problem see KSPConvergedDefaultSetUIRNorm Barry > On Jan 6, 2022, at 1:52 PM, Ning Li wrote: > > Hello, > > I am trying to compare PETSc and Lis libraries. > > In PETSc I could use absolute or relative tolerance. I want to know what is your definition of relative tolerance? > > In Lis, they have 3 convergence conditions: > > > Thanks > > Ning Li -------------- next part -------------- An HTML attachment was scrubbed... URL: From FERRANJ2 at my.erau.edu Thu Jan 6 16:14:49 2022 From: FERRANJ2 at my.erau.edu (Ferrand, Jesus A.) Date: Thu, 6 Jan 2022 22:14:49 +0000 Subject: [petsc-users] [EXTERNAL] Re: DM misuse causes massive memory leak? In-Reply-To: <87a6g99507.fsf@jedbrown.org> References: <87tueraunm.fsf@jedbrown.org> <87sfu2921h.fsf@jedbrown.org> <87a6g99507.fsf@jedbrown.org> Message-ID: Jed: DMPlexLabelComplete() has allowed me to speed up my code significantly (Many thanks!). I did not use DMAddBoundary() though. I figured I could obtain Index Sets (IS's) from the DAG for different depths and then IS's for the points that were flagged in Gmsh (after calling DMPlexLabelComplete()). I then intersected both IS's using ISIntersect() to get the DAG points corresponding to just vertices (and flagged by Gmsh) for Dirichlet BC's, and DAG points that are Faces and flagged by Gmsh for Neumann BC's. I then use the intersected IS to edit a Mat and a RHS Vec manually. I did further profiling and have found the PetsSections are now the next biggest overhead. For Dirichlet BC's I make an array of vertex ID's and call MatSetZeroRows() to impose BC's on them through my K matrix. And yes, I'm solving the elasticity PDE. For Neumann BC's I use DMPlexGetRecursiveVertices() to edit my RHS vector. I want to keep the PetscSections since they preallocate my matrix rather well (the one from DMCreateMatrix()) but at the same time would like to remove them since they add overhead. Do you think DMAddboundary() with the function call will be faster than my single calls to MatSetZeroRows() and DMPlexGetRecursiveVertices() ? ________________________________ From: Jed Brown Sent: Wednesday, January 5, 2022 5:44 PM To: Ferrand, Jesus A. Cc: petsc-users Subject: Re: [EXTERNAL] Re: [petsc-users] DM misuse causes massive memory leak? For something like displacement (and this sounds like elasticity), I would recommend using one field with three components. You can constrain a subset of the components to implement slip conditions. You can use DMPlexLabelComplete(dm, label) to propagate those face labels to vertices. "Ferrand, Jesus A." writes: > Thanks for the reply (I hit reply all this time). > > So, I set 3 fields: > /* > ierr = PetscSectionSetNumFields(s,dof); CHKERRQ(ierr); > ierr = PetscSectionSetFieldName(s,0, "X-Displacement"); CHKERRQ(ierr); //Field ID is 0 > ierr = PetscSectionSetFieldName(s,1, "Y-Displacement"); CHKERRQ(ierr); //Field ID is 1 > ierr = PetscSectionSetFieldName(s,2, "Z-Displacement"); CHKERRQ(ierr); //Field ID is 2 > */ > > I then loop through the vertices of my DMPlex > > /* > for(ii = vStart; ii < vEnd; ii++){//Vertex loop. > ierr = PetscSectionSetDof(s, ii, dof); CHKERRQ(ierr); > ierr = PetscSectionSetFieldDof(s,ii,0,1); CHKERRQ(ierr);//One X-displacement per vertex (1 dof) > ierr = PetscSectionSetFieldDof(s,ii,1,1); CHKERRQ(ierr);//One Y-displacement per vertex (1 dof) > ierr = PetscSectionSetFieldDof(s,ii,2,1); CHKERRQ(ierr);//One Z-displacement per vertex (1 dof) > }//Sets x, y, and z displacements as dofs. > */ > > I only associated fields with vertices, not with any other points in the DAG. Regarding the use of DMAddBoundary(), I mostly copied the usage shown in SNES example 77. I modified the function definition to simply set the dof to 0.0 as opposed to the coordinates. Below "physicalgroups" is the DMLabel that I got from gmsh, this flags Face points, not vertices. That is why I think the error log suggests that fields were never set. > > ierr = DMAddBoundary(dm, DM_BC_ESSENTIAL, "fixed", physicalgroups, 1, &surfvals[ii], fieldID, 0, NULL, (void (*)(void)) coordinates, NULL, NULL, NULL); CHKERRQ(ierr); > PetscErrorCode coordinates(PetscInt dim, PetscReal time, const PetscReal x[], PetscInt Nf, PetscScalar *u, void *ctx){ > const PetscInt Ncomp = dim; > PetscInt comp; > for (comp = 0; comp < Ncomp; ++comp) u[comp] = 0.0; > return 0; > } > > > ________________________________ > From: Jed Brown > Sent: Wednesday, January 5, 2022 12:36 AM > To: Ferrand, Jesus A. > Cc: petsc-users > Subject: Re: [EXTERNAL] Re: [petsc-users] DM misuse causes massive memory leak? > > Please "reply all" include the list in the future. > > "Ferrand, Jesus A." writes: > >> Forgot to say thanks for the reply (my bad). >> Yes, I was indeed forgetting to pre-allocate the sparse matrices when doing them by hand (complacency seeded by MATLAB's zeros()). Thank you, Jed, and Jeremy, for the hints! >> >> I have more questions, these ones about boundary conditions (I think these are for Matt). >> In my current code I set Dirichlet conditions directly on a Mat by calling MatSetZeroRows(). I profiled my code and found the part that applies them to be unnacceptably slow. In response, I've been trying to better pre-allocate Mats using PetscSections. I have found documentation for PetscSectionSetDof(), PetscSectionSetNumFields(), PetscSectionSetFieldName(), and PetscSectionSetFieldDof(), to set the size of my Mats and Vecs by calling DMSetLocalSection() followed by DMCreateMatrix() and DMCreateLocalVector() to get a RHS vector. This seems faster. >> >> In PetscSection, what is the difference between a "field" and a "component"? For example, I could have one field "Velocity" with three components ux, uy, and uz or perhaps three fields ux, uy, and uz each with a default component? > > It's just how you name them and how they appear in output. Usually "velocity" is better as a field with three components, but fields with other meaning (and perhaps different finite element spaces), such as pressure, would be different fields. Different components are always in the same FE space. > >> I am struggling now to impose boundary conditions after constraining dofs using PetscSection. My understanding is that constraining dof's reduces the size of the DM's matrix but it does not give the DM knowledge of what values the constrained dofs should have, right? >> >> I know that there is DMAddBoundary(), but I am unsure of how to use it. >From Gmsh I have a mesh with surface boundaries flagged. I'm not sure whether DMAddBoundary()will constrain the face, edge, or vertex points when I give it the DMLabel from Gmsh. (I need specific dof on the vertices to be constrained). I did some testing and I think DMAddBoundary() attempts to constrain the Face points (see error log below). I only associated fields with the vertices but not the Faces. I can extract the vertex points from the face label using DMPlexGetConeRecursiveVertices() but the output IS has repeated entries for the vertex points (many faces share the same vertex). Is there an easier way to get the vertex points from a gmsh surface tag? > > How did you call DMAddBoundary()? Are you using DM_BC_ESSENTIAL and a callback function that provides the inhomogeneous boundary condition? > >> I'm sorry this is a mouthful. >> >> [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- >> [0]PETSC ERROR: Argument out of range >> [0]PETSC ERROR: Field number 2 must be in [0, 0) > > It looks like you haven't added these fields yet. > >> [0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting. >> [0]PETSC ERROR: Petsc Release Version 3.16.0, unknown >> [0]PETSC ERROR: ./gmsh4 on a linux-c-dbg named F86 by jesus Tue Jan 4 15:19:57 2022 >> [0]PETSC ERROR: Configure options --with-32bits-pci-domain=1 --with-debugging =1 >> [0]PETSC ERROR: #1 DMGetField() at /home/jesus/SAND/PETSc_install/petsc/src/dm/interface/dm.c:4803 >> [0]PETSC ERROR: #2 DMCompleteBoundaryLabel_Internal() at /home/jesus/SAND/PETSc_install/petsc/src/dm/interface/dm.c:5140 >> [0]PETSC ERROR: #3 DMAddBoundary() at /home/jesus/SAND/PETSc_install/petsc/src/dm/interface/dm.c:8561 >> [0]PETSC ERROR: #4 main() at /home/jesus/SAND/FEA/3D/gmshBACKUP4.c:215 >> [0]PETSC ERROR: No PETSc Option Table entries >> [0]PETSC ERROR: ----------------End of Error Message -------send entire error message to petsc-maint at mcs.anl.gov---------- >> >> >> >> >> >> >> ________________________________ >> From: Jed Brown >> Sent: Wednesday, December 29, 2021 5:55 PM >> To: Ferrand, Jesus A. ; petsc-users at mcs.anl.gov >> Subject: [EXTERNAL] Re: [petsc-users] DM misuse causes massive memory leak? >> >> CAUTION: This email originated outside of Embry-Riddle Aeronautical University. Do not click links or open attachments unless you recognize the sender and know the content is safe. >> >> >> "Ferrand, Jesus A." writes: >> >>> Dear PETSc Team: >>> >>> I have a question about DM and PetscSection. Say I import a mesh (for FEM purposes) and create a DMPlex for it. I then use PetscSections to set degrees of freedom per "point" (by point I mean vertices, lines, faces, and cells). I then use PetscSectionGetStorageSize() to get the size of the global stiffness matrix (K) needed for my FEM problem. One last detail, this K I populate inside a rather large loop using an element stiffness matrix function of my own. Instead of using DMCreateMatrix(), I manually created a Mat using MatCreate(), MatSetType(), MatSetSizes(), and MatSetUp(). I come to find that said loop is painfully slow when I use the manually created matrix, but 20x faster when I use the Mat coming out of DMCreateMatrix(). >> >> The sparse matrix hasn't been preallocated, which forces the data structure to do a lot of copies (as bad as O(n^2) complexity). DMCreateMatrix() preallocates for you. >> >> https://petsc.org/release/docs/manual/performance/#memory-allocation-for-sparse-matrix-assembly >> https://petsc.org/release/docs/manual/mat/#sec-matsparse >> >>> My question is then: Is the manual Mat a noob mistake and is it somehow creating a memory leak with K? Just in case it's something else I'm attaching the code. The loop that populates K is between lines 221 and 278. Anything related to DM, DMPlex, and PetscSection is between lines 117 and 180. >>> >>> Machine Type: HP Laptop >>> C-compiler: Gnu C >>> OS: Ubuntu 20.04 >>> PETSc version: 3.16.0 >>> MPI Implementation: MPICH >>> >>> Hope you all had a Merry Christmas and that you will have a happy and productive New Year. :D >>> >>> >>> Sincerely: >>> >>> J.A. Ferrand >>> >>> Embry-Riddle Aeronautical University - Daytona Beach FL >>> >>> M.Sc. Aerospace Engineering | May 2022 >>> >>> B.Sc. Aerospace Engineering >>> >>> B.Sc. Computational Mathematics >>> >>> >>> >>> Sigma Gamma Tau >>> >>> Tau Beta Pi >>> >>> Honors Program >>> >>> >>> >>> Phone: (386)-843-1829 >>> >>> Email(s): ferranj2 at my.erau.edu >>> >>> jesus.ferrand at gmail.com >>> //REFERENCE: https://github.com/FreeFem/FreeFem-sources/blob/master/plugin/mpi/PETSc-code.hpp >>> #include >>> static char help[] = "Imports a Gmsh mesh with boundary conditions and solves the elasticity equation.\n" >>> "Option prefix = opt_.\n"; >>> >>> struct preKE{//Preallocation before computing KE >>> Mat matB, >>> matBTCB; >>> //matKE; >>> PetscInt x_insert[3], >>> y_insert[3], >>> z_insert[3], >>> m,//Looping variables. >>> sizeKE,//size of the element stiffness matrix. >>> N,//Number of nodes in element. >>> x_in,y_in,z_in; //LI to index B matrix. >>> PetscReal J[3][3],//Jacobian matrix. >>> invJ[3][3],//Inverse of the Jacobian matrix. >>> detJ,//Determinant of the Jacobian. >>> dX[3], >>> dY[3], >>> dZ[3], >>> minor00, >>> minor01, >>> minor02,//Determinants of minors in a 3x3 matrix. >>> dPsidX, dPsidY, dPsidZ,//Shape function derivatives w.r.t global coordinates. >>> weight,//Multiplier of quadrature weights. >>> *dPsidXi,//Derivatives of shape functions w.r.t. Xi. >>> *dPsidEta,//Derivatives of shape functions w.r.t. Eta. >>> *dPsidZeta;//Derivatives of shape functions w.r.t Zeta. >>> PetscErrorCode ierr; >>> };//end struct. >>> >>> //Function declarations. >>> extern PetscErrorCode tetra4(PetscScalar*, PetscScalar*, PetscScalar*,struct preKE*, Mat*, Mat*); >>> extern PetscErrorCode ConstitutiveMatrix(Mat*,const char*,PetscInt); >>> extern PetscErrorCode InitializeKEpreallocation(struct preKE*,const char*); >>> >>> PetscErrorCode PetscViewerVTKWriteFunction(PetscObject vec,PetscViewer viewer){ >>> PetscErrorCode ierr; >>> ierr = VecView((Vec)vec,viewer); CHKERRQ(ierr); >>> return ierr; >>> } >>> >>> >>> >>> >>> int main(int argc, char **args){ >>> //DEFINITIONS OF PETSC's DMPLEX LINGO: >>> //POINT: A topology element (cell, face, edge, or vertex). >>> //CHART: It an interval from 0 to the number of "points." (the range of admissible linear indices) >>> //STRATUM: A subset of the "chart" which corresponds to all "points" at a given "level." >>> //LEVEL: This is either a "depth" or a "height". >>> //HEIGHT: Dimensionality of an element measured from 0D to 3D. Heights: cell = 0, face = 1, edge = 2, vertex = 3. >>> //DEPTH: Dimensionality of an element measured from 3D to 0D. Depths: cell = 3, face = 2, edge = 1, vertex = 0; >>> //CLOSURE: *of an element is the collection of all other elements that define it.I.e., the closure of a surface is the collection of vertices and edges that make it up. >>> //STAR: >>> //STANDARD LABELS: These are default tags that DMPlex has for its topology. ("depth") >>> PetscErrorCode ierr;//Error tracking variable. >>> DM dm;//Distributed memory object (useful for managing grids.) >>> DMLabel physicalgroups;//Identifies user-specified tags in gmsh (to impose BC's). >>> DMPolytopeType celltype;//When looping through cells, determines its type (tetrahedron, pyramid, hexahedron, etc.) >>> PetscSection s; >>> KSP ksp;//Krylov Sub-Space (linear solver object) >>> Mat K,//Global stiffness matrix (Square, assume unsymmetric). >>> KE,//Element stiffness matrix (Square, assume unsymmetric). >>> matC;//Constitutive matrix. >>> Vec XYZ,//Coordinate vector, contains spatial locations of mesh's vertices (NOTE: This vector self-destroys!). >>> U,//Displacement vector. >>> F;//Load Vector. >>> PetscViewer XYZviewer,//Viewer object to output mesh to ASCII format. >>> XYZpUviewer; //Viewer object to output displacements to ASCII format. >>> PetscBool interpolate = PETSC_TRUE,//Instructs Gmsh importer whether to generate faces and edges (Needed when using P2 or higher elements). >>> useCone = PETSC_TRUE,//Instructs "DMPlexGetTransitiveClosure()" whether to extract the closure or the star. >>> dirichletBC = PETSC_FALSE,//For use when assembling the K matrix. >>> neumannBC = PETSC_FALSE,//For use when assembling the F vector. >>> saveASCII = PETSC_FALSE,//Whether to save results in ASCII format. >>> saveVTK = PETSC_FALSE;//Whether to save results as VTK format. >>> PetscInt nc,//number of cells. (PETSc lingo for "elements") >>> nv,//number of vertices. (PETSc lingo for "nodes") >>> nf,//number of faces. (PETSc lingo for "surfaces") >>> ne,//number of edges. (PETSc lingo for "lines") >>> pStart,//starting LI of global elements. >>> pEnd,//ending LI of all elements. >>> cStart,//starting LI for cells global arrangement. >>> cEnd,//ending LI for cells in global arrangement. >>> vStart,//starting LI for vertices in global arrangement. >>> vEnd,//ending LI for vertices in global arrangement. >>> fStart,//starting LI for faces in global arrangement. >>> fEnd,//ending LI for faces in global arrangement. >>> eStart,//starting LI for edges in global arrangement. >>> eEnd,//ending LI for edges in global arrangement. >>> sizeK,//Size of the element stiffness matrix. >>> ii,jj,kk,//Dedicated looping variables. >>> indexXYZ,//Variable to access the elements of XYZ vector. >>> indexK,//Variable to access the elements of the U and F vectors (can reference rows and colums of K matrix.) >>> *closure = PETSC_NULL,//Pointer to the closure elements of a cell. >>> size_closure,//Size of the closure of a cell. >>> dim,//Dimension of the mesh. >>> //*edof,//Linear indices of dof's inside the K matrix. >>> dof = 3,//Degrees of freedom per node. >>> cells=0, edges=0, vertices=0, faces=0,//Topology counters when looping through cells. >>> pinXcode=10, pinZcode=11,forceZcode=12;//Gmsh codes to extract relevant "Face Sets." >>> PetscReal //*x_el,//Pointer to a vector that will store the x-coordinates of an element's vertices. >>> //*y_el,//Pointer to a vector that will store the y-coordinates of an element's vertices. >>> //*z_el,//Pointer to a vector that will store the z-coordinates of an element's vertices. >>> *xyz_el,//Pointer to xyz array in the XYZ vector. >>> traction = -10, >>> *KEdata, >>> t1,t2; //time keepers. >>> const char *gmshfile = "TopOptmeshfine2.msh";//Name of gmsh file to import. >>> >>> ierr = PetscInitialize(&argc,&args,NULL,help); if(ierr) return ierr; //And the machine shall work.... >>> >>> //MESH IMPORT================================================================= >>> //IMPORTANT NOTE: Gmsh only creates "cells" and "vertices," it does not create the "faces" or "edges." >>> //Gmsh probably can generate them, must figure out how to. >>> t1 = MPI_Wtime(); >>> ierr = DMPlexCreateGmshFromFile(PETSC_COMM_WORLD,gmshfile,interpolate,&dm); CHKERRQ(ierr);//Read Gmsh file and generate the DMPlex. >>> ierr = DMGetDimension(dm, &dim); CHKERRQ(ierr);//1-D, 2-D, or 3-D >>> ierr = DMPlexGetChart(dm, &pStart, &pEnd); CHKERRQ(ierr);//Extracts linear indices of cells, vertices, faces, and edges. >>> ierr = DMGetCoordinatesLocal(dm,&XYZ); CHKERRQ(ierr);//Extracts coordinates from mesh.(Contiguous storage: [x0,y0,z0,x1,y1,z1,...]) >>> ierr = VecGetArray(XYZ,&xyz_el); CHKERRQ(ierr);//Get pointer to vector's data. >>> t2 = MPI_Wtime(); >>> PetscPrintf(PETSC_COMM_WORLD,"Mesh Import time: %10f\n",t2-t1); >>> DMView(dm,PETSC_VIEWER_STDOUT_WORLD); >>> >>> //IMPORTANT NOTE: PETSc assumes that vertex-cell meshes are 2D even if they were 3D, so its ordering changes. >>> //Cells remain at height 0, but vertices move to height 1 from height 3. To prevent this from becoming an issue >>> //the "interpolate" variable is set to PETSC_TRUE to tell the mesh importer to generate faces and edges. >>> //PETSc, therefore, technically does additional meshing. Gotta figure out how to get this from Gmsh directly. >>> ierr = DMPlexGetDepthStratum(dm,3, &cStart, &cEnd);//Get LI of cells. >>> ierr = DMPlexGetDepthStratum(dm,2, &fStart, &fEnd);//Get LI of faces >>> ierr = DMPlexGetDepthStratum(dm,1, &eStart, &eEnd);//Get LI of edges. >>> ierr = DMPlexGetDepthStratum(dm,0, &vStart, &vEnd);//Get LI of vertices. >>> ierr = DMGetStratumSize(dm,"depth", 3, &nc);//Get number of cells. >>> ierr = DMGetStratumSize(dm,"depth", 2, &nf);//Get number of faces. >>> ierr = DMGetStratumSize(dm,"depth", 1, &ne);//Get number of edges. >>> ierr = DMGetStratumSize(dm,"depth", 0, &nv);//Get number of vertices. >>> /* >>> PetscPrintf(PETSC_COMM_WORLD,"global start = %10d\t global end = %10d\n",pStart,pEnd); >>> PetscPrintf(PETSC_COMM_WORLD,"#cells = %10d\t i = %10d\t i < %10d\n",nc,cStart,cEnd); >>> PetscPrintf(PETSC_COMM_WORLD,"#faces = %10d\t i = %10d\t i < %10d\n",nf,fStart,fEnd); >>> PetscPrintf(PETSC_COMM_WORLD,"#edges = %10d\t i = %10d\t i < %10d\n",ne,eStart,eEnd); >>> PetscPrintf(PETSC_COMM_WORLD,"#vertices = %10d\t i = %10d\t i < %10d\n",nv,vStart,vEnd); >>> */ >>> //MESH IMPORT================================================================= >>> >>> //NOTE: This section extremely hardcoded right now. >>> //Current setup would only support P1 meshes. >>> //MEMORY ALLOCATION ========================================================== >>> ierr = PetscSectionCreate(PETSC_COMM_WORLD, &s); CHKERRQ(ierr); >>> //The chart is akin to a contiguous memory storage allocation. Each chart entry is associated >>> //with a "thing," could be a vertex, face, cell, or edge, or anything else. >>> ierr = PetscSectionSetChart(s, pStart, pEnd); CHKERRQ(ierr); >>> //For each "thing" in the chart, additional room can be made. This is helpful for associating >>> //nodes to multiple degrees of freedom. These commands help associate nodes with >>> for(ii = cStart; ii < cEnd; ii++){//Cell loop. >>> ierr = PetscSectionSetDof(s, ii, 0);CHKERRQ(ierr);}//NOTE: Currently no dof's associated with cells. >>> for(ii = fStart; ii < fEnd; ii++){//Face loop. >>> ierr = PetscSectionSetDof(s, ii, 0);CHKERRQ(ierr);}//NOTE: Currently no dof's associated with faces. >>> for(ii = vStart; ii < vEnd; ii++){//Vertex loop. >>> ierr = PetscSectionSetDof(s, ii, dof);CHKERRQ(ierr);}//Sets x, y, and z displacements as dofs. >>> for(ii = eStart; ii < eEnd; ii++){//Edge loop >>> ierr = PetscSectionSetDof(s, ii, 0);CHKERRQ(ierr);}//NOTE: Currently no dof's associated with edges. >>> ierr = PetscSectionSetUp(s); CHKERRQ(ierr); >>> ierr = PetscSectionGetStorageSize(s,&sizeK);CHKERRQ(ierr);//Determine the size of the global stiffness matrix. >>> ierr = DMSetLocalSection(dm,s); CHKERRQ(ierr);//Associate the PetscSection with the DM object. >>> //PetscErrorCode DMCreateGlobalVector(DM dm,Vec *vec) >>> //ierr = DMCreateGlobalVector(dm,&U); CHKERRQ(ierr); >>> PetscSectionDestroy(&s); >>> //PetscPrintf(PETSC_COMM_WORLD,"sizeK = %10d\n",sizeK); >>> >>> //OBJECT SETUP================================================================ >>> //Global stiffness matrix. >>> //PetscErrorCode DMCreateMatrix(DM dm,Mat *mat) >>> >>> //This makes the loop fast. >>> ierr = DMCreateMatrix(dm,&K); >>> >>> //This makes the loop uber slow. >>> //ierr = MatCreate(PETSC_COMM_WORLD,&K); CHKERRQ(ierr); >>> //ierr = MatSetType(K,MATAIJ); CHKERRQ(ierr);// Global stiffness matrix set to some sparse type. >>> //ierr = MatSetSizes(K,PETSC_DECIDE,PETSC_DECIDE,sizeK,sizeK); CHKERRQ(ierr); >>> //ierr = MatSetUp(K); CHKERRQ(ierr); >>> >>> //Displacement vector. >>> ierr = VecCreate(PETSC_COMM_WORLD,&U); CHKERRQ(ierr); >>> ierr = VecSetType(U,VECSTANDARD); CHKERRQ(ierr);// Global stiffness matrix set to some sparse type. >>> ierr = VecSetSizes(U,PETSC_DECIDE,sizeK); CHKERRQ(ierr); >>> >>> //Load vector. >>> ierr = VecCreate(PETSC_COMM_WORLD,&F); CHKERRQ(ierr); >>> ierr = VecSetType(F,VECSTANDARD); CHKERRQ(ierr);// Global stiffness matrix set to some sparse type. >>> ierr = VecSetSizes(F,PETSC_DECIDE,sizeK); CHKERRQ(ierr); >>> //OBJECT SETUP================================================================ >>> >>> //WARNING: This loop is currently hardcoded for P1 elements only! Must Figure >>> //out a clever way to modify to accomodate Pn (n>1) elements. >>> >>> //BEGIN GLOBAL STIFFNESS MATRIX BUILDER======================================= >>> t1 = MPI_Wtime(); >>> >>> //PREALLOCATIONS============================================================== >>> ierr = ConstitutiveMatrix(&matC,"isotropic",0); CHKERRQ(ierr); >>> struct preKE preKEtetra4; >>> ierr = InitializeKEpreallocation(&preKEtetra4,"tetra4"); CHKERRQ(ierr); >>> ierr = MatCreate(PETSC_COMM_WORLD,&KE); CHKERRQ(ierr); //SEQUENTIAL >>> ierr = MatSetSizes(KE,PETSC_DECIDE,PETSC_DECIDE,12,12); CHKERRQ(ierr); //SEQUENTIAL >>> ierr = MatSetType(KE,MATDENSE); CHKERRQ(ierr); //SEQUENTIAL >>> ierr = MatSetUp(KE); CHKERRQ(ierr); >>> PetscReal x_tetra4[4], y_tetra4[4],z_tetra4[4], >>> x_hex8[8], y_hex8[8],z_hex8[8], >>> *x,*y,*z; >>> PetscInt *EDOF,edof_tetra4[12],edof_hex8[24]; >>> DMPolytopeType previous = DM_POLYTOPE_UNKNOWN; >>> //PREALLOCATIONS============================================================== >>> >>> >>> >>> for(ii=cStart;ii>> ierr = DMPlexGetTransitiveClosure(dm, ii, useCone, &size_closure, &closure); CHKERRQ(ierr); >>> ierr = DMPlexGetCellType(dm, ii, &celltype); CHKERRQ(ierr); >>> //IMPORTANT NOTE: MOST OF THIS LOOP SHOULD BE INCLUDED IN THE KE3D function. >>> if(previous != celltype){ >>> //PetscPrintf(PETSC_COMM_WORLD,"run \n"); >>> if(celltype == DM_POLYTOPE_TETRAHEDRON){ >>> x = x_tetra4; >>> y = y_tetra4; >>> z = z_tetra4; >>> EDOF = edof_tetra4; >>> }//end if. >>> else if(celltype == DM_POLYTOPE_HEXAHEDRON){ >>> x = x_hex8; >>> y = y_hex8; >>> z = z_hex8; >>> EDOF = edof_hex8; >>> }//end else if. >>> } >>> previous = celltype; >>> >>> //PetscPrintf(PETSC_COMM_WORLD,"Cell # %4i\t",ii); >>> cells=0; >>> edges=0; >>> vertices=0; >>> faces=0; >>> kk = 0; >>> for(jj=0;jj<(2*size_closure);jj+=2){//Scan the closure of the current cell. >>> //Use information from the DM's strata to determine composition of cell_ii. >>> if(vStart <= closure[jj] && closure[jj]< vEnd){//Check for vertices. >>> //PetscPrintf(PETSC_COMM_WORLD,"%5i\t",closure[jj]); >>> indexXYZ = dim*(closure[jj]-vStart);//Linear index of x-coordinate in the xyz_el array. >>> >>> *(x+vertices) = xyz_el[indexXYZ]; >>> *(y+vertices) = xyz_el[indexXYZ+1];//Extract Y-coordinates of the current vertex. >>> *(z+vertices) = xyz_el[indexXYZ+2];//Extract Y-coordinates of the current vertex. >>> *(EDOF + kk) = indexXYZ; >>> *(EDOF + kk+1) = indexXYZ+1; >>> *(EDOF + kk+2) = indexXYZ+2; >>> kk+=3; >>> vertices++;//Update vertex counter. >>> }//end if >>> else if(eStart <= closure[jj] && closure[jj]< eEnd){//Check for edge ID's >>> edges++; >>> }//end else ifindexK >>> else if(fStart <= closure[jj] && closure[jj]< fEnd){//Check for face ID's >>> faces++; >>> }//end else if >>> else if(cStart <= closure[jj] && closure[jj]< cEnd){//Check for cell ID's >>> cells++; >>> }//end else if >>> }//end "jj" loop. >>> ierr = tetra4(x,y,z,&preKEtetra4,&matC,&KE); CHKERRQ(ierr); //Generate the element stiffness matrix for this cell. >>> ierr = MatDenseGetArray(KE,&KEdata); CHKERRQ(ierr); >>> ierr = MatSetValues(K,12,EDOF,12,EDOF,KEdata,ADD_VALUES); CHKERRQ(ierr);//WARNING: HARDCODED FOR TETRAHEDRAL P1 ELEMENTS ONLY !!!!!!!!!!!!!!!!!!!!!!! >>> ierr = MatDenseRestoreArray(KE,&KEdata); CHKERRQ(ierr); >>> ierr = DMPlexRestoreTransitiveClosure(dm, ii,useCone, &size_closure, &closure); CHKERRQ(ierr); >>> }//end "ii" loop. >>> ierr = MatAssemblyBegin(K,MAT_FINAL_ASSEMBLY); CHKERRQ(ierr); >>> ierr = MatAssemblyEnd(K,MAT_FINAL_ASSEMBLY); CHKERRQ(ierr); >>> //ierr = MatView(K,PETSC_VIEWER_DRAW_WORLD); CHKERRQ(ierr); >>> //END GLOBAL STIFFNESS MATRIX BUILDER=========================================== >>> t2 = MPI_Wtime(); >>> PetscPrintf(PETSC_COMM_WORLD,"K build time: %10f\n",t2-t1); >>> >>> >>> >>> >>> >>> >>> >>> >>> t1 = MPI_Wtime(); >>> //BEGIN BOUNDARY CONDITION ENFORCEMENT========================================== >>> IS TrianglesIS, physicalsurfaceID;//, VerticesIS; >>> PetscInt numsurfvals, >>> //numRows, >>> dof_offset,numTri; >>> const PetscInt *surfvals, >>> //*pinZID, >>> *TriangleID; >>> PetscScalar diag =1; >>> PetscReal area,force; >>> //NOTE: Petsc can read/assign labels. Eeach label may posses multiple "values." >>> //These values act as tags within a tag. >>> //IMPORTANT NOTE: The below line needs a safety. If a mesh that does not feature >>> //face sets is imported, the code in its current state will crash!!!. This is currently >>> //hardcoded for the test mesh. >>> ierr = DMGetLabel(dm, "Face Sets", &physicalgroups); CHKERRQ(ierr);//Inspects Physical surface groups defined by gmsh (if any). >>> ierr = DMLabelGetValueIS(physicalgroups, &physicalsurfaceID); CHKERRQ(ierr);//Gets the physical surface ID's defined in gmsh (as specified in the .geo file). >>> ierr = ISGetIndices(physicalsurfaceID,&surfvals); CHKERRQ(ierr);//Get a pointer to the actual surface values. >>> ierr = DMLabelGetNumValues(physicalgroups, &numsurfvals); CHKERRQ(ierr);//Gets the number of different values that the label assigns. >>> for(ii=0;ii>> //PetscPrintf(PETSC_COMM_WORLD,"Values = %5i\n",surfvals[ii]); >>> //PROBLEM: The surface values are hardcoded in the gmsh file. We need to adopt standard "codes" >>> //that we can give to users when they make their meshes so that this code recognizes the Type >>> // of boundary conditions that are to be imposed. >>> if(surfvals[ii] == pinXcode){ >>> dof_offset = 0; >>> dirichletBC = PETSC_TRUE; >>> }//end if. >>> else if(surfvals[ii] == pinZcode){ >>> dof_offset = 2; >>> dirichletBC = PETSC_TRUE; >>> }//end else if. >>> else if(surfvals[ii] == forceZcode){ >>> dof_offset = 2; >>> neumannBC = PETSC_TRUE; >>> }//end else if. >>> >>> ierr = DMLabelGetStratumIS(physicalgroups, surfvals[ii], &TrianglesIS); CHKERRQ(ierr);//Get the ID's (as an IS) of the surfaces belonging to value 11. >>> //PROBLEM: DMPlexGetConeRecursiveVertices returns an array with repeated node ID's. For each repetition, the lines that enforce BC's unnecessarily re-run. >>> ierr = ISGetSize(TrianglesIS,&numTri); CHKERRQ(ierr); >>> ierr = ISGetIndices(TrianglesIS,&TriangleID); CHKERRQ(ierr);//Get a pointer to the actual surface values. >>> for(kk=0;kk>> ierr = DMPlexGetTransitiveClosure(dm, TriangleID[kk], useCone, &size_closure, &closure); CHKERRQ(ierr); >>> if(neumannBC){ >>> ierr = DMPlexComputeCellGeometryFVM(dm, TriangleID[kk], &area,PETSC_NULL,PETSC_NULL); CHKERRQ(ierr); >>> force = traction*area/3;//WARNING: The 3 here is hardcoded for a purely tetrahedral mesh only!!!!!!!!!! >>> } >>> for(jj=0;jj<(2*size_closure);jj+=2){ >>> //PetscErrorCode DMPlexComputeCellGeometryFVM(DM dm, PetscInt cell, PetscReal *vol, PetscReal centroid[], PetscReal normal[]) >>> if(vStart <= closure[jj] && closure[jj]< vEnd){//Check for vertices. >>> indexK = dof*(closure[jj] - vStart) + dof_offset; //Compute the dof ID's in the K matrix. >>> if(dirichletBC){//Boundary conditions requiring an edit of K matrix. >>> ierr = MatZeroRows(K,1,&indexK,diag,NULL,NULL); CHKERRQ(ierr); >>> }//end if. >>> else if(neumannBC){//Boundary conditions requiring an edit of RHS vector. >>> ierr = VecSetValue(F,indexK,force,ADD_VALUES); CHKERRQ(ierr); >>> }// end else if. >>> }//end if. >>> }//end "jj" loop. >>> ierr = DMPlexRestoreTransitiveClosure(dm, closure[jj],useCone, &size_closure, &closure); CHKERRQ(ierr); >>> }//end "kk" loop. >>> ierr = ISRestoreIndices(TrianglesIS,&TriangleID); CHKERRQ(ierr); >>> >>> /* >>> ierr = DMPlexGetConeRecursiveVertices(dm, TrianglesIS, &VerticesIS); CHKERRQ(ierr);//Get the ID's (as an IS) of the vertices that make up the surfaces of value 11. >>> ierr = ISGetSize(VerticesIS,&numRows); CHKERRQ(ierr);//Get number of flagged vertices (this includes repeated indices for faces that share nodes). >>> ierr = ISGetIndices(VerticesIS,&pinZID); CHKERRQ(ierr);//Get a pointer to the actual surface values. >>> if(dirichletBC){//Boundary conditions requiring an edit of K matrix. >>> for(kk=0;kk>> indexK = 3*(pinZID[kk] - vStart) + dof_offset; //Compute the dof ID's in the K matrix. (NOTE: the 3* ishardcoded for 3 degrees of freedom, tie this to a variable in the FUTURE.) >>> ierr = MatZeroRows(K,1,&indexK,diag,NULL,NULL); CHKERRQ(ierr); >>> }//end "kk" loop. >>> }//end if. >>> else if(neumannBC){//Boundary conditions requiring an edit of RHS vector. >>> for(kk=0;kk>> indexK = 3*(pinZID[kk] - vStart) + dof_offset; >>> ierr = VecSetValue(F,indexK,traction,INSERT_VALUES); CHKERRQ(ierr); >>> }//end "kk" loop. >>> }// end else if. >>> ierr = ISRestoreIndices(VerticesIS,&pinZID); CHKERRQ(ierr); >>> */ >>> dirichletBC = PETSC_FALSE; >>> neumannBC = PETSC_FALSE; >>> }//end "ii" loop. >>> ierr = ISRestoreIndices(physicalsurfaceID,&surfvals); CHKERRQ(ierr); >>> //ierr = ISRestoreIndices(VerticesIS,&pinZID); CHKERRQ(ierr); >>> ierr = ISDestroy(&physicalsurfaceID); CHKERRQ(ierr); >>> //ierr = ISDestroy(&VerticesIS); CHKERRQ(ierr); >>> ierr = ISDestroy(&TrianglesIS); CHKERRQ(ierr); >>> //END BOUNDARY CONDITION ENFORCEMENT============================================ >>> t2 = MPI_Wtime(); >>> PetscPrintf(PETSC_COMM_WORLD,"BC imposition time: %10f\n",t2-t1); >>> >>> /* >>> PetscInt kk = 0; >>> for(ii=vStart;ii>> kk++; >>> PetscPrintf(PETSC_COMM_WORLD,"Vertex #%4i\t x = %10.9f\ty = %10.9f\tz = %10.9f\n",ii,xyz_el[3*kk],xyz_el[3*kk+1],xyz_el[3*kk+2]); >>> }// end "ii" loop. >>> */ >>> >>> t1 = MPI_Wtime(); >>> //SOLVER======================================================================== >>> ierr = KSPCreate(PETSC_COMM_WORLD,&ksp); CHKERRQ(ierr); >>> ierr = KSPSetOperators(ksp,K,K); CHKERRQ(ierr); >>> ierr = KSPSetFromOptions(ksp); CHKERRQ(ierr); >>> ierr = KSPSolve(ksp,F,U); CHKERRQ(ierr); >>> t2 = MPI_Wtime(); >>> //ierr = KSPView(ksp,PETSC_VIEWER_STDOUT_WORLD);CHKERRQ(ierr); >>> //SOLVER======================================================================== >>> t2 = MPI_Wtime(); >>> PetscPrintf(PETSC_COMM_WORLD,"Solver time: %10f\n",t2-t1); >>> ierr = VecRestoreArray(XYZ,&xyz_el); CHKERRQ(ierr);//Get pointer to vector's data. >>> >>> //BEGIN MAX/MIN DISPLACEMENTS=================================================== >>> IS ISux,ISuy,ISuz; >>> Vec UX,UY,UZ; >>> PetscReal UXmax,UYmax,UZmax,UXmin,UYmin,UZmin; >>> ierr = ISCreateStride(PETSC_COMM_WORLD,nv,0,3,&ISux); CHKERRQ(ierr); >>> ierr = ISCreateStride(PETSC_COMM_WORLD,nv,1,3,&ISuy); CHKERRQ(ierr); >>> ierr = ISCreateStride(PETSC_COMM_WORLD,nv,2,3,&ISuz); CHKERRQ(ierr); >>> >>> //PetscErrorCode VecGetSubVector(Vec X,IS is,Vec *Y) >>> ierr = VecGetSubVector(U,ISux,&UX); CHKERRQ(ierr); >>> ierr = VecGetSubVector(U,ISuy,&UY); CHKERRQ(ierr); >>> ierr = VecGetSubVector(U,ISuz,&UZ); CHKERRQ(ierr); >>> >>> //PetscErrorCode VecMax(Vec x,PetscInt *p,PetscReal *val) >>> ierr = VecMax(UX,PETSC_NULL,&UXmax); CHKERRQ(ierr); >>> ierr = VecMax(UY,PETSC_NULL,&UYmax); CHKERRQ(ierr); >>> ierr = VecMax(UZ,PETSC_NULL,&UZmax); CHKERRQ(ierr); >>> >>> ierr = VecMin(UX,PETSC_NULL,&UXmin); CHKERRQ(ierr); >>> ierr = VecMin(UY,PETSC_NULL,&UYmin); CHKERRQ(ierr); >>> ierr = VecMin(UZ,PETSC_NULL,&UZmin); CHKERRQ(ierr); >>> >>> PetscPrintf(PETSC_COMM_WORLD,"%10f\t <= ux <= %10f\n",UXmin,UXmax); >>> PetscPrintf(PETSC_COMM_WORLD,"%10f\t <= uy <= %10f\n",UYmin,UYmax); >>> PetscPrintf(PETSC_COMM_WORLD,"%10f\t <= uz <= %10f\n",UZmin,UZmax); >>> >>> >>> >>> >>> //BEGIN OUTPUT SOLUTION========================================================= >>> if(saveASCII){ >>> ierr = PetscViewerASCIIOpen(PETSC_COMM_WORLD,"XYZ.txt",&XYZviewer); >>> ierr = VecView(XYZ,XYZviewer); CHKERRQ(ierr); >>> ierr = PetscViewerASCIIOpen(PETSC_COMM_WORLD,"U.txt",&XYZpUviewer); >>> ierr = VecView(U,XYZpUviewer); CHKERRQ(ierr); >>> PetscViewerDestroy(&XYZviewer); PetscViewerDestroy(&XYZpUviewer); >>> >>> }//end if. >>> if(saveVTK){ >>> const char *meshfile = "starting_mesh.vtk", >>> *deformedfile = "deformed_mesh.vtk"; >>> ierr = PetscViewerVTKOpen(PETSC_COMM_WORLD,meshfile,FILE_MODE_WRITE,&XYZviewer); CHKERRQ(ierr); >>> //PetscErrorCode DMSetAuxiliaryVec(DM dm, DMLabel label, PetscInt value, Vec aux) >>> DMLabel UXlabel,UYlabel, UZlabel; >>> //PetscErrorCode DMLabelCreate(MPI_Comm comm, const char name[], DMLabel *label) >>> ierr = DMLabelCreate(PETSC_COMM_WORLD, "X-Displacement", &UXlabel); CHKERRQ(ierr); >>> ierr = DMLabelCreate(PETSC_COMM_WORLD, "Y-Displacement", &UYlabel); CHKERRQ(ierr); >>> ierr = DMLabelCreate(PETSC_COMM_WORLD, "Z-Displacement", &UZlabel); CHKERRQ(ierr); >>> ierr = DMSetAuxiliaryVec(dm,UXlabel, 1, UX); CHKERRQ(ierr); >>> ierr = DMSetAuxiliaryVec(dm,UYlabel, 1, UY); CHKERRQ(ierr); >>> ierr = DMSetAuxiliaryVec(dm,UZlabel, 1, UZ); CHKERRQ(ierr); >>> //PetscErrorCode PetscViewerVTKAddField(PetscViewer viewer,PetscObject dm,PetscErrorCode (*PetscViewerVTKWriteFunction)(PetscObject,PetscViewer),PetscInt fieldnum,PetscViewerVTKFieldType fieldtype,PetscBool checkdm,PetscObject vec) >>> >>> >>> >>> //ierr = PetscViewerVTKAddField(XYZviewer, dm,PetscErrorCode (*PetscViewerVTKWriteFunction)(Vec,PetscViewer),PETSC_DEFAULT,PETSC_VTK_POINT_FIELD,PETSC_FALSE,UX); >>> ierr = PetscViewerVTKAddField(XYZviewer, (PetscObject)dm,&PetscViewerVTKWriteFunction,PETSC_DEFAULT,PETSC_VTK_POINT_FIELD,PETSC_FALSE,(PetscObject)UX); >>> >>> >>> ierr = DMPlexVTKWriteAll((PetscObject)dm, XYZviewer); CHKERRQ(ierr); >>> ierr = VecAXPY(XYZ,1,U); CHKERRQ(ierr);//Add displacement field to the mesh coordinates to deform. >>> ierr = PetscViewerVTKOpen(PETSC_COMM_WORLD,deformedfile,FILE_MODE_WRITE,&XYZpUviewer); CHKERRQ(ierr); >>> ierr = DMPlexVTKWriteAll((PetscObject)dm, XYZpUviewer); CHKERRQ(ierr);// >>> PetscViewerDestroy(&XYZviewer); PetscViewerDestroy(&XYZpUviewer); >>> >>> }//end else if. >>> else{ >>> ierr = PetscPrintf(PETSC_COMM_WORLD,"No output format specified! Files not saved.\n"); CHKERRQ(ierr); >>> }//end else. >>> >>> >>> //END OUTPUT SOLUTION=========================================================== >>> VecDestroy(&UX); ISDestroy(&ISux); >>> VecDestroy(&UY); ISDestroy(&ISuy); >>> VecDestroy(&UZ); ISDestroy(&ISuz); >>> //END MAX/MIN DISPLACEMENTS===================================================== >>> >>> //CLEANUP===================================================================== >>> DMDestroy(&dm); >>> KSPDestroy(&ksp); >>> MatDestroy(&K); MatDestroy(&KE); MatDestroy(&matC); //MatDestroy(preKEtetra4.matB); MatDestroy(preKEtetra4.matBTCB); >>> VecDestroy(&U); VecDestroy(&F); >>> >>> //DMLabelDestroy(&physicalgroups);//Destroyig the DM destroys the label. >>> //CLEANUP===================================================================== >>> //PetscErrorCode PetscMallocDump(FILE *fp) >>> //ierr = PetscMallocDump(NULL); >>> return PetscFinalize();//And the machine shall rest.... >>> }//end main. >>> >>> PetscErrorCode tetra4(PetscScalar* X,PetscScalar* Y, PetscScalar* Z,struct preKE *P, Mat* matC, Mat* KE){ >>> //INPUTS: >>> //X: Global X coordinates of the elemental nodes. >>> //Y: Global Y coordinates of the elemental nodes. >>> //Z: Global Z coordinates of the elemental nodes. >>> //J: Jacobian matrix. >>> //invJ: Inverse Jacobian matrix. >>> PetscErrorCode ierr; >>> //For current quadrature point, get dPsi/dXi_i Xi_i = {Xi,Eta,Zeta} >>> /* >>> P->dPsidXi[0] = +1.; P->dPsidEta[0] = 0.0; P->dPsidZeta[0] = 0.0; >>> P->dPsidXi[1] = 0.0; P->dPsidEta[1] = +1.; P->dPsidZeta[1] = 0.0; >>> P->dPsidXi[2] = 0.0; P->dPsidEta[2] = 0.0; P->dPsidZeta[2] = +1.; >>> P->dPsidXi[3] = -1.; P->dPsidEta[3] = -1.; P->dPsidZeta[3] = -1.; >>> */ >>> //Populate the Jacobian matrix. >>> P->J[0][0] = X[0] - X[3]; >>> P->J[0][1] = Y[0] - Y[3]; >>> P->J[0][2] = Z[0] - Z[3]; >>> P->J[1][0] = X[1] - X[3]; >>> P->J[1][1] = Y[1] - Y[3]; >>> P->J[1][2] = Z[1] - Z[3]; >>> P->J[2][0] = X[2] - X[3]; >>> P->J[2][1] = Y[2] - Y[3]; >>> P->J[2][2] = Z[2] - Z[3]; >>> >>> //Determinant of the 3x3 Jacobian. (Expansion along 1st row). >>> P->minor00 = P->J[1][1]*P->J[2][2] - P->J[2][1]*P->J[1][2];//Reuse when finding InvJ. >>> P->minor01 = P->J[1][0]*P->J[2][2] - P->J[2][0]*P->J[1][2];//Reuse when finding InvJ. >>> P->minor02 = P->J[1][0]*P->J[2][1] - P->J[2][0]*P->J[1][1];//Reuse when finding InvJ. >>> P->detJ = P->J[0][0]*P->minor00 - P->J[0][1]*P->minor01 + P->J[0][2]*P->minor02; >>> //Inverse of the 3x3 Jacobian >>> P->invJ[0][0] = +P->minor00/P->detJ;//Reuse precomputed minor. >>> P->invJ[0][1] = -(P->J[0][1]*P->J[2][2] - P->J[0][2]*P->J[2][1])/P->detJ; >>> P->invJ[0][2] = +(P->J[0][1]*P->J[1][2] - P->J[1][1]*P->J[0][2])/P->detJ; >>> P->invJ[1][0] = -P->minor01/P->detJ;//Reuse precomputed minor. >>> P->invJ[1][1] = +(P->J[0][0]*P->J[2][2] - P->J[0][2]*P->J[2][0])/P->detJ; >>> P->invJ[1][2] = -(P->J[0][0]*P->J[1][2] - P->J[1][0]*P->J[0][2])/P->detJ; >>> P->invJ[2][0] = +P->minor02/P->detJ;//Reuse precomputed minor. >>> P->invJ[2][1] = -(P->J[0][0]*P->J[2][1] - P->J[0][1]*P->J[2][0])/P->detJ; >>> P->invJ[2][2] = +(P->J[0][0]*P->J[1][1] - P->J[0][1]*P->J[1][0])/P->detJ; >>> >>> //*****************STRAIN MATRIX (B)************************************** >>> for(P->m=0;P->mN;P->m++){//Scan all shape functions. >>> >>> P->x_in = 0 + P->m*3;//Every 3rd column starting at 0 >>> P->y_in = P->x_in +1;//Every 3rd column starting at 1 >>> P->z_in = P->y_in +1;//Every 3rd column starting at 2 >>> >>> P->dX[0] = P->invJ[0][0]*P->dPsidXi[P->m] + P->invJ[0][1]*P->dPsidEta[P->m] + P->invJ[0][2]*P->dPsidZeta[P->m]; >>> P->dY[0] = P->invJ[1][0]*P->dPsidXi[P->m] + P->invJ[1][1]*P->dPsidEta[P->m] + P->invJ[1][2]*P->dPsidZeta[P->m]; >>> P->dZ[0] = P->invJ[2][0]*P->dPsidXi[P->m] + P->invJ[2][1]*P->dPsidEta[P->m] + P->invJ[2][2]*P->dPsidZeta[P->m]; >>> >>> P->dX[1] = P->dZ[0]; P->dX[2] = P->dY[0]; >>> P->dY[1] = P->dZ[0]; P->dY[2] = P->dX[0]; >>> P->dZ[1] = P->dX[0]; P->dZ[2] = P->dY[0]; >>> >>> ierr = MatSetValues(P->matB,3,P->x_insert,1,&(P->x_in),P->dX,INSERT_VALUES); CHKERRQ(ierr); >>> ierr = MatSetValues(P->matB,3,P->y_insert,1,&(P->y_in),P->dY,INSERT_VALUES); CHKERRQ(ierr); >>> ierr = MatSetValues(P->matB,3,P->z_insert,1,&(P->z_in),P->dZ,INSERT_VALUES); CHKERRQ(ierr); >>> >>> }//end "m" loop. >>> ierr = MatAssemblyBegin(P->matB,MAT_FINAL_ASSEMBLY); CHKERRQ(ierr); >>> ierr = MatAssemblyEnd(P->matB,MAT_FINAL_ASSEMBLY); CHKERRQ(ierr); >>> //*****************STRAIN MATRIX (B)************************************** >>> >>> //Compute the matrix product B^t*C*B, scale it by the quadrature weights and add to KE. >>> P->weight = -P->detJ/6; >>> >>> ierr = MatZeroEntries(*KE); CHKERRQ(ierr); >>> ierr = MatPtAP(*matC,P->matB,MAT_INITIAL_MATRIX,PETSC_DEFAULT,&(P->matBTCB));CHKERRQ(ierr); >>> ierr = MatScale(P->matBTCB,P->weight); CHKERRQ(ierr); >>> ierr = MatAssemblyBegin(P->matBTCB,MAT_FINAL_ASSEMBLY); CHKERRQ(ierr); >>> ierr = MatAssemblyEnd(P->matBTCB,MAT_FINAL_ASSEMBLY); CHKERRQ(ierr); >>> ierr = MatAXPY(*KE,1,P->matBTCB,DIFFERENT_NONZERO_PATTERN); CHKERRQ(ierr);//Add contribution of current quadrature point to KE. >>> >>> //ierr = MatPtAP(*matC,P->matB,MAT_INITIAL_MATRIX,PETSC_DEFAULT,KE);CHKERRQ(ierr); >>> //ierr = MatScale(*KE,P->weight); CHKERRQ(ierr); >>> >>> ierr = MatAssemblyBegin(*KE,MAT_FINAL_ASSEMBLY); CHKERRQ(ierr); >>> ierr = MatAssemblyEnd(*KE,MAT_FINAL_ASSEMBLY); CHKERRQ(ierr); >>> >>> //Cleanup >>> return ierr; >>> }//end tetra4. >>> >>> PetscErrorCode ConstitutiveMatrix(Mat *matC,const char* type,PetscInt materialID){ >>> PetscErrorCode ierr; >>> PetscBool isotropic = PETSC_FALSE, >>> orthotropic = PETSC_FALSE; >>> //PetscErrorCode PetscStrcmp(const char a[],const char b[],PetscBool *flg) >>> ierr = PetscStrcmp(type,"isotropic",&isotropic); >>> ierr = PetscStrcmp(type,"orthotropic",&orthotropic); >>> ierr = MatCreate(PETSC_COMM_WORLD,matC); CHKERRQ(ierr); >>> ierr = MatSetSizes(*matC,PETSC_DECIDE,PETSC_DECIDE,6,6); CHKERRQ(ierr); >>> ierr = MatSetType(*matC,MATAIJ); CHKERRQ(ierr); >>> ierr = MatSetUp(*matC); CHKERRQ(ierr); >>> >>> if(isotropic){ >>> PetscReal E,nu, M,L,vals[3]; >>> switch(materialID){ >>> case 0://Hardcoded properties for isotropic material #0 >>> E = 200; >>> nu = 1./3; >>> break; >>> case 1://Hardcoded properties for isotropic material #1 >>> E = 96; >>> nu = 1./3; >>> break; >>> }//end switch. >>> M = E/(2*(1+nu)),//Lame's constant 1 ("mu"). >>> L = E*nu/((1+nu)*(1-2*nu));//Lame's constant 2 ("lambda"). >>> //PetscErrorCode MatSetValues(Mat mat,PetscInt m,const PetscInt idxm[],PetscInt n,const PetscInt idxn[],const PetscScalar v[],InsertMode addv) >>> PetscInt idxn[3] = {0,1,2}; >>> vals[0] = L+2*M; vals[1] = L; vals[2] = vals[1]; >>> ierr = MatSetValues(*matC,1,&idxn[0],3,idxn,vals,INSERT_VALUES); CHKERRQ(ierr); >>> vals[1] = vals[0]; vals[0] = vals[2]; >>> ierr = MatSetValues(*matC,1,&idxn[1],3,idxn,vals,INSERT_VALUES); CHKERRQ(ierr); >>> vals[2] = vals[1]; vals[1] = vals[0]; >>> ierr = MatSetValues(*matC,1,&idxn[2],3,idxn,vals,INSERT_VALUES); CHKERRQ(ierr); >>> ierr = MatSetValue(*matC,3,3,M,INSERT_VALUES); CHKERRQ(ierr); >>> ierr = MatSetValue(*matC,4,4,M,INSERT_VALUES); CHKERRQ(ierr); >>> ierr = MatSetValue(*matC,5,5,M,INSERT_VALUES); CHKERRQ(ierr); >>> }//end if. >>> /* >>> else if(orthotropic){ >>> switch(materialID){ >>> case 0: >>> break; >>> case 1: >>> break; >>> }//end switch. >>> }//end else if. >>> */ >>> ierr = MatAssemblyBegin(*matC,MAT_FINAL_ASSEMBLY); CHKERRQ(ierr); >>> ierr = MatAssemblyEnd(*matC,MAT_FINAL_ASSEMBLY); CHKERRQ(ierr); >>> //MatView(*matC,0); >>> return ierr; >>> }//End ConstitutiveMatrix >>> >>> PetscErrorCode InitializeKEpreallocation(struct preKE *P,const char* type){ >>> PetscErrorCode ierr; >>> PetscBool istetra4 = PETSC_FALSE, >>> ishex8 = PETSC_FALSE; >>> ierr = PetscStrcmp(type,"tetra4",&istetra4); CHKERRQ(ierr); >>> ierr = PetscStrcmp(type,"hex8",&ishex8); CHKERRQ(ierr); >>> if(istetra4){ >>> P->sizeKE = 12; >>> P->N = 4; >>> }//end if. >>> else if(ishex8){ >>> P->sizeKE = 24; >>> P->N = 8; >>> }//end else if. >>> >>> >>> P->x_insert[0] = 0; P->x_insert[1] = 3; P->x_insert[2] = 5; >>> P->y_insert[0] = 1; P->y_insert[1] = 4; P->y_insert[2] = 5; >>> P->z_insert[0] = 2; P->z_insert[1] = 3; P->z_insert[2] = 4; >>> //Allocate memory for the differentiated shape function vectors. >>> ierr = PetscMalloc1(P->N,&(P->dPsidXi)); CHKERRQ(ierr); >>> ierr = PetscMalloc1(P->N,&(P->dPsidEta)); CHKERRQ(ierr); >>> ierr = PetscMalloc1(P->N,&(P->dPsidZeta)); CHKERRQ(ierr); >>> >>> P->dPsidXi[0] = +1.; P->dPsidEta[0] = 0.0; P->dPsidZeta[0] = 0.0; >>> P->dPsidXi[1] = 0.0; P->dPsidEta[1] = +1.; P->dPsidZeta[1] = 0.0; >>> P->dPsidXi[2] = 0.0; P->dPsidEta[2] = 0.0; P->dPsidZeta[2] = +1.; >>> P->dPsidXi[3] = -1.; P->dPsidEta[3] = -1.; P->dPsidZeta[3] = -1.; >>> >>> >>> //Strain matrix. >>> ierr = MatCreate(PETSC_COMM_WORLD,&(P->matB)); CHKERRQ(ierr); >>> ierr = MatSetSizes(P->matB,PETSC_DECIDE,PETSC_DECIDE,6,P->sizeKE); CHKERRQ(ierr);//Hardcoded >>> ierr = MatSetType(P->matB,MATAIJ); CHKERRQ(ierr); >>> ierr = MatSetUp(P->matB); CHKERRQ(ierr); >>> >>> //Contribution matrix. >>> ierr = MatCreate(PETSC_COMM_WORLD,&(P->matBTCB)); CHKERRQ(ierr); >>> ierr = MatSetSizes(P->matBTCB,PETSC_DECIDE,PETSC_DECIDE,P->sizeKE,P->sizeKE); CHKERRQ(ierr); >>> ierr = MatSetType(P->matBTCB,MATAIJ); CHKERRQ(ierr); >>> ierr = MatSetUp(P->matBTCB); CHKERRQ(ierr); >>> >>> //Element stiffness matrix. >>> //ierr = MatCreateSeqDense(PETSC_COMM_SELF,12,12,NULL,&KE); CHKERRQ(ierr); //PARALLEL >>> >>> return ierr; >>> } -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Thu Jan 6 16:20:01 2022 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 6 Jan 2022 17:20:01 -0500 Subject: [petsc-users] [EXTERNAL] Re: DM misuse causes massive memory leak? In-Reply-To: References: <87tueraunm.fsf@jedbrown.org> <87sfu2921h.fsf@jedbrown.org> <87a6g99507.fsf@jedbrown.org> Message-ID: On Thu, Jan 6, 2022 at 5:15 PM Ferrand, Jesus A. wrote: > Jed: > > DMPlexLabelComplete() has allowed me to speed up my code significantly > (Many thanks!). > > I did not use DMAddBoundary() though. > I figured I could obtain Index Sets (IS's) from the DAG for different > depths and then IS's for the points that were flagged in Gmsh (after > calling DMPlexLabelComplete()). > I then intersected both IS's using ISIntersect() to get the DAG points > corresponding to just vertices (and flagged by Gmsh) for Dirichlet BC's, > and DAG points that are Faces and flagged by Gmsh for Neumann BC's. I then > use the intersected IS to edit a Mat and a RHS Vec manually. I did further > profiling and have found the PetsSections are now the next biggest > overhead. > > For Dirichlet BC's I make an array of vertex ID's and call > MatSetZeroRows() to impose BC's on them through my K matrix. And yes, I'm > solving the elasticity PDE. For Neumann BC's I use > DMPlexGetRecursiveVertices() to edit my RHS vector. > I cannot find a function named DMPlexGetRecursiveVertices(). > I want to keep the PetscSections since they preallocate my matrix rather > well (the one from DMCreateMatrix()) but at the same time would like to > remove them since they add overhead. Do you think DMAddboundary() with the > function call will be faster than my single calls to MatSetZeroRows() and > DMPlexGetRecursiveVertices() ? > PetscSection is really simple. Are you sure you are measuring long times there? What are you using it to do? Thanks, Matt > ------------------------------ > *From:* Jed Brown > *Sent:* Wednesday, January 5, 2022 5:44 PM > *To:* Ferrand, Jesus A. > *Cc:* petsc-users > *Subject:* Re: [EXTERNAL] Re: [petsc-users] DM misuse causes massive > memory leak? > > For something like displacement (and this sounds like elasticity), I would > recommend using one field with three components. You can constrain a subset > of the components to implement slip conditions. > > You can use DMPlexLabelComplete(dm, label) to propagate those face labels > to vertices. > > "Ferrand, Jesus A." writes: > > > Thanks for the reply (I hit reply all this time). > > > > So, I set 3 fields: > > /* > > ierr = PetscSectionSetNumFields(s,dof); CHKERRQ(ierr); > > ierr = PetscSectionSetFieldName(s,0, "X-Displacement"); CHKERRQ(ierr); > //Field ID is 0 > > ierr = PetscSectionSetFieldName(s,1, "Y-Displacement"); CHKERRQ(ierr); > //Field ID is 1 > > ierr = PetscSectionSetFieldName(s,2, "Z-Displacement"); CHKERRQ(ierr); > //Field ID is 2 > > */ > > > > I then loop through the vertices of my DMPlex > > > > /* > > for(ii = vStart; ii < vEnd; ii++){//Vertex loop. > > ierr = PetscSectionSetDof(s, ii, dof); CHKERRQ(ierr); > > ierr = PetscSectionSetFieldDof(s,ii,0,1); CHKERRQ(ierr);//One > X-displacement per vertex (1 dof) > > ierr = PetscSectionSetFieldDof(s,ii,1,1); CHKERRQ(ierr);//One > Y-displacement per vertex (1 dof) > > ierr = PetscSectionSetFieldDof(s,ii,2,1); CHKERRQ(ierr);//One > Z-displacement per vertex (1 dof) > > }//Sets x, y, and z displacements as dofs. > > */ > > > > I only associated fields with vertices, not with any other points in the > DAG. Regarding the use of DMAddBoundary(), I mostly copied the usage shown > in SNES example 77. I modified the function definition to simply set the > dof to 0.0 as opposed to the coordinates. Below "physicalgroups" is the > DMLabel that I got from gmsh, this flags Face points, not vertices. That is > why I think the error log suggests that fields were never set. > > > > ierr = DMAddBoundary(dm, DM_BC_ESSENTIAL, "fixed", physicalgroups, 1, > &surfvals[ii], fieldID, 0, NULL, (void (*)(void)) coordinates, NULL, NULL, > NULL); CHKERRQ(ierr); > > PetscErrorCode coordinates(PetscInt dim, PetscReal time, const PetscReal > x[], PetscInt Nf, PetscScalar *u, void *ctx){ > > const PetscInt Ncomp = dim; > > PetscInt comp; > > for (comp = 0; comp < Ncomp; ++comp) u[comp] = 0.0; > > return 0; > > } > > > > > > ________________________________ > > From: Jed Brown > > Sent: Wednesday, January 5, 2022 12:36 AM > > To: Ferrand, Jesus A. > > Cc: petsc-users > > Subject: Re: [EXTERNAL] Re: [petsc-users] DM misuse causes massive > memory leak? > > > > Please "reply all" include the list in the future. > > > > "Ferrand, Jesus A." writes: > > > >> Forgot to say thanks for the reply (my bad). > >> Yes, I was indeed forgetting to pre-allocate the sparse matrices when > doing them by hand (complacency seeded by MATLAB's zeros()). Thank you, > Jed, and Jeremy, for the hints! > >> > >> I have more questions, these ones about boundary conditions (I think > these are for Matt). > >> In my current code I set Dirichlet conditions directly on a Mat by > calling MatSetZeroRows(). I profiled my code and found the part that > applies them to be unnacceptably slow. In response, I've been trying to > better pre-allocate Mats using PetscSections. I have found documentation > for PetscSectionSetDof(), PetscSectionSetNumFields(), > PetscSectionSetFieldName(), and PetscSectionSetFieldDof(), to set the size > of my Mats and Vecs by calling DMSetLocalSection() followed by > DMCreateMatrix() and DMCreateLocalVector() to get a RHS vector. This seems > faster. > >> > >> In PetscSection, what is the difference between a "field" and a > "component"? For example, I could have one field "Velocity" with three > components ux, uy, and uz or perhaps three fields ux, uy, and uz each with > a default component? > > > > It's just how you name them and how they appear in output. Usually > "velocity" is better as a field with three components, but fields with > other meaning (and perhaps different finite element spaces), such as > pressure, would be different fields. Different components are always in the > same FE space. > > > >> I am struggling now to impose boundary conditions after constraining > dofs using PetscSection. My understanding is that constraining dof's > reduces the size of the DM's matrix but it does not give the DM knowledge > of what values the constrained dofs should have, right? > >> > >> I know that there is DMAddBoundary(), but I am unsure of how to use it. > From Gmsh I have a mesh with surface boundaries flagged. I'm not sure > whether DMAddBoundary()will constrain the face, edge, or vertex points when > I give it the DMLabel from Gmsh. (I need specific dof on the vertices to be > constrained). I did some testing and I think DMAddBoundary() attempts to > constrain the Face points (see error log below). I only associated fields > with the vertices but not the Faces. I can extract the vertex points from > the face label using DMPlexGetConeRecursiveVertices() but the output IS has > repeated entries for the vertex points (many faces share the same vertex). > Is there an easier way to get the vertex points from a gmsh surface tag? > > > > How did you call DMAddBoundary()? Are you using DM_BC_ESSENTIAL and a > callback function that provides the inhomogeneous boundary condition? > > > >> I'm sorry this is a mouthful. > >> > >> [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > >> [0]PETSC ERROR: Argument out of range > >> [0]PETSC ERROR: Field number 2 must be in [0, 0) > > > > It looks like you haven't added these fields yet. > > > >> [0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble > shooting. > >> [0]PETSC ERROR: Petsc Release Version 3.16.0, unknown > >> [0]PETSC ERROR: ./gmsh4 on a linux-c-dbg named F86 by jesus Tue Jan 4 > 15:19:57 2022 > >> [0]PETSC ERROR: Configure options --with-32bits-pci-domain=1 > --with-debugging =1 > >> [0]PETSC ERROR: #1 DMGetField() at > /home/jesus/SAND/PETSc_install/petsc/src/dm/interface/dm.c:4803 > >> [0]PETSC ERROR: #2 DMCompleteBoundaryLabel_Internal() at > /home/jesus/SAND/PETSc_install/petsc/src/dm/interface/dm.c:5140 > >> [0]PETSC ERROR: #3 DMAddBoundary() at > /home/jesus/SAND/PETSc_install/petsc/src/dm/interface/dm.c:8561 > >> [0]PETSC ERROR: #4 main() at /home/jesus/SAND/FEA/3D/gmshBACKUP4.c:215 > >> [0]PETSC ERROR: No PETSc Option Table entries > >> [0]PETSC ERROR: ----------------End of Error Message -------send entire > error message to petsc-maint at mcs.anl.gov---------- > >> > >> > >> > >> > >> > >> > >> ________________________________ > >> From: Jed Brown > >> Sent: Wednesday, December 29, 2021 5:55 PM > >> To: Ferrand, Jesus A. ; petsc-users at mcs.anl.gov < > petsc-users at mcs.anl.gov> > >> Subject: [EXTERNAL] Re: [petsc-users] DM misuse causes massive memory > leak? > >> > >> CAUTION: This email originated outside of Embry-Riddle Aeronautical > University. Do not click links or open attachments unless you recognize the > sender and know the content is safe. > >> > >> > >> "Ferrand, Jesus A." writes: > >> > >>> Dear PETSc Team: > >>> > >>> I have a question about DM and PetscSection. Say I import a mesh (for > FEM purposes) and create a DMPlex for it. I then use PetscSections to set > degrees of freedom per "point" (by point I mean vertices, lines, faces, and > cells). I then use PetscSectionGetStorageSize() to get the size of the > global stiffness matrix (K) needed for my FEM problem. One last detail, > this K I populate inside a rather large loop using an element stiffness > matrix function of my own. Instead of using DMCreateMatrix(), I manually > created a Mat using MatCreate(), MatSetType(), MatSetSizes(), and > MatSetUp(). I come to find that said loop is painfully slow when I use the > manually created matrix, but 20x faster when I use the Mat coming out of > DMCreateMatrix(). > >> > >> The sparse matrix hasn't been preallocated, which forces the data > structure to do a lot of copies (as bad as O(n^2) complexity). > DMCreateMatrix() preallocates for you. > >> > >> > https://petsc.org/release/docs/manual/performance/#memory-allocation-for-sparse-matrix-assembly > >> https://petsc.org/release/docs/manual/mat/#sec-matsparse > >> > >>> My question is then: Is the manual Mat a noob mistake and is it > somehow creating a memory leak with K? Just in case it's something else I'm > attaching the code. The loop that populates K is between lines 221 and 278. > Anything related to DM, DMPlex, and PetscSection is between lines 117 and > 180. > >>> > >>> Machine Type: HP Laptop > >>> C-compiler: Gnu C > >>> OS: Ubuntu 20.04 > >>> PETSc version: 3.16.0 > >>> MPI Implementation: MPICH > >>> > >>> Hope you all had a Merry Christmas and that you will have a happy and > productive New Year. :D > >>> > >>> > >>> Sincerely: > >>> > >>> J.A. Ferrand > >>> > >>> Embry-Riddle Aeronautical University - Daytona Beach FL > >>> > >>> M.Sc. Aerospace Engineering | May 2022 > >>> > >>> B.Sc. Aerospace Engineering > >>> > >>> B.Sc. Computational Mathematics > >>> > >>> > >>> > >>> Sigma Gamma Tau > >>> > >>> Tau Beta Pi > >>> > >>> Honors Program > >>> > >>> > >>> > >>> Phone: (386)-843-1829 > >>> > >>> Email(s): ferranj2 at my.erau.edu > >>> > >>> jesus.ferrand at gmail.com > >>> //REFERENCE: > https://github.com/FreeFem/FreeFem-sources/blob/master/plugin/mpi/PETSc-code.hpp > >>> #include > >>> static char help[] = "Imports a Gmsh mesh with boundary conditions and > solves the elasticity equation.\n" > >>> "Option prefix = opt_.\n"; > >>> > >>> struct preKE{//Preallocation before computing KE > >>> Mat matB, > >>> matBTCB; > >>> //matKE; > >>> PetscInt x_insert[3], > >>> y_insert[3], > >>> z_insert[3], > >>> m,//Looping variables. > >>> sizeKE,//size of the element stiffness matrix. > >>> N,//Number of nodes in element. > >>> x_in,y_in,z_in; //LI to index B matrix. > >>> PetscReal J[3][3],//Jacobian matrix. > >>> invJ[3][3],//Inverse of the Jacobian matrix. > >>> detJ,//Determinant of the Jacobian. > >>> dX[3], > >>> dY[3], > >>> dZ[3], > >>> minor00, > >>> minor01, > >>> minor02,//Determinants of minors in a 3x3 matrix. > >>> dPsidX, dPsidY, dPsidZ,//Shape function derivatives w.r.t > global coordinates. > >>> weight,//Multiplier of quadrature weights. > >>> *dPsidXi,//Derivatives of shape functions w.r.t. Xi. > >>> *dPsidEta,//Derivatives of shape functions w.r.t. Eta. > >>> *dPsidZeta;//Derivatives of shape functions w.r.t Zeta. > >>> PetscErrorCode ierr; > >>> };//end struct. > >>> > >>> //Function declarations. > >>> extern PetscErrorCode tetra4(PetscScalar*, PetscScalar*, > PetscScalar*,struct preKE*, Mat*, Mat*); > >>> extern PetscErrorCode ConstitutiveMatrix(Mat*,const char*,PetscInt); > >>> extern PetscErrorCode InitializeKEpreallocation(struct preKE*,const > char*); > >>> > >>> PetscErrorCode PetscViewerVTKWriteFunction(PetscObject vec,PetscViewer > viewer){ > >>> PetscErrorCode ierr; > >>> ierr = VecView((Vec)vec,viewer); CHKERRQ(ierr); > >>> return ierr; > >>> } > >>> > >>> > >>> > >>> > >>> int main(int argc, char **args){ > >>> //DEFINITIONS OF PETSC's DMPLEX LINGO: > >>> //POINT: A topology element (cell, face, edge, or vertex). > >>> //CHART: It an interval from 0 to the number of "points." (the range > of admissible linear indices) > >>> //STRATUM: A subset of the "chart" which corresponds to all "points" > at a given "level." > >>> //LEVEL: This is either a "depth" or a "height". > >>> //HEIGHT: Dimensionality of an element measured from 0D to 3D. > Heights: cell = 0, face = 1, edge = 2, vertex = 3. > >>> //DEPTH: Dimensionality of an element measured from 3D to 0D. > Depths: cell = 3, face = 2, edge = 1, vertex = 0; > >>> //CLOSURE: *of an element is the collection of all other elements > that define it.I.e., the closure of a surface is the collection of vertices > and edges that make it up. > >>> //STAR: > >>> //STANDARD LABELS: These are default tags that DMPlex has for its > topology. ("depth") > >>> PetscErrorCode ierr;//Error tracking variable. > >>> DM dm;//Distributed memory object (useful for managing grids.) > >>> DMLabel physicalgroups;//Identifies user-specified tags in gmsh (to > impose BC's). > >>> DMPolytopeType celltype;//When looping through cells, determines its > type (tetrahedron, pyramid, hexahedron, etc.) > >>> PetscSection s; > >>> KSP ksp;//Krylov Sub-Space (linear solver object) > >>> Mat K,//Global stiffness matrix (Square, assume unsymmetric). > >>> KE,//Element stiffness matrix (Square, assume unsymmetric). > >>> matC;//Constitutive matrix. > >>> Vec XYZ,//Coordinate vector, contains spatial locations of mesh's > vertices (NOTE: This vector self-destroys!). > >>> U,//Displacement vector. > >>> F;//Load Vector. > >>> PetscViewer XYZviewer,//Viewer object to output mesh to ASCII format. > >>> XYZpUviewer; //Viewer object to output displacements to > ASCII format. > >>> PetscBool interpolate = PETSC_TRUE,//Instructs Gmsh importer whether > to generate faces and edges (Needed when using P2 or higher elements). > >>> useCone = PETSC_TRUE,//Instructs > "DMPlexGetTransitiveClosure()" whether to extract the closure or the star. > >>> dirichletBC = PETSC_FALSE,//For use when assembling the K > matrix. > >>> neumannBC = PETSC_FALSE,//For use when assembling the F > vector. > >>> saveASCII = PETSC_FALSE,//Whether to save results in ASCII > format. > >>> saveVTK = PETSC_FALSE;//Whether to save results as VTK > format. > >>> PetscInt nc,//number of cells. (PETSc lingo for "elements") > >>> nv,//number of vertices. (PETSc lingo for "nodes") > >>> nf,//number of faces. (PETSc lingo for "surfaces") > >>> ne,//number of edges. (PETSc lingo for "lines") > >>> pStart,//starting LI of global elements. > >>> pEnd,//ending LI of all elements. > >>> cStart,//starting LI for cells global arrangement. > >>> cEnd,//ending LI for cells in global arrangement. > >>> vStart,//starting LI for vertices in global arrangement. > >>> vEnd,//ending LI for vertices in global arrangement. > >>> fStart,//starting LI for faces in global arrangement. > >>> fEnd,//ending LI for faces in global arrangement. > >>> eStart,//starting LI for edges in global arrangement. > >>> eEnd,//ending LI for edges in global arrangement. > >>> sizeK,//Size of the element stiffness matrix. > >>> ii,jj,kk,//Dedicated looping variables. > >>> indexXYZ,//Variable to access the elements of XYZ vector. > >>> indexK,//Variable to access the elements of the U and F > vectors (can reference rows and colums of K matrix.) > >>> *closure = PETSC_NULL,//Pointer to the closure elements of > a cell. > >>> size_closure,//Size of the closure of a cell. > >>> dim,//Dimension of the mesh. > >>> //*edof,//Linear indices of dof's inside the K matrix. > >>> dof = 3,//Degrees of freedom per node. > >>> cells=0, edges=0, vertices=0, faces=0,//Topology counters > when looping through cells. > >>> pinXcode=10, pinZcode=11,forceZcode=12;//Gmsh codes to > extract relevant "Face Sets." > >>> PetscReal //*x_el,//Pointer to a vector that will store the > x-coordinates of an element's vertices. > >>> //*y_el,//Pointer to a vector that will store the > y-coordinates of an element's vertices. > >>> //*z_el,//Pointer to a vector that will store the > z-coordinates of an element's vertices. > >>> *xyz_el,//Pointer to xyz array in the XYZ vector. > >>> traction = -10, > >>> *KEdata, > >>> t1,t2; //time keepers. > >>> const char *gmshfile = "TopOptmeshfine2.msh";//Name of gmsh file to > import. > >>> > >>> ierr = PetscInitialize(&argc,&args,NULL,help); if(ierr) return ierr; > //And the machine shall work.... > >>> > >>> //MESH > IMPORT================================================================= > >>> //IMPORTANT NOTE: Gmsh only creates "cells" and "vertices," it does > not create the "faces" or "edges." > >>> //Gmsh probably can generate them, must figure out how to. > >>> t1 = MPI_Wtime(); > >>> ierr = > DMPlexCreateGmshFromFile(PETSC_COMM_WORLD,gmshfile,interpolate,&dm); > CHKERRQ(ierr);//Read Gmsh file and generate the DMPlex. > >>> ierr = DMGetDimension(dm, &dim); CHKERRQ(ierr);//1-D, 2-D, or 3-D > >>> ierr = DMPlexGetChart(dm, &pStart, &pEnd); CHKERRQ(ierr);//Extracts > linear indices of cells, vertices, faces, and edges. > >>> ierr = DMGetCoordinatesLocal(dm,&XYZ); CHKERRQ(ierr);//Extracts > coordinates from mesh.(Contiguous storage: [x0,y0,z0,x1,y1,z1,...]) > >>> ierr = VecGetArray(XYZ,&xyz_el); CHKERRQ(ierr);//Get pointer to > vector's data. > >>> t2 = MPI_Wtime(); > >>> PetscPrintf(PETSC_COMM_WORLD,"Mesh Import time: %10f\n",t2-t1); > >>> DMView(dm,PETSC_VIEWER_STDOUT_WORLD); > >>> > >>> //IMPORTANT NOTE: PETSc assumes that vertex-cell meshes are 2D even > if they were 3D, so its ordering changes. > >>> //Cells remain at height 0, but vertices move to height 1 from > height 3. To prevent this from becoming an issue > >>> //the "interpolate" variable is set to PETSC_TRUE to tell the mesh > importer to generate faces and edges. > >>> //PETSc, therefore, technically does additional meshing. Gotta > figure out how to get this from Gmsh directly. > >>> ierr = DMPlexGetDepthStratum(dm,3, &cStart, &cEnd);//Get LI of cells. > >>> ierr = DMPlexGetDepthStratum(dm,2, &fStart, &fEnd);//Get LI of faces > >>> ierr = DMPlexGetDepthStratum(dm,1, &eStart, &eEnd);//Get LI of edges. > >>> ierr = DMPlexGetDepthStratum(dm,0, &vStart, &vEnd);//Get LI of > vertices. > >>> ierr = DMGetStratumSize(dm,"depth", 3, &nc);//Get number of cells. > >>> ierr = DMGetStratumSize(dm,"depth", 2, &nf);//Get number of faces. > >>> ierr = DMGetStratumSize(dm,"depth", 1, &ne);//Get number of edges. > >>> ierr = DMGetStratumSize(dm,"depth", 0, &nv);//Get number of vertices. > >>> /* > >>> PetscPrintf(PETSC_COMM_WORLD,"global start = %10d\t global end = > %10d\n",pStart,pEnd); > >>> PetscPrintf(PETSC_COMM_WORLD,"#cells = %10d\t i = %10d\t i < > %10d\n",nc,cStart,cEnd); > >>> PetscPrintf(PETSC_COMM_WORLD,"#faces = %10d\t i = %10d\t i < > %10d\n",nf,fStart,fEnd); > >>> PetscPrintf(PETSC_COMM_WORLD,"#edges = %10d\t i = %10d\t i < > %10d\n",ne,eStart,eEnd); > >>> PetscPrintf(PETSC_COMM_WORLD,"#vertices = %10d\t i = %10d\t i < > %10d\n",nv,vStart,vEnd); > >>> */ > >>> //MESH > IMPORT================================================================= > >>> > >>> //NOTE: This section extremely hardcoded right now. > >>> //Current setup would only support P1 meshes. > >>> //MEMORY ALLOCATION > ========================================================== > >>> ierr = PetscSectionCreate(PETSC_COMM_WORLD, &s); CHKERRQ(ierr); > >>> //The chart is akin to a contiguous memory storage allocation. Each > chart entry is associated > >>> //with a "thing," could be a vertex, face, cell, or edge, or > anything else. > >>> ierr = PetscSectionSetChart(s, pStart, pEnd); CHKERRQ(ierr); > >>> //For each "thing" in the chart, additional room can be made. This > is helpful for associating > >>> //nodes to multiple degrees of freedom. These commands help > associate nodes with > >>> for(ii = cStart; ii < cEnd; ii++){//Cell loop. > >>> ierr = PetscSectionSetDof(s, ii, 0);CHKERRQ(ierr);}//NOTE: > Currently no dof's associated with cells. > >>> for(ii = fStart; ii < fEnd; ii++){//Face loop. > >>> ierr = PetscSectionSetDof(s, ii, 0);CHKERRQ(ierr);}//NOTE: > Currently no dof's associated with faces. > >>> for(ii = vStart; ii < vEnd; ii++){//Vertex loop. > >>> ierr = PetscSectionSetDof(s, ii, dof);CHKERRQ(ierr);}//Sets x, y, > and z displacements as dofs. > >>> for(ii = eStart; ii < eEnd; ii++){//Edge loop > >>> ierr = PetscSectionSetDof(s, ii, 0);CHKERRQ(ierr);}//NOTE: > Currently no dof's associated with edges. > >>> ierr = PetscSectionSetUp(s); CHKERRQ(ierr); > >>> ierr = > PetscSectionGetStorageSize(s,&sizeK);CHKERRQ(ierr);//Determine the size of > the global stiffness matrix. > >>> ierr = DMSetLocalSection(dm,s); CHKERRQ(ierr);//Associate the > PetscSection with the DM object. > >>> //PetscErrorCode DMCreateGlobalVector(DM dm,Vec *vec) > >>> //ierr = DMCreateGlobalVector(dm,&U); CHKERRQ(ierr); > >>> PetscSectionDestroy(&s); > >>> //PetscPrintf(PETSC_COMM_WORLD,"sizeK = %10d\n",sizeK); > >>> > >>> //OBJECT > SETUP================================================================ > >>> //Global stiffness matrix. > >>> //PetscErrorCode DMCreateMatrix(DM dm,Mat *mat) > >>> > >>> //This makes the loop fast. > >>> ierr = DMCreateMatrix(dm,&K); > >>> > >>> //This makes the loop uber slow. > >>> //ierr = MatCreate(PETSC_COMM_WORLD,&K); CHKERRQ(ierr); > >>> //ierr = MatSetType(K,MATAIJ); CHKERRQ(ierr);// Global stiffness > matrix set to some sparse type. > >>> //ierr = MatSetSizes(K,PETSC_DECIDE,PETSC_DECIDE,sizeK,sizeK); > CHKERRQ(ierr); > >>> //ierr = MatSetUp(K); CHKERRQ(ierr); > >>> > >>> //Displacement vector. > >>> ierr = VecCreate(PETSC_COMM_WORLD,&U); CHKERRQ(ierr); > >>> ierr = VecSetType(U,VECSTANDARD); CHKERRQ(ierr);// Global stiffness > matrix set to some sparse type. > >>> ierr = VecSetSizes(U,PETSC_DECIDE,sizeK); CHKERRQ(ierr); > >>> > >>> //Load vector. > >>> ierr = VecCreate(PETSC_COMM_WORLD,&F); CHKERRQ(ierr); > >>> ierr = VecSetType(F,VECSTANDARD); CHKERRQ(ierr);// Global stiffness > matrix set to some sparse type. > >>> ierr = VecSetSizes(F,PETSC_DECIDE,sizeK); CHKERRQ(ierr); > >>> //OBJECT > SETUP================================================================ > >>> > >>> //WARNING: This loop is currently hardcoded for P1 elements only! > Must Figure > >>> //out a clever way to modify to accomodate Pn (n>1) elements. > >>> > >>> //BEGIN GLOBAL STIFFNESS MATRIX > BUILDER======================================= > >>> t1 = MPI_Wtime(); > >>> > >>> > //PREALLOCATIONS============================================================== > >>> ierr = ConstitutiveMatrix(&matC,"isotropic",0); CHKERRQ(ierr); > >>> struct preKE preKEtetra4; > >>> ierr = InitializeKEpreallocation(&preKEtetra4,"tetra4"); > CHKERRQ(ierr); > >>> ierr = MatCreate(PETSC_COMM_WORLD,&KE); CHKERRQ(ierr); //SEQUENTIAL > >>> ierr = MatSetSizes(KE,PETSC_DECIDE,PETSC_DECIDE,12,12); > CHKERRQ(ierr); //SEQUENTIAL > >>> ierr = MatSetType(KE,MATDENSE); CHKERRQ(ierr); //SEQUENTIAL > >>> ierr = MatSetUp(KE); CHKERRQ(ierr); > >>> PetscReal x_tetra4[4], y_tetra4[4],z_tetra4[4], > >>> x_hex8[8], y_hex8[8],z_hex8[8], > >>> *x,*y,*z; > >>> PetscInt *EDOF,edof_tetra4[12],edof_hex8[24]; > >>> DMPolytopeType previous = DM_POLYTOPE_UNKNOWN; > >>> > //PREALLOCATIONS============================================================== > >>> > >>> > >>> > >>> for(ii=cStart;ii >>> ierr = DMPlexGetTransitiveClosure(dm, ii, useCone, &size_closure, > &closure); CHKERRQ(ierr); > >>> ierr = DMPlexGetCellType(dm, ii, &celltype); CHKERRQ(ierr); > >>> //IMPORTANT NOTE: MOST OF THIS LOOP SHOULD BE INCLUDED IN THE KE3D > function. > >>> if(previous != celltype){ > >>> //PetscPrintf(PETSC_COMM_WORLD,"run \n"); > >>> if(celltype == DM_POLYTOPE_TETRAHEDRON){ > >>> x = x_tetra4; > >>> y = y_tetra4; > >>> z = z_tetra4; > >>> EDOF = edof_tetra4; > >>> }//end if. > >>> else if(celltype == DM_POLYTOPE_HEXAHEDRON){ > >>> x = x_hex8; > >>> y = y_hex8; > >>> z = z_hex8; > >>> EDOF = edof_hex8; > >>> }//end else if. > >>> } > >>> previous = celltype; > >>> > >>> //PetscPrintf(PETSC_COMM_WORLD,"Cell # %4i\t",ii); > >>> cells=0; > >>> edges=0; > >>> vertices=0; > >>> faces=0; > >>> kk = 0; > >>> for(jj=0;jj<(2*size_closure);jj+=2){//Scan the closure of the > current cell. > >>> //Use information from the DM's strata to determine composition > of cell_ii. > >>> if(vStart <= closure[jj] && closure[jj]< vEnd){//Check for > vertices. > >>> //PetscPrintf(PETSC_COMM_WORLD,"%5i\t",closure[jj]); > >>> indexXYZ = dim*(closure[jj]-vStart);//Linear index of > x-coordinate in the xyz_el array. > >>> > >>> *(x+vertices) = xyz_el[indexXYZ]; > >>> *(y+vertices) = xyz_el[indexXYZ+1];//Extract Y-coordinates of > the current vertex. > >>> *(z+vertices) = xyz_el[indexXYZ+2];//Extract Y-coordinates of > the current vertex. > >>> *(EDOF + kk) = indexXYZ; > >>> *(EDOF + kk+1) = indexXYZ+1; > >>> *(EDOF + kk+2) = indexXYZ+2; > >>> kk+=3; > >>> vertices++;//Update vertex counter. > >>> }//end if > >>> else if(eStart <= closure[jj] && closure[jj]< eEnd){//Check for > edge ID's > >>> edges++; > >>> }//end else ifindexK > >>> else if(fStart <= closure[jj] && closure[jj]< fEnd){//Check for > face ID's > >>> faces++; > >>> }//end else if > >>> else if(cStart <= closure[jj] && closure[jj]< cEnd){//Check for > cell ID's > >>> cells++; > >>> }//end else if > >>> }//end "jj" loop. > >>> ierr = tetra4(x,y,z,&preKEtetra4,&matC,&KE); CHKERRQ(ierr); > //Generate the element stiffness matrix for this cell. > >>> ierr = MatDenseGetArray(KE,&KEdata); CHKERRQ(ierr); > >>> ierr = MatSetValues(K,12,EDOF,12,EDOF,KEdata,ADD_VALUES); > CHKERRQ(ierr);//WARNING: HARDCODED FOR TETRAHEDRAL P1 ELEMENTS ONLY > !!!!!!!!!!!!!!!!!!!!!!! > >>> ierr = MatDenseRestoreArray(KE,&KEdata); CHKERRQ(ierr); > >>> ierr = DMPlexRestoreTransitiveClosure(dm, ii,useCone, &size_closure, > &closure); CHKERRQ(ierr); > >>> }//end "ii" loop. > >>> ierr = MatAssemblyBegin(K,MAT_FINAL_ASSEMBLY); CHKERRQ(ierr); > >>> ierr = MatAssemblyEnd(K,MAT_FINAL_ASSEMBLY); CHKERRQ(ierr); > >>> //ierr = MatView(K,PETSC_VIEWER_DRAW_WORLD); CHKERRQ(ierr); > >>> //END GLOBAL STIFFNESS MATRIX > BUILDER=========================================== > >>> t2 = MPI_Wtime(); > >>> PetscPrintf(PETSC_COMM_WORLD,"K build time: %10f\n",t2-t1); > >>> > >>> > >>> > >>> > >>> > >>> > >>> > >>> > >>> t1 = MPI_Wtime(); > >>> //BEGIN BOUNDARY CONDITION > ENFORCEMENT========================================== > >>> IS TrianglesIS, physicalsurfaceID;//, VerticesIS; > >>> PetscInt numsurfvals, > >>> //numRows, > >>> dof_offset,numTri; > >>> const PetscInt *surfvals, > >>> //*pinZID, > >>> *TriangleID; > >>> PetscScalar diag =1; > >>> PetscReal area,force; > >>> //NOTE: Petsc can read/assign labels. Eeach label may posses multiple > "values." > >>> //These values act as tags within a tag. > >>> //IMPORTANT NOTE: The below line needs a safety. If a mesh that does > not feature > >>> //face sets is imported, the code in its current state will crash!!!. > This is currently > >>> //hardcoded for the test mesh. > >>> ierr = DMGetLabel(dm, "Face Sets", &physicalgroups); > CHKERRQ(ierr);//Inspects Physical surface groups defined by gmsh (if any). > >>> ierr = DMLabelGetValueIS(physicalgroups, &physicalsurfaceID); > CHKERRQ(ierr);//Gets the physical surface ID's defined in gmsh (as > specified in the .geo file). > >>> ierr = ISGetIndices(physicalsurfaceID,&surfvals); CHKERRQ(ierr);//Get > a pointer to the actual surface values. > >>> ierr = DMLabelGetNumValues(physicalgroups, &numsurfvals); > CHKERRQ(ierr);//Gets the number of different values that the label assigns. > >>> for(ii=0;ii label. > >>> //PetscPrintf(PETSC_COMM_WORLD,"Values = %5i\n",surfvals[ii]); > >>> //PROBLEM: The surface values are hardcoded in the gmsh file. We > need to adopt standard "codes" > >>> //that we can give to users when they make their meshes so that this > code recognizes the Type > >>> // of boundary conditions that are to be imposed. > >>> if(surfvals[ii] == pinXcode){ > >>> dof_offset = 0; > >>> dirichletBC = PETSC_TRUE; > >>> }//end if. > >>> else if(surfvals[ii] == pinZcode){ > >>> dof_offset = 2; > >>> dirichletBC = PETSC_TRUE; > >>> }//end else if. > >>> else if(surfvals[ii] == forceZcode){ > >>> dof_offset = 2; > >>> neumannBC = PETSC_TRUE; > >>> }//end else if. > >>> > >>> ierr = DMLabelGetStratumIS(physicalgroups, surfvals[ii], > &TrianglesIS); CHKERRQ(ierr);//Get the ID's (as an IS) of the surfaces > belonging to value 11. > >>> //PROBLEM: DMPlexGetConeRecursiveVertices returns an array with > repeated node ID's. For each repetition, the lines that enforce BC's > unnecessarily re-run. > >>> ierr = ISGetSize(TrianglesIS,&numTri); CHKERRQ(ierr); > >>> ierr = ISGetIndices(TrianglesIS,&TriangleID); CHKERRQ(ierr);//Get a > pointer to the actual surface values. > >>> for(kk=0;kk >>> ierr = DMPlexGetTransitiveClosure(dm, TriangleID[kk], useCone, > &size_closure, &closure); CHKERRQ(ierr); > >>> if(neumannBC){ > >>> ierr = DMPlexComputeCellGeometryFVM(dm, TriangleID[kk], > &area,PETSC_NULL,PETSC_NULL); CHKERRQ(ierr); > >>> force = traction*area/3;//WARNING: The 3 here is hardcoded for a > purely tetrahedral mesh only!!!!!!!!!! > >>> } > >>> for(jj=0;jj<(2*size_closure);jj+=2){ > >>> //PetscErrorCode DMPlexComputeCellGeometryFVM(DM dm, PetscInt > cell, PetscReal *vol, PetscReal centroid[], PetscReal normal[]) > >>> if(vStart <= closure[jj] && closure[jj]< vEnd){//Check for > vertices. > >>> indexK = dof*(closure[jj] - vStart) + dof_offset; //Compute > the dof ID's in the K matrix. > >>> if(dirichletBC){//Boundary conditions requiring an edit of K > matrix. > >>> ierr = MatZeroRows(K,1,&indexK,diag,NULL,NULL); > CHKERRQ(ierr); > >>> }//end if. > >>> else if(neumannBC){//Boundary conditions requiring an edit of > RHS vector. > >>> ierr = VecSetValue(F,indexK,force,ADD_VALUES); > CHKERRQ(ierr); > >>> }// end else if. > >>> }//end if. > >>> }//end "jj" loop. > >>> ierr = DMPlexRestoreTransitiveClosure(dm, closure[jj],useCone, > &size_closure, &closure); CHKERRQ(ierr); > >>> }//end "kk" loop. > >>> ierr = ISRestoreIndices(TrianglesIS,&TriangleID); CHKERRQ(ierr); > >>> > >>> /* > >>> ierr = DMPlexGetConeRecursiveVertices(dm, TrianglesIS, &VerticesIS); > CHKERRQ(ierr);//Get the ID's (as an IS) of the vertices that make up the > surfaces of value 11. > >>> ierr = ISGetSize(VerticesIS,&numRows); CHKERRQ(ierr);//Get number of > flagged vertices (this includes repeated indices for faces that share > nodes). > >>> ierr = ISGetIndices(VerticesIS,&pinZID); CHKERRQ(ierr);//Get a > pointer to the actual surface values. > >>> if(dirichletBC){//Boundary conditions requiring an edit of K matrix. > >>> for(kk=0;kk >>> indexK = 3*(pinZID[kk] - vStart) + dof_offset; //Compute the dof > ID's in the K matrix. (NOTE: the 3* ishardcoded for 3 degrees of freedom, > tie this to a variable in the FUTURE.) > >>> ierr = MatZeroRows(K,1,&indexK,diag,NULL,NULL); CHKERRQ(ierr); > >>> }//end "kk" loop. > >>> }//end if. > >>> else if(neumannBC){//Boundary conditions requiring an edit of RHS > vector. > >>> for(kk=0;kk >>> indexK = 3*(pinZID[kk] - vStart) + dof_offset; > >>> ierr = VecSetValue(F,indexK,traction,INSERT_VALUES); > CHKERRQ(ierr); > >>> }//end "kk" loop. > >>> }// end else if. > >>> ierr = ISRestoreIndices(VerticesIS,&pinZID); CHKERRQ(ierr); > >>> */ > >>> dirichletBC = PETSC_FALSE; > >>> neumannBC = PETSC_FALSE; > >>> }//end "ii" loop. > >>> ierr = ISRestoreIndices(physicalsurfaceID,&surfvals); CHKERRQ(ierr); > >>> //ierr = ISRestoreIndices(VerticesIS,&pinZID); CHKERRQ(ierr); > >>> ierr = ISDestroy(&physicalsurfaceID); CHKERRQ(ierr); > >>> //ierr = ISDestroy(&VerticesIS); CHKERRQ(ierr); > >>> ierr = ISDestroy(&TrianglesIS); CHKERRQ(ierr); > >>> //END BOUNDARY CONDITION > ENFORCEMENT============================================ > >>> t2 = MPI_Wtime(); > >>> PetscPrintf(PETSC_COMM_WORLD,"BC imposition time: %10f\n",t2-t1); > >>> > >>> /* > >>> PetscInt kk = 0; > >>> for(ii=vStart;ii >>> kk++; > >>> PetscPrintf(PETSC_COMM_WORLD,"Vertex #%4i\t x = %10.9f\ty = > %10.9f\tz = %10.9f\n",ii,xyz_el[3*kk],xyz_el[3*kk+1],xyz_el[3*kk+2]); > >>> }// end "ii" loop. > >>> */ > >>> > >>> t1 = MPI_Wtime(); > >>> > //SOLVER======================================================================== > >>> ierr = KSPCreate(PETSC_COMM_WORLD,&ksp); CHKERRQ(ierr); > >>> ierr = KSPSetOperators(ksp,K,K); CHKERRQ(ierr); > >>> ierr = KSPSetFromOptions(ksp); CHKERRQ(ierr); > >>> ierr = KSPSolve(ksp,F,U); CHKERRQ(ierr); > >>> t2 = MPI_Wtime(); > >>> //ierr = KSPView(ksp,PETSC_VIEWER_STDOUT_WORLD);CHKERRQ(ierr); > >>> > //SOLVER======================================================================== > >>> t2 = MPI_Wtime(); > >>> PetscPrintf(PETSC_COMM_WORLD,"Solver time: %10f\n",t2-t1); > >>> ierr = VecRestoreArray(XYZ,&xyz_el); CHKERRQ(ierr);//Get pointer to > vector's data. > >>> > >>> //BEGIN MAX/MIN > DISPLACEMENTS=================================================== > >>> IS ISux,ISuy,ISuz; > >>> Vec UX,UY,UZ; > >>> PetscReal UXmax,UYmax,UZmax,UXmin,UYmin,UZmin; > >>> ierr = ISCreateStride(PETSC_COMM_WORLD,nv,0,3,&ISux); CHKERRQ(ierr); > >>> ierr = ISCreateStride(PETSC_COMM_WORLD,nv,1,3,&ISuy); CHKERRQ(ierr); > >>> ierr = ISCreateStride(PETSC_COMM_WORLD,nv,2,3,&ISuz); CHKERRQ(ierr); > >>> > >>> //PetscErrorCode VecGetSubVector(Vec X,IS is,Vec *Y) > >>> ierr = VecGetSubVector(U,ISux,&UX); CHKERRQ(ierr); > >>> ierr = VecGetSubVector(U,ISuy,&UY); CHKERRQ(ierr); > >>> ierr = VecGetSubVector(U,ISuz,&UZ); CHKERRQ(ierr); > >>> > >>> //PetscErrorCode VecMax(Vec x,PetscInt *p,PetscReal *val) > >>> ierr = VecMax(UX,PETSC_NULL,&UXmax); CHKERRQ(ierr); > >>> ierr = VecMax(UY,PETSC_NULL,&UYmax); CHKERRQ(ierr); > >>> ierr = VecMax(UZ,PETSC_NULL,&UZmax); CHKERRQ(ierr); > >>> > >>> ierr = VecMin(UX,PETSC_NULL,&UXmin); CHKERRQ(ierr); > >>> ierr = VecMin(UY,PETSC_NULL,&UYmin); CHKERRQ(ierr); > >>> ierr = VecMin(UZ,PETSC_NULL,&UZmin); CHKERRQ(ierr); > >>> > >>> PetscPrintf(PETSC_COMM_WORLD,"%10f\t <= ux <= %10f\n",UXmin,UXmax); > >>> PetscPrintf(PETSC_COMM_WORLD,"%10f\t <= uy <= %10f\n",UYmin,UYmax); > >>> PetscPrintf(PETSC_COMM_WORLD,"%10f\t <= uz <= %10f\n",UZmin,UZmax); > >>> > >>> > >>> > >>> > >>> //BEGIN OUTPUT > SOLUTION========================================================= > >>> if(saveASCII){ > >>> ierr = PetscViewerASCIIOpen(PETSC_COMM_WORLD,"XYZ.txt",&XYZviewer); > >>> ierr = VecView(XYZ,XYZviewer); CHKERRQ(ierr); > >>> ierr = PetscViewerASCIIOpen(PETSC_COMM_WORLD,"U.txt",&XYZpUviewer); > >>> ierr = VecView(U,XYZpUviewer); CHKERRQ(ierr); > >>> PetscViewerDestroy(&XYZviewer); PetscViewerDestroy(&XYZpUviewer); > >>> > >>> }//end if. > >>> if(saveVTK){ > >>> const char *meshfile = "starting_mesh.vtk", > >>> *deformedfile = "deformed_mesh.vtk"; > >>> ierr = > PetscViewerVTKOpen(PETSC_COMM_WORLD,meshfile,FILE_MODE_WRITE,&XYZviewer); > CHKERRQ(ierr); > >>> //PetscErrorCode DMSetAuxiliaryVec(DM dm, DMLabel label, PetscInt > value, Vec aux) > >>> DMLabel UXlabel,UYlabel, UZlabel; > >>> //PetscErrorCode DMLabelCreate(MPI_Comm comm, const char name[], > DMLabel *label) > >>> ierr = DMLabelCreate(PETSC_COMM_WORLD, "X-Displacement", &UXlabel); > CHKERRQ(ierr); > >>> ierr = DMLabelCreate(PETSC_COMM_WORLD, "Y-Displacement", &UYlabel); > CHKERRQ(ierr); > >>> ierr = DMLabelCreate(PETSC_COMM_WORLD, "Z-Displacement", &UZlabel); > CHKERRQ(ierr); > >>> ierr = DMSetAuxiliaryVec(dm,UXlabel, 1, UX); CHKERRQ(ierr); > >>> ierr = DMSetAuxiliaryVec(dm,UYlabel, 1, UY); CHKERRQ(ierr); > >>> ierr = DMSetAuxiliaryVec(dm,UZlabel, 1, UZ); CHKERRQ(ierr); > >>> //PetscErrorCode PetscViewerVTKAddField(PetscViewer > viewer,PetscObject dm,PetscErrorCode > (*PetscViewerVTKWriteFunction)(PetscObject,PetscViewer),PetscInt > fieldnum,PetscViewerVTKFieldType fieldtype,PetscBool checkdm,PetscObject > vec) > >>> > >>> > >>> > >>> //ierr = PetscViewerVTKAddField(XYZviewer, dm,PetscErrorCode > (*PetscViewerVTKWriteFunction)(Vec,PetscViewer),PETSC_DEFAULT,PETSC_VTK_POINT_FIELD,PETSC_FALSE,UX); > >>> ierr = PetscViewerVTKAddField(XYZviewer, > (PetscObject)dm,&PetscViewerVTKWriteFunction,PETSC_DEFAULT,PETSC_VTK_POINT_FIELD,PETSC_FALSE,(PetscObject)UX); > >>> > >>> > >>> ierr = DMPlexVTKWriteAll((PetscObject)dm, XYZviewer); CHKERRQ(ierr); > >>> ierr = VecAXPY(XYZ,1,U); CHKERRQ(ierr);//Add displacement field to > the mesh coordinates to deform. > >>> ierr = > PetscViewerVTKOpen(PETSC_COMM_WORLD,deformedfile,FILE_MODE_WRITE,&XYZpUviewer); > CHKERRQ(ierr); > >>> ierr = DMPlexVTKWriteAll((PetscObject)dm, XYZpUviewer); > CHKERRQ(ierr);// > >>> PetscViewerDestroy(&XYZviewer); PetscViewerDestroy(&XYZpUviewer); > >>> > >>> }//end else if. > >>> else{ > >>> ierr = PetscPrintf(PETSC_COMM_WORLD,"No output format specified! > Files not saved.\n"); CHKERRQ(ierr); > >>> }//end else. > >>> > >>> > >>> //END OUTPUT > SOLUTION=========================================================== > >>> VecDestroy(&UX); ISDestroy(&ISux); > >>> VecDestroy(&UY); ISDestroy(&ISuy); > >>> VecDestroy(&UZ); ISDestroy(&ISuz); > >>> //END MAX/MIN > DISPLACEMENTS===================================================== > >>> > >>> > //CLEANUP===================================================================== > >>> DMDestroy(&dm); > >>> KSPDestroy(&ksp); > >>> MatDestroy(&K); MatDestroy(&KE); MatDestroy(&matC); > //MatDestroy(preKEtetra4.matB); MatDestroy(preKEtetra4.matBTCB); > >>> VecDestroy(&U); VecDestroy(&F); > >>> > >>> //DMLabelDestroy(&physicalgroups);//Destroyig the DM destroys the > label. > >>> > //CLEANUP===================================================================== > >>> //PetscErrorCode PetscMallocDump(FILE *fp) > >>> //ierr = PetscMallocDump(NULL); > >>> return PetscFinalize();//And the machine shall rest.... > >>> }//end main. > >>> > >>> PetscErrorCode tetra4(PetscScalar* X,PetscScalar* Y, PetscScalar* > Z,struct preKE *P, Mat* matC, Mat* KE){ > >>> //INPUTS: > >>> //X: Global X coordinates of the elemental nodes. > >>> //Y: Global Y coordinates of the elemental nodes. > >>> //Z: Global Z coordinates of the elemental nodes. > >>> //J: Jacobian matrix. > >>> //invJ: Inverse Jacobian matrix. > >>> PetscErrorCode ierr; > >>> //For current quadrature point, get dPsi/dXi_i Xi_i = {Xi,Eta,Zeta} > >>> /* > >>> P->dPsidXi[0] = +1.; P->dPsidEta[0] = 0.0; P->dPsidZeta[0] = 0.0; > >>> P->dPsidXi[1] = 0.0; P->dPsidEta[1] = +1.; P->dPsidZeta[1] = 0.0; > >>> P->dPsidXi[2] = 0.0; P->dPsidEta[2] = 0.0; P->dPsidZeta[2] = +1.; > >>> P->dPsidXi[3] = -1.; P->dPsidEta[3] = -1.; P->dPsidZeta[3] = -1.; > >>> */ > >>> //Populate the Jacobian matrix. > >>> P->J[0][0] = X[0] - X[3]; > >>> P->J[0][1] = Y[0] - Y[3]; > >>> P->J[0][2] = Z[0] - Z[3]; > >>> P->J[1][0] = X[1] - X[3]; > >>> P->J[1][1] = Y[1] - Y[3]; > >>> P->J[1][2] = Z[1] - Z[3]; > >>> P->J[2][0] = X[2] - X[3]; > >>> P->J[2][1] = Y[2] - Y[3]; > >>> P->J[2][2] = Z[2] - Z[3]; > >>> > >>> //Determinant of the 3x3 Jacobian. (Expansion along 1st row). > >>> P->minor00 = P->J[1][1]*P->J[2][2] - P->J[2][1]*P->J[1][2];//Reuse > when finding InvJ. > >>> P->minor01 = P->J[1][0]*P->J[2][2] - P->J[2][0]*P->J[1][2];//Reuse > when finding InvJ. > >>> P->minor02 = P->J[1][0]*P->J[2][1] - P->J[2][0]*P->J[1][1];//Reuse > when finding InvJ. > >>> P->detJ = P->J[0][0]*P->minor00 - P->J[0][1]*P->minor01 + > P->J[0][2]*P->minor02; > >>> //Inverse of the 3x3 Jacobian > >>> P->invJ[0][0] = +P->minor00/P->detJ;//Reuse precomputed minor. > >>> P->invJ[0][1] = -(P->J[0][1]*P->J[2][2] - > P->J[0][2]*P->J[2][1])/P->detJ; > >>> P->invJ[0][2] = +(P->J[0][1]*P->J[1][2] - > P->J[1][1]*P->J[0][2])/P->detJ; > >>> P->invJ[1][0] = -P->minor01/P->detJ;//Reuse precomputed minor. > >>> P->invJ[1][1] = +(P->J[0][0]*P->J[2][2] - > P->J[0][2]*P->J[2][0])/P->detJ; > >>> P->invJ[1][2] = -(P->J[0][0]*P->J[1][2] - > P->J[1][0]*P->J[0][2])/P->detJ; > >>> P->invJ[2][0] = +P->minor02/P->detJ;//Reuse precomputed minor. > >>> P->invJ[2][1] = -(P->J[0][0]*P->J[2][1] - > P->J[0][1]*P->J[2][0])/P->detJ; > >>> P->invJ[2][2] = +(P->J[0][0]*P->J[1][1] - > P->J[0][1]*P->J[1][0])/P->detJ; > >>> > >>> //*****************STRAIN MATRIX > (B)************************************** > >>> for(P->m=0;P->mN;P->m++){//Scan all shape functions. > >>> > >>> P->x_in = 0 + P->m*3;//Every 3rd column starting at 0 > >>> P->y_in = P->x_in +1;//Every 3rd column starting at 1 > >>> P->z_in = P->y_in +1;//Every 3rd column starting at 2 > >>> > >>> P->dX[0] = P->invJ[0][0]*P->dPsidXi[P->m] + > P->invJ[0][1]*P->dPsidEta[P->m] + P->invJ[0][2]*P->dPsidZeta[P->m]; > >>> P->dY[0] = P->invJ[1][0]*P->dPsidXi[P->m] + > P->invJ[1][1]*P->dPsidEta[P->m] + P->invJ[1][2]*P->dPsidZeta[P->m]; > >>> P->dZ[0] = P->invJ[2][0]*P->dPsidXi[P->m] + > P->invJ[2][1]*P->dPsidEta[P->m] + P->invJ[2][2]*P->dPsidZeta[P->m]; > >>> > >>> P->dX[1] = P->dZ[0]; P->dX[2] = P->dY[0]; > >>> P->dY[1] = P->dZ[0]; P->dY[2] = P->dX[0]; > >>> P->dZ[1] = P->dX[0]; P->dZ[2] = P->dY[0]; > >>> > >>> ierr = > MatSetValues(P->matB,3,P->x_insert,1,&(P->x_in),P->dX,INSERT_VALUES); > CHKERRQ(ierr); > >>> ierr = > MatSetValues(P->matB,3,P->y_insert,1,&(P->y_in),P->dY,INSERT_VALUES); > CHKERRQ(ierr); > >>> ierr = > MatSetValues(P->matB,3,P->z_insert,1,&(P->z_in),P->dZ,INSERT_VALUES); > CHKERRQ(ierr); > >>> > >>> }//end "m" loop. > >>> ierr = MatAssemblyBegin(P->matB,MAT_FINAL_ASSEMBLY); CHKERRQ(ierr); > >>> ierr = MatAssemblyEnd(P->matB,MAT_FINAL_ASSEMBLY); CHKERRQ(ierr); > >>> //*****************STRAIN MATRIX > (B)************************************** > >>> > >>> //Compute the matrix product B^t*C*B, scale it by the > quadrature weights and add to KE. > >>> P->weight = -P->detJ/6; > >>> > >>> ierr = MatZeroEntries(*KE); CHKERRQ(ierr); > >>> ierr = > MatPtAP(*matC,P->matB,MAT_INITIAL_MATRIX,PETSC_DEFAULT,&(P->matBTCB));CHKERRQ(ierr); > >>> ierr = MatScale(P->matBTCB,P->weight); CHKERRQ(ierr); > >>> ierr = MatAssemblyBegin(P->matBTCB,MAT_FINAL_ASSEMBLY); > CHKERRQ(ierr); > >>> ierr = MatAssemblyEnd(P->matBTCB,MAT_FINAL_ASSEMBLY); CHKERRQ(ierr); > >>> ierr = MatAXPY(*KE,1,P->matBTCB,DIFFERENT_NONZERO_PATTERN); > CHKERRQ(ierr);//Add contribution of current quadrature point to KE. > >>> > >>> //ierr = > MatPtAP(*matC,P->matB,MAT_INITIAL_MATRIX,PETSC_DEFAULT,KE);CHKERRQ(ierr); > >>> //ierr = MatScale(*KE,P->weight); CHKERRQ(ierr); > >>> > >>> ierr = MatAssemblyBegin(*KE,MAT_FINAL_ASSEMBLY); CHKERRQ(ierr); > >>> ierr = MatAssemblyEnd(*KE,MAT_FINAL_ASSEMBLY); CHKERRQ(ierr); > >>> > >>> //Cleanup > >>> return ierr; > >>> }//end tetra4. > >>> > >>> PetscErrorCode ConstitutiveMatrix(Mat *matC,const char* type,PetscInt > materialID){ > >>> PetscErrorCode ierr; > >>> PetscBool isotropic = PETSC_FALSE, > >>> orthotropic = PETSC_FALSE; > >>> //PetscErrorCode PetscStrcmp(const char a[],const char > b[],PetscBool *flg) > >>> ierr = PetscStrcmp(type,"isotropic",&isotropic); > >>> ierr = PetscStrcmp(type,"orthotropic",&orthotropic); > >>> ierr = MatCreate(PETSC_COMM_WORLD,matC); CHKERRQ(ierr); > >>> ierr = MatSetSizes(*matC,PETSC_DECIDE,PETSC_DECIDE,6,6); > CHKERRQ(ierr); > >>> ierr = MatSetType(*matC,MATAIJ); CHKERRQ(ierr); > >>> ierr = MatSetUp(*matC); CHKERRQ(ierr); > >>> > >>> if(isotropic){ > >>> PetscReal E,nu, M,L,vals[3]; > >>> switch(materialID){ > >>> case 0://Hardcoded properties for isotropic material #0 > >>> E = 200; > >>> nu = 1./3; > >>> break; > >>> case 1://Hardcoded properties for isotropic material #1 > >>> E = 96; > >>> nu = 1./3; > >>> break; > >>> }//end switch. > >>> M = E/(2*(1+nu)),//Lame's constant 1 ("mu"). > >>> L = E*nu/((1+nu)*(1-2*nu));//Lame's constant 2 ("lambda"). > >>> //PetscErrorCode MatSetValues(Mat mat,PetscInt m,const PetscInt > idxm[],PetscInt n,const PetscInt idxn[],const PetscScalar v[],InsertMode > addv) > >>> PetscInt idxn[3] = {0,1,2}; > >>> vals[0] = L+2*M; vals[1] = L; vals[2] = vals[1]; > >>> ierr = MatSetValues(*matC,1,&idxn[0],3,idxn,vals,INSERT_VALUES); > CHKERRQ(ierr); > >>> vals[1] = vals[0]; vals[0] = vals[2]; > >>> ierr = MatSetValues(*matC,1,&idxn[1],3,idxn,vals,INSERT_VALUES); > CHKERRQ(ierr); > >>> vals[2] = vals[1]; vals[1] = vals[0]; > >>> ierr = MatSetValues(*matC,1,&idxn[2],3,idxn,vals,INSERT_VALUES); > CHKERRQ(ierr); > >>> ierr = MatSetValue(*matC,3,3,M,INSERT_VALUES); CHKERRQ(ierr); > >>> ierr = MatSetValue(*matC,4,4,M,INSERT_VALUES); CHKERRQ(ierr); > >>> ierr = MatSetValue(*matC,5,5,M,INSERT_VALUES); CHKERRQ(ierr); > >>> }//end if. > >>> /* > >>> else if(orthotropic){ > >>> switch(materialID){ > >>> case 0: > >>> break; > >>> case 1: > >>> break; > >>> }//end switch. > >>> }//end else if. > >>> */ > >>> ierr = MatAssemblyBegin(*matC,MAT_FINAL_ASSEMBLY); CHKERRQ(ierr); > >>> ierr = MatAssemblyEnd(*matC,MAT_FINAL_ASSEMBLY); CHKERRQ(ierr); > >>> //MatView(*matC,0); > >>> return ierr; > >>> }//End ConstitutiveMatrix > >>> > >>> PetscErrorCode InitializeKEpreallocation(struct preKE *P,const char* > type){ > >>> PetscErrorCode ierr; > >>> PetscBool istetra4 = PETSC_FALSE, > >>> ishex8 = PETSC_FALSE; > >>> ierr = PetscStrcmp(type,"tetra4",&istetra4); CHKERRQ(ierr); > >>> ierr = PetscStrcmp(type,"hex8",&ishex8); CHKERRQ(ierr); > >>> if(istetra4){ > >>> P->sizeKE = 12; > >>> P->N = 4; > >>> }//end if. > >>> else if(ishex8){ > >>> P->sizeKE = 24; > >>> P->N = 8; > >>> }//end else if. > >>> > >>> > >>> P->x_insert[0] = 0; P->x_insert[1] = 3; P->x_insert[2] = 5; > >>> P->y_insert[0] = 1; P->y_insert[1] = 4; P->y_insert[2] = 5; > >>> P->z_insert[0] = 2; P->z_insert[1] = 3; P->z_insert[2] = 4; > >>> //Allocate memory for the differentiated shape function vectors. > >>> ierr = PetscMalloc1(P->N,&(P->dPsidXi)); CHKERRQ(ierr); > >>> ierr = PetscMalloc1(P->N,&(P->dPsidEta)); CHKERRQ(ierr); > >>> ierr = PetscMalloc1(P->N,&(P->dPsidZeta)); CHKERRQ(ierr); > >>> > >>> P->dPsidXi[0] = +1.; P->dPsidEta[0] = 0.0; P->dPsidZeta[0] = 0.0; > >>> P->dPsidXi[1] = 0.0; P->dPsidEta[1] = +1.; P->dPsidZeta[1] = 0.0; > >>> P->dPsidXi[2] = 0.0; P->dPsidEta[2] = 0.0; P->dPsidZeta[2] = +1.; > >>> P->dPsidXi[3] = -1.; P->dPsidEta[3] = -1.; P->dPsidZeta[3] = -1.; > >>> > >>> > >>> //Strain matrix. > >>> ierr = MatCreate(PETSC_COMM_WORLD,&(P->matB)); CHKERRQ(ierr); > >>> ierr = MatSetSizes(P->matB,PETSC_DECIDE,PETSC_DECIDE,6,P->sizeKE); > CHKERRQ(ierr);//Hardcoded > >>> ierr = MatSetType(P->matB,MATAIJ); CHKERRQ(ierr); > >>> ierr = MatSetUp(P->matB); CHKERRQ(ierr); > >>> > >>> //Contribution matrix. > >>> ierr = MatCreate(PETSC_COMM_WORLD,&(P->matBTCB)); CHKERRQ(ierr); > >>> ierr = > MatSetSizes(P->matBTCB,PETSC_DECIDE,PETSC_DECIDE,P->sizeKE,P->sizeKE); > CHKERRQ(ierr); > >>> ierr = MatSetType(P->matBTCB,MATAIJ); CHKERRQ(ierr); > >>> ierr = MatSetUp(P->matBTCB); CHKERRQ(ierr); > >>> > >>> //Element stiffness matrix. > >>> //ierr = MatCreateSeqDense(PETSC_COMM_SELF,12,12,NULL,&KE); > CHKERRQ(ierr); //PARALLEL > >>> > >>> return ierr; > >>> } > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From FERRANJ2 at my.erau.edu Thu Jan 6 16:30:41 2022 From: FERRANJ2 at my.erau.edu (Ferrand, Jesus A.) Date: Thu, 6 Jan 2022 22:30:41 +0000 Subject: [petsc-users] [EXTERNAL] Re: DM misuse causes massive memory leak? In-Reply-To: References: <87tueraunm.fsf@jedbrown.org> <87sfu2921h.fsf@jedbrown.org> <87a6g99507.fsf@jedbrown.org> Message-ID: Matt: Thanks for the immediate reply. My apologies, I misspelled the function name, it should have been "DMPlexGetConeRecursiveVertices()." Regarding the PetscSection use: I am looping through every single point in the DAG of my mesh. For each point I am assigning dof using PetscSectionSetDof(). I am also assigning dof to the corresponding fields using PetscSectionSetFieldDof(). I took Jed's advice and made a single field with 3 components, I named all of them. So, I used PetscSectionSetNumFields(), PetscSectionSetFieldComponents(), PetscSectionSetFieldName(), and PetscSectionSetComponentName(). Finally, I proceed to PetscSectionSetUp(), and DMSetLocalSection(). In my timed code blocks I am including DMPlexGetDepthStratum(), and DMGetStratumSize(), and DMPlexGetChart() because I need their output to assign the dof to the PetscSection. For what is worth, I have my configure set to --with-debugging = 1. ________________________________ From: Matthew Knepley Sent: Thursday, January 6, 2022 5:20 PM To: Ferrand, Jesus A. Cc: Jed Brown ; petsc-users Subject: Re: [petsc-users] [EXTERNAL] Re: DM misuse causes massive memory leak? On Thu, Jan 6, 2022 at 5:15 PM Ferrand, Jesus A. > wrote: Jed: DMPlexLabelComplete() has allowed me to speed up my code significantly (Many thanks!). I did not use DMAddBoundary() though. I figured I could obtain Index Sets (IS's) from the DAG for different depths and then IS's for the points that were flagged in Gmsh (after calling DMPlexLabelComplete()). I then intersected both IS's using ISIntersect() to get the DAG points corresponding to just vertices (and flagged by Gmsh) for Dirichlet BC's, and DAG points that are Faces and flagged by Gmsh for Neumann BC's. I then use the intersected IS to edit a Mat and a RHS Vec manually. I did further profiling and have found the PetsSections are now the next biggest overhead. For Dirichlet BC's I make an array of vertex ID's and call MatSetZeroRows() to impose BC's on them through my K matrix. And yes, I'm solving the elasticity PDE. For Neumann BC's I use DMPlexGetRecursiveVertices() to edit my RHS vector. I cannot find a function named DMPlexGetRecursiveVertices(). I want to keep the PetscSections since they preallocate my matrix rather well (the one from DMCreateMatrix()) but at the same time would like to remove them since they add overhead. Do you think DMAddboundary() with the function call will be faster than my single calls to MatSetZeroRows() and DMPlexGetRecursiveVertices() ? PetscSection is really simple. Are you sure you are measuring long times there? What are you using it to do? Thanks, Matt ________________________________ From: Jed Brown > Sent: Wednesday, January 5, 2022 5:44 PM To: Ferrand, Jesus A. > Cc: petsc-users > Subject: Re: [EXTERNAL] Re: [petsc-users] DM misuse causes massive memory leak? For something like displacement (and this sounds like elasticity), I would recommend using one field with three components. You can constrain a subset of the components to implement slip conditions. You can use DMPlexLabelComplete(dm, label) to propagate those face labels to vertices. "Ferrand, Jesus A." > writes: > Thanks for the reply (I hit reply all this time). > > So, I set 3 fields: > /* > ierr = PetscSectionSetNumFields(s,dof); CHKERRQ(ierr); > ierr = PetscSectionSetFieldName(s,0, "X-Displacement"); CHKERRQ(ierr); //Field ID is 0 > ierr = PetscSectionSetFieldName(s,1, "Y-Displacement"); CHKERRQ(ierr); //Field ID is 1 > ierr = PetscSectionSetFieldName(s,2, "Z-Displacement"); CHKERRQ(ierr); //Field ID is 2 > */ > > I then loop through the vertices of my DMPlex > > /* > for(ii = vStart; ii < vEnd; ii++){//Vertex loop. > ierr = PetscSectionSetDof(s, ii, dof); CHKERRQ(ierr); > ierr = PetscSectionSetFieldDof(s,ii,0,1); CHKERRQ(ierr);//One X-displacement per vertex (1 dof) > ierr = PetscSectionSetFieldDof(s,ii,1,1); CHKERRQ(ierr);//One Y-displacement per vertex (1 dof) > ierr = PetscSectionSetFieldDof(s,ii,2,1); CHKERRQ(ierr);//One Z-displacement per vertex (1 dof) > }//Sets x, y, and z displacements as dofs. > */ > > I only associated fields with vertices, not with any other points in the DAG. Regarding the use of DMAddBoundary(), I mostly copied the usage shown in SNES example 77. I modified the function definition to simply set the dof to 0.0 as opposed to the coordinates. Below "physicalgroups" is the DMLabel that I got from gmsh, this flags Face points, not vertices. That is why I think the error log suggests that fields were never set. > > ierr = DMAddBoundary(dm, DM_BC_ESSENTIAL, "fixed", physicalgroups, 1, &surfvals[ii], fieldID, 0, NULL, (void (*)(void)) coordinates, NULL, NULL, NULL); CHKERRQ(ierr); > PetscErrorCode coordinates(PetscInt dim, PetscReal time, const PetscReal x[], PetscInt Nf, PetscScalar *u, void *ctx){ > const PetscInt Ncomp = dim; > PetscInt comp; > for (comp = 0; comp < Ncomp; ++comp) u[comp] = 0.0; > return 0; > } > > > ________________________________ > From: Jed Brown > > Sent: Wednesday, January 5, 2022 12:36 AM > To: Ferrand, Jesus A. > > Cc: petsc-users > > Subject: Re: [EXTERNAL] Re: [petsc-users] DM misuse causes massive memory leak? > > Please "reply all" include the list in the future. > > "Ferrand, Jesus A." > writes: > >> Forgot to say thanks for the reply (my bad). >> Yes, I was indeed forgetting to pre-allocate the sparse matrices when doing them by hand (complacency seeded by MATLAB's zeros()). Thank you, Jed, and Jeremy, for the hints! >> >> I have more questions, these ones about boundary conditions (I think these are for Matt). >> In my current code I set Dirichlet conditions directly on a Mat by calling MatSetZeroRows(). I profiled my code and found the part that applies them to be unnacceptably slow. In response, I've been trying to better pre-allocate Mats using PetscSections. I have found documentation for PetscSectionSetDof(), PetscSectionSetNumFields(), PetscSectionSetFieldName(), and PetscSectionSetFieldDof(), to set the size of my Mats and Vecs by calling DMSetLocalSection() followed by DMCreateMatrix() and DMCreateLocalVector() to get a RHS vector. This seems faster. >> >> In PetscSection, what is the difference between a "field" and a "component"? For example, I could have one field "Velocity" with three components ux, uy, and uz or perhaps three fields ux, uy, and uz each with a default component? > > It's just how you name them and how they appear in output. Usually "velocity" is better as a field with three components, but fields with other meaning (and perhaps different finite element spaces), such as pressure, would be different fields. Different components are always in the same FE space. > >> I am struggling now to impose boundary conditions after constraining dofs using PetscSection. My understanding is that constraining dof's reduces the size of the DM's matrix but it does not give the DM knowledge of what values the constrained dofs should have, right? >> >> I know that there is DMAddBoundary(), but I am unsure of how to use it. >From Gmsh I have a mesh with surface boundaries flagged. I'm not sure whether DMAddBoundary()will constrain the face, edge, or vertex points when I give it the DMLabel from Gmsh. (I need specific dof on the vertices to be constrained). I did some testing and I think DMAddBoundary() attempts to constrain the Face points (see error log below). I only associated fields with the vertices but not the Faces. I can extract the vertex points from the face label using DMPlexGetConeRecursiveVertices() but the output IS has repeated entries for the vertex points (many faces share the same vertex). Is there an easier way to get the vertex points from a gmsh surface tag? > > How did you call DMAddBoundary()? Are you using DM_BC_ESSENTIAL and a callback function that provides the inhomogeneous boundary condition? > >> I'm sorry this is a mouthful. >> >> [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- >> [0]PETSC ERROR: Argument out of range >> [0]PETSC ERROR: Field number 2 must be in [0, 0) > > It looks like you haven't added these fields yet. > >> [0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting. >> [0]PETSC ERROR: Petsc Release Version 3.16.0, unknown >> [0]PETSC ERROR: ./gmsh4 on a linux-c-dbg named F86 by jesus Tue Jan 4 15:19:57 2022 >> [0]PETSC ERROR: Configure options --with-32bits-pci-domain=1 --with-debugging =1 >> [0]PETSC ERROR: #1 DMGetField() at /home/jesus/SAND/PETSc_install/petsc/src/dm/interface/dm.c:4803 >> [0]PETSC ERROR: #2 DMCompleteBoundaryLabel_Internal() at /home/jesus/SAND/PETSc_install/petsc/src/dm/interface/dm.c:5140 >> [0]PETSC ERROR: #3 DMAddBoundary() at /home/jesus/SAND/PETSc_install/petsc/src/dm/interface/dm.c:8561 >> [0]PETSC ERROR: #4 main() at /home/jesus/SAND/FEA/3D/gmshBACKUP4.c:215 >> [0]PETSC ERROR: No PETSc Option Table entries >> [0]PETSC ERROR: ----------------End of Error Message -------send entire error message to petsc-maint at mcs.anl.gov---------- >> >> >> >> >> >> >> ________________________________ >> From: Jed Brown > >> Sent: Wednesday, December 29, 2021 5:55 PM >> To: Ferrand, Jesus A. >; petsc-users at mcs.anl.gov > >> Subject: [EXTERNAL] Re: [petsc-users] DM misuse causes massive memory leak? >> >> CAUTION: This email originated outside of Embry-Riddle Aeronautical University. Do not click links or open attachments unless you recognize the sender and know the content is safe. >> >> >> "Ferrand, Jesus A." > writes: >> >>> Dear PETSc Team: >>> >>> I have a question about DM and PetscSection. Say I import a mesh (for FEM purposes) and create a DMPlex for it. I then use PetscSections to set degrees of freedom per "point" (by point I mean vertices, lines, faces, and cells). I then use PetscSectionGetStorageSize() to get the size of the global stiffness matrix (K) needed for my FEM problem. One last detail, this K I populate inside a rather large loop using an element stiffness matrix function of my own. Instead of using DMCreateMatrix(), I manually created a Mat using MatCreate(), MatSetType(), MatSetSizes(), and MatSetUp(). I come to find that said loop is painfully slow when I use the manually created matrix, but 20x faster when I use the Mat coming out of DMCreateMatrix(). >> >> The sparse matrix hasn't been preallocated, which forces the data structure to do a lot of copies (as bad as O(n^2) complexity). DMCreateMatrix() preallocates for you. >> >> https://petsc.org/release/docs/manual/performance/#memory-allocation-for-sparse-matrix-assembly >> https://petsc.org/release/docs/manual/mat/#sec-matsparse >> >>> My question is then: Is the manual Mat a noob mistake and is it somehow creating a memory leak with K? Just in case it's something else I'm attaching the code. The loop that populates K is between lines 221 and 278. Anything related to DM, DMPlex, and PetscSection is between lines 117 and 180. >>> >>> Machine Type: HP Laptop >>> C-compiler: Gnu C >>> OS: Ubuntu 20.04 >>> PETSc version: 3.16.0 >>> MPI Implementation: MPICH >>> >>> Hope you all had a Merry Christmas and that you will have a happy and productive New Year. :D >>> >>> >>> Sincerely: >>> >>> J.A. Ferrand >>> >>> Embry-Riddle Aeronautical University - Daytona Beach FL >>> >>> M.Sc. Aerospace Engineering | May 2022 >>> >>> B.Sc. Aerospace Engineering >>> >>> B.Sc. Computational Mathematics >>> >>> >>> >>> Sigma Gamma Tau >>> >>> Tau Beta Pi >>> >>> Honors Program >>> >>> >>> >>> Phone: (386)-843-1829 >>> >>> Email(s): ferranj2 at my.erau.edu >>> >>> jesus.ferrand at gmail.com >>> //REFERENCE: https://github.com/FreeFem/FreeFem-sources/blob/master/plugin/mpi/PETSc-code.hpp >>> #include >>> static char help[] = "Imports a Gmsh mesh with boundary conditions and solves the elasticity equation.\n" >>> "Option prefix = opt_.\n"; >>> >>> struct preKE{//Preallocation before computing KE >>> Mat matB, >>> matBTCB; >>> //matKE; >>> PetscInt x_insert[3], >>> y_insert[3], >>> z_insert[3], >>> m,//Looping variables. >>> sizeKE,//size of the element stiffness matrix. >>> N,//Number of nodes in element. >>> x_in,y_in,z_in; //LI to index B matrix. >>> PetscReal J[3][3],//Jacobian matrix. >>> invJ[3][3],//Inverse of the Jacobian matrix. >>> detJ,//Determinant of the Jacobian. >>> dX[3], >>> dY[3], >>> dZ[3], >>> minor00, >>> minor01, >>> minor02,//Determinants of minors in a 3x3 matrix. >>> dPsidX, dPsidY, dPsidZ,//Shape function derivatives w.r.t global coordinates. >>> weight,//Multiplier of quadrature weights. >>> *dPsidXi,//Derivatives of shape functions w.r.t. Xi. >>> *dPsidEta,//Derivatives of shape functions w.r.t. Eta. >>> *dPsidZeta;//Derivatives of shape functions w.r.t Zeta. >>> PetscErrorCode ierr; >>> };//end struct. >>> >>> //Function declarations. >>> extern PetscErrorCode tetra4(PetscScalar*, PetscScalar*, PetscScalar*,struct preKE*, Mat*, Mat*); >>> extern PetscErrorCode ConstitutiveMatrix(Mat*,const char*,PetscInt); >>> extern PetscErrorCode InitializeKEpreallocation(struct preKE*,const char*); >>> >>> PetscErrorCode PetscViewerVTKWriteFunction(PetscObject vec,PetscViewer viewer){ >>> PetscErrorCode ierr; >>> ierr = VecView((Vec)vec,viewer); CHKERRQ(ierr); >>> return ierr; >>> } >>> >>> >>> >>> >>> int main(int argc, char **args){ >>> //DEFINITIONS OF PETSC's DMPLEX LINGO: >>> //POINT: A topology element (cell, face, edge, or vertex). >>> //CHART: It an interval from 0 to the number of "points." (the range of admissible linear indices) >>> //STRATUM: A subset of the "chart" which corresponds to all "points" at a given "level." >>> //LEVEL: This is either a "depth" or a "height". >>> //HEIGHT: Dimensionality of an element measured from 0D to 3D. Heights: cell = 0, face = 1, edge = 2, vertex = 3. >>> //DEPTH: Dimensionality of an element measured from 3D to 0D. Depths: cell = 3, face = 2, edge = 1, vertex = 0; >>> //CLOSURE: *of an element is the collection of all other elements that define it.I.e., the closure of a surface is the collection of vertices and edges that make it up. >>> //STAR: >>> //STANDARD LABELS: These are default tags that DMPlex has for its topology. ("depth") >>> PetscErrorCode ierr;//Error tracking variable. >>> DM dm;//Distributed memory object (useful for managing grids.) >>> DMLabel physicalgroups;//Identifies user-specified tags in gmsh (to impose BC's). >>> DMPolytopeType celltype;//When looping through cells, determines its type (tetrahedron, pyramid, hexahedron, etc.) >>> PetscSection s; >>> KSP ksp;//Krylov Sub-Space (linear solver object) >>> Mat K,//Global stiffness matrix (Square, assume unsymmetric). >>> KE,//Element stiffness matrix (Square, assume unsymmetric). >>> matC;//Constitutive matrix. >>> Vec XYZ,//Coordinate vector, contains spatial locations of mesh's vertices (NOTE: This vector self-destroys!). >>> U,//Displacement vector. >>> F;//Load Vector. >>> PetscViewer XYZviewer,//Viewer object to output mesh to ASCII format. >>> XYZpUviewer; //Viewer object to output displacements to ASCII format. >>> PetscBool interpolate = PETSC_TRUE,//Instructs Gmsh importer whether to generate faces and edges (Needed when using P2 or higher elements). >>> useCone = PETSC_TRUE,//Instructs "DMPlexGetTransitiveClosure()" whether to extract the closure or the star. >>> dirichletBC = PETSC_FALSE,//For use when assembling the K matrix. >>> neumannBC = PETSC_FALSE,//For use when assembling the F vector. >>> saveASCII = PETSC_FALSE,//Whether to save results in ASCII format. >>> saveVTK = PETSC_FALSE;//Whether to save results as VTK format. >>> PetscInt nc,//number of cells. (PETSc lingo for "elements") >>> nv,//number of vertices. (PETSc lingo for "nodes") >>> nf,//number of faces. (PETSc lingo for "surfaces") >>> ne,//number of edges. (PETSc lingo for "lines") >>> pStart,//starting LI of global elements. >>> pEnd,//ending LI of all elements. >>> cStart,//starting LI for cells global arrangement. >>> cEnd,//ending LI for cells in global arrangement. >>> vStart,//starting LI for vertices in global arrangement. >>> vEnd,//ending LI for vertices in global arrangement. >>> fStart,//starting LI for faces in global arrangement. >>> fEnd,//ending LI for faces in global arrangement. >>> eStart,//starting LI for edges in global arrangement. >>> eEnd,//ending LI for edges in global arrangement. >>> sizeK,//Size of the element stiffness matrix. >>> ii,jj,kk,//Dedicated looping variables. >>> indexXYZ,//Variable to access the elements of XYZ vector. >>> indexK,//Variable to access the elements of the U and F vectors (can reference rows and colums of K matrix.) >>> *closure = PETSC_NULL,//Pointer to the closure elements of a cell. >>> size_closure,//Size of the closure of a cell. >>> dim,//Dimension of the mesh. >>> //*edof,//Linear indices of dof's inside the K matrix. >>> dof = 3,//Degrees of freedom per node. >>> cells=0, edges=0, vertices=0, faces=0,//Topology counters when looping through cells. >>> pinXcode=10, pinZcode=11,forceZcode=12;//Gmsh codes to extract relevant "Face Sets." >>> PetscReal //*x_el,//Pointer to a vector that will store the x-coordinates of an element's vertices. >>> //*y_el,//Pointer to a vector that will store the y-coordinates of an element's vertices. >>> //*z_el,//Pointer to a vector that will store the z-coordinates of an element's vertices. >>> *xyz_el,//Pointer to xyz array in the XYZ vector. >>> traction = -10, >>> *KEdata, >>> t1,t2; //time keepers. >>> const char *gmshfile = "TopOptmeshfine2.msh";//Name of gmsh file to import. >>> >>> ierr = PetscInitialize(&argc,&args,NULL,help); if(ierr) return ierr; //And the machine shall work.... >>> >>> //MESH IMPORT================================================================= >>> //IMPORTANT NOTE: Gmsh only creates "cells" and "vertices," it does not create the "faces" or "edges." >>> //Gmsh probably can generate them, must figure out how to. >>> t1 = MPI_Wtime(); >>> ierr = DMPlexCreateGmshFromFile(PETSC_COMM_WORLD,gmshfile,interpolate,&dm); CHKERRQ(ierr);//Read Gmsh file and generate the DMPlex. >>> ierr = DMGetDimension(dm, &dim); CHKERRQ(ierr);//1-D, 2-D, or 3-D >>> ierr = DMPlexGetChart(dm, &pStart, &pEnd); CHKERRQ(ierr);//Extracts linear indices of cells, vertices, faces, and edges. >>> ierr = DMGetCoordinatesLocal(dm,&XYZ); CHKERRQ(ierr);//Extracts coordinates from mesh.(Contiguous storage: [x0,y0,z0,x1,y1,z1,...]) >>> ierr = VecGetArray(XYZ,&xyz_el); CHKERRQ(ierr);//Get pointer to vector's data. >>> t2 = MPI_Wtime(); >>> PetscPrintf(PETSC_COMM_WORLD,"Mesh Import time: %10f\n",t2-t1); >>> DMView(dm,PETSC_VIEWER_STDOUT_WORLD); >>> >>> //IMPORTANT NOTE: PETSc assumes that vertex-cell meshes are 2D even if they were 3D, so its ordering changes. >>> //Cells remain at height 0, but vertices move to height 1 from height 3. To prevent this from becoming an issue >>> //the "interpolate" variable is set to PETSC_TRUE to tell the mesh importer to generate faces and edges. >>> //PETSc, therefore, technically does additional meshing. Gotta figure out how to get this from Gmsh directly. >>> ierr = DMPlexGetDepthStratum(dm,3, &cStart, &cEnd);//Get LI of cells. >>> ierr = DMPlexGetDepthStratum(dm,2, &fStart, &fEnd);//Get LI of faces >>> ierr = DMPlexGetDepthStratum(dm,1, &eStart, &eEnd);//Get LI of edges. >>> ierr = DMPlexGetDepthStratum(dm,0, &vStart, &vEnd);//Get LI of vertices. >>> ierr = DMGetStratumSize(dm,"depth", 3, &nc);//Get number of cells. >>> ierr = DMGetStratumSize(dm,"depth", 2, &nf);//Get number of faces. >>> ierr = DMGetStratumSize(dm,"depth", 1, &ne);//Get number of edges. >>> ierr = DMGetStratumSize(dm,"depth", 0, &nv);//Get number of vertices. >>> /* >>> PetscPrintf(PETSC_COMM_WORLD,"global start = %10d\t global end = %10d\n",pStart,pEnd); >>> PetscPrintf(PETSC_COMM_WORLD,"#cells = %10d\t i = %10d\t i < %10d\n",nc,cStart,cEnd); >>> PetscPrintf(PETSC_COMM_WORLD,"#faces = %10d\t i = %10d\t i < %10d\n",nf,fStart,fEnd); >>> PetscPrintf(PETSC_COMM_WORLD,"#edges = %10d\t i = %10d\t i < %10d\n",ne,eStart,eEnd); >>> PetscPrintf(PETSC_COMM_WORLD,"#vertices = %10d\t i = %10d\t i < %10d\n",nv,vStart,vEnd); >>> */ >>> //MESH IMPORT================================================================= >>> >>> //NOTE: This section extremely hardcoded right now. >>> //Current setup would only support P1 meshes. >>> //MEMORY ALLOCATION ========================================================== >>> ierr = PetscSectionCreate(PETSC_COMM_WORLD, &s); CHKERRQ(ierr); >>> //The chart is akin to a contiguous memory storage allocation. Each chart entry is associated >>> //with a "thing," could be a vertex, face, cell, or edge, or anything else. >>> ierr = PetscSectionSetChart(s, pStart, pEnd); CHKERRQ(ierr); >>> //For each "thing" in the chart, additional room can be made. This is helpful for associating >>> //nodes to multiple degrees of freedom. These commands help associate nodes with >>> for(ii = cStart; ii < cEnd; ii++){//Cell loop. >>> ierr = PetscSectionSetDof(s, ii, 0);CHKERRQ(ierr);}//NOTE: Currently no dof's associated with cells. >>> for(ii = fStart; ii < fEnd; ii++){//Face loop. >>> ierr = PetscSectionSetDof(s, ii, 0);CHKERRQ(ierr);}//NOTE: Currently no dof's associated with faces. >>> for(ii = vStart; ii < vEnd; ii++){//Vertex loop. >>> ierr = PetscSectionSetDof(s, ii, dof);CHKERRQ(ierr);}//Sets x, y, and z displacements as dofs. >>> for(ii = eStart; ii < eEnd; ii++){//Edge loop >>> ierr = PetscSectionSetDof(s, ii, 0);CHKERRQ(ierr);}//NOTE: Currently no dof's associated with edges. >>> ierr = PetscSectionSetUp(s); CHKERRQ(ierr); >>> ierr = PetscSectionGetStorageSize(s,&sizeK);CHKERRQ(ierr);//Determine the size of the global stiffness matrix. >>> ierr = DMSetLocalSection(dm,s); CHKERRQ(ierr);//Associate the PetscSection with the DM object. >>> //PetscErrorCode DMCreateGlobalVector(DM dm,Vec *vec) >>> //ierr = DMCreateGlobalVector(dm,&U); CHKERRQ(ierr); >>> PetscSectionDestroy(&s); >>> //PetscPrintf(PETSC_COMM_WORLD,"sizeK = %10d\n",sizeK); >>> >>> //OBJECT SETUP================================================================ >>> //Global stiffness matrix. >>> //PetscErrorCode DMCreateMatrix(DM dm,Mat *mat) >>> >>> //This makes the loop fast. >>> ierr = DMCreateMatrix(dm,&K); >>> >>> //This makes the loop uber slow. >>> //ierr = MatCreate(PETSC_COMM_WORLD,&K); CHKERRQ(ierr); >>> //ierr = MatSetType(K,MATAIJ); CHKERRQ(ierr);// Global stiffness matrix set to some sparse type. >>> //ierr = MatSetSizes(K,PETSC_DECIDE,PETSC_DECIDE,sizeK,sizeK); CHKERRQ(ierr); >>> //ierr = MatSetUp(K); CHKERRQ(ierr); >>> >>> //Displacement vector. >>> ierr = VecCreate(PETSC_COMM_WORLD,&U); CHKERRQ(ierr); >>> ierr = VecSetType(U,VECSTANDARD); CHKERRQ(ierr);// Global stiffness matrix set to some sparse type. >>> ierr = VecSetSizes(U,PETSC_DECIDE,sizeK); CHKERRQ(ierr); >>> >>> //Load vector. >>> ierr = VecCreate(PETSC_COMM_WORLD,&F); CHKERRQ(ierr); >>> ierr = VecSetType(F,VECSTANDARD); CHKERRQ(ierr);// Global stiffness matrix set to some sparse type. >>> ierr = VecSetSizes(F,PETSC_DECIDE,sizeK); CHKERRQ(ierr); >>> //OBJECT SETUP================================================================ >>> >>> //WARNING: This loop is currently hardcoded for P1 elements only! Must Figure >>> //out a clever way to modify to accomodate Pn (n>1) elements. >>> >>> //BEGIN GLOBAL STIFFNESS MATRIX BUILDER======================================= >>> t1 = MPI_Wtime(); >>> >>> //PREALLOCATIONS============================================================== >>> ierr = ConstitutiveMatrix(&matC,"isotropic",0); CHKERRQ(ierr); >>> struct preKE preKEtetra4; >>> ierr = InitializeKEpreallocation(&preKEtetra4,"tetra4"); CHKERRQ(ierr); >>> ierr = MatCreate(PETSC_COMM_WORLD,&KE); CHKERRQ(ierr); //SEQUENTIAL >>> ierr = MatSetSizes(KE,PETSC_DECIDE,PETSC_DECIDE,12,12); CHKERRQ(ierr); //SEQUENTIAL >>> ierr = MatSetType(KE,MATDENSE); CHKERRQ(ierr); //SEQUENTIAL >>> ierr = MatSetUp(KE); CHKERRQ(ierr); >>> PetscReal x_tetra4[4], y_tetra4[4],z_tetra4[4], >>> x_hex8[8], y_hex8[8],z_hex8[8], >>> *x,*y,*z; >>> PetscInt *EDOF,edof_tetra4[12],edof_hex8[24]; >>> DMPolytopeType previous = DM_POLYTOPE_UNKNOWN; >>> //PREALLOCATIONS============================================================== >>> >>> >>> >>> for(ii=cStart;ii>> ierr = DMPlexGetTransitiveClosure(dm, ii, useCone, &size_closure, &closure); CHKERRQ(ierr); >>> ierr = DMPlexGetCellType(dm, ii, &celltype); CHKERRQ(ierr); >>> //IMPORTANT NOTE: MOST OF THIS LOOP SHOULD BE INCLUDED IN THE KE3D function. >>> if(previous != celltype){ >>> //PetscPrintf(PETSC_COMM_WORLD,"run \n"); >>> if(celltype == DM_POLYTOPE_TETRAHEDRON){ >>> x = x_tetra4; >>> y = y_tetra4; >>> z = z_tetra4; >>> EDOF = edof_tetra4; >>> }//end if. >>> else if(celltype == DM_POLYTOPE_HEXAHEDRON){ >>> x = x_hex8; >>> y = y_hex8; >>> z = z_hex8; >>> EDOF = edof_hex8; >>> }//end else if. >>> } >>> previous = celltype; >>> >>> //PetscPrintf(PETSC_COMM_WORLD,"Cell # %4i\t",ii); >>> cells=0; >>> edges=0; >>> vertices=0; >>> faces=0; >>> kk = 0; >>> for(jj=0;jj<(2*size_closure);jj+=2){//Scan the closure of the current cell. >>> //Use information from the DM's strata to determine composition of cell_ii. >>> if(vStart <= closure[jj] && closure[jj]< vEnd){//Check for vertices. >>> //PetscPrintf(PETSC_COMM_WORLD,"%5i\t",closure[jj]); >>> indexXYZ = dim*(closure[jj]-vStart);//Linear index of x-coordinate in the xyz_el array. >>> >>> *(x+vertices) = xyz_el[indexXYZ]; >>> *(y+vertices) = xyz_el[indexXYZ+1];//Extract Y-coordinates of the current vertex. >>> *(z+vertices) = xyz_el[indexXYZ+2];//Extract Y-coordinates of the current vertex. >>> *(EDOF + kk) = indexXYZ; >>> *(EDOF + kk+1) = indexXYZ+1; >>> *(EDOF + kk+2) = indexXYZ+2; >>> kk+=3; >>> vertices++;//Update vertex counter. >>> }//end if >>> else if(eStart <= closure[jj] && closure[jj]< eEnd){//Check for edge ID's >>> edges++; >>> }//end else ifindexK >>> else if(fStart <= closure[jj] && closure[jj]< fEnd){//Check for face ID's >>> faces++; >>> }//end else if >>> else if(cStart <= closure[jj] && closure[jj]< cEnd){//Check for cell ID's >>> cells++; >>> }//end else if >>> }//end "jj" loop. >>> ierr = tetra4(x,y,z,&preKEtetra4,&matC,&KE); CHKERRQ(ierr); //Generate the element stiffness matrix for this cell. >>> ierr = MatDenseGetArray(KE,&KEdata); CHKERRQ(ierr); >>> ierr = MatSetValues(K,12,EDOF,12,EDOF,KEdata,ADD_VALUES); CHKERRQ(ierr);//WARNING: HARDCODED FOR TETRAHEDRAL P1 ELEMENTS ONLY !!!!!!!!!!!!!!!!!!!!!!! >>> ierr = MatDenseRestoreArray(KE,&KEdata); CHKERRQ(ierr); >>> ierr = DMPlexRestoreTransitiveClosure(dm, ii,useCone, &size_closure, &closure); CHKERRQ(ierr); >>> }//end "ii" loop. >>> ierr = MatAssemblyBegin(K,MAT_FINAL_ASSEMBLY); CHKERRQ(ierr); >>> ierr = MatAssemblyEnd(K,MAT_FINAL_ASSEMBLY); CHKERRQ(ierr); >>> //ierr = MatView(K,PETSC_VIEWER_DRAW_WORLD); CHKERRQ(ierr); >>> //END GLOBAL STIFFNESS MATRIX BUILDER=========================================== >>> t2 = MPI_Wtime(); >>> PetscPrintf(PETSC_COMM_WORLD,"K build time: %10f\n",t2-t1); >>> >>> >>> >>> >>> >>> >>> >>> >>> t1 = MPI_Wtime(); >>> //BEGIN BOUNDARY CONDITION ENFORCEMENT========================================== >>> IS TrianglesIS, physicalsurfaceID;//, VerticesIS; >>> PetscInt numsurfvals, >>> //numRows, >>> dof_offset,numTri; >>> const PetscInt *surfvals, >>> //*pinZID, >>> *TriangleID; >>> PetscScalar diag =1; >>> PetscReal area,force; >>> //NOTE: Petsc can read/assign labels. Eeach label may posses multiple "values." >>> //These values act as tags within a tag. >>> //IMPORTANT NOTE: The below line needs a safety. If a mesh that does not feature >>> //face sets is imported, the code in its current state will crash!!!. This is currently >>> //hardcoded for the test mesh. >>> ierr = DMGetLabel(dm, "Face Sets", &physicalgroups); CHKERRQ(ierr);//Inspects Physical surface groups defined by gmsh (if any). >>> ierr = DMLabelGetValueIS(physicalgroups, &physicalsurfaceID); CHKERRQ(ierr);//Gets the physical surface ID's defined in gmsh (as specified in the .geo file). >>> ierr = ISGetIndices(physicalsurfaceID,&surfvals); CHKERRQ(ierr);//Get a pointer to the actual surface values. >>> ierr = DMLabelGetNumValues(physicalgroups, &numsurfvals); CHKERRQ(ierr);//Gets the number of different values that the label assigns. >>> for(ii=0;ii>> //PetscPrintf(PETSC_COMM_WORLD,"Values = %5i\n",surfvals[ii]); >>> //PROBLEM: The surface values are hardcoded in the gmsh file. We need to adopt standard "codes" >>> //that we can give to users when they make their meshes so that this code recognizes the Type >>> // of boundary conditions that are to be imposed. >>> if(surfvals[ii] == pinXcode){ >>> dof_offset = 0; >>> dirichletBC = PETSC_TRUE; >>> }//end if. >>> else if(surfvals[ii] == pinZcode){ >>> dof_offset = 2; >>> dirichletBC = PETSC_TRUE; >>> }//end else if. >>> else if(surfvals[ii] == forceZcode){ >>> dof_offset = 2; >>> neumannBC = PETSC_TRUE; >>> }//end else if. >>> >>> ierr = DMLabelGetStratumIS(physicalgroups, surfvals[ii], &TrianglesIS); CHKERRQ(ierr);//Get the ID's (as an IS) of the surfaces belonging to value 11. >>> //PROBLEM: DMPlexGetConeRecursiveVertices returns an array with repeated node ID's. For each repetition, the lines that enforce BC's unnecessarily re-run. >>> ierr = ISGetSize(TrianglesIS,&numTri); CHKERRQ(ierr); >>> ierr = ISGetIndices(TrianglesIS,&TriangleID); CHKERRQ(ierr);//Get a pointer to the actual surface values. >>> for(kk=0;kk>> ierr = DMPlexGetTransitiveClosure(dm, TriangleID[kk], useCone, &size_closure, &closure); CHKERRQ(ierr); >>> if(neumannBC){ >>> ierr = DMPlexComputeCellGeometryFVM(dm, TriangleID[kk], &area,PETSC_NULL,PETSC_NULL); CHKERRQ(ierr); >>> force = traction*area/3;//WARNING: The 3 here is hardcoded for a purely tetrahedral mesh only!!!!!!!!!! >>> } >>> for(jj=0;jj<(2*size_closure);jj+=2){ >>> //PetscErrorCode DMPlexComputeCellGeometryFVM(DM dm, PetscInt cell, PetscReal *vol, PetscReal centroid[], PetscReal normal[]) >>> if(vStart <= closure[jj] && closure[jj]< vEnd){//Check for vertices. >>> indexK = dof*(closure[jj] - vStart) + dof_offset; //Compute the dof ID's in the K matrix. >>> if(dirichletBC){//Boundary conditions requiring an edit of K matrix. >>> ierr = MatZeroRows(K,1,&indexK,diag,NULL,NULL); CHKERRQ(ierr); >>> }//end if. >>> else if(neumannBC){//Boundary conditions requiring an edit of RHS vector. >>> ierr = VecSetValue(F,indexK,force,ADD_VALUES); CHKERRQ(ierr); >>> }// end else if. >>> }//end if. >>> }//end "jj" loop. >>> ierr = DMPlexRestoreTransitiveClosure(dm, closure[jj],useCone, &size_closure, &closure); CHKERRQ(ierr); >>> }//end "kk" loop. >>> ierr = ISRestoreIndices(TrianglesIS,&TriangleID); CHKERRQ(ierr); >>> >>> /* >>> ierr = DMPlexGetConeRecursiveVertices(dm, TrianglesIS, &VerticesIS); CHKERRQ(ierr);//Get the ID's (as an IS) of the vertices that make up the surfaces of value 11. >>> ierr = ISGetSize(VerticesIS,&numRows); CHKERRQ(ierr);//Get number of flagged vertices (this includes repeated indices for faces that share nodes). >>> ierr = ISGetIndices(VerticesIS,&pinZID); CHKERRQ(ierr);//Get a pointer to the actual surface values. >>> if(dirichletBC){//Boundary conditions requiring an edit of K matrix. >>> for(kk=0;kk>> indexK = 3*(pinZID[kk] - vStart) + dof_offset; //Compute the dof ID's in the K matrix. (NOTE: the 3* ishardcoded for 3 degrees of freedom, tie this to a variable in the FUTURE.) >>> ierr = MatZeroRows(K,1,&indexK,diag,NULL,NULL); CHKERRQ(ierr); >>> }//end "kk" loop. >>> }//end if. >>> else if(neumannBC){//Boundary conditions requiring an edit of RHS vector. >>> for(kk=0;kk>> indexK = 3*(pinZID[kk] - vStart) + dof_offset; >>> ierr = VecSetValue(F,indexK,traction,INSERT_VALUES); CHKERRQ(ierr); >>> }//end "kk" loop. >>> }// end else if. >>> ierr = ISRestoreIndices(VerticesIS,&pinZID); CHKERRQ(ierr); >>> */ >>> dirichletBC = PETSC_FALSE; >>> neumannBC = PETSC_FALSE; >>> }//end "ii" loop. >>> ierr = ISRestoreIndices(physicalsurfaceID,&surfvals); CHKERRQ(ierr); >>> //ierr = ISRestoreIndices(VerticesIS,&pinZID); CHKERRQ(ierr); >>> ierr = ISDestroy(&physicalsurfaceID); CHKERRQ(ierr); >>> //ierr = ISDestroy(&VerticesIS); CHKERRQ(ierr); >>> ierr = ISDestroy(&TrianglesIS); CHKERRQ(ierr); >>> //END BOUNDARY CONDITION ENFORCEMENT============================================ >>> t2 = MPI_Wtime(); >>> PetscPrintf(PETSC_COMM_WORLD,"BC imposition time: %10f\n",t2-t1); >>> >>> /* >>> PetscInt kk = 0; >>> for(ii=vStart;ii>> kk++; >>> PetscPrintf(PETSC_COMM_WORLD,"Vertex #%4i\t x = %10.9f\ty = %10.9f\tz = %10.9f\n",ii,xyz_el[3*kk],xyz_el[3*kk+1],xyz_el[3*kk+2]); >>> }// end "ii" loop. >>> */ >>> >>> t1 = MPI_Wtime(); >>> //SOLVER======================================================================== >>> ierr = KSPCreate(PETSC_COMM_WORLD,&ksp); CHKERRQ(ierr); >>> ierr = KSPSetOperators(ksp,K,K); CHKERRQ(ierr); >>> ierr = KSPSetFromOptions(ksp); CHKERRQ(ierr); >>> ierr = KSPSolve(ksp,F,U); CHKERRQ(ierr); >>> t2 = MPI_Wtime(); >>> //ierr = KSPView(ksp,PETSC_VIEWER_STDOUT_WORLD);CHKERRQ(ierr); >>> //SOLVER======================================================================== >>> t2 = MPI_Wtime(); >>> PetscPrintf(PETSC_COMM_WORLD,"Solver time: %10f\n",t2-t1); >>> ierr = VecRestoreArray(XYZ,&xyz_el); CHKERRQ(ierr);//Get pointer to vector's data. >>> >>> //BEGIN MAX/MIN DISPLACEMENTS=================================================== >>> IS ISux,ISuy,ISuz; >>> Vec UX,UY,UZ; >>> PetscReal UXmax,UYmax,UZmax,UXmin,UYmin,UZmin; >>> ierr = ISCreateStride(PETSC_COMM_WORLD,nv,0,3,&ISux); CHKERRQ(ierr); >>> ierr = ISCreateStride(PETSC_COMM_WORLD,nv,1,3,&ISuy); CHKERRQ(ierr); >>> ierr = ISCreateStride(PETSC_COMM_WORLD,nv,2,3,&ISuz); CHKERRQ(ierr); >>> >>> //PetscErrorCode VecGetSubVector(Vec X,IS is,Vec *Y) >>> ierr = VecGetSubVector(U,ISux,&UX); CHKERRQ(ierr); >>> ierr = VecGetSubVector(U,ISuy,&UY); CHKERRQ(ierr); >>> ierr = VecGetSubVector(U,ISuz,&UZ); CHKERRQ(ierr); >>> >>> //PetscErrorCode VecMax(Vec x,PetscInt *p,PetscReal *val) >>> ierr = VecMax(UX,PETSC_NULL,&UXmax); CHKERRQ(ierr); >>> ierr = VecMax(UY,PETSC_NULL,&UYmax); CHKERRQ(ierr); >>> ierr = VecMax(UZ,PETSC_NULL,&UZmax); CHKERRQ(ierr); >>> >>> ierr = VecMin(UX,PETSC_NULL,&UXmin); CHKERRQ(ierr); >>> ierr = VecMin(UY,PETSC_NULL,&UYmin); CHKERRQ(ierr); >>> ierr = VecMin(UZ,PETSC_NULL,&UZmin); CHKERRQ(ierr); >>> >>> PetscPrintf(PETSC_COMM_WORLD,"%10f\t <= ux <= %10f\n",UXmin,UXmax); >>> PetscPrintf(PETSC_COMM_WORLD,"%10f\t <= uy <= %10f\n",UYmin,UYmax); >>> PetscPrintf(PETSC_COMM_WORLD,"%10f\t <= uz <= %10f\n",UZmin,UZmax); >>> >>> >>> >>> >>> //BEGIN OUTPUT SOLUTION========================================================= >>> if(saveASCII){ >>> ierr = PetscViewerASCIIOpen(PETSC_COMM_WORLD,"XYZ.txt",&XYZviewer); >>> ierr = VecView(XYZ,XYZviewer); CHKERRQ(ierr); >>> ierr = PetscViewerASCIIOpen(PETSC_COMM_WORLD,"U.txt",&XYZpUviewer); >>> ierr = VecView(U,XYZpUviewer); CHKERRQ(ierr); >>> PetscViewerDestroy(&XYZviewer); PetscViewerDestroy(&XYZpUviewer); >>> >>> }//end if. >>> if(saveVTK){ >>> const char *meshfile = "starting_mesh.vtk", >>> *deformedfile = "deformed_mesh.vtk"; >>> ierr = PetscViewerVTKOpen(PETSC_COMM_WORLD,meshfile,FILE_MODE_WRITE,&XYZviewer); CHKERRQ(ierr); >>> //PetscErrorCode DMSetAuxiliaryVec(DM dm, DMLabel label, PetscInt value, Vec aux) >>> DMLabel UXlabel,UYlabel, UZlabel; >>> //PetscErrorCode DMLabelCreate(MPI_Comm comm, const char name[], DMLabel *label) >>> ierr = DMLabelCreate(PETSC_COMM_WORLD, "X-Displacement", &UXlabel); CHKERRQ(ierr); >>> ierr = DMLabelCreate(PETSC_COMM_WORLD, "Y-Displacement", &UYlabel); CHKERRQ(ierr); >>> ierr = DMLabelCreate(PETSC_COMM_WORLD, "Z-Displacement", &UZlabel); CHKERRQ(ierr); >>> ierr = DMSetAuxiliaryVec(dm,UXlabel, 1, UX); CHKERRQ(ierr); >>> ierr = DMSetAuxiliaryVec(dm,UYlabel, 1, UY); CHKERRQ(ierr); >>> ierr = DMSetAuxiliaryVec(dm,UZlabel, 1, UZ); CHKERRQ(ierr); >>> //PetscErrorCode PetscViewerVTKAddField(PetscViewer viewer,PetscObject dm,PetscErrorCode (*PetscViewerVTKWriteFunction)(PetscObject,PetscViewer),PetscInt fieldnum,PetscViewerVTKFieldType fieldtype,PetscBool checkdm,PetscObject vec) >>> >>> >>> >>> //ierr = PetscViewerVTKAddField(XYZviewer, dm,PetscErrorCode (*PetscViewerVTKWriteFunction)(Vec,PetscViewer),PETSC_DEFAULT,PETSC_VTK_POINT_FIELD,PETSC_FALSE,UX); >>> ierr = PetscViewerVTKAddField(XYZviewer, (PetscObject)dm,&PetscViewerVTKWriteFunction,PETSC_DEFAULT,PETSC_VTK_POINT_FIELD,PETSC_FALSE,(PetscObject)UX); >>> >>> >>> ierr = DMPlexVTKWriteAll((PetscObject)dm, XYZviewer); CHKERRQ(ierr); >>> ierr = VecAXPY(XYZ,1,U); CHKERRQ(ierr);//Add displacement field to the mesh coordinates to deform. >>> ierr = PetscViewerVTKOpen(PETSC_COMM_WORLD,deformedfile,FILE_MODE_WRITE,&XYZpUviewer); CHKERRQ(ierr); >>> ierr = DMPlexVTKWriteAll((PetscObject)dm, XYZpUviewer); CHKERRQ(ierr);// >>> PetscViewerDestroy(&XYZviewer); PetscViewerDestroy(&XYZpUviewer); >>> >>> }//end else if. >>> else{ >>> ierr = PetscPrintf(PETSC_COMM_WORLD,"No output format specified! Files not saved.\n"); CHKERRQ(ierr); >>> }//end else. >>> >>> >>> //END OUTPUT SOLUTION=========================================================== >>> VecDestroy(&UX); ISDestroy(&ISux); >>> VecDestroy(&UY); ISDestroy(&ISuy); >>> VecDestroy(&UZ); ISDestroy(&ISuz); >>> //END MAX/MIN DISPLACEMENTS===================================================== >>> >>> //CLEANUP===================================================================== >>> DMDestroy(&dm); >>> KSPDestroy(&ksp); >>> MatDestroy(&K); MatDestroy(&KE); MatDestroy(&matC); //MatDestroy(preKEtetra4.matB); MatDestroy(preKEtetra4.matBTCB); >>> VecDestroy(&U); VecDestroy(&F); >>> >>> //DMLabelDestroy(&physicalgroups);//Destroyig the DM destroys the label. >>> //CLEANUP===================================================================== >>> //PetscErrorCode PetscMallocDump(FILE *fp) >>> //ierr = PetscMallocDump(NULL); >>> return PetscFinalize();//And the machine shall rest.... >>> }//end main. >>> >>> PetscErrorCode tetra4(PetscScalar* X,PetscScalar* Y, PetscScalar* Z,struct preKE *P, Mat* matC, Mat* KE){ >>> //INPUTS: >>> //X: Global X coordinates of the elemental nodes. >>> //Y: Global Y coordinates of the elemental nodes. >>> //Z: Global Z coordinates of the elemental nodes. >>> //J: Jacobian matrix. >>> //invJ: Inverse Jacobian matrix. >>> PetscErrorCode ierr; >>> //For current quadrature point, get dPsi/dXi_i Xi_i = {Xi,Eta,Zeta} >>> /* >>> P->dPsidXi[0] = +1.; P->dPsidEta[0] = 0.0; P->dPsidZeta[0] = 0.0; >>> P->dPsidXi[1] = 0.0; P->dPsidEta[1] = +1.; P->dPsidZeta[1] = 0.0; >>> P->dPsidXi[2] = 0.0; P->dPsidEta[2] = 0.0; P->dPsidZeta[2] = +1.; >>> P->dPsidXi[3] = -1.; P->dPsidEta[3] = -1.; P->dPsidZeta[3] = -1.; >>> */ >>> //Populate the Jacobian matrix. >>> P->J[0][0] = X[0] - X[3]; >>> P->J[0][1] = Y[0] - Y[3]; >>> P->J[0][2] = Z[0] - Z[3]; >>> P->J[1][0] = X[1] - X[3]; >>> P->J[1][1] = Y[1] - Y[3]; >>> P->J[1][2] = Z[1] - Z[3]; >>> P->J[2][0] = X[2] - X[3]; >>> P->J[2][1] = Y[2] - Y[3]; >>> P->J[2][2] = Z[2] - Z[3]; >>> >>> //Determinant of the 3x3 Jacobian. (Expansion along 1st row). >>> P->minor00 = P->J[1][1]*P->J[2][2] - P->J[2][1]*P->J[1][2];//Reuse when finding InvJ. >>> P->minor01 = P->J[1][0]*P->J[2][2] - P->J[2][0]*P->J[1][2];//Reuse when finding InvJ. >>> P->minor02 = P->J[1][0]*P->J[2][1] - P->J[2][0]*P->J[1][1];//Reuse when finding InvJ. >>> P->detJ = P->J[0][0]*P->minor00 - P->J[0][1]*P->minor01 + P->J[0][2]*P->minor02; >>> //Inverse of the 3x3 Jacobian >>> P->invJ[0][0] = +P->minor00/P->detJ;//Reuse precomputed minor. >>> P->invJ[0][1] = -(P->J[0][1]*P->J[2][2] - P->J[0][2]*P->J[2][1])/P->detJ; >>> P->invJ[0][2] = +(P->J[0][1]*P->J[1][2] - P->J[1][1]*P->J[0][2])/P->detJ; >>> P->invJ[1][0] = -P->minor01/P->detJ;//Reuse precomputed minor. >>> P->invJ[1][1] = +(P->J[0][0]*P->J[2][2] - P->J[0][2]*P->J[2][0])/P->detJ; >>> P->invJ[1][2] = -(P->J[0][0]*P->J[1][2] - P->J[1][0]*P->J[0][2])/P->detJ; >>> P->invJ[2][0] = +P->minor02/P->detJ;//Reuse precomputed minor. >>> P->invJ[2][1] = -(P->J[0][0]*P->J[2][1] - P->J[0][1]*P->J[2][0])/P->detJ; >>> P->invJ[2][2] = +(P->J[0][0]*P->J[1][1] - P->J[0][1]*P->J[1][0])/P->detJ; >>> >>> //*****************STRAIN MATRIX (B)************************************** >>> for(P->m=0;P->mN;P->m++){//Scan all shape functions. >>> >>> P->x_in = 0 + P->m*3;//Every 3rd column starting at 0 >>> P->y_in = P->x_in +1;//Every 3rd column starting at 1 >>> P->z_in = P->y_in +1;//Every 3rd column starting at 2 >>> >>> P->dX[0] = P->invJ[0][0]*P->dPsidXi[P->m] + P->invJ[0][1]*P->dPsidEta[P->m] + P->invJ[0][2]*P->dPsidZeta[P->m]; >>> P->dY[0] = P->invJ[1][0]*P->dPsidXi[P->m] + P->invJ[1][1]*P->dPsidEta[P->m] + P->invJ[1][2]*P->dPsidZeta[P->m]; >>> P->dZ[0] = P->invJ[2][0]*P->dPsidXi[P->m] + P->invJ[2][1]*P->dPsidEta[P->m] + P->invJ[2][2]*P->dPsidZeta[P->m]; >>> >>> P->dX[1] = P->dZ[0]; P->dX[2] = P->dY[0]; >>> P->dY[1] = P->dZ[0]; P->dY[2] = P->dX[0]; >>> P->dZ[1] = P->dX[0]; P->dZ[2] = P->dY[0]; >>> >>> ierr = MatSetValues(P->matB,3,P->x_insert,1,&(P->x_in),P->dX,INSERT_VALUES); CHKERRQ(ierr); >>> ierr = MatSetValues(P->matB,3,P->y_insert,1,&(P->y_in),P->dY,INSERT_VALUES); CHKERRQ(ierr); >>> ierr = MatSetValues(P->matB,3,P->z_insert,1,&(P->z_in),P->dZ,INSERT_VALUES); CHKERRQ(ierr); >>> >>> }//end "m" loop. >>> ierr = MatAssemblyBegin(P->matB,MAT_FINAL_ASSEMBLY); CHKERRQ(ierr); >>> ierr = MatAssemblyEnd(P->matB,MAT_FINAL_ASSEMBLY); CHKERRQ(ierr); >>> //*****************STRAIN MATRIX (B)************************************** >>> >>> //Compute the matrix product B^t*C*B, scale it by the quadrature weights and add to KE. >>> P->weight = -P->detJ/6; >>> >>> ierr = MatZeroEntries(*KE); CHKERRQ(ierr); >>> ierr = MatPtAP(*matC,P->matB,MAT_INITIAL_MATRIX,PETSC_DEFAULT,&(P->matBTCB));CHKERRQ(ierr); >>> ierr = MatScale(P->matBTCB,P->weight); CHKERRQ(ierr); >>> ierr = MatAssemblyBegin(P->matBTCB,MAT_FINAL_ASSEMBLY); CHKERRQ(ierr); >>> ierr = MatAssemblyEnd(P->matBTCB,MAT_FINAL_ASSEMBLY); CHKERRQ(ierr); >>> ierr = MatAXPY(*KE,1,P->matBTCB,DIFFERENT_NONZERO_PATTERN); CHKERRQ(ierr);//Add contribution of current quadrature point to KE. >>> >>> //ierr = MatPtAP(*matC,P->matB,MAT_INITIAL_MATRIX,PETSC_DEFAULT,KE);CHKERRQ(ierr); >>> //ierr = MatScale(*KE,P->weight); CHKERRQ(ierr); >>> >>> ierr = MatAssemblyBegin(*KE,MAT_FINAL_ASSEMBLY); CHKERRQ(ierr); >>> ierr = MatAssemblyEnd(*KE,MAT_FINAL_ASSEMBLY); CHKERRQ(ierr); >>> >>> //Cleanup >>> return ierr; >>> }//end tetra4. >>> >>> PetscErrorCode ConstitutiveMatrix(Mat *matC,const char* type,PetscInt materialID){ >>> PetscErrorCode ierr; >>> PetscBool isotropic = PETSC_FALSE, >>> orthotropic = PETSC_FALSE; >>> //PetscErrorCode PetscStrcmp(const char a[],const char b[],PetscBool *flg) >>> ierr = PetscStrcmp(type,"isotropic",&isotropic); >>> ierr = PetscStrcmp(type,"orthotropic",&orthotropic); >>> ierr = MatCreate(PETSC_COMM_WORLD,matC); CHKERRQ(ierr); >>> ierr = MatSetSizes(*matC,PETSC_DECIDE,PETSC_DECIDE,6,6); CHKERRQ(ierr); >>> ierr = MatSetType(*matC,MATAIJ); CHKERRQ(ierr); >>> ierr = MatSetUp(*matC); CHKERRQ(ierr); >>> >>> if(isotropic){ >>> PetscReal E,nu, M,L,vals[3]; >>> switch(materialID){ >>> case 0://Hardcoded properties for isotropic material #0 >>> E = 200; >>> nu = 1./3; >>> break; >>> case 1://Hardcoded properties for isotropic material #1 >>> E = 96; >>> nu = 1./3; >>> break; >>> }//end switch. >>> M = E/(2*(1+nu)),//Lame's constant 1 ("mu"). >>> L = E*nu/((1+nu)*(1-2*nu));//Lame's constant 2 ("lambda"). >>> //PetscErrorCode MatSetValues(Mat mat,PetscInt m,const PetscInt idxm[],PetscInt n,const PetscInt idxn[],const PetscScalar v[],InsertMode addv) >>> PetscInt idxn[3] = {0,1,2}; >>> vals[0] = L+2*M; vals[1] = L; vals[2] = vals[1]; >>> ierr = MatSetValues(*matC,1,&idxn[0],3,idxn,vals,INSERT_VALUES); CHKERRQ(ierr); >>> vals[1] = vals[0]; vals[0] = vals[2]; >>> ierr = MatSetValues(*matC,1,&idxn[1],3,idxn,vals,INSERT_VALUES); CHKERRQ(ierr); >>> vals[2] = vals[1]; vals[1] = vals[0]; >>> ierr = MatSetValues(*matC,1,&idxn[2],3,idxn,vals,INSERT_VALUES); CHKERRQ(ierr); >>> ierr = MatSetValue(*matC,3,3,M,INSERT_VALUES); CHKERRQ(ierr); >>> ierr = MatSetValue(*matC,4,4,M,INSERT_VALUES); CHKERRQ(ierr); >>> ierr = MatSetValue(*matC,5,5,M,INSERT_VALUES); CHKERRQ(ierr); >>> }//end if. >>> /* >>> else if(orthotropic){ >>> switch(materialID){ >>> case 0: >>> break; >>> case 1: >>> break; >>> }//end switch. >>> }//end else if. >>> */ >>> ierr = MatAssemblyBegin(*matC,MAT_FINAL_ASSEMBLY); CHKERRQ(ierr); >>> ierr = MatAssemblyEnd(*matC,MAT_FINAL_ASSEMBLY); CHKERRQ(ierr); >>> //MatView(*matC,0); >>> return ierr; >>> }//End ConstitutiveMatrix >>> >>> PetscErrorCode InitializeKEpreallocation(struct preKE *P,const char* type){ >>> PetscErrorCode ierr; >>> PetscBool istetra4 = PETSC_FALSE, >>> ishex8 = PETSC_FALSE; >>> ierr = PetscStrcmp(type,"tetra4",&istetra4); CHKERRQ(ierr); >>> ierr = PetscStrcmp(type,"hex8",&ishex8); CHKERRQ(ierr); >>> if(istetra4){ >>> P->sizeKE = 12; >>> P->N = 4; >>> }//end if. >>> else if(ishex8){ >>> P->sizeKE = 24; >>> P->N = 8; >>> }//end else if. >>> >>> >>> P->x_insert[0] = 0; P->x_insert[1] = 3; P->x_insert[2] = 5; >>> P->y_insert[0] = 1; P->y_insert[1] = 4; P->y_insert[2] = 5; >>> P->z_insert[0] = 2; P->z_insert[1] = 3; P->z_insert[2] = 4; >>> //Allocate memory for the differentiated shape function vectors. >>> ierr = PetscMalloc1(P->N,&(P->dPsidXi)); CHKERRQ(ierr); >>> ierr = PetscMalloc1(P->N,&(P->dPsidEta)); CHKERRQ(ierr); >>> ierr = PetscMalloc1(P->N,&(P->dPsidZeta)); CHKERRQ(ierr); >>> >>> P->dPsidXi[0] = +1.; P->dPsidEta[0] = 0.0; P->dPsidZeta[0] = 0.0; >>> P->dPsidXi[1] = 0.0; P->dPsidEta[1] = +1.; P->dPsidZeta[1] = 0.0; >>> P->dPsidXi[2] = 0.0; P->dPsidEta[2] = 0.0; P->dPsidZeta[2] = +1.; >>> P->dPsidXi[3] = -1.; P->dPsidEta[3] = -1.; P->dPsidZeta[3] = -1.; >>> >>> >>> //Strain matrix. >>> ierr = MatCreate(PETSC_COMM_WORLD,&(P->matB)); CHKERRQ(ierr); >>> ierr = MatSetSizes(P->matB,PETSC_DECIDE,PETSC_DECIDE,6,P->sizeKE); CHKERRQ(ierr);//Hardcoded >>> ierr = MatSetType(P->matB,MATAIJ); CHKERRQ(ierr); >>> ierr = MatSetUp(P->matB); CHKERRQ(ierr); >>> >>> //Contribution matrix. >>> ierr = MatCreate(PETSC_COMM_WORLD,&(P->matBTCB)); CHKERRQ(ierr); >>> ierr = MatSetSizes(P->matBTCB,PETSC_DECIDE,PETSC_DECIDE,P->sizeKE,P->sizeKE); CHKERRQ(ierr); >>> ierr = MatSetType(P->matBTCB,MATAIJ); CHKERRQ(ierr); >>> ierr = MatSetUp(P->matBTCB); CHKERRQ(ierr); >>> >>> //Element stiffness matrix. >>> //ierr = MatCreateSeqDense(PETSC_COMM_SELF,12,12,NULL,&KE); CHKERRQ(ierr); //PARALLEL >>> >>> return ierr; >>> } -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Thu Jan 6 16:39:10 2022 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 6 Jan 2022 17:39:10 -0500 Subject: [petsc-users] [EXTERNAL] Re: DM misuse causes massive memory leak? In-Reply-To: References: <87tueraunm.fsf@jedbrown.org> <87sfu2921h.fsf@jedbrown.org> <87a6g99507.fsf@jedbrown.org> Message-ID: On Thu, Jan 6, 2022 at 5:30 PM Ferrand, Jesus A. wrote: > Matt: > > Thanks for the immediate reply. > My apologies, I misspelled the function name, it should have been " > DMPlexGetConeRecursiveVertices()." > Oh, this is a debugging function that Vaclav wrote. It should never be used in production. It allocates memory all over the place. If you want vertices, just get the closure and filter out the vertices: PetscInt *closure = NULL; PetscInt Ncl, Nclv; DMPlexGetDepthStrautm(dm, &vStart, &vEnd); DMPlexGetTransitiveClosure(dm, cell, PETSC_TRUE, &Ncl, &closure); Nclv = 0; for (cl = 0; cl < 2*Ncl; cl += 2) { const PetscInt point = closure[cl]; if ((point >= vStart) && (point < vEnd)) closure[Nclv++] = point; } DMPlexRestoreTransitiveClosure(...); > Regarding the PetscSection use: I am looping through every single point in > the DAG of my mesh. For each point I am assigning dof using > PetscSectionSetDof(). I am also assigning dof to the corresponding fields > using PetscSectionSetFieldDof(). I took Jed's advice and made a single > field with 3 components, I named all of them. So, I used > PetscSectionSetNumFields(), PetscSectionSetFieldComponents(), > PetscSectionSetFieldName(), and PetscSectionSetComponentName(). Finally, I > proceed to PetscSectionSetUp(), and DMSetLocalSection(). In my timed code > blocks I am including DMPlexGetDepthStratum(), and DMGetStratumSize(), and > DMPlexGetChart() because I need their output to assign the dof to the > PetscSection. > This should all take almost no time. There are no expensive operations there. Thanks, Matt > For what is worth, I have my configure set to --with-debugging = 1. > ------------------------------ > *From:* Matthew Knepley > *Sent:* Thursday, January 6, 2022 5:20 PM > *To:* Ferrand, Jesus A. > *Cc:* Jed Brown ; petsc-users > *Subject:* Re: [petsc-users] [EXTERNAL] Re: DM misuse causes massive > memory leak? > > On Thu, Jan 6, 2022 at 5:15 PM Ferrand, Jesus A. > wrote: > > Jed: > > DMPlexLabelComplete() has allowed me to speed up my code significantly > (Many thanks!). > > I did not use DMAddBoundary() though. > I figured I could obtain Index Sets (IS's) from the DAG for different > depths and then IS's for the points that were flagged in Gmsh (after > calling DMPlexLabelComplete()). > I then intersected both IS's using ISIntersect() to get the DAG points > corresponding to just vertices (and flagged by Gmsh) for Dirichlet BC's, > and DAG points that are Faces and flagged by Gmsh for Neumann BC's. I then > use the intersected IS to edit a Mat and a RHS Vec manually. I did further > profiling and have found the PetsSections are now the next biggest > overhead. > > For Dirichlet BC's I make an array of vertex ID's and call > MatSetZeroRows() to impose BC's on them through my K matrix. And yes, I'm > solving the elasticity PDE. For Neumann BC's I use > DMPlexGetRecursiveVertices() to edit my RHS vector. > > > I cannot find a function named DMPlexGetRecursiveVertices(). > > > I want to keep the PetscSections since they preallocate my matrix rather > well (the one from DMCreateMatrix()) but at the same time would like to > remove them since they add overhead. Do you think DMAddboundary() with the > function call will be faster than my single calls to MatSetZeroRows() and > DMPlexGetRecursiveVertices() ? > > > PetscSection is really simple. Are you sure you are measuring long times > there? What are you using it to do? > > Thanks, > > Matt > > > ------------------------------ > *From:* Jed Brown > *Sent:* Wednesday, January 5, 2022 5:44 PM > *To:* Ferrand, Jesus A. > *Cc:* petsc-users > *Subject:* Re: [EXTERNAL] Re: [petsc-users] DM misuse causes massive > memory leak? > > For something like displacement (and this sounds like elasticity), I would > recommend using one field with three components. You can constrain a subset > of the components to implement slip conditions. > > You can use DMPlexLabelComplete(dm, label) to propagate those face labels > to vertices. > > "Ferrand, Jesus A." writes: > > > Thanks for the reply (I hit reply all this time). > > > > So, I set 3 fields: > > /* > > ierr = PetscSectionSetNumFields(s,dof); CHKERRQ(ierr); > > ierr = PetscSectionSetFieldName(s,0, "X-Displacement"); CHKERRQ(ierr); > //Field ID is 0 > > ierr = PetscSectionSetFieldName(s,1, "Y-Displacement"); CHKERRQ(ierr); > //Field ID is 1 > > ierr = PetscSectionSetFieldName(s,2, "Z-Displacement"); CHKERRQ(ierr); > //Field ID is 2 > > */ > > > > I then loop through the vertices of my DMPlex > > > > /* > > for(ii = vStart; ii < vEnd; ii++){//Vertex loop. > > ierr = PetscSectionSetDof(s, ii, dof); CHKERRQ(ierr); > > ierr = PetscSectionSetFieldDof(s,ii,0,1); CHKERRQ(ierr);//One > X-displacement per vertex (1 dof) > > ierr = PetscSectionSetFieldDof(s,ii,1,1); CHKERRQ(ierr);//One > Y-displacement per vertex (1 dof) > > ierr = PetscSectionSetFieldDof(s,ii,2,1); CHKERRQ(ierr);//One > Z-displacement per vertex (1 dof) > > }//Sets x, y, and z displacements as dofs. > > */ > > > > I only associated fields with vertices, not with any other points in the > DAG. Regarding the use of DMAddBoundary(), I mostly copied the usage shown > in SNES example 77. I modified the function definition to simply set the > dof to 0.0 as opposed to the coordinates. Below "physicalgroups" is the > DMLabel that I got from gmsh, this flags Face points, not vertices. That is > why I think the error log suggests that fields were never set. > > > > ierr = DMAddBoundary(dm, DM_BC_ESSENTIAL, "fixed", physicalgroups, 1, > &surfvals[ii], fieldID, 0, NULL, (void (*)(void)) coordinates, NULL, NULL, > NULL); CHKERRQ(ierr); > > PetscErrorCode coordinates(PetscInt dim, PetscReal time, const PetscReal > x[], PetscInt Nf, PetscScalar *u, void *ctx){ > > const PetscInt Ncomp = dim; > > PetscInt comp; > > for (comp = 0; comp < Ncomp; ++comp) u[comp] = 0.0; > > return 0; > > } > > > > > > ________________________________ > > From: Jed Brown > > Sent: Wednesday, January 5, 2022 12:36 AM > > To: Ferrand, Jesus A. > > Cc: petsc-users > > Subject: Re: [EXTERNAL] Re: [petsc-users] DM misuse causes massive > memory leak? > > > > Please "reply all" include the list in the future. > > > > "Ferrand, Jesus A." writes: > > > >> Forgot to say thanks for the reply (my bad). > >> Yes, I was indeed forgetting to pre-allocate the sparse matrices when > doing them by hand (complacency seeded by MATLAB's zeros()). Thank you, > Jed, and Jeremy, for the hints! > >> > >> I have more questions, these ones about boundary conditions (I think > these are for Matt). > >> In my current code I set Dirichlet conditions directly on a Mat by > calling MatSetZeroRows(). I profiled my code and found the part that > applies them to be unnacceptably slow. In response, I've been trying to > better pre-allocate Mats using PetscSections. I have found documentation > for PetscSectionSetDof(), PetscSectionSetNumFields(), > PetscSectionSetFieldName(), and PetscSectionSetFieldDof(), to set the size > of my Mats and Vecs by calling DMSetLocalSection() followed by > DMCreateMatrix() and DMCreateLocalVector() to get a RHS vector. This seems > faster. > >> > >> In PetscSection, what is the difference between a "field" and a > "component"? For example, I could have one field "Velocity" with three > components ux, uy, and uz or perhaps three fields ux, uy, and uz each with > a default component? > > > > It's just how you name them and how they appear in output. Usually > "velocity" is better as a field with three components, but fields with > other meaning (and perhaps different finite element spaces), such as > pressure, would be different fields. Different components are always in the > same FE space. > > > >> I am struggling now to impose boundary conditions after constraining > dofs using PetscSection. My understanding is that constraining dof's > reduces the size of the DM's matrix but it does not give the DM knowledge > of what values the constrained dofs should have, right? > >> > >> I know that there is DMAddBoundary(), but I am unsure of how to use it. > From Gmsh I have a mesh with surface boundaries flagged. I'm not sure > whether DMAddBoundary()will constrain the face, edge, or vertex points when > I give it the DMLabel from Gmsh. (I need specific dof on the vertices to be > constrained). I did some testing and I think DMAddBoundary() attempts to > constrain the Face points (see error log below). I only associated fields > with the vertices but not the Faces. I can extract the vertex points from > the face label using DMPlexGetConeRecursiveVertices() but the output IS has > repeated entries for the vertex points (many faces share the same vertex). > Is there an easier way to get the vertex points from a gmsh surface tag? > > > > How did you call DMAddBoundary()? Are you using DM_BC_ESSENTIAL and a > callback function that provides the inhomogeneous boundary condition? > > > >> I'm sorry this is a mouthful. > >> > >> [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > >> [0]PETSC ERROR: Argument out of range > >> [0]PETSC ERROR: Field number 2 must be in [0, 0) > > > > It looks like you haven't added these fields yet. > > > >> [0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble > shooting. > >> [0]PETSC ERROR: Petsc Release Version 3.16.0, unknown > >> [0]PETSC ERROR: ./gmsh4 on a linux-c-dbg named F86 by jesus Tue Jan 4 > 15:19:57 2022 > >> [0]PETSC ERROR: Configure options --with-32bits-pci-domain=1 > --with-debugging =1 > >> [0]PETSC ERROR: #1 DMGetField() at > /home/jesus/SAND/PETSc_install/petsc/src/dm/interface/dm.c:4803 > >> [0]PETSC ERROR: #2 DMCompleteBoundaryLabel_Internal() at > /home/jesus/SAND/PETSc_install/petsc/src/dm/interface/dm.c:5140 > >> [0]PETSC ERROR: #3 DMAddBoundary() at > /home/jesus/SAND/PETSc_install/petsc/src/dm/interface/dm.c:8561 > >> [0]PETSC ERROR: #4 main() at /home/jesus/SAND/FEA/3D/gmshBACKUP4.c:215 > >> [0]PETSC ERROR: No PETSc Option Table entries > >> [0]PETSC ERROR: ----------------End of Error Message -------send entire > error message to petsc-maint at mcs.anl.gov---------- > >> > >> > >> > >> > >> > >> > >> ________________________________ > >> From: Jed Brown > >> Sent: Wednesday, December 29, 2021 5:55 PM > >> To: Ferrand, Jesus A. ; petsc-users at mcs.anl.gov < > petsc-users at mcs.anl.gov> > >> Subject: [EXTERNAL] Re: [petsc-users] DM misuse causes massive memory > leak? > >> > >> CAUTION: This email originated outside of Embry-Riddle Aeronautical > University. Do not click links or open attachments unless you recognize the > sender and know the content is safe. > >> > >> > >> "Ferrand, Jesus A." writes: > >> > >>> Dear PETSc Team: > >>> > >>> I have a question about DM and PetscSection. Say I import a mesh (for > FEM purposes) and create a DMPlex for it. I then use PetscSections to set > degrees of freedom per "point" (by point I mean vertices, lines, faces, and > cells). I then use PetscSectionGetStorageSize() to get the size of the > global stiffness matrix (K) needed for my FEM problem. One last detail, > this K I populate inside a rather large loop using an element stiffness > matrix function of my own. Instead of using DMCreateMatrix(), I manually > created a Mat using MatCreate(), MatSetType(), MatSetSizes(), and > MatSetUp(). I come to find that said loop is painfully slow when I use the > manually created matrix, but 20x faster when I use the Mat coming out of > DMCreateMatrix(). > >> > >> The sparse matrix hasn't been preallocated, which forces the data > structure to do a lot of copies (as bad as O(n^2) complexity). > DMCreateMatrix() preallocates for you. > >> > >> > https://petsc.org/release/docs/manual/performance/#memory-allocation-for-sparse-matrix-assembly > >> https://petsc.org/release/docs/manual/mat/#sec-matsparse > >> > >>> My question is then: Is the manual Mat a noob mistake and is it > somehow creating a memory leak with K? Just in case it's something else I'm > attaching the code. The loop that populates K is between lines 221 and 278. > Anything related to DM, DMPlex, and PetscSection is between lines 117 and > 180. > >>> > >>> Machine Type: HP Laptop > >>> C-compiler: Gnu C > >>> OS: Ubuntu 20.04 > >>> PETSc version: 3.16.0 > >>> MPI Implementation: MPICH > >>> > >>> Hope you all had a Merry Christmas and that you will have a happy and > productive New Year. :D > >>> > >>> > >>> Sincerely: > >>> > >>> J.A. Ferrand > >>> > >>> Embry-Riddle Aeronautical University - Daytona Beach FL > >>> > >>> M.Sc. Aerospace Engineering | May 2022 > >>> > >>> B.Sc. Aerospace Engineering > >>> > >>> B.Sc. Computational Mathematics > >>> > >>> > >>> > >>> Sigma Gamma Tau > >>> > >>> Tau Beta Pi > >>> > >>> Honors Program > >>> > >>> > >>> > >>> Phone: (386)-843-1829 > >>> > >>> Email(s): ferranj2 at my.erau.edu > >>> > >>> jesus.ferrand at gmail.com > >>> //REFERENCE: > https://github.com/FreeFem/FreeFem-sources/blob/master/plugin/mpi/PETSc-code.hpp > >>> #include > >>> static char help[] = "Imports a Gmsh mesh with boundary conditions and > solves the elasticity equation.\n" > >>> "Option prefix = opt_.\n"; > >>> > >>> struct preKE{//Preallocation before computing KE > >>> Mat matB, > >>> matBTCB; > >>> //matKE; > >>> PetscInt x_insert[3], > >>> y_insert[3], > >>> z_insert[3], > >>> m,//Looping variables. > >>> sizeKE,//size of the element stiffness matrix. > >>> N,//Number of nodes in element. > >>> x_in,y_in,z_in; //LI to index B matrix. > >>> PetscReal J[3][3],//Jacobian matrix. > >>> invJ[3][3],//Inverse of the Jacobian matrix. > >>> detJ,//Determinant of the Jacobian. > >>> dX[3], > >>> dY[3], > >>> dZ[3], > >>> minor00, > >>> minor01, > >>> minor02,//Determinants of minors in a 3x3 matrix. > >>> dPsidX, dPsidY, dPsidZ,//Shape function derivatives w.r.t > global coordinates. > >>> weight,//Multiplier of quadrature weights. > >>> *dPsidXi,//Derivatives of shape functions w.r.t. Xi. > >>> *dPsidEta,//Derivatives of shape functions w.r.t. Eta. > >>> *dPsidZeta;//Derivatives of shape functions w.r.t Zeta. > >>> PetscErrorCode ierr; > >>> };//end struct. > >>> > >>> //Function declarations. > >>> extern PetscErrorCode tetra4(PetscScalar*, PetscScalar*, > PetscScalar*,struct preKE*, Mat*, Mat*); > >>> extern PetscErrorCode ConstitutiveMatrix(Mat*,const char*,PetscInt); > >>> extern PetscErrorCode InitializeKEpreallocation(struct preKE*,const > char*); > >>> > >>> PetscErrorCode PetscViewerVTKWriteFunction(PetscObject vec,PetscViewer > viewer){ > >>> PetscErrorCode ierr; > >>> ierr = VecView((Vec)vec,viewer); CHKERRQ(ierr); > >>> return ierr; > >>> } > >>> > >>> > >>> > >>> > >>> int main(int argc, char **args){ > >>> //DEFINITIONS OF PETSC's DMPLEX LINGO: > >>> //POINT: A topology element (cell, face, edge, or vertex). > >>> //CHART: It an interval from 0 to the number of "points." (the range > of admissible linear indices) > >>> //STRATUM: A subset of the "chart" which corresponds to all "points" > at a given "level." > >>> //LEVEL: This is either a "depth" or a "height". > >>> //HEIGHT: Dimensionality of an element measured from 0D to 3D. > Heights: cell = 0, face = 1, edge = 2, vertex = 3. > >>> //DEPTH: Dimensionality of an element measured from 3D to 0D. > Depths: cell = 3, face = 2, edge = 1, vertex = 0; > >>> //CLOSURE: *of an element is the collection of all other elements > that define it.I.e., the closure of a surface is the collection of vertices > and edges that make it up. > >>> //STAR: > >>> //STANDARD LABELS: These are default tags that DMPlex has for its > topology. ("depth") > >>> PetscErrorCode ierr;//Error tracking variable. > >>> DM dm;//Distributed memory object (useful for managing grids.) > >>> DMLabel physicalgroups;//Identifies user-specified tags in gmsh (to > impose BC's). > >>> DMPolytopeType celltype;//When looping through cells, determines its > type (tetrahedron, pyramid, hexahedron, etc.) > >>> PetscSection s; > >>> KSP ksp;//Krylov Sub-Space (linear solver object) > >>> Mat K,//Global stiffness matrix (Square, assume unsymmetric). > >>> KE,//Element stiffness matrix (Square, assume unsymmetric). > >>> matC;//Constitutive matrix. > >>> Vec XYZ,//Coordinate vector, contains spatial locations of mesh's > vertices (NOTE: This vector self-destroys!). > >>> U,//Displacement vector. > >>> F;//Load Vector. > >>> PetscViewer XYZviewer,//Viewer object to output mesh to ASCII format. > >>> XYZpUviewer; //Viewer object to output displacements to > ASCII format. > >>> PetscBool interpolate = PETSC_TRUE,//Instructs Gmsh importer whether > to generate faces and edges (Needed when using P2 or higher elements). > >>> useCone = PETSC_TRUE,//Instructs > "DMPlexGetTransitiveClosure()" whether to extract the closure or the star. > >>> dirichletBC = PETSC_FALSE,//For use when assembling the K > matrix. > >>> neumannBC = PETSC_FALSE,//For use when assembling the F > vector. > >>> saveASCII = PETSC_FALSE,//Whether to save results in ASCII > format. > >>> saveVTK = PETSC_FALSE;//Whether to save results as VTK > format. > >>> PetscInt nc,//number of cells. (PETSc lingo for "elements") > >>> nv,//number of vertices. (PETSc lingo for "nodes") > >>> nf,//number of faces. (PETSc lingo for "surfaces") > >>> ne,//number of edges. (PETSc lingo for "lines") > >>> pStart,//starting LI of global elements. > >>> pEnd,//ending LI of all elements. > >>> cStart,//starting LI for cells global arrangement. > >>> cEnd,//ending LI for cells in global arrangement. > >>> vStart,//starting LI for vertices in global arrangement. > >>> vEnd,//ending LI for vertices in global arrangement. > >>> fStart,//starting LI for faces in global arrangement. > >>> fEnd,//ending LI for faces in global arrangement. > >>> eStart,//starting LI for edges in global arrangement. > >>> eEnd,//ending LI for edges in global arrangement. > >>> sizeK,//Size of the element stiffness matrix. > >>> ii,jj,kk,//Dedicated looping variables. > >>> indexXYZ,//Variable to access the elements of XYZ vector. > >>> indexK,//Variable to access the elements of the U and F > vectors (can reference rows and colums of K matrix.) > >>> *closure = PETSC_NULL,//Pointer to the closure elements of > a cell. > >>> size_closure,//Size of the closure of a cell. > >>> dim,//Dimension of the mesh. > >>> //*edof,//Linear indices of dof's inside the K matrix. > >>> dof = 3,//Degrees of freedom per node. > >>> cells=0, edges=0, vertices=0, faces=0,//Topology counters > when looping through cells. > >>> pinXcode=10, pinZcode=11,forceZcode=12;//Gmsh codes to > extract relevant "Face Sets." > >>> PetscReal //*x_el,//Pointer to a vector that will store the > x-coordinates of an element's vertices. > >>> //*y_el,//Pointer to a vector that will store the > y-coordinates of an element's vertices. > >>> //*z_el,//Pointer to a vector that will store the > z-coordinates of an element's vertices. > >>> *xyz_el,//Pointer to xyz array in the XYZ vector. > >>> traction = -10, > >>> *KEdata, > >>> t1,t2; //time keepers. > >>> const char *gmshfile = "TopOptmeshfine2.msh";//Name of gmsh file to > import. > >>> > >>> ierr = PetscInitialize(&argc,&args,NULL,help); if(ierr) return ierr; > //And the machine shall work.... > >>> > >>> //MESH > IMPORT================================================================= > >>> //IMPORTANT NOTE: Gmsh only creates "cells" and "vertices," it does > not create the "faces" or "edges." > >>> //Gmsh probably can generate them, must figure out how to. > >>> t1 = MPI_Wtime(); > >>> ierr = > DMPlexCreateGmshFromFile(PETSC_COMM_WORLD,gmshfile,interpolate,&dm); > CHKERRQ(ierr);//Read Gmsh file and generate the DMPlex. > >>> ierr = DMGetDimension(dm, &dim); CHKERRQ(ierr);//1-D, 2-D, or 3-D > >>> ierr = DMPlexGetChart(dm, &pStart, &pEnd); CHKERRQ(ierr);//Extracts > linear indices of cells, vertices, faces, and edges. > >>> ierr = DMGetCoordinatesLocal(dm,&XYZ); CHKERRQ(ierr);//Extracts > coordinates from mesh.(Contiguous storage: [x0,y0,z0,x1,y1,z1,...]) > >>> ierr = VecGetArray(XYZ,&xyz_el); CHKERRQ(ierr);//Get pointer to > vector's data. > >>> t2 = MPI_Wtime(); > >>> PetscPrintf(PETSC_COMM_WORLD,"Mesh Import time: %10f\n",t2-t1); > >>> DMView(dm,PETSC_VIEWER_STDOUT_WORLD); > >>> > >>> //IMPORTANT NOTE: PETSc assumes that vertex-cell meshes are 2D even > if they were 3D, so its ordering changes. > >>> //Cells remain at height 0, but vertices move to height 1 from > height 3. To prevent this from becoming an issue > >>> //the "interpolate" variable is set to PETSC_TRUE to tell the mesh > importer to generate faces and edges. > >>> //PETSc, therefore, technically does additional meshing. Gotta > figure out how to get this from Gmsh directly. > >>> ierr = DMPlexGetDepthStratum(dm,3, &cStart, &cEnd);//Get LI of cells. > >>> ierr = DMPlexGetDepthStratum(dm,2, &fStart, &fEnd);//Get LI of faces > >>> ierr = DMPlexGetDepthStratum(dm,1, &eStart, &eEnd);//Get LI of edges. > >>> ierr = DMPlexGetDepthStratum(dm,0, &vStart, &vEnd);//Get LI of > vertices. > >>> ierr = DMGetStratumSize(dm,"depth", 3, &nc);//Get number of cells. > >>> ierr = DMGetStratumSize(dm,"depth", 2, &nf);//Get number of faces. > >>> ierr = DMGetStratumSize(dm,"depth", 1, &ne);//Get number of edges. > >>> ierr = DMGetStratumSize(dm,"depth", 0, &nv);//Get number of vertices. > >>> /* > >>> PetscPrintf(PETSC_COMM_WORLD,"global start = %10d\t global end = > %10d\n",pStart,pEnd); > >>> PetscPrintf(PETSC_COMM_WORLD,"#cells = %10d\t i = %10d\t i < > %10d\n",nc,cStart,cEnd); > >>> PetscPrintf(PETSC_COMM_WORLD,"#faces = %10d\t i = %10d\t i < > %10d\n",nf,fStart,fEnd); > >>> PetscPrintf(PETSC_COMM_WORLD,"#edges = %10d\t i = %10d\t i < > %10d\n",ne,eStart,eEnd); > >>> PetscPrintf(PETSC_COMM_WORLD,"#vertices = %10d\t i = %10d\t i < > %10d\n",nv,vStart,vEnd); > >>> */ > >>> //MESH > IMPORT================================================================= > >>> > >>> //NOTE: This section extremely hardcoded right now. > >>> //Current setup would only support P1 meshes. > >>> //MEMORY ALLOCATION > ========================================================== > >>> ierr = PetscSectionCreate(PETSC_COMM_WORLD, &s); CHKERRQ(ierr); > >>> //The chart is akin to a contiguous memory storage allocation. Each > chart entry is associated > >>> //with a "thing," could be a vertex, face, cell, or edge, or > anything else. > >>> ierr = PetscSectionSetChart(s, pStart, pEnd); CHKERRQ(ierr); > >>> //For each "thing" in the chart, additional room can be made. This > is helpful for associating > >>> //nodes to multiple degrees of freedom. These commands help > associate nodes with > >>> for(ii = cStart; ii < cEnd; ii++){//Cell loop. > >>> ierr = PetscSectionSetDof(s, ii, 0);CHKERRQ(ierr);}//NOTE: > Currently no dof's associated with cells. > >>> for(ii = fStart; ii < fEnd; ii++){//Face loop. > >>> ierr = PetscSectionSetDof(s, ii, 0);CHKERRQ(ierr);}//NOTE: > Currently no dof's associated with faces. > >>> for(ii = vStart; ii < vEnd; ii++){//Vertex loop. > >>> ierr = PetscSectionSetDof(s, ii, dof);CHKERRQ(ierr);}//Sets x, y, > and z displacements as dofs. > >>> for(ii = eStart; ii < eEnd; ii++){//Edge loop > >>> ierr = PetscSectionSetDof(s, ii, 0);CHKERRQ(ierr);}//NOTE: > Currently no dof's associated with edges. > >>> ierr = PetscSectionSetUp(s); CHKERRQ(ierr); > >>> ierr = > PetscSectionGetStorageSize(s,&sizeK);CHKERRQ(ierr);//Determine the size of > the global stiffness matrix. > >>> ierr = DMSetLocalSection(dm,s); CHKERRQ(ierr);//Associate the > PetscSection with the DM object. > >>> //PetscErrorCode DMCreateGlobalVector(DM dm,Vec *vec) > >>> //ierr = DMCreateGlobalVector(dm,&U); CHKERRQ(ierr); > >>> PetscSectionDestroy(&s); > >>> //PetscPrintf(PETSC_COMM_WORLD,"sizeK = %10d\n",sizeK); > >>> > >>> //OBJECT > SETUP================================================================ > >>> //Global stiffness matrix. > >>> //PetscErrorCode DMCreateMatrix(DM dm,Mat *mat) > >>> > >>> //This makes the loop fast. > >>> ierr = DMCreateMatrix(dm,&K); > >>> > >>> //This makes the loop uber slow. > >>> //ierr = MatCreate(PETSC_COMM_WORLD,&K); CHKERRQ(ierr); > >>> //ierr = MatSetType(K,MATAIJ); CHKERRQ(ierr);// Global stiffness > matrix set to some sparse type. > >>> //ierr = MatSetSizes(K,PETSC_DECIDE,PETSC_DECIDE,sizeK,sizeK); > CHKERRQ(ierr); > >>> //ierr = MatSetUp(K); CHKERRQ(ierr); > >>> > >>> //Displacement vector. > >>> ierr = VecCreate(PETSC_COMM_WORLD,&U); CHKERRQ(ierr); > >>> ierr = VecSetType(U,VECSTANDARD); CHKERRQ(ierr);// Global stiffness > matrix set to some sparse type. > >>> ierr = VecSetSizes(U,PETSC_DECIDE,sizeK); CHKERRQ(ierr); > >>> > >>> //Load vector. > >>> ierr = VecCreate(PETSC_COMM_WORLD,&F); CHKERRQ(ierr); > >>> ierr = VecSetType(F,VECSTANDARD); CHKERRQ(ierr);// Global stiffness > matrix set to some sparse type. > >>> ierr = VecSetSizes(F,PETSC_DECIDE,sizeK); CHKERRQ(ierr); > >>> //OBJECT > SETUP================================================================ > >>> > >>> //WARNING: This loop is currently hardcoded for P1 elements only! > Must Figure > >>> //out a clever way to modify to accomodate Pn (n>1) elements. > >>> > >>> //BEGIN GLOBAL STIFFNESS MATRIX > BUILDER======================================= > >>> t1 = MPI_Wtime(); > >>> > >>> > //PREALLOCATIONS============================================================== > >>> ierr = ConstitutiveMatrix(&matC,"isotropic",0); CHKERRQ(ierr); > >>> struct preKE preKEtetra4; > >>> ierr = InitializeKEpreallocation(&preKEtetra4,"tetra4"); > CHKERRQ(ierr); > >>> ierr = MatCreate(PETSC_COMM_WORLD,&KE); CHKERRQ(ierr); //SEQUENTIAL > >>> ierr = MatSetSizes(KE,PETSC_DECIDE,PETSC_DECIDE,12,12); > CHKERRQ(ierr); //SEQUENTIAL > >>> ierr = MatSetType(KE,MATDENSE); CHKERRQ(ierr); //SEQUENTIAL > >>> ierr = MatSetUp(KE); CHKERRQ(ierr); > >>> PetscReal x_tetra4[4], y_tetra4[4],z_tetra4[4], > >>> x_hex8[8], y_hex8[8],z_hex8[8], > >>> *x,*y,*z; > >>> PetscInt *EDOF,edof_tetra4[12],edof_hex8[24]; > >>> DMPolytopeType previous = DM_POLYTOPE_UNKNOWN; > >>> > //PREALLOCATIONS============================================================== > >>> > >>> > >>> > >>> for(ii=cStart;ii >>> ierr = DMPlexGetTransitiveClosure(dm, ii, useCone, &size_closure, > &closure); CHKERRQ(ierr); > >>> ierr = DMPlexGetCellType(dm, ii, &celltype); CHKERRQ(ierr); > >>> //IMPORTANT NOTE: MOST OF THIS LOOP SHOULD BE INCLUDED IN THE KE3D > function. > >>> if(previous != celltype){ > >>> //PetscPrintf(PETSC_COMM_WORLD,"run \n"); > >>> if(celltype == DM_POLYTOPE_TETRAHEDRON){ > >>> x = x_tetra4; > >>> y = y_tetra4; > >>> z = z_tetra4; > >>> EDOF = edof_tetra4; > >>> }//end if. > >>> else if(celltype == DM_POLYTOPE_HEXAHEDRON){ > >>> x = x_hex8; > >>> y = y_hex8; > >>> z = z_hex8; > >>> EDOF = edof_hex8; > >>> }//end else if. > >>> } > >>> previous = celltype; > >>> > >>> //PetscPrintf(PETSC_COMM_WORLD,"Cell # %4i\t",ii); > >>> cells=0; > >>> edges=0; > >>> vertices=0; > >>> faces=0; > >>> kk = 0; > >>> for(jj=0;jj<(2*size_closure);jj+=2){//Scan the closure of the > current cell. > >>> //Use information from the DM's strata to determine composition > of cell_ii. > >>> if(vStart <= closure[jj] && closure[jj]< vEnd){//Check for > vertices. > >>> //PetscPrintf(PETSC_COMM_WORLD,"%5i\t",closure[jj]); > >>> indexXYZ = dim*(closure[jj]-vStart);//Linear index of > x-coordinate in the xyz_el array. > >>> > >>> *(x+vertices) = xyz_el[indexXYZ]; > >>> *(y+vertices) = xyz_el[indexXYZ+1];//Extract Y-coordinates of > the current vertex. > >>> *(z+vertices) = xyz_el[indexXYZ+2];//Extract Y-coordinates of > the current vertex. > >>> *(EDOF + kk) = indexXYZ; > >>> *(EDOF + kk+1) = indexXYZ+1; > >>> *(EDOF + kk+2) = indexXYZ+2; > >>> kk+=3; > >>> vertices++;//Update vertex counter. > >>> }//end if > >>> else if(eStart <= closure[jj] && closure[jj]< eEnd){//Check for > edge ID's > >>> edges++; > >>> }//end else ifindexK > >>> else if(fStart <= closure[jj] && closure[jj]< fEnd){//Check for > face ID's > >>> faces++; > >>> }//end else if > >>> else if(cStart <= closure[jj] && closure[jj]< cEnd){//Check for > cell ID's > >>> cells++; > >>> }//end else if > >>> }//end "jj" loop. > >>> ierr = tetra4(x,y,z,&preKEtetra4,&matC,&KE); CHKERRQ(ierr); > //Generate the element stiffness matrix for this cell. > >>> ierr = MatDenseGetArray(KE,&KEdata); CHKERRQ(ierr); > >>> ierr = MatSetValues(K,12,EDOF,12,EDOF,KEdata,ADD_VALUES); > CHKERRQ(ierr);//WARNING: HARDCODED FOR TETRAHEDRAL P1 ELEMENTS ONLY > !!!!!!!!!!!!!!!!!!!!!!! > >>> ierr = MatDenseRestoreArray(KE,&KEdata); CHKERRQ(ierr); > >>> ierr = DMPlexRestoreTransitiveClosure(dm, ii,useCone, &size_closure, > &closure); CHKERRQ(ierr); > >>> }//end "ii" loop. > >>> ierr = MatAssemblyBegin(K,MAT_FINAL_ASSEMBLY); CHKERRQ(ierr); > >>> ierr = MatAssemblyEnd(K,MAT_FINAL_ASSEMBLY); CHKERRQ(ierr); > >>> //ierr = MatView(K,PETSC_VIEWER_DRAW_WORLD); CHKERRQ(ierr); > >>> //END GLOBAL STIFFNESS MATRIX > BUILDER=========================================== > >>> t2 = MPI_Wtime(); > >>> PetscPrintf(PETSC_COMM_WORLD,"K build time: %10f\n",t2-t1); > >>> > >>> > >>> > >>> > >>> > >>> > >>> > >>> > >>> t1 = MPI_Wtime(); > >>> //BEGIN BOUNDARY CONDITION > ENFORCEMENT========================================== > >>> IS TrianglesIS, physicalsurfaceID;//, VerticesIS; > >>> PetscInt numsurfvals, > >>> //numRows, > >>> dof_offset,numTri; > >>> const PetscInt *surfvals, > >>> //*pinZID, > >>> *TriangleID; > >>> PetscScalar diag =1; > >>> PetscReal area,force; > >>> //NOTE: Petsc can read/assign labels. Eeach label may posses multiple > "values." > >>> //These values act as tags within a tag. > >>> //IMPORTANT NOTE: The below line needs a safety. If a mesh that does > not feature > >>> //face sets is imported, the code in its current state will crash!!!. > This is currently > >>> //hardcoded for the test mesh. > >>> ierr = DMGetLabel(dm, "Face Sets", &physicalgroups); > CHKERRQ(ierr);//Inspects Physical surface groups defined by gmsh (if any). > >>> ierr = DMLabelGetValueIS(physicalgroups, &physicalsurfaceID); > CHKERRQ(ierr);//Gets the physical surface ID's defined in gmsh (as > specified in the .geo file). > >>> ierr = ISGetIndices(physicalsurfaceID,&surfvals); CHKERRQ(ierr);//Get > a pointer to the actual surface values. > >>> ierr = DMLabelGetNumValues(physicalgroups, &numsurfvals); > CHKERRQ(ierr);//Gets the number of different values that the label assigns. > >>> for(ii=0;ii label. > >>> //PetscPrintf(PETSC_COMM_WORLD,"Values = %5i\n",surfvals[ii]); > >>> //PROBLEM: The surface values are hardcoded in the gmsh file. We > need to adopt standard "codes" > >>> //that we can give to users when they make their meshes so that this > code recognizes the Type > >>> // of boundary conditions that are to be imposed. > >>> if(surfvals[ii] == pinXcode){ > >>> dof_offset = 0; > >>> dirichletBC = PETSC_TRUE; > >>> }//end if. > >>> else if(surfvals[ii] == pinZcode){ > >>> dof_offset = 2; > >>> dirichletBC = PETSC_TRUE; > >>> }//end else if. > >>> else if(surfvals[ii] == forceZcode){ > >>> dof_offset = 2; > >>> neumannBC = PETSC_TRUE; > >>> }//end else if. > >>> > >>> ierr = DMLabelGetStratumIS(physicalgroups, surfvals[ii], > &TrianglesIS); CHKERRQ(ierr);//Get the ID's (as an IS) of the surfaces > belonging to value 11. > >>> //PROBLEM: DMPlexGetConeRecursiveVertices returns an array with > repeated node ID's. For each repetition, the lines that enforce BC's > unnecessarily re-run. > >>> ierr = ISGetSize(TrianglesIS,&numTri); CHKERRQ(ierr); > >>> ierr = ISGetIndices(TrianglesIS,&TriangleID); CHKERRQ(ierr);//Get a > pointer to the actual surface values. > >>> for(kk=0;kk >>> ierr = DMPlexGetTransitiveClosure(dm, TriangleID[kk], useCone, > &size_closure, &closure); CHKERRQ(ierr); > >>> if(neumannBC){ > >>> ierr = DMPlexComputeCellGeometryFVM(dm, TriangleID[kk], > &area,PETSC_NULL,PETSC_NULL); CHKERRQ(ierr); > >>> force = traction*area/3;//WARNING: The 3 here is hardcoded for a > purely tetrahedral mesh only!!!!!!!!!! > >>> } > >>> for(jj=0;jj<(2*size_closure);jj+=2){ > >>> //PetscErrorCode DMPlexComputeCellGeometryFVM(DM dm, PetscInt > cell, PetscReal *vol, PetscReal centroid[], PetscReal normal[]) > >>> if(vStart <= closure[jj] && closure[jj]< vEnd){//Check for > vertices. > >>> indexK = dof*(closure[jj] - vStart) + dof_offset; //Compute > the dof ID's in the K matrix. > >>> if(dirichletBC){//Boundary conditions requiring an edit of K > matrix. > >>> ierr = MatZeroRows(K,1,&indexK,diag,NULL,NULL); > CHKERRQ(ierr); > >>> }//end if. > >>> else if(neumannBC){//Boundary conditions requiring an edit of > RHS vector. > >>> ierr = VecSetValue(F,indexK,force,ADD_VALUES); > CHKERRQ(ierr); > >>> }// end else if. > >>> }//end if. > >>> }//end "jj" loop. > >>> ierr = DMPlexRestoreTransitiveClosure(dm, closure[jj],useCone, > &size_closure, &closure); CHKERRQ(ierr); > >>> }//end "kk" loop. > >>> ierr = ISRestoreIndices(TrianglesIS,&TriangleID); CHKERRQ(ierr); > >>> > >>> /* > >>> ierr = DMPlexGetConeRecursiveVertices(dm, TrianglesIS, &VerticesIS); > CHKERRQ(ierr);//Get the ID's (as an IS) of the vertices that make up the > surfaces of value 11. > >>> ierr = ISGetSize(VerticesIS,&numRows); CHKERRQ(ierr);//Get number of > flagged vertices (this includes repeated indices for faces that share > nodes). > >>> ierr = ISGetIndices(VerticesIS,&pinZID); CHKERRQ(ierr);//Get a > pointer to the actual surface values. > >>> if(dirichletBC){//Boundary conditions requiring an edit of K matrix. > >>> for(kk=0;kk >>> indexK = 3*(pinZID[kk] - vStart) + dof_offset; //Compute the dof > ID's in the K matrix. (NOTE: the 3* ishardcoded for 3 degrees of freedom, > tie this to a variable in the FUTURE.) > >>> ierr = MatZeroRows(K,1,&indexK,diag,NULL,NULL); CHKERRQ(ierr); > >>> }//end "kk" loop. > >>> }//end if. > >>> else if(neumannBC){//Boundary conditions requiring an edit of RHS > vector. > >>> for(kk=0;kk >>> indexK = 3*(pinZID[kk] - vStart) + dof_offset; > >>> ierr = VecSetValue(F,indexK,traction,INSERT_VALUES); > CHKERRQ(ierr); > >>> }//end "kk" loop. > >>> }// end else if. > >>> ierr = ISRestoreIndices(VerticesIS,&pinZID); CHKERRQ(ierr); > >>> */ > >>> dirichletBC = PETSC_FALSE; > >>> neumannBC = PETSC_FALSE; > >>> }//end "ii" loop. > >>> ierr = ISRestoreIndices(physicalsurfaceID,&surfvals); CHKERRQ(ierr); > >>> //ierr = ISRestoreIndices(VerticesIS,&pinZID); CHKERRQ(ierr); > >>> ierr = ISDestroy(&physicalsurfaceID); CHKERRQ(ierr); > >>> //ierr = ISDestroy(&VerticesIS); CHKERRQ(ierr); > >>> ierr = ISDestroy(&TrianglesIS); CHKERRQ(ierr); > >>> //END BOUNDARY CONDITION > ENFORCEMENT============================================ > >>> t2 = MPI_Wtime(); > >>> PetscPrintf(PETSC_COMM_WORLD,"BC imposition time: %10f\n",t2-t1); > >>> > >>> /* > >>> PetscInt kk = 0; > >>> for(ii=vStart;ii >>> kk++; > >>> PetscPrintf(PETSC_COMM_WORLD,"Vertex #%4i\t x = %10.9f\ty = > %10.9f\tz = %10.9f\n",ii,xyz_el[3*kk],xyz_el[3*kk+1],xyz_el[3*kk+2]); > >>> }// end "ii" loop. > >>> */ > >>> > >>> t1 = MPI_Wtime(); > >>> > //SOLVER======================================================================== > >>> ierr = KSPCreate(PETSC_COMM_WORLD,&ksp); CHKERRQ(ierr); > >>> ierr = KSPSetOperators(ksp,K,K); CHKERRQ(ierr); > >>> ierr = KSPSetFromOptions(ksp); CHKERRQ(ierr); > >>> ierr = KSPSolve(ksp,F,U); CHKERRQ(ierr); > >>> t2 = MPI_Wtime(); > >>> //ierr = KSPView(ksp,PETSC_VIEWER_STDOUT_WORLD);CHKERRQ(ierr); > >>> > //SOLVER======================================================================== > >>> t2 = MPI_Wtime(); > >>> PetscPrintf(PETSC_COMM_WORLD,"Solver time: %10f\n",t2-t1); > >>> ierr = VecRestoreArray(XYZ,&xyz_el); CHKERRQ(ierr);//Get pointer to > vector's data. > >>> > >>> //BEGIN MAX/MIN > DISPLACEMENTS=================================================== > >>> IS ISux,ISuy,ISuz; > >>> Vec UX,UY,UZ; > >>> PetscReal UXmax,UYmax,UZmax,UXmin,UYmin,UZmin; > >>> ierr = ISCreateStride(PETSC_COMM_WORLD,nv,0,3,&ISux); CHKERRQ(ierr); > >>> ierr = ISCreateStride(PETSC_COMM_WORLD,nv,1,3,&ISuy); CHKERRQ(ierr); > >>> ierr = ISCreateStride(PETSC_COMM_WORLD,nv,2,3,&ISuz); CHKERRQ(ierr); > >>> > >>> //PetscErrorCode VecGetSubVector(Vec X,IS is,Vec *Y) > >>> ierr = VecGetSubVector(U,ISux,&UX); CHKERRQ(ierr); > >>> ierr = VecGetSubVector(U,ISuy,&UY); CHKERRQ(ierr); > >>> ierr = VecGetSubVector(U,ISuz,&UZ); CHKERRQ(ierr); > >>> > >>> //PetscErrorCode VecMax(Vec x,PetscInt *p,PetscReal *val) > >>> ierr = VecMax(UX,PETSC_NULL,&UXmax); CHKERRQ(ierr); > >>> ierr = VecMax(UY,PETSC_NULL,&UYmax); CHKERRQ(ierr); > >>> ierr = VecMax(UZ,PETSC_NULL,&UZmax); CHKERRQ(ierr); > >>> > >>> ierr = VecMin(UX,PETSC_NULL,&UXmin); CHKERRQ(ierr); > >>> ierr = VecMin(UY,PETSC_NULL,&UYmin); CHKERRQ(ierr); > >>> ierr = VecMin(UZ,PETSC_NULL,&UZmin); CHKERRQ(ierr); > >>> > >>> PetscPrintf(PETSC_COMM_WORLD,"%10f\t <= ux <= %10f\n",UXmin,UXmax); > >>> PetscPrintf(PETSC_COMM_WORLD,"%10f\t <= uy <= %10f\n",UYmin,UYmax); > >>> PetscPrintf(PETSC_COMM_WORLD,"%10f\t <= uz <= %10f\n",UZmin,UZmax); > >>> > >>> > >>> > >>> > >>> //BEGIN OUTPUT > SOLUTION========================================================= > >>> if(saveASCII){ > >>> ierr = PetscViewerASCIIOpen(PETSC_COMM_WORLD,"XYZ.txt",&XYZviewer); > >>> ierr = VecView(XYZ,XYZviewer); CHKERRQ(ierr); > >>> ierr = PetscViewerASCIIOpen(PETSC_COMM_WORLD,"U.txt",&XYZpUviewer); > >>> ierr = VecView(U,XYZpUviewer); CHKERRQ(ierr); > >>> PetscViewerDestroy(&XYZviewer); PetscViewerDestroy(&XYZpUviewer); > >>> > >>> }//end if. > >>> if(saveVTK){ > >>> const char *meshfile = "starting_mesh.vtk", > >>> *deformedfile = "deformed_mesh.vtk"; > >>> ierr = > PetscViewerVTKOpen(PETSC_COMM_WORLD,meshfile,FILE_MODE_WRITE,&XYZviewer); > CHKERRQ(ierr); > >>> //PetscErrorCode DMSetAuxiliaryVec(DM dm, DMLabel label, PetscInt > value, Vec aux) > >>> DMLabel UXlabel,UYlabel, UZlabel; > >>> //PetscErrorCode DMLabelCreate(MPI_Comm comm, const char name[], > DMLabel *label) > >>> ierr = DMLabelCreate(PETSC_COMM_WORLD, "X-Displacement", &UXlabel); > CHKERRQ(ierr); > >>> ierr = DMLabelCreate(PETSC_COMM_WORLD, "Y-Displacement", &UYlabel); > CHKERRQ(ierr); > >>> ierr = DMLabelCreate(PETSC_COMM_WORLD, "Z-Displacement", &UZlabel); > CHKERRQ(ierr); > >>> ierr = DMSetAuxiliaryVec(dm,UXlabel, 1, UX); CHKERRQ(ierr); > >>> ierr = DMSetAuxiliaryVec(dm,UYlabel, 1, UY); CHKERRQ(ierr); > >>> ierr = DMSetAuxiliaryVec(dm,UZlabel, 1, UZ); CHKERRQ(ierr); > >>> //PetscErrorCode PetscViewerVTKAddField(PetscViewer > viewer,PetscObject dm,PetscErrorCode > (*PetscViewerVTKWriteFunction)(PetscObject,PetscViewer),PetscInt > fieldnum,PetscViewerVTKFieldType fieldtype,PetscBool checkdm,PetscObject > vec) > >>> > >>> > >>> > >>> //ierr = PetscViewerVTKAddField(XYZviewer, dm,PetscErrorCode > (*PetscViewerVTKWriteFunction)(Vec,PetscViewer),PETSC_DEFAULT,PETSC_VTK_POINT_FIELD,PETSC_FALSE,UX); > >>> ierr = PetscViewerVTKAddField(XYZviewer, > (PetscObject)dm,&PetscViewerVTKWriteFunction,PETSC_DEFAULT,PETSC_VTK_POINT_FIELD,PETSC_FALSE,(PetscObject)UX); > >>> > >>> > >>> ierr = DMPlexVTKWriteAll((PetscObject)dm, XYZviewer); CHKERRQ(ierr); > >>> ierr = VecAXPY(XYZ,1,U); CHKERRQ(ierr);//Add displacement field to > the mesh coordinates to deform. > >>> ierr = > PetscViewerVTKOpen(PETSC_COMM_WORLD,deformedfile,FILE_MODE_WRITE,&XYZpUviewer); > CHKERRQ(ierr); > >>> ierr = DMPlexVTKWriteAll((PetscObject)dm, XYZpUviewer); > CHKERRQ(ierr);// > >>> PetscViewerDestroy(&XYZviewer); PetscViewerDestroy(&XYZpUviewer); > >>> > >>> }//end else if. > >>> else{ > >>> ierr = PetscPrintf(PETSC_COMM_WORLD,"No output format specified! > Files not saved.\n"); CHKERRQ(ierr); > >>> }//end else. > >>> > >>> > >>> //END OUTPUT > SOLUTION=========================================================== > >>> VecDestroy(&UX); ISDestroy(&ISux); > >>> VecDestroy(&UY); ISDestroy(&ISuy); > >>> VecDestroy(&UZ); ISDestroy(&ISuz); > >>> //END MAX/MIN > DISPLACEMENTS===================================================== > >>> > >>> > //CLEANUP===================================================================== > >>> DMDestroy(&dm); > >>> KSPDestroy(&ksp); > >>> MatDestroy(&K); MatDestroy(&KE); MatDestroy(&matC); > //MatDestroy(preKEtetra4.matB); MatDestroy(preKEtetra4.matBTCB); > >>> VecDestroy(&U); VecDestroy(&F); > >>> > >>> //DMLabelDestroy(&physicalgroups);//Destroyig the DM destroys the > label. > >>> > //CLEANUP===================================================================== > >>> //PetscErrorCode PetscMallocDump(FILE *fp) > >>> //ierr = PetscMallocDump(NULL); > >>> return PetscFinalize();//And the machine shall rest.... > >>> }//end main. > >>> > >>> PetscErrorCode tetra4(PetscScalar* X,PetscScalar* Y, PetscScalar* > Z,struct preKE *P, Mat* matC, Mat* KE){ > >>> //INPUTS: > >>> //X: Global X coordinates of the elemental nodes. > >>> //Y: Global Y coordinates of the elemental nodes. > >>> //Z: Global Z coordinates of the elemental nodes. > >>> //J: Jacobian matrix. > >>> //invJ: Inverse Jacobian matrix. > >>> PetscErrorCode ierr; > >>> //For current quadrature point, get dPsi/dXi_i Xi_i = {Xi,Eta,Zeta} > >>> /* > >>> P->dPsidXi[0] = +1.; P->dPsidEta[0] = 0.0; P->dPsidZeta[0] = 0.0; > >>> P->dPsidXi[1] = 0.0; P->dPsidEta[1] = +1.; P->dPsidZeta[1] = 0.0; > >>> P->dPsidXi[2] = 0.0; P->dPsidEta[2] = 0.0; P->dPsidZeta[2] = +1.; > >>> P->dPsidXi[3] = -1.; P->dPsidEta[3] = -1.; P->dPsidZeta[3] = -1.; > >>> */ > >>> //Populate the Jacobian matrix. > >>> P->J[0][0] = X[0] - X[3]; > >>> P->J[0][1] = Y[0] - Y[3]; > >>> P->J[0][2] = Z[0] - Z[3]; > >>> P->J[1][0] = X[1] - X[3]; > >>> P->J[1][1] = Y[1] - Y[3]; > >>> P->J[1][2] = Z[1] - Z[3]; > >>> P->J[2][0] = X[2] - X[3]; > >>> P->J[2][1] = Y[2] - Y[3]; > >>> P->J[2][2] = Z[2] - Z[3]; > >>> > >>> //Determinant of the 3x3 Jacobian. (Expansion along 1st row). > >>> P->minor00 = P->J[1][1]*P->J[2][2] - P->J[2][1]*P->J[1][2];//Reuse > when finding InvJ. > >>> P->minor01 = P->J[1][0]*P->J[2][2] - P->J[2][0]*P->J[1][2];//Reuse > when finding InvJ. > >>> P->minor02 = P->J[1][0]*P->J[2][1] - P->J[2][0]*P->J[1][1];//Reuse > when finding InvJ. > >>> P->detJ = P->J[0][0]*P->minor00 - P->J[0][1]*P->minor01 + > P->J[0][2]*P->minor02; > >>> //Inverse of the 3x3 Jacobian > >>> P->invJ[0][0] = +P->minor00/P->detJ;//Reuse precomputed minor. > >>> P->invJ[0][1] = -(P->J[0][1]*P->J[2][2] - > P->J[0][2]*P->J[2][1])/P->detJ; > >>> P->invJ[0][2] = +(P->J[0][1]*P->J[1][2] - > P->J[1][1]*P->J[0][2])/P->detJ; > >>> P->invJ[1][0] = -P->minor01/P->detJ;//Reuse precomputed minor. > >>> P->invJ[1][1] = +(P->J[0][0]*P->J[2][2] - > P->J[0][2]*P->J[2][0])/P->detJ; > >>> P->invJ[1][2] = -(P->J[0][0]*P->J[1][2] - > P->J[1][0]*P->J[0][2])/P->detJ; > >>> P->invJ[2][0] = +P->minor02/P->detJ;//Reuse precomputed minor. > >>> P->invJ[2][1] = -(P->J[0][0]*P->J[2][1] - > P->J[0][1]*P->J[2][0])/P->detJ; > >>> P->invJ[2][2] = +(P->J[0][0]*P->J[1][1] - > P->J[0][1]*P->J[1][0])/P->detJ; > >>> > >>> //*****************STRAIN MATRIX > (B)************************************** > >>> for(P->m=0;P->mN;P->m++){//Scan all shape functions. > >>> > >>> P->x_in = 0 + P->m*3;//Every 3rd column starting at 0 > >>> P->y_in = P->x_in +1;//Every 3rd column starting at 1 > >>> P->z_in = P->y_in +1;//Every 3rd column starting at 2 > >>> > >>> P->dX[0] = P->invJ[0][0]*P->dPsidXi[P->m] + > P->invJ[0][1]*P->dPsidEta[P->m] + P->invJ[0][2]*P->dPsidZeta[P->m]; > >>> P->dY[0] = P->invJ[1][0]*P->dPsidXi[P->m] + > P->invJ[1][1]*P->dPsidEta[P->m] + P->invJ[1][2]*P->dPsidZeta[P->m]; > >>> P->dZ[0] = P->invJ[2][0]*P->dPsidXi[P->m] + > P->invJ[2][1]*P->dPsidEta[P->m] + P->invJ[2][2]*P->dPsidZeta[P->m]; > >>> > >>> P->dX[1] = P->dZ[0]; P->dX[2] = P->dY[0]; > >>> P->dY[1] = P->dZ[0]; P->dY[2] = P->dX[0]; > >>> P->dZ[1] = P->dX[0]; P->dZ[2] = P->dY[0]; > >>> > >>> ierr = > MatSetValues(P->matB,3,P->x_insert,1,&(P->x_in),P->dX,INSERT_VALUES); > CHKERRQ(ierr); > >>> ierr = > MatSetValues(P->matB,3,P->y_insert,1,&(P->y_in),P->dY,INSERT_VALUES); > CHKERRQ(ierr); > >>> ierr = > MatSetValues(P->matB,3,P->z_insert,1,&(P->z_in),P->dZ,INSERT_VALUES); > CHKERRQ(ierr); > >>> > >>> }//end "m" loop. > >>> ierr = MatAssemblyBegin(P->matB,MAT_FINAL_ASSEMBLY); CHKERRQ(ierr); > >>> ierr = MatAssemblyEnd(P->matB,MAT_FINAL_ASSEMBLY); CHKERRQ(ierr); > >>> //*****************STRAIN MATRIX > (B)************************************** > >>> > >>> //Compute the matrix product B^t*C*B, scale it by the > quadrature weights and add to KE. > >>> P->weight = -P->detJ/6; > >>> > >>> ierr = MatZeroEntries(*KE); CHKERRQ(ierr); > >>> ierr = > MatPtAP(*matC,P->matB,MAT_INITIAL_MATRIX,PETSC_DEFAULT,&(P->matBTCB));CHKERRQ(ierr); > >>> ierr = MatScale(P->matBTCB,P->weight); CHKERRQ(ierr); > >>> ierr = MatAssemblyBegin(P->matBTCB,MAT_FINAL_ASSEMBLY); > CHKERRQ(ierr); > >>> ierr = MatAssemblyEnd(P->matBTCB,MAT_FINAL_ASSEMBLY); CHKERRQ(ierr); > >>> ierr = MatAXPY(*KE,1,P->matBTCB,DIFFERENT_NONZERO_PATTERN); > CHKERRQ(ierr);//Add contribution of current quadrature point to KE. > >>> > >>> //ierr = > MatPtAP(*matC,P->matB,MAT_INITIAL_MATRIX,PETSC_DEFAULT,KE);CHKERRQ(ierr); > >>> //ierr = MatScale(*KE,P->weight); CHKERRQ(ierr); > >>> > >>> ierr = MatAssemblyBegin(*KE,MAT_FINAL_ASSEMBLY); CHKERRQ(ierr); > >>> ierr = MatAssemblyEnd(*KE,MAT_FINAL_ASSEMBLY); CHKERRQ(ierr); > >>> > >>> //Cleanup > >>> return ierr; > >>> }//end tetra4. > >>> > >>> PetscErrorCode ConstitutiveMatrix(Mat *matC,const char* type,PetscInt > materialID){ > >>> PetscErrorCode ierr; > >>> PetscBool isotropic = PETSC_FALSE, > >>> orthotropic = PETSC_FALSE; > >>> //PetscErrorCode PetscStrcmp(const char a[],const char > b[],PetscBool *flg) > >>> ierr = PetscStrcmp(type,"isotropic",&isotropic); > >>> ierr = PetscStrcmp(type,"orthotropic",&orthotropic); > >>> ierr = MatCreate(PETSC_COMM_WORLD,matC); CHKERRQ(ierr); > >>> ierr = MatSetSizes(*matC,PETSC_DECIDE,PETSC_DECIDE,6,6); > CHKERRQ(ierr); > >>> ierr = MatSetType(*matC,MATAIJ); CHKERRQ(ierr); > >>> ierr = MatSetUp(*matC); CHKERRQ(ierr); > >>> > >>> if(isotropic){ > >>> PetscReal E,nu, M,L,vals[3]; > >>> switch(materialID){ > >>> case 0://Hardcoded properties for isotropic material #0 > >>> E = 200; > >>> nu = 1./3; > >>> break; > >>> case 1://Hardcoded properties for isotropic material #1 > >>> E = 96; > >>> nu = 1./3; > >>> break; > >>> }//end switch. > >>> M = E/(2*(1+nu)),//Lame's constant 1 ("mu"). > >>> L = E*nu/((1+nu)*(1-2*nu));//Lame's constant 2 ("lambda"). > >>> //PetscErrorCode MatSetValues(Mat mat,PetscInt m,const PetscInt > idxm[],PetscInt n,const PetscInt idxn[],const PetscScalar v[],InsertMode > addv) > >>> PetscInt idxn[3] = {0,1,2}; > >>> vals[0] = L+2*M; vals[1] = L; vals[2] = vals[1]; > >>> ierr = MatSetValues(*matC,1,&idxn[0],3,idxn,vals,INSERT_VALUES); > CHKERRQ(ierr); > >>> vals[1] = vals[0]; vals[0] = vals[2]; > >>> ierr = MatSetValues(*matC,1,&idxn[1],3,idxn,vals,INSERT_VALUES); > CHKERRQ(ierr); > >>> vals[2] = vals[1]; vals[1] = vals[0]; > >>> ierr = MatSetValues(*matC,1,&idxn[2],3,idxn,vals,INSERT_VALUES); > CHKERRQ(ierr); > >>> ierr = MatSetValue(*matC,3,3,M,INSERT_VALUES); CHKERRQ(ierr); > >>> ierr = MatSetValue(*matC,4,4,M,INSERT_VALUES); CHKERRQ(ierr); > >>> ierr = MatSetValue(*matC,5,5,M,INSERT_VALUES); CHKERRQ(ierr); > >>> }//end if. > >>> /* > >>> else if(orthotropic){ > >>> switch(materialID){ > >>> case 0: > >>> break; > >>> case 1: > >>> break; > >>> }//end switch. > >>> }//end else if. > >>> */ > >>> ierr = MatAssemblyBegin(*matC,MAT_FINAL_ASSEMBLY); CHKERRQ(ierr); > >>> ierr = MatAssemblyEnd(*matC,MAT_FINAL_ASSEMBLY); CHKERRQ(ierr); > >>> //MatView(*matC,0); > >>> return ierr; > >>> }//End ConstitutiveMatrix > >>> > >>> PetscErrorCode InitializeKEpreallocation(struct preKE *P,const char* > type){ > >>> PetscErrorCode ierr; > >>> PetscBool istetra4 = PETSC_FALSE, > >>> ishex8 = PETSC_FALSE; > >>> ierr = PetscStrcmp(type,"tetra4",&istetra4); CHKERRQ(ierr); > >>> ierr = PetscStrcmp(type,"hex8",&ishex8); CHKERRQ(ierr); > >>> if(istetra4){ > >>> P->sizeKE = 12; > >>> P->N = 4; > >>> }//end if. > >>> else if(ishex8){ > >>> P->sizeKE = 24; > >>> P->N = 8; > >>> }//end else if. > >>> > >>> > >>> P->x_insert[0] = 0; P->x_insert[1] = 3; P->x_insert[2] = 5; > >>> P->y_insert[0] = 1; P->y_insert[1] = 4; P->y_insert[2] = 5; > >>> P->z_insert[0] = 2; P->z_insert[1] = 3; P->z_insert[2] = 4; > >>> //Allocate memory for the differentiated shape function vectors. > >>> ierr = PetscMalloc1(P->N,&(P->dPsidXi)); CHKERRQ(ierr); > >>> ierr = PetscMalloc1(P->N,&(P->dPsidEta)); CHKERRQ(ierr); > >>> ierr = PetscMalloc1(P->N,&(P->dPsidZeta)); CHKERRQ(ierr); > >>> > >>> P->dPsidXi[0] = +1.; P->dPsidEta[0] = 0.0; P->dPsidZeta[0] = 0.0; > >>> P->dPsidXi[1] = 0.0; P->dPsidEta[1] = +1.; P->dPsidZeta[1] = 0.0; > >>> P->dPsidXi[2] = 0.0; P->dPsidEta[2] = 0.0; P->dPsidZeta[2] = +1.; > >>> P->dPsidXi[3] = -1.; P->dPsidEta[3] = -1.; P->dPsidZeta[3] = -1.; > >>> > >>> > >>> //Strain matrix. > >>> ierr = MatCreate(PETSC_COMM_WORLD,&(P->matB)); CHKERRQ(ierr); > >>> ierr = MatSetSizes(P->matB,PETSC_DECIDE,PETSC_DECIDE,6,P->sizeKE); > CHKERRQ(ierr);//Hardcoded > >>> ierr = MatSetType(P->matB,MATAIJ); CHKERRQ(ierr); > >>> ierr = MatSetUp(P->matB); CHKERRQ(ierr); > >>> > >>> //Contribution matrix. > >>> ierr = MatCreate(PETSC_COMM_WORLD,&(P->matBTCB)); CHKERRQ(ierr); > >>> ierr = > MatSetSizes(P->matBTCB,PETSC_DECIDE,PETSC_DECIDE,P->sizeKE,P->sizeKE); > CHKERRQ(ierr); > >>> ierr = MatSetType(P->matBTCB,MATAIJ); CHKERRQ(ierr); > >>> ierr = MatSetUp(P->matBTCB); CHKERRQ(ierr); > >>> > >>> //Element stiffness matrix. > >>> //ierr = MatCreateSeqDense(PETSC_COMM_SELF,12,12,NULL,&KE); > CHKERRQ(ierr); //PARALLEL > >>> > >>> return ierr; > >>> } > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From alexis.marboeuf at hotmail.fr Thu Jan 6 16:51:38 2022 From: alexis.marboeuf at hotmail.fr (Alexis Marboeuf) Date: Thu, 6 Jan 2022 22:51:38 +0000 Subject: [petsc-users] TR: [petsc-dev] DMPlexCreateGlobalToNaturalSF and partitioners In-Reply-To: References: Message-ID: Hi Matt, First, I wish a Happy New Year 2022 to you and to the whole PETSc developper team. Waiting I can push my branch on the remote repository (I am going to post on petsc-dev just after), the error in DMPlexCreateGlobalToNaturalSF with parmetis or ptscotch partitionners raises with src/dm/impls/plex/tests/ex44.c on the branch knepley/fix-plex-g2n. I just defined 2 fields instead of 1 in ex44.c: const PetscInt Nf = 2; const PetscInt numComp[2] = {1, dim}; const PetscInt numDof[6] = {1, 0, 0, 0, 0, dim}; on lines 284 to 286. I get the error [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [0]PETSC ERROR: Invalid argument [0]PETSC ERROR: Input array needs to be sorted [0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting. [0]PETSC ERROR: Petsc Development GIT revision: v3.16.1-235-g50d0f7a46d GIT Date: 2021-12-19 19:19:22 -0500 [0]PETSC ERROR: ../ex44 on a arch-darwin-c-debug named marboeua-1.math.mcmaster.ca by alexismarboeuf Thu Jan 6 17:23:54 2022 [0]PETSC ERROR: Configure options --download-chaco=1 --download-exodusii=1 --download-fblaslapack=1 --download-hdf5=1 --download-hypre=1 --download-ml=1 --download-netcdf=1 --download-pnetcdf=1 --download-sieve=1 --download-sowing=1 --download-yaml=1 --download-zlib=1 --download-metis=1 --download-parmetis=1 --download-ptscotch=1 --with-boost-dir=/opt/homebrew/Cellar/boost/1.76.0 --with-boost=1 --with-c2html=0 --with-debugging=1 --with-fortran-datatypes=1 --with-mpi-dir=/opt/homebrew/Cellar/mpich/3.4.3 --with-ranlib=ranlib --with-x11=1 [0]PETSC ERROR: #1 PetscSortedRemoveDupsInt() at /Users/alexismarboeuf/Documents/petsc/src/sys/utils/sorti.c:308 [0]PETSC ERROR: #2 PetscSFCreateEmbeddedLeafSF() at /Users/alexismarboeuf/Documents/petsc/src/vec/is/sf/interface/sf.c:1405 [0]PETSC ERROR: #3 DMPlexCreateGlobalToNaturalSF() at /Users/alexismarboeuf/Documents/petsc/src/dm/impls/plex/plexnatural.c:174 [0]PETSC ERROR: #4 DMPlexDistribute() at /Users/alexismarboeuf/Documents/petsc/src/dm/impls/plex/plexdistribute.c:1747 [0]PETSC ERROR: #5 main() at /Users/alexismarboeuf/Documents/petsc/src/dm/impls/plex/tests/ex44.c:299 [1]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [1]PETSC ERROR: Invalid argument [1]PETSC ERROR: Input array needs to be sorted [1]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting. [1]PETSC ERROR: Petsc Development GIT revision: v3.16.1-235-g50d0f7a46d GIT Date: 2021-12-19 19:19:22 -0500 [1]PETSC ERROR: ../ex44 on a arch-darwin-c-debug named marboeua-1.math.mcmaster.ca by alexismarboeuf Thu Jan 6 17:23:54 2022 [1]PETSC ERROR: Configure options --download-chaco=1 --download-exodusii=1 --download-fblaslapack=1 --download-hdf5=1 --download-hypre=1 --download-ml=1 --download-netcdf=1 --download-pnetcdf=1 --download-sieve=1 --download-sowing=1 --download-yaml=1 --download-zlib=1 --download-metis=1 --download-parmetis=1 --download-ptscotch=1 --with-boost-dir=/opt/homebrew/Cellar/boost/1.76.0 --with-boost=1 --with-c2html=0 --with-debugging=1 --with-fortran-datatypes=1 --with-mpi-dir=/opt/homebrew/Cellar/mpich/3.4.3 --with-ranlib=ranlib --with-x11=1 [1]PETSC ERROR: #1 PetscSortedRemoveDupsInt() at /Users/alexismarboeuf/Documents/petsc/src/sys/utils/sorti.c:308 [1]PETSC ERROR: #2 PetscSFCreateEmbeddedLeafSF() at /Users/alexismarboeuf/Documents/petsc/src/vec/is/sf/interface/sf.c:1405 [1]PETSC ERROR: #3 DMPlexCreateGlobalToNaturalSF() at /Users/alexismarboeuf/Documents/petsc/src/dm/impls/plex/plexnatural.c:174 [1]PETSC ERROR: #4 DMPlexDistribute() at /Users/alexismarboeuf/Documents/petsc/src/dm/impls/plex/plexdistribute.c:1747 [1]PETSC ERROR: #5 main() at /Users/alexismarboeuf/Documents/petsc/src/dm/impls/plex/tests/ex44.c:299 [1]PETSC ERROR: PETSc Option Table entries: [1]PETSC ERROR: -field [1]PETSC ERROR: [0]PETSC ERROR: PETSc Option Table entries: [0]PETSC ERROR: -field [0]PETSC ERROR: -petscpartitioner_type parmetis [0]PETSC ERROR: ----------------End of Error Message -------send entire error message to petsc-maint at mcs.anl.gov---------- -petscpartitioner_type parmetis [1]PETSC ERROR: ----------------End of Error Message -------send entire error message to petsc-maint at mcs.anl.gov---------- Abort(62) on node 0 (rank 0 in comm 0): application called MPI_Abort(MPI_COMM_WORLD, 62) - process 0 Abort(62) on node 1 (rank 1 in comm 0): application called MPI_Abort(MPI_COMM_WORLD, 62) - process 1 running mpiexec -n 2 ../ex44 -field -petscpartitioner_type parmetis in $PETSC_DIR/$PETSC_ARCH/tests/dm/impls/plex/tests/runex44_0. Thanks a lot for your time and your help. ----------------?--------------------------------------------------- Alexis Marboeuf Postdoctoral fellow, Department of Mathematics & Statistics Hamilton Hall room 409B, McMaster University 1280 Main Street West, Hamilton, Ontario L8S 4K1, Canada EMail: marboeua at mcmaster.ca Tel. +1 (905) 525 9140 ext. 27031 ------------------------------------------------------------------- ________________________________ De : Alexis Marboeuf Envoy? : lundi 20 d?cembre 2021 15:34 ? : Matthew Knepley Cc : petsc-dev at mcs.anl.gov Objet : RE: [petsc-dev] DMPlexCreateGlobalToNaturalSF and partitioners Hi Matt, I created a branch marboeuf/plex-naturaldm starting from knepley/fix-plex-g2n with my minor modifications. It's incomplete now: src/dm/impls/plex/plex.c for Sub or SuperDM are not updated and my new example src/dm/impls/plex/tests/ex47.c just does one call of DMPlexDistribute. But I have the error I mentioned with partitioners. I am configuring PETSc with ./configure --download-chaco=1--download-exodusii=1--download-fblaslapack=1--download-hdf5=1--download-hypre=1--download-metis=1--download-ml=1--download-netcdf=1--download-parmetis=1--download-pnetcdf=1--download-ptscotch=1--download-sieve=1--download-sowing=1--download-yaml=1--download-zlib=1--with-boost-dir=/opt/homebrew/Cellar/boost/1.76.0--with-boost=1--with-c2html=0--with-debugging=1--with-fortran-datatypes=1--with-mpi-dir=/opt/homebrew/Cellar/mpich/3.4.2--with-ranlib=ranlib --with-x11=1 on my MacMook Pro Apple M1 Pro under the arm64 architecture. I am running mpiexec -n 2 ../ex47 -petscpartitioner_type parmetis in $PETSC_DIR/$PETSC_ARCH/tests/dm/impls/plex/tests/runex47_0 which raises the error. The example with -petscpartitioner_type simple works fine. ex44 also works fine even with parmetis or ptscotch partitioners. The reason is I defined 2 fields in the section of ex47 which is the only difference with the first test of ex44 for now. I am not allowed to push my branch to the remote repository under my Username AlexisMarb although I defined an ssh key. As soon as I am able to push it I'll do it so you can checkout and test. Thanks a lot for your time and help. ----------------?--------------------------------------------------- Alexis Marboeuf Postdoctoral fellow, Department of Mathematics & Statistics Hamilton Hall room 409B, McMaster University 1280 Main Street West, Hamilton, Ontario L8S 4K1, Canada EMail: marboeua at mcmaster.ca Tel. +1 (905) 525 9140 ext. 27031 ------------------------------------------------------------------- ________________________________ De : Matthew Knepley Envoy? : vendredi 17 d?cembre 2021 17:44 ? : Alexis Marboeuf Cc : petsc-dev at mcs.anl.gov Objet : Re: [petsc-dev] DMPlexCreateGlobalToNaturalSF and partitioners On Fri, Dec 17, 2021 at 3:04 PM Alexis Marboeuf > wrote: Dear PETSc Team, Following the merge request !4547 for fixing the Global To Natural map, I am implementing modifications to introduce a "natural" DM (i.e. the DM used for IO): see the discussion in https://gitlab.com/petsc/petsc/-/merge_requests/4547. I am writing an example for that, very similar to src/dm/impls/plex/tests/ex44.c added by the merge request, and running on 2 processors. The idea is to: (i) create a DM (the same distributed 2x5 mesh of ex44.c); (ii) call multiple times DMPlexDistribute with different partitioners and set one of the DMs as the "natural" DM; and (iii) see if we can reconstruct correctly the "natural" ordering and distribution from the last DM and the Global To Natural map. I am having some troubles however with at least parmetis and ptscotch partitioners which raise an error inside DMPlexCreateGlobalToNatural. Here is the error: The easiest thing to do here is to start a branch from the !4547 branch that I can checkout, and then tell me how to run what you are running. Thanks, Matt [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [0]PETSC ERROR: [1]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [1]PETSC ERROR: Invalid argument [1]PETSC ERROR: Input array needs to be sorted Invalid argument [0]PETSC ERROR: Input array needs to be sorted [0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting. [0]PETSC ERROR: [1]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting. [1]PETSC ERROR: Petsc Development GIT revision: v3.16.1-435-g007f11b901 GIT Date: 2021-12-01 14:31:21 +0000 Petsc Development GIT revision: v3.16.1-435-g007f11b901 GIT Date: 2021-12-01 14:31:21 +0000 [0]PETSC ERROR: ../ex47 on a arch-darwin-c-debug named marboeua-1.math.mcmaster.ca by alexismarboeuf Thu Dec 16 22:39:20 2021 [1]PETSC ERROR: ../ex47 on a arch-darwin-c-debug named marboeua-1.math.mcmaster.ca by alexismarboeuf Thu Dec 16 22:39:20 2021 [1]PETSC ERROR: [0]PETSC ERROR: Configure options --force --download-fblaslapack=1 --download-exodusii=1 --download-hdf5=1 --download-chaco=1 --download-metis=1 --download-parmetis=1 -download-ptscotch=1 --download-sowing=1 --download-hypre=1 --download-ml=1 --download-netcdf=1 --download-yaml=1 --download-zlib=1 --download-pnetcdf=1 --download-sieve=1 --with-boost=1 --with-boost-dir=/opt/homebrew/Cellar/boost/1.76.0 with-clanguage=C++ --with-c2html=0 --with-fortran-datatypes=1 --with-mpi-dir=/opt/homebrew/Cellar/mpich/3.4.2 --with-debugging=1 --with-ranlib=ranlib --with-x11=1 [0]PETSC ERROR: Configure options --force --download-fblaslapack=1 --download-exodusii=1 --download-hdf5=1 --download-chaco=1 --download-metis=1 --download-parmetis=1 -download-ptscotch=1 --download-sowing=1 --download-hypre=1 --download-ml=1 --download-netcdf=1 --download-yaml=1 --download-zlib=1 --download-pnetcdf=1 --download-sieve=1 --with-boost=1 --with-boost-dir=/opt/homebrew/Cellar/boost/1.76.0 with-clanguage=C++ --with-c2html=0 --with-fortran-datatypes=1 --with-mpi-dir=/opt/homebrew/Cellar/mpich/3.4.2 --with-debugging=1 --with-ranlib=ranlib --with-x11=1 [1]PETSC ERROR: #1 PetscSortedRemoveDupsInt() at /Users/alexismarboeuf/Documents/petsc2/src/sys/utils/sorti.c:308 [0]PETSC ERROR: #2 PetscSFCreateEmbeddedLeafSF() at /Users/alexismarboeuf/Documents/petsc2/src/vec/is/sf/interface/sf.c:1409 [0]PETSC ERROR: #1 PetscSortedRemoveDupsInt() at /Users/alexismarboeuf/Documents/petsc2/src/sys/utils/sorti.c:308 [1]PETSC ERROR: #2 PetscSFCreateEmbeddedLeafSF() at /Users/alexismarboeuf/Documents/petsc2/src/vec/is/sf/interface/sf.c:1409 [1]PETSC ERROR: #3 DMPlexCreateGlobalToNaturalSF() at /Users/alexismarboeuf/Documents/petsc2/src/dm/impls/plex/plexnatural.c:173 [1]PETSC ERROR: #4 DMPlexDistribute() at /Users/alexismarboeuf/Documents/petsc2/src/dm/impls/plex/plexdistribute.c:1755 [1]PETSC ERROR: #5 main() at /Users/alexismarboeuf/Documents/petsc2/src/dm/impls/plex/tests/ex47.c:289 [1]PETSC ERROR: PETSc Option Table entries: [1]PETSC ERROR: -petscpartitioner_type ptscotch [1]PETSC ERROR: ----------------End of Error Message -------send entire error message to petsc-maint at mcs.anl.gov---------- Abort(62) on node 1 (rank 1 in comm 0): application called MPI_Abort(MPI_COMM_WORLD, 62) - process 1 #3 DMPlexCreateGlobalToNaturalSF() at /Users/alexismarboeuf/Documents/petsc2/src/dm/impls/plex/plexnatural.c:173 [0]PETSC ERROR: #4 DMPlexDistribute() at /Users/alexismarboeuf/Documents/petsc2/src/dm/impls/plex/plexdistribute.c:1755 [0]PETSC ERROR: #5 main() at /Users/alexismarboeuf/Documents/petsc2/src/dm/impls/plex/tests/ex47.c:289 [0]PETSC ERROR: PETSc Option Table entries: [0]PETSC ERROR: -petscpartitioner_type ptscotch [0]PETSC ERROR: ----------------End of Error Message -------send entire error message to petsc-maint at mcs.anl.gov---------- Abort(62) on node 0 (rank 0 in comm 0): application called MPI_Abort(MPI_COMM_WORLD, 62) - process 0 Thanks all for your help. ------------------------------------------------------------------- Alexis Marboeuf Postdoctoral fellow, Department of Mathematics & Statistics Hamilton Hall room 409B, McMaster University 1280 Main Street West, Hamilton, Ontario L8S 4K1, Canada EMail: marboeua at mcmaster.ca Tel. +1 (905) 525 9140 ext. 27031 ------------------------------------------------------------------- -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From alexis.marboeuf at hotmail.fr Thu Jan 6 18:08:23 2022 From: alexis.marboeuf at hotmail.fr (Alexis Marboeuf) Date: Fri, 7 Jan 2022 00:08:23 +0000 Subject: [petsc-users] Unable to push code on the project Message-ID: Dear PETSc developper team, I wish you again all the best for this New Year 2022. I created a branch marboeuf/plex-naturaldm from the branch knepley/fix-plex-g2n with git checkout -b marboeuf/plex-naturaldm origin/knepley/fix-plex-g2n to implement the natural DM notion: see the discussion in https://gitlab.com/petsc/petsc/-/merge_requests/4547. I wrote code and commited. For now, it is only an example, cloned from src/dm/impls/plex/tests/ex44.c of knepley/fix-plex-g2n. I intend to use it to illustrate the natural DM notion. However, I am unable to push my branch to the remote repository with git push -u origin marboeuf/plex-naturaldm getting the error Username for 'https://gitlab.com': AlexisMarb Password for 'https://AlexisMarb at gitlab.com': remote: You are not allowed to push code to this project. fatal: unable to access 'https://gitlab.com/petsc/petsc.git/': The requested URL returned error: 403 I created an ssh key and linked it with my GitLab account. What am I missing or doing wrong? Than you all for your help. ------------------------------------------------------------------- Alexis Marboeuf Postdoctoral fellow, Department of Mathematics & Statistics Hamilton Hall room 409B, McMaster University 1280 Main Street West, Hamilton, Ontario L8S 4K1, Canada EMail: marboeua at mcmaster.ca Tel. +1 (905) 525 9140 ext. 27031 ------------------------------------------------------------------- [https://gitlab.com/uploads/-/system/project/avatar/13882401/PETSc_RBG-logo.png] Fix Plex g2n map (!4547) ? Merge requests ? PETSc / petsc PETSc, pronounced PET-see (the S is silent), is a suite of data structures and routines for the scalable (parallel) solution of scientific applications modeled by partial differential equations. gitlab.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From thibault.bridelbertomeu at gmail.com Fri Jan 7 04:46:11 2022 From: thibault.bridelbertomeu at gmail.com (Thibault Bridel-Bertomeu) Date: Fri, 7 Jan 2022 11:46:11 +0100 Subject: [petsc-users] Fluid-Structure interaction with multiple DMPlex Message-ID: Dear all, First of, happy new year everyone !! All the best ! I am starting to draft a new project that will be about fluid-structure interaction: in particular, the idea is to compute the Navier-Stokes (or Euler nevermind) flow around an object and _at the same time_ compute the heat equation inside the object. So basically, I am thinking a mesh of the fluid and a mesh of the object, both meshes being linked at the fluid - solid interface. First (Matthew maybe ?) do you think it is something that could be done using two DMPlex's that would somehow be spawned from reading a Gmsh mesh with two volumes ? And on one DMPlex we would have finite volume for the fluid, on the other finite elements for the heat eqn ? Second, is it something that anyone in the community has ever imagined doing with PETSc DMPlex's ? As I said it is very prospective, I just wanted to have your opinion !! Thanks very much in advance everyone !! Cheers, Thibault -------------- next part -------------- An HTML attachment was scrubbed... URL: From gongding at cn.cogenda.com Fri Jan 7 07:38:41 2022 From: gongding at cn.cogenda.com (Gong Ding) Date: Fri, 7 Jan 2022 21:38:41 +0800 Subject: [petsc-users] petsc __float128 krylov solver with mixed precision LU preconditioner Message-ID: <7202b876-01f4-fa2b-3dcc-5b519828eb06@cn.cogenda.com> Dear All, I had a question. For petsc configured with 128bit float, Is it possible to use 64bit direct solver as mixed precision preconditioner for krylov subspace iterative method? If so , such as MUMPS or Pardiso can be used as efficient parallel LU preconditioner I think. Regards Gong Ding -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Fri Jan 7 07:44:37 2022 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 7 Jan 2022 08:44:37 -0500 Subject: [petsc-users] Fluid-Structure interaction with multiple DMPlex In-Reply-To: References: Message-ID: On Fri, Jan 7, 2022 at 5:46 AM Thibault Bridel-Bertomeu < thibault.bridelbertomeu at gmail.com> wrote: > Dear all, > > First of, happy new year everyone !! All the best ! > Happy New Year! > I am starting to draft a new project that will be about fluid-structure > interaction: in particular, the idea is to compute the Navier-Stokes (or > Euler nevermind) flow around an object and _at the same time_ compute the > heat equation inside the object. > So basically, I am thinking a mesh of the fluid and a mesh of the object, > both meshes being linked at the fluid - solid interface. > First question: Are these meshes intended to match on the interface? If not, this sounds like overset grids or immersed boundary/interface methods. In this case, more than one mesh makes sense to me. If they are intended to match, then I would advocate a single mesh with multiple problems defined on it. I have experimented with this, for example see SNES ex23 where I have a field in only part of the domain. I have a large project to do exactly this in a rocket engine now. > First (Matthew maybe ?) do you think it is something that could be done > using two DMPlex's that would somehow be spawned from reading a Gmsh mesh > with two volumes ? > You can take a mesh and filter out part of it with DMPlexFilter(). That is not used much so I may have to fix it to do what you want, but that should be easy. > And on one DMPlex we would have finite volume for the fluid, on the other > finite elements for the heat eqn ? > I have done this exact thing on a single mesh. It should be no harder on two meshes if you go that route. > Second, is it something that anyone in the community has ever imagined > doing with PETSc DMPlex's ? > Yes, I had a combined FV+FEM simulation of magma dynamics (I should make it an example), and currently we are doing FVM+FEM for simulation of a rocket engine. Thanks, Matt > As I said it is very prospective, I just wanted to have your opinion !! > > Thanks very much in advance everyone !! > > Cheers, > Thibault > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From thibault.bridelbertomeu at gmail.com Fri Jan 7 07:52:18 2022 From: thibault.bridelbertomeu at gmail.com (Thibault Bridel-Bertomeu) Date: Fri, 7 Jan 2022 14:52:18 +0100 Subject: [petsc-users] Fluid-Structure interaction with multiple DMPlex In-Reply-To: References: Message-ID: Hi Matthew, Le ven. 7 janv. 2022 ? 14:44, Matthew Knepley a ?crit : > On Fri, Jan 7, 2022 at 5:46 AM Thibault Bridel-Bertomeu < > thibault.bridelbertomeu at gmail.com> wrote: > >> Dear all, >> >> First of, happy new year everyone !! All the best ! >> > > Happy New Year! > > >> I am starting to draft a new project that will be about fluid-structure >> interaction: in particular, the idea is to compute the Navier-Stokes (or >> Euler nevermind) flow around an object and _at the same time_ compute the >> heat equation inside the object. >> So basically, I am thinking a mesh of the fluid and a mesh of the object, >> both meshes being linked at the fluid - solid interface. >> > > First question: Are these meshes intended to match on the interface? If > not, this sounds like overset grids or immersed boundary/interface methods. > In this case, more than one mesh makes sense to me. If they are intended to > match, then I would advocate a single mesh with multiple problems defined > on it. I have experimented with this, for example see SNES ex23 where I > have a field in only part of the domain. I have a large project to do > exactly this in a rocket engine now. > Yes the way I see it is more of a single mesh with two distinct regions to distinguish between the fluid and the solid. I was talking about two meshes to try and explain my vision but it seems like it was unclear. Imagine if you wish a rectangular box with a sphere inclusion: the sphere would be tagged as a solid and the rest of the domain as fluid. Using Gmsh volumes for instance. Ill check out the SNES example ! Thanks ! > >> First (Matthew maybe ?) do you think it is something that could be done >> using two DMPlex's that would somehow be spawned from reading a Gmsh mesh >> with two volumes ? >> > > You can take a mesh and filter out part of it with DMPlexFilter(). That is > not used much so I may have to fix it to do what you want, but that should > be easy. > > >> And on one DMPlex we would have finite volume for the fluid, on the other >> finite elements for the heat eqn ? >> > > I have done this exact thing on a single mesh. It should be no harder on > two meshes if you go that route. > > >> Second, is it something that anyone in the community has ever imagined >> doing with PETSc DMPlex's ? >> > > Yes, I had a combined FV+FEM simulation of magma dynamics (I should make > it an example), and currently we are doing FVM+FEM for simulation of a > rocket engine. > Wow so it seems like it?s the exact same thing I would like to achieve as the rocket engine example. So you have a single mesh and two regions tagged differently, and you use the DmPlexFilter to solve FVM and FEM separately ? Thanks ! Thibault > Thanks, > > Matt > > >> As I said it is very prospective, I just wanted to have your opinion !! >> >> Thanks very much in advance everyone !! >> >> Cheers, >> Thibault >> >> > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > -- Thibault Bridel-Bertomeu ? Eng, MSc, PhD Research Engineer CEA/CESTA 33114 LE BARP Tel.: (+33)557046924 Mob.: (+33)611025322 Mail: thibault.bridelbertomeu at gmail.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Fri Jan 7 07:54:14 2022 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 7 Jan 2022 08:54:14 -0500 Subject: [petsc-users] Fluid-Structure interaction with multiple DMPlex In-Reply-To: References: Message-ID: On Fri, Jan 7, 2022 at 8:52 AM Thibault Bridel-Bertomeu < thibault.bridelbertomeu at gmail.com> wrote: > Hi Matthew, > > Le ven. 7 janv. 2022 ? 14:44, Matthew Knepley a > ?crit : > >> On Fri, Jan 7, 2022 at 5:46 AM Thibault Bridel-Bertomeu < >> thibault.bridelbertomeu at gmail.com> wrote: >> >>> Dear all, >>> >>> First of, happy new year everyone !! All the best ! >>> >> >> Happy New Year! >> >> >>> I am starting to draft a new project that will be about fluid-structure >>> interaction: in particular, the idea is to compute the Navier-Stokes (or >>> Euler nevermind) flow around an object and _at the same time_ compute the >>> heat equation inside the object. >>> So basically, I am thinking a mesh of the fluid and a mesh of the >>> object, both meshes being linked at the fluid - solid interface. >>> >> >> First question: Are these meshes intended to match on the interface? If >> not, this sounds like overset grids or immersed boundary/interface methods. >> In this case, more than one mesh makes sense to me. If they are intended to >> match, then I would advocate a single mesh with multiple problems defined >> on it. I have experimented with this, for example see SNES ex23 where I >> have a field in only part of the domain. I have a large project to do >> exactly this in a rocket engine now. >> > > Yes the way I see it is more of a single mesh with two distinct regions to > distinguish between the fluid and the solid. I was talking about two meshes > to try and explain my vision but it seems like it was unclear. > Imagine if you wish a rectangular box with a sphere inclusion: the sphere > would be tagged as a solid and the rest of the domain as fluid. Using Gmsh > volumes for instance. > Ill check out the SNES example ! Thanks ! > > >> >>> First (Matthew maybe ?) do you think it is something that could be done >>> using two DMPlex's that would somehow be spawned from reading a Gmsh mesh >>> with two volumes ? >>> >> >> You can take a mesh and filter out part of it with DMPlexFilter(). That >> is not used much so I may have to fix it to do what you want, but that >> should be easy. >> >> >>> And on one DMPlex we would have finite volume for the fluid, on the >>> other finite elements for the heat eqn ? >>> >> >> I have done this exact thing on a single mesh. It should be no harder on >> two meshes if you go that route. >> >> >>> Second, is it something that anyone in the community has ever imagined >>> doing with PETSc DMPlex's ? >>> >> >> Yes, I had a combined FV+FEM simulation of magma dynamics (I should make >> it an example), and currently we are doing FVM+FEM for simulation of a >> rocket engine. >> > > Wow so it seems like it?s the exact same thing I would like to achieve as > the rocket engine example. > So you have a single mesh and two regions tagged differently, and you use > the DmPlexFilter to solve FVM and FEM separately ? > With a single mesh, you do not even need DMPlexFilter. You just use the labels that Gmsh gives you. I think we should be able to get it going in a straightforward way. Thanks, Matt > Thanks ! > > Thibault > > >> Thanks, >> >> Matt >> >> >>> As I said it is very prospective, I just wanted to have your opinion !! >>> >>> Thanks very much in advance everyone !! >>> >>> Cheers, >>> Thibault >>> >>> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ >> >> > -- > Thibault Bridel-Bertomeu > ? > Eng, MSc, PhD > Research Engineer > CEA/CESTA > 33114 LE BARP > Tel.: (+33)557046924 > Mob.: (+33)611025322 > Mail: thibault.bridelbertomeu at gmail.com > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From mfadams at lbl.gov Fri Jan 7 10:16:49 2022 From: mfadams at lbl.gov (Mark Adams) Date: Fri, 7 Jan 2022 11:16:49 -0500 Subject: [petsc-users] petsc __float128 krylov solver with mixed precision LU preconditioner In-Reply-To: <7202b876-01f4-fa2b-3dcc-5b519828eb06@cn.cogenda.com> References: <7202b876-01f4-fa2b-3dcc-5b519828eb06@cn.cogenda.com> Message-ID: I don't believe so. Not without extreme measures. On Fri, Jan 7, 2022 at 8:39 AM Gong Ding wrote: > Dear All, > > I had a question. For petsc configured with 128bit float, Is it possible > to use 64bit direct solver as mixed precision preconditioner for krylov > subspace iterative method? > > If so , such as MUMPS or Pardiso can be used as efficient parallel LU preconditioner > I think. > > Regards > > Gong Ding > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From thibault.bridelbertomeu at gmail.com Fri Jan 7 11:57:59 2022 From: thibault.bridelbertomeu at gmail.com (Thibault Bridel-Bertomeu) Date: Fri, 7 Jan 2022 18:57:59 +0100 Subject: [petsc-users] Fluid-Structure interaction with multiple DMPlex In-Reply-To: References: Message-ID: Le ven. 7 janv. 2022 ? 14:54, Matthew Knepley a ?crit : > On Fri, Jan 7, 2022 at 8:52 AM Thibault Bridel-Bertomeu < > thibault.bridelbertomeu at gmail.com> wrote: > >> Hi Matthew, >> >> Le ven. 7 janv. 2022 ? 14:44, Matthew Knepley a >> ?crit : >> >>> On Fri, Jan 7, 2022 at 5:46 AM Thibault Bridel-Bertomeu < >>> thibault.bridelbertomeu at gmail.com> wrote: >>> >>>> Dear all, >>>> >>>> First of, happy new year everyone !! All the best ! >>>> >>> >>> Happy New Year! >>> >>> >>>> I am starting to draft a new project that will be about fluid-structure >>>> interaction: in particular, the idea is to compute the Navier-Stokes (or >>>> Euler nevermind) flow around an object and _at the same time_ compute the >>>> heat equation inside the object. >>>> So basically, I am thinking a mesh of the fluid and a mesh of the >>>> object, both meshes being linked at the fluid - solid interface. >>>> >>> >>> First question: Are these meshes intended to match on the interface? If >>> not, this sounds like overset grids or immersed boundary/interface methods. >>> In this case, more than one mesh makes sense to me. If they are intended to >>> match, then I would advocate a single mesh with multiple problems defined >>> on it. I have experimented with this, for example see SNES ex23 where I >>> have a field in only part of the domain. I have a large project to do >>> exactly this in a rocket engine now. >>> >> >> Yes the way I see it is more of a single mesh with two distinct regions >> to distinguish between the fluid and the solid. I was talking about two >> meshes to try and explain my vision but it seems like it was unclear. >> Imagine if you wish a rectangular box with a sphere inclusion: the sphere >> would be tagged as a solid and the rest of the domain as fluid. Using Gmsh >> volumes for instance. >> Ill check out the SNES example ! Thanks ! >> >> >>> >>>> First (Matthew maybe ?) do you think it is something that could be done >>>> using two DMPlex's that would somehow be spawned from reading a Gmsh mesh >>>> with two volumes ? >>>> >>> >>> You can take a mesh and filter out part of it with DMPlexFilter(). That >>> is not used much so I may have to fix it to do what you want, but that >>> should be easy. >>> >>> >>>> And on one DMPlex we would have finite volume for the fluid, on the >>>> other finite elements for the heat eqn ? >>>> >>> >>> I have done this exact thing on a single mesh. It should be no harder on >>> two meshes if you go that route. >>> >>> >>>> Second, is it something that anyone in the community has ever imagined >>>> doing with PETSc DMPlex's ? >>>> >>> >>> Yes, I had a combined FV+FEM simulation of magma dynamics (I should make >>> it an example), and currently we are doing FVM+FEM for simulation of a >>> rocket engine. >>> >> >> Wow so it seems like it?s the exact same thing I would like to achieve as >> the rocket engine example. >> So you have a single mesh and two regions tagged differently, and you use >> the DmPlexFilter to solve FVM and FEM separately ? >> > > With a single mesh, you do not even need DMPlexFilter. You just use the > labels that Gmsh gives you. I think we should be able to get it going in a > straightforward way. > Ok then ! Thanks ! I?ll give it a shot and see what happens ! Setting up the FVM and FEM discretizations will pass by DMSetField right ? With a single mesh tagged with two different regions, it should show up as two fields, is that correct ? Thanks, Thibault > Thanks, > > Matt > > >> Thanks ! >> >> Thibault >> >> >>> Thanks, >>> >>> Matt >>> >>> >>>> As I said it is very prospective, I just wanted to have your opinion !! >>>> >>>> Thanks very much in advance everyone !! >>>> >>>> Cheers, >>>> Thibault >>>> >>>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >>> https://www.cse.buffalo.edu/~knepley/ >>> >>> >> -- >> Thibault Bridel-Bertomeu >> ? >> Eng, MSc, PhD >> Research Engineer >> CEA/CESTA >> 33114 LE BARP >> Tel.: (+33)557046924 >> Mob.: (+33)611025322 >> Mail: thibault.bridelbertomeu at gmail.com >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > -- Thibault Bridel-Bertomeu ? Eng, MSc, PhD Research Engineer CEA/CESTA 33114 LE BARP Tel.: (+33)557046924 Mob.: (+33)611025322 Mail: thibault.bridelbertomeu at gmail.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Fri Jan 7 12:22:52 2022 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 7 Jan 2022 13:22:52 -0500 Subject: [petsc-users] Fluid-Structure interaction with multiple DMPlex In-Reply-To: References: Message-ID: On Fri, Jan 7, 2022 at 12:58 PM Thibault Bridel-Bertomeu < thibault.bridelbertomeu at gmail.com> wrote: > > Le ven. 7 janv. 2022 ? 14:54, Matthew Knepley a > ?crit : > >> On Fri, Jan 7, 2022 at 8:52 AM Thibault Bridel-Bertomeu < >> thibault.bridelbertomeu at gmail.com> wrote: >> >>> Hi Matthew, >>> >>> Le ven. 7 janv. 2022 ? 14:44, Matthew Knepley a >>> ?crit : >>> >>>> On Fri, Jan 7, 2022 at 5:46 AM Thibault Bridel-Bertomeu < >>>> thibault.bridelbertomeu at gmail.com> wrote: >>>> >>>>> Dear all, >>>>> >>>>> First of, happy new year everyone !! All the best ! >>>>> >>>> >>>> Happy New Year! >>>> >>>> >>>>> I am starting to draft a new project that will be about >>>>> fluid-structure interaction: in particular, the idea is to compute the >>>>> Navier-Stokes (or Euler nevermind) flow around an object and _at the same >>>>> time_ compute the heat equation inside the object. >>>>> So basically, I am thinking a mesh of the fluid and a mesh of the >>>>> object, both meshes being linked at the fluid - solid interface. >>>>> >>>> >>>> First question: Are these meshes intended to match on the interface? If >>>> not, this sounds like overset grids or immersed boundary/interface methods. >>>> In this case, more than one mesh makes sense to me. If they are intended to >>>> match, then I would advocate a single mesh with multiple problems defined >>>> on it. I have experimented with this, for example see SNES ex23 where I >>>> have a field in only part of the domain. I have a large project to do >>>> exactly this in a rocket engine now. >>>> >>> >>> Yes the way I see it is more of a single mesh with two distinct regions >>> to distinguish between the fluid and the solid. I was talking about two >>> meshes to try and explain my vision but it seems like it was unclear. >>> Imagine if you wish a rectangular box with a sphere inclusion: the >>> sphere would be tagged as a solid and the rest of the domain as fluid. >>> Using Gmsh volumes for instance. >>> Ill check out the SNES example ! Thanks ! >>> >>> >>>> >>>>> First (Matthew maybe ?) do you think it is something that could be >>>>> done using two DMPlex's that would somehow be spawned from reading a Gmsh >>>>> mesh with two volumes ? >>>>> >>>> >>>> You can take a mesh and filter out part of it with DMPlexFilter(). That >>>> is not used much so I may have to fix it to do what you want, but that >>>> should be easy. >>>> >>>> >>>>> And on one DMPlex we would have finite volume for the fluid, on the >>>>> other finite elements for the heat eqn ? >>>>> >>>> >>>> I have done this exact thing on a single mesh. It should be no harder >>>> on two meshes if you go that route. >>>> >>>> >>>>> Second, is it something that anyone in the community has ever imagined >>>>> doing with PETSc DMPlex's ? >>>>> >>>> >>>> Yes, I had a combined FV+FEM simulation of magma dynamics (I should >>>> make it an example), and currently we are doing FVM+FEM for simulation of a >>>> rocket engine. >>>> >>> >>> Wow so it seems like it?s the exact same thing I would like to achieve >>> as the rocket engine example. >>> So you have a single mesh and two regions tagged differently, and you >>> use the DmPlexFilter to solve FVM and FEM separately ? >>> >> >> With a single mesh, you do not even need DMPlexFilter. You just use the >> labels that Gmsh gives you. I think we should be able to get it going in a >> straightforward way. >> > > Ok then ! Thanks ! I?ll give it a shot and see what happens ! > Setting up the FVM and FEM discretizations will pass by DMSetField right ? > With a single mesh tagged with two different regions, it should show up as > two fields, is that correct ? > Yes, the idea is as follows. Each field also has a label argument that is the support of the field in the domain. Then we create PetscDS objects for each separate set of overlapping fields. The current algorithm is not complete I think, so let me know if this step fails. Thanks, Matt > Thanks, > > Thibault > > >> Thanks, >> >> Matt >> >> >>> Thanks ! >>> >>> Thibault >>> >>> >>>> Thanks, >>>> >>>> Matt >>>> >>>> >>>>> As I said it is very prospective, I just wanted to have your opinion !! >>>>> >>>>> Thanks very much in advance everyone !! >>>>> >>>>> Cheers, >>>>> Thibault >>>>> >>>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their >>>> experiments is infinitely more interesting than any results to which their >>>> experiments lead. >>>> -- Norbert Wiener >>>> >>>> https://www.cse.buffalo.edu/~knepley/ >>>> >>>> >>> -- >>> Thibault Bridel-Bertomeu >>> ? >>> Eng, MSc, PhD >>> Research Engineer >>> CEA/CESTA >>> 33114 LE BARP >>> Tel.: (+33)557046924 >>> Mob.: (+33)611025322 >>> Mail: thibault.bridelbertomeu at gmail.com >>> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ >> >> > -- > Thibault Bridel-Bertomeu > ? > Eng, MSc, PhD > Research Engineer > CEA/CESTA > 33114 LE BARP > Tel.: (+33)557046924 > Mob.: (+33)611025322 > Mail: thibault.bridelbertomeu at gmail.com > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From thibault.bridelbertomeu at gmail.com Fri Jan 7 12:45:11 2022 From: thibault.bridelbertomeu at gmail.com (Thibault Bridel-Bertomeu) Date: Fri, 7 Jan 2022 19:45:11 +0100 Subject: [petsc-users] Fluid-Structure interaction with multiple DMPlex In-Reply-To: References: Message-ID: Le ven. 7 janv. 2022 ? 19:23, Matthew Knepley a ?crit : > On Fri, Jan 7, 2022 at 12:58 PM Thibault Bridel-Bertomeu < > thibault.bridelbertomeu at gmail.com> wrote: > >> >> Le ven. 7 janv. 2022 ? 14:54, Matthew Knepley a >> ?crit : >> >>> On Fri, Jan 7, 2022 at 8:52 AM Thibault Bridel-Bertomeu < >>> thibault.bridelbertomeu at gmail.com> wrote: >>> >>>> Hi Matthew, >>>> >>>> Le ven. 7 janv. 2022 ? 14:44, Matthew Knepley a >>>> ?crit : >>>> >>>>> On Fri, Jan 7, 2022 at 5:46 AM Thibault Bridel-Bertomeu < >>>>> thibault.bridelbertomeu at gmail.com> wrote: >>>>> >>>>>> Dear all, >>>>>> >>>>>> First of, happy new year everyone !! All the best ! >>>>>> >>>>> >>>>> Happy New Year! >>>>> >>>>> >>>>>> I am starting to draft a new project that will be about >>>>>> fluid-structure interaction: in particular, the idea is to compute the >>>>>> Navier-Stokes (or Euler nevermind) flow around an object and _at the same >>>>>> time_ compute the heat equation inside the object. >>>>>> So basically, I am thinking a mesh of the fluid and a mesh of the >>>>>> object, both meshes being linked at the fluid - solid interface. >>>>>> >>>>> >>>>> First question: Are these meshes intended to match on the interface? >>>>> If not, this sounds like overset grids or immersed boundary/interface >>>>> methods. In this case, more than one mesh makes sense to me. If they are >>>>> intended to match, then I would advocate a single mesh with multiple >>>>> problems defined on it. I have experimented with this, for example see SNES >>>>> ex23 where I have a field in only part of the domain. I have a large >>>>> project to do exactly this in a rocket engine now. >>>>> >>>> >>>> Yes the way I see it is more of a single mesh with two distinct regions >>>> to distinguish between the fluid and the solid. I was talking about two >>>> meshes to try and explain my vision but it seems like it was unclear. >>>> Imagine if you wish a rectangular box with a sphere inclusion: the >>>> sphere would be tagged as a solid and the rest of the domain as fluid. >>>> Using Gmsh volumes for instance. >>>> Ill check out the SNES example ! Thanks ! >>>> >>>> >>>>> >>>>>> First (Matthew maybe ?) do you think it is something that could be >>>>>> done using two DMPlex's that would somehow be spawned from reading a Gmsh >>>>>> mesh with two volumes ? >>>>>> >>>>> >>>>> You can take a mesh and filter out part of it with DMPlexFilter(). >>>>> That is not used much so I may have to fix it to do what you want, but that >>>>> should be easy. >>>>> >>>>> >>>>>> And on one DMPlex we would have finite volume for the fluid, on the >>>>>> other finite elements for the heat eqn ? >>>>>> >>>>> >>>>> I have done this exact thing on a single mesh. It should be no harder >>>>> on two meshes if you go that route. >>>>> >>>>> >>>>>> Second, is it something that anyone in the community has ever >>>>>> imagined doing with PETSc DMPlex's ? >>>>>> >>>>> >>>>> Yes, I had a combined FV+FEM simulation of magma dynamics (I should >>>>> make it an example), and currently we are doing FVM+FEM for simulation of a >>>>> rocket engine. >>>>> >>>> >>>> Wow so it seems like it?s the exact same thing I would like to achieve >>>> as the rocket engine example. >>>> So you have a single mesh and two regions tagged differently, and you >>>> use the DmPlexFilter to solve FVM and FEM separately ? >>>> >>> >>> With a single mesh, you do not even need DMPlexFilter. You just use the >>> labels that Gmsh gives you. I think we should be able to get it going in a >>> straightforward way. >>> >> >> Ok then ! Thanks ! I?ll give it a shot and see what happens ! >> Setting up the FVM and FEM discretizations will pass by DMSetField right >> ? With a single mesh tagged with two different regions, it should show up >> as two fields, is that correct ? >> > > Yes, the idea is as follows. Each field also has a label argument that is > the support of the field in the domain. Then we create PetscDS objects for > each > separate set of overlapping fields. The current algorithm is not complete > I think, so let me know if this step fails. > Ok, thanks. I?ll let you know and share snippets when I have something started ! Talk soon ! Thanks ! Thibault > Thanks, > > Matt > > >> Thanks, >> >> Thibault >> >> >>> Thanks, >>> >>> Matt >>> >>> >>>> Thanks ! >>>> >>>> Thibault >>>> >>>> >>>>> Thanks, >>>>> >>>>> Matt >>>>> >>>>> >>>>>> As I said it is very prospective, I just wanted to have your opinion >>>>>> !! >>>>>> >>>>>> Thanks very much in advance everyone !! >>>>>> >>>>>> Cheers, >>>>>> Thibault >>>>>> >>>>>> >>>>> >>>>> -- >>>>> What most experimenters take for granted before they begin their >>>>> experiments is infinitely more interesting than any results to which their >>>>> experiments lead. >>>>> -- Norbert Wiener >>>>> >>>>> https://www.cse.buffalo.edu/~knepley/ >>>>> >>>>> >>>> -- >>>> Thibault Bridel-Bertomeu >>>> ? >>>> Eng, MSc, PhD >>>> Research Engineer >>>> CEA/CESTA >>>> 33114 LE BARP >>>> Tel.: (+33)557046924 >>>> Mob.: (+33)611025322 >>>> Mail: thibault.bridelbertomeu at gmail.com >>>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >>> https://www.cse.buffalo.edu/~knepley/ >>> >>> >> -- >> Thibault Bridel-Bertomeu >> ? >> Eng, MSc, PhD >> Research Engineer >> CEA/CESTA >> 33114 LE BARP >> Tel.: (+33)557046924 >> Mob.: (+33)611025322 >> Mail: thibault.bridelbertomeu at gmail.com >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > -- Thibault Bridel-Bertomeu ? Eng, MSc, PhD Research Engineer CEA/CESTA 33114 LE BARP Tel.: (+33)557046924 Mob.: (+33)611025322 Mail: thibault.bridelbertomeu at gmail.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From FERRANJ2 at my.erau.edu Fri Jan 7 18:10:34 2022 From: FERRANJ2 at my.erau.edu (Ferrand, Jesus A.) Date: Sat, 8 Jan 2022 00:10:34 +0000 Subject: [petsc-users] [EXTERNAL] Re: DM misuse causes massive memory leak? In-Reply-To: References: <87tueraunm.fsf@jedbrown.org> <87sfu2921h.fsf@jedbrown.org> <87a6g99507.fsf@jedbrown.org> Message-ID: Matt: Thanks again for the prompt reply : ) You were right regarding the PetscSections. I reprofiled my code where I use them and realized that I had DMCreateMatrix() in the mix. The brunt of the overhead came from it, not the PetscSections. I looked for ways to improve the time on DMCreateMatrix() and figured I could call DMSetMatType() to work with a BAIJ + MatSetValuesBlocked() instead of AIJ +MatSetValues() to leverage my dof blocks (ux, uy, uz) during assembly. My matrix assembly times have improved but DMCreateMatrix seems to take just as long as before. I noticed after calling DMGetBlockSize() that my DM is working with blocksize of 1. Weirdly enough, my call to KSPView() returns a block size of 3 (as it should be) for the matrix that came from DMCreateMatrix(). I further noticed after looking at the DM function index that there is no "DMSetBlockSize()," however there is in fact a DMGetBlockSize(). Anyways, any suggestions on how to speed up DMCreateMatrix()? I have tried calling both DMSetMatrixPreallocateOnly() and DMSetAdjacency() (with the flags for FEM) but barely got a speedup. Regarding DMPlexGetConeRecursiveVertices() vs. DMPlexGetTransitiveClosure() for BC's. I will (for now) side with the former as I am getting better times with it for the BC's at least. My gripe with DMPlexGetTransitiveClosure() is that it returns the whole closure as oppossed to subsets of it (i.e., for a cell, it returns the constituetn faces, edges, and vertices as oppossed to say, just vertices). Currently, I'm working with P1 elements only so I need only vertices. To me, therefore, the additional info seems wasteful for P1. For Pn (n>1) DMPlexGetTransitiveClosure() seems like the superior option though. ________________________________ From: Matthew Knepley Sent: Thursday, January 6, 2022 5:39 PM To: Ferrand, Jesus A. Cc: Jed Brown ; petsc-users Subject: Re: [petsc-users] [EXTERNAL] Re: DM misuse causes massive memory leak? On Thu, Jan 6, 2022 at 5:30 PM Ferrand, Jesus A. > wrote: Matt: Thanks for the immediate reply. My apologies, I misspelled the function name, it should have been "DMPlexGetConeRecursiveVertices()." Oh, this is a debugging function that Vaclav wrote. It should never be used in production. It allocates memory all over the place. If you want vertices, just get the closure and filter out the vertices: PetscInt *closure = NULL; PetscInt Ncl, Nclv; DMPlexGetDepthStrautm(dm, &vStart, &vEnd); DMPlexGetTransitiveClosure(dm, cell, PETSC_TRUE, &Ncl, &closure); Nclv = 0; for (cl = 0; cl < 2*Ncl; cl += 2) { const PetscInt point = closure[cl]; if ((point >= vStart) && (point < vEnd)) closure[Nclv++] = point; } DMPlexRestoreTransitiveClosure(...); Regarding the PetscSection use: I am looping through every single point in the DAG of my mesh. For each point I am assigning dof using PetscSectionSetDof(). I am also assigning dof to the corresponding fields using PetscSectionSetFieldDof(). I took Jed's advice and made a single field with 3 components, I named all of them. So, I used PetscSectionSetNumFields(), PetscSectionSetFieldComponents(), PetscSectionSetFieldName(), and PetscSectionSetComponentName(). Finally, I proceed to PetscSectionSetUp(), and DMSetLocalSection(). In my timed code blocks I am including DMPlexGetDepthStratum(), and DMGetStratumSize(), and DMPlexGetChart() because I need their output to assign the dof to the PetscSection. This should all take almost no time. There are no expensive operations there. Thanks, Matt For what is worth, I have my configure set to --with-debugging = 1. ________________________________ From: Matthew Knepley > Sent: Thursday, January 6, 2022 5:20 PM To: Ferrand, Jesus A. > Cc: Jed Brown >; petsc-users > Subject: Re: [petsc-users] [EXTERNAL] Re: DM misuse causes massive memory leak? On Thu, Jan 6, 2022 at 5:15 PM Ferrand, Jesus A. > wrote: Jed: DMPlexLabelComplete() has allowed me to speed up my code significantly (Many thanks!). I did not use DMAddBoundary() though. I figured I could obtain Index Sets (IS's) from the DAG for different depths and then IS's for the points that were flagged in Gmsh (after calling DMPlexLabelComplete()). I then intersected both IS's using ISIntersect() to get the DAG points corresponding to just vertices (and flagged by Gmsh) for Dirichlet BC's, and DAG points that are Faces and flagged by Gmsh for Neumann BC's. I then use the intersected IS to edit a Mat and a RHS Vec manually. I did further profiling and have found the PetsSections are now the next biggest overhead. For Dirichlet BC's I make an array of vertex ID's and call MatSetZeroRows() to impose BC's on them through my K matrix. And yes, I'm solving the elasticity PDE. For Neumann BC's I use DMPlexGetRecursiveVertices() to edit my RHS vector. I cannot find a function named DMPlexGetRecursiveVertices(). I want to keep the PetscSections since they preallocate my matrix rather well (the one from DMCreateMatrix()) but at the same time would like to remove them since they add overhead. Do you think DMAddboundary() with the function call will be faster than my single calls to MatSetZeroRows() and DMPlexGetRecursiveVertices() ? PetscSection is really simple. Are you sure you are measuring long times there? What are you using it to do? Thanks, Matt ________________________________ From: Jed Brown > Sent: Wednesday, January 5, 2022 5:44 PM To: Ferrand, Jesus A. > Cc: petsc-users > Subject: Re: [EXTERNAL] Re: [petsc-users] DM misuse causes massive memory leak? For something like displacement (and this sounds like elasticity), I would recommend using one field with three components. You can constrain a subset of the components to implement slip conditions. You can use DMPlexLabelComplete(dm, label) to propagate those face labels to vertices. "Ferrand, Jesus A." > writes: > Thanks for the reply (I hit reply all this time). > > So, I set 3 fields: > /* > ierr = PetscSectionSetNumFields(s,dof); CHKERRQ(ierr); > ierr = PetscSectionSetFieldName(s,0, "X-Displacement"); CHKERRQ(ierr); //Field ID is 0 > ierr = PetscSectionSetFieldName(s,1, "Y-Displacement"); CHKERRQ(ierr); //Field ID is 1 > ierr = PetscSectionSetFieldName(s,2, "Z-Displacement"); CHKERRQ(ierr); //Field ID is 2 > */ > > I then loop through the vertices of my DMPlex > > /* > for(ii = vStart; ii < vEnd; ii++){//Vertex loop. > ierr = PetscSectionSetDof(s, ii, dof); CHKERRQ(ierr); > ierr = PetscSectionSetFieldDof(s,ii,0,1); CHKERRQ(ierr);//One X-displacement per vertex (1 dof) > ierr = PetscSectionSetFieldDof(s,ii,1,1); CHKERRQ(ierr);//One Y-displacement per vertex (1 dof) > ierr = PetscSectionSetFieldDof(s,ii,2,1); CHKERRQ(ierr);//One Z-displacement per vertex (1 dof) > }//Sets x, y, and z displacements as dofs. > */ > > I only associated fields with vertices, not with any other points in the DAG. Regarding the use of DMAddBoundary(), I mostly copied the usage shown in SNES example 77. I modified the function definition to simply set the dof to 0.0 as opposed to the coordinates. Below "physicalgroups" is the DMLabel that I got from gmsh, this flags Face points, not vertices. That is why I think the error log suggests that fields were never set. > > ierr = DMAddBoundary(dm, DM_BC_ESSENTIAL, "fixed", physicalgroups, 1, &surfvals[ii], fieldID, 0, NULL, (void (*)(void)) coordinates, NULL, NULL, NULL); CHKERRQ(ierr); > PetscErrorCode coordinates(PetscInt dim, PetscReal time, const PetscReal x[], PetscInt Nf, PetscScalar *u, void *ctx){ > const PetscInt Ncomp = dim; > PetscInt comp; > for (comp = 0; comp < Ncomp; ++comp) u[comp] = 0.0; > return 0; > } > > > ________________________________ > From: Jed Brown > > Sent: Wednesday, January 5, 2022 12:36 AM > To: Ferrand, Jesus A. > > Cc: petsc-users > > Subject: Re: [EXTERNAL] Re: [petsc-users] DM misuse causes massive memory leak? > > Please "reply all" include the list in the future. > > "Ferrand, Jesus A." > writes: > >> Forgot to say thanks for the reply (my bad). >> Yes, I was indeed forgetting to pre-allocate the sparse matrices when doing them by hand (complacency seeded by MATLAB's zeros()). Thank you, Jed, and Jeremy, for the hints! >> >> I have more questions, these ones about boundary conditions (I think these are for Matt). >> In my current code I set Dirichlet conditions directly on a Mat by calling MatSetZeroRows(). I profiled my code and found the part that applies them to be unnacceptably slow. In response, I've been trying to better pre-allocate Mats using PetscSections. I have found documentation for PetscSectionSetDof(), PetscSectionSetNumFields(), PetscSectionSetFieldName(), and PetscSectionSetFieldDof(), to set the size of my Mats and Vecs by calling DMSetLocalSection() followed by DMCreateMatrix() and DMCreateLocalVector() to get a RHS vector. This seems faster. >> >> In PetscSection, what is the difference between a "field" and a "component"? For example, I could have one field "Velocity" with three components ux, uy, and uz or perhaps three fields ux, uy, and uz each with a default component? > > It's just how you name them and how they appear in output. Usually "velocity" is better as a field with three components, but fields with other meaning (and perhaps different finite element spaces), such as pressure, would be different fields. Different components are always in the same FE space. > >> I am struggling now to impose boundary conditions after constraining dofs using PetscSection. My understanding is that constraining dof's reduces the size of the DM's matrix but it does not give the DM knowledge of what values the constrained dofs should have, right? >> >> I know that there is DMAddBoundary(), but I am unsure of how to use it. >From Gmsh I have a mesh with surface boundaries flagged. I'm not sure whether DMAddBoundary()will constrain the face, edge, or vertex points when I give it the DMLabel from Gmsh. (I need specific dof on the vertices to be constrained). I did some testing and I think DMAddBoundary() attempts to constrain the Face points (see error log below). I only associated fields with the vertices but not the Faces. I can extract the vertex points from the face label using DMPlexGetConeRecursiveVertices() but the output IS has repeated entries for the vertex points (many faces share the same vertex). Is there an easier way to get the vertex points from a gmsh surface tag? > > How did you call DMAddBoundary()? Are you using DM_BC_ESSENTIAL and a callback function that provides the inhomogeneous boundary condition? > >> I'm sorry this is a mouthful. >> >> [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- >> [0]PETSC ERROR: Argument out of range >> [0]PETSC ERROR: Field number 2 must be in [0, 0) > > It looks like you haven't added these fields yet. > >> [0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting. >> [0]PETSC ERROR: Petsc Release Version 3.16.0, unknown >> [0]PETSC ERROR: ./gmsh4 on a linux-c-dbg named F86 by jesus Tue Jan 4 15:19:57 2022 >> [0]PETSC ERROR: Configure options --with-32bits-pci-domain=1 --with-debugging =1 >> [0]PETSC ERROR: #1 DMGetField() at /home/jesus/SAND/PETSc_install/petsc/src/dm/interface/dm.c:4803 >> [0]PETSC ERROR: #2 DMCompleteBoundaryLabel_Internal() at /home/jesus/SAND/PETSc_install/petsc/src/dm/interface/dm.c:5140 >> [0]PETSC ERROR: #3 DMAddBoundary() at /home/jesus/SAND/PETSc_install/petsc/src/dm/interface/dm.c:8561 >> [0]PETSC ERROR: #4 main() at /home/jesus/SAND/FEA/3D/gmshBACKUP4.c:215 >> [0]PETSC ERROR: No PETSc Option Table entries >> [0]PETSC ERROR: ----------------End of Error Message -------send entire error message to petsc-maint at mcs.anl.gov---------- >> >> >> >> >> >> >> ________________________________ >> From: Jed Brown > >> Sent: Wednesday, December 29, 2021 5:55 PM >> To: Ferrand, Jesus A. >; petsc-users at mcs.anl.gov > >> Subject: [EXTERNAL] Re: [petsc-users] DM misuse causes massive memory leak? >> >> CAUTION: This email originated outside of Embry-Riddle Aeronautical University. Do not click links or open attachments unless you recognize the sender and know the content is safe. >> >> >> "Ferrand, Jesus A." > writes: >> >>> Dear PETSc Team: >>> >>> I have a question about DM and PetscSection. Say I import a mesh (for FEM purposes) and create a DMPlex for it. I then use PetscSections to set degrees of freedom per "point" (by point I mean vertices, lines, faces, and cells). I then use PetscSectionGetStorageSize() to get the size of the global stiffness matrix (K) needed for my FEM problem. One last detail, this K I populate inside a rather large loop using an element stiffness matrix function of my own. Instead of using DMCreateMatrix(), I manually created a Mat using MatCreate(), MatSetType(), MatSetSizes(), and MatSetUp(). I come to find that said loop is painfully slow when I use the manually created matrix, but 20x faster when I use the Mat coming out of DMCreateMatrix(). >> >> The sparse matrix hasn't been preallocated, which forces the data structure to do a lot of copies (as bad as O(n^2) complexity). DMCreateMatrix() preallocates for you. >> >> https://petsc.org/release/docs/manual/performance/#memory-allocation-for-sparse-matrix-assembly >> https://petsc.org/release/docs/manual/mat/#sec-matsparse >> >>> My question is then: Is the manual Mat a noob mistake and is it somehow creating a memory leak with K? Just in case it's something else I'm attaching the code. The loop that populates K is between lines 221 and 278. Anything related to DM, DMPlex, and PetscSection is between lines 117 and 180. >>> >>> Machine Type: HP Laptop >>> C-compiler: Gnu C >>> OS: Ubuntu 20.04 >>> PETSc version: 3.16.0 >>> MPI Implementation: MPICH >>> >>> Hope you all had a Merry Christmas and that you will have a happy and productive New Year. :D >>> >>> >>> Sincerely: >>> >>> J.A. Ferrand >>> >>> Embry-Riddle Aeronautical University - Daytona Beach FL >>> >>> M.Sc. Aerospace Engineering | May 2022 >>> >>> B.Sc. Aerospace Engineering >>> >>> B.Sc. Computational Mathematics >>> >>> >>> >>> Sigma Gamma Tau >>> >>> Tau Beta Pi >>> >>> Honors Program >>> >>> >>> >>> Phone: (386)-843-1829 >>> >>> Email(s): ferranj2 at my.erau.edu >>> >>> jesus.ferrand at gmail.com >>> //REFERENCE: https://github.com/FreeFem/FreeFem-sources/blob/master/plugin/mpi/PETSc-code.hpp >>> #include >>> static char help[] = "Imports a Gmsh mesh with boundary conditions and solves the elasticity equation.\n" >>> "Option prefix = opt_.\n"; >>> >>> struct preKE{//Preallocation before computing KE >>> Mat matB, >>> matBTCB; >>> //matKE; >>> PetscInt x_insert[3], >>> y_insert[3], >>> z_insert[3], >>> m,//Looping variables. >>> sizeKE,//size of the element stiffness matrix. >>> N,//Number of nodes in element. >>> x_in,y_in,z_in; //LI to index B matrix. >>> PetscReal J[3][3],//Jacobian matrix. >>> invJ[3][3],//Inverse of the Jacobian matrix. >>> detJ,//Determinant of the Jacobian. >>> dX[3], >>> dY[3], >>> dZ[3], >>> minor00, >>> minor01, >>> minor02,//Determinants of minors in a 3x3 matrix. >>> dPsidX, dPsidY, dPsidZ,//Shape function derivatives w.r.t global coordinates. >>> weight,//Multiplier of quadrature weights. >>> *dPsidXi,//Derivatives of shape functions w.r.t. Xi. >>> *dPsidEta,//Derivatives of shape functions w.r.t. Eta. >>> *dPsidZeta;//Derivatives of shape functions w.r.t Zeta. >>> PetscErrorCode ierr; >>> };//end struct. >>> >>> //Function declarations. >>> extern PetscErrorCode tetra4(PetscScalar*, PetscScalar*, PetscScalar*,struct preKE*, Mat*, Mat*); >>> extern PetscErrorCode ConstitutiveMatrix(Mat*,const char*,PetscInt); >>> extern PetscErrorCode InitializeKEpreallocation(struct preKE*,const char*); >>> >>> PetscErrorCode PetscViewerVTKWriteFunction(PetscObject vec,PetscViewer viewer){ >>> PetscErrorCode ierr; >>> ierr = VecView((Vec)vec,viewer); CHKERRQ(ierr); >>> return ierr; >>> } >>> >>> >>> >>> >>> int main(int argc, char **args){ >>> //DEFINITIONS OF PETSC's DMPLEX LINGO: >>> //POINT: A topology element (cell, face, edge, or vertex). >>> //CHART: It an interval from 0 to the number of "points." (the range of admissible linear indices) >>> //STRATUM: A subset of the "chart" which corresponds to all "points" at a given "level." >>> //LEVEL: This is either a "depth" or a "height". >>> //HEIGHT: Dimensionality of an element measured from 0D to 3D. Heights: cell = 0, face = 1, edge = 2, vertex = 3. >>> //DEPTH: Dimensionality of an element measured from 3D to 0D. Depths: cell = 3, face = 2, edge = 1, vertex = 0; >>> //CLOSURE: *of an element is the collection of all other elements that define it.I.e., the closure of a surface is the collection of vertices and edges that make it up. >>> //STAR: >>> //STANDARD LABELS: These are default tags that DMPlex has for its topology. ("depth") >>> PetscErrorCode ierr;//Error tracking variable. >>> DM dm;//Distributed memory object (useful for managing grids.) >>> DMLabel physicalgroups;//Identifies user-specified tags in gmsh (to impose BC's). >>> DMPolytopeType celltype;//When looping through cells, determines its type (tetrahedron, pyramid, hexahedron, etc.) >>> PetscSection s; >>> KSP ksp;//Krylov Sub-Space (linear solver object) >>> Mat K,//Global stiffness matrix (Square, assume unsymmetric). >>> KE,//Element stiffness matrix (Square, assume unsymmetric). >>> matC;//Constitutive matrix. >>> Vec XYZ,//Coordinate vector, contains spatial locations of mesh's vertices (NOTE: This vector self-destroys!). >>> U,//Displacement vector. >>> F;//Load Vector. >>> PetscViewer XYZviewer,//Viewer object to output mesh to ASCII format. >>> XYZpUviewer; //Viewer object to output displacements to ASCII format. >>> PetscBool interpolate = PETSC_TRUE,//Instructs Gmsh importer whether to generate faces and edges (Needed when using P2 or higher elements). >>> useCone = PETSC_TRUE,//Instructs "DMPlexGetTransitiveClosure()" whether to extract the closure or the star. >>> dirichletBC = PETSC_FALSE,//For use when assembling the K matrix. >>> neumannBC = PETSC_FALSE,//For use when assembling the F vector. >>> saveASCII = PETSC_FALSE,//Whether to save results in ASCII format. >>> saveVTK = PETSC_FALSE;//Whether to save results as VTK format. >>> PetscInt nc,//number of cells. (PETSc lingo for "elements") >>> nv,//number of vertices. (PETSc lingo for "nodes") >>> nf,//number of faces. (PETSc lingo for "surfaces") >>> ne,//number of edges. (PETSc lingo for "lines") >>> pStart,//starting LI of global elements. >>> pEnd,//ending LI of all elements. >>> cStart,//starting LI for cells global arrangement. >>> cEnd,//ending LI for cells in global arrangement. >>> vStart,//starting LI for vertices in global arrangement. >>> vEnd,//ending LI for vertices in global arrangement. >>> fStart,//starting LI for faces in global arrangement. >>> fEnd,//ending LI for faces in global arrangement. >>> eStart,//starting LI for edges in global arrangement. >>> eEnd,//ending LI for edges in global arrangement. >>> sizeK,//Size of the element stiffness matrix. >>> ii,jj,kk,//Dedicated looping variables. >>> indexXYZ,//Variable to access the elements of XYZ vector. >>> indexK,//Variable to access the elements of the U and F vectors (can reference rows and colums of K matrix.) >>> *closure = PETSC_NULL,//Pointer to the closure elements of a cell. >>> size_closure,//Size of the closure of a cell. >>> dim,//Dimension of the mesh. >>> //*edof,//Linear indices of dof's inside the K matrix. >>> dof = 3,//Degrees of freedom per node. >>> cells=0, edges=0, vertices=0, faces=0,//Topology counters when looping through cells. >>> pinXcode=10, pinZcode=11,forceZcode=12;//Gmsh codes to extract relevant "Face Sets." >>> PetscReal //*x_el,//Pointer to a vector that will store the x-coordinates of an element's vertices. >>> //*y_el,//Pointer to a vector that will store the y-coordinates of an element's vertices. >>> //*z_el,//Pointer to a vector that will store the z-coordinates of an element's vertices. >>> *xyz_el,//Pointer to xyz array in the XYZ vector. >>> traction = -10, >>> *KEdata, >>> t1,t2; //time keepers. >>> const char *gmshfile = "TopOptmeshfine2.msh";//Name of gmsh file to import. >>> >>> ierr = PetscInitialize(&argc,&args,NULL,help); if(ierr) return ierr; //And the machine shall work.... >>> >>> //MESH IMPORT================================================================= >>> //IMPORTANT NOTE: Gmsh only creates "cells" and "vertices," it does not create the "faces" or "edges." >>> //Gmsh probably can generate them, must figure out how to. >>> t1 = MPI_Wtime(); >>> ierr = DMPlexCreateGmshFromFile(PETSC_COMM_WORLD,gmshfile,interpolate,&dm); CHKERRQ(ierr);//Read Gmsh file and generate the DMPlex. >>> ierr = DMGetDimension(dm, &dim); CHKERRQ(ierr);//1-D, 2-D, or 3-D >>> ierr = DMPlexGetChart(dm, &pStart, &pEnd); CHKERRQ(ierr);//Extracts linear indices of cells, vertices, faces, and edges. >>> ierr = DMGetCoordinatesLocal(dm,&XYZ); CHKERRQ(ierr);//Extracts coordinates from mesh.(Contiguous storage: [x0,y0,z0,x1,y1,z1,...]) >>> ierr = VecGetArray(XYZ,&xyz_el); CHKERRQ(ierr);//Get pointer to vector's data. >>> t2 = MPI_Wtime(); >>> PetscPrintf(PETSC_COMM_WORLD,"Mesh Import time: %10f\n",t2-t1); >>> DMView(dm,PETSC_VIEWER_STDOUT_WORLD); >>> >>> //IMPORTANT NOTE: PETSc assumes that vertex-cell meshes are 2D even if they were 3D, so its ordering changes. >>> //Cells remain at height 0, but vertices move to height 1 from height 3. To prevent this from becoming an issue >>> //the "interpolate" variable is set to PETSC_TRUE to tell the mesh importer to generate faces and edges. >>> //PETSc, therefore, technically does additional meshing. Gotta figure out how to get this from Gmsh directly. >>> ierr = DMPlexGetDepthStratum(dm,3, &cStart, &cEnd);//Get LI of cells. >>> ierr = DMPlexGetDepthStratum(dm,2, &fStart, &fEnd);//Get LI of faces >>> ierr = DMPlexGetDepthStratum(dm,1, &eStart, &eEnd);//Get LI of edges. >>> ierr = DMPlexGetDepthStratum(dm,0, &vStart, &vEnd);//Get LI of vertices. >>> ierr = DMGetStratumSize(dm,"depth", 3, &nc);//Get number of cells. >>> ierr = DMGetStratumSize(dm,"depth", 2, &nf);//Get number of faces. >>> ierr = DMGetStratumSize(dm,"depth", 1, &ne);//Get number of edges. >>> ierr = DMGetStratumSize(dm,"depth", 0, &nv);//Get number of vertices. >>> /* >>> PetscPrintf(PETSC_COMM_WORLD,"global start = %10d\t global end = %10d\n",pStart,pEnd); >>> PetscPrintf(PETSC_COMM_WORLD,"#cells = %10d\t i = %10d\t i < %10d\n",nc,cStart,cEnd); >>> PetscPrintf(PETSC_COMM_WORLD,"#faces = %10d\t i = %10d\t i < %10d\n",nf,fStart,fEnd); >>> PetscPrintf(PETSC_COMM_WORLD,"#edges = %10d\t i = %10d\t i < %10d\n",ne,eStart,eEnd); >>> PetscPrintf(PETSC_COMM_WORLD,"#vertices = %10d\t i = %10d\t i < %10d\n",nv,vStart,vEnd); >>> */ >>> //MESH IMPORT================================================================= >>> >>> //NOTE: This section extremely hardcoded right now. >>> //Current setup would only support P1 meshes. >>> //MEMORY ALLOCATION ========================================================== >>> ierr = PetscSectionCreate(PETSC_COMM_WORLD, &s); CHKERRQ(ierr); >>> //The chart is akin to a contiguous memory storage allocation. Each chart entry is associated >>> //with a "thing," could be a vertex, face, cell, or edge, or anything else. >>> ierr = PetscSectionSetChart(s, pStart, pEnd); CHKERRQ(ierr); >>> //For each "thing" in the chart, additional room can be made. This is helpful for associating >>> //nodes to multiple degrees of freedom. These commands help associate nodes with >>> for(ii = cStart; ii < cEnd; ii++){//Cell loop. >>> ierr = PetscSectionSetDof(s, ii, 0);CHKERRQ(ierr);}//NOTE: Currently no dof's associated with cells. >>> for(ii = fStart; ii < fEnd; ii++){//Face loop. >>> ierr = PetscSectionSetDof(s, ii, 0);CHKERRQ(ierr);}//NOTE: Currently no dof's associated with faces. >>> for(ii = vStart; ii < vEnd; ii++){//Vertex loop. >>> ierr = PetscSectionSetDof(s, ii, dof);CHKERRQ(ierr);}//Sets x, y, and z displacements as dofs. >>> for(ii = eStart; ii < eEnd; ii++){//Edge loop >>> ierr = PetscSectionSetDof(s, ii, 0);CHKERRQ(ierr);}//NOTE: Currently no dof's associated with edges. >>> ierr = PetscSectionSetUp(s); CHKERRQ(ierr); >>> ierr = PetscSectionGetStorageSize(s,&sizeK);CHKERRQ(ierr);//Determine the size of the global stiffness matrix. >>> ierr = DMSetLocalSection(dm,s); CHKERRQ(ierr);//Associate the PetscSection with the DM object. >>> //PetscErrorCode DMCreateGlobalVector(DM dm,Vec *vec) >>> //ierr = DMCreateGlobalVector(dm,&U); CHKERRQ(ierr); >>> PetscSectionDestroy(&s); >>> //PetscPrintf(PETSC_COMM_WORLD,"sizeK = %10d\n",sizeK); >>> >>> //OBJECT SETUP================================================================ >>> //Global stiffness matrix. >>> //PetscErrorCode DMCreateMatrix(DM dm,Mat *mat) >>> >>> //This makes the loop fast. >>> ierr = DMCreateMatrix(dm,&K); >>> >>> //This makes the loop uber slow. >>> //ierr = MatCreate(PETSC_COMM_WORLD,&K); CHKERRQ(ierr); >>> //ierr = MatSetType(K,MATAIJ); CHKERRQ(ierr);// Global stiffness matrix set to some sparse type. >>> //ierr = MatSetSizes(K,PETSC_DECIDE,PETSC_DECIDE,sizeK,sizeK); CHKERRQ(ierr); >>> //ierr = MatSetUp(K); CHKERRQ(ierr); >>> >>> //Displacement vector. >>> ierr = VecCreate(PETSC_COMM_WORLD,&U); CHKERRQ(ierr); >>> ierr = VecSetType(U,VECSTANDARD); CHKERRQ(ierr);// Global stiffness matrix set to some sparse type. >>> ierr = VecSetSizes(U,PETSC_DECIDE,sizeK); CHKERRQ(ierr); >>> >>> //Load vector. >>> ierr = VecCreate(PETSC_COMM_WORLD,&F); CHKERRQ(ierr); >>> ierr = VecSetType(F,VECSTANDARD); CHKERRQ(ierr);// Global stiffness matrix set to some sparse type. >>> ierr = VecSetSizes(F,PETSC_DECIDE,sizeK); CHKERRQ(ierr); >>> //OBJECT SETUP================================================================ >>> >>> //WARNING: This loop is currently hardcoded for P1 elements only! Must Figure >>> //out a clever way to modify to accomodate Pn (n>1) elements. >>> >>> //BEGIN GLOBAL STIFFNESS MATRIX BUILDER======================================= >>> t1 = MPI_Wtime(); >>> >>> //PREALLOCATIONS============================================================== >>> ierr = ConstitutiveMatrix(&matC,"isotropic",0); CHKERRQ(ierr); >>> struct preKE preKEtetra4; >>> ierr = InitializeKEpreallocation(&preKEtetra4,"tetra4"); CHKERRQ(ierr); >>> ierr = MatCreate(PETSC_COMM_WORLD,&KE); CHKERRQ(ierr); //SEQUENTIAL >>> ierr = MatSetSizes(KE,PETSC_DECIDE,PETSC_DECIDE,12,12); CHKERRQ(ierr); //SEQUENTIAL >>> ierr = MatSetType(KE,MATDENSE); CHKERRQ(ierr); //SEQUENTIAL >>> ierr = MatSetUp(KE); CHKERRQ(ierr); >>> PetscReal x_tetra4[4], y_tetra4[4],z_tetra4[4], >>> x_hex8[8], y_hex8[8],z_hex8[8], >>> *x,*y,*z; >>> PetscInt *EDOF,edof_tetra4[12],edof_hex8[24]; >>> DMPolytopeType previous = DM_POLYTOPE_UNKNOWN; >>> //PREALLOCATIONS============================================================== >>> >>> >>> >>> for(ii=cStart;ii>> ierr = DMPlexGetTransitiveClosure(dm, ii, useCone, &size_closure, &closure); CHKERRQ(ierr); >>> ierr = DMPlexGetCellType(dm, ii, &celltype); CHKERRQ(ierr); >>> //IMPORTANT NOTE: MOST OF THIS LOOP SHOULD BE INCLUDED IN THE KE3D function. >>> if(previous != celltype){ >>> //PetscPrintf(PETSC_COMM_WORLD,"run \n"); >>> if(celltype == DM_POLYTOPE_TETRAHEDRON){ >>> x = x_tetra4; >>> y = y_tetra4; >>> z = z_tetra4; >>> EDOF = edof_tetra4; >>> }//end if. >>> else if(celltype == DM_POLYTOPE_HEXAHEDRON){ >>> x = x_hex8; >>> y = y_hex8; >>> z = z_hex8; >>> EDOF = edof_hex8; >>> }//end else if. >>> } >>> previous = celltype; >>> >>> //PetscPrintf(PETSC_COMM_WORLD,"Cell # %4i\t",ii); >>> cells=0; >>> edges=0; >>> vertices=0; >>> faces=0; >>> kk = 0; >>> for(jj=0;jj<(2*size_closure);jj+=2){//Scan the closure of the current cell. >>> //Use information from the DM's strata to determine composition of cell_ii. >>> if(vStart <= closure[jj] && closure[jj]< vEnd){//Check for vertices. >>> //PetscPrintf(PETSC_COMM_WORLD,"%5i\t",closure[jj]); >>> indexXYZ = dim*(closure[jj]-vStart);//Linear index of x-coordinate in the xyz_el array. >>> >>> *(x+vertices) = xyz_el[indexXYZ]; >>> *(y+vertices) = xyz_el[indexXYZ+1];//Extract Y-coordinates of the current vertex. >>> *(z+vertices) = xyz_el[indexXYZ+2];//Extract Y-coordinates of the current vertex. >>> *(EDOF + kk) = indexXYZ; >>> *(EDOF + kk+1) = indexXYZ+1; >>> *(EDOF + kk+2) = indexXYZ+2; >>> kk+=3; >>> vertices++;//Update vertex counter. >>> }//end if >>> else if(eStart <= closure[jj] && closure[jj]< eEnd){//Check for edge ID's >>> edges++; >>> }//end else ifindexK >>> else if(fStart <= closure[jj] && closure[jj]< fEnd){//Check for face ID's >>> faces++; >>> }//end else if >>> else if(cStart <= closure[jj] && closure[jj]< cEnd){//Check for cell ID's >>> cells++; >>> }//end else if >>> }//end "jj" loop. >>> ierr = tetra4(x,y,z,&preKEtetra4,&matC,&KE); CHKERRQ(ierr); //Generate the element stiffness matrix for this cell. >>> ierr = MatDenseGetArray(KE,&KEdata); CHKERRQ(ierr); >>> ierr = MatSetValues(K,12,EDOF,12,EDOF,KEdata,ADD_VALUES); CHKERRQ(ierr);//WARNING: HARDCODED FOR TETRAHEDRAL P1 ELEMENTS ONLY !!!!!!!!!!!!!!!!!!!!!!! >>> ierr = MatDenseRestoreArray(KE,&KEdata); CHKERRQ(ierr); >>> ierr = DMPlexRestoreTransitiveClosure(dm, ii,useCone, &size_closure, &closure); CHKERRQ(ierr); >>> }//end "ii" loop. >>> ierr = MatAssemblyBegin(K,MAT_FINAL_ASSEMBLY); CHKERRQ(ierr); >>> ierr = MatAssemblyEnd(K,MAT_FINAL_ASSEMBLY); CHKERRQ(ierr); >>> //ierr = MatView(K,PETSC_VIEWER_DRAW_WORLD); CHKERRQ(ierr); >>> //END GLOBAL STIFFNESS MATRIX BUILDER=========================================== >>> t2 = MPI_Wtime(); >>> PetscPrintf(PETSC_COMM_WORLD,"K build time: %10f\n",t2-t1); >>> >>> >>> >>> >>> >>> >>> >>> >>> t1 = MPI_Wtime(); >>> //BEGIN BOUNDARY CONDITION ENFORCEMENT========================================== >>> IS TrianglesIS, physicalsurfaceID;//, VerticesIS; >>> PetscInt numsurfvals, >>> //numRows, >>> dof_offset,numTri; >>> const PetscInt *surfvals, >>> //*pinZID, >>> *TriangleID; >>> PetscScalar diag =1; >>> PetscReal area,force; >>> //NOTE: Petsc can read/assign labels. Eeach label may posses multiple "values." >>> //These values act as tags within a tag. >>> //IMPORTANT NOTE: The below line needs a safety. If a mesh that does not feature >>> //face sets is imported, the code in its current state will crash!!!. This is currently >>> //hardcoded for the test mesh. >>> ierr = DMGetLabel(dm, "Face Sets", &physicalgroups); CHKERRQ(ierr);//Inspects Physical surface groups defined by gmsh (if any). >>> ierr = DMLabelGetValueIS(physicalgroups, &physicalsurfaceID); CHKERRQ(ierr);//Gets the physical surface ID's defined in gmsh (as specified in the .geo file). >>> ierr = ISGetIndices(physicalsurfaceID,&surfvals); CHKERRQ(ierr);//Get a pointer to the actual surface values. >>> ierr = DMLabelGetNumValues(physicalgroups, &numsurfvals); CHKERRQ(ierr);//Gets the number of different values that the label assigns. >>> for(ii=0;ii>> //PetscPrintf(PETSC_COMM_WORLD,"Values = %5i\n",surfvals[ii]); >>> //PROBLEM: The surface values are hardcoded in the gmsh file. We need to adopt standard "codes" >>> //that we can give to users when they make their meshes so that this code recognizes the Type >>> // of boundary conditions that are to be imposed. >>> if(surfvals[ii] == pinXcode){ >>> dof_offset = 0; >>> dirichletBC = PETSC_TRUE; >>> }//end if. >>> else if(surfvals[ii] == pinZcode){ >>> dof_offset = 2; >>> dirichletBC = PETSC_TRUE; >>> }//end else if. >>> else if(surfvals[ii] == forceZcode){ >>> dof_offset = 2; >>> neumannBC = PETSC_TRUE; >>> }//end else if. >>> >>> ierr = DMLabelGetStratumIS(physicalgroups, surfvals[ii], &TrianglesIS); CHKERRQ(ierr);//Get the ID's (as an IS) of the surfaces belonging to value 11. >>> //PROBLEM: DMPlexGetConeRecursiveVertices returns an array with repeated node ID's. For each repetition, the lines that enforce BC's unnecessarily re-run. >>> ierr = ISGetSize(TrianglesIS,&numTri); CHKERRQ(ierr); >>> ierr = ISGetIndices(TrianglesIS,&TriangleID); CHKERRQ(ierr);//Get a pointer to the actual surface values. >>> for(kk=0;kk>> ierr = DMPlexGetTransitiveClosure(dm, TriangleID[kk], useCone, &size_closure, &closure); CHKERRQ(ierr); >>> if(neumannBC){ >>> ierr = DMPlexComputeCellGeometryFVM(dm, TriangleID[kk], &area,PETSC_NULL,PETSC_NULL); CHKERRQ(ierr); >>> force = traction*area/3;//WARNING: The 3 here is hardcoded for a purely tetrahedral mesh only!!!!!!!!!! >>> } >>> for(jj=0;jj<(2*size_closure);jj+=2){ >>> //PetscErrorCode DMPlexComputeCellGeometryFVM(DM dm, PetscInt cell, PetscReal *vol, PetscReal centroid[], PetscReal normal[]) >>> if(vStart <= closure[jj] && closure[jj]< vEnd){//Check for vertices. >>> indexK = dof*(closure[jj] - vStart) + dof_offset; //Compute the dof ID's in the K matrix. >>> if(dirichletBC){//Boundary conditions requiring an edit of K matrix. >>> ierr = MatZeroRows(K,1,&indexK,diag,NULL,NULL); CHKERRQ(ierr); >>> }//end if. >>> else if(neumannBC){//Boundary conditions requiring an edit of RHS vector. >>> ierr = VecSetValue(F,indexK,force,ADD_VALUES); CHKERRQ(ierr); >>> }// end else if. >>> }//end if. >>> }//end "jj" loop. >>> ierr = DMPlexRestoreTransitiveClosure(dm, closure[jj],useCone, &size_closure, &closure); CHKERRQ(ierr); >>> }//end "kk" loop. >>> ierr = ISRestoreIndices(TrianglesIS,&TriangleID); CHKERRQ(ierr); >>> >>> /* >>> ierr = DMPlexGetConeRecursiveVertices(dm, TrianglesIS, &VerticesIS); CHKERRQ(ierr);//Get the ID's (as an IS) of the vertices that make up the surfaces of value 11. >>> ierr = ISGetSize(VerticesIS,&numRows); CHKERRQ(ierr);//Get number of flagged vertices (this includes repeated indices for faces that share nodes). >>> ierr = ISGetIndices(VerticesIS,&pinZID); CHKERRQ(ierr);//Get a pointer to the actual surface values. >>> if(dirichletBC){//Boundary conditions requiring an edit of K matrix. >>> for(kk=0;kk>> indexK = 3*(pinZID[kk] - vStart) + dof_offset; //Compute the dof ID's in the K matrix. (NOTE: the 3* ishardcoded for 3 degrees of freedom, tie this to a variable in the FUTURE.) >>> ierr = MatZeroRows(K,1,&indexK,diag,NULL,NULL); CHKERRQ(ierr); >>> }//end "kk" loop. >>> }//end if. >>> else if(neumannBC){//Boundary conditions requiring an edit of RHS vector. >>> for(kk=0;kk>> indexK = 3*(pinZID[kk] - vStart) + dof_offset; >>> ierr = VecSetValue(F,indexK,traction,INSERT_VALUES); CHKERRQ(ierr); >>> }//end "kk" loop. >>> }// end else if. >>> ierr = ISRestoreIndices(VerticesIS,&pinZID); CHKERRQ(ierr); >>> */ >>> dirichletBC = PETSC_FALSE; >>> neumannBC = PETSC_FALSE; >>> }//end "ii" loop. >>> ierr = ISRestoreIndices(physicalsurfaceID,&surfvals); CHKERRQ(ierr); >>> //ierr = ISRestoreIndices(VerticesIS,&pinZID); CHKERRQ(ierr); >>> ierr = ISDestroy(&physicalsurfaceID); CHKERRQ(ierr); >>> //ierr = ISDestroy(&VerticesIS); CHKERRQ(ierr); >>> ierr = ISDestroy(&TrianglesIS); CHKERRQ(ierr); >>> //END BOUNDARY CONDITION ENFORCEMENT============================================ >>> t2 = MPI_Wtime(); >>> PetscPrintf(PETSC_COMM_WORLD,"BC imposition time: %10f\n",t2-t1); >>> >>> /* >>> PetscInt kk = 0; >>> for(ii=vStart;ii>> kk++; >>> PetscPrintf(PETSC_COMM_WORLD,"Vertex #%4i\t x = %10.9f\ty = %10.9f\tz = %10.9f\n",ii,xyz_el[3*kk],xyz_el[3*kk+1],xyz_el[3*kk+2]); >>> }// end "ii" loop. >>> */ >>> >>> t1 = MPI_Wtime(); >>> //SOLVER======================================================================== >>> ierr = KSPCreate(PETSC_COMM_WORLD,&ksp); CHKERRQ(ierr); >>> ierr = KSPSetOperators(ksp,K,K); CHKERRQ(ierr); >>> ierr = KSPSetFromOptions(ksp); CHKERRQ(ierr); >>> ierr = KSPSolve(ksp,F,U); CHKERRQ(ierr); >>> t2 = MPI_Wtime(); >>> //ierr = KSPView(ksp,PETSC_VIEWER_STDOUT_WORLD);CHKERRQ(ierr); >>> //SOLVER======================================================================== >>> t2 = MPI_Wtime(); >>> PetscPrintf(PETSC_COMM_WORLD,"Solver time: %10f\n",t2-t1); >>> ierr = VecRestoreArray(XYZ,&xyz_el); CHKERRQ(ierr);//Get pointer to vector's data. >>> >>> //BEGIN MAX/MIN DISPLACEMENTS=================================================== >>> IS ISux,ISuy,ISuz; >>> Vec UX,UY,UZ; >>> PetscReal UXmax,UYmax,UZmax,UXmin,UYmin,UZmin; >>> ierr = ISCreateStride(PETSC_COMM_WORLD,nv,0,3,&ISux); CHKERRQ(ierr); >>> ierr = ISCreateStride(PETSC_COMM_WORLD,nv,1,3,&ISuy); CHKERRQ(ierr); >>> ierr = ISCreateStride(PETSC_COMM_WORLD,nv,2,3,&ISuz); CHKERRQ(ierr); >>> >>> //PetscErrorCode VecGetSubVector(Vec X,IS is,Vec *Y) >>> ierr = VecGetSubVector(U,ISux,&UX); CHKERRQ(ierr); >>> ierr = VecGetSubVector(U,ISuy,&UY); CHKERRQ(ierr); >>> ierr = VecGetSubVector(U,ISuz,&UZ); CHKERRQ(ierr); >>> >>> //PetscErrorCode VecMax(Vec x,PetscInt *p,PetscReal *val) >>> ierr = VecMax(UX,PETSC_NULL,&UXmax); CHKERRQ(ierr); >>> ierr = VecMax(UY,PETSC_NULL,&UYmax); CHKERRQ(ierr); >>> ierr = VecMax(UZ,PETSC_NULL,&UZmax); CHKERRQ(ierr); >>> >>> ierr = VecMin(UX,PETSC_NULL,&UXmin); CHKERRQ(ierr); >>> ierr = VecMin(UY,PETSC_NULL,&UYmin); CHKERRQ(ierr); >>> ierr = VecMin(UZ,PETSC_NULL,&UZmin); CHKERRQ(ierr); >>> >>> PetscPrintf(PETSC_COMM_WORLD,"%10f\t <= ux <= %10f\n",UXmin,UXmax); >>> PetscPrintf(PETSC_COMM_WORLD,"%10f\t <= uy <= %10f\n",UYmin,UYmax); >>> PetscPrintf(PETSC_COMM_WORLD,"%10f\t <= uz <= %10f\n",UZmin,UZmax); >>> >>> >>> >>> >>> //BEGIN OUTPUT SOLUTION========================================================= >>> if(saveASCII){ >>> ierr = PetscViewerASCIIOpen(PETSC_COMM_WORLD,"XYZ.txt",&XYZviewer); >>> ierr = VecView(XYZ,XYZviewer); CHKERRQ(ierr); >>> ierr = PetscViewerASCIIOpen(PETSC_COMM_WORLD,"U.txt",&XYZpUviewer); >>> ierr = VecView(U,XYZpUviewer); CHKERRQ(ierr); >>> PetscViewerDestroy(&XYZviewer); PetscViewerDestroy(&XYZpUviewer); >>> >>> }//end if. >>> if(saveVTK){ >>> const char *meshfile = "starting_mesh.vtk", >>> *deformedfile = "deformed_mesh.vtk"; >>> ierr = PetscViewerVTKOpen(PETSC_COMM_WORLD,meshfile,FILE_MODE_WRITE,&XYZviewer); CHKERRQ(ierr); >>> //PetscErrorCode DMSetAuxiliaryVec(DM dm, DMLabel label, PetscInt value, Vec aux) >>> DMLabel UXlabel,UYlabel, UZlabel; >>> //PetscErrorCode DMLabelCreate(MPI_Comm comm, const char name[], DMLabel *label) >>> ierr = DMLabelCreate(PETSC_COMM_WORLD, "X-Displacement", &UXlabel); CHKERRQ(ierr); >>> ierr = DMLabelCreate(PETSC_COMM_WORLD, "Y-Displacement", &UYlabel); CHKERRQ(ierr); >>> ierr = DMLabelCreate(PETSC_COMM_WORLD, "Z-Displacement", &UZlabel); CHKERRQ(ierr); >>> ierr = DMSetAuxiliaryVec(dm,UXlabel, 1, UX); CHKERRQ(ierr); >>> ierr = DMSetAuxiliaryVec(dm,UYlabel, 1, UY); CHKERRQ(ierr); >>> ierr = DMSetAuxiliaryVec(dm,UZlabel, 1, UZ); CHKERRQ(ierr); >>> //PetscErrorCode PetscViewerVTKAddField(PetscViewer viewer,PetscObject dm,PetscErrorCode (*PetscViewerVTKWriteFunction)(PetscObject,PetscViewer),PetscInt fieldnum,PetscViewerVTKFieldType fieldtype,PetscBool checkdm,PetscObject vec) >>> >>> >>> >>> //ierr = PetscViewerVTKAddField(XYZviewer, dm,PetscErrorCode (*PetscViewerVTKWriteFunction)(Vec,PetscViewer),PETSC_DEFAULT,PETSC_VTK_POINT_FIELD,PETSC_FALSE,UX); >>> ierr = PetscViewerVTKAddField(XYZviewer, (PetscObject)dm,&PetscViewerVTKWriteFunction,PETSC_DEFAULT,PETSC_VTK_POINT_FIELD,PETSC_FALSE,(PetscObject)UX); >>> >>> >>> ierr = DMPlexVTKWriteAll((PetscObject)dm, XYZviewer); CHKERRQ(ierr); >>> ierr = VecAXPY(XYZ,1,U); CHKERRQ(ierr);//Add displacement field to the mesh coordinates to deform. >>> ierr = PetscViewerVTKOpen(PETSC_COMM_WORLD,deformedfile,FILE_MODE_WRITE,&XYZpUviewer); CHKERRQ(ierr); >>> ierr = DMPlexVTKWriteAll((PetscObject)dm, XYZpUviewer); CHKERRQ(ierr);// >>> PetscViewerDestroy(&XYZviewer); PetscViewerDestroy(&XYZpUviewer); >>> >>> }//end else if. >>> else{ >>> ierr = PetscPrintf(PETSC_COMM_WORLD,"No output format specified! Files not saved.\n"); CHKERRQ(ierr); >>> }//end else. >>> >>> >>> //END OUTPUT SOLUTION=========================================================== >>> VecDestroy(&UX); ISDestroy(&ISux); >>> VecDestroy(&UY); ISDestroy(&ISuy); >>> VecDestroy(&UZ); ISDestroy(&ISuz); >>> //END MAX/MIN DISPLACEMENTS===================================================== >>> >>> //CLEANUP===================================================================== >>> DMDestroy(&dm); >>> KSPDestroy(&ksp); >>> MatDestroy(&K); MatDestroy(&KE); MatDestroy(&matC); //MatDestroy(preKEtetra4.matB); MatDestroy(preKEtetra4.matBTCB); >>> VecDestroy(&U); VecDestroy(&F); >>> >>> //DMLabelDestroy(&physicalgroups);//Destroyig the DM destroys the label. >>> //CLEANUP===================================================================== >>> //PetscErrorCode PetscMallocDump(FILE *fp) >>> //ierr = PetscMallocDump(NULL); >>> return PetscFinalize();//And the machine shall rest.... >>> }//end main. >>> >>> PetscErrorCode tetra4(PetscScalar* X,PetscScalar* Y, PetscScalar* Z,struct preKE *P, Mat* matC, Mat* KE){ >>> //INPUTS: >>> //X: Global X coordinates of the elemental nodes. >>> //Y: Global Y coordinates of the elemental nodes. >>> //Z: Global Z coordinates of the elemental nodes. >>> //J: Jacobian matrix. >>> //invJ: Inverse Jacobian matrix. >>> PetscErrorCode ierr; >>> //For current quadrature point, get dPsi/dXi_i Xi_i = {Xi,Eta,Zeta} >>> /* >>> P->dPsidXi[0] = +1.; P->dPsidEta[0] = 0.0; P->dPsidZeta[0] = 0.0; >>> P->dPsidXi[1] = 0.0; P->dPsidEta[1] = +1.; P->dPsidZeta[1] = 0.0; >>> P->dPsidXi[2] = 0.0; P->dPsidEta[2] = 0.0; P->dPsidZeta[2] = +1.; >>> P->dPsidXi[3] = -1.; P->dPsidEta[3] = -1.; P->dPsidZeta[3] = -1.; >>> */ >>> //Populate the Jacobian matrix. >>> P->J[0][0] = X[0] - X[3]; >>> P->J[0][1] = Y[0] - Y[3]; >>> P->J[0][2] = Z[0] - Z[3]; >>> P->J[1][0] = X[1] - X[3]; >>> P->J[1][1] = Y[1] - Y[3]; >>> P->J[1][2] = Z[1] - Z[3]; >>> P->J[2][0] = X[2] - X[3]; >>> P->J[2][1] = Y[2] - Y[3]; >>> P->J[2][2] = Z[2] - Z[3]; >>> >>> //Determinant of the 3x3 Jacobian. (Expansion along 1st row). >>> P->minor00 = P->J[1][1]*P->J[2][2] - P->J[2][1]*P->J[1][2];//Reuse when finding InvJ. >>> P->minor01 = P->J[1][0]*P->J[2][2] - P->J[2][0]*P->J[1][2];//Reuse when finding InvJ. >>> P->minor02 = P->J[1][0]*P->J[2][1] - P->J[2][0]*P->J[1][1];//Reuse when finding InvJ. >>> P->detJ = P->J[0][0]*P->minor00 - P->J[0][1]*P->minor01 + P->J[0][2]*P->minor02; >>> //Inverse of the 3x3 Jacobian >>> P->invJ[0][0] = +P->minor00/P->detJ;//Reuse precomputed minor. >>> P->invJ[0][1] = -(P->J[0][1]*P->J[2][2] - P->J[0][2]*P->J[2][1])/P->detJ; >>> P->invJ[0][2] = +(P->J[0][1]*P->J[1][2] - P->J[1][1]*P->J[0][2])/P->detJ; >>> P->invJ[1][0] = -P->minor01/P->detJ;//Reuse precomputed minor. >>> P->invJ[1][1] = +(P->J[0][0]*P->J[2][2] - P->J[0][2]*P->J[2][0])/P->detJ; >>> P->invJ[1][2] = -(P->J[0][0]*P->J[1][2] - P->J[1][0]*P->J[0][2])/P->detJ; >>> P->invJ[2][0] = +P->minor02/P->detJ;//Reuse precomputed minor. >>> P->invJ[2][1] = -(P->J[0][0]*P->J[2][1] - P->J[0][1]*P->J[2][0])/P->detJ; >>> P->invJ[2][2] = +(P->J[0][0]*P->J[1][1] - P->J[0][1]*P->J[1][0])/P->detJ; >>> >>> //*****************STRAIN MATRIX (B)************************************** >>> for(P->m=0;P->mN;P->m++){//Scan all shape functions. >>> >>> P->x_in = 0 + P->m*3;//Every 3rd column starting at 0 >>> P->y_in = P->x_in +1;//Every 3rd column starting at 1 >>> P->z_in = P->y_in +1;//Every 3rd column starting at 2 >>> >>> P->dX[0] = P->invJ[0][0]*P->dPsidXi[P->m] + P->invJ[0][1]*P->dPsidEta[P->m] + P->invJ[0][2]*P->dPsidZeta[P->m]; >>> P->dY[0] = P->invJ[1][0]*P->dPsidXi[P->m] + P->invJ[1][1]*P->dPsidEta[P->m] + P->invJ[1][2]*P->dPsidZeta[P->m]; >>> P->dZ[0] = P->invJ[2][0]*P->dPsidXi[P->m] + P->invJ[2][1]*P->dPsidEta[P->m] + P->invJ[2][2]*P->dPsidZeta[P->m]; >>> >>> P->dX[1] = P->dZ[0]; P->dX[2] = P->dY[0]; >>> P->dY[1] = P->dZ[0]; P->dY[2] = P->dX[0]; >>> P->dZ[1] = P->dX[0]; P->dZ[2] = P->dY[0]; >>> >>> ierr = MatSetValues(P->matB,3,P->x_insert,1,&(P->x_in),P->dX,INSERT_VALUES); CHKERRQ(ierr); >>> ierr = MatSetValues(P->matB,3,P->y_insert,1,&(P->y_in),P->dY,INSERT_VALUES); CHKERRQ(ierr); >>> ierr = MatSetValues(P->matB,3,P->z_insert,1,&(P->z_in),P->dZ,INSERT_VALUES); CHKERRQ(ierr); >>> >>> }//end "m" loop. >>> ierr = MatAssemblyBegin(P->matB,MAT_FINAL_ASSEMBLY); CHKERRQ(ierr); >>> ierr = MatAssemblyEnd(P->matB,MAT_FINAL_ASSEMBLY); CHKERRQ(ierr); >>> //*****************STRAIN MATRIX (B)************************************** >>> >>> //Compute the matrix product B^t*C*B, scale it by the quadrature weights and add to KE. >>> P->weight = -P->detJ/6; >>> >>> ierr = MatZeroEntries(*KE); CHKERRQ(ierr); >>> ierr = MatPtAP(*matC,P->matB,MAT_INITIAL_MATRIX,PETSC_DEFAULT,&(P->matBTCB));CHKERRQ(ierr); >>> ierr = MatScale(P->matBTCB,P->weight); CHKERRQ(ierr); >>> ierr = MatAssemblyBegin(P->matBTCB,MAT_FINAL_ASSEMBLY); CHKERRQ(ierr); >>> ierr = MatAssemblyEnd(P->matBTCB,MAT_FINAL_ASSEMBLY); CHKERRQ(ierr); >>> ierr = MatAXPY(*KE,1,P->matBTCB,DIFFERENT_NONZERO_PATTERN); CHKERRQ(ierr);//Add contribution of current quadrature point to KE. >>> >>> //ierr = MatPtAP(*matC,P->matB,MAT_INITIAL_MATRIX,PETSC_DEFAULT,KE);CHKERRQ(ierr); >>> //ierr = MatScale(*KE,P->weight); CHKERRQ(ierr); >>> >>> ierr = MatAssemblyBegin(*KE,MAT_FINAL_ASSEMBLY); CHKERRQ(ierr); >>> ierr = MatAssemblyEnd(*KE,MAT_FINAL_ASSEMBLY); CHKERRQ(ierr); >>> >>> //Cleanup >>> return ierr; >>> }//end tetra4. >>> >>> PetscErrorCode ConstitutiveMatrix(Mat *matC,const char* type,PetscInt materialID){ >>> PetscErrorCode ierr; >>> PetscBool isotropic = PETSC_FALSE, >>> orthotropic = PETSC_FALSE; >>> //PetscErrorCode PetscStrcmp(const char a[],const char b[],PetscBool *flg) >>> ierr = PetscStrcmp(type,"isotropic",&isotropic); >>> ierr = PetscStrcmp(type,"orthotropic",&orthotropic); >>> ierr = MatCreate(PETSC_COMM_WORLD,matC); CHKERRQ(ierr); >>> ierr = MatSetSizes(*matC,PETSC_DECIDE,PETSC_DECIDE,6,6); CHKERRQ(ierr); >>> ierr = MatSetType(*matC,MATAIJ); CHKERRQ(ierr); >>> ierr = MatSetUp(*matC); CHKERRQ(ierr); >>> >>> if(isotropic){ >>> PetscReal E,nu, M,L,vals[3]; >>> switch(materialID){ >>> case 0://Hardcoded properties for isotropic material #0 >>> E = 200; >>> nu = 1./3; >>> break; >>> case 1://Hardcoded properties for isotropic material #1 >>> E = 96; >>> nu = 1./3; >>> break; >>> }//end switch. >>> M = E/(2*(1+nu)),//Lame's constant 1 ("mu"). >>> L = E*nu/((1+nu)*(1-2*nu));//Lame's constant 2 ("lambda"). >>> //PetscErrorCode MatSetValues(Mat mat,PetscInt m,const PetscInt idxm[],PetscInt n,const PetscInt idxn[],const PetscScalar v[],InsertMode addv) >>> PetscInt idxn[3] = {0,1,2}; >>> vals[0] = L+2*M; vals[1] = L; vals[2] = vals[1]; >>> ierr = MatSetValues(*matC,1,&idxn[0],3,idxn,vals,INSERT_VALUES); CHKERRQ(ierr); >>> vals[1] = vals[0]; vals[0] = vals[2]; >>> ierr = MatSetValues(*matC,1,&idxn[1],3,idxn,vals,INSERT_VALUES); CHKERRQ(ierr); >>> vals[2] = vals[1]; vals[1] = vals[0]; >>> ierr = MatSetValues(*matC,1,&idxn[2],3,idxn,vals,INSERT_VALUES); CHKERRQ(ierr); >>> ierr = MatSetValue(*matC,3,3,M,INSERT_VALUES); CHKERRQ(ierr); >>> ierr = MatSetValue(*matC,4,4,M,INSERT_VALUES); CHKERRQ(ierr); >>> ierr = MatSetValue(*matC,5,5,M,INSERT_VALUES); CHKERRQ(ierr); >>> }//end if. >>> /* >>> else if(orthotropic){ >>> switch(materialID){ >>> case 0: >>> break; >>> case 1: >>> break; >>> }//end switch. >>> }//end else if. >>> */ >>> ierr = MatAssemblyBegin(*matC,MAT_FINAL_ASSEMBLY); CHKERRQ(ierr); >>> ierr = MatAssemblyEnd(*matC,MAT_FINAL_ASSEMBLY); CHKERRQ(ierr); >>> //MatView(*matC,0); >>> return ierr; >>> }//End ConstitutiveMatrix >>> >>> PetscErrorCode InitializeKEpreallocation(struct preKE *P,const char* type){ >>> PetscErrorCode ierr; >>> PetscBool istetra4 = PETSC_FALSE, >>> ishex8 = PETSC_FALSE; >>> ierr = PetscStrcmp(type,"tetra4",&istetra4); CHKERRQ(ierr); >>> ierr = PetscStrcmp(type,"hex8",&ishex8); CHKERRQ(ierr); >>> if(istetra4){ >>> P->sizeKE = 12; >>> P->N = 4; >>> }//end if. >>> else if(ishex8){ >>> P->sizeKE = 24; >>> P->N = 8; >>> }//end else if. >>> >>> >>> P->x_insert[0] = 0; P->x_insert[1] = 3; P->x_insert[2] = 5; >>> P->y_insert[0] = 1; P->y_insert[1] = 4; P->y_insert[2] = 5; >>> P->z_insert[0] = 2; P->z_insert[1] = 3; P->z_insert[2] = 4; >>> //Allocate memory for the differentiated shape function vectors. >>> ierr = PetscMalloc1(P->N,&(P->dPsidXi)); CHKERRQ(ierr); >>> ierr = PetscMalloc1(P->N,&(P->dPsidEta)); CHKERRQ(ierr); >>> ierr = PetscMalloc1(P->N,&(P->dPsidZeta)); CHKERRQ(ierr); >>> >>> P->dPsidXi[0] = +1.; P->dPsidEta[0] = 0.0; P->dPsidZeta[0] = 0.0; >>> P->dPsidXi[1] = 0.0; P->dPsidEta[1] = +1.; P->dPsidZeta[1] = 0.0; >>> P->dPsidXi[2] = 0.0; P->dPsidEta[2] = 0.0; P->dPsidZeta[2] = +1.; >>> P->dPsidXi[3] = -1.; P->dPsidEta[3] = -1.; P->dPsidZeta[3] = -1.; >>> >>> >>> //Strain matrix. >>> ierr = MatCreate(PETSC_COMM_WORLD,&(P->matB)); CHKERRQ(ierr); >>> ierr = MatSetSizes(P->matB,PETSC_DECIDE,PETSC_DECIDE,6,P->sizeKE); CHKERRQ(ierr);//Hardcoded >>> ierr = MatSetType(P->matB,MATAIJ); CHKERRQ(ierr); >>> ierr = MatSetUp(P->matB); CHKERRQ(ierr); >>> >>> //Contribution matrix. >>> ierr = MatCreate(PETSC_COMM_WORLD,&(P->matBTCB)); CHKERRQ(ierr); >>> ierr = MatSetSizes(P->matBTCB,PETSC_DECIDE,PETSC_DECIDE,P->sizeKE,P->sizeKE); CHKERRQ(ierr); >>> ierr = MatSetType(P->matBTCB,MATAIJ); CHKERRQ(ierr); >>> ierr = MatSetUp(P->matBTCB); CHKERRQ(ierr); >>> >>> //Element stiffness matrix. >>> //ierr = MatCreateSeqDense(PETSC_COMM_SELF,12,12,NULL,&KE); CHKERRQ(ierr); //PARALLEL >>> >>> return ierr; >>> } -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Fri Jan 7 18:23:02 2022 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 7 Jan 2022 19:23:02 -0500 Subject: [petsc-users] [EXTERNAL] Re: DM misuse causes massive memory leak? In-Reply-To: References: <87tueraunm.fsf@jedbrown.org> <87sfu2921h.fsf@jedbrown.org> <87a6g99507.fsf@jedbrown.org> Message-ID: On Fri, Jan 7, 2022 at 7:10 PM Ferrand, Jesus A. wrote: > Matt: > > Thanks again for the prompt reply : ) > > You were right regarding the PetscSections. I reprofiled my code where I > use them and realized that I had DMCreateMatrix() in the mix. The brunt of > the overhead came from it, not the PetscSections. > > I looked for ways to improve the time on DMCreateMatrix() and figured I > could call DMSetMatType() to work with a BAIJ + MatSetValuesBlocked() > instead of AIJ +MatSetValues() to leverage my dof blocks (ux, uy, uz) > during assembly. My matrix assembly times have improved but DMCreateMatrix > seems to take just as long as before. I noticed after calling > DMGetBlockSize() that my DM is working with blocksize of 1. Weirdly enough, > my call to KSPView() returns a block size of 3 (as it should be) for the > matrix that came from DMCreateMatrix(). I further noticed after looking at > the DM function index that there is no "DMSetBlockSize()," however there is > in fact a DMGetBlockSize(). Anyways, any suggestions on how to speed up > DMCreateMatrix()? I have tried calling both DMSetMatrixPreallocateOnly() > and DMSetAdjacency() (with the flags for FEM) but barely got a speedup. > THe time here comes from traversing the mesh to see what needs to be allocated. It is insensitive to block size. I know of no way to speed up this process without strong additional assumptions. > Regarding DMPlexGetConeRecursiveVertices() vs. > DMPlexGetTransitiveClosure() for BC's. I will (for now) side with the > former as I am getting better times with it for the BC's at least. My gripe > with DMPlexGetTransitiveClosure() is that it returns the whole closure as > oppossed to subsets of it (i.e., for a cell, it returns the constituetn > faces, edges, and vertices as oppossed to say, just vertices). Currently, > I'm working with P1 elements only so I need only vertices. To me, > therefore, the additional info seems wasteful for P1. For Pn (n>1) > DMPlexGetTransitiveClosure() seems like the superior option though. > GetConeRecursive _is_ getting the entire closure and throwing away the parts it does not want. It is exactly the same code with worse memory management. Thanks, Matt > ------------------------------ > *From:* Matthew Knepley > *Sent:* Thursday, January 6, 2022 5:39 PM > *To:* Ferrand, Jesus A. > *Cc:* Jed Brown ; petsc-users > *Subject:* Re: [petsc-users] [EXTERNAL] Re: DM misuse causes massive > memory leak? > > On Thu, Jan 6, 2022 at 5:30 PM Ferrand, Jesus A. > wrote: > > Matt: > > Thanks for the immediate reply. > My apologies, I misspelled the function name, it should have been " > DMPlexGetConeRecursiveVertices()." > > > Oh, this is a debugging function that Vaclav wrote. It should never be > used in production. It allocates memory all over the place. > If you want vertices, just get the closure and filter out the vertices: > > PetscInt *closure = NULL; > PetscInt Ncl, Nclv; > > DMPlexGetDepthStrautm(dm, &vStart, &vEnd); > DMPlexGetTransitiveClosure(dm, cell, PETSC_TRUE, &Ncl, &closure); > Nclv = 0; > for (cl = 0; cl < 2*Ncl; cl += 2) { > const PetscInt point = closure[cl]; > > if ((point >= vStart) && (point < vEnd)) closure[Nclv++] = point; > } > > DMPlexRestoreTransitiveClosure(...); > > > Regarding the PetscSection use: I am looping through every single point in > the DAG of my mesh. For each point I am assigning dof using > PetscSectionSetDof(). I am also assigning dof to the corresponding fields > using PetscSectionSetFieldDof(). I took Jed's advice and made a single > field with 3 components, I named all of them. So, I used > PetscSectionSetNumFields(), PetscSectionSetFieldComponents(), > PetscSectionSetFieldName(), and PetscSectionSetComponentName(). Finally, I > proceed to PetscSectionSetUp(), and DMSetLocalSection(). In my timed code > blocks I am including DMPlexGetDepthStratum(), and DMGetStratumSize(), and > DMPlexGetChart() because I need their output to assign the dof to the > PetscSection. > > > This should all take almost no time. There are no expensive operations > there. > > Thanks, > > Matt > > > For what is worth, I have my configure set to --with-debugging = 1. > ------------------------------ > *From:* Matthew Knepley > *Sent:* Thursday, January 6, 2022 5:20 PM > *To:* Ferrand, Jesus A. > *Cc:* Jed Brown ; petsc-users > *Subject:* Re: [petsc-users] [EXTERNAL] Re: DM misuse causes massive > memory leak? > > On Thu, Jan 6, 2022 at 5:15 PM Ferrand, Jesus A. > wrote: > > Jed: > > DMPlexLabelComplete() has allowed me to speed up my code significantly > (Many thanks!). > > I did not use DMAddBoundary() though. > I figured I could obtain Index Sets (IS's) from the DAG for different > depths and then IS's for the points that were flagged in Gmsh (after > calling DMPlexLabelComplete()). > I then intersected both IS's using ISIntersect() to get the DAG points > corresponding to just vertices (and flagged by Gmsh) for Dirichlet BC's, > and DAG points that are Faces and flagged by Gmsh for Neumann BC's. I then > use the intersected IS to edit a Mat and a RHS Vec manually. I did further > profiling and have found the PetsSections are now the next biggest > overhead. > > For Dirichlet BC's I make an array of vertex ID's and call > MatSetZeroRows() to impose BC's on them through my K matrix. And yes, I'm > solving the elasticity PDE. For Neumann BC's I use > DMPlexGetRecursiveVertices() to edit my RHS vector. > > > I cannot find a function named DMPlexGetRecursiveVertices(). > > > I want to keep the PetscSections since they preallocate my matrix rather > well (the one from DMCreateMatrix()) but at the same time would like to > remove them since they add overhead. Do you think DMAddboundary() with the > function call will be faster than my single calls to MatSetZeroRows() and > DMPlexGetRecursiveVertices() ? > > > PetscSection is really simple. Are you sure you are measuring long times > there? What are you using it to do? > > Thanks, > > Matt > > > ------------------------------ > *From:* Jed Brown > *Sent:* Wednesday, January 5, 2022 5:44 PM > *To:* Ferrand, Jesus A. > *Cc:* petsc-users > *Subject:* Re: [EXTERNAL] Re: [petsc-users] DM misuse causes massive > memory leak? > > For something like displacement (and this sounds like elasticity), I would > recommend using one field with three components. You can constrain a subset > of the components to implement slip conditions. > > You can use DMPlexLabelComplete(dm, label) to propagate those face labels > to vertices. > > "Ferrand, Jesus A." writes: > > > Thanks for the reply (I hit reply all this time). > > > > So, I set 3 fields: > > /* > > ierr = PetscSectionSetNumFields(s,dof); CHKERRQ(ierr); > > ierr = PetscSectionSetFieldName(s,0, "X-Displacement"); CHKERRQ(ierr); > //Field ID is 0 > > ierr = PetscSectionSetFieldName(s,1, "Y-Displacement"); CHKERRQ(ierr); > //Field ID is 1 > > ierr = PetscSectionSetFieldName(s,2, "Z-Displacement"); CHKERRQ(ierr); > //Field ID is 2 > > */ > > > > I then loop through the vertices of my DMPlex > > > > /* > > for(ii = vStart; ii < vEnd; ii++){//Vertex loop. > > ierr = PetscSectionSetDof(s, ii, dof); CHKERRQ(ierr); > > ierr = PetscSectionSetFieldDof(s,ii,0,1); CHKERRQ(ierr);//One > X-displacement per vertex (1 dof) > > ierr = PetscSectionSetFieldDof(s,ii,1,1); CHKERRQ(ierr);//One > Y-displacement per vertex (1 dof) > > ierr = PetscSectionSetFieldDof(s,ii,2,1); CHKERRQ(ierr);//One > Z-displacement per vertex (1 dof) > > }//Sets x, y, and z displacements as dofs. > > */ > > > > I only associated fields with vertices, not with any other points in the > DAG. Regarding the use of DMAddBoundary(), I mostly copied the usage shown > in SNES example 77. I modified the function definition to simply set the > dof to 0.0 as opposed to the coordinates. Below "physicalgroups" is the > DMLabel that I got from gmsh, this flags Face points, not vertices. That is > why I think the error log suggests that fields were never set. > > > > ierr = DMAddBoundary(dm, DM_BC_ESSENTIAL, "fixed", physicalgroups, 1, > &surfvals[ii], fieldID, 0, NULL, (void (*)(void)) coordinates, NULL, NULL, > NULL); CHKERRQ(ierr); > > PetscErrorCode coordinates(PetscInt dim, PetscReal time, const PetscReal > x[], PetscInt Nf, PetscScalar *u, void *ctx){ > > const PetscInt Ncomp = dim; > > PetscInt comp; > > for (comp = 0; comp < Ncomp; ++comp) u[comp] = 0.0; > > return 0; > > } > > > > > > ________________________________ > > From: Jed Brown > > Sent: Wednesday, January 5, 2022 12:36 AM > > To: Ferrand, Jesus A. > > Cc: petsc-users > > Subject: Re: [EXTERNAL] Re: [petsc-users] DM misuse causes massive > memory leak? > > > > Please "reply all" include the list in the future. > > > > "Ferrand, Jesus A." writes: > > > >> Forgot to say thanks for the reply (my bad). > >> Yes, I was indeed forgetting to pre-allocate the sparse matrices when > doing them by hand (complacency seeded by MATLAB's zeros()). Thank you, > Jed, and Jeremy, for the hints! > >> > >> I have more questions, these ones about boundary conditions (I think > these are for Matt). > >> In my current code I set Dirichlet conditions directly on a Mat by > calling MatSetZeroRows(). I profiled my code and found the part that > applies them to be unnacceptably slow. In response, I've been trying to > better pre-allocate Mats using PetscSections. I have found documentation > for PetscSectionSetDof(), PetscSectionSetNumFields(), > PetscSectionSetFieldName(), and PetscSectionSetFieldDof(), to set the size > of my Mats and Vecs by calling DMSetLocalSection() followed by > DMCreateMatrix() and DMCreateLocalVector() to get a RHS vector. This seems > faster. > >> > >> In PetscSection, what is the difference between a "field" and a > "component"? For example, I could have one field "Velocity" with three > components ux, uy, and uz or perhaps three fields ux, uy, and uz each with > a default component? > > > > It's just how you name them and how they appear in output. Usually > "velocity" is better as a field with three components, but fields with > other meaning (and perhaps different finite element spaces), such as > pressure, would be different fields. Different components are always in the > same FE space. > > > >> I am struggling now to impose boundary conditions after constraining > dofs using PetscSection. My understanding is that constraining dof's > reduces the size of the DM's matrix but it does not give the DM knowledge > of what values the constrained dofs should have, right? > >> > >> I know that there is DMAddBoundary(), but I am unsure of how to use it. > From Gmsh I have a mesh with surface boundaries flagged. I'm not sure > whether DMAddBoundary()will constrain the face, edge, or vertex points when > I give it the DMLabel from Gmsh. (I need specific dof on the vertices to be > constrained). I did some testing and I think DMAddBoundary() attempts to > constrain the Face points (see error log below). I only associated fields > with the vertices but not the Faces. I can extract the vertex points from > the face label using DMPlexGetConeRecursiveVertices() but the output IS has > repeated entries for the vertex points (many faces share the same vertex). > Is there an easier way to get the vertex points from a gmsh surface tag? > > > > How did you call DMAddBoundary()? Are you using DM_BC_ESSENTIAL and a > callback function that provides the inhomogeneous boundary condition? > > > >> I'm sorry this is a mouthful. > >> > >> [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > >> [0]PETSC ERROR: Argument out of range > >> [0]PETSC ERROR: Field number 2 must be in [0, 0) > > > > It looks like you haven't added these fields yet. > > > >> [0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble > shooting. > >> [0]PETSC ERROR: Petsc Release Version 3.16.0, unknown > >> [0]PETSC ERROR: ./gmsh4 on a linux-c-dbg named F86 by jesus Tue Jan 4 > 15:19:57 2022 > >> [0]PETSC ERROR: Configure options --with-32bits-pci-domain=1 > --with-debugging =1 > >> [0]PETSC ERROR: #1 DMGetField() at > /home/jesus/SAND/PETSc_install/petsc/src/dm/interface/dm.c:4803 > >> [0]PETSC ERROR: #2 DMCompleteBoundaryLabel_Internal() at > /home/jesus/SAND/PETSc_install/petsc/src/dm/interface/dm.c:5140 > >> [0]PETSC ERROR: #3 DMAddBoundary() at > /home/jesus/SAND/PETSc_install/petsc/src/dm/interface/dm.c:8561 > >> [0]PETSC ERROR: #4 main() at /home/jesus/SAND/FEA/3D/gmshBACKUP4.c:215 > >> [0]PETSC ERROR: No PETSc Option Table entries > >> [0]PETSC ERROR: ----------------End of Error Message -------send entire > error message to petsc-maint at mcs.anl.gov---------- > >> > >> > >> > >> > >> > >> > >> ________________________________ > >> From: Jed Brown > >> Sent: Wednesday, December 29, 2021 5:55 PM > >> To: Ferrand, Jesus A. ; petsc-users at mcs.anl.gov < > petsc-users at mcs.anl.gov> > >> Subject: [EXTERNAL] Re: [petsc-users] DM misuse causes massive memory > leak? > >> > >> CAUTION: This email originated outside of Embry-Riddle Aeronautical > University. Do not click links or open attachments unless you recognize the > sender and know the content is safe. > >> > >> > >> "Ferrand, Jesus A." writes: > >> > >>> Dear PETSc Team: > >>> > >>> I have a question about DM and PetscSection. Say I import a mesh (for > FEM purposes) and create a DMPlex for it. I then use PetscSections to set > degrees of freedom per "point" (by point I mean vertices, lines, faces, and > cells). I then use PetscSectionGetStorageSize() to get the size of the > global stiffness matrix (K) needed for my FEM problem. One last detail, > this K I populate inside a rather large loop using an element stiffness > matrix function of my own. Instead of using DMCreateMatrix(), I manually > created a Mat using MatCreate(), MatSetType(), MatSetSizes(), and > MatSetUp(). I come to find that said loop is painfully slow when I use the > manually created matrix, but 20x faster when I use the Mat coming out of > DMCreateMatrix(). > >> > >> The sparse matrix hasn't been preallocated, which forces the data > structure to do a lot of copies (as bad as O(n^2) complexity). > DMCreateMatrix() preallocates for you. > >> > >> > https://petsc.org/release/docs/manual/performance/#memory-allocation-for-sparse-matrix-assembly > >> https://petsc.org/release/docs/manual/mat/#sec-matsparse > >> > >>> My question is then: Is the manual Mat a noob mistake and is it > somehow creating a memory leak with K? Just in case it's something else I'm > attaching the code. The loop that populates K is between lines 221 and 278. > Anything related to DM, DMPlex, and PetscSection is between lines 117 and > 180. > >>> > >>> Machine Type: HP Laptop > >>> C-compiler: Gnu C > >>> OS: Ubuntu 20.04 > >>> PETSc version: 3.16.0 > >>> MPI Implementation: MPICH > >>> > >>> Hope you all had a Merry Christmas and that you will have a happy and > productive New Year. :D > >>> > >>> > >>> Sincerely: > >>> > >>> J.A. Ferrand > >>> > >>> Embry-Riddle Aeronautical University - Daytona Beach FL > >>> > >>> M.Sc. Aerospace Engineering | May 2022 > >>> > >>> B.Sc. Aerospace Engineering > >>> > >>> B.Sc. Computational Mathematics > >>> > >>> > >>> > >>> Sigma Gamma Tau > >>> > >>> Tau Beta Pi > >>> > >>> Honors Program > >>> > >>> > >>> > >>> Phone: (386)-843-1829 > >>> > >>> Email(s): ferranj2 at my.erau.edu > >>> > >>> jesus.ferrand at gmail.com > >>> //REFERENCE: > https://github.com/FreeFem/FreeFem-sources/blob/master/plugin/mpi/PETSc-code.hpp > >>> #include > >>> static char help[] = "Imports a Gmsh mesh with boundary conditions and > solves the elasticity equation.\n" > >>> "Option prefix = opt_.\n"; > >>> > >>> struct preKE{//Preallocation before computing KE > >>> Mat matB, > >>> matBTCB; > >>> //matKE; > >>> PetscInt x_insert[3], > >>> y_insert[3], > >>> z_insert[3], > >>> m,//Looping variables. > >>> sizeKE,//size of the element stiffness matrix. > >>> N,//Number of nodes in element. > >>> x_in,y_in,z_in; //LI to index B matrix. > >>> PetscReal J[3][3],//Jacobian matrix. > >>> invJ[3][3],//Inverse of the Jacobian matrix. > >>> detJ,//Determinant of the Jacobian. > >>> dX[3], > >>> dY[3], > >>> dZ[3], > >>> minor00, > >>> minor01, > >>> minor02,//Determinants of minors in a 3x3 matrix. > >>> dPsidX, dPsidY, dPsidZ,//Shape function derivatives w.r.t > global coordinates. > >>> weight,//Multiplier of quadrature weights. > >>> *dPsidXi,//Derivatives of shape functions w.r.t. Xi. > >>> *dPsidEta,//Derivatives of shape functions w.r.t. Eta. > >>> *dPsidZeta;//Derivatives of shape functions w.r.t Zeta. > >>> PetscErrorCode ierr; > >>> };//end struct. > >>> > >>> //Function declarations. > >>> extern PetscErrorCode tetra4(PetscScalar*, PetscScalar*, > PetscScalar*,struct preKE*, Mat*, Mat*); > >>> extern PetscErrorCode ConstitutiveMatrix(Mat*,const char*,PetscInt); > >>> extern PetscErrorCode InitializeKEpreallocation(struct preKE*,const > char*); > >>> > >>> PetscErrorCode PetscViewerVTKWriteFunction(PetscObject vec,PetscViewer > viewer){ > >>> PetscErrorCode ierr; > >>> ierr = VecView((Vec)vec,viewer); CHKERRQ(ierr); > >>> return ierr; > >>> } > >>> > >>> > >>> > >>> > >>> int main(int argc, char **args){ > >>> //DEFINITIONS OF PETSC's DMPLEX LINGO: > >>> //POINT: A topology element (cell, face, edge, or vertex). > >>> //CHART: It an interval from 0 to the number of "points." (the range > of admissible linear indices) > >>> //STRATUM: A subset of the "chart" which corresponds to all "points" > at a given "level." > >>> //LEVEL: This is either a "depth" or a "height". > >>> //HEIGHT: Dimensionality of an element measured from 0D to 3D. > Heights: cell = 0, face = 1, edge = 2, vertex = 3. > >>> //DEPTH: Dimensionality of an element measured from 3D to 0D. > Depths: cell = 3, face = 2, edge = 1, vertex = 0; > >>> //CLOSURE: *of an element is the collection of all other elements > that define it.I.e., the closure of a surface is the collection of vertices > and edges that make it up. > >>> //STAR: > >>> //STANDARD LABELS: These are default tags that DMPlex has for its > topology. ("depth") > >>> PetscErrorCode ierr;//Error tracking variable. > >>> DM dm;//Distributed memory object (useful for managing grids.) > >>> DMLabel physicalgroups;//Identifies user-specified tags in gmsh (to > impose BC's). > >>> DMPolytopeType celltype;//When looping through cells, determines its > type (tetrahedron, pyramid, hexahedron, etc.) > >>> PetscSection s; > >>> KSP ksp;//Krylov Sub-Space (linear solver object) > >>> Mat K,//Global stiffness matrix (Square, assume unsymmetric). > >>> KE,//Element stiffness matrix (Square, assume unsymmetric). > >>> matC;//Constitutive matrix. > >>> Vec XYZ,//Coordinate vector, contains spatial locations of mesh's > vertices (NOTE: This vector self-destroys!). > >>> U,//Displacement vector. > >>> F;//Load Vector. > >>> PetscViewer XYZviewer,//Viewer object to output mesh to ASCII format. > >>> XYZpUviewer; //Viewer object to output displacements to > ASCII format. > >>> PetscBool interpolate = PETSC_TRUE,//Instructs Gmsh importer whether > to generate faces and edges (Needed when using P2 or higher elements). > >>> useCone = PETSC_TRUE,//Instructs > "DMPlexGetTransitiveClosure()" whether to extract the closure or the star. > >>> dirichletBC = PETSC_FALSE,//For use when assembling the K > matrix. > >>> neumannBC = PETSC_FALSE,//For use when assembling the F > vector. > >>> saveASCII = PETSC_FALSE,//Whether to save results in ASCII > format. > >>> saveVTK = PETSC_FALSE;//Whether to save results as VTK > format. > >>> PetscInt nc,//number of cells. (PETSc lingo for "elements") > >>> nv,//number of vertices. (PETSc lingo for "nodes") > >>> nf,//number of faces. (PETSc lingo for "surfaces") > >>> ne,//number of edges. (PETSc lingo for "lines") > >>> pStart,//starting LI of global elements. > >>> pEnd,//ending LI of all elements. > >>> cStart,//starting LI for cells global arrangement. > >>> cEnd,//ending LI for cells in global arrangement. > >>> vStart,//starting LI for vertices in global arrangement. > >>> vEnd,//ending LI for vertices in global arrangement. > >>> fStart,//starting LI for faces in global arrangement. > >>> fEnd,//ending LI for faces in global arrangement. > >>> eStart,//starting LI for edges in global arrangement. > >>> eEnd,//ending LI for edges in global arrangement. > >>> sizeK,//Size of the element stiffness matrix. > >>> ii,jj,kk,//Dedicated looping variables. > >>> indexXYZ,//Variable to access the elements of XYZ vector. > >>> indexK,//Variable to access the elements of the U and F > vectors (can reference rows and colums of K matrix.) > >>> *closure = PETSC_NULL,//Pointer to the closure elements of > a cell. > >>> size_closure,//Size of the closure of a cell. > >>> dim,//Dimension of the mesh. > >>> //*edof,//Linear indices of dof's inside the K matrix. > >>> dof = 3,//Degrees of freedom per node. > >>> cells=0, edges=0, vertices=0, faces=0,//Topology counters > when looping through cells. > >>> pinXcode=10, pinZcode=11,forceZcode=12;//Gmsh codes to > extract relevant "Face Sets." > >>> PetscReal //*x_el,//Pointer to a vector that will store the > x-coordinates of an element's vertices. > >>> //*y_el,//Pointer to a vector that will store the > y-coordinates of an element's vertices. > >>> //*z_el,//Pointer to a vector that will store the > z-coordinates of an element's vertices. > >>> *xyz_el,//Pointer to xyz array in the XYZ vector. > >>> traction = -10, > >>> *KEdata, > >>> t1,t2; //time keepers. > >>> const char *gmshfile = "TopOptmeshfine2.msh";//Name of gmsh file to > import. > >>> > >>> ierr = PetscInitialize(&argc,&args,NULL,help); if(ierr) return ierr; > //And the machine shall work.... > >>> > >>> //MESH > IMPORT================================================================= > >>> //IMPORTANT NOTE: Gmsh only creates "cells" and "vertices," it does > not create the "faces" or "edges." > >>> //Gmsh probably can generate them, must figure out how to. > >>> t1 = MPI_Wtime(); > >>> ierr = > DMPlexCreateGmshFromFile(PETSC_COMM_WORLD,gmshfile,interpolate,&dm); > CHKERRQ(ierr);//Read Gmsh file and generate the DMPlex. > >>> ierr = DMGetDimension(dm, &dim); CHKERRQ(ierr);//1-D, 2-D, or 3-D > >>> ierr = DMPlexGetChart(dm, &pStart, &pEnd); CHKERRQ(ierr);//Extracts > linear indices of cells, vertices, faces, and edges. > >>> ierr = DMGetCoordinatesLocal(dm,&XYZ); CHKERRQ(ierr);//Extracts > coordinates from mesh.(Contiguous storage: [x0,y0,z0,x1,y1,z1,...]) > >>> ierr = VecGetArray(XYZ,&xyz_el); CHKERRQ(ierr);//Get pointer to > vector's data. > >>> t2 = MPI_Wtime(); > >>> PetscPrintf(PETSC_COMM_WORLD,"Mesh Import time: %10f\n",t2-t1); > >>> DMView(dm,PETSC_VIEWER_STDOUT_WORLD); > >>> > >>> //IMPORTANT NOTE: PETSc assumes that vertex-cell meshes are 2D even > if they were 3D, so its ordering changes. > >>> //Cells remain at height 0, but vertices move to height 1 from > height 3. To prevent this from becoming an issue > >>> //the "interpolate" variable is set to PETSC_TRUE to tell the mesh > importer to generate faces and edges. > >>> //PETSc, therefore, technically does additional meshing. Gotta > figure out how to get this from Gmsh directly. > >>> ierr = DMPlexGetDepthStratum(dm,3, &cStart, &cEnd);//Get LI of cells. > >>> ierr = DMPlexGetDepthStratum(dm,2, &fStart, &fEnd);//Get LI of faces > >>> ierr = DMPlexGetDepthStratum(dm,1, &eStart, &eEnd);//Get LI of edges. > >>> ierr = DMPlexGetDepthStratum(dm,0, &vStart, &vEnd);//Get LI of > vertices. > >>> ierr = DMGetStratumSize(dm,"depth", 3, &nc);//Get number of cells. > >>> ierr = DMGetStratumSize(dm,"depth", 2, &nf);//Get number of faces. > >>> ierr = DMGetStratumSize(dm,"depth", 1, &ne);//Get number of edges. > >>> ierr = DMGetStratumSize(dm,"depth", 0, &nv);//Get number of vertices. > >>> /* > >>> PetscPrintf(PETSC_COMM_WORLD,"global start = %10d\t global end = > %10d\n",pStart,pEnd); > >>> PetscPrintf(PETSC_COMM_WORLD,"#cells = %10d\t i = %10d\t i < > %10d\n",nc,cStart,cEnd); > >>> PetscPrintf(PETSC_COMM_WORLD,"#faces = %10d\t i = %10d\t i < > %10d\n",nf,fStart,fEnd); > >>> PetscPrintf(PETSC_COMM_WORLD,"#edges = %10d\t i = %10d\t i < > %10d\n",ne,eStart,eEnd); > >>> PetscPrintf(PETSC_COMM_WORLD,"#vertices = %10d\t i = %10d\t i < > %10d\n",nv,vStart,vEnd); > >>> */ > >>> //MESH > IMPORT================================================================= > >>> > >>> //NOTE: This section extremely hardcoded right now. > >>> //Current setup would only support P1 meshes. > >>> //MEMORY ALLOCATION > ========================================================== > >>> ierr = PetscSectionCreate(PETSC_COMM_WORLD, &s); CHKERRQ(ierr); > >>> //The chart is akin to a contiguous memory storage allocation. Each > chart entry is associated > >>> //with a "thing," could be a vertex, face, cell, or edge, or > anything else. > >>> ierr = PetscSectionSetChart(s, pStart, pEnd); CHKERRQ(ierr); > >>> //For each "thing" in the chart, additional room can be made. This > is helpful for associating > >>> //nodes to multiple degrees of freedom. These commands help > associate nodes with > >>> for(ii = cStart; ii < cEnd; ii++){//Cell loop. > >>> ierr = PetscSectionSetDof(s, ii, 0);CHKERRQ(ierr);}//NOTE: > Currently no dof's associated with cells. > >>> for(ii = fStart; ii < fEnd; ii++){//Face loop. > >>> ierr = PetscSectionSetDof(s, ii, 0);CHKERRQ(ierr);}//NOTE: > Currently no dof's associated with faces. > >>> for(ii = vStart; ii < vEnd; ii++){//Vertex loop. > >>> ierr = PetscSectionSetDof(s, ii, dof);CHKERRQ(ierr);}//Sets x, y, > and z displacements as dofs. > >>> for(ii = eStart; ii < eEnd; ii++){//Edge loop > >>> ierr = PetscSectionSetDof(s, ii, 0);CHKERRQ(ierr);}//NOTE: > Currently no dof's associated with edges. > >>> ierr = PetscSectionSetUp(s); CHKERRQ(ierr); > >>> ierr = > PetscSectionGetStorageSize(s,&sizeK);CHKERRQ(ierr);//Determine the size of > the global stiffness matrix. > >>> ierr = DMSetLocalSection(dm,s); CHKERRQ(ierr);//Associate the > PetscSection with the DM object. > >>> //PetscErrorCode DMCreateGlobalVector(DM dm,Vec *vec) > >>> //ierr = DMCreateGlobalVector(dm,&U); CHKERRQ(ierr); > >>> PetscSectionDestroy(&s); > >>> //PetscPrintf(PETSC_COMM_WORLD,"sizeK = %10d\n",sizeK); > >>> > >>> //OBJECT > SETUP================================================================ > >>> //Global stiffness matrix. > >>> //PetscErrorCode DMCreateMatrix(DM dm,Mat *mat) > >>> > >>> //This makes the loop fast. > >>> ierr = DMCreateMatrix(dm,&K); > >>> > >>> //This makes the loop uber slow. > >>> //ierr = MatCreate(PETSC_COMM_WORLD,&K); CHKERRQ(ierr); > >>> //ierr = MatSetType(K,MATAIJ); CHKERRQ(ierr);// Global stiffness > matrix set to some sparse type. > >>> //ierr = MatSetSizes(K,PETSC_DECIDE,PETSC_DECIDE,sizeK,sizeK); > CHKERRQ(ierr); > >>> //ierr = MatSetUp(K); CHKERRQ(ierr); > >>> > >>> //Displacement vector. > >>> ierr = VecCreate(PETSC_COMM_WORLD,&U); CHKERRQ(ierr); > >>> ierr = VecSetType(U,VECSTANDARD); CHKERRQ(ierr);// Global stiffness > matrix set to some sparse type. > >>> ierr = VecSetSizes(U,PETSC_DECIDE,sizeK); CHKERRQ(ierr); > >>> > >>> //Load vector. > >>> ierr = VecCreate(PETSC_COMM_WORLD,&F); CHKERRQ(ierr); > >>> ierr = VecSetType(F,VECSTANDARD); CHKERRQ(ierr);// Global stiffness > matrix set to some sparse type. > >>> ierr = VecSetSizes(F,PETSC_DECIDE,sizeK); CHKERRQ(ierr); > >>> //OBJECT > SETUP================================================================ > >>> > >>> //WARNING: This loop is currently hardcoded for P1 elements only! > Must Figure > >>> //out a clever way to modify to accomodate Pn (n>1) elements. > >>> > >>> //BEGIN GLOBAL STIFFNESS MATRIX > BUILDER======================================= > >>> t1 = MPI_Wtime(); > >>> > >>> > //PREALLOCATIONS============================================================== > >>> ierr = ConstitutiveMatrix(&matC,"isotropic",0); CHKERRQ(ierr); > >>> struct preKE preKEtetra4; > >>> ierr = InitializeKEpreallocation(&preKEtetra4,"tetra4"); > CHKERRQ(ierr); > >>> ierr = MatCreate(PETSC_COMM_WORLD,&KE); CHKERRQ(ierr); //SEQUENTIAL > >>> ierr = MatSetSizes(KE,PETSC_DECIDE,PETSC_DECIDE,12,12); > CHKERRQ(ierr); //SEQUENTIAL > >>> ierr = MatSetType(KE,MATDENSE); CHKERRQ(ierr); //SEQUENTIAL > >>> ierr = MatSetUp(KE); CHKERRQ(ierr); > >>> PetscReal x_tetra4[4], y_tetra4[4],z_tetra4[4], > >>> x_hex8[8], y_hex8[8],z_hex8[8], > >>> *x,*y,*z; > >>> PetscInt *EDOF,edof_tetra4[12],edof_hex8[24]; > >>> DMPolytopeType previous = DM_POLYTOPE_UNKNOWN; > >>> > //PREALLOCATIONS============================================================== > >>> > >>> > >>> > >>> for(ii=cStart;ii >>> ierr = DMPlexGetTransitiveClosure(dm, ii, useCone, &size_closure, > &closure); CHKERRQ(ierr); > >>> ierr = DMPlexGetCellType(dm, ii, &celltype); CHKERRQ(ierr); > >>> //IMPORTANT NOTE: MOST OF THIS LOOP SHOULD BE INCLUDED IN THE KE3D > function. > >>> if(previous != celltype){ > >>> //PetscPrintf(PETSC_COMM_WORLD,"run \n"); > >>> if(celltype == DM_POLYTOPE_TETRAHEDRON){ > >>> x = x_tetra4; > >>> y = y_tetra4; > >>> z = z_tetra4; > >>> EDOF = edof_tetra4; > >>> }//end if. > >>> else if(celltype == DM_POLYTOPE_HEXAHEDRON){ > >>> x = x_hex8; > >>> y = y_hex8; > >>> z = z_hex8; > >>> EDOF = edof_hex8; > >>> }//end else if. > >>> } > >>> previous = celltype; > >>> > >>> //PetscPrintf(PETSC_COMM_WORLD,"Cell # %4i\t",ii); > >>> cells=0; > >>> edges=0; > >>> vertices=0; > >>> faces=0; > >>> kk = 0; > >>> for(jj=0;jj<(2*size_closure);jj+=2){//Scan the closure of the > current cell. > >>> //Use information from the DM's strata to determine composition > of cell_ii. > >>> if(vStart <= closure[jj] && closure[jj]< vEnd){//Check for > vertices. > >>> //PetscPrintf(PETSC_COMM_WORLD,"%5i\t",closure[jj]); > >>> indexXYZ = dim*(closure[jj]-vStart);//Linear index of > x-coordinate in the xyz_el array. > >>> > >>> *(x+vertices) = xyz_el[indexXYZ]; > >>> *(y+vertices) = xyz_el[indexXYZ+1];//Extract Y-coordinates of > the current vertex. > >>> *(z+vertices) = xyz_el[indexXYZ+2];//Extract Y-coordinates of > the current vertex. > >>> *(EDOF + kk) = indexXYZ; > >>> *(EDOF + kk+1) = indexXYZ+1; > >>> *(EDOF + kk+2) = indexXYZ+2; > >>> kk+=3; > >>> vertices++;//Update vertex counter. > >>> }//end if > >>> else if(eStart <= closure[jj] && closure[jj]< eEnd){//Check for > edge ID's > >>> edges++; > >>> }//end else ifindexK > >>> else if(fStart <= closure[jj] && closure[jj]< fEnd){//Check for > face ID's > >>> faces++; > >>> }//end else if > >>> else if(cStart <= closure[jj] && closure[jj]< cEnd){//Check for > cell ID's > >>> cells++; > >>> }//end else if > >>> }//end "jj" loop. > >>> ierr = tetra4(x,y,z,&preKEtetra4,&matC,&KE); CHKERRQ(ierr); > //Generate the element stiffness matrix for this cell. > >>> ierr = MatDenseGetArray(KE,&KEdata); CHKERRQ(ierr); > >>> ierr = MatSetValues(K,12,EDOF,12,EDOF,KEdata,ADD_VALUES); > CHKERRQ(ierr);//WARNING: HARDCODED FOR TETRAHEDRAL P1 ELEMENTS ONLY > !!!!!!!!!!!!!!!!!!!!!!! > >>> ierr = MatDenseRestoreArray(KE,&KEdata); CHKERRQ(ierr); > >>> ierr = DMPlexRestoreTransitiveClosure(dm, ii,useCone, &size_closure, > &closure); CHKERRQ(ierr); > >>> }//end "ii" loop. > >>> ierr = MatAssemblyBegin(K,MAT_FINAL_ASSEMBLY); CHKERRQ(ierr); > >>> ierr = MatAssemblyEnd(K,MAT_FINAL_ASSEMBLY); CHKERRQ(ierr); > >>> //ierr = MatView(K,PETSC_VIEWER_DRAW_WORLD); CHKERRQ(ierr); > >>> //END GLOBAL STIFFNESS MATRIX > BUILDER=========================================== > >>> t2 = MPI_Wtime(); > >>> PetscPrintf(PETSC_COMM_WORLD,"K build time: %10f\n",t2-t1); > >>> > >>> > >>> > >>> > >>> > >>> > >>> > >>> > >>> t1 = MPI_Wtime(); > >>> //BEGIN BOUNDARY CONDITION > ENFORCEMENT========================================== > >>> IS TrianglesIS, physicalsurfaceID;//, VerticesIS; > >>> PetscInt numsurfvals, > >>> //numRows, > >>> dof_offset,numTri; > >>> const PetscInt *surfvals, > >>> //*pinZID, > >>> *TriangleID; > >>> PetscScalar diag =1; > >>> PetscReal area,force; > >>> //NOTE: Petsc can read/assign labels. Eeach label may posses multiple > "values." > >>> //These values act as tags within a tag. > >>> //IMPORTANT NOTE: The below line needs a safety. If a mesh that does > not feature > >>> //face sets is imported, the code in its current state will crash!!!. > This is currently > >>> //hardcoded for the test mesh. > >>> ierr = DMGetLabel(dm, "Face Sets", &physicalgroups); > CHKERRQ(ierr);//Inspects Physical surface groups defined by gmsh (if any). > >>> ierr = DMLabelGetValueIS(physicalgroups, &physicalsurfaceID); > CHKERRQ(ierr);//Gets the physical surface ID's defined in gmsh (as > specified in the .geo file). > >>> ierr = ISGetIndices(physicalsurfaceID,&surfvals); CHKERRQ(ierr);//Get > a pointer to the actual surface values. > >>> ierr = DMLabelGetNumValues(physicalgroups, &numsurfvals); > CHKERRQ(ierr);//Gets the number of different values that the label assigns. > >>> for(ii=0;ii label. > >>> //PetscPrintf(PETSC_COMM_WORLD,"Values = %5i\n",surfvals[ii]); > >>> //PROBLEM: The surface values are hardcoded in the gmsh file. We > need to adopt standard "codes" > >>> //that we can give to users when they make their meshes so that this > code recognizes the Type > >>> // of boundary conditions that are to be imposed. > >>> if(surfvals[ii] == pinXcode){ > >>> dof_offset = 0; > >>> dirichletBC = PETSC_TRUE; > >>> }//end if. > >>> else if(surfvals[ii] == pinZcode){ > >>> dof_offset = 2; > >>> dirichletBC = PETSC_TRUE; > >>> }//end else if. > >>> else if(surfvals[ii] == forceZcode){ > >>> dof_offset = 2; > >>> neumannBC = PETSC_TRUE; > >>> }//end else if. > >>> > >>> ierr = DMLabelGetStratumIS(physicalgroups, surfvals[ii], > &TrianglesIS); CHKERRQ(ierr);//Get the ID's (as an IS) of the surfaces > belonging to value 11. > >>> //PROBLEM: DMPlexGetConeRecursiveVertices returns an array with > repeated node ID's. For each repetition, the lines that enforce BC's > unnecessarily re-run. > >>> ierr = ISGetSize(TrianglesIS,&numTri); CHKERRQ(ierr); > >>> ierr = ISGetIndices(TrianglesIS,&TriangleID); CHKERRQ(ierr);//Get a > pointer to the actual surface values. > >>> for(kk=0;kk >>> ierr = DMPlexGetTransitiveClosure(dm, TriangleID[kk], useCone, > &size_closure, &closure); CHKERRQ(ierr); > >>> if(neumannBC){ > >>> ierr = DMPlexComputeCellGeometryFVM(dm, TriangleID[kk], > &area,PETSC_NULL,PETSC_NULL); CHKERRQ(ierr); > >>> force = traction*area/3;//WARNING: The 3 here is hardcoded for a > purely tetrahedral mesh only!!!!!!!!!! > >>> } > >>> for(jj=0;jj<(2*size_closure);jj+=2){ > >>> //PetscErrorCode DMPlexComputeCellGeometryFVM(DM dm, PetscInt > cell, PetscReal *vol, PetscReal centroid[], PetscReal normal[]) > >>> if(vStart <= closure[jj] && closure[jj]< vEnd){//Check for > vertices. > >>> indexK = dof*(closure[jj] - vStart) + dof_offset; //Compute > the dof ID's in the K matrix. > >>> if(dirichletBC){//Boundary conditions requiring an edit of K > matrix. > >>> ierr = MatZeroRows(K,1,&indexK,diag,NULL,NULL); > CHKERRQ(ierr); > >>> }//end if. > >>> else if(neumannBC){//Boundary conditions requiring an edit of > RHS vector. > >>> ierr = VecSetValue(F,indexK,force,ADD_VALUES); > CHKERRQ(ierr); > >>> }// end else if. > >>> }//end if. > >>> }//end "jj" loop. > >>> ierr = DMPlexRestoreTransitiveClosure(dm, closure[jj],useCone, > &size_closure, &closure); CHKERRQ(ierr); > >>> }//end "kk" loop. > >>> ierr = ISRestoreIndices(TrianglesIS,&TriangleID); CHKERRQ(ierr); > >>> > >>> /* > >>> ierr = DMPlexGetConeRecursiveVertices(dm, TrianglesIS, &VerticesIS); > CHKERRQ(ierr);//Get the ID's (as an IS) of the vertices that make up the > surfaces of value 11. > >>> ierr = ISGetSize(VerticesIS,&numRows); CHKERRQ(ierr);//Get number of > flagged vertices (this includes repeated indices for faces that share > nodes). > >>> ierr = ISGetIndices(VerticesIS,&pinZID); CHKERRQ(ierr);//Get a > pointer to the actual surface values. > >>> if(dirichletBC){//Boundary conditions requiring an edit of K matrix. > >>> for(kk=0;kk >>> indexK = 3*(pinZID[kk] - vStart) + dof_offset; //Compute the dof > ID's in the K matrix. (NOTE: the 3* ishardcoded for 3 degrees of freedom, > tie this to a variable in the FUTURE.) > >>> ierr = MatZeroRows(K,1,&indexK,diag,NULL,NULL); CHKERRQ(ierr); > >>> }//end "kk" loop. > >>> }//end if. > >>> else if(neumannBC){//Boundary conditions requiring an edit of RHS > vector. > >>> for(kk=0;kk >>> indexK = 3*(pinZID[kk] - vStart) + dof_offset; > >>> ierr = VecSetValue(F,indexK,traction,INSERT_VALUES); > CHKERRQ(ierr); > >>> }//end "kk" loop. > >>> }// end else if. > >>> ierr = ISRestoreIndices(VerticesIS,&pinZID); CHKERRQ(ierr); > >>> */ > >>> dirichletBC = PETSC_FALSE; > >>> neumannBC = PETSC_FALSE; > >>> }//end "ii" loop. > >>> ierr = ISRestoreIndices(physicalsurfaceID,&surfvals); CHKERRQ(ierr); > >>> //ierr = ISRestoreIndices(VerticesIS,&pinZID); CHKERRQ(ierr); > >>> ierr = ISDestroy(&physicalsurfaceID); CHKERRQ(ierr); > >>> //ierr = ISDestroy(&VerticesIS); CHKERRQ(ierr); > >>> ierr = ISDestroy(&TrianglesIS); CHKERRQ(ierr); > >>> //END BOUNDARY CONDITION > ENFORCEMENT============================================ > >>> t2 = MPI_Wtime(); > >>> PetscPrintf(PETSC_COMM_WORLD,"BC imposition time: %10f\n",t2-t1); > >>> > >>> /* > >>> PetscInt kk = 0; > >>> for(ii=vStart;ii >>> kk++; > >>> PetscPrintf(PETSC_COMM_WORLD,"Vertex #%4i\t x = %10.9f\ty = > %10.9f\tz = %10.9f\n",ii,xyz_el[3*kk],xyz_el[3*kk+1],xyz_el[3*kk+2]); > >>> }// end "ii" loop. > >>> */ > >>> > >>> t1 = MPI_Wtime(); > >>> > //SOLVER======================================================================== > >>> ierr = KSPCreate(PETSC_COMM_WORLD,&ksp); CHKERRQ(ierr); > >>> ierr = KSPSetOperators(ksp,K,K); CHKERRQ(ierr); > >>> ierr = KSPSetFromOptions(ksp); CHKERRQ(ierr); > >>> ierr = KSPSolve(ksp,F,U); CHKERRQ(ierr); > >>> t2 = MPI_Wtime(); > >>> //ierr = KSPView(ksp,PETSC_VIEWER_STDOUT_WORLD);CHKERRQ(ierr); > >>> > //SOLVER======================================================================== > >>> t2 = MPI_Wtime(); > >>> PetscPrintf(PETSC_COMM_WORLD,"Solver time: %10f\n",t2-t1); > >>> ierr = VecRestoreArray(XYZ,&xyz_el); CHKERRQ(ierr);//Get pointer to > vector's data. > >>> > >>> //BEGIN MAX/MIN > DISPLACEMENTS=================================================== > >>> IS ISux,ISuy,ISuz; > >>> Vec UX,UY,UZ; > >>> PetscReal UXmax,UYmax,UZmax,UXmin,UYmin,UZmin; > >>> ierr = ISCreateStride(PETSC_COMM_WORLD,nv,0,3,&ISux); CHKERRQ(ierr); > >>> ierr = ISCreateStride(PETSC_COMM_WORLD,nv,1,3,&ISuy); CHKERRQ(ierr); > >>> ierr = ISCreateStride(PETSC_COMM_WORLD,nv,2,3,&ISuz); CHKERRQ(ierr); > >>> > >>> //PetscErrorCode VecGetSubVector(Vec X,IS is,Vec *Y) > >>> ierr = VecGetSubVector(U,ISux,&UX); CHKERRQ(ierr); > >>> ierr = VecGetSubVector(U,ISuy,&UY); CHKERRQ(ierr); > >>> ierr = VecGetSubVector(U,ISuz,&UZ); CHKERRQ(ierr); > >>> > >>> //PetscErrorCode VecMax(Vec x,PetscInt *p,PetscReal *val) > >>> ierr = VecMax(UX,PETSC_NULL,&UXmax); CHKERRQ(ierr); > >>> ierr = VecMax(UY,PETSC_NULL,&UYmax); CHKERRQ(ierr); > >>> ierr = VecMax(UZ,PETSC_NULL,&UZmax); CHKERRQ(ierr); > >>> > >>> ierr = VecMin(UX,PETSC_NULL,&UXmin); CHKERRQ(ierr); > >>> ierr = VecMin(UY,PETSC_NULL,&UYmin); CHKERRQ(ierr); > >>> ierr = VecMin(UZ,PETSC_NULL,&UZmin); CHKERRQ(ierr); > >>> > >>> PetscPrintf(PETSC_COMM_WORLD,"%10f\t <= ux <= %10f\n",UXmin,UXmax); > >>> PetscPrintf(PETSC_COMM_WORLD,"%10f\t <= uy <= %10f\n",UYmin,UYmax); > >>> PetscPrintf(PETSC_COMM_WORLD,"%10f\t <= uz <= %10f\n",UZmin,UZmax); > >>> > >>> > >>> > >>> > >>> //BEGIN OUTPUT > SOLUTION========================================================= > >>> if(saveASCII){ > >>> ierr = PetscViewerASCIIOpen(PETSC_COMM_WORLD,"XYZ.txt",&XYZviewer); > >>> ierr = VecView(XYZ,XYZviewer); CHKERRQ(ierr); > >>> ierr = PetscViewerASCIIOpen(PETSC_COMM_WORLD,"U.txt",&XYZpUviewer); > >>> ierr = VecView(U,XYZpUviewer); CHKERRQ(ierr); > >>> PetscViewerDestroy(&XYZviewer); PetscViewerDestroy(&XYZpUviewer); > >>> > >>> }//end if. > >>> if(saveVTK){ > >>> const char *meshfile = "starting_mesh.vtk", > >>> *deformedfile = "deformed_mesh.vtk"; > >>> ierr = > PetscViewerVTKOpen(PETSC_COMM_WORLD,meshfile,FILE_MODE_WRITE,&XYZviewer); > CHKERRQ(ierr); > >>> //PetscErrorCode DMSetAuxiliaryVec(DM dm, DMLabel label, PetscInt > value, Vec aux) > >>> DMLabel UXlabel,UYlabel, UZlabel; > >>> //PetscErrorCode DMLabelCreate(MPI_Comm comm, const char name[], > DMLabel *label) > >>> ierr = DMLabelCreate(PETSC_COMM_WORLD, "X-Displacement", &UXlabel); > CHKERRQ(ierr); > >>> ierr = DMLabelCreate(PETSC_COMM_WORLD, "Y-Displacement", &UYlabel); > CHKERRQ(ierr); > >>> ierr = DMLabelCreate(PETSC_COMM_WORLD, "Z-Displacement", &UZlabel); > CHKERRQ(ierr); > >>> ierr = DMSetAuxiliaryVec(dm,UXlabel, 1, UX); CHKERRQ(ierr); > >>> ierr = DMSetAuxiliaryVec(dm,UYlabel, 1, UY); CHKERRQ(ierr); > >>> ierr = DMSetAuxiliaryVec(dm,UZlabel, 1, UZ); CHKERRQ(ierr); > >>> //PetscErrorCode PetscViewerVTKAddField(PetscViewer > viewer,PetscObject dm,PetscErrorCode > (*PetscViewerVTKWriteFunction)(PetscObject,PetscViewer),PetscInt > fieldnum,PetscViewerVTKFieldType fieldtype,PetscBool checkdm,PetscObject > vec) > >>> > >>> > >>> > >>> //ierr = PetscViewerVTKAddField(XYZviewer, dm,PetscErrorCode > (*PetscViewerVTKWriteFunction)(Vec,PetscViewer),PETSC_DEFAULT,PETSC_VTK_POINT_FIELD,PETSC_FALSE,UX); > >>> ierr = PetscViewerVTKAddField(XYZviewer, > (PetscObject)dm,&PetscViewerVTKWriteFunction,PETSC_DEFAULT,PETSC_VTK_POINT_FIELD,PETSC_FALSE,(PetscObject)UX); > >>> > >>> > >>> ierr = DMPlexVTKWriteAll((PetscObject)dm, XYZviewer); CHKERRQ(ierr); > >>> ierr = VecAXPY(XYZ,1,U); CHKERRQ(ierr);//Add displacement field to > the mesh coordinates to deform. > >>> ierr = > PetscViewerVTKOpen(PETSC_COMM_WORLD,deformedfile,FILE_MODE_WRITE,&XYZpUviewer); > CHKERRQ(ierr); > >>> ierr = DMPlexVTKWriteAll((PetscObject)dm, XYZpUviewer); > CHKERRQ(ierr);// > >>> PetscViewerDestroy(&XYZviewer); PetscViewerDestroy(&XYZpUviewer); > >>> > >>> }//end else if. > >>> else{ > >>> ierr = PetscPrintf(PETSC_COMM_WORLD,"No output format specified! > Files not saved.\n"); CHKERRQ(ierr); > >>> }//end else. > >>> > >>> > >>> //END OUTPUT > SOLUTION=========================================================== > >>> VecDestroy(&UX); ISDestroy(&ISux); > >>> VecDestroy(&UY); ISDestroy(&ISuy); > >>> VecDestroy(&UZ); ISDestroy(&ISuz); > >>> //END MAX/MIN > DISPLACEMENTS===================================================== > >>> > >>> > //CLEANUP===================================================================== > >>> DMDestroy(&dm); > >>> KSPDestroy(&ksp); > >>> MatDestroy(&K); MatDestroy(&KE); MatDestroy(&matC); > //MatDestroy(preKEtetra4.matB); MatDestroy(preKEtetra4.matBTCB); > >>> VecDestroy(&U); VecDestroy(&F); > >>> > >>> //DMLabelDestroy(&physicalgroups);//Destroyig the DM destroys the > label. > >>> > //CLEANUP===================================================================== > >>> //PetscErrorCode PetscMallocDump(FILE *fp) > >>> //ierr = PetscMallocDump(NULL); > >>> return PetscFinalize();//And the machine shall rest.... > >>> }//end main. > >>> > >>> PetscErrorCode tetra4(PetscScalar* X,PetscScalar* Y, PetscScalar* > Z,struct preKE *P, Mat* matC, Mat* KE){ > >>> //INPUTS: > >>> //X: Global X coordinates of the elemental nodes. > >>> //Y: Global Y coordinates of the elemental nodes. > >>> //Z: Global Z coordinates of the elemental nodes. > >>> //J: Jacobian matrix. > >>> //invJ: Inverse Jacobian matrix. > >>> PetscErrorCode ierr; > >>> //For current quadrature point, get dPsi/dXi_i Xi_i = {Xi,Eta,Zeta} > >>> /* > >>> P->dPsidXi[0] = +1.; P->dPsidEta[0] = 0.0; P->dPsidZeta[0] = 0.0; > >>> P->dPsidXi[1] = 0.0; P->dPsidEta[1] = +1.; P->dPsidZeta[1] = 0.0; > >>> P->dPsidXi[2] = 0.0; P->dPsidEta[2] = 0.0; P->dPsidZeta[2] = +1.; > >>> P->dPsidXi[3] = -1.; P->dPsidEta[3] = -1.; P->dPsidZeta[3] = -1.; > >>> */ > >>> //Populate the Jacobian matrix. > >>> P->J[0][0] = X[0] - X[3]; > >>> P->J[0][1] = Y[0] - Y[3]; > >>> P->J[0][2] = Z[0] - Z[3]; > >>> P->J[1][0] = X[1] - X[3]; > >>> P->J[1][1] = Y[1] - Y[3]; > >>> P->J[1][2] = Z[1] - Z[3]; > >>> P->J[2][0] = X[2] - X[3]; > >>> P->J[2][1] = Y[2] - Y[3]; > >>> P->J[2][2] = Z[2] - Z[3]; > >>> > >>> //Determinant of the 3x3 Jacobian. (Expansion along 1st row). > >>> P->minor00 = P->J[1][1]*P->J[2][2] - P->J[2][1]*P->J[1][2];//Reuse > when finding InvJ. > >>> P->minor01 = P->J[1][0]*P->J[2][2] - P->J[2][0]*P->J[1][2];//Reuse > when finding InvJ. > >>> P->minor02 = P->J[1][0]*P->J[2][1] - P->J[2][0]*P->J[1][1];//Reuse > when finding InvJ. > >>> P->detJ = P->J[0][0]*P->minor00 - P->J[0][1]*P->minor01 + > P->J[0][2]*P->minor02; > >>> //Inverse of the 3x3 Jacobian > >>> P->invJ[0][0] = +P->minor00/P->detJ;//Reuse precomputed minor. > >>> P->invJ[0][1] = -(P->J[0][1]*P->J[2][2] - > P->J[0][2]*P->J[2][1])/P->detJ; > >>> P->invJ[0][2] = +(P->J[0][1]*P->J[1][2] - > P->J[1][1]*P->J[0][2])/P->detJ; > >>> P->invJ[1][0] = -P->minor01/P->detJ;//Reuse precomputed minor. > >>> P->invJ[1][1] = +(P->J[0][0]*P->J[2][2] - > P->J[0][2]*P->J[2][0])/P->detJ; > >>> P->invJ[1][2] = -(P->J[0][0]*P->J[1][2] - > P->J[1][0]*P->J[0][2])/P->detJ; > >>> P->invJ[2][0] = +P->minor02/P->detJ;//Reuse precomputed minor. > >>> P->invJ[2][1] = -(P->J[0][0]*P->J[2][1] - > P->J[0][1]*P->J[2][0])/P->detJ; > >>> P->invJ[2][2] = +(P->J[0][0]*P->J[1][1] - > P->J[0][1]*P->J[1][0])/P->detJ; > >>> > >>> //*****************STRAIN MATRIX > (B)************************************** > >>> for(P->m=0;P->mN;P->m++){//Scan all shape functions. > >>> > >>> P->x_in = 0 + P->m*3;//Every 3rd column starting at 0 > >>> P->y_in = P->x_in +1;//Every 3rd column starting at 1 > >>> P->z_in = P->y_in +1;//Every 3rd column starting at 2 > >>> > >>> P->dX[0] = P->invJ[0][0]*P->dPsidXi[P->m] + > P->invJ[0][1]*P->dPsidEta[P->m] + P->invJ[0][2]*P->dPsidZeta[P->m]; > >>> P->dY[0] = P->invJ[1][0]*P->dPsidXi[P->m] + > P->invJ[1][1]*P->dPsidEta[P->m] + P->invJ[1][2]*P->dPsidZeta[P->m]; > >>> P->dZ[0] = P->invJ[2][0]*P->dPsidXi[P->m] + > P->invJ[2][1]*P->dPsidEta[P->m] + P->invJ[2][2]*P->dPsidZeta[P->m]; > >>> > >>> P->dX[1] = P->dZ[0]; P->dX[2] = P->dY[0]; > >>> P->dY[1] = P->dZ[0]; P->dY[2] = P->dX[0]; > >>> P->dZ[1] = P->dX[0]; P->dZ[2] = P->dY[0]; > >>> > >>> ierr = > MatSetValues(P->matB,3,P->x_insert,1,&(P->x_in),P->dX,INSERT_VALUES); > CHKERRQ(ierr); > >>> ierr = > MatSetValues(P->matB,3,P->y_insert,1,&(P->y_in),P->dY,INSERT_VALUES); > CHKERRQ(ierr); > >>> ierr = > MatSetValues(P->matB,3,P->z_insert,1,&(P->z_in),P->dZ,INSERT_VALUES); > CHKERRQ(ierr); > >>> > >>> }//end "m" loop. > >>> ierr = MatAssemblyBegin(P->matB,MAT_FINAL_ASSEMBLY); CHKERRQ(ierr); > >>> ierr = MatAssemblyEnd(P->matB,MAT_FINAL_ASSEMBLY); CHKERRQ(ierr); > >>> //*****************STRAIN MATRIX > (B)************************************** > >>> > >>> //Compute the matrix product B^t*C*B, scale it by the > quadrature weights and add to KE. > >>> P->weight = -P->detJ/6; > >>> > >>> ierr = MatZeroEntries(*KE); CHKERRQ(ierr); > >>> ierr = > MatPtAP(*matC,P->matB,MAT_INITIAL_MATRIX,PETSC_DEFAULT,&(P->matBTCB));CHKERRQ(ierr); > >>> ierr = MatScale(P->matBTCB,P->weight); CHKERRQ(ierr); > >>> ierr = MatAssemblyBegin(P->matBTCB,MAT_FINAL_ASSEMBLY); > CHKERRQ(ierr); > >>> ierr = MatAssemblyEnd(P->matBTCB,MAT_FINAL_ASSEMBLY); CHKERRQ(ierr); > >>> ierr = MatAXPY(*KE,1,P->matBTCB,DIFFERENT_NONZERO_PATTERN); > CHKERRQ(ierr);//Add contribution of current quadrature point to KE. > >>> > >>> //ierr = > MatPtAP(*matC,P->matB,MAT_INITIAL_MATRIX,PETSC_DEFAULT,KE);CHKERRQ(ierr); > >>> //ierr = MatScale(*KE,P->weight); CHKERRQ(ierr); > >>> > >>> ierr = MatAssemblyBegin(*KE,MAT_FINAL_ASSEMBLY); CHKERRQ(ierr); > >>> ierr = MatAssemblyEnd(*KE,MAT_FINAL_ASSEMBLY); CHKERRQ(ierr); > >>> > >>> //Cleanup > >>> return ierr; > >>> }//end tetra4. > >>> > >>> PetscErrorCode ConstitutiveMatrix(Mat *matC,const char* type,PetscInt > materialID){ > >>> PetscErrorCode ierr; > >>> PetscBool isotropic = PETSC_FALSE, > >>> orthotropic = PETSC_FALSE; > >>> //PetscErrorCode PetscStrcmp(const char a[],const char > b[],PetscBool *flg) > >>> ierr = PetscStrcmp(type,"isotropic",&isotropic); > >>> ierr = PetscStrcmp(type,"orthotropic",&orthotropic); > >>> ierr = MatCreate(PETSC_COMM_WORLD,matC); CHKERRQ(ierr); > >>> ierr = MatSetSizes(*matC,PETSC_DECIDE,PETSC_DECIDE,6,6); > CHKERRQ(ierr); > >>> ierr = MatSetType(*matC,MATAIJ); CHKERRQ(ierr); > >>> ierr = MatSetUp(*matC); CHKERRQ(ierr); > >>> > >>> if(isotropic){ > >>> PetscReal E,nu, M,L,vals[3]; > >>> switch(materialID){ > >>> case 0://Hardcoded properties for isotropic material #0 > >>> E = 200; > >>> nu = 1./3; > >>> break; > >>> case 1://Hardcoded properties for isotropic material #1 > >>> E = 96; > >>> nu = 1./3; > >>> break; > >>> }//end switch. > >>> M = E/(2*(1+nu)),//Lame's constant 1 ("mu"). > >>> L = E*nu/((1+nu)*(1-2*nu));//Lame's constant 2 ("lambda"). > >>> //PetscErrorCode MatSetValues(Mat mat,PetscInt m,const PetscInt > idxm[],PetscInt n,const PetscInt idxn[],const PetscScalar v[],InsertMode > addv) > >>> PetscInt idxn[3] = {0,1,2}; > >>> vals[0] = L+2*M; vals[1] = L; vals[2] = vals[1]; > >>> ierr = MatSetValues(*matC,1,&idxn[0],3,idxn,vals,INSERT_VALUES); > CHKERRQ(ierr); > >>> vals[1] = vals[0]; vals[0] = vals[2]; > >>> ierr = MatSetValues(*matC,1,&idxn[1],3,idxn,vals,INSERT_VALUES); > CHKERRQ(ierr); > >>> vals[2] = vals[1]; vals[1] = vals[0]; > >>> ierr = MatSetValues(*matC,1,&idxn[2],3,idxn,vals,INSERT_VALUES); > CHKERRQ(ierr); > >>> ierr = MatSetValue(*matC,3,3,M,INSERT_VALUES); CHKERRQ(ierr); > >>> ierr = MatSetValue(*matC,4,4,M,INSERT_VALUES); CHKERRQ(ierr); > >>> ierr = MatSetValue(*matC,5,5,M,INSERT_VALUES); CHKERRQ(ierr); > >>> }//end if. > >>> /* > >>> else if(orthotropic){ > >>> switch(materialID){ > >>> case 0: > >>> break; > >>> case 1: > >>> break; > >>> }//end switch. > >>> }//end else if. > >>> */ > >>> ierr = MatAssemblyBegin(*matC,MAT_FINAL_ASSEMBLY); CHKERRQ(ierr); > >>> ierr = MatAssemblyEnd(*matC,MAT_FINAL_ASSEMBLY); CHKERRQ(ierr); > >>> //MatView(*matC,0); > >>> return ierr; > >>> }//End ConstitutiveMatrix > >>> > >>> PetscErrorCode InitializeKEpreallocation(struct preKE *P,const char* > type){ > >>> PetscErrorCode ierr; > >>> PetscBool istetra4 = PETSC_FALSE, > >>> ishex8 = PETSC_FALSE; > >>> ierr = PetscStrcmp(type,"tetra4",&istetra4); CHKERRQ(ierr); > >>> ierr = PetscStrcmp(type,"hex8",&ishex8); CHKERRQ(ierr); > >>> if(istetra4){ > >>> P->sizeKE = 12; > >>> P->N = 4; > >>> }//end if. > >>> else if(ishex8){ > >>> P->sizeKE = 24; > >>> P->N = 8; > >>> }//end else if. > >>> > >>> > >>> P->x_insert[0] = 0; P->x_insert[1] = 3; P->x_insert[2] = 5; > >>> P->y_insert[0] = 1; P->y_insert[1] = 4; P->y_insert[2] = 5; > >>> P->z_insert[0] = 2; P->z_insert[1] = 3; P->z_insert[2] = 4; > >>> //Allocate memory for the differentiated shape function vectors. > >>> ierr = PetscMalloc1(P->N,&(P->dPsidXi)); CHKERRQ(ierr); > >>> ierr = PetscMalloc1(P->N,&(P->dPsidEta)); CHKERRQ(ierr); > >>> ierr = PetscMalloc1(P->N,&(P->dPsidZeta)); CHKERRQ(ierr); > >>> > >>> P->dPsidXi[0] = +1.; P->dPsidEta[0] = 0.0; P->dPsidZeta[0] = 0.0; > >>> P->dPsidXi[1] = 0.0; P->dPsidEta[1] = +1.; P->dPsidZeta[1] = 0.0; > >>> P->dPsidXi[2] = 0.0; P->dPsidEta[2] = 0.0; P->dPsidZeta[2] = +1.; > >>> P->dPsidXi[3] = -1.; P->dPsidEta[3] = -1.; P->dPsidZeta[3] = -1.; > >>> > >>> > >>> //Strain matrix. > >>> ierr = MatCreate(PETSC_COMM_WORLD,&(P->matB)); CHKERRQ(ierr); > >>> ierr = MatSetSizes(P->matB,PETSC_DECIDE,PETSC_DECIDE,6,P->sizeKE); > CHKERRQ(ierr);//Hardcoded > >>> ierr = MatSetType(P->matB,MATAIJ); CHKERRQ(ierr); > >>> ierr = MatSetUp(P->matB); CHKERRQ(ierr); > >>> > >>> //Contribution matrix. > >>> ierr = MatCreate(PETSC_COMM_WORLD,&(P->matBTCB)); CHKERRQ(ierr); > >>> ierr = > MatSetSizes(P->matBTCB,PETSC_DECIDE,PETSC_DECIDE,P->sizeKE,P->sizeKE); > CHKERRQ(ierr); > >>> ierr = MatSetType(P->matBTCB,MATAIJ); CHKERRQ(ierr); > >>> ierr = MatSetUp(P->matBTCB); CHKERRQ(ierr); > >>> > >>> //Element stiffness matrix. > >>> //ierr = MatCreateSeqDense(PETSC_COMM_SELF,12,12,NULL,&KE); > CHKERRQ(ierr); //PARALLEL > >>> > >>> return ierr; > >>> } > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From thibault.bridelbertomeu at gmail.com Sat Jan 8 02:04:51 2022 From: thibault.bridelbertomeu at gmail.com (Thibault Bridel-Bertomeu) Date: Sat, 8 Jan 2022 09:04:51 +0100 Subject: [petsc-users] Fluid-Structure interaction with multiple DMPlex In-Reply-To: References: Message-ID: Le ven. 7 janv. 2022 ? 19:45, Thibault Bridel-Bertomeu < thibault.bridelbertomeu at gmail.com> a ?crit : > > > Le ven. 7 janv. 2022 ? 19:23, Matthew Knepley a > ?crit : > >> On Fri, Jan 7, 2022 at 12:58 PM Thibault Bridel-Bertomeu < >> thibault.bridelbertomeu at gmail.com> wrote: >> >>> >>> Le ven. 7 janv. 2022 ? 14:54, Matthew Knepley a >>> ?crit : >>> >>>> On Fri, Jan 7, 2022 at 8:52 AM Thibault Bridel-Bertomeu < >>>> thibault.bridelbertomeu at gmail.com> wrote: >>>> >>>>> Hi Matthew, >>>>> >>>>> Le ven. 7 janv. 2022 ? 14:44, Matthew Knepley a >>>>> ?crit : >>>>> >>>>>> On Fri, Jan 7, 2022 at 5:46 AM Thibault Bridel-Bertomeu < >>>>>> thibault.bridelbertomeu at gmail.com> wrote: >>>>>> >>>>>>> Dear all, >>>>>>> >>>>>>> First of, happy new year everyone !! All the best ! >>>>>>> >>>>>> >>>>>> Happy New Year! >>>>>> >>>>>> >>>>>>> I am starting to draft a new project that will be about >>>>>>> fluid-structure interaction: in particular, the idea is to compute the >>>>>>> Navier-Stokes (or Euler nevermind) flow around an object and _at the same >>>>>>> time_ compute the heat equation inside the object. >>>>>>> So basically, I am thinking a mesh of the fluid and a mesh of the >>>>>>> object, both meshes being linked at the fluid - solid interface. >>>>>>> >>>>>> >>>>>> First question: Are these meshes intended to match on the interface? >>>>>> If not, this sounds like overset grids or immersed boundary/interface >>>>>> methods. In this case, more than one mesh makes sense to me. If they are >>>>>> intended to match, then I would advocate a single mesh with multiple >>>>>> problems defined on it. I have experimented with this, for example see SNES >>>>>> ex23 where I have a field in only part of the domain. I have a large >>>>>> project to do exactly this in a rocket engine now. >>>>>> >>>>> >>>>> Yes the way I see it is more of a single mesh with two distinct >>>>> regions to distinguish between the fluid and the solid. I was talking about >>>>> two meshes to try and explain my vision but it seems like it was unclear. >>>>> Imagine if you wish a rectangular box with a sphere inclusion: the >>>>> sphere would be tagged as a solid and the rest of the domain as fluid. >>>>> Using Gmsh volumes for instance. >>>>> Ill check out the SNES example ! Thanks ! >>>>> >>>>> >>>>>> >>>>>>> First (Matthew maybe ?) do you think it is something that could be >>>>>>> done using two DMPlex's that would somehow be spawned from reading a Gmsh >>>>>>> mesh with two volumes ? >>>>>>> >>>>>> >>>>>> You can take a mesh and filter out part of it with DMPlexFilter(). >>>>>> That is not used much so I may have to fix it to do what you want, but that >>>>>> should be easy. >>>>>> >>>>>> >>>>>>> And on one DMPlex we would have finite volume for the fluid, on the >>>>>>> other finite elements for the heat eqn ? >>>>>>> >>>>>> >>>>>> I have done this exact thing on a single mesh. It should be no harder >>>>>> on two meshes if you go that route. >>>>>> >>>>>> >>>>>>> Second, is it something that anyone in the community has ever >>>>>>> imagined doing with PETSc DMPlex's ? >>>>>>> >>>>>> >>>>>> Yes, I had a combined FV+FEM simulation of magma dynamics (I should >>>>>> make it an example), and currently we are doing FVM+FEM for simulation of a >>>>>> rocket engine. >>>>>> >>>>> >>>>> Wow so it seems like it?s the exact same thing I would like to achieve >>>>> as the rocket engine example. >>>>> So you have a single mesh and two regions tagged differently, and you >>>>> use the DmPlexFilter to solve FVM and FEM separately ? >>>>> >>>> >>>> With a single mesh, you do not even need DMPlexFilter. You just use the >>>> labels that Gmsh gives you. I think we should be able to get it going in a >>>> straightforward way. >>>> >>> >>> Ok then ! Thanks ! I?ll give it a shot and see what happens ! >>> Setting up the FVM and FEM discretizations will pass by DMSetField right >>> ? With a single mesh tagged with two different regions, it should show up >>> as two fields, is that correct ? >>> >> >> Yes, the idea is as follows. Each field also has a label argument that is >> the support of the field in the domain. Then we create PetscDS objects for >> each >> separate set of overlapping fields. The current algorithm is not complete >> I think, so let me know if this step fails. >> > > Ok, thanks. > I?ll let you know and share snippets when I have something started ! > > Talk soon ! Thanks ! > Hi Matthew, I thought about a little something else : what about setting two different TS, one for each field of the DM ? Most probably the fluid part would be solved with an explicit time stepping whereas the solid part with the heat equation would benefit from implicit time stepping. TSSetDM does not allow a field specification, is there a way to hack that so that each field has its own TS ? Thanks Thibault > Thibault > > >> Thanks, >> >> Matt >> >> >>> Thanks, >>> >>> Thibault >>> >>> >>>> Thanks, >>>> >>>> Matt >>>> >>>> >>>>> Thanks ! >>>>> >>>>> Thibault >>>>> >>>>> >>>>>> Thanks, >>>>>> >>>>>> Matt >>>>>> >>>>>> >>>>>>> As I said it is very prospective, I just wanted to have your opinion >>>>>>> !! >>>>>>> >>>>>>> Thanks very much in advance everyone !! >>>>>>> >>>>>>> Cheers, >>>>>>> Thibault >>>>>>> >>>>>>> >>>>>> >>>>>> -- >>>>>> What most experimenters take for granted before they begin their >>>>>> experiments is infinitely more interesting than any results to which their >>>>>> experiments lead. >>>>>> -- Norbert Wiener >>>>>> >>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>> >>>>>> >>>>> -- >>>>> Thibault Bridel-Bertomeu >>>>> ? >>>>> Eng, MSc, PhD >>>>> Research Engineer >>>>> CEA/CESTA >>>>> 33114 LE BARP >>>>> Tel.: (+33)557046924 >>>>> Mob.: (+33)611025322 >>>>> Mail: thibault.bridelbertomeu at gmail.com >>>>> >>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their >>>> experiments is infinitely more interesting than any results to which their >>>> experiments lead. >>>> -- Norbert Wiener >>>> >>>> https://www.cse.buffalo.edu/~knepley/ >>>> >>>> >>> -- >>> Thibault Bridel-Bertomeu >>> ? >>> Eng, MSc, PhD >>> Research Engineer >>> CEA/CESTA >>> 33114 LE BARP >>> Tel.: (+33)557046924 >>> Mob.: (+33)611025322 >>> Mail: thibault.bridelbertomeu at gmail.com >>> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ >> >> > -- > Thibault Bridel-Bertomeu > ? > Eng, MSc, PhD > Research Engineer > CEA/CESTA > 33114 LE BARP > Tel.: (+33)557046924 > Mob.: (+33)611025322 > Mail: thibault.bridelbertomeu at gmail.com > -- Thibault Bridel-Bertomeu ? Eng, MSc, PhD Research Engineer CEA/CESTA 33114 LE BARP Tel.: (+33)557046924 Mob.: (+33)611025322 Mail: thibault.bridelbertomeu at gmail.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Sat Jan 8 09:58:07 2022 From: knepley at gmail.com (Matthew Knepley) Date: Sat, 8 Jan 2022 10:58:07 -0500 Subject: [petsc-users] Fluid-Structure interaction with multiple DMPlex In-Reply-To: References: Message-ID: On Sat, Jan 8, 2022 at 3:05 AM Thibault Bridel-Bertomeu < thibault.bridelbertomeu at gmail.com> wrote: > Le ven. 7 janv. 2022 ? 19:45, Thibault Bridel-Bertomeu < > thibault.bridelbertomeu at gmail.com> a ?crit : > >> Le ven. 7 janv. 2022 ? 19:23, Matthew Knepley a >> ?crit : >> >>> On Fri, Jan 7, 2022 at 12:58 PM Thibault Bridel-Bertomeu < >>> thibault.bridelbertomeu at gmail.com> wrote: >>> >>>> >>>> Le ven. 7 janv. 2022 ? 14:54, Matthew Knepley a >>>> ?crit : >>>> >>>>> On Fri, Jan 7, 2022 at 8:52 AM Thibault Bridel-Bertomeu < >>>>> thibault.bridelbertomeu at gmail.com> wrote: >>>>> >>>>>> Hi Matthew, >>>>>> >>>>>> Le ven. 7 janv. 2022 ? 14:44, Matthew Knepley a >>>>>> ?crit : >>>>>> >>>>>>> On Fri, Jan 7, 2022 at 5:46 AM Thibault Bridel-Bertomeu < >>>>>>> thibault.bridelbertomeu at gmail.com> wrote: >>>>>>> >>>>>>>> Dear all, >>>>>>>> >>>>>>>> First of, happy new year everyone !! All the best ! >>>>>>>> >>>>>>> >>>>>>> Happy New Year! >>>>>>> >>>>>>> >>>>>>>> I am starting to draft a new project that will be about >>>>>>>> fluid-structure interaction: in particular, the idea is to compute the >>>>>>>> Navier-Stokes (or Euler nevermind) flow around an object and _at the same >>>>>>>> time_ compute the heat equation inside the object. >>>>>>>> So basically, I am thinking a mesh of the fluid and a mesh of the >>>>>>>> object, both meshes being linked at the fluid - solid interface. >>>>>>>> >>>>>>> >>>>>>> First question: Are these meshes intended to match on the interface? >>>>>>> If not, this sounds like overset grids or immersed boundary/interface >>>>>>> methods. In this case, more than one mesh makes sense to me. If they are >>>>>>> intended to match, then I would advocate a single mesh with multiple >>>>>>> problems defined on it. I have experimented with this, for example see SNES >>>>>>> ex23 where I have a field in only part of the domain. I have a large >>>>>>> project to do exactly this in a rocket engine now. >>>>>>> >>>>>> >>>>>> Yes the way I see it is more of a single mesh with two distinct >>>>>> regions to distinguish between the fluid and the solid. I was talking about >>>>>> two meshes to try and explain my vision but it seems like it was unclear. >>>>>> Imagine if you wish a rectangular box with a sphere inclusion: the >>>>>> sphere would be tagged as a solid and the rest of the domain as fluid. >>>>>> Using Gmsh volumes for instance. >>>>>> Ill check out the SNES example ! Thanks ! >>>>>> >>>>>> >>>>>>> >>>>>>>> First (Matthew maybe ?) do you think it is something that could be >>>>>>>> done using two DMPlex's that would somehow be spawned from reading a Gmsh >>>>>>>> mesh with two volumes ? >>>>>>>> >>>>>>> >>>>>>> You can take a mesh and filter out part of it with DMPlexFilter(). >>>>>>> That is not used much so I may have to fix it to do what you want, but that >>>>>>> should be easy. >>>>>>> >>>>>>> >>>>>>>> And on one DMPlex we would have finite volume for the fluid, on the >>>>>>>> other finite elements for the heat eqn ? >>>>>>>> >>>>>>> >>>>>>> I have done this exact thing on a single mesh. It should be no >>>>>>> harder on two meshes if you go that route. >>>>>>> >>>>>>> >>>>>>>> Second, is it something that anyone in the community has ever >>>>>>>> imagined doing with PETSc DMPlex's ? >>>>>>>> >>>>>>> >>>>>>> Yes, I had a combined FV+FEM simulation of magma dynamics (I should >>>>>>> make it an example), and currently we are doing FVM+FEM for simulation of a >>>>>>> rocket engine. >>>>>>> >>>>>> >>>>>> Wow so it seems like it?s the exact same thing I would like to >>>>>> achieve as the rocket engine example. >>>>>> So you have a single mesh and two regions tagged differently, and you >>>>>> use the DmPlexFilter to solve FVM and FEM separately ? >>>>>> >>>>> >>>>> With a single mesh, you do not even need DMPlexFilter. You just use >>>>> the labels that Gmsh gives you. I think we should be able to get it going >>>>> in a straightforward way. >>>>> >>>> >>>> Ok then ! Thanks ! I?ll give it a shot and see what happens ! >>>> Setting up the FVM and FEM discretizations will pass by DMSetField >>>> right ? With a single mesh tagged with two different regions, it should >>>> show up as two fields, is that correct ? >>>> >>> >>> Yes, the idea is as follows. Each field also has a label argument that >>> is the support of the field in the domain. Then we create PetscDS objects >>> for each >>> separate set of overlapping fields. The current algorithm is not >>> complete I think, so let me know if this step fails. >>> >> >> Ok, thanks. >> I?ll let you know and share snippets when I have something started ! >> >> Talk soon ! Thanks ! >> > > Hi Matthew, > > I thought about a little something else : what about setting two different > TS, one for each field of the DM ? Most probably the fluid part would be > solved with an explicit time stepping whereas the solid part with the heat > equation would benefit from implicit time stepping. TSSetDM does not allow > a field specification, is there a way to hack that so that each field has > its own TS ? > I see at least two options here: 1. Split the problems: You can use DMCreateSubDM() to split off part of a problem and use a solver on that. I have done this for problems with weak coupling. 2. Use IMEX For strong coupling, I have used the IMEX TSes in PETSc. You put the explicit terms in the RHS, and the implicit in the IFunction. Thanks, Matt > Thanks > > Thibault > > >> Thibault >> >> >>> Thanks, >>> >>> Matt >>> >>> >>>> Thanks, >>>> >>>> Thibault >>>> >>>> >>>>> Thanks, >>>>> >>>>> Matt >>>>> >>>>> >>>>>> Thanks ! >>>>>> >>>>>> Thibault >>>>>> >>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> Matt >>>>>>> >>>>>>> >>>>>>>> As I said it is very prospective, I just wanted to have your >>>>>>>> opinion !! >>>>>>>> >>>>>>>> Thanks very much in advance everyone !! >>>>>>>> >>>>>>>> Cheers, >>>>>>>> Thibault >>>>>>>> >>>>>>>> >>>>>>> >>>>>>> -- >>>>>>> What most experimenters take for granted before they begin their >>>>>>> experiments is infinitely more interesting than any results to which their >>>>>>> experiments lead. >>>>>>> -- Norbert Wiener >>>>>>> >>>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>>> >>>>>>> >>>>>> -- >>>>>> Thibault Bridel-Bertomeu >>>>>> ? >>>>>> Eng, MSc, PhD >>>>>> Research Engineer >>>>>> CEA/CESTA >>>>>> 33114 LE BARP >>>>>> Tel.: (+33)557046924 >>>>>> Mob.: (+33)611025322 >>>>>> Mail: thibault.bridelbertomeu at gmail.com >>>>>> >>>>> >>>>> >>>>> -- >>>>> What most experimenters take for granted before they begin their >>>>> experiments is infinitely more interesting than any results to which their >>>>> experiments lead. >>>>> -- Norbert Wiener >>>>> >>>>> https://www.cse.buffalo.edu/~knepley/ >>>>> >>>>> >>>> -- >>>> Thibault Bridel-Bertomeu >>>> ? >>>> Eng, MSc, PhD >>>> Research Engineer >>>> CEA/CESTA >>>> 33114 LE BARP >>>> Tel.: (+33)557046924 >>>> Mob.: (+33)611025322 >>>> Mail: thibault.bridelbertomeu at gmail.com >>>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >>> https://www.cse.buffalo.edu/~knepley/ >>> >>> >> -- >> Thibault Bridel-Bertomeu >> ? >> Eng, MSc, PhD >> Research Engineer >> CEA/CESTA >> 33114 LE BARP >> Tel.: (+33)557046924 >> Mob.: (+33)611025322 >> Mail: thibault.bridelbertomeu at gmail.com >> > -- > Thibault Bridel-Bertomeu > ? > Eng, MSc, PhD > Research Engineer > CEA/CESTA > 33114 LE BARP > Tel.: (+33)557046924 > Mob.: (+33)611025322 > Mail: thibault.bridelbertomeu at gmail.com > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From mfadams at lbl.gov Sat Jan 8 12:03:01 2022 From: mfadams at lbl.gov (Mark Adams) Date: Sat, 8 Jan 2022 13:03:01 -0500 Subject: [petsc-users] Fluid-Structure interaction with multiple DMPlex In-Reply-To: References: Message-ID: Can you subcycle with IMEX? On Sat, Jan 8, 2022 at 10:58 AM Matthew Knepley wrote: > On Sat, Jan 8, 2022 at 3:05 AM Thibault Bridel-Bertomeu < > thibault.bridelbertomeu at gmail.com> wrote: > >> Le ven. 7 janv. 2022 ? 19:45, Thibault Bridel-Bertomeu < >> thibault.bridelbertomeu at gmail.com> a ?crit : >> >>> Le ven. 7 janv. 2022 ? 19:23, Matthew Knepley a >>> ?crit : >>> >>>> On Fri, Jan 7, 2022 at 12:58 PM Thibault Bridel-Bertomeu < >>>> thibault.bridelbertomeu at gmail.com> wrote: >>>> >>>>> >>>>> Le ven. 7 janv. 2022 ? 14:54, Matthew Knepley a >>>>> ?crit : >>>>> >>>>>> On Fri, Jan 7, 2022 at 8:52 AM Thibault Bridel-Bertomeu < >>>>>> thibault.bridelbertomeu at gmail.com> wrote: >>>>>> >>>>>>> Hi Matthew, >>>>>>> >>>>>>> Le ven. 7 janv. 2022 ? 14:44, Matthew Knepley a >>>>>>> ?crit : >>>>>>> >>>>>>>> On Fri, Jan 7, 2022 at 5:46 AM Thibault Bridel-Bertomeu < >>>>>>>> thibault.bridelbertomeu at gmail.com> wrote: >>>>>>>> >>>>>>>>> Dear all, >>>>>>>>> >>>>>>>>> First of, happy new year everyone !! All the best ! >>>>>>>>> >>>>>>>> >>>>>>>> Happy New Year! >>>>>>>> >>>>>>>> >>>>>>>>> I am starting to draft a new project that will be about >>>>>>>>> fluid-structure interaction: in particular, the idea is to compute the >>>>>>>>> Navier-Stokes (or Euler nevermind) flow around an object and _at the same >>>>>>>>> time_ compute the heat equation inside the object. >>>>>>>>> So basically, I am thinking a mesh of the fluid and a mesh of the >>>>>>>>> object, both meshes being linked at the fluid - solid interface. >>>>>>>>> >>>>>>>> >>>>>>>> First question: Are these meshes intended to match on the >>>>>>>> interface? If not, this sounds like overset grids or immersed >>>>>>>> boundary/interface methods. In this case, more than one mesh makes sense to >>>>>>>> me. If they are intended to match, then I would advocate a single mesh with >>>>>>>> multiple problems defined on it. I have experimented with this, for example >>>>>>>> see SNES ex23 where I have a field in only part of the domain. I have a >>>>>>>> large project to do exactly this in a rocket engine now. >>>>>>>> >>>>>>> >>>>>>> Yes the way I see it is more of a single mesh with two distinct >>>>>>> regions to distinguish between the fluid and the solid. I was talking about >>>>>>> two meshes to try and explain my vision but it seems like it was unclear. >>>>>>> Imagine if you wish a rectangular box with a sphere inclusion: the >>>>>>> sphere would be tagged as a solid and the rest of the domain as fluid. >>>>>>> Using Gmsh volumes for instance. >>>>>>> Ill check out the SNES example ! Thanks ! >>>>>>> >>>>>>> >>>>>>>> >>>>>>>>> First (Matthew maybe ?) do you think it is something that could be >>>>>>>>> done using two DMPlex's that would somehow be spawned from reading a Gmsh >>>>>>>>> mesh with two volumes ? >>>>>>>>> >>>>>>>> >>>>>>>> You can take a mesh and filter out part of it with DMPlexFilter(). >>>>>>>> That is not used much so I may have to fix it to do what you want, but that >>>>>>>> should be easy. >>>>>>>> >>>>>>>> >>>>>>>>> And on one DMPlex we would have finite volume for the fluid, on >>>>>>>>> the other finite elements for the heat eqn ? >>>>>>>>> >>>>>>>> >>>>>>>> I have done this exact thing on a single mesh. It should be no >>>>>>>> harder on two meshes if you go that route. >>>>>>>> >>>>>>>> >>>>>>>>> Second, is it something that anyone in the community has ever >>>>>>>>> imagined doing with PETSc DMPlex's ? >>>>>>>>> >>>>>>>> >>>>>>>> Yes, I had a combined FV+FEM simulation of magma dynamics (I should >>>>>>>> make it an example), and currently we are doing FVM+FEM for simulation of a >>>>>>>> rocket engine. >>>>>>>> >>>>>>> >>>>>>> Wow so it seems like it?s the exact same thing I would like to >>>>>>> achieve as the rocket engine example. >>>>>>> So you have a single mesh and two regions tagged differently, and >>>>>>> you use the DmPlexFilter to solve FVM and FEM separately ? >>>>>>> >>>>>> >>>>>> With a single mesh, you do not even need DMPlexFilter. You just use >>>>>> the labels that Gmsh gives you. I think we should be able to get it going >>>>>> in a straightforward way. >>>>>> >>>>> >>>>> Ok then ! Thanks ! I?ll give it a shot and see what happens ! >>>>> Setting up the FVM and FEM discretizations will pass by DMSetField >>>>> right ? With a single mesh tagged with two different regions, it should >>>>> show up as two fields, is that correct ? >>>>> >>>> >>>> Yes, the idea is as follows. Each field also has a label argument that >>>> is the support of the field in the domain. Then we create PetscDS objects >>>> for each >>>> separate set of overlapping fields. The current algorithm is not >>>> complete I think, so let me know if this step fails. >>>> >>> >>> Ok, thanks. >>> I?ll let you know and share snippets when I have something started ! >>> >>> Talk soon ! Thanks ! >>> >> >> Hi Matthew, >> >> I thought about a little something else : what about setting two >> different TS, one for each field of the DM ? Most probably the fluid part >> would be solved with an explicit time stepping whereas the solid part with >> the heat equation would benefit from implicit time stepping. TSSetDM does >> not allow a field specification, is there a way to hack that so that each >> field has its own TS ? >> > > I see at least two options here: > > 1. Split the problems: > > You can use DMCreateSubDM() to split off part of a problem and use a > solver on that. I have done this for problems with weak coupling. > > 2. Use IMEX > > For strong coupling, I have used the IMEX TSes in PETSc. You put the > explicit terms in the RHS, and the implicit in the IFunction. > > Thanks, > > Matt > > >> Thanks >> >> Thibault >> >> >>> Thibault >>> >>> >>>> Thanks, >>>> >>>> Matt >>>> >>>> >>>>> Thanks, >>>>> >>>>> Thibault >>>>> >>>>> >>>>>> Thanks, >>>>>> >>>>>> Matt >>>>>> >>>>>> >>>>>>> Thanks ! >>>>>>> >>>>>>> Thibault >>>>>>> >>>>>>> >>>>>>>> Thanks, >>>>>>>> >>>>>>>> Matt >>>>>>>> >>>>>>>> >>>>>>>>> As I said it is very prospective, I just wanted to have your >>>>>>>>> opinion !! >>>>>>>>> >>>>>>>>> Thanks very much in advance everyone !! >>>>>>>>> >>>>>>>>> Cheers, >>>>>>>>> Thibault >>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> What most experimenters take for granted before they begin their >>>>>>>> experiments is infinitely more interesting than any results to which their >>>>>>>> experiments lead. >>>>>>>> -- Norbert Wiener >>>>>>>> >>>>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>>>> >>>>>>>> >>>>>>> -- >>>>>>> Thibault Bridel-Bertomeu >>>>>>> ? >>>>>>> Eng, MSc, PhD >>>>>>> Research Engineer >>>>>>> CEA/CESTA >>>>>>> 33114 LE BARP >>>>>>> Tel.: (+33)557046924 >>>>>>> Mob.: (+33)611025322 >>>>>>> Mail: thibault.bridelbertomeu at gmail.com >>>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> What most experimenters take for granted before they begin their >>>>>> experiments is infinitely more interesting than any results to which their >>>>>> experiments lead. >>>>>> -- Norbert Wiener >>>>>> >>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>> >>>>>> >>>>> -- >>>>> Thibault Bridel-Bertomeu >>>>> ? >>>>> Eng, MSc, PhD >>>>> Research Engineer >>>>> CEA/CESTA >>>>> 33114 LE BARP >>>>> Tel.: (+33)557046924 >>>>> Mob.: (+33)611025322 >>>>> Mail: thibault.bridelbertomeu at gmail.com >>>>> >>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their >>>> experiments is infinitely more interesting than any results to which their >>>> experiments lead. >>>> -- Norbert Wiener >>>> >>>> https://www.cse.buffalo.edu/~knepley/ >>>> >>>> >>> -- >>> Thibault Bridel-Bertomeu >>> ? >>> Eng, MSc, PhD >>> Research Engineer >>> CEA/CESTA >>> 33114 LE BARP >>> Tel.: (+33)557046924 >>> Mob.: (+33)611025322 >>> Mail: thibault.bridelbertomeu at gmail.com >>> >> -- >> Thibault Bridel-Bertomeu >> ? >> Eng, MSc, PhD >> Research Engineer >> CEA/CESTA >> 33114 LE BARP >> Tel.: (+33)557046924 >> Mob.: (+33)611025322 >> Mail: thibault.bridelbertomeu at gmail.com >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Sat Jan 8 12:22:02 2022 From: knepley at gmail.com (Matthew Knepley) Date: Sat, 8 Jan 2022 13:22:02 -0500 Subject: [petsc-users] Fluid-Structure interaction with multiple DMPlex In-Reply-To: References: Message-ID: I do not know how. Right now, composable TS does not work all the way. Matt On Sat, Jan 8, 2022 at 1:03 PM Mark Adams wrote: > Can you subcycle with IMEX? > > On Sat, Jan 8, 2022 at 10:58 AM Matthew Knepley wrote: > >> On Sat, Jan 8, 2022 at 3:05 AM Thibault Bridel-Bertomeu < >> thibault.bridelbertomeu at gmail.com> wrote: >> >>> Le ven. 7 janv. 2022 ? 19:45, Thibault Bridel-Bertomeu < >>> thibault.bridelbertomeu at gmail.com> a ?crit : >>> >>>> Le ven. 7 janv. 2022 ? 19:23, Matthew Knepley a >>>> ?crit : >>>> >>>>> On Fri, Jan 7, 2022 at 12:58 PM Thibault Bridel-Bertomeu < >>>>> thibault.bridelbertomeu at gmail.com> wrote: >>>>> >>>>>> >>>>>> Le ven. 7 janv. 2022 ? 14:54, Matthew Knepley a >>>>>> ?crit : >>>>>> >>>>>>> On Fri, Jan 7, 2022 at 8:52 AM Thibault Bridel-Bertomeu < >>>>>>> thibault.bridelbertomeu at gmail.com> wrote: >>>>>>> >>>>>>>> Hi Matthew, >>>>>>>> >>>>>>>> Le ven. 7 janv. 2022 ? 14:44, Matthew Knepley >>>>>>>> a ?crit : >>>>>>>> >>>>>>>>> On Fri, Jan 7, 2022 at 5:46 AM Thibault Bridel-Bertomeu < >>>>>>>>> thibault.bridelbertomeu at gmail.com> wrote: >>>>>>>>> >>>>>>>>>> Dear all, >>>>>>>>>> >>>>>>>>>> First of, happy new year everyone !! All the best ! >>>>>>>>>> >>>>>>>>> >>>>>>>>> Happy New Year! >>>>>>>>> >>>>>>>>> >>>>>>>>>> I am starting to draft a new project that will be about >>>>>>>>>> fluid-structure interaction: in particular, the idea is to compute the >>>>>>>>>> Navier-Stokes (or Euler nevermind) flow around an object and _at the same >>>>>>>>>> time_ compute the heat equation inside the object. >>>>>>>>>> So basically, I am thinking a mesh of the fluid and a mesh of the >>>>>>>>>> object, both meshes being linked at the fluid - solid interface. >>>>>>>>>> >>>>>>>>> >>>>>>>>> First question: Are these meshes intended to match on the >>>>>>>>> interface? If not, this sounds like overset grids or immersed >>>>>>>>> boundary/interface methods. In this case, more than one mesh makes sense to >>>>>>>>> me. If they are intended to match, then I would advocate a single mesh with >>>>>>>>> multiple problems defined on it. I have experimented with this, for example >>>>>>>>> see SNES ex23 where I have a field in only part of the domain. I have a >>>>>>>>> large project to do exactly this in a rocket engine now. >>>>>>>>> >>>>>>>> >>>>>>>> Yes the way I see it is more of a single mesh with two distinct >>>>>>>> regions to distinguish between the fluid and the solid. I was talking about >>>>>>>> two meshes to try and explain my vision but it seems like it was unclear. >>>>>>>> Imagine if you wish a rectangular box with a sphere inclusion: the >>>>>>>> sphere would be tagged as a solid and the rest of the domain as fluid. >>>>>>>> Using Gmsh volumes for instance. >>>>>>>> Ill check out the SNES example ! Thanks ! >>>>>>>> >>>>>>>> >>>>>>>>> >>>>>>>>>> First (Matthew maybe ?) do you think it is something that could >>>>>>>>>> be done using two DMPlex's that would somehow be spawned from reading a >>>>>>>>>> Gmsh mesh with two volumes ? >>>>>>>>>> >>>>>>>>> >>>>>>>>> You can take a mesh and filter out part of it with DMPlexFilter(). >>>>>>>>> That is not used much so I may have to fix it to do what you want, but that >>>>>>>>> should be easy. >>>>>>>>> >>>>>>>>> >>>>>>>>>> And on one DMPlex we would have finite volume for the fluid, on >>>>>>>>>> the other finite elements for the heat eqn ? >>>>>>>>>> >>>>>>>>> >>>>>>>>> I have done this exact thing on a single mesh. It should be no >>>>>>>>> harder on two meshes if you go that route. >>>>>>>>> >>>>>>>>> >>>>>>>>>> Second, is it something that anyone in the community has ever >>>>>>>>>> imagined doing with PETSc DMPlex's ? >>>>>>>>>> >>>>>>>>> >>>>>>>>> Yes, I had a combined FV+FEM simulation of magma dynamics (I >>>>>>>>> should make it an example), and currently we are doing FVM+FEM for >>>>>>>>> simulation of a rocket engine. >>>>>>>>> >>>>>>>> >>>>>>>> Wow so it seems like it?s the exact same thing I would like to >>>>>>>> achieve as the rocket engine example. >>>>>>>> So you have a single mesh and two regions tagged differently, and >>>>>>>> you use the DmPlexFilter to solve FVM and FEM separately ? >>>>>>>> >>>>>>> >>>>>>> With a single mesh, you do not even need DMPlexFilter. You just use >>>>>>> the labels that Gmsh gives you. I think we should be able to get it going >>>>>>> in a straightforward way. >>>>>>> >>>>>> >>>>>> Ok then ! Thanks ! I?ll give it a shot and see what happens ! >>>>>> Setting up the FVM and FEM discretizations will pass by DMSetField >>>>>> right ? With a single mesh tagged with two different regions, it should >>>>>> show up as two fields, is that correct ? >>>>>> >>>>> >>>>> Yes, the idea is as follows. Each field also has a label argument that >>>>> is the support of the field in the domain. Then we create PetscDS objects >>>>> for each >>>>> separate set of overlapping fields. The current algorithm is not >>>>> complete I think, so let me know if this step fails. >>>>> >>>> >>>> Ok, thanks. >>>> I?ll let you know and share snippets when I have something started ! >>>> >>>> Talk soon ! Thanks ! >>>> >>> >>> Hi Matthew, >>> >>> I thought about a little something else : what about setting two >>> different TS, one for each field of the DM ? Most probably the fluid part >>> would be solved with an explicit time stepping whereas the solid part with >>> the heat equation would benefit from implicit time stepping. TSSetDM does >>> not allow a field specification, is there a way to hack that so that each >>> field has its own TS ? >>> >> >> I see at least two options here: >> >> 1. Split the problems: >> >> You can use DMCreateSubDM() to split off part of a problem and use a >> solver on that. I have done this for problems with weak coupling. >> >> 2. Use IMEX >> >> For strong coupling, I have used the IMEX TSes in PETSc. You put the >> explicit terms in the RHS, and the implicit in the IFunction. >> >> Thanks, >> >> Matt >> >> >>> Thanks >>> >>> Thibault >>> >>> >>>> Thibault >>>> >>>> >>>>> Thanks, >>>>> >>>>> Matt >>>>> >>>>> >>>>>> Thanks, >>>>>> >>>>>> Thibault >>>>>> >>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> Matt >>>>>>> >>>>>>> >>>>>>>> Thanks ! >>>>>>>> >>>>>>>> Thibault >>>>>>>> >>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> >>>>>>>>> Matt >>>>>>>>> >>>>>>>>> >>>>>>>>>> As I said it is very prospective, I just wanted to have your >>>>>>>>>> opinion !! >>>>>>>>>> >>>>>>>>>> Thanks very much in advance everyone !! >>>>>>>>>> >>>>>>>>>> Cheers, >>>>>>>>>> Thibault >>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> What most experimenters take for granted before they begin their >>>>>>>>> experiments is infinitely more interesting than any results to which their >>>>>>>>> experiments lead. >>>>>>>>> -- Norbert Wiener >>>>>>>>> >>>>>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>>>>> >>>>>>>>> >>>>>>>> -- >>>>>>>> Thibault Bridel-Bertomeu >>>>>>>> ? >>>>>>>> Eng, MSc, PhD >>>>>>>> Research Engineer >>>>>>>> CEA/CESTA >>>>>>>> 33114 LE BARP >>>>>>>> Tel.: (+33)557046924 >>>>>>>> Mob.: (+33)611025322 >>>>>>>> Mail: thibault.bridelbertomeu at gmail.com >>>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> What most experimenters take for granted before they begin their >>>>>>> experiments is infinitely more interesting than any results to which their >>>>>>> experiments lead. >>>>>>> -- Norbert Wiener >>>>>>> >>>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>>> >>>>>>> >>>>>> -- >>>>>> Thibault Bridel-Bertomeu >>>>>> ? >>>>>> Eng, MSc, PhD >>>>>> Research Engineer >>>>>> CEA/CESTA >>>>>> 33114 LE BARP >>>>>> Tel.: (+33)557046924 >>>>>> Mob.: (+33)611025322 >>>>>> Mail: thibault.bridelbertomeu at gmail.com >>>>>> >>>>> >>>>> >>>>> -- >>>>> What most experimenters take for granted before they begin their >>>>> experiments is infinitely more interesting than any results to which their >>>>> experiments lead. >>>>> -- Norbert Wiener >>>>> >>>>> https://www.cse.buffalo.edu/~knepley/ >>>>> >>>>> >>>> -- >>>> Thibault Bridel-Bertomeu >>>> ? >>>> Eng, MSc, PhD >>>> Research Engineer >>>> CEA/CESTA >>>> 33114 LE BARP >>>> Tel.: (+33)557046924 >>>> Mob.: (+33)611025322 >>>> Mail: thibault.bridelbertomeu at gmail.com >>>> >>> -- >>> Thibault Bridel-Bertomeu >>> ? >>> Eng, MSc, PhD >>> Research Engineer >>> CEA/CESTA >>> 33114 LE BARP >>> Tel.: (+33)557046924 >>> Mob.: (+33)611025322 >>> Mail: thibault.bridelbertomeu at gmail.com >>> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ >> >> > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From thibault.bridelbertomeu at gmail.com Sat Jan 8 12:30:28 2022 From: thibault.bridelbertomeu at gmail.com (Thibault Bridel-Bertomeu) Date: Sat, 8 Jan 2022 19:30:28 +0100 Subject: [petsc-users] Fluid-Structure interaction with multiple DMPlex In-Reply-To: References: Message-ID: Yes I was wondering about different time steps as well because usually implicit integration moves much faster. But if it not implemented, then maybe going the ? weak coupling ? road with a sub-DM is the way. Can I ask how you proceed in the rocket engine code you are writing ? IMEX ? Thibault Le sam. 8 janv. 2022 ? 19:22, Matthew Knepley a ?crit : > I do not know how. Right now, composable TS does not work all the way. > > Matt > > On Sat, Jan 8, 2022 at 1:03 PM Mark Adams wrote: > >> Can you subcycle with IMEX? >> >> On Sat, Jan 8, 2022 at 10:58 AM Matthew Knepley >> wrote: >> >>> On Sat, Jan 8, 2022 at 3:05 AM Thibault Bridel-Bertomeu < >>> thibault.bridelbertomeu at gmail.com> wrote: >>> >>>> Le ven. 7 janv. 2022 ? 19:45, Thibault Bridel-Bertomeu < >>>> thibault.bridelbertomeu at gmail.com> a ?crit : >>>> >>>>> Le ven. 7 janv. 2022 ? 19:23, Matthew Knepley a >>>>> ?crit : >>>>> >>>>>> On Fri, Jan 7, 2022 at 12:58 PM Thibault Bridel-Bertomeu < >>>>>> thibault.bridelbertomeu at gmail.com> wrote: >>>>>> >>>>>>> >>>>>>> Le ven. 7 janv. 2022 ? 14:54, Matthew Knepley a >>>>>>> ?crit : >>>>>>> >>>>>>>> On Fri, Jan 7, 2022 at 8:52 AM Thibault Bridel-Bertomeu < >>>>>>>> thibault.bridelbertomeu at gmail.com> wrote: >>>>>>>> >>>>>>>>> Hi Matthew, >>>>>>>>> >>>>>>>>> Le ven. 7 janv. 2022 ? 14:44, Matthew Knepley >>>>>>>>> a ?crit : >>>>>>>>> >>>>>>>>>> On Fri, Jan 7, 2022 at 5:46 AM Thibault Bridel-Bertomeu < >>>>>>>>>> thibault.bridelbertomeu at gmail.com> wrote: >>>>>>>>>> >>>>>>>>>>> Dear all, >>>>>>>>>>> >>>>>>>>>>> First of, happy new year everyone !! All the best ! >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Happy New Year! >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> I am starting to draft a new project that will be about >>>>>>>>>>> fluid-structure interaction: in particular, the idea is to compute the >>>>>>>>>>> Navier-Stokes (or Euler nevermind) flow around an object and _at the same >>>>>>>>>>> time_ compute the heat equation inside the object. >>>>>>>>>>> So basically, I am thinking a mesh of the fluid and a mesh of >>>>>>>>>>> the object, both meshes being linked at the fluid - solid interface. >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> First question: Are these meshes intended to match on the >>>>>>>>>> interface? If not, this sounds like overset grids or immersed >>>>>>>>>> boundary/interface methods. In this case, more than one mesh makes sense to >>>>>>>>>> me. If they are intended to match, then I would advocate a single mesh with >>>>>>>>>> multiple problems defined on it. I have experimented with this, for example >>>>>>>>>> see SNES ex23 where I have a field in only part of the domain. I have a >>>>>>>>>> large project to do exactly this in a rocket engine now. >>>>>>>>>> >>>>>>>>> >>>>>>>>> Yes the way I see it is more of a single mesh with two distinct >>>>>>>>> regions to distinguish between the fluid and the solid. I was talking about >>>>>>>>> two meshes to try and explain my vision but it seems like it was unclear. >>>>>>>>> Imagine if you wish a rectangular box with a sphere inclusion: the >>>>>>>>> sphere would be tagged as a solid and the rest of the domain as fluid. >>>>>>>>> Using Gmsh volumes for instance. >>>>>>>>> Ill check out the SNES example ! Thanks ! >>>>>>>>> >>>>>>>>> >>>>>>>>>> >>>>>>>>>>> First (Matthew maybe ?) do you think it is something that could >>>>>>>>>>> be done using two DMPlex's that would somehow be spawned from reading a >>>>>>>>>>> Gmsh mesh with two volumes ? >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> You can take a mesh and filter out part of it with >>>>>>>>>> DMPlexFilter(). That is not used much so I may have to fix it to do what >>>>>>>>>> you want, but that should be easy. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> And on one DMPlex we would have finite volume for the fluid, on >>>>>>>>>>> the other finite elements for the heat eqn ? >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> I have done this exact thing on a single mesh. It should be no >>>>>>>>>> harder on two meshes if you go that route. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> Second, is it something that anyone in the community has ever >>>>>>>>>>> imagined doing with PETSc DMPlex's ? >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Yes, I had a combined FV+FEM simulation of magma dynamics (I >>>>>>>>>> should make it an example), and currently we are doing FVM+FEM for >>>>>>>>>> simulation of a rocket engine. >>>>>>>>>> >>>>>>>>> >>>>>>>>> Wow so it seems like it?s the exact same thing I would like to >>>>>>>>> achieve as the rocket engine example. >>>>>>>>> So you have a single mesh and two regions tagged differently, and >>>>>>>>> you use the DmPlexFilter to solve FVM and FEM separately ? >>>>>>>>> >>>>>>>> >>>>>>>> With a single mesh, you do not even need DMPlexFilter. You just use >>>>>>>> the labels that Gmsh gives you. I think we should be able to get it going >>>>>>>> in a straightforward way. >>>>>>>> >>>>>>> >>>>>>> Ok then ! Thanks ! I?ll give it a shot and see what happens ! >>>>>>> Setting up the FVM and FEM discretizations will pass by DMSetField >>>>>>> right ? With a single mesh tagged with two different regions, it should >>>>>>> show up as two fields, is that correct ? >>>>>>> >>>>>> >>>>>> Yes, the idea is as follows. Each field also has a label argument >>>>>> that is the support of the field in the domain. Then we create PetscDS >>>>>> objects for each >>>>>> separate set of overlapping fields. The current algorithm is not >>>>>> complete I think, so let me know if this step fails. >>>>>> >>>>> >>>>> Ok, thanks. >>>>> I?ll let you know and share snippets when I have something started ! >>>>> >>>>> Talk soon ! Thanks ! >>>>> >>>> >>>> Hi Matthew, >>>> >>>> I thought about a little something else : what about setting two >>>> different TS, one for each field of the DM ? Most probably the fluid part >>>> would be solved with an explicit time stepping whereas the solid part with >>>> the heat equation would benefit from implicit time stepping. TSSetDM does >>>> not allow a field specification, is there a way to hack that so that each >>>> field has its own TS ? >>>> >>> >>> I see at least two options here: >>> >>> 1. Split the problems: >>> >>> You can use DMCreateSubDM() to split off part of a problem and use a >>> solver on that. I have done this for problems with weak coupling. >>> >>> 2. Use IMEX >>> >>> For strong coupling, I have used the IMEX TSes in PETSc. You put the >>> explicit terms in the RHS, and the implicit in the IFunction. >>> >>> Thanks, >>> >>> Matt >>> >>> >>>> Thanks >>>> >>>> Thibault >>>> >>>> >>>>> Thibault >>>>> >>>>> >>>>>> Thanks, >>>>>> >>>>>> Matt >>>>>> >>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> Thibault >>>>>>> >>>>>>> >>>>>>>> Thanks, >>>>>>>> >>>>>>>> Matt >>>>>>>> >>>>>>>> >>>>>>>>> Thanks ! >>>>>>>>> >>>>>>>>> Thibault >>>>>>>>> >>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> >>>>>>>>>> Matt >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> As I said it is very prospective, I just wanted to have your >>>>>>>>>>> opinion !! >>>>>>>>>>> >>>>>>>>>>> Thanks very much in advance everyone !! >>>>>>>>>>> >>>>>>>>>>> Cheers, >>>>>>>>>>> Thibault >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> What most experimenters take for granted before they begin their >>>>>>>>>> experiments is infinitely more interesting than any results to which their >>>>>>>>>> experiments lead. >>>>>>>>>> -- Norbert Wiener >>>>>>>>>> >>>>>>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>>>>>> >>>>>>>>>> >>>>>>>>> -- >>>>>>>>> Thibault Bridel-Bertomeu >>>>>>>>> ? >>>>>>>>> Eng, MSc, PhD >>>>>>>>> Research Engineer >>>>>>>>> CEA/CESTA >>>>>>>>> 33114 LE BARP >>>>>>>>> Tel.: (+33)557046924 >>>>>>>>> Mob.: (+33)611025322 >>>>>>>>> Mail: thibault.bridelbertomeu at gmail.com >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> What most experimenters take for granted before they begin their >>>>>>>> experiments is infinitely more interesting than any results to which their >>>>>>>> experiments lead. >>>>>>>> -- Norbert Wiener >>>>>>>> >>>>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>>>> >>>>>>>> >>>>>>> -- >>>>>>> Thibault Bridel-Bertomeu >>>>>>> ? >>>>>>> Eng, MSc, PhD >>>>>>> Research Engineer >>>>>>> CEA/CESTA >>>>>>> 33114 LE BARP >>>>>>> Tel.: (+33)557046924 >>>>>>> Mob.: (+33)611025322 >>>>>>> Mail: thibault.bridelbertomeu at gmail.com >>>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> What most experimenters take for granted before they begin their >>>>>> experiments is infinitely more interesting than any results to which their >>>>>> experiments lead. >>>>>> -- Norbert Wiener >>>>>> >>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>> >>>>>> >>>>> -- >>>>> Thibault Bridel-Bertomeu >>>>> ? >>>>> Eng, MSc, PhD >>>>> Research Engineer >>>>> CEA/CESTA >>>>> 33114 LE BARP >>>>> Tel.: (+33)557046924 >>>>> Mob.: (+33)611025322 >>>>> Mail: thibault.bridelbertomeu at gmail.com >>>>> >>>> -- >>>> Thibault Bridel-Bertomeu >>>> ? >>>> Eng, MSc, PhD >>>> Research Engineer >>>> CEA/CESTA >>>> 33114 LE BARP >>>> Tel.: (+33)557046924 >>>> Mob.: (+33)611025322 >>>> Mail: thibault.bridelbertomeu at gmail.com >>>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >>> https://www.cse.buffalo.edu/~knepley/ >>> >>> >> > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > -- Thibault Bridel-Bertomeu ? Eng, MSc, PhD Research Engineer CEA/CESTA 33114 LE BARP Tel.: (+33)557046924 Mob.: (+33)611025322 Mail: thibault.bridelbertomeu at gmail.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Sat Jan 8 13:00:43 2022 From: knepley at gmail.com (Matthew Knepley) Date: Sat, 8 Jan 2022 14:00:43 -0500 Subject: [petsc-users] Fluid-Structure interaction with multiple DMPlex In-Reply-To: References: Message-ID: On Sat, Jan 8, 2022 at 1:30 PM Thibault Bridel-Bertomeu < thibault.bridelbertomeu at gmail.com> wrote: > Yes I was wondering about different time steps as well because usually > implicit integration moves much faster. > But if it not implemented, then maybe going the ? weak coupling ? road > with a sub-DM is the way. > Can I ask how you proceed in the rocket engine code you are writing ? IMEX > ? > Right now it is IMEX, but we are explicitly substepping particles. Not sure what the final thing will be. Thanks, Matt > Thibault > > Le sam. 8 janv. 2022 ? 19:22, Matthew Knepley a > ?crit : > >> I do not know how. Right now, composable TS does not work all the way. >> >> Matt >> >> On Sat, Jan 8, 2022 at 1:03 PM Mark Adams wrote: >> >>> Can you subcycle with IMEX? >>> >>> On Sat, Jan 8, 2022 at 10:58 AM Matthew Knepley >>> wrote: >>> >>>> On Sat, Jan 8, 2022 at 3:05 AM Thibault Bridel-Bertomeu < >>>> thibault.bridelbertomeu at gmail.com> wrote: >>>> >>>>> Le ven. 7 janv. 2022 ? 19:45, Thibault Bridel-Bertomeu < >>>>> thibault.bridelbertomeu at gmail.com> a ?crit : >>>>> >>>>>> Le ven. 7 janv. 2022 ? 19:23, Matthew Knepley a >>>>>> ?crit : >>>>>> >>>>>>> On Fri, Jan 7, 2022 at 12:58 PM Thibault Bridel-Bertomeu < >>>>>>> thibault.bridelbertomeu at gmail.com> wrote: >>>>>>> >>>>>>>> >>>>>>>> Le ven. 7 janv. 2022 ? 14:54, Matthew Knepley >>>>>>>> a ?crit : >>>>>>>> >>>>>>>>> On Fri, Jan 7, 2022 at 8:52 AM Thibault Bridel-Bertomeu < >>>>>>>>> thibault.bridelbertomeu at gmail.com> wrote: >>>>>>>>> >>>>>>>>>> Hi Matthew, >>>>>>>>>> >>>>>>>>>> Le ven. 7 janv. 2022 ? 14:44, Matthew Knepley >>>>>>>>>> a ?crit : >>>>>>>>>> >>>>>>>>>>> On Fri, Jan 7, 2022 at 5:46 AM Thibault Bridel-Bertomeu < >>>>>>>>>>> thibault.bridelbertomeu at gmail.com> wrote: >>>>>>>>>>> >>>>>>>>>>>> Dear all, >>>>>>>>>>>> >>>>>>>>>>>> First of, happy new year everyone !! All the best ! >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Happy New Year! >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>> I am starting to draft a new project that will be about >>>>>>>>>>>> fluid-structure interaction: in particular, the idea is to compute the >>>>>>>>>>>> Navier-Stokes (or Euler nevermind) flow around an object and _at the same >>>>>>>>>>>> time_ compute the heat equation inside the object. >>>>>>>>>>>> So basically, I am thinking a mesh of the fluid and a mesh of >>>>>>>>>>>> the object, both meshes being linked at the fluid - solid interface. >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> First question: Are these meshes intended to match on the >>>>>>>>>>> interface? If not, this sounds like overset grids or immersed >>>>>>>>>>> boundary/interface methods. In this case, more than one mesh makes sense to >>>>>>>>>>> me. If they are intended to match, then I would advocate a single mesh with >>>>>>>>>>> multiple problems defined on it. I have experimented with this, for example >>>>>>>>>>> see SNES ex23 where I have a field in only part of the domain. I have a >>>>>>>>>>> large project to do exactly this in a rocket engine now. >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Yes the way I see it is more of a single mesh with two distinct >>>>>>>>>> regions to distinguish between the fluid and the solid. I was talking about >>>>>>>>>> two meshes to try and explain my vision but it seems like it was unclear. >>>>>>>>>> Imagine if you wish a rectangular box with a sphere inclusion: >>>>>>>>>> the sphere would be tagged as a solid and the rest of the domain as fluid. >>>>>>>>>> Using Gmsh volumes for instance. >>>>>>>>>> Ill check out the SNES example ! Thanks ! >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>> First (Matthew maybe ?) do you think it is something that could >>>>>>>>>>>> be done using two DMPlex's that would somehow be spawned from reading a >>>>>>>>>>>> Gmsh mesh with two volumes ? >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> You can take a mesh and filter out part of it with >>>>>>>>>>> DMPlexFilter(). That is not used much so I may have to fix it to do what >>>>>>>>>>> you want, but that should be easy. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>> And on one DMPlex we would have finite volume for the fluid, on >>>>>>>>>>>> the other finite elements for the heat eqn ? >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> I have done this exact thing on a single mesh. It should be no >>>>>>>>>>> harder on two meshes if you go that route. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>> Second, is it something that anyone in the community has ever >>>>>>>>>>>> imagined doing with PETSc DMPlex's ? >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Yes, I had a combined FV+FEM simulation of magma dynamics (I >>>>>>>>>>> should make it an example), and currently we are doing FVM+FEM for >>>>>>>>>>> simulation of a rocket engine. >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Wow so it seems like it?s the exact same thing I would like to >>>>>>>>>> achieve as the rocket engine example. >>>>>>>>>> So you have a single mesh and two regions tagged differently, and >>>>>>>>>> you use the DmPlexFilter to solve FVM and FEM separately ? >>>>>>>>>> >>>>>>>>> >>>>>>>>> With a single mesh, you do not even need DMPlexFilter. You just >>>>>>>>> use the labels that Gmsh gives you. I think we should be able to get it >>>>>>>>> going in a straightforward way. >>>>>>>>> >>>>>>>> >>>>>>>> Ok then ! Thanks ! I?ll give it a shot and see what happens ! >>>>>>>> Setting up the FVM and FEM discretizations will pass by DMSetField >>>>>>>> right ? With a single mesh tagged with two different regions, it should >>>>>>>> show up as two fields, is that correct ? >>>>>>>> >>>>>>> >>>>>>> Yes, the idea is as follows. Each field also has a label argument >>>>>>> that is the support of the field in the domain. Then we create PetscDS >>>>>>> objects for each >>>>>>> separate set of overlapping fields. The current algorithm is not >>>>>>> complete I think, so let me know if this step fails. >>>>>>> >>>>>> >>>>>> Ok, thanks. >>>>>> I?ll let you know and share snippets when I have something started ! >>>>>> >>>>>> Talk soon ! Thanks ! >>>>>> >>>>> >>>>> Hi Matthew, >>>>> >>>>> I thought about a little something else : what about setting two >>>>> different TS, one for each field of the DM ? Most probably the fluid part >>>>> would be solved with an explicit time stepping whereas the solid part with >>>>> the heat equation would benefit from implicit time stepping. TSSetDM does >>>>> not allow a field specification, is there a way to hack that so that each >>>>> field has its own TS ? >>>>> >>>> >>>> I see at least two options here: >>>> >>>> 1. Split the problems: >>>> >>>> You can use DMCreateSubDM() to split off part of a problem and use >>>> a solver on that. I have done this for problems with weak coupling. >>>> >>>> 2. Use IMEX >>>> >>>> For strong coupling, I have used the IMEX TSes in PETSc. You put >>>> the explicit terms in the RHS, and the implicit in the IFunction. >>>> >>>> Thanks, >>>> >>>> Matt >>>> >>>> >>>>> Thanks >>>>> >>>>> Thibault >>>>> >>>>> >>>>>> Thibault >>>>>> >>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> Matt >>>>>>> >>>>>>> >>>>>>>> Thanks, >>>>>>>> >>>>>>>> Thibault >>>>>>>> >>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> >>>>>>>>> Matt >>>>>>>>> >>>>>>>>> >>>>>>>>>> Thanks ! >>>>>>>>>> >>>>>>>>>> Thibault >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> >>>>>>>>>>> Matt >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>> As I said it is very prospective, I just wanted to have your >>>>>>>>>>>> opinion !! >>>>>>>>>>>> >>>>>>>>>>>> Thanks very much in advance everyone !! >>>>>>>>>>>> >>>>>>>>>>>> Cheers, >>>>>>>>>>>> Thibault >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> -- >>>>>>>>>>> What most experimenters take for granted before they begin their >>>>>>>>>>> experiments is infinitely more interesting than any results to which their >>>>>>>>>>> experiments lead. >>>>>>>>>>> -- Norbert Wiener >>>>>>>>>>> >>>>>>>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> Thibault Bridel-Bertomeu >>>>>>>>>> ? >>>>>>>>>> Eng, MSc, PhD >>>>>>>>>> Research Engineer >>>>>>>>>> CEA/CESTA >>>>>>>>>> 33114 LE BARP >>>>>>>>>> Tel.: (+33)557046924 >>>>>>>>>> Mob.: (+33)611025322 >>>>>>>>>> Mail: thibault.bridelbertomeu at gmail.com >>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> What most experimenters take for granted before they begin their >>>>>>>>> experiments is infinitely more interesting than any results to which their >>>>>>>>> experiments lead. >>>>>>>>> -- Norbert Wiener >>>>>>>>> >>>>>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>>>>> >>>>>>>>> >>>>>>>> -- >>>>>>>> Thibault Bridel-Bertomeu >>>>>>>> ? >>>>>>>> Eng, MSc, PhD >>>>>>>> Research Engineer >>>>>>>> CEA/CESTA >>>>>>>> 33114 LE BARP >>>>>>>> Tel.: (+33)557046924 >>>>>>>> Mob.: (+33)611025322 >>>>>>>> Mail: thibault.bridelbertomeu at gmail.com >>>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> What most experimenters take for granted before they begin their >>>>>>> experiments is infinitely more interesting than any results to which their >>>>>>> experiments lead. >>>>>>> -- Norbert Wiener >>>>>>> >>>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>>> >>>>>>> >>>>>> -- >>>>>> Thibault Bridel-Bertomeu >>>>>> ? >>>>>> Eng, MSc, PhD >>>>>> Research Engineer >>>>>> CEA/CESTA >>>>>> 33114 LE BARP >>>>>> Tel.: (+33)557046924 >>>>>> Mob.: (+33)611025322 >>>>>> Mail: thibault.bridelbertomeu at gmail.com >>>>>> >>>>> -- >>>>> Thibault Bridel-Bertomeu >>>>> ? >>>>> Eng, MSc, PhD >>>>> Research Engineer >>>>> CEA/CESTA >>>>> 33114 LE BARP >>>>> Tel.: (+33)557046924 >>>>> Mob.: (+33)611025322 >>>>> Mail: thibault.bridelbertomeu at gmail.com >>>>> >>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their >>>> experiments is infinitely more interesting than any results to which their >>>> experiments lead. >>>> -- Norbert Wiener >>>> >>>> https://www.cse.buffalo.edu/~knepley/ >>>> >>>> >>> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ >> >> > -- > Thibault Bridel-Bertomeu > ? > Eng, MSc, PhD > Research Engineer > CEA/CESTA > 33114 LE BARP > Tel.: (+33)557046924 > Mob.: (+33)611025322 > Mail: thibault.bridelbertomeu at gmail.com > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From thibault.bridelbertomeu at gmail.com Sat Jan 8 13:13:00 2022 From: thibault.bridelbertomeu at gmail.com (Thibault Bridel-Bertomeu) Date: Sat, 8 Jan 2022 20:13:00 +0100 Subject: [petsc-users] Fluid-Structure interaction with multiple DMPlex In-Reply-To: References: Message-ID: However if you use IMEX for strong coupling of the two physics solved in each field, then it means you need to write a single set of PDEs that covers everything, don?t you ? If I want to solve Euler equations in one PetscDS and heat equation in the other one, then I need to write a global set of equations to use the IMEX TS , right ? Thanks, Thibault Le sam. 8 janv. 2022 ? 20:00, Matthew Knepley a ?crit : > On Sat, Jan 8, 2022 at 1:30 PM Thibault Bridel-Bertomeu < > thibault.bridelbertomeu at gmail.com> wrote: > >> Yes I was wondering about different time steps as well because usually >> implicit integration moves much faster. >> But if it not implemented, then maybe going the ? weak coupling ? road >> with a sub-DM is the way. >> Can I ask how you proceed in the rocket engine code you are writing ? >> IMEX ? >> > > Right now it is IMEX, but we are explicitly substepping particles. Not > sure what the final thing will be. > > Thanks, > > Matt > > >> Thibault >> >> Le sam. 8 janv. 2022 ? 19:22, Matthew Knepley a >> ?crit : >> >>> I do not know how. Right now, composable TS does not work all the way. >>> >>> Matt >>> >>> On Sat, Jan 8, 2022 at 1:03 PM Mark Adams wrote: >>> >>>> Can you subcycle with IMEX? >>>> >>>> On Sat, Jan 8, 2022 at 10:58 AM Matthew Knepley >>>> wrote: >>>> >>>>> On Sat, Jan 8, 2022 at 3:05 AM Thibault Bridel-Bertomeu < >>>>> thibault.bridelbertomeu at gmail.com> wrote: >>>>> >>>>>> Le ven. 7 janv. 2022 ? 19:45, Thibault Bridel-Bertomeu < >>>>>> thibault.bridelbertomeu at gmail.com> a ?crit : >>>>>> >>>>>>> Le ven. 7 janv. 2022 ? 19:23, Matthew Knepley a >>>>>>> ?crit : >>>>>>> >>>>>>>> On Fri, Jan 7, 2022 at 12:58 PM Thibault Bridel-Bertomeu < >>>>>>>> thibault.bridelbertomeu at gmail.com> wrote: >>>>>>>> >>>>>>>>> >>>>>>>>> Le ven. 7 janv. 2022 ? 14:54, Matthew Knepley >>>>>>>>> a ?crit : >>>>>>>>> >>>>>>>>>> On Fri, Jan 7, 2022 at 8:52 AM Thibault Bridel-Bertomeu < >>>>>>>>>> thibault.bridelbertomeu at gmail.com> wrote: >>>>>>>>>> >>>>>>>>>>> Hi Matthew, >>>>>>>>>>> >>>>>>>>>>> Le ven. 7 janv. 2022 ? 14:44, Matthew Knepley >>>>>>>>>>> a ?crit : >>>>>>>>>>> >>>>>>>>>>>> On Fri, Jan 7, 2022 at 5:46 AM Thibault Bridel-Bertomeu < >>>>>>>>>>>> thibault.bridelbertomeu at gmail.com> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> Dear all, >>>>>>>>>>>>> >>>>>>>>>>>>> First of, happy new year everyone !! All the best ! >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Happy New Year! >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>>> I am starting to draft a new project that will be about >>>>>>>>>>>>> fluid-structure interaction: in particular, the idea is to compute the >>>>>>>>>>>>> Navier-Stokes (or Euler nevermind) flow around an object and _at the same >>>>>>>>>>>>> time_ compute the heat equation inside the object. >>>>>>>>>>>>> So basically, I am thinking a mesh of the fluid and a mesh of >>>>>>>>>>>>> the object, both meshes being linked at the fluid - solid interface. >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> First question: Are these meshes intended to match on the >>>>>>>>>>>> interface? If not, this sounds like overset grids or immersed >>>>>>>>>>>> boundary/interface methods. In this case, more than one mesh makes sense to >>>>>>>>>>>> me. If they are intended to match, then I would advocate a single mesh with >>>>>>>>>>>> multiple problems defined on it. I have experimented with this, for example >>>>>>>>>>>> see SNES ex23 where I have a field in only part of the domain. I have a >>>>>>>>>>>> large project to do exactly this in a rocket engine now. >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Yes the way I see it is more of a single mesh with two distinct >>>>>>>>>>> regions to distinguish between the fluid and the solid. I was talking about >>>>>>>>>>> two meshes to try and explain my vision but it seems like it was unclear. >>>>>>>>>>> Imagine if you wish a rectangular box with a sphere inclusion: >>>>>>>>>>> the sphere would be tagged as a solid and the rest of the domain as fluid. >>>>>>>>>>> Using Gmsh volumes for instance. >>>>>>>>>>> Ill check out the SNES example ! Thanks ! >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>>> First (Matthew maybe ?) do you think it is something that >>>>>>>>>>>>> could be done using two DMPlex's that would somehow be spawned from reading >>>>>>>>>>>>> a Gmsh mesh with two volumes ? >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> You can take a mesh and filter out part of it with >>>>>>>>>>>> DMPlexFilter(). That is not used much so I may have to fix it to do what >>>>>>>>>>>> you want, but that should be easy. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>>> And on one DMPlex we would have finite volume for the fluid, >>>>>>>>>>>>> on the other finite elements for the heat eqn ? >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> I have done this exact thing on a single mesh. It should be no >>>>>>>>>>>> harder on two meshes if you go that route. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>>> Second, is it something that anyone in the community has ever >>>>>>>>>>>>> imagined doing with PETSc DMPlex's ? >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Yes, I had a combined FV+FEM simulation of magma dynamics (I >>>>>>>>>>>> should make it an example), and currently we are doing FVM+FEM for >>>>>>>>>>>> simulation of a rocket engine. >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Wow so it seems like it?s the exact same thing I would like to >>>>>>>>>>> achieve as the rocket engine example. >>>>>>>>>>> So you have a single mesh and two regions tagged differently, >>>>>>>>>>> and you use the DmPlexFilter to solve FVM and FEM separately ? >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> With a single mesh, you do not even need DMPlexFilter. You just >>>>>>>>>> use the labels that Gmsh gives you. I think we should be able to get it >>>>>>>>>> going in a straightforward way. >>>>>>>>>> >>>>>>>>> >>>>>>>>> Ok then ! Thanks ! I?ll give it a shot and see what happens ! >>>>>>>>> Setting up the FVM and FEM discretizations will pass by DMSetField >>>>>>>>> right ? With a single mesh tagged with two different regions, it should >>>>>>>>> show up as two fields, is that correct ? >>>>>>>>> >>>>>>>> >>>>>>>> Yes, the idea is as follows. Each field also has a label argument >>>>>>>> that is the support of the field in the domain. Then we create PetscDS >>>>>>>> objects for each >>>>>>>> separate set of overlapping fields. The current algorithm is not >>>>>>>> complete I think, so let me know if this step fails. >>>>>>>> >>>>>>> >>>>>>> Ok, thanks. >>>>>>> I?ll let you know and share snippets when I have something started ! >>>>>>> >>>>>>> Talk soon ! Thanks ! >>>>>>> >>>>>> >>>>>> Hi Matthew, >>>>>> >>>>>> I thought about a little something else : what about setting two >>>>>> different TS, one for each field of the DM ? Most probably the fluid part >>>>>> would be solved with an explicit time stepping whereas the solid part with >>>>>> the heat equation would benefit from implicit time stepping. TSSetDM does >>>>>> not allow a field specification, is there a way to hack that so that each >>>>>> field has its own TS ? >>>>>> >>>>> >>>>> I see at least two options here: >>>>> >>>>> 1. Split the problems: >>>>> >>>>> You can use DMCreateSubDM() to split off part of a problem and use >>>>> a solver on that. I have done this for problems with weak coupling. >>>>> >>>>> 2. Use IMEX >>>>> >>>>> For strong coupling, I have used the IMEX TSes in PETSc. You put >>>>> the explicit terms in the RHS, and the implicit in the IFunction. >>>>> >>>>> Thanks, >>>>> >>>>> Matt >>>>> >>>>> >>>>>> Thanks >>>>>> >>>>>> Thibault >>>>>> >>>>>> >>>>>>> Thibault >>>>>>> >>>>>>> >>>>>>>> Thanks, >>>>>>>> >>>>>>>> Matt >>>>>>>> >>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> >>>>>>>>> Thibault >>>>>>>>> >>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> >>>>>>>>>> Matt >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> Thanks ! >>>>>>>>>>> >>>>>>>>>>> Thibault >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> >>>>>>>>>>>> Matt >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>>> As I said it is very prospective, I just wanted to have your >>>>>>>>>>>>> opinion !! >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks very much in advance everyone !! >>>>>>>>>>>>> >>>>>>>>>>>>> Cheers, >>>>>>>>>>>>> Thibault >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> -- >>>>>>>>>>>> What most experimenters take for granted before they begin >>>>>>>>>>>> their experiments is infinitely more interesting than any results to which >>>>>>>>>>>> their experiments lead. >>>>>>>>>>>> -- Norbert Wiener >>>>>>>>>>>> >>>>>>>>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> -- >>>>>>>>>>> Thibault Bridel-Bertomeu >>>>>>>>>>> ? >>>>>>>>>>> Eng, MSc, PhD >>>>>>>>>>> Research Engineer >>>>>>>>>>> CEA/CESTA >>>>>>>>>>> 33114 LE BARP >>>>>>>>>>> Tel.: (+33)557046924 >>>>>>>>>>> Mob.: (+33)611025322 >>>>>>>>>>> Mail: thibault.bridelbertomeu at gmail.com >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> What most experimenters take for granted before they begin their >>>>>>>>>> experiments is infinitely more interesting than any results to which their >>>>>>>>>> experiments lead. >>>>>>>>>> -- Norbert Wiener >>>>>>>>>> >>>>>>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>>>>>> >>>>>>>>>> >>>>>>>>> -- >>>>>>>>> Thibault Bridel-Bertomeu >>>>>>>>> ? >>>>>>>>> Eng, MSc, PhD >>>>>>>>> Research Engineer >>>>>>>>> CEA/CESTA >>>>>>>>> 33114 LE BARP >>>>>>>>> Tel.: (+33)557046924 >>>>>>>>> Mob.: (+33)611025322 >>>>>>>>> Mail: thibault.bridelbertomeu at gmail.com >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> What most experimenters take for granted before they begin their >>>>>>>> experiments is infinitely more interesting than any results to which their >>>>>>>> experiments lead. >>>>>>>> -- Norbert Wiener >>>>>>>> >>>>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>>>> >>>>>>>> >>>>>>> -- >>>>>>> Thibault Bridel-Bertomeu >>>>>>> ? >>>>>>> Eng, MSc, PhD >>>>>>> Research Engineer >>>>>>> CEA/CESTA >>>>>>> 33114 LE BARP >>>>>>> Tel.: (+33)557046924 >>>>>>> Mob.: (+33)611025322 >>>>>>> Mail: thibault.bridelbertomeu at gmail.com >>>>>>> >>>>>> -- >>>>>> Thibault Bridel-Bertomeu >>>>>> ? >>>>>> Eng, MSc, PhD >>>>>> Research Engineer >>>>>> CEA/CESTA >>>>>> 33114 LE BARP >>>>>> Tel.: (+33)557046924 >>>>>> Mob.: (+33)611025322 >>>>>> Mail: thibault.bridelbertomeu at gmail.com >>>>>> >>>>> >>>>> >>>>> -- >>>>> What most experimenters take for granted before they begin their >>>>> experiments is infinitely more interesting than any results to which their >>>>> experiments lead. >>>>> -- Norbert Wiener >>>>> >>>>> https://www.cse.buffalo.edu/~knepley/ >>>>> >>>>> >>>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >>> https://www.cse.buffalo.edu/~knepley/ >>> >>> >> -- >> Thibault Bridel-Bertomeu >> ? >> Eng, MSc, PhD >> Research Engineer >> CEA/CESTA >> 33114 LE BARP >> Tel.: (+33)557046924 >> Mob.: (+33)611025322 >> Mail: thibault.bridelbertomeu at gmail.com >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > -- Thibault Bridel-Bertomeu ? Eng, MSc, PhD Research Engineer CEA/CESTA 33114 LE BARP Tel.: (+33)557046924 Mob.: (+33)611025322 Mail: thibault.bridelbertomeu at gmail.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Sun Jan 9 06:05:32 2022 From: knepley at gmail.com (Matthew Knepley) Date: Sun, 9 Jan 2022 07:05:32 -0500 Subject: [petsc-users] Fluid-Structure interaction with multiple DMPlex In-Reply-To: References: Message-ID: On Sat, Jan 8, 2022 at 2:13 PM Thibault Bridel-Bertomeu < thibault.bridelbertomeu at gmail.com> wrote: > However if you use IMEX for strong coupling of the two physics solved in > each field, then it means you need to write a single set of PDEs that > covers everything, don?t you ? > If I want to solve Euler equations in one PetscDS and heat equation in the > other one, then I need to write a global set of equations to use the IMEX > TS , right ? > The way I think about it. You would have explicit terms for Euler, and they would also be confined to one part of the domain, but that just impacts how you do the residual integral. You do assemble a combined residual for all dogs, however, which I think is what you mean. Thanks, Matt > Thanks, > > Thibault > > Le sam. 8 janv. 2022 ? 20:00, Matthew Knepley a > ?crit : > >> On Sat, Jan 8, 2022 at 1:30 PM Thibault Bridel-Bertomeu < >> thibault.bridelbertomeu at gmail.com> wrote: >> >>> Yes I was wondering about different time steps as well because usually >>> implicit integration moves much faster. >>> But if it not implemented, then maybe going the ? weak coupling ? road >>> with a sub-DM is the way. >>> Can I ask how you proceed in the rocket engine code you are writing ? >>> IMEX ? >>> >> >> Right now it is IMEX, but we are explicitly substepping particles. Not >> sure what the final thing will be. >> >> Thanks, >> >> Matt >> >> >>> Thibault >>> >>> Le sam. 8 janv. 2022 ? 19:22, Matthew Knepley a >>> ?crit : >>> >>>> I do not know how. Right now, composable TS does not work all the way. >>>> >>>> Matt >>>> >>>> On Sat, Jan 8, 2022 at 1:03 PM Mark Adams wrote: >>>> >>>>> Can you subcycle with IMEX? >>>>> >>>>> On Sat, Jan 8, 2022 at 10:58 AM Matthew Knepley >>>>> wrote: >>>>> >>>>>> On Sat, Jan 8, 2022 at 3:05 AM Thibault Bridel-Bertomeu < >>>>>> thibault.bridelbertomeu at gmail.com> wrote: >>>>>> >>>>>>> Le ven. 7 janv. 2022 ? 19:45, Thibault Bridel-Bertomeu < >>>>>>> thibault.bridelbertomeu at gmail.com> a ?crit : >>>>>>> >>>>>>>> Le ven. 7 janv. 2022 ? 19:23, Matthew Knepley >>>>>>>> a ?crit : >>>>>>>> >>>>>>>>> On Fri, Jan 7, 2022 at 12:58 PM Thibault Bridel-Bertomeu < >>>>>>>>> thibault.bridelbertomeu at gmail.com> wrote: >>>>>>>>> >>>>>>>>>> >>>>>>>>>> Le ven. 7 janv. 2022 ? 14:54, Matthew Knepley >>>>>>>>>> a ?crit : >>>>>>>>>> >>>>>>>>>>> On Fri, Jan 7, 2022 at 8:52 AM Thibault Bridel-Bertomeu < >>>>>>>>>>> thibault.bridelbertomeu at gmail.com> wrote: >>>>>>>>>>> >>>>>>>>>>>> Hi Matthew, >>>>>>>>>>>> >>>>>>>>>>>> Le ven. 7 janv. 2022 ? 14:44, Matthew Knepley < >>>>>>>>>>>> knepley at gmail.com> a ?crit : >>>>>>>>>>>> >>>>>>>>>>>>> On Fri, Jan 7, 2022 at 5:46 AM Thibault Bridel-Bertomeu < >>>>>>>>>>>>> thibault.bridelbertomeu at gmail.com> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> Dear all, >>>>>>>>>>>>>> >>>>>>>>>>>>>> First of, happy new year everyone !! All the best ! >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> Happy New Year! >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>>> I am starting to draft a new project that will be about >>>>>>>>>>>>>> fluid-structure interaction: in particular, the idea is to compute the >>>>>>>>>>>>>> Navier-Stokes (or Euler nevermind) flow around an object and _at the same >>>>>>>>>>>>>> time_ compute the heat equation inside the object. >>>>>>>>>>>>>> So basically, I am thinking a mesh of the fluid and a mesh of >>>>>>>>>>>>>> the object, both meshes being linked at the fluid - solid interface. >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> First question: Are these meshes intended to match on the >>>>>>>>>>>>> interface? If not, this sounds like overset grids or immersed >>>>>>>>>>>>> boundary/interface methods. In this case, more than one mesh makes sense to >>>>>>>>>>>>> me. If they are intended to match, then I would advocate a single mesh with >>>>>>>>>>>>> multiple problems defined on it. I have experimented with this, for example >>>>>>>>>>>>> see SNES ex23 where I have a field in only part of the domain. I have a >>>>>>>>>>>>> large project to do exactly this in a rocket engine now. >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Yes the way I see it is more of a single mesh with two distinct >>>>>>>>>>>> regions to distinguish between the fluid and the solid. I was talking about >>>>>>>>>>>> two meshes to try and explain my vision but it seems like it was unclear. >>>>>>>>>>>> Imagine if you wish a rectangular box with a sphere inclusion: >>>>>>>>>>>> the sphere would be tagged as a solid and the rest of the domain as fluid. >>>>>>>>>>>> Using Gmsh volumes for instance. >>>>>>>>>>>> Ill check out the SNES example ! Thanks ! >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>>> First (Matthew maybe ?) do you think it is something that >>>>>>>>>>>>>> could be done using two DMPlex's that would somehow be spawned from reading >>>>>>>>>>>>>> a Gmsh mesh with two volumes ? >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> You can take a mesh and filter out part of it with >>>>>>>>>>>>> DMPlexFilter(). That is not used much so I may have to fix it to do what >>>>>>>>>>>>> you want, but that should be easy. >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>>> And on one DMPlex we would have finite volume for the fluid, >>>>>>>>>>>>>> on the other finite elements for the heat eqn ? >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> I have done this exact thing on a single mesh. It should be no >>>>>>>>>>>>> harder on two meshes if you go that route. >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>>> Second, is it something that anyone in the community has ever >>>>>>>>>>>>>> imagined doing with PETSc DMPlex's ? >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> Yes, I had a combined FV+FEM simulation of magma dynamics (I >>>>>>>>>>>>> should make it an example), and currently we are doing FVM+FEM for >>>>>>>>>>>>> simulation of a rocket engine. >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Wow so it seems like it?s the exact same thing I would like to >>>>>>>>>>>> achieve as the rocket engine example. >>>>>>>>>>>> So you have a single mesh and two regions tagged differently, >>>>>>>>>>>> and you use the DmPlexFilter to solve FVM and FEM separately ? >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> With a single mesh, you do not even need DMPlexFilter. You just >>>>>>>>>>> use the labels that Gmsh gives you. I think we should be able to get it >>>>>>>>>>> going in a straightforward way. >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Ok then ! Thanks ! I?ll give it a shot and see what happens ! >>>>>>>>>> Setting up the FVM and FEM discretizations will pass by >>>>>>>>>> DMSetField right ? With a single mesh tagged with two different regions, it >>>>>>>>>> should show up as two fields, is that correct ? >>>>>>>>>> >>>>>>>>> >>>>>>>>> Yes, the idea is as follows. Each field also has a label argument >>>>>>>>> that is the support of the field in the domain. Then we create PetscDS >>>>>>>>> objects for each >>>>>>>>> separate set of overlapping fields. The current algorithm is not >>>>>>>>> complete I think, so let me know if this step fails. >>>>>>>>> >>>>>>>> >>>>>>>> Ok, thanks. >>>>>>>> I?ll let you know and share snippets when I have something started ! >>>>>>>> >>>>>>>> Talk soon ! Thanks ! >>>>>>>> >>>>>>> >>>>>>> Hi Matthew, >>>>>>> >>>>>>> I thought about a little something else : what about setting two >>>>>>> different TS, one for each field of the DM ? Most probably the fluid part >>>>>>> would be solved with an explicit time stepping whereas the solid part with >>>>>>> the heat equation would benefit from implicit time stepping. TSSetDM does >>>>>>> not allow a field specification, is there a way to hack that so that each >>>>>>> field has its own TS ? >>>>>>> >>>>>> >>>>>> I see at least two options here: >>>>>> >>>>>> 1. Split the problems: >>>>>> >>>>>> You can use DMCreateSubDM() to split off part of a problem and >>>>>> use a solver on that. I have done this for problems with weak coupling. >>>>>> >>>>>> 2. Use IMEX >>>>>> >>>>>> For strong coupling, I have used the IMEX TSes in PETSc. You put >>>>>> the explicit terms in the RHS, and the implicit in the IFunction. >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Matt >>>>>> >>>>>> >>>>>>> Thanks >>>>>>> >>>>>>> Thibault >>>>>>> >>>>>>> >>>>>>>> Thibault >>>>>>>> >>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> >>>>>>>>> Matt >>>>>>>>> >>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> >>>>>>>>>> Thibault >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> >>>>>>>>>>> Matt >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>> Thanks ! >>>>>>>>>>>> >>>>>>>>>>>> Thibault >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>>> Thanks, >>>>>>>>>>>>> >>>>>>>>>>>>> Matt >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>>> As I said it is very prospective, I just wanted to have your >>>>>>>>>>>>>> opinion !! >>>>>>>>>>>>>> >>>>>>>>>>>>>> Thanks very much in advance everyone !! >>>>>>>>>>>>>> >>>>>>>>>>>>>> Cheers, >>>>>>>>>>>>>> Thibault >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> -- >>>>>>>>>>>>> What most experimenters take for granted before they begin >>>>>>>>>>>>> their experiments is infinitely more interesting than any results to which >>>>>>>>>>>>> their experiments lead. >>>>>>>>>>>>> -- Norbert Wiener >>>>>>>>>>>>> >>>>>>>>>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>> -- >>>>>>>>>>>> Thibault Bridel-Bertomeu >>>>>>>>>>>> ? >>>>>>>>>>>> Eng, MSc, PhD >>>>>>>>>>>> Research Engineer >>>>>>>>>>>> CEA/CESTA >>>>>>>>>>>> 33114 LE BARP >>>>>>>>>>>> Tel.: (+33)557046924 >>>>>>>>>>>> Mob.: (+33)611025322 >>>>>>>>>>>> Mail: thibault.bridelbertomeu at gmail.com >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> -- >>>>>>>>>>> What most experimenters take for granted before they begin their >>>>>>>>>>> experiments is infinitely more interesting than any results to which their >>>>>>>>>>> experiments lead. >>>>>>>>>>> -- Norbert Wiener >>>>>>>>>>> >>>>>>>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> Thibault Bridel-Bertomeu >>>>>>>>>> ? >>>>>>>>>> Eng, MSc, PhD >>>>>>>>>> Research Engineer >>>>>>>>>> CEA/CESTA >>>>>>>>>> 33114 LE BARP >>>>>>>>>> Tel.: (+33)557046924 >>>>>>>>>> Mob.: (+33)611025322 >>>>>>>>>> Mail: thibault.bridelbertomeu at gmail.com >>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> What most experimenters take for granted before they begin their >>>>>>>>> experiments is infinitely more interesting than any results to which their >>>>>>>>> experiments lead. >>>>>>>>> -- Norbert Wiener >>>>>>>>> >>>>>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>>>>> >>>>>>>>> >>>>>>>> -- >>>>>>>> Thibault Bridel-Bertomeu >>>>>>>> ? >>>>>>>> Eng, MSc, PhD >>>>>>>> Research Engineer >>>>>>>> CEA/CESTA >>>>>>>> 33114 LE BARP >>>>>>>> Tel.: (+33)557046924 >>>>>>>> Mob.: (+33)611025322 >>>>>>>> Mail: thibault.bridelbertomeu at gmail.com >>>>>>>> >>>>>>> -- >>>>>>> Thibault Bridel-Bertomeu >>>>>>> ? >>>>>>> Eng, MSc, PhD >>>>>>> Research Engineer >>>>>>> CEA/CESTA >>>>>>> 33114 LE BARP >>>>>>> Tel.: (+33)557046924 >>>>>>> Mob.: (+33)611025322 >>>>>>> Mail: thibault.bridelbertomeu at gmail.com >>>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> What most experimenters take for granted before they begin their >>>>>> experiments is infinitely more interesting than any results to which their >>>>>> experiments lead. >>>>>> -- Norbert Wiener >>>>>> >>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>> >>>>>> >>>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their >>>> experiments is infinitely more interesting than any results to which their >>>> experiments lead. >>>> -- Norbert Wiener >>>> >>>> https://www.cse.buffalo.edu/~knepley/ >>>> >>>> >>> -- >>> Thibault Bridel-Bertomeu >>> ? >>> Eng, MSc, PhD >>> Research Engineer >>> CEA/CESTA >>> 33114 LE BARP >>> Tel.: (+33)557046924 >>> Mob.: (+33)611025322 >>> Mail: thibault.bridelbertomeu at gmail.com >>> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ >> >> > -- > Thibault Bridel-Bertomeu > ? > Eng, MSc, PhD > Research Engineer > CEA/CESTA > 33114 LE BARP > Tel.: (+33)557046924 > Mob.: (+33)611025322 > Mail: thibault.bridelbertomeu at gmail.com > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From thibault.bridelbertomeu at gmail.com Sun Jan 9 06:49:23 2022 From: thibault.bridelbertomeu at gmail.com (Thibault Bridel-Bertomeu) Date: Sun, 9 Jan 2022 13:49:23 +0100 Subject: [petsc-users] Fluid-Structure interaction with multiple DMPlex In-Reply-To: References: Message-ID: Le dim. 9 janv. 2022 ? 13:05, Matthew Knepley a ?crit : > On Sat, Jan 8, 2022 at 2:13 PM Thibault Bridel-Bertomeu < > thibault.bridelbertomeu at gmail.com> wrote: > >> However if you use IMEX for strong coupling of the two physics solved in >> each field, then it means you need to write a single set of PDEs that >> covers everything, don?t you ? >> If I want to solve Euler equations in one PetscDS and heat equation in >> the other one, then I need to write a global set of equations to use the >> IMEX TS , right ? >> > > The way I think about it. You would have explicit terms for Euler, and > they would also be confined to one part of the domain, but that just > impacts how you do the residual integral. You do assemble a combined > residual for all dogs, however, which I think is what you mean. > Hmm I'm not quite sure yet, probably because I haven't really started implementing it and I am not familiar with finite elements in PETSc. The way I see it is that a TS expects to be solving dU/dt = F, that's why I'm imagining that even with two domains with two different physics, one has to write the problem under the previous form. And when it comes to a FVM version of Euler + a FEM version of heat eqn, I'm not quite certain how to write it like that. Am I making any sense ? ?_o Thanks, Thibault > Thanks, > > Matt > > >> Thanks, >> >> Thibault >> >> Le sam. 8 janv. 2022 ? 20:00, Matthew Knepley a >> ?crit : >> >>> On Sat, Jan 8, 2022 at 1:30 PM Thibault Bridel-Bertomeu < >>> thibault.bridelbertomeu at gmail.com> wrote: >>> >>>> Yes I was wondering about different time steps as well because usually >>>> implicit integration moves much faster. >>>> But if it not implemented, then maybe going the ? weak coupling ? road >>>> with a sub-DM is the way. >>>> Can I ask how you proceed in the rocket engine code you are writing ? >>>> IMEX ? >>>> >>> >>> Right now it is IMEX, but we are explicitly substepping particles. Not >>> sure what the final thing will be. >>> >>> Thanks, >>> >>> Matt >>> >>> >>>> Thibault >>>> >>>> Le sam. 8 janv. 2022 ? 19:22, Matthew Knepley a >>>> ?crit : >>>> >>>>> I do not know how. Right now, composable TS does not work all the way. >>>>> >>>>> Matt >>>>> >>>>> On Sat, Jan 8, 2022 at 1:03 PM Mark Adams wrote: >>>>> >>>>>> Can you subcycle with IMEX? >>>>>> >>>>>> On Sat, Jan 8, 2022 at 10:58 AM Matthew Knepley >>>>>> wrote: >>>>>> >>>>>>> On Sat, Jan 8, 2022 at 3:05 AM Thibault Bridel-Bertomeu < >>>>>>> thibault.bridelbertomeu at gmail.com> wrote: >>>>>>> >>>>>>>> Le ven. 7 janv. 2022 ? 19:45, Thibault Bridel-Bertomeu < >>>>>>>> thibault.bridelbertomeu at gmail.com> a ?crit : >>>>>>>> >>>>>>>>> Le ven. 7 janv. 2022 ? 19:23, Matthew Knepley >>>>>>>>> a ?crit : >>>>>>>>> >>>>>>>>>> On Fri, Jan 7, 2022 at 12:58 PM Thibault Bridel-Bertomeu < >>>>>>>>>> thibault.bridelbertomeu at gmail.com> wrote: >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Le ven. 7 janv. 2022 ? 14:54, Matthew Knepley >>>>>>>>>>> a ?crit : >>>>>>>>>>> >>>>>>>>>>>> On Fri, Jan 7, 2022 at 8:52 AM Thibault Bridel-Bertomeu < >>>>>>>>>>>> thibault.bridelbertomeu at gmail.com> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> Hi Matthew, >>>>>>>>>>>>> >>>>>>>>>>>>> Le ven. 7 janv. 2022 ? 14:44, Matthew Knepley < >>>>>>>>>>>>> knepley at gmail.com> a ?crit : >>>>>>>>>>>>> >>>>>>>>>>>>>> On Fri, Jan 7, 2022 at 5:46 AM Thibault Bridel-Bertomeu < >>>>>>>>>>>>>> thibault.bridelbertomeu at gmail.com> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>>> Dear all, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> First of, happy new year everyone !! All the best ! >>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> Happy New Year! >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>>> I am starting to draft a new project that will be about >>>>>>>>>>>>>>> fluid-structure interaction: in particular, the idea is to compute the >>>>>>>>>>>>>>> Navier-Stokes (or Euler nevermind) flow around an object and _at the same >>>>>>>>>>>>>>> time_ compute the heat equation inside the object. >>>>>>>>>>>>>>> So basically, I am thinking a mesh of the fluid and a mesh >>>>>>>>>>>>>>> of the object, both meshes being linked at the fluid - solid interface. >>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> First question: Are these meshes intended to match on the >>>>>>>>>>>>>> interface? If not, this sounds like overset grids or immersed >>>>>>>>>>>>>> boundary/interface methods. In this case, more than one mesh makes sense to >>>>>>>>>>>>>> me. If they are intended to match, then I would advocate a single mesh with >>>>>>>>>>>>>> multiple problems defined on it. I have experimented with this, for example >>>>>>>>>>>>>> see SNES ex23 where I have a field in only part of the domain. I have a >>>>>>>>>>>>>> large project to do exactly this in a rocket engine now. >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> Yes the way I see it is more of a single mesh with two >>>>>>>>>>>>> distinct regions to distinguish between the fluid and the solid. I was >>>>>>>>>>>>> talking about two meshes to try and explain my vision but it seems like it >>>>>>>>>>>>> was unclear. >>>>>>>>>>>>> Imagine if you wish a rectangular box with a sphere inclusion: >>>>>>>>>>>>> the sphere would be tagged as a solid and the rest of the domain as fluid. >>>>>>>>>>>>> Using Gmsh volumes for instance. >>>>>>>>>>>>> Ill check out the SNES example ! Thanks ! >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>>> First (Matthew maybe ?) do you think it is something that >>>>>>>>>>>>>>> could be done using two DMPlex's that would somehow be spawned from reading >>>>>>>>>>>>>>> a Gmsh mesh with two volumes ? >>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> You can take a mesh and filter out part of it with >>>>>>>>>>>>>> DMPlexFilter(). That is not used much so I may have to fix it to do what >>>>>>>>>>>>>> you want, but that should be easy. >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>>> And on one DMPlex we would have finite volume for the fluid, >>>>>>>>>>>>>>> on the other finite elements for the heat eqn ? >>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> I have done this exact thing on a single mesh. It should be >>>>>>>>>>>>>> no harder on two meshes if you go that route. >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>>> Second, is it something that anyone in the community has >>>>>>>>>>>>>>> ever imagined doing with PETSc DMPlex's ? >>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> Yes, I had a combined FV+FEM simulation of magma dynamics (I >>>>>>>>>>>>>> should make it an example), and currently we are doing FVM+FEM for >>>>>>>>>>>>>> simulation of a rocket engine. >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> Wow so it seems like it?s the exact same thing I would like to >>>>>>>>>>>>> achieve as the rocket engine example. >>>>>>>>>>>>> So you have a single mesh and two regions tagged differently, >>>>>>>>>>>>> and you use the DmPlexFilter to solve FVM and FEM separately ? >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> With a single mesh, you do not even need DMPlexFilter. You just >>>>>>>>>>>> use the labels that Gmsh gives you. I think we should be able to get it >>>>>>>>>>>> going in a straightforward way. >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Ok then ! Thanks ! I?ll give it a shot and see what happens ! >>>>>>>>>>> Setting up the FVM and FEM discretizations will pass by >>>>>>>>>>> DMSetField right ? With a single mesh tagged with two different regions, it >>>>>>>>>>> should show up as two fields, is that correct ? >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Yes, the idea is as follows. Each field also has a label argument >>>>>>>>>> that is the support of the field in the domain. Then we create PetscDS >>>>>>>>>> objects for each >>>>>>>>>> separate set of overlapping fields. The current algorithm is not >>>>>>>>>> complete I think, so let me know if this step fails. >>>>>>>>>> >>>>>>>>> >>>>>>>>> Ok, thanks. >>>>>>>>> I?ll let you know and share snippets when I have something started >>>>>>>>> ! >>>>>>>>> >>>>>>>>> Talk soon ! Thanks ! >>>>>>>>> >>>>>>>> >>>>>>>> Hi Matthew, >>>>>>>> >>>>>>>> I thought about a little something else : what about setting two >>>>>>>> different TS, one for each field of the DM ? Most probably the fluid part >>>>>>>> would be solved with an explicit time stepping whereas the solid part with >>>>>>>> the heat equation would benefit from implicit time stepping. TSSetDM does >>>>>>>> not allow a field specification, is there a way to hack that so that each >>>>>>>> field has its own TS ? >>>>>>>> >>>>>>> >>>>>>> I see at least two options here: >>>>>>> >>>>>>> 1. Split the problems: >>>>>>> >>>>>>> You can use DMCreateSubDM() to split off part of a problem and >>>>>>> use a solver on that. I have done this for problems with weak coupling. >>>>>>> >>>>>>> 2. Use IMEX >>>>>>> >>>>>>> For strong coupling, I have used the IMEX TSes in PETSc. You put >>>>>>> the explicit terms in the RHS, and the implicit in the IFunction. >>>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> Matt >>>>>>> >>>>>>> >>>>>>>> Thanks >>>>>>>> >>>>>>>> Thibault >>>>>>>> >>>>>>>> >>>>>>>>> Thibault >>>>>>>>> >>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> >>>>>>>>>> Matt >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> >>>>>>>>>>> Thibault >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> >>>>>>>>>>>> Matt >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>>> Thanks ! >>>>>>>>>>>>> >>>>>>>>>>>>> Thibault >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>> >>>>>>>>>>>>>> Matt >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>>> As I said it is very prospective, I just wanted to have your >>>>>>>>>>>>>>> opinion !! >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Thanks very much in advance everyone !! >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Cheers, >>>>>>>>>>>>>>> Thibault >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> -- >>>>>>>>>>>>>> What most experimenters take for granted before they begin >>>>>>>>>>>>>> their experiments is infinitely more interesting than any results to which >>>>>>>>>>>>>> their experiments lead. >>>>>>>>>>>>>> -- Norbert Wiener >>>>>>>>>>>>>> >>>>>>>>>>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>> -- >>>>>>>>>>>>> Thibault Bridel-Bertomeu >>>>>>>>>>>>> ? >>>>>>>>>>>>> Eng, MSc, PhD >>>>>>>>>>>>> Research Engineer >>>>>>>>>>>>> CEA/CESTA >>>>>>>>>>>>> 33114 LE BARP >>>>>>>>>>>>> Tel.: (+33)557046924 >>>>>>>>>>>>> Mob.: (+33)611025322 >>>>>>>>>>>>> Mail: thibault.bridelbertomeu at gmail.com >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> -- >>>>>>>>>>>> What most experimenters take for granted before they begin >>>>>>>>>>>> their experiments is infinitely more interesting than any results to which >>>>>>>>>>>> their experiments lead. >>>>>>>>>>>> -- Norbert Wiener >>>>>>>>>>>> >>>>>>>>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> -- >>>>>>>>>>> Thibault Bridel-Bertomeu >>>>>>>>>>> ? >>>>>>>>>>> Eng, MSc, PhD >>>>>>>>>>> Research Engineer >>>>>>>>>>> CEA/CESTA >>>>>>>>>>> 33114 LE BARP >>>>>>>>>>> Tel.: (+33)557046924 >>>>>>>>>>> Mob.: (+33)611025322 >>>>>>>>>>> Mail: thibault.bridelbertomeu at gmail.com >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> What most experimenters take for granted before they begin their >>>>>>>>>> experiments is infinitely more interesting than any results to which their >>>>>>>>>> experiments lead. >>>>>>>>>> -- Norbert Wiener >>>>>>>>>> >>>>>>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>>>>>> >>>>>>>>>> >>>>>>>>> -- >>>>>>>>> Thibault Bridel-Bertomeu >>>>>>>>> ? >>>>>>>>> Eng, MSc, PhD >>>>>>>>> Research Engineer >>>>>>>>> CEA/CESTA >>>>>>>>> 33114 LE BARP >>>>>>>>> Tel.: (+33)557046924 >>>>>>>>> Mob.: (+33)611025322 >>>>>>>>> Mail: thibault.bridelbertomeu at gmail.com >>>>>>>>> >>>>>>>> -- >>>>>>>> Thibault Bridel-Bertomeu >>>>>>>> ? >>>>>>>> Eng, MSc, PhD >>>>>>>> Research Engineer >>>>>>>> CEA/CESTA >>>>>>>> 33114 LE BARP >>>>>>>> Tel.: (+33)557046924 >>>>>>>> Mob.: (+33)611025322 >>>>>>>> Mail: thibault.bridelbertomeu at gmail.com >>>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> What most experimenters take for granted before they begin their >>>>>>> experiments is infinitely more interesting than any results to which their >>>>>>> experiments lead. >>>>>>> -- Norbert Wiener >>>>>>> >>>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>>> >>>>>>> >>>>>> >>>>> >>>>> -- >>>>> What most experimenters take for granted before they begin their >>>>> experiments is infinitely more interesting than any results to which their >>>>> experiments lead. >>>>> -- Norbert Wiener >>>>> >>>>> https://www.cse.buffalo.edu/~knepley/ >>>>> >>>>> >>>> -- >>>> Thibault Bridel-Bertomeu >>>> ? >>>> Eng, MSc, PhD >>>> Research Engineer >>>> CEA/CESTA >>>> 33114 LE BARP >>>> Tel.: (+33)557046924 >>>> Mob.: (+33)611025322 >>>> Mail: thibault.bridelbertomeu at gmail.com >>>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >>> https://www.cse.buffalo.edu/~knepley/ >>> >>> >> -- >> Thibault Bridel-Bertomeu >> ? >> Eng, MSc, PhD >> Research Engineer >> CEA/CESTA >> 33114 LE BARP >> Tel.: (+33)557046924 >> Mob.: (+33)611025322 >> Mail: thibault.bridelbertomeu at gmail.com >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Sun Jan 9 08:38:23 2022 From: knepley at gmail.com (Matthew Knepley) Date: Sun, 9 Jan 2022 09:38:23 -0500 Subject: [petsc-users] Fluid-Structure interaction with multiple DMPlex In-Reply-To: References: Message-ID: On Sun, Jan 9, 2022 at 7:49 AM Thibault Bridel-Bertomeu < thibault.bridelbertomeu at gmail.com> wrote: > Le dim. 9 janv. 2022 ? 13:05, Matthew Knepley a > ?crit : > >> On Sat, Jan 8, 2022 at 2:13 PM Thibault Bridel-Bertomeu < >> thibault.bridelbertomeu at gmail.com> wrote: >> >>> However if you use IMEX for strong coupling of the two physics solved in >>> each field, then it means you need to write a single set of PDEs that >>> covers everything, don?t you ? >>> If I want to solve Euler equations in one PetscDS and heat equation in >>> the other one, then I need to write a global set of equations to use the >>> IMEX TS , right ? >>> >> >> The way I think about it. You would have explicit terms for Euler, and >> they would also be confined to one part of the domain, but that just >> impacts how you do the residual integral. You do assemble a combined >> residual for all dogs, however, which I think is what you mean. >> > > Hmm I'm not quite sure yet, probably because I haven't really started > implementing it and I am not familiar with finite elements in PETSc. > The way I see it is that a TS expects to be solving dU/dt = F, that's why > I'm imagining that even with two domains with two different physics, one > has to write the problem under the previous form. And when it comes to a > FVM version of Euler + a FEM version of heat eqn, I'm not quite certain how > to write it like that. > Am I making any sense ? ?_o > Oh, if you have boundary communication, then formulating it as one system is difficult because different cells in supposedly the same DS would have different unknowns, yes. IB solves this by defining the other fields over the whole of each subdomain. Schwarz methods make two different problems and then pass values with what I call an "auxiliary field". You are right that you have to do something. Thanks, Matt > Thanks, > Thibault > > >> Thanks, >> >> Matt >> >> >>> Thanks, >>> >>> Thibault >>> >>> Le sam. 8 janv. 2022 ? 20:00, Matthew Knepley a >>> ?crit : >>> >>>> On Sat, Jan 8, 2022 at 1:30 PM Thibault Bridel-Bertomeu < >>>> thibault.bridelbertomeu at gmail.com> wrote: >>>> >>>>> Yes I was wondering about different time steps as well because usually >>>>> implicit integration moves much faster. >>>>> But if it not implemented, then maybe going the ? weak coupling ? road >>>>> with a sub-DM is the way. >>>>> Can I ask how you proceed in the rocket engine code you are writing ? >>>>> IMEX ? >>>>> >>>> >>>> Right now it is IMEX, but we are explicitly substepping particles. Not >>>> sure what the final thing will be. >>>> >>>> Thanks, >>>> >>>> Matt >>>> >>>> >>>>> Thibault >>>>> >>>>> Le sam. 8 janv. 2022 ? 19:22, Matthew Knepley a >>>>> ?crit : >>>>> >>>>>> I do not know how. Right now, composable TS does not work all the way. >>>>>> >>>>>> Matt >>>>>> >>>>>> On Sat, Jan 8, 2022 at 1:03 PM Mark Adams wrote: >>>>>> >>>>>>> Can you subcycle with IMEX? >>>>>>> >>>>>>> On Sat, Jan 8, 2022 at 10:58 AM Matthew Knepley >>>>>>> wrote: >>>>>>> >>>>>>>> On Sat, Jan 8, 2022 at 3:05 AM Thibault Bridel-Bertomeu < >>>>>>>> thibault.bridelbertomeu at gmail.com> wrote: >>>>>>>> >>>>>>>>> Le ven. 7 janv. 2022 ? 19:45, Thibault Bridel-Bertomeu < >>>>>>>>> thibault.bridelbertomeu at gmail.com> a ?crit : >>>>>>>>> >>>>>>>>>> Le ven. 7 janv. 2022 ? 19:23, Matthew Knepley >>>>>>>>>> a ?crit : >>>>>>>>>> >>>>>>>>>>> On Fri, Jan 7, 2022 at 12:58 PM Thibault Bridel-Bertomeu < >>>>>>>>>>> thibault.bridelbertomeu at gmail.com> wrote: >>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Le ven. 7 janv. 2022 ? 14:54, Matthew Knepley < >>>>>>>>>>>> knepley at gmail.com> a ?crit : >>>>>>>>>>>> >>>>>>>>>>>>> On Fri, Jan 7, 2022 at 8:52 AM Thibault Bridel-Bertomeu < >>>>>>>>>>>>> thibault.bridelbertomeu at gmail.com> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> Hi Matthew, >>>>>>>>>>>>>> >>>>>>>>>>>>>> Le ven. 7 janv. 2022 ? 14:44, Matthew Knepley < >>>>>>>>>>>>>> knepley at gmail.com> a ?crit : >>>>>>>>>>>>>> >>>>>>>>>>>>>>> On Fri, Jan 7, 2022 at 5:46 AM Thibault Bridel-Bertomeu < >>>>>>>>>>>>>>> thibault.bridelbertomeu at gmail.com> wrote: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Dear all, >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> First of, happy new year everyone !! All the best ! >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Happy New Year! >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> I am starting to draft a new project that will be about >>>>>>>>>>>>>>>> fluid-structure interaction: in particular, the idea is to compute the >>>>>>>>>>>>>>>> Navier-Stokes (or Euler nevermind) flow around an object and _at the same >>>>>>>>>>>>>>>> time_ compute the heat equation inside the object. >>>>>>>>>>>>>>>> So basically, I am thinking a mesh of the fluid and a mesh >>>>>>>>>>>>>>>> of the object, both meshes being linked at the fluid - solid interface. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> First question: Are these meshes intended to match on the >>>>>>>>>>>>>>> interface? If not, this sounds like overset grids or immersed >>>>>>>>>>>>>>> boundary/interface methods. In this case, more than one mesh makes sense to >>>>>>>>>>>>>>> me. If they are intended to match, then I would advocate a single mesh with >>>>>>>>>>>>>>> multiple problems defined on it. I have experimented with this, for example >>>>>>>>>>>>>>> see SNES ex23 where I have a field in only part of the domain. I have a >>>>>>>>>>>>>>> large project to do exactly this in a rocket engine now. >>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> Yes the way I see it is more of a single mesh with two >>>>>>>>>>>>>> distinct regions to distinguish between the fluid and the solid. I was >>>>>>>>>>>>>> talking about two meshes to try and explain my vision but it seems like it >>>>>>>>>>>>>> was unclear. >>>>>>>>>>>>>> Imagine if you wish a rectangular box with a sphere >>>>>>>>>>>>>> inclusion: the sphere would be tagged as a solid and the rest of the domain >>>>>>>>>>>>>> as fluid. Using Gmsh volumes for instance. >>>>>>>>>>>>>> Ill check out the SNES example ! Thanks ! >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> First (Matthew maybe ?) do you think it is something that >>>>>>>>>>>>>>>> could be done using two DMPlex's that would somehow be spawned from reading >>>>>>>>>>>>>>>> a Gmsh mesh with two volumes ? >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> You can take a mesh and filter out part of it with >>>>>>>>>>>>>>> DMPlexFilter(). That is not used much so I may have to fix it to do what >>>>>>>>>>>>>>> you want, but that should be easy. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> And on one DMPlex we would have finite volume for the >>>>>>>>>>>>>>>> fluid, on the other finite elements for the heat eqn ? >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I have done this exact thing on a single mesh. It should be >>>>>>>>>>>>>>> no harder on two meshes if you go that route. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Second, is it something that anyone in the community has >>>>>>>>>>>>>>>> ever imagined doing with PETSc DMPlex's ? >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Yes, I had a combined FV+FEM simulation of magma dynamics (I >>>>>>>>>>>>>>> should make it an example), and currently we are doing FVM+FEM for >>>>>>>>>>>>>>> simulation of a rocket engine. >>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> Wow so it seems like it?s the exact same thing I would like >>>>>>>>>>>>>> to achieve as the rocket engine example. >>>>>>>>>>>>>> So you have a single mesh and two regions tagged differently, >>>>>>>>>>>>>> and you use the DmPlexFilter to solve FVM and FEM separately ? >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> With a single mesh, you do not even need DMPlexFilter. You >>>>>>>>>>>>> just use the labels that Gmsh gives you. I think we should be able to get >>>>>>>>>>>>> it going in a straightforward way. >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Ok then ! Thanks ! I?ll give it a shot and see what happens ! >>>>>>>>>>>> Setting up the FVM and FEM discretizations will pass by >>>>>>>>>>>> DMSetField right ? With a single mesh tagged with two different regions, it >>>>>>>>>>>> should show up as two fields, is that correct ? >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Yes, the idea is as follows. Each field also has a label >>>>>>>>>>> argument that is the support of the field in the domain. Then we create >>>>>>>>>>> PetscDS objects for each >>>>>>>>>>> separate set of overlapping fields. The current algorithm is not >>>>>>>>>>> complete I think, so let me know if this step fails. >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Ok, thanks. >>>>>>>>>> I?ll let you know and share snippets when I have something >>>>>>>>>> started ! >>>>>>>>>> >>>>>>>>>> Talk soon ! Thanks ! >>>>>>>>>> >>>>>>>>> >>>>>>>>> Hi Matthew, >>>>>>>>> >>>>>>>>> I thought about a little something else : what about setting two >>>>>>>>> different TS, one for each field of the DM ? Most probably the fluid part >>>>>>>>> would be solved with an explicit time stepping whereas the solid part with >>>>>>>>> the heat equation would benefit from implicit time stepping. TSSetDM does >>>>>>>>> not allow a field specification, is there a way to hack that so that each >>>>>>>>> field has its own TS ? >>>>>>>>> >>>>>>>> >>>>>>>> I see at least two options here: >>>>>>>> >>>>>>>> 1. Split the problems: >>>>>>>> >>>>>>>> You can use DMCreateSubDM() to split off part of a problem and >>>>>>>> use a solver on that. I have done this for problems with weak coupling. >>>>>>>> >>>>>>>> 2. Use IMEX >>>>>>>> >>>>>>>> For strong coupling, I have used the IMEX TSes in PETSc. You >>>>>>>> put the explicit terms in the RHS, and the implicit in the IFunction. >>>>>>>> >>>>>>>> Thanks, >>>>>>>> >>>>>>>> Matt >>>>>>>> >>>>>>>> >>>>>>>>> Thanks >>>>>>>>> >>>>>>>>> Thibault >>>>>>>>> >>>>>>>>> >>>>>>>>>> Thibault >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> >>>>>>>>>>> Matt >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> >>>>>>>>>>>> Thibault >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>>> Thanks, >>>>>>>>>>>>> >>>>>>>>>>>>> Matt >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>>> Thanks ! >>>>>>>>>>>>>> >>>>>>>>>>>>>> Thibault >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Matt >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> As I said it is very prospective, I just wanted to have >>>>>>>>>>>>>>>> your opinion !! >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Thanks very much in advance everyone !! >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Cheers, >>>>>>>>>>>>>>>> Thibault >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>> What most experimenters take for granted before they begin >>>>>>>>>>>>>>> their experiments is infinitely more interesting than any results to which >>>>>>>>>>>>>>> their experiments lead. >>>>>>>>>>>>>>> -- Norbert Wiener >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>> -- >>>>>>>>>>>>>> Thibault Bridel-Bertomeu >>>>>>>>>>>>>> ? >>>>>>>>>>>>>> Eng, MSc, PhD >>>>>>>>>>>>>> Research Engineer >>>>>>>>>>>>>> CEA/CESTA >>>>>>>>>>>>>> 33114 LE BARP >>>>>>>>>>>>>> Tel.: (+33)557046924 >>>>>>>>>>>>>> Mob.: (+33)611025322 >>>>>>>>>>>>>> Mail: thibault.bridelbertomeu at gmail.com >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> -- >>>>>>>>>>>>> What most experimenters take for granted before they begin >>>>>>>>>>>>> their experiments is infinitely more interesting than any results to which >>>>>>>>>>>>> their experiments lead. >>>>>>>>>>>>> -- Norbert Wiener >>>>>>>>>>>>> >>>>>>>>>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>> -- >>>>>>>>>>>> Thibault Bridel-Bertomeu >>>>>>>>>>>> ? >>>>>>>>>>>> Eng, MSc, PhD >>>>>>>>>>>> Research Engineer >>>>>>>>>>>> CEA/CESTA >>>>>>>>>>>> 33114 LE BARP >>>>>>>>>>>> Tel.: (+33)557046924 >>>>>>>>>>>> Mob.: (+33)611025322 >>>>>>>>>>>> Mail: thibault.bridelbertomeu at gmail.com >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> -- >>>>>>>>>>> What most experimenters take for granted before they begin their >>>>>>>>>>> experiments is infinitely more interesting than any results to which their >>>>>>>>>>> experiments lead. >>>>>>>>>>> -- Norbert Wiener >>>>>>>>>>> >>>>>>>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> Thibault Bridel-Bertomeu >>>>>>>>>> ? >>>>>>>>>> Eng, MSc, PhD >>>>>>>>>> Research Engineer >>>>>>>>>> CEA/CESTA >>>>>>>>>> 33114 LE BARP >>>>>>>>>> Tel.: (+33)557046924 >>>>>>>>>> Mob.: (+33)611025322 >>>>>>>>>> Mail: thibault.bridelbertomeu at gmail.com >>>>>>>>>> >>>>>>>>> -- >>>>>>>>> Thibault Bridel-Bertomeu >>>>>>>>> ? >>>>>>>>> Eng, MSc, PhD >>>>>>>>> Research Engineer >>>>>>>>> CEA/CESTA >>>>>>>>> 33114 LE BARP >>>>>>>>> Tel.: (+33)557046924 >>>>>>>>> Mob.: (+33)611025322 >>>>>>>>> Mail: thibault.bridelbertomeu at gmail.com >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> What most experimenters take for granted before they begin their >>>>>>>> experiments is infinitely more interesting than any results to which their >>>>>>>> experiments lead. >>>>>>>> -- Norbert Wiener >>>>>>>> >>>>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>>> -- >>>>>> What most experimenters take for granted before they begin their >>>>>> experiments is infinitely more interesting than any results to which their >>>>>> experiments lead. >>>>>> -- Norbert Wiener >>>>>> >>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>> >>>>>> >>>>> -- >>>>> Thibault Bridel-Bertomeu >>>>> ? >>>>> Eng, MSc, PhD >>>>> Research Engineer >>>>> CEA/CESTA >>>>> 33114 LE BARP >>>>> Tel.: (+33)557046924 >>>>> Mob.: (+33)611025322 >>>>> Mail: thibault.bridelbertomeu at gmail.com >>>>> >>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their >>>> experiments is infinitely more interesting than any results to which their >>>> experiments lead. >>>> -- Norbert Wiener >>>> >>>> https://www.cse.buffalo.edu/~knepley/ >>>> >>>> >>> -- >>> Thibault Bridel-Bertomeu >>> ? >>> Eng, MSc, PhD >>> Research Engineer >>> CEA/CESTA >>> 33114 LE BARP >>> Tel.: (+33)557046924 >>> Mob.: (+33)611025322 >>> Mail: thibault.bridelbertomeu at gmail.com >>> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ >> >> > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From thibault.bridelbertomeu at gmail.com Sun Jan 9 09:52:59 2022 From: thibault.bridelbertomeu at gmail.com (Thibault Bridel-Bertomeu) Date: Sun, 9 Jan 2022 16:52:59 +0100 Subject: [petsc-users] Fluid-Structure interaction with multiple DMPlex In-Reply-To: References: Message-ID: Le dim. 9 janv. 2022 ? 15:38, Matthew Knepley a ?crit : > On Sun, Jan 9, 2022 at 7:49 AM Thibault Bridel-Bertomeu < > thibault.bridelbertomeu at gmail.com> wrote: > >> Le dim. 9 janv. 2022 ? 13:05, Matthew Knepley a >> ?crit : >> >>> On Sat, Jan 8, 2022 at 2:13 PM Thibault Bridel-Bertomeu < >>> thibault.bridelbertomeu at gmail.com> wrote: >>> >>>> However if you use IMEX for strong coupling of the two physics solved >>>> in each field, then it means you need to write a single set of PDEs that >>>> covers everything, don?t you ? >>>> If I want to solve Euler equations in one PetscDS and heat equation in >>>> the other one, then I need to write a global set of equations to use the >>>> IMEX TS , right ? >>>> >>> >>> The way I think about it. You would have explicit terms for Euler, and >>> they would also be confined to one part of the domain, but that just >>> impacts how you do the residual integral. You do assemble a combined >>> residual for all dogs, however, which I think is what you mean. >>> >> >> Hmm I'm not quite sure yet, probably because I haven't really started >> implementing it and I am not familiar with finite elements in PETSc. >> The way I see it is that a TS expects to be solving dU/dt = F, that's why >> I'm imagining that even with two domains with two different physics, one >> has to write the problem under the previous form. And when it comes to a >> FVM version of Euler + a FEM version of heat eqn, I'm not quite certain how >> to write it like that. >> Am I making any sense ? ?_o >> > > Oh, if you have boundary communication, then formulating it as one system > is difficult because different cells in supposedly the same DS would > have different unknowns, yes. IB solves this by defining the other fields > over the whole of each subdomain. Schwarz methods make two different > problems and then pass values with what I call an "auxiliary field". You > are right that you have to do something. > Let's imagine that we read in a Gmsh mesh into a DMPlex. That Gmsh mesh has two physical volumes so the DMPlex will a priori show two labels and therefore (?) two fields, meaning we can work with two SubDMs. Each SubDM basically matches a region of the whole mesh in this case. Now each SubDM can have its own DS and we can also attribute each DM to a TS. We can therefore solve the two problems, say one for fluid dynamics the other for heat eqn. The only thing I am not sure about (actually I haven't coded anything yet so I'm not sure of anything but ...) is the following. The two SubDMs come originally from the same DM right. Say we work in 3D, then the two SubDM must share a layer of triangles (and the segments and vertices that go along with them). That layer of triangles exist in both SubDM and is a boundary in both SubDM. How do I tell, for instance, the fluid SubDM that the information it needs on that layer of triangles comes from the other SubDM ? And vice versa ? Is it possible to create two SubDMs from the same DM that somehow still know each other after the creation ? Example 23 from SNES does not do that kind of thing right ? The "top" and "bottom" pieces are quite independent or am I misunderstanding sth ? Thanks !! Thibault > Thanks, > > Matt > > >> Thanks, >> Thibault >> >> >>> Thanks, >>> >>> Matt >>> >>> >>>> Thanks, >>>> >>>> Thibault >>>> >>>> Le sam. 8 janv. 2022 ? 20:00, Matthew Knepley a >>>> ?crit : >>>> >>>>> On Sat, Jan 8, 2022 at 1:30 PM Thibault Bridel-Bertomeu < >>>>> thibault.bridelbertomeu at gmail.com> wrote: >>>>> >>>>>> Yes I was wondering about different time steps as well because >>>>>> usually implicit integration moves much faster. >>>>>> But if it not implemented, then maybe going the ? weak coupling ? >>>>>> road with a sub-DM is the way. >>>>>> Can I ask how you proceed in the rocket engine code you are writing ? >>>>>> IMEX ? >>>>>> >>>>> >>>>> Right now it is IMEX, but we are explicitly substepping particles. Not >>>>> sure what the final thing will be. >>>>> >>>>> Thanks, >>>>> >>>>> Matt >>>>> >>>>> >>>>>> Thibault >>>>>> >>>>>> Le sam. 8 janv. 2022 ? 19:22, Matthew Knepley a >>>>>> ?crit : >>>>>> >>>>>>> I do not know how. Right now, composable TS does not work all the >>>>>>> way. >>>>>>> >>>>>>> Matt >>>>>>> >>>>>>> On Sat, Jan 8, 2022 at 1:03 PM Mark Adams wrote: >>>>>>> >>>>>>>> Can you subcycle with IMEX? >>>>>>>> >>>>>>>> On Sat, Jan 8, 2022 at 10:58 AM Matthew Knepley >>>>>>>> wrote: >>>>>>>> >>>>>>>>> On Sat, Jan 8, 2022 at 3:05 AM Thibault Bridel-Bertomeu < >>>>>>>>> thibault.bridelbertomeu at gmail.com> wrote: >>>>>>>>> >>>>>>>>>> Le ven. 7 janv. 2022 ? 19:45, Thibault Bridel-Bertomeu < >>>>>>>>>> thibault.bridelbertomeu at gmail.com> a ?crit : >>>>>>>>>> >>>>>>>>>>> Le ven. 7 janv. 2022 ? 19:23, Matthew Knepley >>>>>>>>>>> a ?crit : >>>>>>>>>>> >>>>>>>>>>>> On Fri, Jan 7, 2022 at 12:58 PM Thibault Bridel-Bertomeu < >>>>>>>>>>>> thibault.bridelbertomeu at gmail.com> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> Le ven. 7 janv. 2022 ? 14:54, Matthew Knepley < >>>>>>>>>>>>> knepley at gmail.com> a ?crit : >>>>>>>>>>>>> >>>>>>>>>>>>>> On Fri, Jan 7, 2022 at 8:52 AM Thibault Bridel-Bertomeu < >>>>>>>>>>>>>> thibault.bridelbertomeu at gmail.com> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>>> Hi Matthew, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Le ven. 7 janv. 2022 ? 14:44, Matthew Knepley < >>>>>>>>>>>>>>> knepley at gmail.com> a ?crit : >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On Fri, Jan 7, 2022 at 5:46 AM Thibault Bridel-Bertomeu < >>>>>>>>>>>>>>>> thibault.bridelbertomeu at gmail.com> wrote: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Dear all, >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> First of, happy new year everyone !! All the best ! >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Happy New Year! >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> I am starting to draft a new project that will be about >>>>>>>>>>>>>>>>> fluid-structure interaction: in particular, the idea is to compute the >>>>>>>>>>>>>>>>> Navier-Stokes (or Euler nevermind) flow around an object and _at the same >>>>>>>>>>>>>>>>> time_ compute the heat equation inside the object. >>>>>>>>>>>>>>>>> So basically, I am thinking a mesh of the fluid and a mesh >>>>>>>>>>>>>>>>> of the object, both meshes being linked at the fluid - solid interface. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> First question: Are these meshes intended to match on the >>>>>>>>>>>>>>>> interface? If not, this sounds like overset grids or immersed >>>>>>>>>>>>>>>> boundary/interface methods. In this case, more than one mesh makes sense to >>>>>>>>>>>>>>>> me. If they are intended to match, then I would advocate a single mesh with >>>>>>>>>>>>>>>> multiple problems defined on it. I have experimented with this, for example >>>>>>>>>>>>>>>> see SNES ex23 where I have a field in only part of the domain. I have a >>>>>>>>>>>>>>>> large project to do exactly this in a rocket engine now. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Yes the way I see it is more of a single mesh with two >>>>>>>>>>>>>>> distinct regions to distinguish between the fluid and the solid. I was >>>>>>>>>>>>>>> talking about two meshes to try and explain my vision but it seems like it >>>>>>>>>>>>>>> was unclear. >>>>>>>>>>>>>>> Imagine if you wish a rectangular box with a sphere >>>>>>>>>>>>>>> inclusion: the sphere would be tagged as a solid and the rest of the domain >>>>>>>>>>>>>>> as fluid. Using Gmsh volumes for instance. >>>>>>>>>>>>>>> Ill check out the SNES example ! Thanks ! >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> First (Matthew maybe ?) do you think it is something that >>>>>>>>>>>>>>>>> could be done using two DMPlex's that would somehow be spawned from reading >>>>>>>>>>>>>>>>> a Gmsh mesh with two volumes ? >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> You can take a mesh and filter out part of it with >>>>>>>>>>>>>>>> DMPlexFilter(). That is not used much so I may have to fix it to do what >>>>>>>>>>>>>>>> you want, but that should be easy. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> And on one DMPlex we would have finite volume for the >>>>>>>>>>>>>>>>> fluid, on the other finite elements for the heat eqn ? >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> I have done this exact thing on a single mesh. It should be >>>>>>>>>>>>>>>> no harder on two meshes if you go that route. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Second, is it something that anyone in the community has >>>>>>>>>>>>>>>>> ever imagined doing with PETSc DMPlex's ? >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Yes, I had a combined FV+FEM simulation of magma dynamics >>>>>>>>>>>>>>>> (I should make it an example), and currently we are doing FVM+FEM for >>>>>>>>>>>>>>>> simulation of a rocket engine. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Wow so it seems like it?s the exact same thing I would like >>>>>>>>>>>>>>> to achieve as the rocket engine example. >>>>>>>>>>>>>>> So you have a single mesh and two regions tagged >>>>>>>>>>>>>>> differently, and you use the DmPlexFilter to solve FVM and FEM separately ? >>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> With a single mesh, you do not even need DMPlexFilter. You >>>>>>>>>>>>>> just use the labels that Gmsh gives you. I think we should be able to get >>>>>>>>>>>>>> it going in a straightforward way. >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> Ok then ! Thanks ! I?ll give it a shot and see what happens ! >>>>>>>>>>>>> Setting up the FVM and FEM discretizations will pass by >>>>>>>>>>>>> DMSetField right ? With a single mesh tagged with two different regions, it >>>>>>>>>>>>> should show up as two fields, is that correct ? >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Yes, the idea is as follows. Each field also has a label >>>>>>>>>>>> argument that is the support of the field in the domain. Then we create >>>>>>>>>>>> PetscDS objects for each >>>>>>>>>>>> separate set of overlapping fields. The current algorithm is >>>>>>>>>>>> not complete I think, so let me know if this step fails. >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Ok, thanks. >>>>>>>>>>> I?ll let you know and share snippets when I have something >>>>>>>>>>> started ! >>>>>>>>>>> >>>>>>>>>>> Talk soon ! Thanks ! >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Hi Matthew, >>>>>>>>>> >>>>>>>>>> I thought about a little something else : what about setting two >>>>>>>>>> different TS, one for each field of the DM ? Most probably the fluid part >>>>>>>>>> would be solved with an explicit time stepping whereas the solid part with >>>>>>>>>> the heat equation would benefit from implicit time stepping. TSSetDM does >>>>>>>>>> not allow a field specification, is there a way to hack that so that each >>>>>>>>>> field has its own TS ? >>>>>>>>>> >>>>>>>>> >>>>>>>>> I see at least two options here: >>>>>>>>> >>>>>>>>> 1. Split the problems: >>>>>>>>> >>>>>>>>> You can use DMCreateSubDM() to split off part of a problem and >>>>>>>>> use a solver on that. I have done this for problems with weak coupling. >>>>>>>>> >>>>>>>>> 2. Use IMEX >>>>>>>>> >>>>>>>>> For strong coupling, I have used the IMEX TSes in PETSc. You >>>>>>>>> put the explicit terms in the RHS, and the implicit in the IFunction. >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> >>>>>>>>> Matt >>>>>>>>> >>>>>>>>> >>>>>>>>>> Thanks >>>>>>>>>> >>>>>>>>>> Thibault >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> Thibault >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> >>>>>>>>>>>> Matt >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>>> Thanks, >>>>>>>>>>>>> >>>>>>>>>>>>> Thibault >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>> >>>>>>>>>>>>>> Matt >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>>> Thanks ! >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Thibault >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Matt >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> As I said it is very prospective, I just wanted to have >>>>>>>>>>>>>>>>> your opinion !! >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Thanks very much in advance everyone !! >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Cheers, >>>>>>>>>>>>>>>>> Thibault >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>> What most experimenters take for granted before they begin >>>>>>>>>>>>>>>> their experiments is infinitely more interesting than any results to which >>>>>>>>>>>>>>>> their experiments lead. >>>>>>>>>>>>>>>> -- Norbert Wiener >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>> Thibault Bridel-Bertomeu >>>>>>>>>>>>>>> ? >>>>>>>>>>>>>>> Eng, MSc, PhD >>>>>>>>>>>>>>> Research Engineer >>>>>>>>>>>>>>> CEA/CESTA >>>>>>>>>>>>>>> 33114 LE BARP >>>>>>>>>>>>>>> Tel.: (+33)557046924 >>>>>>>>>>>>>>> Mob.: (+33)611025322 >>>>>>>>>>>>>>> Mail: thibault.bridelbertomeu at gmail.com >>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> -- >>>>>>>>>>>>>> What most experimenters take for granted before they begin >>>>>>>>>>>>>> their experiments is infinitely more interesting than any results to which >>>>>>>>>>>>>> their experiments lead. >>>>>>>>>>>>>> -- Norbert Wiener >>>>>>>>>>>>>> >>>>>>>>>>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>> -- >>>>>>>>>>>>> Thibault Bridel-Bertomeu >>>>>>>>>>>>> ? >>>>>>>>>>>>> Eng, MSc, PhD >>>>>>>>>>>>> Research Engineer >>>>>>>>>>>>> CEA/CESTA >>>>>>>>>>>>> 33114 LE BARP >>>>>>>>>>>>> Tel.: (+33)557046924 >>>>>>>>>>>>> Mob.: (+33)611025322 >>>>>>>>>>>>> Mail: thibault.bridelbertomeu at gmail.com >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> -- >>>>>>>>>>>> What most experimenters take for granted before they begin >>>>>>>>>>>> their experiments is infinitely more interesting than any results to which >>>>>>>>>>>> their experiments lead. >>>>>>>>>>>> -- Norbert Wiener >>>>>>>>>>>> >>>>>>>>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> -- >>>>>>>>>>> Thibault Bridel-Bertomeu >>>>>>>>>>> ? >>>>>>>>>>> Eng, MSc, PhD >>>>>>>>>>> Research Engineer >>>>>>>>>>> CEA/CESTA >>>>>>>>>>> 33114 LE BARP >>>>>>>>>>> Tel.: (+33)557046924 >>>>>>>>>>> Mob.: (+33)611025322 >>>>>>>>>>> Mail: thibault.bridelbertomeu at gmail.com >>>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> Thibault Bridel-Bertomeu >>>>>>>>>> ? >>>>>>>>>> Eng, MSc, PhD >>>>>>>>>> Research Engineer >>>>>>>>>> CEA/CESTA >>>>>>>>>> 33114 LE BARP >>>>>>>>>> Tel.: (+33)557046924 >>>>>>>>>> Mob.: (+33)611025322 >>>>>>>>>> Mail: thibault.bridelbertomeu at gmail.com >>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> What most experimenters take for granted before they begin their >>>>>>>>> experiments is infinitely more interesting than any results to which their >>>>>>>>> experiments lead. >>>>>>>>> -- Norbert Wiener >>>>>>>>> >>>>>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>>>> -- >>>>>>> What most experimenters take for granted before they begin their >>>>>>> experiments is infinitely more interesting than any results to which their >>>>>>> experiments lead. >>>>>>> -- Norbert Wiener >>>>>>> >>>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>>> >>>>>>> >>>>>> -- >>>>>> Thibault Bridel-Bertomeu >>>>>> ? >>>>>> Eng, MSc, PhD >>>>>> Research Engineer >>>>>> CEA/CESTA >>>>>> 33114 LE BARP >>>>>> Tel.: (+33)557046924 >>>>>> Mob.: (+33)611025322 >>>>>> Mail: thibault.bridelbertomeu at gmail.com >>>>>> >>>>> >>>>> >>>>> -- >>>>> What most experimenters take for granted before they begin their >>>>> experiments is infinitely more interesting than any results to which their >>>>> experiments lead. >>>>> -- Norbert Wiener >>>>> >>>>> https://www.cse.buffalo.edu/~knepley/ >>>>> >>>>> >>>> -- >>>> Thibault Bridel-Bertomeu >>>> ? >>>> Eng, MSc, PhD >>>> Research Engineer >>>> CEA/CESTA >>>> 33114 LE BARP >>>> Tel.: (+33)557046924 >>>> Mob.: (+33)611025322 >>>> Mail: thibault.bridelbertomeu at gmail.com >>>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >>> https://www.cse.buffalo.edu/~knepley/ >>> >>> >> > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Sun Jan 9 10:07:55 2022 From: knepley at gmail.com (Matthew Knepley) Date: Sun, 9 Jan 2022 11:07:55 -0500 Subject: [petsc-users] Fluid-Structure interaction with multiple DMPlex In-Reply-To: References: Message-ID: On Sun, Jan 9, 2022 at 10:53 AM Thibault Bridel-Bertomeu < thibault.bridelbertomeu at gmail.com> wrote: > Le dim. 9 janv. 2022 ? 15:38, Matthew Knepley a > ?crit : > >> On Sun, Jan 9, 2022 at 7:49 AM Thibault Bridel-Bertomeu < >> thibault.bridelbertomeu at gmail.com> wrote: >> >>> Le dim. 9 janv. 2022 ? 13:05, Matthew Knepley a >>> ?crit : >>> >>>> On Sat, Jan 8, 2022 at 2:13 PM Thibault Bridel-Bertomeu < >>>> thibault.bridelbertomeu at gmail.com> wrote: >>>> >>>>> However if you use IMEX for strong coupling of the two physics solved >>>>> in each field, then it means you need to write a single set of PDEs that >>>>> covers everything, don?t you ? >>>>> If I want to solve Euler equations in one PetscDS and heat equation in >>>>> the other one, then I need to write a global set of equations to use the >>>>> IMEX TS , right ? >>>>> >>>> >>>> The way I think about it. You would have explicit terms for Euler, and >>>> they would also be confined to one part of the domain, but that just >>>> impacts how you do the residual integral. You do assemble a combined >>>> residual for all dogs, however, which I think is what you mean. >>>> >>> >>> Hmm I'm not quite sure yet, probably because I haven't really started >>> implementing it and I am not familiar with finite elements in PETSc. >>> The way I see it is that a TS expects to be solving dU/dt = F, that's >>> why I'm imagining that even with two domains with two different physics, >>> one has to write the problem under the previous form. And when it comes to >>> a FVM version of Euler + a FEM version of heat eqn, I'm not quite certain >>> how to write it like that. >>> Am I making any sense ? ?_o >>> >> >> Oh, if you have boundary communication, then formulating it as one system >> is difficult because different cells in supposedly the same DS would >> have different unknowns, yes. IB solves this by defining the other fields >> over the whole of each subdomain. Schwarz methods make two different >> problems and then pass values with what I call an "auxiliary field". You >> are right that you have to do something. >> > > Let's imagine that we read in a Gmsh mesh into a DMPlex. > That Gmsh mesh has two physical volumes so the DMPlex will a priori show > two labels and therefore (?) two fields, meaning we can work with two > SubDMs. > Each SubDM basically matches a region of the whole mesh in this case. > Now each SubDM can have its own DS and we can also attribute each DM to a > TS. > We can therefore solve the two problems, say one for fluid dynamics the > other for heat eqn. > > The only thing I am not sure about (actually I haven't coded anything yet > so I'm not sure of anything but ...) is the following. > The two SubDMs come originally from the same DM right. Say we work in 3D, > then the two SubDM must share a layer of triangles (and the segments and > vertices that go along with them). That layer of triangles exist in both > SubDM and is a boundary in both SubDM. > How do I tell, for instance, the fluid SubDM that the information it needs > on that layer of triangles comes from the other SubDM ? And vice versa ? Is > it possible to create two SubDMs from the same DM that somehow still know > each other after the creation ? > Example 23 from SNES does not do that kind of thing right ? The "top" and > "bottom" pieces are quite independent or am I misunderstanding sth ? > The way I see it working is that you compose the maps from the original space to each subDM on the boundary. Then, when you get one solution, you can map it to points on the boundary of the other solution, which you use as an auxiliary field. Thanks, Matt > Thanks !! > > Thibault > > >> Thanks, >> >> Matt >> >> >>> Thanks, >>> Thibault >>> >>> >>>> Thanks, >>>> >>>> Matt >>>> >>>> >>>>> Thanks, >>>>> >>>>> Thibault >>>>> >>>>> Le sam. 8 janv. 2022 ? 20:00, Matthew Knepley a >>>>> ?crit : >>>>> >>>>>> On Sat, Jan 8, 2022 at 1:30 PM Thibault Bridel-Bertomeu < >>>>>> thibault.bridelbertomeu at gmail.com> wrote: >>>>>> >>>>>>> Yes I was wondering about different time steps as well because >>>>>>> usually implicit integration moves much faster. >>>>>>> But if it not implemented, then maybe going the ? weak coupling ? >>>>>>> road with a sub-DM is the way. >>>>>>> Can I ask how you proceed in the rocket engine code you are writing >>>>>>> ? IMEX ? >>>>>>> >>>>>> >>>>>> Right now it is IMEX, but we are explicitly substepping particles. >>>>>> Not sure what the final thing will be. >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Matt >>>>>> >>>>>> >>>>>>> Thibault >>>>>>> >>>>>>> Le sam. 8 janv. 2022 ? 19:22, Matthew Knepley a >>>>>>> ?crit : >>>>>>> >>>>>>>> I do not know how. Right now, composable TS does not work all the >>>>>>>> way. >>>>>>>> >>>>>>>> Matt >>>>>>>> >>>>>>>> On Sat, Jan 8, 2022 at 1:03 PM Mark Adams wrote: >>>>>>>> >>>>>>>>> Can you subcycle with IMEX? >>>>>>>>> >>>>>>>>> On Sat, Jan 8, 2022 at 10:58 AM Matthew Knepley >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>>> On Sat, Jan 8, 2022 at 3:05 AM Thibault Bridel-Bertomeu < >>>>>>>>>> thibault.bridelbertomeu at gmail.com> wrote: >>>>>>>>>> >>>>>>>>>>> Le ven. 7 janv. 2022 ? 19:45, Thibault Bridel-Bertomeu < >>>>>>>>>>> thibault.bridelbertomeu at gmail.com> a ?crit : >>>>>>>>>>> >>>>>>>>>>>> Le ven. 7 janv. 2022 ? 19:23, Matthew Knepley < >>>>>>>>>>>> knepley at gmail.com> a ?crit : >>>>>>>>>>>> >>>>>>>>>>>>> On Fri, Jan 7, 2022 at 12:58 PM Thibault Bridel-Bertomeu < >>>>>>>>>>>>> thibault.bridelbertomeu at gmail.com> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> Le ven. 7 janv. 2022 ? 14:54, Matthew Knepley < >>>>>>>>>>>>>> knepley at gmail.com> a ?crit : >>>>>>>>>>>>>> >>>>>>>>>>>>>>> On Fri, Jan 7, 2022 at 8:52 AM Thibault Bridel-Bertomeu < >>>>>>>>>>>>>>> thibault.bridelbertomeu at gmail.com> wrote: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Hi Matthew, >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Le ven. 7 janv. 2022 ? 14:44, Matthew Knepley < >>>>>>>>>>>>>>>> knepley at gmail.com> a ?crit : >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> On Fri, Jan 7, 2022 at 5:46 AM Thibault Bridel-Bertomeu < >>>>>>>>>>>>>>>>> thibault.bridelbertomeu at gmail.com> wrote: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Dear all, >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> First of, happy new year everyone !! All the best ! >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Happy New Year! >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> I am starting to draft a new project that will be about >>>>>>>>>>>>>>>>>> fluid-structure interaction: in particular, the idea is to compute the >>>>>>>>>>>>>>>>>> Navier-Stokes (or Euler nevermind) flow around an object and _at the same >>>>>>>>>>>>>>>>>> time_ compute the heat equation inside the object. >>>>>>>>>>>>>>>>>> So basically, I am thinking a mesh of the fluid and a >>>>>>>>>>>>>>>>>> mesh of the object, both meshes being linked at the fluid - solid interface. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> First question: Are these meshes intended to match on the >>>>>>>>>>>>>>>>> interface? If not, this sounds like overset grids or immersed >>>>>>>>>>>>>>>>> boundary/interface methods. In this case, more than one mesh makes sense to >>>>>>>>>>>>>>>>> me. If they are intended to match, then I would advocate a single mesh with >>>>>>>>>>>>>>>>> multiple problems defined on it. I have experimented with this, for example >>>>>>>>>>>>>>>>> see SNES ex23 where I have a field in only part of the domain. I have a >>>>>>>>>>>>>>>>> large project to do exactly this in a rocket engine now. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Yes the way I see it is more of a single mesh with two >>>>>>>>>>>>>>>> distinct regions to distinguish between the fluid and the solid. I was >>>>>>>>>>>>>>>> talking about two meshes to try and explain my vision but it seems like it >>>>>>>>>>>>>>>> was unclear. >>>>>>>>>>>>>>>> Imagine if you wish a rectangular box with a sphere >>>>>>>>>>>>>>>> inclusion: the sphere would be tagged as a solid and the rest of the domain >>>>>>>>>>>>>>>> as fluid. Using Gmsh volumes for instance. >>>>>>>>>>>>>>>> Ill check out the SNES example ! Thanks ! >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> First (Matthew maybe ?) do you think it is something that >>>>>>>>>>>>>>>>>> could be done using two DMPlex's that would somehow be spawned from reading >>>>>>>>>>>>>>>>>> a Gmsh mesh with two volumes ? >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> You can take a mesh and filter out part of it with >>>>>>>>>>>>>>>>> DMPlexFilter(). That is not used much so I may have to fix it to do what >>>>>>>>>>>>>>>>> you want, but that should be easy. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> And on one DMPlex we would have finite volume for the >>>>>>>>>>>>>>>>>> fluid, on the other finite elements for the heat eqn ? >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> I have done this exact thing on a single mesh. It should >>>>>>>>>>>>>>>>> be no harder on two meshes if you go that route. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Second, is it something that anyone in the community has >>>>>>>>>>>>>>>>>> ever imagined doing with PETSc DMPlex's ? >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Yes, I had a combined FV+FEM simulation of magma dynamics >>>>>>>>>>>>>>>>> (I should make it an example), and currently we are doing FVM+FEM for >>>>>>>>>>>>>>>>> simulation of a rocket engine. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Wow so it seems like it?s the exact same thing I would like >>>>>>>>>>>>>>>> to achieve as the rocket engine example. >>>>>>>>>>>>>>>> So you have a single mesh and two regions tagged >>>>>>>>>>>>>>>> differently, and you use the DmPlexFilter to solve FVM and FEM separately ? >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> With a single mesh, you do not even need DMPlexFilter. You >>>>>>>>>>>>>>> just use the labels that Gmsh gives you. I think we should be able to get >>>>>>>>>>>>>>> it going in a straightforward way. >>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> Ok then ! Thanks ! I?ll give it a shot and see what happens ! >>>>>>>>>>>>>> Setting up the FVM and FEM discretizations will pass by >>>>>>>>>>>>>> DMSetField right ? With a single mesh tagged with two different regions, it >>>>>>>>>>>>>> should show up as two fields, is that correct ? >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> Yes, the idea is as follows. Each field also has a label >>>>>>>>>>>>> argument that is the support of the field in the domain. Then we create >>>>>>>>>>>>> PetscDS objects for each >>>>>>>>>>>>> separate set of overlapping fields. The current algorithm is >>>>>>>>>>>>> not complete I think, so let me know if this step fails. >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Ok, thanks. >>>>>>>>>>>> I?ll let you know and share snippets when I have something >>>>>>>>>>>> started ! >>>>>>>>>>>> >>>>>>>>>>>> Talk soon ! Thanks ! >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Hi Matthew, >>>>>>>>>>> >>>>>>>>>>> I thought about a little something else : what about setting two >>>>>>>>>>> different TS, one for each field of the DM ? Most probably the fluid part >>>>>>>>>>> would be solved with an explicit time stepping whereas the solid part with >>>>>>>>>>> the heat equation would benefit from implicit time stepping. TSSetDM does >>>>>>>>>>> not allow a field specification, is there a way to hack that so that each >>>>>>>>>>> field has its own TS ? >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> I see at least two options here: >>>>>>>>>> >>>>>>>>>> 1. Split the problems: >>>>>>>>>> >>>>>>>>>> You can use DMCreateSubDM() to split off part of a problem >>>>>>>>>> and use a solver on that. I have done this for problems with weak coupling. >>>>>>>>>> >>>>>>>>>> 2. Use IMEX >>>>>>>>>> >>>>>>>>>> For strong coupling, I have used the IMEX TSes in PETSc. You >>>>>>>>>> put the explicit terms in the RHS, and the implicit in the IFunction. >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> >>>>>>>>>> Matt >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> Thanks >>>>>>>>>>> >>>>>>>>>>> Thibault >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>> Thibault >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>>> Thanks, >>>>>>>>>>>>> >>>>>>>>>>>>> Matt >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>> >>>>>>>>>>>>>> Thibault >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Matt >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Thanks ! >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Thibault >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Matt >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> As I said it is very prospective, I just wanted to have >>>>>>>>>>>>>>>>>> your opinion !! >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Thanks very much in advance everyone !! >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Cheers, >>>>>>>>>>>>>>>>>> Thibault >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>>> What most experimenters take for granted before they begin >>>>>>>>>>>>>>>>> their experiments is infinitely more interesting than any results to which >>>>>>>>>>>>>>>>> their experiments lead. >>>>>>>>>>>>>>>>> -- Norbert Wiener >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>> Thibault Bridel-Bertomeu >>>>>>>>>>>>>>>> ? >>>>>>>>>>>>>>>> Eng, MSc, PhD >>>>>>>>>>>>>>>> Research Engineer >>>>>>>>>>>>>>>> CEA/CESTA >>>>>>>>>>>>>>>> 33114 LE BARP >>>>>>>>>>>>>>>> Tel.: (+33)557046924 >>>>>>>>>>>>>>>> Mob.: (+33)611025322 >>>>>>>>>>>>>>>> Mail: thibault.bridelbertomeu at gmail.com >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>> What most experimenters take for granted before they begin >>>>>>>>>>>>>>> their experiments is infinitely more interesting than any results to which >>>>>>>>>>>>>>> their experiments lead. >>>>>>>>>>>>>>> -- Norbert Wiener >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>> -- >>>>>>>>>>>>>> Thibault Bridel-Bertomeu >>>>>>>>>>>>>> ? >>>>>>>>>>>>>> Eng, MSc, PhD >>>>>>>>>>>>>> Research Engineer >>>>>>>>>>>>>> CEA/CESTA >>>>>>>>>>>>>> 33114 LE BARP >>>>>>>>>>>>>> Tel.: (+33)557046924 >>>>>>>>>>>>>> Mob.: (+33)611025322 >>>>>>>>>>>>>> Mail: thibault.bridelbertomeu at gmail.com >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> -- >>>>>>>>>>>>> What most experimenters take for granted before they begin >>>>>>>>>>>>> their experiments is infinitely more interesting than any results to which >>>>>>>>>>>>> their experiments lead. >>>>>>>>>>>>> -- Norbert Wiener >>>>>>>>>>>>> >>>>>>>>>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>> -- >>>>>>>>>>>> Thibault Bridel-Bertomeu >>>>>>>>>>>> ? >>>>>>>>>>>> Eng, MSc, PhD >>>>>>>>>>>> Research Engineer >>>>>>>>>>>> CEA/CESTA >>>>>>>>>>>> 33114 LE BARP >>>>>>>>>>>> Tel.: (+33)557046924 >>>>>>>>>>>> Mob.: (+33)611025322 >>>>>>>>>>>> Mail: thibault.bridelbertomeu at gmail.com >>>>>>>>>>>> >>>>>>>>>>> -- >>>>>>>>>>> Thibault Bridel-Bertomeu >>>>>>>>>>> ? >>>>>>>>>>> Eng, MSc, PhD >>>>>>>>>>> Research Engineer >>>>>>>>>>> CEA/CESTA >>>>>>>>>>> 33114 LE BARP >>>>>>>>>>> Tel.: (+33)557046924 >>>>>>>>>>> Mob.: (+33)611025322 >>>>>>>>>>> Mail: thibault.bridelbertomeu at gmail.com >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> What most experimenters take for granted before they begin their >>>>>>>>>> experiments is infinitely more interesting than any results to which their >>>>>>>>>> experiments lead. >>>>>>>>>> -- Norbert Wiener >>>>>>>>>> >>>>>>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> What most experimenters take for granted before they begin their >>>>>>>> experiments is infinitely more interesting than any results to which their >>>>>>>> experiments lead. >>>>>>>> -- Norbert Wiener >>>>>>>> >>>>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>>>> >>>>>>>> >>>>>>> -- >>>>>>> Thibault Bridel-Bertomeu >>>>>>> ? >>>>>>> Eng, MSc, PhD >>>>>>> Research Engineer >>>>>>> CEA/CESTA >>>>>>> 33114 LE BARP >>>>>>> Tel.: (+33)557046924 >>>>>>> Mob.: (+33)611025322 >>>>>>> Mail: thibault.bridelbertomeu at gmail.com >>>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> What most experimenters take for granted before they begin their >>>>>> experiments is infinitely more interesting than any results to which their >>>>>> experiments lead. >>>>>> -- Norbert Wiener >>>>>> >>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>> >>>>>> >>>>> -- >>>>> Thibault Bridel-Bertomeu >>>>> ? >>>>> Eng, MSc, PhD >>>>> Research Engineer >>>>> CEA/CESTA >>>>> 33114 LE BARP >>>>> Tel.: (+33)557046924 >>>>> Mob.: (+33)611025322 >>>>> Mail: thibault.bridelbertomeu at gmail.com >>>>> >>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their >>>> experiments is infinitely more interesting than any results to which their >>>> experiments lead. >>>> -- Norbert Wiener >>>> >>>> https://www.cse.buffalo.edu/~knepley/ >>>> >>>> >>> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ >> >> > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From thibault.bridelbertomeu at gmail.com Sun Jan 9 10:16:42 2022 From: thibault.bridelbertomeu at gmail.com (Thibault Bridel-Bertomeu) Date: Sun, 9 Jan 2022 17:16:42 +0100 Subject: [petsc-users] Fluid-Structure interaction with multiple DMPlex In-Reply-To: References: Message-ID: Le dim. 9 janv. 2022 ? 17:08, Matthew Knepley a ?crit : > On Sun, Jan 9, 2022 at 10:53 AM Thibault Bridel-Bertomeu < > thibault.bridelbertomeu at gmail.com> wrote: > >> Le dim. 9 janv. 2022 ? 15:38, Matthew Knepley a >> ?crit : >> >>> On Sun, Jan 9, 2022 at 7:49 AM Thibault Bridel-Bertomeu < >>> thibault.bridelbertomeu at gmail.com> wrote: >>> >>>> Le dim. 9 janv. 2022 ? 13:05, Matthew Knepley a >>>> ?crit : >>>> >>>>> On Sat, Jan 8, 2022 at 2:13 PM Thibault Bridel-Bertomeu < >>>>> thibault.bridelbertomeu at gmail.com> wrote: >>>>> >>>>>> However if you use IMEX for strong coupling of the two physics solved >>>>>> in each field, then it means you need to write a single set of PDEs that >>>>>> covers everything, don?t you ? >>>>>> If I want to solve Euler equations in one PetscDS and heat equation >>>>>> in the other one, then I need to write a global set of equations to use the >>>>>> IMEX TS , right ? >>>>>> >>>>> >>>>> The way I think about it. You would have explicit terms for Euler, and >>>>> they would also be confined to one part of the domain, but that just >>>>> impacts how you do the residual integral. You do assemble a combined >>>>> residual for all dogs, however, which I think is what you mean. >>>>> >>>> >>>> Hmm I'm not quite sure yet, probably because I haven't really started >>>> implementing it and I am not familiar with finite elements in PETSc. >>>> The way I see it is that a TS expects to be solving dU/dt = F, that's >>>> why I'm imagining that even with two domains with two different physics, >>>> one has to write the problem under the previous form. And when it comes to >>>> a FVM version of Euler + a FEM version of heat eqn, I'm not quite certain >>>> how to write it like that. >>>> Am I making any sense ? ?_o >>>> >>> >>> Oh, if you have boundary communication, then formulating it as one >>> system is difficult because different cells in supposedly the same DS would >>> have different unknowns, yes. IB solves this by defining the other >>> fields over the whole of each subdomain. Schwarz methods make two different >>> problems and then pass values with what I call an "auxiliary field". You >>> are right that you have to do something. >>> >> >> Let's imagine that we read in a Gmsh mesh into a DMPlex. >> That Gmsh mesh has two physical volumes so the DMPlex will a priori show >> two labels and therefore (?) two fields, meaning we can work with two >> SubDMs. >> Each SubDM basically matches a region of the whole mesh in this case. >> Now each SubDM can have its own DS and we can also attribute each DM to a >> TS. >> We can therefore solve the two problems, say one for fluid dynamics the >> other for heat eqn. >> >> The only thing I am not sure about (actually I haven't coded anything yet >> so I'm not sure of anything but ...) is the following. >> The two SubDMs come originally from the same DM right. Say we work in 3D, >> then the two SubDM must share a layer of triangles (and the segments and >> vertices that go along with them). That layer of triangles exist in both >> SubDM and is a boundary in both SubDM. >> How do I tell, for instance, the fluid SubDM that the information it >> needs on that layer of triangles comes from the other SubDM ? And vice >> versa ? Is it possible to create two SubDMs from the same DM that somehow >> still know each other after the creation ? >> Example 23 from SNES does not do that kind of thing right ? The "top" and >> "bottom" pieces are quite independent or am I misunderstanding sth ? >> > > The way I see it working is that you compose the maps from the original > space to each subDM on the boundary. Then, when you get one solution, you > can map it to points on the boundary of the other solution, which you use > as an auxiliary field. > Okay yes, I get what you mean. Is there a method in PETSc to do such things ? Does it have to do with IS ? Thanks ! Thibault > > Thanks, > > Matt > > >> Thanks !! >> >> Thibault >> >> >>> Thanks, >>> >>> Matt >>> >>> >>>> Thanks, >>>> Thibault >>>> >>>> >>>>> Thanks, >>>>> >>>>> Matt >>>>> >>>>> >>>>>> Thanks, >>>>>> >>>>>> Thibault >>>>>> >>>>>> Le sam. 8 janv. 2022 ? 20:00, Matthew Knepley a >>>>>> ?crit : >>>>>> >>>>>>> On Sat, Jan 8, 2022 at 1:30 PM Thibault Bridel-Bertomeu < >>>>>>> thibault.bridelbertomeu at gmail.com> wrote: >>>>>>> >>>>>>>> Yes I was wondering about different time steps as well because >>>>>>>> usually implicit integration moves much faster. >>>>>>>> But if it not implemented, then maybe going the ? weak coupling ? >>>>>>>> road with a sub-DM is the way. >>>>>>>> Can I ask how you proceed in the rocket engine code you are writing >>>>>>>> ? IMEX ? >>>>>>>> >>>>>>> >>>>>>> Right now it is IMEX, but we are explicitly substepping particles. >>>>>>> Not sure what the final thing will be. >>>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> Matt >>>>>>> >>>>>>> >>>>>>>> Thibault >>>>>>>> >>>>>>>> Le sam. 8 janv. 2022 ? 19:22, Matthew Knepley >>>>>>>> a ?crit : >>>>>>>> >>>>>>>>> I do not know how. Right now, composable TS does not work all the >>>>>>>>> way. >>>>>>>>> >>>>>>>>> Matt >>>>>>>>> >>>>>>>>> On Sat, Jan 8, 2022 at 1:03 PM Mark Adams wrote: >>>>>>>>> >>>>>>>>>> Can you subcycle with IMEX? >>>>>>>>>> >>>>>>>>>> On Sat, Jan 8, 2022 at 10:58 AM Matthew Knepley < >>>>>>>>>> knepley at gmail.com> wrote: >>>>>>>>>> >>>>>>>>>>> On Sat, Jan 8, 2022 at 3:05 AM Thibault Bridel-Bertomeu < >>>>>>>>>>> thibault.bridelbertomeu at gmail.com> wrote: >>>>>>>>>>> >>>>>>>>>>>> Le ven. 7 janv. 2022 ? 19:45, Thibault Bridel-Bertomeu < >>>>>>>>>>>> thibault.bridelbertomeu at gmail.com> a ?crit : >>>>>>>>>>>> >>>>>>>>>>>>> Le ven. 7 janv. 2022 ? 19:23, Matthew Knepley < >>>>>>>>>>>>> knepley at gmail.com> a ?crit : >>>>>>>>>>>>> >>>>>>>>>>>>>> On Fri, Jan 7, 2022 at 12:58 PM Thibault Bridel-Bertomeu < >>>>>>>>>>>>>> thibault.bridelbertomeu at gmail.com> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Le ven. 7 janv. 2022 ? 14:54, Matthew Knepley < >>>>>>>>>>>>>>> knepley at gmail.com> a ?crit : >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On Fri, Jan 7, 2022 at 8:52 AM Thibault Bridel-Bertomeu < >>>>>>>>>>>>>>>> thibault.bridelbertomeu at gmail.com> wrote: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Hi Matthew, >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Le ven. 7 janv. 2022 ? 14:44, Matthew Knepley < >>>>>>>>>>>>>>>>> knepley at gmail.com> a ?crit : >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> On Fri, Jan 7, 2022 at 5:46 AM Thibault Bridel-Bertomeu < >>>>>>>>>>>>>>>>>> thibault.bridelbertomeu at gmail.com> wrote: >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Dear all, >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> First of, happy new year everyone !! All the best ! >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Happy New Year! >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> I am starting to draft a new project that will be about >>>>>>>>>>>>>>>>>>> fluid-structure interaction: in particular, the idea is to compute the >>>>>>>>>>>>>>>>>>> Navier-Stokes (or Euler nevermind) flow around an object and _at the same >>>>>>>>>>>>>>>>>>> time_ compute the heat equation inside the object. >>>>>>>>>>>>>>>>>>> So basically, I am thinking a mesh of the fluid and a >>>>>>>>>>>>>>>>>>> mesh of the object, both meshes being linked at the fluid - solid interface. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> First question: Are these meshes intended to match on the >>>>>>>>>>>>>>>>>> interface? If not, this sounds like overset grids or immersed >>>>>>>>>>>>>>>>>> boundary/interface methods. In this case, more than one mesh makes sense to >>>>>>>>>>>>>>>>>> me. If they are intended to match, then I would advocate a single mesh with >>>>>>>>>>>>>>>>>> multiple problems defined on it. I have experimented with this, for example >>>>>>>>>>>>>>>>>> see SNES ex23 where I have a field in only part of the domain. I have a >>>>>>>>>>>>>>>>>> large project to do exactly this in a rocket engine now. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Yes the way I see it is more of a single mesh with two >>>>>>>>>>>>>>>>> distinct regions to distinguish between the fluid and the solid. I was >>>>>>>>>>>>>>>>> talking about two meshes to try and explain my vision but it seems like it >>>>>>>>>>>>>>>>> was unclear. >>>>>>>>>>>>>>>>> Imagine if you wish a rectangular box with a sphere >>>>>>>>>>>>>>>>> inclusion: the sphere would be tagged as a solid and the rest of the domain >>>>>>>>>>>>>>>>> as fluid. Using Gmsh volumes for instance. >>>>>>>>>>>>>>>>> Ill check out the SNES example ! Thanks ! >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> First (Matthew maybe ?) do you think it is something >>>>>>>>>>>>>>>>>>> that could be done using two DMPlex's that would somehow be spawned from >>>>>>>>>>>>>>>>>>> reading a Gmsh mesh with two volumes ? >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> You can take a mesh and filter out part of it with >>>>>>>>>>>>>>>>>> DMPlexFilter(). That is not used much so I may have to fix it to do what >>>>>>>>>>>>>>>>>> you want, but that should be easy. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> And on one DMPlex we would have finite volume for the >>>>>>>>>>>>>>>>>>> fluid, on the other finite elements for the heat eqn ? >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> I have done this exact thing on a single mesh. It should >>>>>>>>>>>>>>>>>> be no harder on two meshes if you go that route. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Second, is it something that anyone in the community has >>>>>>>>>>>>>>>>>>> ever imagined doing with PETSc DMPlex's ? >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Yes, I had a combined FV+FEM simulation of magma dynamics >>>>>>>>>>>>>>>>>> (I should make it an example), and currently we are doing FVM+FEM for >>>>>>>>>>>>>>>>>> simulation of a rocket engine. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Wow so it seems like it?s the exact same thing I would >>>>>>>>>>>>>>>>> like to achieve as the rocket engine example. >>>>>>>>>>>>>>>>> So you have a single mesh and two regions tagged >>>>>>>>>>>>>>>>> differently, and you use the DmPlexFilter to solve FVM and FEM separately ? >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> With a single mesh, you do not even need DMPlexFilter. You >>>>>>>>>>>>>>>> just use the labels that Gmsh gives you. I think we should be able to get >>>>>>>>>>>>>>>> it going in a straightforward way. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Ok then ! Thanks ! I?ll give it a shot and see what happens >>>>>>>>>>>>>>> ! >>>>>>>>>>>>>>> Setting up the FVM and FEM discretizations will pass by >>>>>>>>>>>>>>> DMSetField right ? With a single mesh tagged with two different regions, it >>>>>>>>>>>>>>> should show up as two fields, is that correct ? >>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> Yes, the idea is as follows. Each field also has a label >>>>>>>>>>>>>> argument that is the support of the field in the domain. Then we create >>>>>>>>>>>>>> PetscDS objects for each >>>>>>>>>>>>>> separate set of overlapping fields. The current algorithm is >>>>>>>>>>>>>> not complete I think, so let me know if this step fails. >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> Ok, thanks. >>>>>>>>>>>>> I?ll let you know and share snippets when I have something >>>>>>>>>>>>> started ! >>>>>>>>>>>>> >>>>>>>>>>>>> Talk soon ! Thanks ! >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Hi Matthew, >>>>>>>>>>>> >>>>>>>>>>>> I thought about a little something else : what about setting >>>>>>>>>>>> two different TS, one for each field of the DM ? Most probably the fluid >>>>>>>>>>>> part would be solved with an explicit time stepping whereas the solid part >>>>>>>>>>>> with the heat equation would benefit from implicit time stepping. TSSetDM >>>>>>>>>>>> does not allow a field specification, is there a way to hack that so that >>>>>>>>>>>> each field has its own TS ? >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> I see at least two options here: >>>>>>>>>>> >>>>>>>>>>> 1. Split the problems: >>>>>>>>>>> >>>>>>>>>>> You can use DMCreateSubDM() to split off part of a problem >>>>>>>>>>> and use a solver on that. I have done this for problems with weak coupling. >>>>>>>>>>> >>>>>>>>>>> 2. Use IMEX >>>>>>>>>>> >>>>>>>>>>> For strong coupling, I have used the IMEX TSes in PETSc. You >>>>>>>>>>> put the explicit terms in the RHS, and the implicit in the IFunction. >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> >>>>>>>>>>> Matt >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>> Thanks >>>>>>>>>>>> >>>>>>>>>>>> Thibault >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>>> Thibault >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>> >>>>>>>>>>>>>> Matt >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Thibault >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Matt >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Thanks ! >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Thibault >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Matt >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> As I said it is very prospective, I just wanted to have >>>>>>>>>>>>>>>>>>> your opinion !! >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Thanks very much in advance everyone !! >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Cheers, >>>>>>>>>>>>>>>>>>> Thibault >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>>>> What most experimenters take for granted before they >>>>>>>>>>>>>>>>>> begin their experiments is infinitely more interesting than any results to >>>>>>>>>>>>>>>>>> which their experiments lead. >>>>>>>>>>>>>>>>>> -- Norbert Wiener >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>>> Thibault Bridel-Bertomeu >>>>>>>>>>>>>>>>> ? >>>>>>>>>>>>>>>>> Eng, MSc, PhD >>>>>>>>>>>>>>>>> Research Engineer >>>>>>>>>>>>>>>>> CEA/CESTA >>>>>>>>>>>>>>>>> 33114 LE BARP >>>>>>>>>>>>>>>>> Tel.: (+33)557046924 >>>>>>>>>>>>>>>>> Mob.: (+33)611025322 >>>>>>>>>>>>>>>>> Mail: thibault.bridelbertomeu at gmail.com >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>> What most experimenters take for granted before they begin >>>>>>>>>>>>>>>> their experiments is infinitely more interesting than any results to which >>>>>>>>>>>>>>>> their experiments lead. >>>>>>>>>>>>>>>> -- Norbert Wiener >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>> Thibault Bridel-Bertomeu >>>>>>>>>>>>>>> ? >>>>>>>>>>>>>>> Eng, MSc, PhD >>>>>>>>>>>>>>> Research Engineer >>>>>>>>>>>>>>> CEA/CESTA >>>>>>>>>>>>>>> 33114 LE BARP >>>>>>>>>>>>>>> Tel.: (+33)557046924 >>>>>>>>>>>>>>> Mob.: (+33)611025322 >>>>>>>>>>>>>>> Mail: thibault.bridelbertomeu at gmail.com >>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> -- >>>>>>>>>>>>>> What most experimenters take for granted before they begin >>>>>>>>>>>>>> their experiments is infinitely more interesting than any results to which >>>>>>>>>>>>>> their experiments lead. >>>>>>>>>>>>>> -- Norbert Wiener >>>>>>>>>>>>>> >>>>>>>>>>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>> -- >>>>>>>>>>>>> Thibault Bridel-Bertomeu >>>>>>>>>>>>> ? >>>>>>>>>>>>> Eng, MSc, PhD >>>>>>>>>>>>> Research Engineer >>>>>>>>>>>>> CEA/CESTA >>>>>>>>>>>>> 33114 LE BARP >>>>>>>>>>>>> Tel.: (+33)557046924 >>>>>>>>>>>>> Mob.: (+33)611025322 >>>>>>>>>>>>> Mail: thibault.bridelbertomeu at gmail.com >>>>>>>>>>>>> >>>>>>>>>>>> -- >>>>>>>>>>>> Thibault Bridel-Bertomeu >>>>>>>>>>>> ? >>>>>>>>>>>> Eng, MSc, PhD >>>>>>>>>>>> Research Engineer >>>>>>>>>>>> CEA/CESTA >>>>>>>>>>>> 33114 LE BARP >>>>>>>>>>>> Tel.: (+33)557046924 >>>>>>>>>>>> Mob.: (+33)611025322 >>>>>>>>>>>> Mail: thibault.bridelbertomeu at gmail.com >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> -- >>>>>>>>>>> What most experimenters take for granted before they begin their >>>>>>>>>>> experiments is infinitely more interesting than any results to which their >>>>>>>>>>> experiments lead. >>>>>>>>>>> -- Norbert Wiener >>>>>>>>>>> >>>>>>>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> What most experimenters take for granted before they begin their >>>>>>>>> experiments is infinitely more interesting than any results to which their >>>>>>>>> experiments lead. >>>>>>>>> -- Norbert Wiener >>>>>>>>> >>>>>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>>>>> >>>>>>>>> >>>>>>>> -- >>>>>>>> Thibault Bridel-Bertomeu >>>>>>>> ? >>>>>>>> Eng, MSc, PhD >>>>>>>> Research Engineer >>>>>>>> CEA/CESTA >>>>>>>> 33114 LE BARP >>>>>>>> Tel.: (+33)557046924 >>>>>>>> Mob.: (+33)611025322 >>>>>>>> Mail: thibault.bridelbertomeu at gmail.com >>>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> What most experimenters take for granted before they begin their >>>>>>> experiments is infinitely more interesting than any results to which their >>>>>>> experiments lead. >>>>>>> -- Norbert Wiener >>>>>>> >>>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>>> >>>>>>> >>>>>> -- >>>>>> Thibault Bridel-Bertomeu >>>>>> ? >>>>>> Eng, MSc, PhD >>>>>> Research Engineer >>>>>> CEA/CESTA >>>>>> 33114 LE BARP >>>>>> Tel.: (+33)557046924 >>>>>> Mob.: (+33)611025322 >>>>>> Mail: thibault.bridelbertomeu at gmail.com >>>>>> >>>>> >>>>> >>>>> -- >>>>> What most experimenters take for granted before they begin their >>>>> experiments is infinitely more interesting than any results to which their >>>>> experiments lead. >>>>> -- Norbert Wiener >>>>> >>>>> https://www.cse.buffalo.edu/~knepley/ >>>>> >>>>> >>>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >>> https://www.cse.buffalo.edu/~knepley/ >>> >>> >> > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Sun Jan 9 10:24:13 2022 From: knepley at gmail.com (Matthew Knepley) Date: Sun, 9 Jan 2022 11:24:13 -0500 Subject: [petsc-users] Fluid-Structure interaction with multiple DMPlex In-Reply-To: References: Message-ID: On Sun, Jan 9, 2022 at 11:16 AM Thibault Bridel-Bertomeu < thibault.bridelbertomeu at gmail.com> wrote: > Le dim. 9 janv. 2022 ? 17:08, Matthew Knepley a > ?crit : > >> On Sun, Jan 9, 2022 at 10:53 AM Thibault Bridel-Bertomeu < >> thibault.bridelbertomeu at gmail.com> wrote: >> >>> Le dim. 9 janv. 2022 ? 15:38, Matthew Knepley a >>> ?crit : >>> >>>> On Sun, Jan 9, 2022 at 7:49 AM Thibault Bridel-Bertomeu < >>>> thibault.bridelbertomeu at gmail.com> wrote: >>>> >>>>> Le dim. 9 janv. 2022 ? 13:05, Matthew Knepley a >>>>> ?crit : >>>>> >>>>>> On Sat, Jan 8, 2022 at 2:13 PM Thibault Bridel-Bertomeu < >>>>>> thibault.bridelbertomeu at gmail.com> wrote: >>>>>> >>>>>>> However if you use IMEX for strong coupling of the two physics >>>>>>> solved in each field, then it means you need to write a single set of PDEs >>>>>>> that covers everything, don?t you ? >>>>>>> If I want to solve Euler equations in one PetscDS and heat equation >>>>>>> in the other one, then I need to write a global set of equations to use the >>>>>>> IMEX TS , right ? >>>>>>> >>>>>> >>>>>> The way I think about it. You would have explicit terms for Euler, >>>>>> and they would also be confined to one part of the domain, but that just >>>>>> impacts how you do the residual integral. You do assemble a combined >>>>>> residual for all dogs, however, which I think is what you mean. >>>>>> >>>>> >>>>> Hmm I'm not quite sure yet, probably because I haven't really started >>>>> implementing it and I am not familiar with finite elements in PETSc. >>>>> The way I see it is that a TS expects to be solving dU/dt = F, that's >>>>> why I'm imagining that even with two domains with two different physics, >>>>> one has to write the problem under the previous form. And when it comes to >>>>> a FVM version of Euler + a FEM version of heat eqn, I'm not quite certain >>>>> how to write it like that. >>>>> Am I making any sense ? ?_o >>>>> >>>> >>>> Oh, if you have boundary communication, then formulating it as one >>>> system is difficult because different cells in supposedly the same DS would >>>> have different unknowns, yes. IB solves this by defining the other >>>> fields over the whole of each subdomain. Schwarz methods make two different >>>> problems and then pass values with what I call an "auxiliary field". >>>> You are right that you have to do something. >>>> >>> >>> Let's imagine that we read in a Gmsh mesh into a DMPlex. >>> That Gmsh mesh has two physical volumes so the DMPlex will a priori show >>> two labels and therefore (?) two fields, meaning we can work with two >>> SubDMs. >>> Each SubDM basically matches a region of the whole mesh in this case. >>> Now each SubDM can have its own DS and we can also attribute each DM to >>> a TS. >>> We can therefore solve the two problems, say one for fluid dynamics the >>> other for heat eqn. >>> >>> The only thing I am not sure about (actually I haven't coded anything >>> yet so I'm not sure of anything but ...) is the following. >>> The two SubDMs come originally from the same DM right. Say we work in >>> 3D, then the two SubDM must share a layer of triangles (and the segments >>> and vertices that go along with them). That layer of triangles exist in >>> both SubDM and is a boundary in both SubDM. >>> How do I tell, for instance, the fluid SubDM that the information it >>> needs on that layer of triangles comes from the other SubDM ? And vice >>> versa ? Is it possible to create two SubDMs from the same DM that somehow >>> still know each other after the creation ? >>> Example 23 from SNES does not do that kind of thing right ? The "top" >>> and "bottom" pieces are quite independent or am I misunderstanding sth ? >>> >> >> The way I see it working is that you compose the maps from the original >> space to each subDM on the boundary. Then, when you get one solution, you >> can map it to points on the boundary of the other solution, which you use >> as an auxiliary field. >> > > Okay yes, I get what you mean. Is there a method in PETSc to do such > things ? Does it have to do with IS ? > Yes, we can do it. It is just IS manipulation. Thanks, Matt > Thanks ! > Thibault > > >> >> Thanks, >> >> Matt >> >> >>> Thanks !! >>> >>> Thibault >>> >>> >>>> Thanks, >>>> >>>> Matt >>>> >>>> >>>>> Thanks, >>>>> Thibault >>>>> >>>>> >>>>>> Thanks, >>>>>> >>>>>> Matt >>>>>> >>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> Thibault >>>>>>> >>>>>>> Le sam. 8 janv. 2022 ? 20:00, Matthew Knepley a >>>>>>> ?crit : >>>>>>> >>>>>>>> On Sat, Jan 8, 2022 at 1:30 PM Thibault Bridel-Bertomeu < >>>>>>>> thibault.bridelbertomeu at gmail.com> wrote: >>>>>>>> >>>>>>>>> Yes I was wondering about different time steps as well because >>>>>>>>> usually implicit integration moves much faster. >>>>>>>>> But if it not implemented, then maybe going the ? weak coupling ? >>>>>>>>> road with a sub-DM is the way. >>>>>>>>> Can I ask how you proceed in the rocket engine code you are >>>>>>>>> writing ? IMEX ? >>>>>>>>> >>>>>>>> >>>>>>>> Right now it is IMEX, but we are explicitly substepping particles. >>>>>>>> Not sure what the final thing will be. >>>>>>>> >>>>>>>> Thanks, >>>>>>>> >>>>>>>> Matt >>>>>>>> >>>>>>>> >>>>>>>>> Thibault >>>>>>>>> >>>>>>>>> Le sam. 8 janv. 2022 ? 19:22, Matthew Knepley >>>>>>>>> a ?crit : >>>>>>>>> >>>>>>>>>> I do not know how. Right now, composable TS does not work all the >>>>>>>>>> way. >>>>>>>>>> >>>>>>>>>> Matt >>>>>>>>>> >>>>>>>>>> On Sat, Jan 8, 2022 at 1:03 PM Mark Adams >>>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>>> Can you subcycle with IMEX? >>>>>>>>>>> >>>>>>>>>>> On Sat, Jan 8, 2022 at 10:58 AM Matthew Knepley < >>>>>>>>>>> knepley at gmail.com> wrote: >>>>>>>>>>> >>>>>>>>>>>> On Sat, Jan 8, 2022 at 3:05 AM Thibault Bridel-Bertomeu < >>>>>>>>>>>> thibault.bridelbertomeu at gmail.com> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> Le ven. 7 janv. 2022 ? 19:45, Thibault Bridel-Bertomeu < >>>>>>>>>>>>> thibault.bridelbertomeu at gmail.com> a ?crit : >>>>>>>>>>>>> >>>>>>>>>>>>>> Le ven. 7 janv. 2022 ? 19:23, Matthew Knepley < >>>>>>>>>>>>>> knepley at gmail.com> a ?crit : >>>>>>>>>>>>>> >>>>>>>>>>>>>>> On Fri, Jan 7, 2022 at 12:58 PM Thibault Bridel-Bertomeu < >>>>>>>>>>>>>>> thibault.bridelbertomeu at gmail.com> wrote: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Le ven. 7 janv. 2022 ? 14:54, Matthew Knepley < >>>>>>>>>>>>>>>> knepley at gmail.com> a ?crit : >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> On Fri, Jan 7, 2022 at 8:52 AM Thibault Bridel-Bertomeu < >>>>>>>>>>>>>>>>> thibault.bridelbertomeu at gmail.com> wrote: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Hi Matthew, >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Le ven. 7 janv. 2022 ? 14:44, Matthew Knepley < >>>>>>>>>>>>>>>>>> knepley at gmail.com> a ?crit : >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> On Fri, Jan 7, 2022 at 5:46 AM Thibault Bridel-Bertomeu < >>>>>>>>>>>>>>>>>>> thibault.bridelbertomeu at gmail.com> wrote: >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Dear all, >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> First of, happy new year everyone !! All the best ! >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Happy New Year! >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> I am starting to draft a new project that will be about >>>>>>>>>>>>>>>>>>>> fluid-structure interaction: in particular, the idea is to compute the >>>>>>>>>>>>>>>>>>>> Navier-Stokes (or Euler nevermind) flow around an object and _at the same >>>>>>>>>>>>>>>>>>>> time_ compute the heat equation inside the object. >>>>>>>>>>>>>>>>>>>> So basically, I am thinking a mesh of the fluid and a >>>>>>>>>>>>>>>>>>>> mesh of the object, both meshes being linked at the fluid - solid interface. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> First question: Are these meshes intended to match on >>>>>>>>>>>>>>>>>>> the interface? If not, this sounds like overset grids or immersed >>>>>>>>>>>>>>>>>>> boundary/interface methods. In this case, more than one mesh makes sense to >>>>>>>>>>>>>>>>>>> me. If they are intended to match, then I would advocate a single mesh with >>>>>>>>>>>>>>>>>>> multiple problems defined on it. I have experimented with this, for example >>>>>>>>>>>>>>>>>>> see SNES ex23 where I have a field in only part of the domain. I have a >>>>>>>>>>>>>>>>>>> large project to do exactly this in a rocket engine now. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Yes the way I see it is more of a single mesh with two >>>>>>>>>>>>>>>>>> distinct regions to distinguish between the fluid and the solid. I was >>>>>>>>>>>>>>>>>> talking about two meshes to try and explain my vision but it seems like it >>>>>>>>>>>>>>>>>> was unclear. >>>>>>>>>>>>>>>>>> Imagine if you wish a rectangular box with a sphere >>>>>>>>>>>>>>>>>> inclusion: the sphere would be tagged as a solid and the rest of the domain >>>>>>>>>>>>>>>>>> as fluid. Using Gmsh volumes for instance. >>>>>>>>>>>>>>>>>> Ill check out the SNES example ! Thanks ! >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> First (Matthew maybe ?) do you think it is something >>>>>>>>>>>>>>>>>>>> that could be done using two DMPlex's that would somehow be spawned from >>>>>>>>>>>>>>>>>>>> reading a Gmsh mesh with two volumes ? >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> You can take a mesh and filter out part of it with >>>>>>>>>>>>>>>>>>> DMPlexFilter(). That is not used much so I may have to fix it to do what >>>>>>>>>>>>>>>>>>> you want, but that should be easy. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> And on one DMPlex we would have finite volume for the >>>>>>>>>>>>>>>>>>>> fluid, on the other finite elements for the heat eqn ? >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> I have done this exact thing on a single mesh. It should >>>>>>>>>>>>>>>>>>> be no harder on two meshes if you go that route. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Second, is it something that anyone in the community >>>>>>>>>>>>>>>>>>>> has ever imagined doing with PETSc DMPlex's ? >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Yes, I had a combined FV+FEM simulation of magma >>>>>>>>>>>>>>>>>>> dynamics (I should make it an example), and currently we are doing FVM+FEM >>>>>>>>>>>>>>>>>>> for simulation of a rocket engine. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Wow so it seems like it?s the exact same thing I would >>>>>>>>>>>>>>>>>> like to achieve as the rocket engine example. >>>>>>>>>>>>>>>>>> So you have a single mesh and two regions tagged >>>>>>>>>>>>>>>>>> differently, and you use the DmPlexFilter to solve FVM and FEM separately ? >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> With a single mesh, you do not even need DMPlexFilter. You >>>>>>>>>>>>>>>>> just use the labels that Gmsh gives you. I think we should be able to get >>>>>>>>>>>>>>>>> it going in a straightforward way. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Ok then ! Thanks ! I?ll give it a shot and see what happens >>>>>>>>>>>>>>>> ! >>>>>>>>>>>>>>>> Setting up the FVM and FEM discretizations will pass by >>>>>>>>>>>>>>>> DMSetField right ? With a single mesh tagged with two different regions, it >>>>>>>>>>>>>>>> should show up as two fields, is that correct ? >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Yes, the idea is as follows. Each field also has a label >>>>>>>>>>>>>>> argument that is the support of the field in the domain. Then we create >>>>>>>>>>>>>>> PetscDS objects for each >>>>>>>>>>>>>>> separate set of overlapping fields. The current algorithm is >>>>>>>>>>>>>>> not complete I think, so let me know if this step fails. >>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> Ok, thanks. >>>>>>>>>>>>>> I?ll let you know and share snippets when I have something >>>>>>>>>>>>>> started ! >>>>>>>>>>>>>> >>>>>>>>>>>>>> Talk soon ! Thanks ! >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> Hi Matthew, >>>>>>>>>>>>> >>>>>>>>>>>>> I thought about a little something else : what about setting >>>>>>>>>>>>> two different TS, one for each field of the DM ? Most probably the fluid >>>>>>>>>>>>> part would be solved with an explicit time stepping whereas the solid part >>>>>>>>>>>>> with the heat equation would benefit from implicit time stepping. TSSetDM >>>>>>>>>>>>> does not allow a field specification, is there a way to hack that so that >>>>>>>>>>>>> each field has its own TS ? >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> I see at least two options here: >>>>>>>>>>>> >>>>>>>>>>>> 1. Split the problems: >>>>>>>>>>>> >>>>>>>>>>>> You can use DMCreateSubDM() to split off part of a problem >>>>>>>>>>>> and use a solver on that. I have done this for problems with weak coupling. >>>>>>>>>>>> >>>>>>>>>>>> 2. Use IMEX >>>>>>>>>>>> >>>>>>>>>>>> For strong coupling, I have used the IMEX TSes in PETSc. >>>>>>>>>>>> You put the explicit terms in the RHS, and the implicit in the IFunction. >>>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> >>>>>>>>>>>> Matt >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>>> Thanks >>>>>>>>>>>>> >>>>>>>>>>>>> Thibault >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>>> Thibault >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Matt >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Thibault >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Matt >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Thanks ! >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Thibault >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Matt >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> As I said it is very prospective, I just wanted to have >>>>>>>>>>>>>>>>>>>> your opinion !! >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Thanks very much in advance everyone !! >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Cheers, >>>>>>>>>>>>>>>>>>>> Thibault >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>>>>> What most experimenters take for granted before they >>>>>>>>>>>>>>>>>>> begin their experiments is infinitely more interesting than any results to >>>>>>>>>>>>>>>>>>> which their experiments lead. >>>>>>>>>>>>>>>>>>> -- Norbert Wiener >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>>>> Thibault Bridel-Bertomeu >>>>>>>>>>>>>>>>>> ? >>>>>>>>>>>>>>>>>> Eng, MSc, PhD >>>>>>>>>>>>>>>>>> Research Engineer >>>>>>>>>>>>>>>>>> CEA/CESTA >>>>>>>>>>>>>>>>>> 33114 LE BARP >>>>>>>>>>>>>>>>>> Tel.: (+33)557046924 >>>>>>>>>>>>>>>>>> Mob.: (+33)611025322 >>>>>>>>>>>>>>>>>> Mail: thibault.bridelbertomeu at gmail.com >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>>> What most experimenters take for granted before they begin >>>>>>>>>>>>>>>>> their experiments is infinitely more interesting than any results to which >>>>>>>>>>>>>>>>> their experiments lead. >>>>>>>>>>>>>>>>> -- Norbert Wiener >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>> Thibault Bridel-Bertomeu >>>>>>>>>>>>>>>> ? >>>>>>>>>>>>>>>> Eng, MSc, PhD >>>>>>>>>>>>>>>> Research Engineer >>>>>>>>>>>>>>>> CEA/CESTA >>>>>>>>>>>>>>>> 33114 LE BARP >>>>>>>>>>>>>>>> Tel.: (+33)557046924 >>>>>>>>>>>>>>>> Mob.: (+33)611025322 >>>>>>>>>>>>>>>> Mail: thibault.bridelbertomeu at gmail.com >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>> What most experimenters take for granted before they begin >>>>>>>>>>>>>>> their experiments is infinitely more interesting than any results to which >>>>>>>>>>>>>>> their experiments lead. >>>>>>>>>>>>>>> -- Norbert Wiener >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>> -- >>>>>>>>>>>>>> Thibault Bridel-Bertomeu >>>>>>>>>>>>>> ? >>>>>>>>>>>>>> Eng, MSc, PhD >>>>>>>>>>>>>> Research Engineer >>>>>>>>>>>>>> CEA/CESTA >>>>>>>>>>>>>> 33114 LE BARP >>>>>>>>>>>>>> Tel.: (+33)557046924 >>>>>>>>>>>>>> Mob.: (+33)611025322 >>>>>>>>>>>>>> Mail: thibault.bridelbertomeu at gmail.com >>>>>>>>>>>>>> >>>>>>>>>>>>> -- >>>>>>>>>>>>> Thibault Bridel-Bertomeu >>>>>>>>>>>>> ? >>>>>>>>>>>>> Eng, MSc, PhD >>>>>>>>>>>>> Research Engineer >>>>>>>>>>>>> CEA/CESTA >>>>>>>>>>>>> 33114 LE BARP >>>>>>>>>>>>> Tel.: (+33)557046924 >>>>>>>>>>>>> Mob.: (+33)611025322 >>>>>>>>>>>>> Mail: thibault.bridelbertomeu at gmail.com >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> -- >>>>>>>>>>>> What most experimenters take for granted before they begin >>>>>>>>>>>> their experiments is infinitely more interesting than any results to which >>>>>>>>>>>> their experiments lead. >>>>>>>>>>>> -- Norbert Wiener >>>>>>>>>>>> >>>>>>>>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> What most experimenters take for granted before they begin their >>>>>>>>>> experiments is infinitely more interesting than any results to which their >>>>>>>>>> experiments lead. >>>>>>>>>> -- Norbert Wiener >>>>>>>>>> >>>>>>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>>>>>> >>>>>>>>>> >>>>>>>>> -- >>>>>>>>> Thibault Bridel-Bertomeu >>>>>>>>> ? >>>>>>>>> Eng, MSc, PhD >>>>>>>>> Research Engineer >>>>>>>>> CEA/CESTA >>>>>>>>> 33114 LE BARP >>>>>>>>> Tel.: (+33)557046924 >>>>>>>>> Mob.: (+33)611025322 >>>>>>>>> Mail: thibault.bridelbertomeu at gmail.com >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> What most experimenters take for granted before they begin their >>>>>>>> experiments is infinitely more interesting than any results to which their >>>>>>>> experiments lead. >>>>>>>> -- Norbert Wiener >>>>>>>> >>>>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>>>> >>>>>>>> >>>>>>> -- >>>>>>> Thibault Bridel-Bertomeu >>>>>>> ? >>>>>>> Eng, MSc, PhD >>>>>>> Research Engineer >>>>>>> CEA/CESTA >>>>>>> 33114 LE BARP >>>>>>> Tel.: (+33)557046924 >>>>>>> Mob.: (+33)611025322 >>>>>>> Mail: thibault.bridelbertomeu at gmail.com >>>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> What most experimenters take for granted before they begin their >>>>>> experiments is infinitely more interesting than any results to which their >>>>>> experiments lead. >>>>>> -- Norbert Wiener >>>>>> >>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>> >>>>>> >>>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their >>>> experiments is infinitely more interesting than any results to which their >>>> experiments lead. >>>> -- Norbert Wiener >>>> >>>> https://www.cse.buffalo.edu/~knepley/ >>>> >>>> >>> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ >> >> > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From thibault.bridelbertomeu at gmail.com Sun Jan 9 10:32:57 2022 From: thibault.bridelbertomeu at gmail.com (Thibault Bridel-Bertomeu) Date: Sun, 9 Jan 2022 17:32:57 +0100 Subject: [petsc-users] Fluid-Structure interaction with multiple DMPlex In-Reply-To: References: Message-ID: Le dim. 9 janv. 2022 ? 17:24, Matthew Knepley a ?crit : > On Sun, Jan 9, 2022 at 11:16 AM Thibault Bridel-Bertomeu < > thibault.bridelbertomeu at gmail.com> wrote: > >> Le dim. 9 janv. 2022 ? 17:08, Matthew Knepley a >> ?crit : >> >>> On Sun, Jan 9, 2022 at 10:53 AM Thibault Bridel-Bertomeu < >>> thibault.bridelbertomeu at gmail.com> wrote: >>> >>>> Le dim. 9 janv. 2022 ? 15:38, Matthew Knepley a >>>> ?crit : >>>> >>>>> On Sun, Jan 9, 2022 at 7:49 AM Thibault Bridel-Bertomeu < >>>>> thibault.bridelbertomeu at gmail.com> wrote: >>>>> >>>>>> Le dim. 9 janv. 2022 ? 13:05, Matthew Knepley a >>>>>> ?crit : >>>>>> >>>>>>> On Sat, Jan 8, 2022 at 2:13 PM Thibault Bridel-Bertomeu < >>>>>>> thibault.bridelbertomeu at gmail.com> wrote: >>>>>>> >>>>>>>> However if you use IMEX for strong coupling of the two physics >>>>>>>> solved in each field, then it means you need to write a single set of PDEs >>>>>>>> that covers everything, don?t you ? >>>>>>>> If I want to solve Euler equations in one PetscDS and heat equation >>>>>>>> in the other one, then I need to write a global set of equations to use the >>>>>>>> IMEX TS , right ? >>>>>>>> >>>>>>> >>>>>>> The way I think about it. You would have explicit terms for Euler, >>>>>>> and they would also be confined to one part of the domain, but that just >>>>>>> impacts how you do the residual integral. You do assemble a combined >>>>>>> residual for all dogs, however, which I think is what you mean. >>>>>>> >>>>>> >>>>>> Hmm I'm not quite sure yet, probably because I haven't really started >>>>>> implementing it and I am not familiar with finite elements in PETSc. >>>>>> The way I see it is that a TS expects to be solving dU/dt = F, that's >>>>>> why I'm imagining that even with two domains with two different physics, >>>>>> one has to write the problem under the previous form. And when it comes to >>>>>> a FVM version of Euler + a FEM version of heat eqn, I'm not quite certain >>>>>> how to write it like that. >>>>>> Am I making any sense ? ?_o >>>>>> >>>>> >>>>> Oh, if you have boundary communication, then formulating it as one >>>>> system is difficult because different cells in supposedly the same DS would >>>>> have different unknowns, yes. IB solves this by defining the other >>>>> fields over the whole of each subdomain. Schwarz methods make two different >>>>> problems and then pass values with what I call an "auxiliary field". >>>>> You are right that you have to do something. >>>>> >>>> >>>> Let's imagine that we read in a Gmsh mesh into a DMPlex. >>>> That Gmsh mesh has two physical volumes so the DMPlex will a priori >>>> show two labels and therefore (?) two fields, meaning we can work with two >>>> SubDMs. >>>> Each SubDM basically matches a region of the whole mesh in this case. >>>> Now each SubDM can have its own DS and we can also attribute each DM to >>>> a TS. >>>> We can therefore solve the two problems, say one for fluid dynamics the >>>> other for heat eqn. >>>> >>>> The only thing I am not sure about (actually I haven't coded anything >>>> yet so I'm not sure of anything but ...) is the following. >>>> The two SubDMs come originally from the same DM right. Say we work in >>>> 3D, then the two SubDM must share a layer of triangles (and the segments >>>> and vertices that go along with them). That layer of triangles exist in >>>> both SubDM and is a boundary in both SubDM. >>>> How do I tell, for instance, the fluid SubDM that the information it >>>> needs on that layer of triangles comes from the other SubDM ? And vice >>>> versa ? Is it possible to create two SubDMs from the same DM that somehow >>>> still know each other after the creation ? >>>> Example 23 from SNES does not do that kind of thing right ? The "top" >>>> and "bottom" pieces are quite independent or am I misunderstanding sth ? >>>> >>> >>> The way I see it working is that you compose the maps from the original >>> space to each subDM on the boundary. Then, when you get one solution, you >>> can map it to points on the boundary of the other solution, which you use >>> as an auxiliary field. >>> >> >> Okay yes, I get what you mean. Is there a method in PETSc to do such >> things ? Does it have to do with IS ? >> > > Yes, we can do it. It is just IS manipulation. > Tagging that layer of triangles as a physical surface in Gmsh can help later because it will create another label in the DMPlex, right ? Then one could maybe rely on DMGetLabelIdIS ? Thank you Matt as usual for your precious help ! ;) Thibault > > Thanks, > > Matt > > >> Thanks ! >> Thibault >> >> >>> >>> Thanks, >>> >>> Matt >>> >>> >>>> Thanks !! >>>> >>>> Thibault >>>> >>>> >>>>> Thanks, >>>>> >>>>> Matt >>>>> >>>>> >>>>>> Thanks, >>>>>> Thibault >>>>>> >>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> Matt >>>>>>> >>>>>>> >>>>>>>> Thanks, >>>>>>>> >>>>>>>> Thibault >>>>>>>> >>>>>>>> Le sam. 8 janv. 2022 ? 20:00, Matthew Knepley >>>>>>>> a ?crit : >>>>>>>> >>>>>>>>> On Sat, Jan 8, 2022 at 1:30 PM Thibault Bridel-Bertomeu < >>>>>>>>> thibault.bridelbertomeu at gmail.com> wrote: >>>>>>>>> >>>>>>>>>> Yes I was wondering about different time steps as well because >>>>>>>>>> usually implicit integration moves much faster. >>>>>>>>>> But if it not implemented, then maybe going the ? weak coupling ? >>>>>>>>>> road with a sub-DM is the way. >>>>>>>>>> Can I ask how you proceed in the rocket engine code you are >>>>>>>>>> writing ? IMEX ? >>>>>>>>>> >>>>>>>>> >>>>>>>>> Right now it is IMEX, but we are explicitly substepping particles. >>>>>>>>> Not sure what the final thing will be. >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> >>>>>>>>> Matt >>>>>>>>> >>>>>>>>> >>>>>>>>>> Thibault >>>>>>>>>> >>>>>>>>>> Le sam. 8 janv. 2022 ? 19:22, Matthew Knepley >>>>>>>>>> a ?crit : >>>>>>>>>> >>>>>>>>>>> I do not know how. Right now, composable TS does not work all >>>>>>>>>>> the way. >>>>>>>>>>> >>>>>>>>>>> Matt >>>>>>>>>>> >>>>>>>>>>> On Sat, Jan 8, 2022 at 1:03 PM Mark Adams >>>>>>>>>>> wrote: >>>>>>>>>>> >>>>>>>>>>>> Can you subcycle with IMEX? >>>>>>>>>>>> >>>>>>>>>>>> On Sat, Jan 8, 2022 at 10:58 AM Matthew Knepley < >>>>>>>>>>>> knepley at gmail.com> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> On Sat, Jan 8, 2022 at 3:05 AM Thibault Bridel-Bertomeu < >>>>>>>>>>>>> thibault.bridelbertomeu at gmail.com> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> Le ven. 7 janv. 2022 ? 19:45, Thibault Bridel-Bertomeu < >>>>>>>>>>>>>> thibault.bridelbertomeu at gmail.com> a ?crit : >>>>>>>>>>>>>> >>>>>>>>>>>>>>> Le ven. 7 janv. 2022 ? 19:23, Matthew Knepley < >>>>>>>>>>>>>>> knepley at gmail.com> a ?crit : >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On Fri, Jan 7, 2022 at 12:58 PM Thibault Bridel-Bertomeu < >>>>>>>>>>>>>>>> thibault.bridelbertomeu at gmail.com> wrote: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Le ven. 7 janv. 2022 ? 14:54, Matthew Knepley < >>>>>>>>>>>>>>>>> knepley at gmail.com> a ?crit : >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> On Fri, Jan 7, 2022 at 8:52 AM Thibault Bridel-Bertomeu < >>>>>>>>>>>>>>>>>> thibault.bridelbertomeu at gmail.com> wrote: >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Hi Matthew, >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Le ven. 7 janv. 2022 ? 14:44, Matthew Knepley < >>>>>>>>>>>>>>>>>>> knepley at gmail.com> a ?crit : >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> On Fri, Jan 7, 2022 at 5:46 AM Thibault Bridel-Bertomeu >>>>>>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Dear all, >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> First of, happy new year everyone !! All the best ! >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Happy New Year! >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> I am starting to draft a new project that will be >>>>>>>>>>>>>>>>>>>>> about fluid-structure interaction: in particular, the idea is to compute >>>>>>>>>>>>>>>>>>>>> the Navier-Stokes (or Euler nevermind) flow around an object and _at the >>>>>>>>>>>>>>>>>>>>> same time_ compute the heat equation inside the object. >>>>>>>>>>>>>>>>>>>>> So basically, I am thinking a mesh of the fluid and a >>>>>>>>>>>>>>>>>>>>> mesh of the object, both meshes being linked at the fluid - solid interface. >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> First question: Are these meshes intended to match on >>>>>>>>>>>>>>>>>>>> the interface? If not, this sounds like overset grids or immersed >>>>>>>>>>>>>>>>>>>> boundary/interface methods. In this case, more than one mesh makes sense to >>>>>>>>>>>>>>>>>>>> me. If they are intended to match, then I would advocate a single mesh with >>>>>>>>>>>>>>>>>>>> multiple problems defined on it. I have experimented with this, for example >>>>>>>>>>>>>>>>>>>> see SNES ex23 where I have a field in only part of the domain. I have a >>>>>>>>>>>>>>>>>>>> large project to do exactly this in a rocket engine now. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Yes the way I see it is more of a single mesh with two >>>>>>>>>>>>>>>>>>> distinct regions to distinguish between the fluid and the solid. I was >>>>>>>>>>>>>>>>>>> talking about two meshes to try and explain my vision but it seems like it >>>>>>>>>>>>>>>>>>> was unclear. >>>>>>>>>>>>>>>>>>> Imagine if you wish a rectangular box with a sphere >>>>>>>>>>>>>>>>>>> inclusion: the sphere would be tagged as a solid and the rest of the domain >>>>>>>>>>>>>>>>>>> as fluid. Using Gmsh volumes for instance. >>>>>>>>>>>>>>>>>>> Ill check out the SNES example ! Thanks ! >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> First (Matthew maybe ?) do you think it is something >>>>>>>>>>>>>>>>>>>>> that could be done using two DMPlex's that would somehow be spawned from >>>>>>>>>>>>>>>>>>>>> reading a Gmsh mesh with two volumes ? >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> You can take a mesh and filter out part of it with >>>>>>>>>>>>>>>>>>>> DMPlexFilter(). That is not used much so I may have to fix it to do what >>>>>>>>>>>>>>>>>>>> you want, but that should be easy. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> And on one DMPlex we would have finite volume for the >>>>>>>>>>>>>>>>>>>>> fluid, on the other finite elements for the heat eqn ? >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> I have done this exact thing on a single mesh. It >>>>>>>>>>>>>>>>>>>> should be no harder on two meshes if you go that route. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Second, is it something that anyone in the community >>>>>>>>>>>>>>>>>>>>> has ever imagined doing with PETSc DMPlex's ? >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Yes, I had a combined FV+FEM simulation of magma >>>>>>>>>>>>>>>>>>>> dynamics (I should make it an example), and currently we are doing FVM+FEM >>>>>>>>>>>>>>>>>>>> for simulation of a rocket engine. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Wow so it seems like it?s the exact same thing I would >>>>>>>>>>>>>>>>>>> like to achieve as the rocket engine example. >>>>>>>>>>>>>>>>>>> So you have a single mesh and two regions tagged >>>>>>>>>>>>>>>>>>> differently, and you use the DmPlexFilter to solve FVM and FEM separately ? >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> With a single mesh, you do not even need DMPlexFilter. >>>>>>>>>>>>>>>>>> You just use the labels that Gmsh gives you. I think we should be able to >>>>>>>>>>>>>>>>>> get it going in a straightforward way. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Ok then ! Thanks ! I?ll give it a shot and see what >>>>>>>>>>>>>>>>> happens ! >>>>>>>>>>>>>>>>> Setting up the FVM and FEM discretizations will pass by >>>>>>>>>>>>>>>>> DMSetField right ? With a single mesh tagged with two different regions, it >>>>>>>>>>>>>>>>> should show up as two fields, is that correct ? >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Yes, the idea is as follows. Each field also has a label >>>>>>>>>>>>>>>> argument that is the support of the field in the domain. Then we create >>>>>>>>>>>>>>>> PetscDS objects for each >>>>>>>>>>>>>>>> separate set of overlapping fields. The current algorithm >>>>>>>>>>>>>>>> is not complete I think, so let me know if this step fails. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Ok, thanks. >>>>>>>>>>>>>>> I?ll let you know and share snippets when I have something >>>>>>>>>>>>>>> started ! >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Talk soon ! Thanks ! >>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> Hi Matthew, >>>>>>>>>>>>>> >>>>>>>>>>>>>> I thought about a little something else : what about setting >>>>>>>>>>>>>> two different TS, one for each field of the DM ? Most probably the fluid >>>>>>>>>>>>>> part would be solved with an explicit time stepping whereas the solid part >>>>>>>>>>>>>> with the heat equation would benefit from implicit time stepping. TSSetDM >>>>>>>>>>>>>> does not allow a field specification, is there a way to hack that so that >>>>>>>>>>>>>> each field has its own TS ? >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> I see at least two options here: >>>>>>>>>>>>> >>>>>>>>>>>>> 1. Split the problems: >>>>>>>>>>>>> >>>>>>>>>>>>> You can use DMCreateSubDM() to split off part of a problem >>>>>>>>>>>>> and use a solver on that. I have done this for problems with weak coupling. >>>>>>>>>>>>> >>>>>>>>>>>>> 2. Use IMEX >>>>>>>>>>>>> >>>>>>>>>>>>> For strong coupling, I have used the IMEX TSes in PETSc. >>>>>>>>>>>>> You put the explicit terms in the RHS, and the implicit in the IFunction. >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks, >>>>>>>>>>>>> >>>>>>>>>>>>> Matt >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>>> Thanks >>>>>>>>>>>>>> >>>>>>>>>>>>>> Thibault >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>>> Thibault >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Matt >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Thibault >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Matt >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Thanks ! >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Thibault >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Matt >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> As I said it is very prospective, I just wanted to >>>>>>>>>>>>>>>>>>>>> have your opinion !! >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Thanks very much in advance everyone !! >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Cheers, >>>>>>>>>>>>>>>>>>>>> Thibault >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>>>>>> What most experimenters take for granted before they >>>>>>>>>>>>>>>>>>>> begin their experiments is infinitely more interesting than any results to >>>>>>>>>>>>>>>>>>>> which their experiments lead. >>>>>>>>>>>>>>>>>>>> -- Norbert Wiener >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>>>>> Thibault Bridel-Bertomeu >>>>>>>>>>>>>>>>>>> ? >>>>>>>>>>>>>>>>>>> Eng, MSc, PhD >>>>>>>>>>>>>>>>>>> Research Engineer >>>>>>>>>>>>>>>>>>> CEA/CESTA >>>>>>>>>>>>>>>>>>> 33114 LE BARP >>>>>>>>>>>>>>>>>>> Tel.: (+33)557046924 >>>>>>>>>>>>>>>>>>> Mob.: (+33)611025322 >>>>>>>>>>>>>>>>>>> Mail: thibault.bridelbertomeu at gmail.com >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>>>> What most experimenters take for granted before they >>>>>>>>>>>>>>>>>> begin their experiments is infinitely more interesting than any results to >>>>>>>>>>>>>>>>>> which their experiments lead. >>>>>>>>>>>>>>>>>> -- Norbert Wiener >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>>> Thibault Bridel-Bertomeu >>>>>>>>>>>>>>>>> ? >>>>>>>>>>>>>>>>> Eng, MSc, PhD >>>>>>>>>>>>>>>>> Research Engineer >>>>>>>>>>>>>>>>> CEA/CESTA >>>>>>>>>>>>>>>>> 33114 LE BARP >>>>>>>>>>>>>>>>> Tel.: (+33)557046924 >>>>>>>>>>>>>>>>> Mob.: (+33)611025322 >>>>>>>>>>>>>>>>> Mail: thibault.bridelbertomeu at gmail.com >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>> What most experimenters take for granted before they begin >>>>>>>>>>>>>>>> their experiments is infinitely more interesting than any results to which >>>>>>>>>>>>>>>> their experiments lead. >>>>>>>>>>>>>>>> -- Norbert Wiener >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>> Thibault Bridel-Bertomeu >>>>>>>>>>>>>>> ? >>>>>>>>>>>>>>> Eng, MSc, PhD >>>>>>>>>>>>>>> Research Engineer >>>>>>>>>>>>>>> CEA/CESTA >>>>>>>>>>>>>>> 33114 LE BARP >>>>>>>>>>>>>>> Tel.: (+33)557046924 >>>>>>>>>>>>>>> Mob.: (+33)611025322 >>>>>>>>>>>>>>> Mail: thibault.bridelbertomeu at gmail.com >>>>>>>>>>>>>>> >>>>>>>>>>>>>> -- >>>>>>>>>>>>>> Thibault Bridel-Bertomeu >>>>>>>>>>>>>> ? >>>>>>>>>>>>>> Eng, MSc, PhD >>>>>>>>>>>>>> Research Engineer >>>>>>>>>>>>>> CEA/CESTA >>>>>>>>>>>>>> 33114 LE BARP >>>>>>>>>>>>>> Tel.: (+33)557046924 >>>>>>>>>>>>>> Mob.: (+33)611025322 >>>>>>>>>>>>>> Mail: thibault.bridelbertomeu at gmail.com >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> -- >>>>>>>>>>>>> What most experimenters take for granted before they begin >>>>>>>>>>>>> their experiments is infinitely more interesting than any results to which >>>>>>>>>>>>> their experiments lead. >>>>>>>>>>>>> -- Norbert Wiener >>>>>>>>>>>>> >>>>>>>>>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> -- >>>>>>>>>>> What most experimenters take for granted before they begin their >>>>>>>>>>> experiments is infinitely more interesting than any results to which their >>>>>>>>>>> experiments lead. >>>>>>>>>>> -- Norbert Wiener >>>>>>>>>>> >>>>>>>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> Thibault Bridel-Bertomeu >>>>>>>>>> ? >>>>>>>>>> Eng, MSc, PhD >>>>>>>>>> Research Engineer >>>>>>>>>> CEA/CESTA >>>>>>>>>> 33114 LE BARP >>>>>>>>>> Tel.: (+33)557046924 >>>>>>>>>> Mob.: (+33)611025322 >>>>>>>>>> Mail: thibault.bridelbertomeu at gmail.com >>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> What most experimenters take for granted before they begin their >>>>>>>>> experiments is infinitely more interesting than any results to which their >>>>>>>>> experiments lead. >>>>>>>>> -- Norbert Wiener >>>>>>>>> >>>>>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>>>>> >>>>>>>>> >>>>>>>> -- >>>>>>>> Thibault Bridel-Bertomeu >>>>>>>> ? >>>>>>>> Eng, MSc, PhD >>>>>>>> Research Engineer >>>>>>>> CEA/CESTA >>>>>>>> 33114 LE BARP >>>>>>>> Tel.: (+33)557046924 >>>>>>>> Mob.: (+33)611025322 >>>>>>>> Mail: thibault.bridelbertomeu at gmail.com >>>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> What most experimenters take for granted before they begin their >>>>>>> experiments is infinitely more interesting than any results to which their >>>>>>> experiments lead. >>>>>>> -- Norbert Wiener >>>>>>> >>>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>>> >>>>>>> >>>>>> >>>>> >>>>> -- >>>>> What most experimenters take for granted before they begin their >>>>> experiments is infinitely more interesting than any results to which their >>>>> experiments lead. >>>>> -- Norbert Wiener >>>>> >>>>> https://www.cse.buffalo.edu/~knepley/ >>>>> >>>>> >>>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >>> https://www.cse.buffalo.edu/~knepley/ >>> >>> >> > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at jedbrown.org Sun Jan 9 16:04:02 2022 From: jed at jedbrown.org (Jed Brown) Date: Sun, 09 Jan 2022 15:04:02 -0700 Subject: [petsc-users] Fluid-Structure interaction with multiple DMPlex In-Reply-To: References: Message-ID: <87wnj8tvl9.fsf@jedbrown.org> Thibault Bridel-Bertomeu writes: > However if you use IMEX for strong coupling of the two physics solved in > each field, then it means you need to write a single set of PDEs that > covers everything, don?t you ? > If I want to solve Euler equations in one PetscDS and heat equation in the > other one, then I need to write a global set of equations to use the IMEX > TS , right ? Yes. You can use multirate integrators, which are like subcycling while controlling splitting error. Most subcycling approaches will limit your global convergence to first order in time. First order with a very small coefficient ("weak coupling") might be okay, but you do need to quantify it for each physical regime and resolution. From danyang.su at gmail.com Mon Jan 10 13:33:45 2022 From: danyang.su at gmail.com (Danyang Su) Date: Mon, 10 Jan 2022 11:33:45 -0800 Subject: [petsc-users] Is old ex10.c (separated matrix and rhs) deprecated? Message-ID: Hi All, Back to PETSc-3.8 version, the example ex10.c supports reading matrix and vector from separated files. Is this feature deprecated in the new PETSc version? I have some matrix and rhs to test but could not use ex10 example under new PETSc version. Thanks, Danyang From bsmith at petsc.dev Mon Jan 10 13:52:06 2022 From: bsmith at petsc.dev (Barry Smith) Date: Mon, 10 Jan 2022 14:52:06 -0500 Subject: [petsc-users] Is old ex10.c (separated matrix and rhs) deprecated? In-Reply-To: References: Message-ID: <6A17173E-59F1-47BF-B813-D4C27C43A107@petsc.dev> You can put them in different files or in the same file. If you put them in the same file you need to know which object comes first the matrix or the vector, this tells you if you should call MatLoad first or VecLoad. Barry > On Jan 10, 2022, at 2:33 PM, Danyang Su wrote: > > Hi All, > > Back to PETSc-3.8 version, the example ex10.c supports reading matrix and vector from separated files. Is this feature deprecated in the new PETSc version? I have some matrix and rhs to test but could not use ex10 example under new PETSc version. > > Thanks, > > Danyang > From lzou at anl.gov Mon Jan 10 15:35:31 2022 From: lzou at anl.gov (Zou, Ling) Date: Mon, 10 Jan 2022 21:35:31 +0000 Subject: [petsc-users] A problem with MatFDColoringSetFunction Message-ID: Hi All, I would appreciate if you could give some advice for setting the ?FormFunction? for the MatFDColoringSetFunction function call. I follow what is shown in the example: https://petsc.org/release/src/snes/tutorials/ex14.c.html I setup similar code structure like: PetscErrorCode FormFunction(SNES,Vec,Vec,void*); Then use it as: MatFDColoringSetFunction(fdcoloring, (PetscErrorCode(*)(void))SNESFormFunction, this); This works fine on my local MacOS. However, when commit it to a remote repo, the compiler there gives me the following error (where warnings are treated as errors): src/base/PETScProblemInterface.C:100:67: error: cast between incompatible function types from 'PetscErrorCode (*)(SNES, Vec, Vec, void*)' {aka 'int (*)(_p_SNES*, _p_Vec*, _p_Vec*, void*)'} to 'PetscErrorCode (*)()' {aka 'int (*)()'} [-Werror=cast-function-type] Any help is appreciated. -Ling -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Mon Jan 10 17:13:23 2022 From: bsmith at petsc.dev (Barry Smith) Date: Mon, 10 Jan 2022 18:13:23 -0500 Subject: [petsc-users] A problem with MatFDColoringSetFunction In-Reply-To: References: Message-ID: <1E74890F-2DCE-4B7C-84DB-35D53B0B6759@petsc.dev> This is annoying. You need to pass in a compiler flag to turn off the error conditioner for casting a function pointer. It may be -Wnoerror=cast-function-type just google. Barry > On Jan 10, 2022, at 4:35 PM, Zou, Ling via petsc-users wrote: > > Hi All, > > I would appreciate if you could give some advice for setting the ?FormFunction? for the MatFDColoringSetFunction function call. > I follow what is shown in the example: > https://petsc.org/release/src/snes/tutorials/ex14.c.html > > I setup similar code structure like: > PetscErrorCode FormFunction(SNES,Vec,Vec,void*); > Then use it as: > MatFDColoringSetFunction(fdcoloring, (PetscErrorCode(*)(void))SNESFormFunction, this); > > This works fine on my local MacOS. However, when commit it to a remote repo, the compiler there gives me the following error (where warnings are treated as errors): > > src/base/PETScProblemInterface.C:100:67: error: > cast between incompatible function types from > 'PetscErrorCode (*)(SNES, Vec, Vec, void*)' {aka 'int (*)(_p_SNES*, _p_Vec*, _p_Vec*, void*)'} > to > 'PetscErrorCode (*)()' {aka 'int (*)()'} [-Werror=cast-function-type] > > Any help is appreciated. > > -Ling -------------- next part -------------- An HTML attachment was scrubbed... URL: From lzou at anl.gov Mon Jan 10 21:23:33 2022 From: lzou at anl.gov (Zou, Ling) Date: Tue, 11 Jan 2022 03:23:33 +0000 Subject: [petsc-users] A problem with MatFDColoringSetFunction In-Reply-To: <1E74890F-2DCE-4B7C-84DB-35D53B0B6759@petsc.dev> References: <1E74890F-2DCE-4B7C-84DB-35D53B0B6759@petsc.dev> Message-ID: Thank you Barry. -Ling From: Barry Smith Date: Monday, January 10, 2022 at 5:13 PM To: Zou, Ling Cc: PETSc Subject: Re: [petsc-users] A problem with MatFDColoringSetFunction This is annoying. You need to pass in a compiler flag to turn off the error conditioner for casting a function pointer. It may be -Wnoerror=cast-function-type just google. Barry On Jan 10, 2022, at 4:35 PM, Zou, Ling via petsc-users > wrote: Hi All, I would appreciate if you could give some advice for setting the ?FormFunction? for the MatFDColoringSetFunction function call. I follow what is shown in the example: https://petsc.org/release/src/snes/tutorials/ex14.c.html I setup similar code structure like: PetscErrorCode FormFunction(SNES,Vec,Vec,void*); Then use it as: MatFDColoringSetFunction(fdcoloring, (PetscErrorCode(*)(void))SNESFormFunction, this); This works fine on my local MacOS. However, when commit it to a remote repo, the compiler there gives me the following error (where warnings are treated as errors): src/base/PETScProblemInterface.C:100:67: error: cast between incompatible function types from 'PetscErrorCode (*)(SNES, Vec, Vec, void*)' {aka 'int (*)(_p_SNES*, _p_Vec*, _p_Vec*, void*)'} to 'PetscErrorCode (*)()' {aka 'int (*)()'} [-Werror=cast-function-type] Any help is appreciated. -Ling -------------- next part -------------- An HTML attachment was scrubbed... URL: From thibault.bridelbertomeu at gmail.com Tue Jan 11 00:56:33 2022 From: thibault.bridelbertomeu at gmail.com (Thibault Bridel-Bertomeu) Date: Tue, 11 Jan 2022 07:56:33 +0100 Subject: [petsc-users] Fluid-Structure interaction with multiple DMPlex In-Reply-To: <87wnj8tvl9.fsf@jedbrown.org> References: <87wnj8tvl9.fsf@jedbrown.org> Message-ID: Hello everybody, So, let's say i have the mesh attached to this email that has 2 physical surfaces and 5 physical curves. This gives me 2 strata in the "Cell Sets" and 5 strata in the "Face Sets". Would something like the following piece of code be the right way to "extract" and manipulate each stratum of the "Cell Sets" to assign them a DS, a TS etc...? DMLabel surfacesLabel; ierr = DMGetLabel(dm, "Cell Sets", &surfacesLabel);CHKERRQ(ierr); IS fluidIS; ierr = DMLabelGetStratumIS(surfacesLabel, 2, &fluidIS);CHKERRQ(ierr); DMLabel fluidLabel; ierr = DMLabelCreate(PETSC_COMM_WORLD, "Fluid", &fluidLabel);CHKERRQ(ierr); ierr = DMLabelSetStratumIS(fluidLabel, 1, fluidIS);CHKERRQ(ierr); Once I have the Fluid label linked to the fluidIS (same for the solid), should I call DMPlexLabelComplete on both the labels before proceeding and calling the DMCreateSubDM with their IS ? Thanks, Thibault Le dim. 9 janv. 2022 ? 23:04, Jed Brown a ?crit : > Thibault Bridel-Bertomeu writes: > > > However if you use IMEX for strong coupling of the two physics solved in > > each field, then it means you need to write a single set of PDEs that > > covers everything, don?t you ? > > If I want to solve Euler equations in one PetscDS and heat equation in > the > > other one, then I need to write a global set of equations to use the IMEX > > TS , right ? > > Yes. > > You can use multirate integrators, which are like subcycling while > controlling splitting error. Most subcycling approaches will limit your > global convergence to first order in time. First order with a very small > coefficient ("weak coupling") might be okay, but you do need to quantify it > for each physical regime and resolution. > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: divided_square.geo Type: application/vnd.dynageo Size: 1020 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: divided_square_gmsh41ascii.msh Type: model/mesh Size: 111184 bytes Desc: not available URL: From jed at jedbrown.org Tue Jan 11 09:36:53 2022 From: jed at jedbrown.org (Jed Brown) Date: Tue, 11 Jan 2022 08:36:53 -0700 Subject: [petsc-users] Fluid-Structure interaction with multiple DMPlex In-Reply-To: References: <87wnj8tvl9.fsf@jedbrown.org> Message-ID: <87zgo2s2qy.fsf@jedbrown.org> Thibault Bridel-Bertomeu writes: > Hello everybody, > > So, let's say i have the mesh attached to this email that has 2 physical > surfaces and 5 physical curves. This gives me 2 strata in the "Cell Sets" > and 5 strata in the "Face Sets". > Would something like the following piece of code be the right way to > "extract" and manipulate each stratum of the "Cell Sets" to assign them a > DS, a TS etc...? > > DMLabel surfacesLabel; ierr = DMGetLabel(dm, "Cell Sets", > &surfacesLabel);CHKERRQ(ierr); IS fluidIS; ierr = > DMLabelGetStratumIS(surfacesLabel, 2, &fluidIS);CHKERRQ(ierr); > DMLabel fluidLabel; ierr = DMLabelCreate(PETSC_COMM_WORLD, "Fluid", > &fluidLabel);CHKERRQ(ierr); ierr = DMLabelSetStratumIS(fluidLabel, > 1, fluidIS);CHKERRQ(ierr); > > Once I have the Fluid label linked to the fluidIS (same for the > solid), should I call DMPlexLabelComplete on both the labels before > proceeding and calling the DMCreateSubDM with their IS ? How do you want to implement the function space and interface condition? As single-valued temperature seen from both sides? With a discontinuous space and a surface integral? Euler equations are commonly solved in conservative variables, thus you don't have an option of a continuous temperature space. From nicolas.barnafi at unimi.it Tue Jan 11 10:49:32 2022 From: nicolas.barnafi at unimi.it (Nicolas Alejandro Barnafi) Date: Tue, 11 Jan 2022 17:49:32 +0100 Subject: [petsc-users] On PCSetCoordinates with Fieldsplit Message-ID: <7e73a7ab1c10b8.61ddc32c@unimi.it> Dear community,? I am working on a block preconditioner, where one of the blocks uses HYPRE's AMS. As it requires the coordinates of the dofs, I have done so to the PC object. I expected the coordinates to be inherited down to the subblocks, is this not the case? (it seems so as I couldn't find a specialized FIELDSPLIT SetCoordinates function). If this feature is missing, please give me some hints on where to add the missing function, I would gladly do it. If not, please let me know why it was dismissed, in order to do things the hard way [as in hard-coded ;)]. Kind regards, Nicolas -------------- next part -------------- An HTML attachment was scrubbed... URL: From nabw91 at gmail.com Tue Jan 11 10:58:20 2022 From: nabw91 at gmail.com (=?UTF-8?Q?Nicol=C3=A1s_Barnafi?=) Date: Tue, 11 Jan 2022 17:58:20 +0100 Subject: [petsc-users] PCSetCoordinates does not set coordinates of sub PC (fieldsplit) objects Message-ID: Dear community, I am working on a block preconditioner, where one of the blocks uses HYPRE's AMS. As it requires the coordinates of the dofs, I have done so to the PC object. I expected the coordinates to be inherited down to the subblocks, is this not the case? (it seems so as I couldn't find a specialized FIELDSPLIT SetCoordinates function). If this feature is missing, please give me some hints on where to add the missing function, I would gladly do it. If not, please let me know why it was dismissed, in order to do things the hard way [as in hard-coded ;)]. Kind regards, Nicolas -------------- next part -------------- An HTML attachment was scrubbed... URL: From patrick.sanan at gmail.com Tue Jan 11 11:08:43 2022 From: patrick.sanan at gmail.com (Patrick Sanan) Date: Tue, 11 Jan 2022 18:08:43 +0100 Subject: [petsc-users] Finite difference approximation of Jacobian In-Reply-To: References: <231abd15aab544f9850826cb437366f7@lanl.gov> Message-ID: Working on doing this incrementally, in progress here: https://gitlab.com/petsc/petsc/-/merge_requests/4712 This works in 1D for AIJ matrices, assembling a matrix with a maximal number of zero entries as dictated by the stencil width (which is intended to be very very close to what DMDA would do if you associated all the unknowns with a particular grid point, which is the way DMStag largely works under the hood). Dave, before I get into it, am I correct in my understanding that MATPREALLOCATOR would be better here because you would avoid superfluous zeros in the sparsity pattern, because this routine wouldn't have to assemble the Mat returned by DMCreateMatrix()? If this seems like a sane way to go, I will continue to add some more tests (in particular periodic BCs not tested yet) and add the code for 2D and 3D. Am Mo., 13. Dez. 2021 um 20:17 Uhr schrieb Dave May : > > > On Mon, 13 Dec 2021 at 20:13, Matthew Knepley wrote: > >> On Mon, Dec 13, 2021 at 1:52 PM Dave May wrote: >> >>> On Mon, 13 Dec 2021 at 19:29, Matthew Knepley wrote: >>> >>>> On Mon, Dec 13, 2021 at 1:16 PM Dave May >>>> wrote: >>>> >>>>> >>>>> >>>>> On Sat 11. Dec 2021 at 22:28, Matthew Knepley >>>>> wrote: >>>>> >>>>>> On Sat, Dec 11, 2021 at 1:58 PM Tang, Qi wrote: >>>>>> >>>>>>> Hi, >>>>>>> Does anyone have comment on finite difference coloring with DMStag? >>>>>>> We are using DMStag and TS to evolve some nonlinear equations implicitly. >>>>>>> It would be helpful to have the coloring Jacobian option with that. >>>>>>> >>>>>> >>>>>> Since DMStag produces the Jacobian connectivity, >>>>>> >>>>> >>>>> This is incorrect. >>>>> The DMCreateMatrix implementation for DMSTAG only sets the number of >>>>> nonzeros (very inaccurately). It does not insert any zero values and thus >>>>> the nonzero structure is actually not defined. >>>>> That is why coloring doesn?t work. >>>>> >>>> >>>> Ah, thanks Dave. >>>> >>>> Okay, we should fix that.It is perfectly possible to compute the >>>> nonzero pattern from the DMStag information. >>>> >>> >>> Agreed. The API for DMSTAG is complete enough to enable one to >>> loop over the cells, and for all quantities defined on the cell (centre, >>> face, vertex), >>> insert values into the appropriate slot in the matrix. >>> Combined with MATPREALLOCATOR, I believe a compact and readable >>> code should be possible to write for the preallocation (cf DMDA). >>> >>> I think the only caveat with the approach of using all quantities >>> defined on the cell is >>> It may slightly over allocate depending on how the user wishes to impose >>> the boundary condition, >>> or slightly over allocate for says Stokes where there is no >>> pressure-pressure coupling term. >>> >> >> Yes, and would not handle higher order stencils.I think the >> overallocating is livable for the first imeplementation. >> >> > Sure, but neither does DMDA. > > The user always has to know what they are doing and set the stencil width > accordingly. > I actually had this point listed in my initial email (and the stencil > growth issue when using FD for nonlinear problems), > however I deleted it as all the same issue exist in DMDA and no one > complains (at least not loudly) :D > > > > > >> Thanks, >> >> Matt >> >> >>> Thanks, >>> Dave >>> >>> >>>> Paging Patrick :) >>>> >>>> Thanks, >>>> >>>> Matt >>>> >>>> >>>>> Thanks, >>>>> Dave >>>>> >>>>> >>>>> you can use -snes_fd_color_use_mat. It has many options. Here is an >>>>>> example of us using that: >>>>>> >>>>>> >>>>>> https://gitlab.com/petsc/petsc/-/blob/main/src/snes/tutorials/ex19.c#L898 >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Matt >>>>>> >>>>>> >>>>>>> Thanks, >>>>>>> Qi >>>>>>> >>>>>>> >>>>>>> On Oct 15, 2021, at 3:07 PM, Jorti, Zakariae via petsc-users < >>>>>>> petsc-users at mcs.anl.gov> wrote: >>>>>>> >>>>>>> Hello, >>>>>>> >>>>>>> Does the Jacobian approximation using coloring and finite >>>>>>> differencing of the function evaluation work in DMStag? >>>>>>> Thank you. >>>>>>> Best regards, >>>>>>> >>>>>>> Zakariae >>>>>>> >>>>>>> >>>>>>> >>>>>> >>>>>> -- >>>>>> What most experimenters take for granted before they begin their >>>>>> experiments is infinitely more interesting than any results to which their >>>>>> experiments lead. >>>>>> -- Norbert Wiener >>>>>> >>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>> >>>>>> >>>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their >>>> experiments is infinitely more interesting than any results to which their >>>> experiments lead. >>>> -- Norbert Wiener >>>> >>>> https://www.cse.buffalo.edu/~knepley/ >>>> >>>> >>> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From lzou at anl.gov Tue Jan 11 11:11:58 2022 From: lzou at anl.gov (Zou, Ling) Date: Tue, 11 Jan 2022 17:11:58 +0000 Subject: [petsc-users] A problem with MatFDColoringSetFunction In-Reply-To: References: <1E74890F-2DCE-4B7C-84DB-35D53B0B6759@petsc.dev> Message-ID: A follow up: The compiler option '-Wno-bad-function-cast' did not work for c++. See message below: cc1plus: error: command line option '-Wno-bad-function-cast' is valid for C/ObjC but not for C++ [-Werror] Credit to David Andrs of INL, here is an alternative solution: MatFDColoringSetFunction(fdcoloring, (PetscErrorCode(*)(void))(void(*)(void))SNESFormFunction, this); Thanks, -Ling From: petsc-users on behalf of Zou, Ling via petsc-users Date: Monday, January 10, 2022 at 9:23 PM To: Barry Smith Cc: PETSc Subject: Re: [petsc-users] A problem with MatFDColoringSetFunction Thank you Barry. -Ling From: Barry Smith Date: Monday, January 10, 2022 at 5:13 PM To: Zou, Ling Cc: PETSc Subject: Re: [petsc-users] A problem with MatFDColoringSetFunction This is annoying. You need to pass in a compiler flag to turn off the error conditioner for casting a function pointer. It may be -Wnoerror=cast-function-type just google. Barry On Jan 10, 2022, at 4:35 PM, Zou, Ling via petsc-users > wrote: Hi All, I would appreciate if you could give some advice for setting the ?FormFunction? for the MatFDColoringSetFunction function call. I follow what is shown in the example: https://petsc.org/release/src/snes/tutorials/ex14.c.html I setup similar code structure like: PetscErrorCode FormFunction(SNES,Vec,Vec,void*); Then use it as: MatFDColoringSetFunction(fdcoloring, (PetscErrorCode(*)(void))SNESFormFunction, this); This works fine on my local MacOS. However, when commit it to a remote repo, the compiler there gives me the following error (where warnings are treated as errors): src/base/PETScProblemInterface.C:100:67: error: cast between incompatible function types from 'PetscErrorCode (*)(SNES, Vec, Vec, void*)' {aka 'int (*)(_p_SNES*, _p_Vec*, _p_Vec*, void*)'} to 'PetscErrorCode (*)()' {aka 'int (*)()'} [-Werror=cast-function-type] Any help is appreciated. -Ling -------------- next part -------------- An HTML attachment was scrubbed... URL: From thibault.bridelbertomeu at gmail.com Tue Jan 11 13:59:02 2022 From: thibault.bridelbertomeu at gmail.com (Thibault Bridel-Bertomeu) Date: Tue, 11 Jan 2022 20:59:02 +0100 Subject: [petsc-users] Fluid-Structure interaction with multiple DMPlex In-Reply-To: <87zgo2s2qy.fsf@jedbrown.org> References: <87wnj8tvl9.fsf@jedbrown.org> <87zgo2s2qy.fsf@jedbrown.org> Message-ID: Le mar. 11 janv. 2022 ? 16:36, Jed Brown a ?crit : > Thibault Bridel-Bertomeu writes: > > > Hello everybody, > > > > So, let's say i have the mesh attached to this email that has 2 physical > > surfaces and 5 physical curves. This gives me 2 strata in the "Cell Sets" > > and 5 strata in the "Face Sets". > > Would something like the following piece of code be the right way to > > "extract" and manipulate each stratum of the "Cell Sets" to assign them a > > DS, a TS etc...? > > > > DMLabel surfacesLabel; ierr = DMGetLabel(dm, "Cell Sets", > > &surfacesLabel);CHKERRQ(ierr); IS fluidIS; ierr = > > DMLabelGetStratumIS(surfacesLabel, 2, &fluidIS);CHKERRQ(ierr); > > DMLabel fluidLabel; ierr = DMLabelCreate(PETSC_COMM_WORLD, "Fluid", > > &fluidLabel);CHKERRQ(ierr); ierr = DMLabelSetStratumIS(fluidLabel, > > 1, fluidIS);CHKERRQ(ierr); > > > > Once I have the Fluid label linked to the fluidIS (same for the > > solid), should I call DMPlexLabelComplete on both the labels before > > proceeding and calling the DMCreateSubDM with their IS ? > > How do you want to implement the function space and interface condition? > As single-valued temperature seen from both sides? With a discontinuous > space and a surface integral? Euler equations are commonly solved in > conservative variables, thus you don't have an option of a continuous > temperature space. > A priori, the job of the fluid is to provide a certain heat flux to the solid that will subsequently warm up. So I am not expecting that both fluid and solid will have the same temperature at the interface, it will indeed be discontinuous. I do not get why you mention a surface integral though ? Anyways, for now, I do not really know how to handle this boundary condition problem ... it is one of the _big_ items of my todo list and I think I'll need your help. For now, I am trying to figure out how to handle both discretizations, and I am going the following way : // Set up the discrete systems for both domains //+ Fluid is finite volumes PetscFV fluidFV; ierr = PetscFVCreate(PetscObjectComm((PetscObject) dm), &fluidFV);CHKERRQ (ierr); user->dof = 4; ierr = PetscFVSetNumComponents(fluidFV, user->dof);CHKERRQ(ierr); PetscInt dim; ierr = DMGetDimension(dm, &dim);CHKERRQ(ierr); ierr = PetscFVSetSpatialDimension(fluidFV, dim);CHKERRQ(ierr); ierr = PetscObjectSetName((PetscObject) fluidFV, "EulerEquation");CHKERRQ (ierr); ierr = PetscFVSetType(fluidFV, PETSCFVLEASTSQUARES);CHKERRQ(ierr); ierr = PetscFVSetComputeGradients(fluidFV, PETSC_FALSE);CHKERRQ(ierr); PetscInt fluidFieldNum = 0; ierr = DMSetField(dm, fluidFieldNum, fluidLabel, (PetscObject) fluidFV); CHKERRQ(ierr); ierr = DMViewFromOptions(dm, NULL, "-dm_fv_view");CHKERRQ(ierr); //+ Solid is finite elements PetscBool simplex; ierr = DMPlexIsSimplex(dm, &simplex);CHKERRQ(ierr); PetscFE solidFE; ierr = PetscFECreateDefault(PetscObjectComm((PetscObject) dm), dim, 1, simplex, "heateqn_", -1, &solidFE);CHKERRQ(ierr); ierr = PetscObjectSetName((PetscObject) solidFE, "HeatEquation");CHKERRQ (ierr); PetscInt solidFieldNum = 1; ierr = DMSetField(dm, solidFieldNum, solidLabel, (PetscObject) solidFE); CHKERRQ(ierr); ierr = PetscFEDestroy(&solidFE);CHKERRQ(ierr); ierr = DMViewFromOptions(dm, NULL, "-dm_fe_view");CHKERRQ(ierr); // //+ Sub-DM for the fluid domain // DM fluidDM; // IS subFluidIS; // ierr = DMCreateSubDM(dm, 1, &fluidFieldNum, &subFluidIS, &fluidDM);CHKERRQ(ierr); // ierr = DMViewFromOptions(fluidDM, NULL, "-fluid_dm_view");CHKERRQ(ierr); // //+ Sub-DM for the solid domain // DM solidDM; // IS subSolidIS; // ierr = DMCreateSubDM(dm, 1, &solidFieldNum, &subSolidIS, &solidDM);CHKERRQ(ierr); // ierr = DMViewFromOptions(solidDM, NULL, "-solid_dm_view");CHKERRQ(ierr); //+ Create the DS covering both domains ierr = DMCreateDS(dm);CHKERRQ(ierr); ierr = DMViewFromOptions(dm, NULL, "-dm_ds_view");CHKERRQ(ierr); //+ Sub-DS for the fluid domain PetscDS fluidDS; ierr = DMGetRegionNumDS(dm, fluidFieldNum, &fluidLabel, &fluidIS, &fluidDS); CHKERRQ(ierr); ierr = PetscDSViewFromOptions(fluidDS, NULL, "-fluid_ds_view");CHKERRQ (ierr); //+ Sub-DS for the solid domain PetscDS solidDS; ierr = DMGetRegionNumDS(dm, solidFieldNum, &solidLabel, &solidIS, &solidDS); CHKERRQ(ierr); ierr = PetscDSViewFromOptions(solidDS, NULL, "-solid_ds_view");CHKERRQ (ierr); I am not sure where I am going with this for now, I am really experimenting ... I tried generating two sub-DMs because it seemed to me that, that way, I could have two TS and basically work my two problems independently and figure out the interface later. But! When I printed them (with fluid_dm_view and solid_dm_view), it seemed that something was wrong : the fluidDM showed a FEM field and both sub DMs appeared to know the whole mesh and all the strata of all the fields. The next step is to figure out the FEM for the heat equation -- I have the details for the FVM in another code, it should be pretty straightforward. The following step will be to plug in a timestepper, or two timesteppers. Then the interface and the connection between the two problems. Thanks for your help and all your input, as always !!!! Cheers, Thibault -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Tue Jan 11 14:09:11 2022 From: bsmith at petsc.dev (Barry Smith) Date: Tue, 11 Jan 2022 15:09:11 -0500 Subject: [petsc-users] A problem with MatFDColoringSetFunction In-Reply-To: References: <1E74890F-2DCE-4B7C-84DB-35D53B0B6759@petsc.dev> Message-ID: <4DA70C43-8EB4-4B3B-99E7-265191B9DC36@petsc.dev> Thanks for the alternative. @jed does this mean we need to do our casting of functions like this everywhere? > On Jan 11, 2022, at 12:11 PM, Zou, Ling wrote: > > A follow up: > > The compiler option '-Wno-bad-function-cast' did not work for c++. See message below: > cc1plus: error: command line option '-Wno-bad-function-cast' is valid for C/ObjC but not for C++ [-Werror] > > Credit to David Andrs of INL, here is an alternative solution: > > MatFDColoringSetFunction(fdcoloring, > (PetscErrorCode(*)(void))(void(*)(void))SNESFormFunction, this); > > Thanks, > > -Ling > > From: petsc-users > on behalf of Zou, Ling via petsc-users > > Date: Monday, January 10, 2022 at 9:23 PM > To: Barry Smith > > Cc: PETSc > > Subject: Re: [petsc-users] A problem with MatFDColoringSetFunction > > Thank you Barry. > > -Ling > > From: Barry Smith > > Date: Monday, January 10, 2022 at 5:13 PM > To: Zou, Ling > > Cc: PETSc > > Subject: Re: [petsc-users] A problem with MatFDColoringSetFunction > > > This is annoying. You need to pass in a compiler flag to turn off the error conditioner for casting a function pointer. It may be -Wnoerror=cast-function-type just google. > > Barry > > > > On Jan 10, 2022, at 4:35 PM, Zou, Ling via petsc-users > wrote: > > Hi All, > > I would appreciate if you could give some advice for setting the ?FormFunction? for the MatFDColoringSetFunction function call. > I follow what is shown in the example: > https://petsc.org/release/src/snes/tutorials/ex14.c.html > > I setup similar code structure like: > PetscErrorCode FormFunction(SNES,Vec,Vec,void*); > Then use it as: > MatFDColoringSetFunction(fdcoloring, (PetscErrorCode(*)(void))SNESFormFunction, this); > > This works fine on my local MacOS. However, when commit it to a remote repo, the compiler there gives me the following error (where warnings are treated as errors): > > src/base/PETScProblemInterface.C:100:67: error: > cast between incompatible function types from > 'PetscErrorCode (*)(SNES, Vec, Vec, void*)' {aka 'int (*)(_p_SNES*, _p_Vec*, _p_Vec*, void*)'} > to > 'PetscErrorCode (*)()' {aka 'int (*)()'} [-Werror=cast-function-type] > > Any help is appreciated. > > -Ling -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Tue Jan 11 14:31:21 2022 From: bsmith at petsc.dev (Barry Smith) Date: Tue, 11 Jan 2022 15:31:21 -0500 Subject: [petsc-users] PCSetCoordinates does not set coordinates of sub PC (fieldsplit) objects In-Reply-To: References: Message-ID: Nicolas, For "simple" PCFIELDSPLIT it is possible to pass down the attached coordinate information. By simple I mean where the splitting is done by fields and not by general lists of IS (where one does not have enough information to know what the coordinates would mean to the subPCS). Look in fieldsplit.c PCFieldSplitSetFields_FieldSplit() where it does the KSPCreate(). I think you can do a KSPGetPC() on that ksp and PCSetCoordinates on that PC to supply the coordinates to the subPC. In the function PCFieldSplitSetIS_FieldSplit() you can also attach the coordinates to the subPCs IF defaultsplit is true. Sadly this is not the full story. The outer PC will not have any coordinates because calling PCSetCoordinates on a PCFIELDSPLIT does nothing since fieldsplit doesn't handle coordinates. So you need to do more, you need to provide a PCSetCoordinates_FieldSplit() that saves the coordinates in new entries in the PC_FieldSplit struct and then in PCFieldSplitSetFields_FieldSplit() you need to access those saved values and pass them into the PCSetCoordinates() that you call on the subPCs. Once you write PCSetCoordinates_FieldSplit() you need to call ierr = PetscObjectComposeFunction((PetscObject)pc,"PCSetCoordinates_C",PCSetCoordinates_FieldSplit);CHKERRQ(ierr); inside PCCreate_FieldSplit(). Any questions just let us know. Barry > On Jan 11, 2022, at 11:58 AM, Nicol?s Barnafi wrote: > > Dear community, > > I am working on a block preconditioner, where one of the blocks uses HYPRE's AMS. As it requires the coordinates of the dofs, I have done so to the PC object. I expected the coordinates to be inherited down to the subblocks, is this not the case? (it seems so as I couldn't find a specialized FIELDSPLIT SetCoordinates function). > > If this feature is missing, please give me some hints on where to add the missing function, I would gladly do it. If not, please let me know why it was dismissed, in order to do things the hard way [as in hard-coded ;)]. > > Kind regards, > Nicolas From bastian.loehrer at tu-dresden.de Tue Jan 11 15:33:50 2022 From: bastian.loehrer at tu-dresden.de (=?UTF-8?Q?Bastian_L=c3=b6hrer?=) Date: Tue, 11 Jan 2022 22:33:50 +0100 Subject: [petsc-users] Loading DM binary data into 64-bit-integer PETSc Message-ID: An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Tue Jan 11 15:49:08 2022 From: bsmith at petsc.dev (Barry Smith) Date: Tue, 11 Jan 2022 16:49:08 -0500 Subject: [petsc-users] Loading DM binary data into 64-bit-integer PETSc In-Reply-To: References: Message-ID: <5CB2B702-06AD-4AC4-B71E-38B809AA12D8@petsc.dev> In general, no because the integer values inside the files will have different lengths. You can use the Python lib/petsc/bin/PetscBinaryIO.py or MATLAB share/petsc/matlab/*.m utilities to read the data from one format and save it in the other. Sorry for the inconvenience. Barry > On Jan 11, 2022, at 4:33 PM, Bastian L?hrer wrote: > > Dear PETSc community, > > I have PETSc binaries containing DM PetscScalars which were written with a PETSc configured without 64bit integers. > > Is it possible to load this data using a PETSc configured with 64bit integers? > > Thanks, > Bastian > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at jedbrown.org Tue Jan 11 19:04:08 2022 From: jed at jedbrown.org (Jed Brown) Date: Tue, 11 Jan 2022 18:04:08 -0700 Subject: [petsc-users] A problem with MatFDColoringSetFunction In-Reply-To: <4DA70C43-8EB4-4B3B-99E7-265191B9DC36@petsc.dev> References: <1E74890F-2DCE-4B7C-84DB-35D53B0B6759@petsc.dev> <4DA70C43-8EB4-4B3B-99E7-265191B9DC36@petsc.dev> Message-ID: <87a6g1sr1z.fsf@jedbrown.org> Barry Smith writes: > Thanks for the alternative. > > @jed does this mean we need to do our casting of functions like this everywhere? Whenever our interfaces accept a function with a non-unique prototype, the interface should take void(*)(void) instead of PetscErrorCode(*)(void). I think we should also be a bit more judicious about where we absolutely need this hack. From knepley at gmail.com Tue Jan 11 20:25:37 2022 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 11 Jan 2022 21:25:37 -0500 Subject: [petsc-users] Fluid-Structure interaction with multiple DMPlex In-Reply-To: References: <87wnj8tvl9.fsf@jedbrown.org> <87zgo2s2qy.fsf@jedbrown.org> Message-ID: On Tue, Jan 11, 2022 at 2:59 PM Thibault Bridel-Bertomeu < thibault.bridelbertomeu at gmail.com> wrote: > Le mar. 11 janv. 2022 ? 16:36, Jed Brown a ?crit : > >> Thibault Bridel-Bertomeu writes: >> >> > Hello everybody, >> > >> > So, let's say i have the mesh attached to this email that has 2 physical >> > surfaces and 5 physical curves. This gives me 2 strata in the "Cell >> Sets" >> > and 5 strata in the "Face Sets". >> > Would something like the following piece of code be the right way to >> > "extract" and manipulate each stratum of the "Cell Sets" to assign them >> a >> > DS, a TS etc...? >> > >> > DMLabel surfacesLabel; ierr = DMGetLabel(dm, "Cell Sets", >> > &surfacesLabel);CHKERRQ(ierr); IS fluidIS; ierr = >> > DMLabelGetStratumIS(surfacesLabel, 2, &fluidIS);CHKERRQ(ierr); >> > DMLabel fluidLabel; ierr = DMLabelCreate(PETSC_COMM_WORLD, "Fluid", >> > &fluidLabel);CHKERRQ(ierr); ierr = DMLabelSetStratumIS(fluidLabel, >> > 1, fluidIS);CHKERRQ(ierr); >> > >> > Once I have the Fluid label linked to the fluidIS (same for the >> > solid), should I call DMPlexLabelComplete on both the labels before >> > proceeding and calling the DMCreateSubDM with their IS ? >> >> How do you want to implement the function space and interface condition? >> As single-valued temperature seen from both sides? With a discontinuous >> space and a surface integral? Euler equations are commonly solved in >> conservative variables, thus you don't have an option of a continuous >> temperature space. >> > > A priori, the job of the fluid is to provide a certain heat flux to the > solid that will subsequently warm up. So I am not expecting that both fluid > and solid will have the same temperature at the interface, it will indeed > be discontinuous. > I do not get why you mention a surface integral though ? > Anyways, for now, I do not really know how to handle this boundary > condition problem ... it is one of the _big_ items of my todo list and I > think I'll need your help. > Hmm, let's start here because that does not make sense to me. Can you write down the mathematical model? Usually I think of temperature as being a continuous field. I have only seen discontinuities for rareified gases. Thanks, Matt > For now, I am trying to figure out how to handle both discretizations, and > I am going the following way : > > // Set up the discrete systems for both domains > //+ Fluid is finite volumes > PetscFV fluidFV; > ierr = PetscFVCreate(PetscObjectComm((PetscObject) dm), &fluidFV);CHKERRQ > (ierr); > user->dof = 4; > ierr = PetscFVSetNumComponents(fluidFV, user->dof);CHKERRQ(ierr); > PetscInt dim; > ierr = DMGetDimension(dm, &dim);CHKERRQ(ierr); > ierr = PetscFVSetSpatialDimension(fluidFV, dim);CHKERRQ(ierr); > ierr = PetscObjectSetName((PetscObject) fluidFV, "EulerEquation");CHKERRQ > (ierr); > ierr = PetscFVSetType(fluidFV, PETSCFVLEASTSQUARES);CHKERRQ(ierr); > ierr = PetscFVSetComputeGradients(fluidFV, PETSC_FALSE);CHKERRQ(ierr); > PetscInt fluidFieldNum = 0; > ierr = DMSetField(dm, fluidFieldNum, fluidLabel, (PetscObject) fluidFV); > CHKERRQ(ierr); > ierr = DMViewFromOptions(dm, NULL, "-dm_fv_view");CHKERRQ(ierr); > //+ Solid is finite elements > PetscBool simplex; > ierr = DMPlexIsSimplex(dm, &simplex);CHKERRQ(ierr); > PetscFE solidFE; > ierr = PetscFECreateDefault(PetscObjectComm((PetscObject) dm), dim, 1, > simplex, "heateqn_", -1, &solidFE);CHKERRQ(ierr); > ierr = PetscObjectSetName((PetscObject) solidFE, "HeatEquation");CHKERRQ > (ierr); > PetscInt solidFieldNum = 1; > ierr = DMSetField(dm, solidFieldNum, solidLabel, (PetscObject) solidFE); > CHKERRQ(ierr); > ierr = PetscFEDestroy(&solidFE);CHKERRQ(ierr); > ierr = DMViewFromOptions(dm, NULL, "-dm_fe_view");CHKERRQ(ierr); > // //+ Sub-DM for the fluid domain > // DM fluidDM; > // IS subFluidIS; > // ierr = DMCreateSubDM(dm, 1, &fluidFieldNum, &subFluidIS, > &fluidDM);CHKERRQ(ierr); > // ierr = DMViewFromOptions(fluidDM, NULL, "-fluid_dm_view");CHKERRQ(ierr); > // //+ Sub-DM for the solid domain > // DM solidDM; > // IS subSolidIS; > // ierr = DMCreateSubDM(dm, 1, &solidFieldNum, &subSolidIS, > &solidDM);CHKERRQ(ierr); > // ierr = DMViewFromOptions(solidDM, NULL, "-solid_dm_view");CHKERRQ(ierr); > //+ Create the DS covering both domains > ierr = DMCreateDS(dm);CHKERRQ(ierr); > ierr = DMViewFromOptions(dm, NULL, "-dm_ds_view");CHKERRQ(ierr); > //+ Sub-DS for the fluid domain > PetscDS fluidDS; > ierr = DMGetRegionNumDS(dm, fluidFieldNum, &fluidLabel, &fluidIS, > &fluidDS);CHKERRQ(ierr); > ierr = PetscDSViewFromOptions(fluidDS, NULL, "-fluid_ds_view");CHKERRQ > (ierr); > //+ Sub-DS for the solid domain > PetscDS solidDS; > ierr = DMGetRegionNumDS(dm, solidFieldNum, &solidLabel, &solidIS, > &solidDS);CHKERRQ(ierr); > ierr = PetscDSViewFromOptions(solidDS, NULL, "-solid_ds_view");CHKERRQ > (ierr); > > I am not sure where I am going with this for now, I am really > experimenting ... > I tried generating two sub-DMs because it seemed to me that, that way, I > could have two TS and basically work my two problems independently and > figure out the interface later. But! When I printed them (with > fluid_dm_view and solid_dm_view), it seemed that something was wrong : the > fluidDM showed a FEM field and both sub DMs appeared to know the whole mesh > and all the strata of all the fields. > The next step is to figure out the FEM for the heat equation -- I have the > details for the FVM in another code, it should be pretty straightforward. > The following step will be to plug in a timestepper, or two timesteppers. > Then the interface and the connection between the two problems. > > Thanks for your help and all your input, as always !!!! > > Cheers, > Thibault > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Tue Jan 11 20:43:36 2022 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 11 Jan 2022 21:43:36 -0500 Subject: [petsc-users] Finite difference approximation of Jacobian In-Reply-To: References: <231abd15aab544f9850826cb437366f7@lanl.gov> Message-ID: On Tue, Jan 11, 2022 at 12:09 PM Patrick Sanan wrote: > Working on doing this incrementally, in progress here: > https://gitlab.com/petsc/petsc/-/merge_requests/4712 > > This works in 1D for AIJ matrices, assembling a matrix with a maximal > number of zero entries as dictated by the stencil width (which is intended > to be very very close to what DMDA would do if you > associated all the unknowns with a particular grid point, which is the way > DMStag largely works under the hood). > > Dave, before I get into it, am I correct in my understanding that > MATPREALLOCATOR would be better here because you would avoid superfluous > zeros in the sparsity pattern, > because this routine wouldn't have to assemble the Mat returned by > DMCreateMatrix()? > Yes, here is how it works. You throw in all the nonzeros you come across. Preallocator is a hash table that can check for duplicates. At the end, it returns the sparsity pattern. Thanks, Matt > If this seems like a sane way to go, I will continue to add some more > tests (in particular periodic BCs not tested yet) and add the code for 2D > and 3D. > > > > Am Mo., 13. Dez. 2021 um 20:17 Uhr schrieb Dave May < > dave.mayhem23 at gmail.com>: > >> >> >> On Mon, 13 Dec 2021 at 20:13, Matthew Knepley wrote: >> >>> On Mon, Dec 13, 2021 at 1:52 PM Dave May >>> wrote: >>> >>>> On Mon, 13 Dec 2021 at 19:29, Matthew Knepley >>>> wrote: >>>> >>>>> On Mon, Dec 13, 2021 at 1:16 PM Dave May >>>>> wrote: >>>>> >>>>>> >>>>>> >>>>>> On Sat 11. Dec 2021 at 22:28, Matthew Knepley >>>>>> wrote: >>>>>> >>>>>>> On Sat, Dec 11, 2021 at 1:58 PM Tang, Qi wrote: >>>>>>> >>>>>>>> Hi, >>>>>>>> Does anyone have comment on finite difference coloring with DMStag? >>>>>>>> We are using DMStag and TS to evolve some nonlinear equations implicitly. >>>>>>>> It would be helpful to have the coloring Jacobian option with that. >>>>>>>> >>>>>>> >>>>>>> Since DMStag produces the Jacobian connectivity, >>>>>>> >>>>>> >>>>>> This is incorrect. >>>>>> The DMCreateMatrix implementation for DMSTAG only sets the number of >>>>>> nonzeros (very inaccurately). It does not insert any zero values and thus >>>>>> the nonzero structure is actually not defined. >>>>>> That is why coloring doesn?t work. >>>>>> >>>>> >>>>> Ah, thanks Dave. >>>>> >>>>> Okay, we should fix that.It is perfectly possible to compute the >>>>> nonzero pattern from the DMStag information. >>>>> >>>> >>>> Agreed. The API for DMSTAG is complete enough to enable one to >>>> loop over the cells, and for all quantities defined on the cell >>>> (centre, face, vertex), >>>> insert values into the appropriate slot in the matrix. >>>> Combined with MATPREALLOCATOR, I believe a compact and readable >>>> code should be possible to write for the preallocation (cf DMDA). >>>> >>>> I think the only caveat with the approach of using all quantities >>>> defined on the cell is >>>> It may slightly over allocate depending on how the user wishes to >>>> impose the boundary condition, >>>> or slightly over allocate for says Stokes where there is no >>>> pressure-pressure coupling term. >>>> >>> >>> Yes, and would not handle higher order stencils.I think the >>> overallocating is livable for the first imeplementation. >>> >>> >> Sure, but neither does DMDA. >> >> The user always has to know what they are doing and set the stencil width >> accordingly. >> I actually had this point listed in my initial email (and the stencil >> growth issue when using FD for nonlinear problems), >> however I deleted it as all the same issue exist in DMDA and no one >> complains (at least not loudly) :D >> >> >> >> >> >>> Thanks, >>> >>> Matt >>> >>> >>>> Thanks, >>>> Dave >>>> >>>> >>>>> Paging Patrick :) >>>>> >>>>> Thanks, >>>>> >>>>> Matt >>>>> >>>>> >>>>>> Thanks, >>>>>> Dave >>>>>> >>>>>> >>>>>> you can use -snes_fd_color_use_mat. It has many options. Here is an >>>>>>> example of us using that: >>>>>>> >>>>>>> >>>>>>> https://gitlab.com/petsc/petsc/-/blob/main/src/snes/tutorials/ex19.c#L898 >>>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> Matt >>>>>>> >>>>>>> >>>>>>>> Thanks, >>>>>>>> Qi >>>>>>>> >>>>>>>> >>>>>>>> On Oct 15, 2021, at 3:07 PM, Jorti, Zakariae via petsc-users < >>>>>>>> petsc-users at mcs.anl.gov> wrote: >>>>>>>> >>>>>>>> Hello, >>>>>>>> >>>>>>>> Does the Jacobian approximation using coloring and finite >>>>>>>> differencing of the function evaluation work in DMStag? >>>>>>>> Thank you. >>>>>>>> Best regards, >>>>>>>> >>>>>>>> Zakariae >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>> >>>>>>> -- >>>>>>> What most experimenters take for granted before they begin their >>>>>>> experiments is infinitely more interesting than any results to which their >>>>>>> experiments lead. >>>>>>> -- Norbert Wiener >>>>>>> >>>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>>> >>>>>>> >>>>>> >>>>> >>>>> -- >>>>> What most experimenters take for granted before they begin their >>>>> experiments is infinitely more interesting than any results to which their >>>>> experiments lead. >>>>> -- Norbert Wiener >>>>> >>>>> https://www.cse.buffalo.edu/~knepley/ >>>>> >>>>> >>>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >>> https://www.cse.buffalo.edu/~knepley/ >>> >>> >> -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Tue Jan 11 20:51:04 2022 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 11 Jan 2022 21:51:04 -0500 Subject: [petsc-users] PCSetCoordinates does not set coordinates of sub PC (fieldsplit) objects In-Reply-To: References: Message-ID: On Tue, Jan 11, 2022 at 3:31 PM Barry Smith wrote: > > Nicolas, > > For "simple" PCFIELDSPLIT it is possible to pass down the attached > coordinate information. By simple I mean where the splitting is done by > fields and not by general lists of IS (where one does not have enough > information to know what the coordinates would mean to the subPCS). > > Look in fieldsplit.c PCFieldSplitSetFields_FieldSplit() where it does > the KSPCreate(). I think you can do a KSPGetPC() on that ksp and > PCSetCoordinates on that PC to supply the coordinates to the subPC. In the > function PCFieldSplitSetIS_FieldSplit() you can also attach the coordinates > to the subPCs IF defaultsplit is true. > > Sadly this is not the full story. The outer PC will not have any > coordinates because calling PCSetCoordinates on a PCFIELDSPLIT does nothing > since fieldsplit doesn't handle coordinates. So you need to do more, you > need to provide a PCSetCoordinates_FieldSplit() that saves the coordinates > in new entries in the PC_FieldSplit struct and then in > PCFieldSplitSetFields_FieldSplit() you need to access those saved values > and pass them into the PCSetCoordinates() that you call on the subPCs. Once > you write > PCSetCoordinates_FieldSplit() you need to call > > ierr = > PetscObjectComposeFunction((PetscObject)pc,"PCSetCoordinates_C",PCSetCoordinates_FieldSplit);CHKERRQ(ierr); > > > inside PCCreate_FieldSplit(). > > Any questions just let us know. > I will add "Why is this so cumbersome?". This is a workaround in order to get geometric information into GAMG. It should really be PCGAMGSetCoordinates(), which are used to calculate the rigid body modes, and assume a bunch of stuff about the coordinate space. This would not help you, because it would still force you to pull out the correct subPC. The "right" way now to give geometric information to a TS/SNES/KSP/PC is through a DM, which are passed down through PCFIELDSPLIT, PCPATCH, etc. However they are heavier weight than just some coordinates. Thanks, Matt > Barry > > > > On Jan 11, 2022, at 11:58 AM, Nicol?s Barnafi wrote: > > > > Dear community, > > > > I am working on a block preconditioner, where one of the blocks uses > HYPRE's AMS. As it requires the coordinates of the dofs, I have done so to > the PC object. I expected the coordinates to be inherited down to the > subblocks, is this not the case? (it seems so as I couldn't find a > specialized FIELDSPLIT SetCoordinates function). > > > > If this feature is missing, please give me some hints on where to add > the missing function, I would gladly do it. If not, please let me know why > it was dismissed, in order to do things the hard way [as in hard-coded ;)]. > > > > Kind regards, > > Nicolas > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Tue Jan 11 23:22:43 2022 From: bsmith at petsc.dev (Barry Smith) Date: Wed, 12 Jan 2022 00:22:43 -0500 Subject: [petsc-users] PCSetCoordinates does not set coordinates of sub PC (fieldsplit) objects In-Reply-To: References: Message-ID: > On Jan 11, 2022, at 9:51 PM, Matthew Knepley wrote: > > On Tue, Jan 11, 2022 at 3:31 PM Barry Smith > wrote: > > Nicolas, > > For "simple" PCFIELDSPLIT it is possible to pass down the attached coordinate information. By simple I mean where the splitting is done by fields and not by general lists of IS (where one does not have enough information to know what the coordinates would mean to the subPCS). > > Look in fieldsplit.c PCFieldSplitSetFields_FieldSplit() where it does the KSPCreate(). I think you can do a KSPGetPC() on that ksp and PCSetCoordinates on that PC to supply the coordinates to the subPC. In the function PCFieldSplitSetIS_FieldSplit() you can also attach the coordinates to the subPCs IF defaultsplit is true. > > Sadly this is not the full story. The outer PC will not have any coordinates because calling PCSetCoordinates on a PCFIELDSPLIT does nothing since fieldsplit doesn't handle coordinates. So you need to do more, you need to provide a PCSetCoordinates_FieldSplit() that saves the coordinates in new entries in the PC_FieldSplit struct and then in PCFieldSplitSetFields_FieldSplit() you need to access those saved values and pass them into the PCSetCoordinates() that you call on the subPCs. Once you write > PCSetCoordinates_FieldSplit() you need to call > > ierr = PetscObjectComposeFunction((PetscObject)pc,"PCSetCoordinates_C",PCSetCoordinates_FieldSplit);CHKERRQ(ierr); > > inside PCCreate_FieldSplit(). > > Any questions just let us know. > > I will add "Why is this so cumbersome?". This is a workaround in order to get geometric information into GAMG. It should really be PCGAMGSetCoordinates(), which > are used to calculate the rigid body modes, and assume a bunch of stuff about the coordinate space. This would not help you, because it would still force you to pull > out the correct subPC. The "right" way now to give geometric information to a TS/SNES/KSP/PC is through a DM, which are passed down through PCFIELDSPLIT, > PCPATCH, etc. However they are heavier weight than just some coordinates. This is not cumbersome at all. It is a simple natural way to pass around coordinates to PC's and, when possible, their children. Barry Note that we could also have a DMGetCoordinates() that pulled coordinates from a DM (that happended to have them) in this form associated with the PC and the PC could call it to get the coordinates and use them as needed. But this simple PCSetCoordinates() is a nice complement to that approach. > > Thanks, > > Matt > > Barry > > > > On Jan 11, 2022, at 11:58 AM, Nicol?s Barnafi > wrote: > > > > Dear community, > > > > I am working on a block preconditioner, where one of the blocks uses HYPRE's AMS. As it requires the coordinates of the dofs, I have done so to the PC object. I expected the coordinates to be inherited down to the subblocks, is this not the case? (it seems so as I couldn't find a specialized FIELDSPLIT SetCoordinates function). > > > > If this feature is missing, please give me some hints on where to add the missing function, I would gladly do it. If not, please let me know why it was dismissed, in order to do things the hard way [as in hard-coded ;)]. > > > > Kind regards, > > Nicolas > > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Tue Jan 11 23:30:22 2022 From: bsmith at petsc.dev (Barry Smith) Date: Wed, 12 Jan 2022 00:30:22 -0500 Subject: [petsc-users] Finite difference approximation of Jacobian In-Reply-To: References: <231abd15aab544f9850826cb437366f7@lanl.gov> Message-ID: I think the MATPREALLOCATOR as a MatType is a cumbersome strange thing and would prefer it was just functionality that Mat provided directly; for example MatSetOption(mat, preallocator_mode,true); matsetvalues,... MatAssemblyBegin/End. Now the matrix has its proper nonzero structure of whatever type the user set initially, aij, baij, sbaij, .... And the user can now use it efficiently. Barry So turning on the option just swaps out temporarily the operations for MatSetValues and AssemblyBegin/End to be essentially those in MATPREALLOCATOR. The refactorization should take almost no time and would be faster than trying to rig dmstag to use MATPREALLOCATOR as is. > On Jan 11, 2022, at 9:43 PM, Matthew Knepley wrote: > > On Tue, Jan 11, 2022 at 12:09 PM Patrick Sanan > wrote: > Working on doing this incrementally, in progress here: https://gitlab.com/petsc/petsc/-/merge_requests/4712 > > This works in 1D for AIJ matrices, assembling a matrix with a maximal number of zero entries as dictated by the stencil width (which is intended to be very very close to what DMDA would do if you > associated all the unknowns with a particular grid point, which is the way DMStag largely works under the hood). > > Dave, before I get into it, am I correct in my understanding that MATPREALLOCATOR would be better here because you would avoid superfluous zeros in the sparsity pattern, > because this routine wouldn't have to assemble the Mat returned by DMCreateMatrix()? > > Yes, here is how it works. You throw in all the nonzeros you come across. Preallocator is a hash table that can check for duplicates. At the end, it returns the sparsity pattern. > > Thanks, > > Matt > > If this seems like a sane way to go, I will continue to add some more tests (in particular periodic BCs not tested yet) and add the code for 2D and 3D. > > > > Am Mo., 13. Dez. 2021 um 20:17 Uhr schrieb Dave May >: > > > On Mon, 13 Dec 2021 at 20:13, Matthew Knepley > wrote: > On Mon, Dec 13, 2021 at 1:52 PM Dave May > wrote: > On Mon, 13 Dec 2021 at 19:29, Matthew Knepley > wrote: > On Mon, Dec 13, 2021 at 1:16 PM Dave May > wrote: > > > On Sat 11. Dec 2021 at 22:28, Matthew Knepley > wrote: > On Sat, Dec 11, 2021 at 1:58 PM Tang, Qi > wrote: > Hi, > Does anyone have comment on finite difference coloring with DMStag? We are using DMStag and TS to evolve some nonlinear equations implicitly. It would be helpful to have the coloring Jacobian option with that. > > Since DMStag produces the Jacobian connectivity, > > This is incorrect. > The DMCreateMatrix implementation for DMSTAG only sets the number of nonzeros (very inaccurately). It does not insert any zero values and thus the nonzero structure is actually not defined. > That is why coloring doesn?t work. > > Ah, thanks Dave. > > Okay, we should fix that.It is perfectly possible to compute the nonzero pattern from the DMStag information. > > Agreed. The API for DMSTAG is complete enough to enable one to > loop over the cells, and for all quantities defined on the cell (centre, face, vertex), > insert values into the appropriate slot in the matrix. > Combined with MATPREALLOCATOR, I believe a compact and readable > code should be possible to write for the preallocation (cf DMDA). > > I think the only caveat with the approach of using all quantities defined on the cell is > It may slightly over allocate depending on how the user wishes to impose the boundary condition, > or slightly over allocate for says Stokes where there is no pressure-pressure coupling term. > > Yes, and would not handle higher order stencils.I think the overallocating is livable for the first imeplementation. > > > Sure, but neither does DMDA. > > The user always has to know what they are doing and set the stencil width accordingly. > I actually had this point listed in my initial email (and the stencil growth issue when using FD for nonlinear problems), > however I deleted it as all the same issue exist in DMDA and no one complains (at least not loudly) :D > > > > > Thanks, > > Matt > > Thanks, > Dave > > > Paging Patrick :) > > Thanks, > > Matt > > Thanks, > Dave > > > you can use -snes_fd_color_use_mat. It has many options. Here is an example of us using that: > > https://gitlab.com/petsc/petsc/-/blob/main/src/snes/tutorials/ex19.c#L898 > > Thanks, > > Matt > > Thanks, > Qi > > >> On Oct 15, 2021, at 3:07 PM, Jorti, Zakariae via petsc-users > wrote: >> >> Hello, >> >> Does the Jacobian approximation using coloring and finite differencing of the function evaluation work in DMStag? >> Thank you. >> Best regards, >> >> Zakariae > > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Tue Jan 11 23:33:44 2022 From: bsmith at petsc.dev (Barry Smith) Date: Wed, 12 Jan 2022 00:33:44 -0500 Subject: [petsc-users] A problem with MatFDColoringSetFunction In-Reply-To: <87a6g1sr1z.fsf@jedbrown.org> References: <1E74890F-2DCE-4B7C-84DB-35D53B0B6759@petsc.dev> <4DA70C43-8EB4-4B3B-99E7-265191B9DC36@petsc.dev> <87a6g1sr1z.fsf@jedbrown.org> Message-ID: <289E8755-62C5-4FAC-9DDD-8DA6428726F5@petsc.dev> > On Jan 11, 2022, at 8:04 PM, Jed Brown wrote: > > Barry Smith writes: > >> Thanks for the alternative. >> >> @jed does this mean we need to do our casting of functions like this everywhere? > > Whenever our interfaces accept a function with a non-unique prototype, the interface should take void(*)(void) instead of PetscErrorCode(*)(void). Great. I didn't realise when I did the initial PetscErrorCode(*)(void). that void(*)(void) was a special portable castable function prototype I have made an issue. > > I think we should also be a bit more judicious about where we absolutely need this hack. From jed at jedbrown.org Tue Jan 11 23:43:00 2022 From: jed at jedbrown.org (Jed Brown) Date: Tue, 11 Jan 2022 22:43:00 -0700 Subject: [petsc-users] Finite difference approximation of Jacobian In-Reply-To: References: <231abd15aab544f9850826cb437366f7@lanl.gov> Message-ID: <877db5se57.fsf@jedbrown.org> I agree with this and even started a branch jed/mat-hash in (yikes!) 2017. I think it should be default if no preallocation functions are called. But it isn't exactly the MATPREALLOCATOR code because it needs to handle values too. Should not be a lot of code and will essentially remove this FAQ and one of the most irritating subtle aspects of new codes using PETSc matrices. https://petsc.org/release/faq/#assembling-large-sparse-matrices-takes-a-long-time-what-can-i-do-to-make-this-process-faster-or-matsetvalues-is-so-slow-what-can-i-do-to-speed-it-up Barry Smith writes: > I think the MATPREALLOCATOR as a MatType is a cumbersome strange thing and would prefer it was just functionality that Mat provided directly; for example MatSetOption(mat, preallocator_mode,true); matsetvalues,... MatAssemblyBegin/End. Now the matrix has its proper nonzero structure of whatever type the user set initially, aij, baij, sbaij, .... And the user can now use it efficiently. > > Barry > > So turning on the option just swaps out temporarily the operations for MatSetValues and AssemblyBegin/End to be essentially those in MATPREALLOCATOR. The refactorization should take almost no time and would be faster than trying to rig dmstag to use MATPREALLOCATOR as is. > > >> On Jan 11, 2022, at 9:43 PM, Matthew Knepley wrote: >> >> On Tue, Jan 11, 2022 at 12:09 PM Patrick Sanan > wrote: >> Working on doing this incrementally, in progress here: https://gitlab.com/petsc/petsc/-/merge_requests/4712 >> >> This works in 1D for AIJ matrices, assembling a matrix with a maximal number of zero entries as dictated by the stencil width (which is intended to be very very close to what DMDA would do if you >> associated all the unknowns with a particular grid point, which is the way DMStag largely works under the hood). >> >> Dave, before I get into it, am I correct in my understanding that MATPREALLOCATOR would be better here because you would avoid superfluous zeros in the sparsity pattern, >> because this routine wouldn't have to assemble the Mat returned by DMCreateMatrix()? >> >> Yes, here is how it works. You throw in all the nonzeros you come across. Preallocator is a hash table that can check for duplicates. At the end, it returns the sparsity pattern. >> >> Thanks, >> >> Matt >> >> If this seems like a sane way to go, I will continue to add some more tests (in particular periodic BCs not tested yet) and add the code for 2D and 3D. >> >> >> >> Am Mo., 13. Dez. 2021 um 20:17 Uhr schrieb Dave May >: >> >> >> On Mon, 13 Dec 2021 at 20:13, Matthew Knepley > wrote: >> On Mon, Dec 13, 2021 at 1:52 PM Dave May > wrote: >> On Mon, 13 Dec 2021 at 19:29, Matthew Knepley > wrote: >> On Mon, Dec 13, 2021 at 1:16 PM Dave May > wrote: >> >> >> On Sat 11. Dec 2021 at 22:28, Matthew Knepley > wrote: >> On Sat, Dec 11, 2021 at 1:58 PM Tang, Qi > wrote: >> Hi, >> Does anyone have comment on finite difference coloring with DMStag? We are using DMStag and TS to evolve some nonlinear equations implicitly. It would be helpful to have the coloring Jacobian option with that. >> >> Since DMStag produces the Jacobian connectivity, >> >> This is incorrect. >> The DMCreateMatrix implementation for DMSTAG only sets the number of nonzeros (very inaccurately). It does not insert any zero values and thus the nonzero structure is actually not defined. >> That is why coloring doesn?t work. >> >> Ah, thanks Dave. >> >> Okay, we should fix that.It is perfectly possible to compute the nonzero pattern from the DMStag information. >> >> Agreed. The API for DMSTAG is complete enough to enable one to >> loop over the cells, and for all quantities defined on the cell (centre, face, vertex), >> insert values into the appropriate slot in the matrix. >> Combined with MATPREALLOCATOR, I believe a compact and readable >> code should be possible to write for the preallocation (cf DMDA). >> >> I think the only caveat with the approach of using all quantities defined on the cell is >> It may slightly over allocate depending on how the user wishes to impose the boundary condition, >> or slightly over allocate for says Stokes where there is no pressure-pressure coupling term. >> >> Yes, and would not handle higher order stencils.I think the overallocating is livable for the first imeplementation. >> >> >> Sure, but neither does DMDA. >> >> The user always has to know what they are doing and set the stencil width accordingly. >> I actually had this point listed in my initial email (and the stencil growth issue when using FD for nonlinear problems), >> however I deleted it as all the same issue exist in DMDA and no one complains (at least not loudly) :D >> >> >> >> >> Thanks, >> >> Matt >> >> Thanks, >> Dave >> >> >> Paging Patrick :) >> >> Thanks, >> >> Matt >> >> Thanks, >> Dave >> >> >> you can use -snes_fd_color_use_mat. It has many options. Here is an example of us using that: >> >> https://gitlab.com/petsc/petsc/-/blob/main/src/snes/tutorials/ex19.c#L898 >> >> Thanks, >> >> Matt >> >> Thanks, >> Qi >> >> >>> On Oct 15, 2021, at 3:07 PM, Jorti, Zakariae via petsc-users > wrote: >>> >>> Hello, >>> >>> Does the Jacobian approximation using coloring and finite differencing of the function evaluation work in DMStag? >>> Thank you. >>> Best regards, >>> >>> Zakariae >> >> >> >> -- >> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ >> >> >> -- >> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ >> >> >> -- >> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ >> >> >> -- >> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ From jed at jedbrown.org Tue Jan 11 23:52:26 2022 From: jed at jedbrown.org (Jed Brown) Date: Tue, 11 Jan 2022 22:52:26 -0700 Subject: [petsc-users] Fluid-Structure interaction with multiple DMPlex In-Reply-To: References: <87wnj8tvl9.fsf@jedbrown.org> <87zgo2s2qy.fsf@jedbrown.org> Message-ID: <874k69sdph.fsf@jedbrown.org> Matthew Knepley writes: > Hmm, let's start here because that does not make sense to me. Can you write > down the mathematical model? Usually I think of temperature as being > a continuous field. I have only seen discontinuities for rareified gases. Well, you can still model it as a discontinuous field with a flux derived from a jump condition similar to DG. I think that's a reasonable choice when the fluid uses conservative variables and/or a function space that does not have boundary nodes (like FV or DG). But you should check the literature to identify a formulation you're comfortable with and then we can advise on how to implement it with DMPlex. From bsmith at petsc.dev Wed Jan 12 01:03:53 2022 From: bsmith at petsc.dev (Barry Smith) Date: Wed, 12 Jan 2022 02:03:53 -0500 Subject: [petsc-users] Finite difference approximation of Jacobian In-Reply-To: <877db5se57.fsf@jedbrown.org> References: <231abd15aab544f9850826cb437366f7@lanl.gov> <877db5se57.fsf@jedbrown.org> Message-ID: Why does it need to handle values? > On Jan 12, 2022, at 12:43 AM, Jed Brown wrote: > > I agree with this and even started a branch jed/mat-hash in (yikes!) 2017. I think it should be default if no preallocation functions are called. But it isn't exactly the MATPREALLOCATOR code because it needs to handle values too. Should not be a lot of code and will essentially remove this FAQ and one of the most irritating subtle aspects of new codes using PETSc matrices. > > https://petsc.org/release/faq/#assembling-large-sparse-matrices-takes-a-long-time-what-can-i-do-to-make-this-process-faster-or-matsetvalues-is-so-slow-what-can-i-do-to-speed-it-up > > Barry Smith writes: > >> I think the MATPREALLOCATOR as a MatType is a cumbersome strange thing and would prefer it was just functionality that Mat provided directly; for example MatSetOption(mat, preallocator_mode,true); matsetvalues,... MatAssemblyBegin/End. Now the matrix has its proper nonzero structure of whatever type the user set initially, aij, baij, sbaij, .... And the user can now use it efficiently. >> >> Barry >> >> So turning on the option just swaps out temporarily the operations for MatSetValues and AssemblyBegin/End to be essentially those in MATPREALLOCATOR. The refactorization should take almost no time and would be faster than trying to rig dmstag to use MATPREALLOCATOR as is. >> >> >>> On Jan 11, 2022, at 9:43 PM, Matthew Knepley wrote: >>> >>> On Tue, Jan 11, 2022 at 12:09 PM Patrick Sanan > wrote: >>> Working on doing this incrementally, in progress here: https://gitlab.com/petsc/petsc/-/merge_requests/4712 >>> >>> This works in 1D for AIJ matrices, assembling a matrix with a maximal number of zero entries as dictated by the stencil width (which is intended to be very very close to what DMDA would do if you >>> associated all the unknowns with a particular grid point, which is the way DMStag largely works under the hood). >>> >>> Dave, before I get into it, am I correct in my understanding that MATPREALLOCATOR would be better here because you would avoid superfluous zeros in the sparsity pattern, >>> because this routine wouldn't have to assemble the Mat returned by DMCreateMatrix()? >>> >>> Yes, here is how it works. You throw in all the nonzeros you come across. Preallocator is a hash table that can check for duplicates. At the end, it returns the sparsity pattern. >>> >>> Thanks, >>> >>> Matt >>> >>> If this seems like a sane way to go, I will continue to add some more tests (in particular periodic BCs not tested yet) and add the code for 2D and 3D. >>> >>> >>> >>> Am Mo., 13. Dez. 2021 um 20:17 Uhr schrieb Dave May >: >>> >>> >>> On Mon, 13 Dec 2021 at 20:13, Matthew Knepley > wrote: >>> On Mon, Dec 13, 2021 at 1:52 PM Dave May > wrote: >>> On Mon, 13 Dec 2021 at 19:29, Matthew Knepley > wrote: >>> On Mon, Dec 13, 2021 at 1:16 PM Dave May > wrote: >>> >>> >>> On Sat 11. Dec 2021 at 22:28, Matthew Knepley > wrote: >>> On Sat, Dec 11, 2021 at 1:58 PM Tang, Qi > wrote: >>> Hi, >>> Does anyone have comment on finite difference coloring with DMStag? We are using DMStag and TS to evolve some nonlinear equations implicitly. It would be helpful to have the coloring Jacobian option with that. >>> >>> Since DMStag produces the Jacobian connectivity, >>> >>> This is incorrect. >>> The DMCreateMatrix implementation for DMSTAG only sets the number of nonzeros (very inaccurately). It does not insert any zero values and thus the nonzero structure is actually not defined. >>> That is why coloring doesn?t work. >>> >>> Ah, thanks Dave. >>> >>> Okay, we should fix that.It is perfectly possible to compute the nonzero pattern from the DMStag information. >>> >>> Agreed. The API for DMSTAG is complete enough to enable one to >>> loop over the cells, and for all quantities defined on the cell (centre, face, vertex), >>> insert values into the appropriate slot in the matrix. >>> Combined with MATPREALLOCATOR, I believe a compact and readable >>> code should be possible to write for the preallocation (cf DMDA). >>> >>> I think the only caveat with the approach of using all quantities defined on the cell is >>> It may slightly over allocate depending on how the user wishes to impose the boundary condition, >>> or slightly over allocate for says Stokes where there is no pressure-pressure coupling term. >>> >>> Yes, and would not handle higher order stencils.I think the overallocating is livable for the first imeplementation. >>> >>> >>> Sure, but neither does DMDA. >>> >>> The user always has to know what they are doing and set the stencil width accordingly. >>> I actually had this point listed in my initial email (and the stencil growth issue when using FD for nonlinear problems), >>> however I deleted it as all the same issue exist in DMDA and no one complains (at least not loudly) :D >>> >>> >>> >>> >>> Thanks, >>> >>> Matt >>> >>> Thanks, >>> Dave >>> >>> >>> Paging Patrick :) >>> >>> Thanks, >>> >>> Matt >>> >>> Thanks, >>> Dave >>> >>> >>> you can use -snes_fd_color_use_mat. It has many options. Here is an example of us using that: >>> >>> https://gitlab.com/petsc/petsc/-/blob/main/src/snes/tutorials/ex19.c#L898 >>> >>> Thanks, >>> >>> Matt >>> >>> Thanks, >>> Qi >>> >>> >>>> On Oct 15, 2021, at 3:07 PM, Jorti, Zakariae via petsc-users > wrote: >>>> >>>> Hello, >>>> >>>> Does the Jacobian approximation using coloring and finite differencing of the function evaluation work in DMStag? >>>> Thank you. >>>> Best regards, >>>> >>>> Zakariae >>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>> -- Norbert Wiener >>> >>> https://www.cse.buffalo.edu/~knepley/ >>> >>> >>> -- >>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>> -- Norbert Wiener >>> >>> https://www.cse.buffalo.edu/~knepley/ >>> >>> >>> -- >>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>> -- Norbert Wiener >>> >>> https://www.cse.buffalo.edu/~knepley/ >>> >>> >>> -- >>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>> -- Norbert Wiener >>> >>> https://www.cse.buffalo.edu/~knepley/ From fdkong.jd at gmail.com Wed Jan 12 01:22:02 2022 From: fdkong.jd at gmail.com (Fande Kong) Date: Wed, 12 Jan 2022 00:22:02 -0700 Subject: [petsc-users] Finite difference approximation of Jacobian In-Reply-To: References: Message-ID: <1FDAAB7C-465D-4643-A196-5CE3166F084A@gmail.com> This is something I almost started a while ago. https://gitlab.com/petsc/petsc/-/issues/852 It would be a very interesting addition to us. Fande > On Jan 12, 2022, at 12:04 AM, Barry Smith wrote: > > ? > Why does it need to handle values? > >> On Jan 12, 2022, at 12:43 AM, Jed Brown wrote: >> >> I agree with this and even started a branch jed/mat-hash in (yikes!) 2017. I think it should be default if no preallocation functions are called. But it isn't exactly the MATPREALLOCATOR code because it needs to handle values too. Should not be a lot of code and will essentially remove this FAQ and one of the most irritating subtle aspects of new codes using PETSc matrices. >> >> https://petsc.org/release/faq/#assembling-large-sparse-matrices-takes-a-long-time-what-can-i-do-to-make-this-process-faster-or-matsetvalues-is-so-slow-what-can-i-do-to-speed-it-up >> >> Barry Smith writes: >> >>> I think the MATPREALLOCATOR as a MatType is a cumbersome strange thing and would prefer it was just functionality that Mat provided directly; for example MatSetOption(mat, preallocator_mode,true); matsetvalues,... MatAssemblyBegin/End. Now the matrix has its proper nonzero structure of whatever type the user set initially, aij, baij, sbaij, .... And the user can now use it efficiently. >>> >>> Barry >>> >>> So turning on the option just swaps out temporarily the operations for MatSetValues and AssemblyBegin/End to be essentially those in MATPREALLOCATOR. The refactorization should take almost no time and would be faster than trying to rig dmstag to use MATPREALLOCATOR as is. >>> >>> >>>> On Jan 11, 2022, at 9:43 PM, Matthew Knepley wrote: >>>> >>>> On Tue, Jan 11, 2022 at 12:09 PM Patrick Sanan > wrote: >>>> Working on doing this incrementally, in progress here: https://gitlab.com/petsc/petsc/-/merge_requests/4712 >>>> >>>> This works in 1D for AIJ matrices, assembling a matrix with a maximal number of zero entries as dictated by the stencil width (which is intended to be very very close to what DMDA would do if you >>>> associated all the unknowns with a particular grid point, which is the way DMStag largely works under the hood). >>>> >>>> Dave, before I get into it, am I correct in my understanding that MATPREALLOCATOR would be better here because you would avoid superfluous zeros in the sparsity pattern, >>>> because this routine wouldn't have to assemble the Mat returned by DMCreateMatrix()? >>>> >>>> Yes, here is how it works. You throw in all the nonzeros you come across. Preallocator is a hash table that can check for duplicates. At the end, it returns the sparsity pattern. >>>> >>>> Thanks, >>>> >>>> Matt >>>> >>>> If this seems like a sane way to go, I will continue to add some more tests (in particular periodic BCs not tested yet) and add the code for 2D and 3D. >>>> >>>> >>>> >>>> Am Mo., 13. Dez. 2021 um 20:17 Uhr schrieb Dave May >: >>>> >>>> >>>> On Mon, 13 Dec 2021 at 20:13, Matthew Knepley > wrote: >>>> On Mon, Dec 13, 2021 at 1:52 PM Dave May > wrote: >>>> On Mon, 13 Dec 2021 at 19:29, Matthew Knepley > wrote: >>>> On Mon, Dec 13, 2021 at 1:16 PM Dave May > wrote: >>>> >>>> >>>> On Sat 11. Dec 2021 at 22:28, Matthew Knepley > wrote: >>>> On Sat, Dec 11, 2021 at 1:58 PM Tang, Qi > wrote: >>>> Hi, >>>> Does anyone have comment on finite difference coloring with DMStag? We are using DMStag and TS to evolve some nonlinear equations implicitly. It would be helpful to have the coloring Jacobian option with that. >>>> >>>> Since DMStag produces the Jacobian connectivity, >>>> >>>> This is incorrect. >>>> The DMCreateMatrix implementation for DMSTAG only sets the number of nonzeros (very inaccurately). It does not insert any zero values and thus the nonzero structure is actually not defined. >>>> That is why coloring doesn?t work. >>>> >>>> Ah, thanks Dave. >>>> >>>> Okay, we should fix that.It is perfectly possible to compute the nonzero pattern from the DMStag information. >>>> >>>> Agreed. The API for DMSTAG is complete enough to enable one to >>>> loop over the cells, and for all quantities defined on the cell (centre, face, vertex), >>>> insert values into the appropriate slot in the matrix. >>>> Combined with MATPREALLOCATOR, I believe a compact and readable >>>> code should be possible to write for the preallocation (cf DMDA). >>>> >>>> I think the only caveat with the approach of using all quantities defined on the cell is >>>> It may slightly over allocate depending on how the user wishes to impose the boundary condition, >>>> or slightly over allocate for says Stokes where there is no pressure-pressure coupling term. >>>> >>>> Yes, and would not handle higher order stencils.I think the overallocating is livable for the first imeplementation. >>>> >>>> >>>> Sure, but neither does DMDA. >>>> >>>> The user always has to know what they are doing and set the stencil width accordingly. >>>> I actually had this point listed in my initial email (and the stencil growth issue when using FD for nonlinear problems), >>>> however I deleted it as all the same issue exist in DMDA and no one complains (at least not loudly) :D >>>> >>>> >>>> >>>> >>>> Thanks, >>>> >>>> Matt >>>> >>>> Thanks, >>>> Dave >>>> >>>> >>>> Paging Patrick :) >>>> >>>> Thanks, >>>> >>>> Matt >>>> >>>> Thanks, >>>> Dave >>>> >>>> >>>> you can use -snes_fd_color_use_mat. It has many options. Here is an example of us using that: >>>> >>>> https://gitlab.com/petsc/petsc/-/blob/main/src/snes/tutorials/ex19.c#L898 >>>> >>>> Thanks, >>>> >>>> Matt >>>> >>>> Thanks, >>>> Qi >>>> >>>> >>>>> On Oct 15, 2021, at 3:07 PM, Jorti, Zakariae via petsc-users > wrote: >>>>> >>>>> Hello, >>>>> >>>>> Does the Jacobian approximation using coloring and finite differencing of the function evaluation work in DMStag? >>>>> Thank you. >>>>> Best regards, >>>>> >>>>> Zakariae >>>> >>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>>> -- Norbert Wiener >>>> >>>> https://www.cse.buffalo.edu/~knepley/ >>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>>> -- Norbert Wiener >>>> >>>> https://www.cse.buffalo.edu/~knepley/ >>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>>> -- Norbert Wiener >>>> >>>> https://www.cse.buffalo.edu/~knepley/ >>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>>> -- Norbert Wiener >>>> >>>> https://www.cse.buffalo.edu/~knepley/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at jedbrown.org Wed Jan 12 01:22:06 2022 From: jed at jedbrown.org (Jed Brown) Date: Wed, 12 Jan 2022 00:22:06 -0700 Subject: [petsc-users] Finite difference approximation of Jacobian In-Reply-To: References: <231abd15aab544f9850826cb437366f7@lanl.gov> <877db5se57.fsf@jedbrown.org> Message-ID: <87y23lquzl.fsf@jedbrown.org> Because if a user jumps right into MatSetValues() and then wants to use the result, it better have the values. It's true that DM preallocation normally just "inserts zeros" and thus matches what MATPREALLOCATOR does. Barry Smith writes: > Why does it need to handle values? > >> On Jan 12, 2022, at 12:43 AM, Jed Brown wrote: >> >> I agree with this and even started a branch jed/mat-hash in (yikes!) 2017. I think it should be default if no preallocation functions are called. But it isn't exactly the MATPREALLOCATOR code because it needs to handle values too. Should not be a lot of code and will essentially remove this FAQ and one of the most irritating subtle aspects of new codes using PETSc matrices. >> >> https://petsc.org/release/faq/#assembling-large-sparse-matrices-takes-a-long-time-what-can-i-do-to-make-this-process-faster-or-matsetvalues-is-so-slow-what-can-i-do-to-speed-it-up >> >> Barry Smith writes: >> >>> I think the MATPREALLOCATOR as a MatType is a cumbersome strange thing and would prefer it was just functionality that Mat provided directly; for example MatSetOption(mat, preallocator_mode,true); matsetvalues,... MatAssemblyBegin/End. Now the matrix has its proper nonzero structure of whatever type the user set initially, aij, baij, sbaij, .... And the user can now use it efficiently. >>> >>> Barry >>> >>> So turning on the option just swaps out temporarily the operations for MatSetValues and AssemblyBegin/End to be essentially those in MATPREALLOCATOR. The refactorization should take almost no time and would be faster than trying to rig dmstag to use MATPREALLOCATOR as is. >>> >>> >>>> On Jan 11, 2022, at 9:43 PM, Matthew Knepley wrote: >>>> >>>> On Tue, Jan 11, 2022 at 12:09 PM Patrick Sanan > wrote: >>>> Working on doing this incrementally, in progress here: https://gitlab.com/petsc/petsc/-/merge_requests/4712 >>>> >>>> This works in 1D for AIJ matrices, assembling a matrix with a maximal number of zero entries as dictated by the stencil width (which is intended to be very very close to what DMDA would do if you >>>> associated all the unknowns with a particular grid point, which is the way DMStag largely works under the hood). >>>> >>>> Dave, before I get into it, am I correct in my understanding that MATPREALLOCATOR would be better here because you would avoid superfluous zeros in the sparsity pattern, >>>> because this routine wouldn't have to assemble the Mat returned by DMCreateMatrix()? >>>> >>>> Yes, here is how it works. You throw in all the nonzeros you come across. Preallocator is a hash table that can check for duplicates. At the end, it returns the sparsity pattern. >>>> >>>> Thanks, >>>> >>>> Matt >>>> >>>> If this seems like a sane way to go, I will continue to add some more tests (in particular periodic BCs not tested yet) and add the code for 2D and 3D. >>>> >>>> >>>> >>>> Am Mo., 13. Dez. 2021 um 20:17 Uhr schrieb Dave May >: >>>> >>>> >>>> On Mon, 13 Dec 2021 at 20:13, Matthew Knepley > wrote: >>>> On Mon, Dec 13, 2021 at 1:52 PM Dave May > wrote: >>>> On Mon, 13 Dec 2021 at 19:29, Matthew Knepley > wrote: >>>> On Mon, Dec 13, 2021 at 1:16 PM Dave May > wrote: >>>> >>>> >>>> On Sat 11. Dec 2021 at 22:28, Matthew Knepley > wrote: >>>> On Sat, Dec 11, 2021 at 1:58 PM Tang, Qi > wrote: >>>> Hi, >>>> Does anyone have comment on finite difference coloring with DMStag? We are using DMStag and TS to evolve some nonlinear equations implicitly. It would be helpful to have the coloring Jacobian option with that. >>>> >>>> Since DMStag produces the Jacobian connectivity, >>>> >>>> This is incorrect. >>>> The DMCreateMatrix implementation for DMSTAG only sets the number of nonzeros (very inaccurately). It does not insert any zero values and thus the nonzero structure is actually not defined. >>>> That is why coloring doesn?t work. >>>> >>>> Ah, thanks Dave. >>>> >>>> Okay, we should fix that.It is perfectly possible to compute the nonzero pattern from the DMStag information. >>>> >>>> Agreed. The API for DMSTAG is complete enough to enable one to >>>> loop over the cells, and for all quantities defined on the cell (centre, face, vertex), >>>> insert values into the appropriate slot in the matrix. >>>> Combined with MATPREALLOCATOR, I believe a compact and readable >>>> code should be possible to write for the preallocation (cf DMDA). >>>> >>>> I think the only caveat with the approach of using all quantities defined on the cell is >>>> It may slightly over allocate depending on how the user wishes to impose the boundary condition, >>>> or slightly over allocate for says Stokes where there is no pressure-pressure coupling term. >>>> >>>> Yes, and would not handle higher order stencils.I think the overallocating is livable for the first imeplementation. >>>> >>>> >>>> Sure, but neither does DMDA. >>>> >>>> The user always has to know what they are doing and set the stencil width accordingly. >>>> I actually had this point listed in my initial email (and the stencil growth issue when using FD for nonlinear problems), >>>> however I deleted it as all the same issue exist in DMDA and no one complains (at least not loudly) :D >>>> >>>> >>>> >>>> >>>> Thanks, >>>> >>>> Matt >>>> >>>> Thanks, >>>> Dave >>>> >>>> >>>> Paging Patrick :) >>>> >>>> Thanks, >>>> >>>> Matt >>>> >>>> Thanks, >>>> Dave >>>> >>>> >>>> you can use -snes_fd_color_use_mat. It has many options. Here is an example of us using that: >>>> >>>> https://gitlab.com/petsc/petsc/-/blob/main/src/snes/tutorials/ex19.c#L898 >>>> >>>> Thanks, >>>> >>>> Matt >>>> >>>> Thanks, >>>> Qi >>>> >>>> >>>>> On Oct 15, 2021, at 3:07 PM, Jorti, Zakariae via petsc-users > wrote: >>>>> >>>>> Hello, >>>>> >>>>> Does the Jacobian approximation using coloring and finite differencing of the function evaluation work in DMStag? >>>>> Thank you. >>>>> Best regards, >>>>> >>>>> Zakariae >>>> >>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>>> -- Norbert Wiener >>>> >>>> https://www.cse.buffalo.edu/~knepley/ >>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>>> -- Norbert Wiener >>>> >>>> https://www.cse.buffalo.edu/~knepley/ >>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>>> -- Norbert Wiener >>>> >>>> https://www.cse.buffalo.edu/~knepley/ >>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>>> -- Norbert Wiener >>>> >>>> https://www.cse.buffalo.edu/~knepley/ From bsmith at petsc.dev Wed Jan 12 01:48:25 2022 From: bsmith at petsc.dev (Barry Smith) Date: Wed, 12 Jan 2022 02:48:25 -0500 Subject: [petsc-users] Finite difference approximation of Jacobian In-Reply-To: <87y23lquzl.fsf@jedbrown.org> References: <231abd15aab544f9850826cb437366f7@lanl.gov> <877db5se57.fsf@jedbrown.org> <87y23lquzl.fsf@jedbrown.org> Message-ID: > On Jan 12, 2022, at 2:22 AM, Jed Brown wrote: > > Because if a user jumps right into MatSetValues() and then wants to use the result, it better have the values. It's true that DM preallocation normally just "inserts zeros" and thus matches what MATPREALLOCATOR does. I am not saying the user jumps right into MatSetValues(). I am saying they do a "value" free assembly to have the preallocation set up and then they do the first real assembly; hence very much JUST a refactorization of MATPREALLOCATE. So the user is "preallocation-aware" but does not have to do anything difficult to determine their pattern analytically. Of course handling the values also is fine and simplifies user code and the conceptual ideas (since preallocation ceases to exist as a concept for users). I have no problem with skipping the "JUST a refactorization of MATPREALLOCATE code" but for Patrick's current pressing need I think a refactorization of MATPREALLOCATE would be fastest to develop hence I suggested it. I did not look at Jed's branch but one way to eliminate the preallocation part completely would be to have something like MATHASH and then when the first assembly is complete do a MatConvert to AIJ or whatever. The conversion could be hidden inside the assemblyend of the hash so the user need not be aware that something different was done the first time through. Doing it this way would easily allow multiple MATHASH* implementations (written by different people) that would compete on speed and memory usage. So there could be MatSetType() and a new MatSetInitialType() where the initial type would be automatically set to some MATHASH if preallocation info is not provided. > > Barry Smith writes: > >> Why does it need to handle values? >> >>> On Jan 12, 2022, at 12:43 AM, Jed Brown wrote: >>> >>> I agree with this and even started a branch jed/mat-hash in (yikes!) 2017. I think it should be default if no preallocation functions are called. But it isn't exactly the MATPREALLOCATOR code because it needs to handle values too. Should not be a lot of code and will essentially remove this FAQ and one of the most irritating subtle aspects of new codes using PETSc matrices. >>> >>> https://petsc.org/release/faq/#assembling-large-sparse-matrices-takes-a-long-time-what-can-i-do-to-make-this-process-faster-or-matsetvalues-is-so-slow-what-can-i-do-to-speed-it-up >>> >>> Barry Smith writes: >>> >>>> I think the MATPREALLOCATOR as a MatType is a cumbersome strange thing and would prefer it was just functionality that Mat provided directly; for example MatSetOption(mat, preallocator_mode,true); matsetvalues,... MatAssemblyBegin/End. Now the matrix has its proper nonzero structure of whatever type the user set initially, aij, baij, sbaij, .... And the user can now use it efficiently. >>>> >>>> Barry >>>> >>>> So turning on the option just swaps out temporarily the operations for MatSetValues and AssemblyBegin/End to be essentially those in MATPREALLOCATOR. The refactorization should take almost no time and would be faster than trying to rig dmstag to use MATPREALLOCATOR as is. >>>> >>>> >>>>> On Jan 11, 2022, at 9:43 PM, Matthew Knepley wrote: >>>>> >>>>> On Tue, Jan 11, 2022 at 12:09 PM Patrick Sanan > wrote: >>>>> Working on doing this incrementally, in progress here: https://gitlab.com/petsc/petsc/-/merge_requests/4712 >>>>> >>>>> This works in 1D for AIJ matrices, assembling a matrix with a maximal number of zero entries as dictated by the stencil width (which is intended to be very very close to what DMDA would do if you >>>>> associated all the unknowns with a particular grid point, which is the way DMStag largely works under the hood). >>>>> >>>>> Dave, before I get into it, am I correct in my understanding that MATPREALLOCATOR would be better here because you would avoid superfluous zeros in the sparsity pattern, >>>>> because this routine wouldn't have to assemble the Mat returned by DMCreateMatrix()? >>>>> >>>>> Yes, here is how it works. You throw in all the nonzeros you come across. Preallocator is a hash table that can check for duplicates. At the end, it returns the sparsity pattern. >>>>> >>>>> Thanks, >>>>> >>>>> Matt >>>>> >>>>> If this seems like a sane way to go, I will continue to add some more tests (in particular periodic BCs not tested yet) and add the code for 2D and 3D. >>>>> >>>>> >>>>> >>>>> Am Mo., 13. Dez. 2021 um 20:17 Uhr schrieb Dave May >: >>>>> >>>>> >>>>> On Mon, 13 Dec 2021 at 20:13, Matthew Knepley > wrote: >>>>> On Mon, Dec 13, 2021 at 1:52 PM Dave May > wrote: >>>>> On Mon, 13 Dec 2021 at 19:29, Matthew Knepley > wrote: >>>>> On Mon, Dec 13, 2021 at 1:16 PM Dave May > wrote: >>>>> >>>>> >>>>> On Sat 11. Dec 2021 at 22:28, Matthew Knepley > wrote: >>>>> On Sat, Dec 11, 2021 at 1:58 PM Tang, Qi > wrote: >>>>> Hi, >>>>> Does anyone have comment on finite difference coloring with DMStag? We are using DMStag and TS to evolve some nonlinear equations implicitly. It would be helpful to have the coloring Jacobian option with that. >>>>> >>>>> Since DMStag produces the Jacobian connectivity, >>>>> >>>>> This is incorrect. >>>>> The DMCreateMatrix implementation for DMSTAG only sets the number of nonzeros (very inaccurately). It does not insert any zero values and thus the nonzero structure is actually not defined. >>>>> That is why coloring doesn?t work. >>>>> >>>>> Ah, thanks Dave. >>>>> >>>>> Okay, we should fix that.It is perfectly possible to compute the nonzero pattern from the DMStag information. >>>>> >>>>> Agreed. The API for DMSTAG is complete enough to enable one to >>>>> loop over the cells, and for all quantities defined on the cell (centre, face, vertex), >>>>> insert values into the appropriate slot in the matrix. >>>>> Combined with MATPREALLOCATOR, I believe a compact and readable >>>>> code should be possible to write for the preallocation (cf DMDA). >>>>> >>>>> I think the only caveat with the approach of using all quantities defined on the cell is >>>>> It may slightly over allocate depending on how the user wishes to impose the boundary condition, >>>>> or slightly over allocate for says Stokes where there is no pressure-pressure coupling term. >>>>> >>>>> Yes, and would not handle higher order stencils.I think the overallocating is livable for the first imeplementation. >>>>> >>>>> >>>>> Sure, but neither does DMDA. >>>>> >>>>> The user always has to know what they are doing and set the stencil width accordingly. >>>>> I actually had this point listed in my initial email (and the stencil growth issue when using FD for nonlinear problems), >>>>> however I deleted it as all the same issue exist in DMDA and no one complains (at least not loudly) :D >>>>> >>>>> >>>>> >>>>> >>>>> Thanks, >>>>> >>>>> Matt >>>>> >>>>> Thanks, >>>>> Dave >>>>> >>>>> >>>>> Paging Patrick :) >>>>> >>>>> Thanks, >>>>> >>>>> Matt >>>>> >>>>> Thanks, >>>>> Dave >>>>> >>>>> >>>>> you can use -snes_fd_color_use_mat. It has many options. Here is an example of us using that: >>>>> >>>>> https://gitlab.com/petsc/petsc/-/blob/main/src/snes/tutorials/ex19.c#L898 >>>>> >>>>> Thanks, >>>>> >>>>> Matt >>>>> >>>>> Thanks, >>>>> Qi >>>>> >>>>> >>>>>> On Oct 15, 2021, at 3:07 PM, Jorti, Zakariae via petsc-users > wrote: >>>>>> >>>>>> Hello, >>>>>> >>>>>> Does the Jacobian approximation using coloring and finite differencing of the function evaluation work in DMStag? >>>>>> Thank you. >>>>>> Best regards, >>>>>> >>>>>> Zakariae >>>>> >>>>> >>>>> >>>>> -- >>>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>>>> -- Norbert Wiener >>>>> >>>>> https://www.cse.buffalo.edu/~knepley/ >>>>> >>>>> >>>>> -- >>>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>>>> -- Norbert Wiener >>>>> >>>>> https://www.cse.buffalo.edu/~knepley/ >>>>> >>>>> >>>>> -- >>>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>>>> -- Norbert Wiener >>>>> >>>>> https://www.cse.buffalo.edu/~knepley/ >>>>> >>>>> >>>>> -- >>>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>>>> -- Norbert Wiener >>>>> >>>>> https://www.cse.buffalo.edu/~knepley/ From bsmith at petsc.dev Wed Jan 12 01:53:09 2022 From: bsmith at petsc.dev (Barry Smith) Date: Wed, 12 Jan 2022 02:53:09 -0500 Subject: [petsc-users] Finite difference approximation of Jacobian In-Reply-To: References: <231abd15aab544f9850826cb437366f7@lanl.gov> <877db5se57.fsf@jedbrown.org> <87y23lquzl.fsf@jedbrown.org> Message-ID: <249DED57-7AA6-4748-A15A-0B8DDFBC5B85@petsc.dev> Actually given the Subject of this email "Finite difference approximation of Jacobian" what I suggested is A complete story anyways for Patrick since the user cannot provide numerical values (of course Patrick could write a general new hash matrix type and then use it with zeros but that is asking a lot of him). Regarding Fande's needs. One could rig things so that later one could "flip" back the matrix to again use the hasher for setting values when the contacts change. > On Jan 12, 2022, at 2:48 AM, Barry Smith wrote: > > > >> On Jan 12, 2022, at 2:22 AM, Jed Brown wrote: >> >> Because if a user jumps right into MatSetValues() and then wants to use the result, it better have the values. It's true that DM preallocation normally just "inserts zeros" and thus matches what MATPREALLOCATOR does. > > I am not saying the user jumps right into MatSetValues(). I am saying they do a "value" free assembly to have the preallocation set up and then they do the first real assembly; hence very much JUST a refactorization of MATPREALLOCATE. So the user is "preallocation-aware" but does not have to do anything difficult to determine their pattern analytically. > > Of course handling the values also is fine and simplifies user code and the conceptual ideas (since preallocation ceases to exist as a concept for users). I have no problem with skipping the "JUST a refactorization of MATPREALLOCATE code" but for Patrick's current pressing need I think a refactorization of MATPREALLOCATE would be fastest to develop hence I suggested it. > > I did not look at Jed's branch but one way to eliminate the preallocation part completely would be to have something like MATHASH and then when the first assembly is complete do a MatConvert to AIJ or whatever. The conversion could be hidden inside the assemblyend of the hash so the user need not be aware that something different was done the first time through. Doing it this way would easily allow multiple MATHASH* implementations (written by different people) that would compete on speed and memory usage. So there could be MatSetType() and a new MatSetInitialType() where the initial type would be automatically set to some MATHASH if preallocation info is not provided. > >> >> Barry Smith writes: >> >>> Why does it need to handle values? >>> >>>> On Jan 12, 2022, at 12:43 AM, Jed Brown wrote: >>>> >>>> I agree with this and even started a branch jed/mat-hash in (yikes!) 2017. I think it should be default if no preallocation functions are called. But it isn't exactly the MATPREALLOCATOR code because it needs to handle values too. Should not be a lot of code and will essentially remove this FAQ and one of the most irritating subtle aspects of new codes using PETSc matrices. >>>> >>>> https://petsc.org/release/faq/#assembling-large-sparse-matrices-takes-a-long-time-what-can-i-do-to-make-this-process-faster-or-matsetvalues-is-so-slow-what-can-i-do-to-speed-it-up >>>> >>>> Barry Smith writes: >>>> >>>>> I think the MATPREALLOCATOR as a MatType is a cumbersome strange thing and would prefer it was just functionality that Mat provided directly; for example MatSetOption(mat, preallocator_mode,true); matsetvalues,... MatAssemblyBegin/End. Now the matrix has its proper nonzero structure of whatever type the user set initially, aij, baij, sbaij, .... And the user can now use it efficiently. >>>>> >>>>> Barry >>>>> >>>>> So turning on the option just swaps out temporarily the operations for MatSetValues and AssemblyBegin/End to be essentially those in MATPREALLOCATOR. The refactorization should take almost no time and would be faster than trying to rig dmstag to use MATPREALLOCATOR as is. >>>>> >>>>> >>>>>> On Jan 11, 2022, at 9:43 PM, Matthew Knepley wrote: >>>>>> >>>>>> On Tue, Jan 11, 2022 at 12:09 PM Patrick Sanan > wrote: >>>>>> Working on doing this incrementally, in progress here: https://gitlab.com/petsc/petsc/-/merge_requests/4712 >>>>>> >>>>>> This works in 1D for AIJ matrices, assembling a matrix with a maximal number of zero entries as dictated by the stencil width (which is intended to be very very close to what DMDA would do if you >>>>>> associated all the unknowns with a particular grid point, which is the way DMStag largely works under the hood). >>>>>> >>>>>> Dave, before I get into it, am I correct in my understanding that MATPREALLOCATOR would be better here because you would avoid superfluous zeros in the sparsity pattern, >>>>>> because this routine wouldn't have to assemble the Mat returned by DMCreateMatrix()? >>>>>> >>>>>> Yes, here is how it works. You throw in all the nonzeros you come across. Preallocator is a hash table that can check for duplicates. At the end, it returns the sparsity pattern. >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Matt >>>>>> >>>>>> If this seems like a sane way to go, I will continue to add some more tests (in particular periodic BCs not tested yet) and add the code for 2D and 3D. >>>>>> >>>>>> >>>>>> >>>>>> Am Mo., 13. Dez. 2021 um 20:17 Uhr schrieb Dave May >: >>>>>> >>>>>> >>>>>> On Mon, 13 Dec 2021 at 20:13, Matthew Knepley > wrote: >>>>>> On Mon, Dec 13, 2021 at 1:52 PM Dave May > wrote: >>>>>> On Mon, 13 Dec 2021 at 19:29, Matthew Knepley > wrote: >>>>>> On Mon, Dec 13, 2021 at 1:16 PM Dave May > wrote: >>>>>> >>>>>> >>>>>> On Sat 11. Dec 2021 at 22:28, Matthew Knepley > wrote: >>>>>> On Sat, Dec 11, 2021 at 1:58 PM Tang, Qi > wrote: >>>>>> Hi, >>>>>> Does anyone have comment on finite difference coloring with DMStag? We are using DMStag and TS to evolve some nonlinear equations implicitly. It would be helpful to have the coloring Jacobian option with that. >>>>>> >>>>>> Since DMStag produces the Jacobian connectivity, >>>>>> >>>>>> This is incorrect. >>>>>> The DMCreateMatrix implementation for DMSTAG only sets the number of nonzeros (very inaccurately). It does not insert any zero values and thus the nonzero structure is actually not defined. >>>>>> That is why coloring doesn?t work. >>>>>> >>>>>> Ah, thanks Dave. >>>>>> >>>>>> Okay, we should fix that.It is perfectly possible to compute the nonzero pattern from the DMStag information. >>>>>> >>>>>> Agreed. The API for DMSTAG is complete enough to enable one to >>>>>> loop over the cells, and for all quantities defined on the cell (centre, face, vertex), >>>>>> insert values into the appropriate slot in the matrix. >>>>>> Combined with MATPREALLOCATOR, I believe a compact and readable >>>>>> code should be possible to write for the preallocation (cf DMDA). >>>>>> >>>>>> I think the only caveat with the approach of using all quantities defined on the cell is >>>>>> It may slightly over allocate depending on how the user wishes to impose the boundary condition, >>>>>> or slightly over allocate for says Stokes where there is no pressure-pressure coupling term. >>>>>> >>>>>> Yes, and would not handle higher order stencils.I think the overallocating is livable for the first imeplementation. >>>>>> >>>>>> >>>>>> Sure, but neither does DMDA. >>>>>> >>>>>> The user always has to know what they are doing and set the stencil width accordingly. >>>>>> I actually had this point listed in my initial email (and the stencil growth issue when using FD for nonlinear problems), >>>>>> however I deleted it as all the same issue exist in DMDA and no one complains (at least not loudly) :D >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Matt >>>>>> >>>>>> Thanks, >>>>>> Dave >>>>>> >>>>>> >>>>>> Paging Patrick :) >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Matt >>>>>> >>>>>> Thanks, >>>>>> Dave >>>>>> >>>>>> >>>>>> you can use -snes_fd_color_use_mat. It has many options. Here is an example of us using that: >>>>>> >>>>>> https://gitlab.com/petsc/petsc/-/blob/main/src/snes/tutorials/ex19.c#L898 >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Matt >>>>>> >>>>>> Thanks, >>>>>> Qi >>>>>> >>>>>> >>>>>>> On Oct 15, 2021, at 3:07 PM, Jorti, Zakariae via petsc-users > wrote: >>>>>>> >>>>>>> Hello, >>>>>>> >>>>>>> Does the Jacobian approximation using coloring and finite differencing of the function evaluation work in DMStag? >>>>>>> Thank you. >>>>>>> Best regards, >>>>>>> >>>>>>> Zakariae >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>>>>> -- Norbert Wiener >>>>>> >>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>> >>>>>> >>>>>> -- >>>>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>>>>> -- Norbert Wiener >>>>>> >>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>> >>>>>> >>>>>> -- >>>>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>>>>> -- Norbert Wiener >>>>>> >>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>> >>>>>> >>>>>> -- >>>>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>>>>> -- Norbert Wiener >>>>>> >>>>>> https://www.cse.buffalo.edu/~knepley/ > From varunhiremath at gmail.com Wed Jan 12 02:59:50 2022 From: varunhiremath at gmail.com (Varun Hiremath) Date: Wed, 12 Jan 2022 00:59:50 -0800 Subject: [petsc-users] PETSc MUMPS interface Message-ID: Hi All, I want to collect MUMPS memory estimates based on the initial symbolic factorization analysis before the actual numerical factorization starts to check if the estimated memory requirements fit the available memory. I am following the steps from https://petsc.org/main/src/ksp/ksp/tutorials/ex52.c.html PCFactorSetMatSolverType(pc,MATSOLVERMUMPS); PCFactorSetUpMatSolverType(pc); PCFactorGetMatrix(pc,&F); KSPSetUp(ksp); MatMumpsGetInfog(F,...) But it appears KSPSetUp calls both symbolic and numerical factorization. So is there some other way to get these statistics before the actual factorization starts? Thanks, Varun -------------- next part -------------- An HTML attachment was scrubbed... URL: From thibault.bridelbertomeu at gmail.com Wed Jan 12 04:27:35 2022 From: thibault.bridelbertomeu at gmail.com (Thibault Bridel-Bertomeu) Date: Wed, 12 Jan 2022 11:27:35 +0100 Subject: [petsc-users] Fluid-Structure interaction with multiple DMPlex In-Reply-To: <874k69sdph.fsf@jedbrown.org> References: <87wnj8tvl9.fsf@jedbrown.org> <87zgo2s2qy.fsf@jedbrown.org> <874k69sdph.fsf@jedbrown.org> Message-ID: Le mer. 12 janv. 2022 ? 06:52, Jed Brown a ?crit : > Matthew Knepley writes: > > > Hmm, let's start here because that does not make sense to me. Can you > write > > down the mathematical model? Usually I think of temperature as being > > a continuous field. I have only seen discontinuities for rareified gases. > > Well, you can still model it as a discontinuous field with a flux derived > from a jump condition similar to DG. I think that's a reasonable choice > when the fluid uses conservative variables and/or a function space that > does not have boundary nodes (like FV or DG). > > But you should check the literature to identify a formulation you're > comfortable with and then we can advise on how to implement it with DMPlex. > OK so I checked with my colleague and it turns out I was wrong about the interface condition : we want a perfect thermal contact that ensures both temperature and heat flux continuity. The PDF attached summarizes the PDEs that we have to solve in the fluid and the solid, along with the interface boundary condition (the boundary conditions on other parts of the borders are "classical" boundary conditions for fluid - inlet, outlet, freestream - and for solid - pure neumann for instance a.k.a. insulation -). Thank you for your advice !! Cheers, Thibault -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Memo_Transient_Conjugate_Heat_Transfer.pdf Type: application/pdf Size: 198858 bytes Desc: not available URL: From patrick.sanan at gmail.com Wed Jan 12 07:53:06 2022 From: patrick.sanan at gmail.com (Patrick Sanan) Date: Wed, 12 Jan 2022 14:53:06 +0100 Subject: [petsc-users] Finite difference approximation of Jacobian In-Reply-To: <249DED57-7AA6-4748-A15A-0B8DDFBC5B85@petsc.dev> References: <231abd15aab544f9850826cb437366f7@lanl.gov> <877db5se57.fsf@jedbrown.org> <87y23lquzl.fsf@jedbrown.org> <249DED57-7AA6-4748-A15A-0B8DDFBC5B85@petsc.dev> Message-ID: Thanks a lot, all! So given that there's still some debate about whether we should even use MATPREALLOCATOR or a better integration of that hash logic , as in Issue 852, I'll proceed with simply aping what DMDA does (with apologies for all this code duplication). One thing I had missed, which I just added, is respect for DMSetMatrixPreallocation() / -dm_preallocate_only . Now, when that's activated, the previous behavior should be recovered, but by default it'll assemble a matrix which can be used for coloring, as is the main objective of this thread. Am Mi., 12. Jan. 2022 um 08:53 Uhr schrieb Barry Smith : > > Actually given the Subject of this email "Finite difference > approximation of Jacobian" what I suggested is A complete story anyways for > Patrick since the user cannot provide numerical values (of course Patrick > could write a general new hash matrix type and then use it with zeros but > that is asking a lot of him). > > Regarding Fande's needs. One could rig things so that later one could > "flip" back the matrix to again use the hasher for setting values when the > contacts change. > > > On Jan 12, 2022, at 2:48 AM, Barry Smith wrote: > > > > > > > >> On Jan 12, 2022, at 2:22 AM, Jed Brown wrote: > >> > >> Because if a user jumps right into MatSetValues() and then wants to use > the result, it better have the values. It's true that DM preallocation > normally just "inserts zeros" and thus matches what MATPREALLOCATOR does. > > > > I am not saying the user jumps right into MatSetValues(). I am saying > they do a "value" free assembly to have the preallocation set up and then > they do the first real assembly; hence very much JUST a refactorization of > MATPREALLOCATE. So the user is "preallocation-aware" but does not have to > do anything difficult to determine their pattern analytically. > > > > Of course handling the values also is fine and simplifies user code and > the conceptual ideas (since preallocation ceases to exist as a concept for > users). I have no problem with skipping the "JUST a refactorization of > MATPREALLOCATE code" but for Patrick's current pressing need I think a > refactorization of MATPREALLOCATE would be fastest to develop hence I > suggested it. > > > > I did not look at Jed's branch but one way to eliminate the > preallocation part completely would be to have something like MATHASH and > then when the first assembly is complete do a MatConvert to AIJ or > whatever. The conversion could be hidden inside the assemblyend of the > hash so the user need not be aware that something different was done the > first time through. Doing it this way would easily allow multiple MATHASH* > implementations (written by different people) that would compete on speed > and memory usage. So there could be MatSetType() and a new > MatSetInitialType() where the initial type would be automatically set to > some MATHASH if preallocation info is not provided. > > > >> > >> Barry Smith writes: > >> > >>> Why does it need to handle values? > >>> > >>>> On Jan 12, 2022, at 12:43 AM, Jed Brown wrote: > >>>> > >>>> I agree with this and even started a branch jed/mat-hash in (yikes!) > 2017. I think it should be default if no preallocation functions are > called. But it isn't exactly the MATPREALLOCATOR code because it needs to > handle values too. Should not be a lot of code and will essentially remove > this FAQ and one of the most irritating subtle aspects of new codes using > PETSc matrices. > >>>> > >>>> > https://petsc.org/release/faq/#assembling-large-sparse-matrices-takes-a-long-time-what-can-i-do-to-make-this-process-faster-or-matsetvalues-is-so-slow-what-can-i-do-to-speed-it-up > >>>> > >>>> Barry Smith writes: > >>>> > >>>>> I think the MATPREALLOCATOR as a MatType is a cumbersome strange > thing and would prefer it was just functionality that Mat provided > directly; for example MatSetOption(mat, preallocator_mode,true); > matsetvalues,... MatAssemblyBegin/End. Now the matrix has its proper > nonzero structure of whatever type the user set initially, aij, baij, > sbaij, .... And the user can now use it efficiently. > >>>>> > >>>>> Barry > >>>>> > >>>>> So turning on the option just swaps out temporarily the operations > for MatSetValues and AssemblyBegin/End to be essentially those in > MATPREALLOCATOR. The refactorization should take almost no time and would > be faster than trying to rig dmstag to use MATPREALLOCATOR as is. > >>>>> > >>>>> > >>>>>> On Jan 11, 2022, at 9:43 PM, Matthew Knepley > wrote: > >>>>>> > >>>>>> On Tue, Jan 11, 2022 at 12:09 PM Patrick Sanan < > patrick.sanan at gmail.com > wrote: > >>>>>> Working on doing this incrementally, in progress here: > https://gitlab.com/petsc/petsc/-/merge_requests/4712 < > https://gitlab.com/petsc/petsc/-/merge_requests/4712> > >>>>>> > >>>>>> This works in 1D for AIJ matrices, assembling a matrix with a > maximal number of zero entries as dictated by the stencil width (which is > intended to be very very close to what DMDA would do if you > >>>>>> associated all the unknowns with a particular grid point, which is > the way DMStag largely works under the hood). > >>>>>> > >>>>>> Dave, before I get into it, am I correct in my understanding that > MATPREALLOCATOR would be better here because you would avoid superfluous > zeros in the sparsity pattern, > >>>>>> because this routine wouldn't have to assemble the Mat returned by > DMCreateMatrix()? > >>>>>> > >>>>>> Yes, here is how it works. You throw in all the nonzeros you come > across. Preallocator is a hash table that can check for duplicates. At the > end, it returns the sparsity pattern. > >>>>>> > >>>>>> Thanks, > >>>>>> > >>>>>> Matt > >>>>>> > >>>>>> If this seems like a sane way to go, I will continue to add some > more tests (in particular periodic BCs not tested yet) and add the code for > 2D and 3D. > >>>>>> > >>>>>> > >>>>>> > >>>>>> Am Mo., 13. Dez. 2021 um 20:17 Uhr schrieb Dave May < > dave.mayhem23 at gmail.com >: > >>>>>> > >>>>>> > >>>>>> On Mon, 13 Dec 2021 at 20:13, Matthew Knepley > wrote: > >>>>>> On Mon, Dec 13, 2021 at 1:52 PM Dave May > wrote: > >>>>>> On Mon, 13 Dec 2021 at 19:29, Matthew Knepley > wrote: > >>>>>> On Mon, Dec 13, 2021 at 1:16 PM Dave May > wrote: > >>>>>> > >>>>>> > >>>>>> On Sat 11. Dec 2021 at 22:28, Matthew Knepley > wrote: > >>>>>> On Sat, Dec 11, 2021 at 1:58 PM Tang, Qi tangqi at msu.edu>> wrote: > >>>>>> Hi, > >>>>>> Does anyone have comment on finite difference coloring with DMStag? > We are using DMStag and TS to evolve some nonlinear equations implicitly. > It would be helpful to have the coloring Jacobian option with that. > >>>>>> > >>>>>> Since DMStag produces the Jacobian connectivity, > >>>>>> > >>>>>> This is incorrect. > >>>>>> The DMCreateMatrix implementation for DMSTAG only sets the number > of nonzeros (very inaccurately). It does not insert any zero values and > thus the nonzero structure is actually not defined. > >>>>>> That is why coloring doesn?t work. > >>>>>> > >>>>>> Ah, thanks Dave. > >>>>>> > >>>>>> Okay, we should fix that.It is perfectly possible to compute the > nonzero pattern from the DMStag information. > >>>>>> > >>>>>> Agreed. The API for DMSTAG is complete enough to enable one to > >>>>>> loop over the cells, and for all quantities defined on the cell > (centre, face, vertex), > >>>>>> insert values into the appropriate slot in the matrix. > >>>>>> Combined with MATPREALLOCATOR, I believe a compact and readable > >>>>>> code should be possible to write for the preallocation (cf DMDA). > >>>>>> > >>>>>> I think the only caveat with the approach of using all quantities > defined on the cell is > >>>>>> It may slightly over allocate depending on how the user wishes to > impose the boundary condition, > >>>>>> or slightly over allocate for says Stokes where there is no > pressure-pressure coupling term. > >>>>>> > >>>>>> Yes, and would not handle higher order stencils.I think the > overallocating is livable for the first imeplementation. > >>>>>> > >>>>>> > >>>>>> Sure, but neither does DMDA. > >>>>>> > >>>>>> The user always has to know what they are doing and set the stencil > width accordingly. > >>>>>> I actually had this point listed in my initial email (and the > stencil growth issue when using FD for nonlinear problems), > >>>>>> however I deleted it as all the same issue exist in DMDA and no one > complains (at least not loudly) :D > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> Thanks, > >>>>>> > >>>>>> Matt > >>>>>> > >>>>>> Thanks, > >>>>>> Dave > >>>>>> > >>>>>> > >>>>>> Paging Patrick :) > >>>>>> > >>>>>> Thanks, > >>>>>> > >>>>>> Matt > >>>>>> > >>>>>> Thanks, > >>>>>> Dave > >>>>>> > >>>>>> > >>>>>> you can use -snes_fd_color_use_mat. It has many options. Here is an > example of us using that: > >>>>>> > >>>>>> > https://gitlab.com/petsc/petsc/-/blob/main/src/snes/tutorials/ex19.c#L898 > > > >>>>>> > >>>>>> Thanks, > >>>>>> > >>>>>> Matt > >>>>>> > >>>>>> Thanks, > >>>>>> Qi > >>>>>> > >>>>>> > >>>>>>> On Oct 15, 2021, at 3:07 PM, Jorti, Zakariae via petsc-users < > petsc-users at mcs.anl.gov > wrote: > >>>>>>> > >>>>>>> Hello, > >>>>>>> > >>>>>>> Does the Jacobian approximation using coloring and finite > differencing of the function evaluation work in DMStag? > >>>>>>> Thank you. > >>>>>>> Best regards, > >>>>>>> > >>>>>>> Zakariae > >>>>>> > >>>>>> > >>>>>> > >>>>>> -- > >>>>>> What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > >>>>>> -- Norbert Wiener > >>>>>> > >>>>>> https://www.cse.buffalo.edu/~knepley/ < > http://www.cse.buffalo.edu/~knepley/> > >>>>>> > >>>>>> > >>>>>> -- > >>>>>> What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > >>>>>> -- Norbert Wiener > >>>>>> > >>>>>> https://www.cse.buffalo.edu/~knepley/ < > http://www.cse.buffalo.edu/~knepley/> > >>>>>> > >>>>>> > >>>>>> -- > >>>>>> What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > >>>>>> -- Norbert Wiener > >>>>>> > >>>>>> https://www.cse.buffalo.edu/~knepley/ < > http://www.cse.buffalo.edu/~knepley/> > >>>>>> > >>>>>> > >>>>>> -- > >>>>>> What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > >>>>>> -- Norbert Wiener > >>>>>> > >>>>>> https://www.cse.buffalo.edu/~knepley/ < > http://www.cse.buffalo.edu/~knepley/> > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From marco.cisternino at optimad.it Wed Jan 12 08:54:19 2022 From: marco.cisternino at optimad.it (Marco Cisternino) Date: Wed, 12 Jan 2022 14:54:19 +0000 Subject: [petsc-users] Nullspaces In-Reply-To: <45645EB2-6EA9-4B0E-9F87-1C64943DE8EA@petsc.dev> References: <34A686AA-D337-484B-9EB3-A01C7565AD48@dsic.upv.es> <45645EB2-6EA9-4B0E-9F87-1C64943DE8EA@petsc.dev> Message-ID: Hello Barry, To answer your question, both the eigenvectors contain only two values: the eigenvectors entries are different in the two eigenvectors but coherent with the belonging of the entry to the sub-domains. However, I was able to get the same behavior of the MatTestNullSpace using the two ways of creating the null space. To summarize the issue, I did: - MatNullSpaceCreate(getCommunicator(), PETSC_TRUE, 0, nullptr, &nullspace) (CASE 1) - Vec* nsp; VecDuplicateVecs(m_rhs, 1, &nsp); VecSet(nsp[0],1.0); VecNormalize(nsp[0], nullptr); MatNullSpaceCreate(getCommunicator(), PETSC_FALSE, 1, nsp, &nullspace); (CASE 2) and then I tested with MatNullSpaceTest(nullspace, m_A, &isNullSpaceValid). CASE 1 gave isNullSpaceValid=true but CASE 2 gave isNullSpaceValid=false. I found the problem is the normalization of the Vec nsp[0]. Modifying CASE 2 like this: Vec* nsp; VecDuplicateVecs(m_rhs, 1, &nsp); PetscInt N; VecGetSize(nsp[0],&N); VecSet(nsp[0],1.0 / N); MatNullSpaceCreate(getCommunicator(), PETSC_FALSE, 1, nsp, &nullspace); I can get isNullSpaceValid= true even for CASE 2. But most importantly, I can get isNullSpaceValid=true even creating the true 2-dimensional null space with explicit normalization (using the number of non-zero entries sub-domain per sub-domain) and not VecNormalize: MatNullSpaceCreate(getCommunicator(), PETSC_FALSE, nConstants, constants, &nullspace); where nConstants=2 and constants[0] (constants[1]) has been set with 1/N0 (1/N1) in entries relative to sub-domain 0 (1). I going to check which is the impact on CFD solution. Any comment on this would be really appreciated. Thank you all. Marco Cisternino -----Original Message----- From: Barry Smith Sent: mercoled? 5 gennaio 2022 23:17 To: Marco Cisternino Cc: Jose E. Roman ; petsc-users Subject: Re: [petsc-users] Nullspaces What do you get for the two eigenvectors ? > On Jan 5, 2022, at 11:21 AM, Marco Cisternino wrote: > > Hello Jose and Stefano. > Thank you, Jose for your hints. > I computed the two smallest eigenvalues of my operator and they are tiny but not zero. > The smallest 0 eigenvalue is = (4.71506e-08, 0) with abs error = > 3.95575e-07 The smallest 1 eigenvalue is = (1.95628e-07, 0) with abs > error = 4.048e-07 As Stefano remarked, I would have expected much tinier values, closer to zero. > Probably something is wrong in what I do: > EPS eps; > EPSCreate(PETSC_COMM_WORLD, &eps); > EPSSetOperators( eps, matrix, NULL ); > EPSSetWhichEigenpairs(eps, EPS_SMALLEST_MAGNITUDE); > EPSSetProblemType( eps, EPS_NHEP ); > EPSSetConvergenceTest(eps,EPS_CONV_ABS); > EPSSetTolerances(eps, 1.0e-10, 1000); > EPSSetDimensions(eps,2,PETSC_DEFAULT,PETSC_DEFAULT); > EPSSetFromOptions( eps ); > EPSSolve( eps ); > > Even commenting " EPSSetTolerances(eps, 1.0e-10, 1000);" and use default values, the results are exactly the same. > > Am I correctly computing the 2 smallest eigenvalues? > > They should be zeros but they are not. Any suggestions about how understanding why? > > In a previous email Mark remarked: "Also you say you divide by the cell volume. Maybe I am not understanding this but that is basically diagonal scaling and that will change the null space (ie, not a constant anymore)", therefore why does the null space built with MatNullSpaceCreate(PETSC_COMM_WORLD, PETSC_TRUE, 0, nullptr, &nullspace); passes the MatNullSpaceTest?? > > Thank you all! > > Marco Cisternino > > -----Original Message----- > From: Jose E. Roman > Sent: marted? 4 gennaio 2022 19:30 > To: Marco Cisternino > Cc: Stefano Zampini ; petsc-users > > Subject: Re: [petsc-users] Nullspaces > > To compute more than one eigenpair, call EPSSetDimensions(eps,nev,PETSC_DEFAULT,PETSC_DEFAULT). > > To compute zero eigenvalues you may want to use an absolute convergence criterion, with EPSSetConvergenceTest(eps,EPS_CONV_ABS), but then a tolerance of 1e-12 is probably too small. You can try without this, anyway. > > Jose > > >> El 4 ene 2022, a las 18:44, Marco Cisternino escribi?: >> >> Hello Stefano and thank you for your support. >> I never used SLEPc before but I did this: >> right after the matrix loading from file I added the following lines to my shared tiny code >> MatLoad(matrix, v); >> >> EPS eps; >> EPSCreate(PETSC_COMM_WORLD, &eps); >> EPSSetOperators( eps, matrix, NULL ); >> EPSSetWhichEigenpairs(eps, EPS_SMALLEST_MAGNITUDE); >> EPSSetProblemType( eps, EPS_NHEP ); >> EPSSetTolerances(eps, 1.0e-12, 1000); >> EPSSetFromOptions( eps ); >> EPSSolve( eps ); >> >> Vec xr, xi; /* eigenvector, x */ >> PetscScalar kr, ki; /* eigenvalue, k */ >> PetscInt j, nconv; >> PetscReal error; >> EPSGetConverged( eps, &nconv ); >> for (j=0; j> EPSGetEigenpair( eps, j, &kr, &ki, xr, xi ); >> EPSComputeError( eps, j, EPS_ERROR_ABSOLUTE, &error ); >> std::cout << "The smallest eigenvalue is = (" << kr << ", " << ki << ") with error = " << error << std::endl; >> } >> >> I launched using >> mpirun -n 1 ./testnullspace -eps_monitor >> >> and the output is >> >> 1 EPS nconv=0 first unconverged value (error) -1499.29 >> (6.57994794e+01) >> 2 EPS nconv=0 first unconverged value (error) -647.468 >> (5.39939262e+01) >> 3 EPS nconv=0 first unconverged value (error) -177.157 >> (9.49337698e+01) >> 4 EPS nconv=0 first unconverged value (error) 59.6771 >> (1.62531943e+02) >> 5 EPS nconv=0 first unconverged value (error) 41.755 >> (1.41965990e+02) >> 6 EPS nconv=0 first unconverged value (error) -11.5462 >> (3.60453662e+02) >> 7 EPS nconv=0 first unconverged value (error) -6.04493 >> (4.60890030e+02) >> 8 EPS nconv=0 first unconverged value (error) -22.7362 >> (8.67630086e+01) >> 9 EPS nconv=0 first unconverged value (error) -12.9637 >> (1.08507821e+02) >> 10 EPS nconv=0 first unconverged value (error) 7.7234 >> (1.53561979e+02) ? >> 111 EPS nconv=0 first unconverged value (error) -2.27e-08 >> (6.84762319e+00) >> 112 EPS nconv=0 first unconverged value (error) -2.60619e-08 >> (4.45245528e+00) >> 113 EPS nconv=0 first unconverged value (error) -5.49592e-09 >> (1.87798984e+01) >> 114 EPS nconv=0 first unconverged value (error) -9.9456e-09 >> (7.96711076e+00) >> 115 EPS nconv=0 first unconverged value (error) -1.89779e-08 >> (4.15471472e+00) >> 116 EPS nconv=0 first unconverged value (error) -2.05288e-08 >> (2.52953194e+00) >> 117 EPS nconv=0 first unconverged value (error) -2.02919e-08 >> (2.90090711e+00) >> 118 EPS nconv=0 first unconverged value (error) -3.8706e-08 >> (8.03595736e-01) >> 119 EPS nconv=1 first unconverged value (error) -61751.8 >> (9.58036571e-07) Computed 1 pairs The smallest eigenvalue is = >> (-3.8706e-08, 0) with error = 4.9707e-07 >> >> Am I using SLEPc in the right way at least for the first smallest eigenvalue? If I?m on the right way I can find out how to compute the second one. >> >> Thanks a lot >> >> Marco Cisternino > From junchao.zhang at gmail.com Wed Jan 12 09:03:19 2022 From: junchao.zhang at gmail.com (Junchao Zhang) Date: Wed, 12 Jan 2022 09:03:19 -0600 Subject: [petsc-users] PETSc MUMPS interface In-Reply-To: References: Message-ID: Calling PCSetUp() before KSPSetUp()? --Junchao Zhang On Wed, Jan 12, 2022 at 3:00 AM Varun Hiremath wrote: > Hi All, > > I want to collect MUMPS memory estimates based on the initial > symbolic factorization analysis before the actual numerical factorization > starts to check if the estimated memory requirements fit the available > memory. > > I am following the steps from > https://petsc.org/main/src/ksp/ksp/tutorials/ex52.c.html > > PCFactorSetMatSolverType(pc,MATSOLVERMUMPS); > PCFactorSetUpMatSolverType(pc); > PCFactorGetMatrix(pc,&F); > > KSPSetUp(ksp); > MatMumpsGetInfog(F,...) > > But it appears KSPSetUp calls both symbolic and numerical factorization. > So is there some other way to get these statistics before the actual > factorization starts? > > Thanks, > Varun > -------------- next part -------------- An HTML attachment was scrubbed... URL: From hzhang at mcs.anl.gov Wed Jan 12 09:58:27 2022 From: hzhang at mcs.anl.gov (Zhang, Hong) Date: Wed, 12 Jan 2022 15:58:27 +0000 Subject: [petsc-users] PETSc MUMPS interface In-Reply-To: References: Message-ID: PCFactorSetMatSolverType(pc,MATSOLVERMUMPS); PCFactorSetUpMatSolverType(pc); PCFactorGetMatrix(pc,&F); MatLUFactorSymbolic(F,A,...) You must provide row and column permutations etc, petsc/src/mat/tests/ex125.c may give you a clue on how to get these inputs. Hong ________________________________ From: petsc-users on behalf of Junchao Zhang Sent: Wednesday, January 12, 2022 9:03 AM To: Varun Hiremath Cc: Peder J?rgensgaard Olesen via petsc-users Subject: Re: [petsc-users] PETSc MUMPS interface Calling PCSetUp() before KSPSetUp()? --Junchao Zhang On Wed, Jan 12, 2022 at 3:00 AM Varun Hiremath > wrote: Hi All, I want to collect MUMPS memory estimates based on the initial symbolic factorization analysis before the actual numerical factorization starts to check if the estimated memory requirements fit the available memory. I am following the steps from https://petsc.org/main/src/ksp/ksp/tutorials/ex52.c.html PCFactorSetMatSolverType(pc,MATSOLVERMUMPS); PCFactorSetUpMatSolverType(pc); PCFactorGetMatrix(pc,&F); KSPSetUp(ksp); MatMumpsGetInfog(F,...) But it appears KSPSetUp calls both symbolic and numerical factorization. So is there some other way to get these statistics before the actual factorization starts? Thanks, Varun -------------- next part -------------- An HTML attachment was scrubbed... URL: From danyang.su at gmail.com Wed Jan 12 15:41:09 2022 From: danyang.su at gmail.com (Danyang Su) Date: Wed, 12 Jan 2022 13:41:09 -0800 Subject: [petsc-users] PETSc configuration error on macOS Monterey with Intel oneAPI Message-ID: Hi All, I got an error in PETSc configuration on macOS Monterey with Intel oneAPI using the following options: ./configure --with-cc=icc --with-cxx=icpc --with-fc=ifort --with-blas-lapack-dir=/opt/intel/oneapi/mkl/2022.0.0/lib/ --with-debugging=1 PETSC_ARCH=macos-intel-dbg --download-mumps --download-parmetis --download-metis --download-hypre --download-superlu --download-hdf5=yes --download-openmpi Error with downloaded OpenMPI: Cannot compile/link FC with /Users/danyangsu/Soft/PETSc/petsc-3.16.3/macos-intel-dbg/bin/mpif90. Any suggestions for that? There is no problem if I use GNU compiler and MPICH. Thanks, Danyang -------------- next part -------------- An HTML attachment was scrubbed... URL: From samar.khatiwala at earth.ox.ac.uk Wed Jan 12 16:01:18 2022 From: samar.khatiwala at earth.ox.ac.uk (Samar Khatiwala) Date: Wed, 12 Jan 2022 22:01:18 +0000 Subject: [petsc-users] PETSc configuration error on macOS Monterey with Intel oneAPI In-Reply-To: References: Message-ID: Hi Danyang, I had trouble configuring PETSc on MacOS Monterey with ifort when using mpich (which I was building myself). I tracked it down to an errant "-Wl,-flat_namespace? option in the mpif90 wrapper. I rebuilt mpich with the "--enable-two-level-namespace? configuration option and the problem went away. I don?t know if there?s a similar issue with openmpi but you could check the corresponding mpif90 wrapper (mpif90 -show) whether "-Wl,-flat_namespace? is present or not. If so, perhaps passing "--enable-two-level-namespace? to PETSc configure might fix the problem (although I don?t know how you would set this flag *just* for building openmpi). Samar > On Jan 12, 2022, at 9:41 PM, Danyang Su wrote: > > Hi All, > > I got an error in PETSc configuration on macOS Monterey with Intel oneAPI using the following options: > > > ./configure --with-cc=icc --with-cxx=icpc --with-fc=ifort --with-blas-lapack-dir=/opt/intel/oneapi/mkl/2022.0.0/lib/ --with-debugging=1 PETSC_ARCH=macos-intel-dbg --download-mumps --download-parmetis --download-metis --download-hypre --download-superlu --download-hdf5=yes --download-openmpi > > > Error with downloaded OpenMPI: Cannot compile/link FC with /Users/danyangsu/Soft/PETSc/petsc-3.16.3/macos-intel-dbg/bin/mpif90. > > > Any suggestions for that? > > > There is no problem if I use GNU compiler and MPICH. > > > Thanks, > > > Danyang -------------- next part -------------- An HTML attachment was scrubbed... URL: From FERRANJ2 at my.erau.edu Wed Jan 12 18:55:06 2022 From: FERRANJ2 at my.erau.edu (Ferrand, Jesus A.) Date: Thu, 13 Jan 2022 00:55:06 +0000 Subject: [petsc-users] Parallel DMPlex Local to Global Mapping Message-ID: Dear PETSc Team: Hi! I'm working on a parallel version of a PETSc script that I wrote in serial using DMPlex. After calling DMPlexDistribute() each rank is assigned its own DAG where the points are numbered locally. For example, If I split a 100-cell mesh over 4 processors, each process numbers their cells 0-24 as oppossed to something like 0-24, 25-49, 50-74, and 74-99 on ranks 0,1,2, and 3 respectively. The same happens for Face, Edge, and Vertex points in that the local DAG's renumber the ID's starting with 0 instead of using global numbering. How can I distribute a mesh such that the global numbering is reflected in the local DAG's? If not, what would be the right way to retrieve the global numbering? I've seen the term "StarForest" in some [petsc-users] discussion threads discussing a similar issue but have little clue as how to use them. I've looked at the following functions: * DMPlexCreatePointNumbering() - Sounds like what I need, but I don't think it will work because I am relying on DMPlexGetDepthStratum() which returns bounds in local numbering. * DMPlexGetCellNumbering() - Only converts Cells * DMPlexGetVertexNumbering() - Only converts Vertices Basically, what I want to do is to have a global matrix and have my MPI ranks call MatSetValues() on it (with ADD_VALUES as the mode). In my serial code I was relying on the global point numbering to build the matrix. Without it, I can't do it my way : (. I'm manually assembling a global stiffness matrix out of element stiffness matrices to run FEA. Any help is much appreciated. Sincerely: J.A. Ferrand Embry-Riddle Aeronautical University - Daytona Beach FL M.Sc. Aerospace Engineering | May 2022 B.Sc. Aerospace Engineering B.Sc. Computational Mathematics Sigma Gamma Tau Tau Beta Pi Honors Program Phone: (386)-843-1829 Email(s): ferranj2 at my.erau.edu jesus.ferrand at gmail.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From danyang.su at gmail.com Thu Jan 13 00:01:08 2022 From: danyang.su at gmail.com (Danyang Su) Date: Wed, 12 Jan 2022 22:01:08 -0800 Subject: [petsc-users] PETSc configuration error on macOS Monterey with Intel oneAPI In-Reply-To: References: Message-ID: <93338C45-052E-49DB-9A89-C3CE54FD95F9@gmail.com> Hi Samar, Thanks for your suggestion. Unfortunately, it does not work. I checked the mpif90 wrapper and the option "-Wl,-flat_namespace? is present. (base) ? bin ./mpif90 -show ifort -I/Users/danyangsu/Soft/PETSc/petsc-3.16.3/macos-intel-dbg/include -Wl,-flat_namespace -Wl,-commons,use_dylibs -I/Users/danyangsu/Soft/PETSc/petsc-3.16.3/macos-intel-dbg/lib -L/Users/danyangsu/Soft/PETSc/petsc-3.16.3/macos-intel-dbg/lib -lmpi_usempif08 -lmpi_usempi_ignore_tkr -lmpi_mpifh -lmpi Thanks anyway, Danyang From: Samar Khatiwala Date: Wednesday, January 12, 2022 at 2:01 PM To: Danyang Su Cc: PETSc Subject: Re: [petsc-users] PETSc configuration error on macOS Monterey with Intel oneAPI Hi Danyang, I had trouble configuring PETSc on MacOS Monterey with ifort when using mpich (which I was building myself). I tracked it down to an errant "-Wl,-flat_namespace? option in the mpif90 wrapper. I rebuilt mpich with the "--enable-two-level-namespace? configuration option and the problem went away. I don?t know if there?s a similar issue with openmpi but you could check the corresponding mpif90 wrapper (mpif90 -show) whether "-Wl,-flat_namespace? is present or not. If so, perhaps passing "--enable-two-level-namespace? to PETSc configure might fix the problem (although I don?t know how you would set this flag *just* for building openmpi). Samar On Jan 12, 2022, at 9:41 PM, Danyang Su wrote: Hi All, I got an error in PETSc configuration on macOS Monterey with Intel oneAPI using the following options: ./configure --with-cc=icc --with-cxx=icpc --with-fc=ifort --with-blas-lapack-dir=/opt/intel/oneapi/mkl/2022.0.0/lib/ --with-debugging=1 PETSC_ARCH=macos-intel-dbg --download-mumps --download-parmetis --download-metis --download-hypre --download-superlu --download-hdf5=yes --download-openmpi Error with downloaded OpenMPI: Cannot compile/link FC with /Users/danyangsu/Soft/PETSc/petsc-3.16.3/macos-intel-dbg/bin/mpif90. Any suggestions for that? There is no problem if I use GNU compiler and MPICH. Thanks, Danyang -------------- next part -------------- An HTML attachment was scrubbed... URL: From samar.khatiwala at earth.ox.ac.uk Thu Jan 13 03:16:18 2022 From: samar.khatiwala at earth.ox.ac.uk (Samar Khatiwala) Date: Thu, 13 Jan 2022 09:16:18 +0000 Subject: [petsc-users] PETSc configuration error on macOS Monterey with Intel oneAPI In-Reply-To: <93338C45-052E-49DB-9A89-C3CE54FD95F9@gmail.com> References: <93338C45-052E-49DB-9A89-C3CE54FD95F9@gmail.com> Message-ID: Hi Danyang, Just to reiterate, the presence of -Wl,-flat_namespace *is* the problem. I got rid of it by configuring mpich with --enable-two-level-namespace. I reported this problem to the PETSc folks a few weeks ago and they were going to patch MPICH.py (under config/BuildSystem/config/packages) to pass this flag. So you could try configuring with ?download-mpich (or build your own mpich, which is pretty straightforward). If you?re wedded to openmpi, you could patch up OpenMPI.py yourself (maybe --enable-two-level-namespace is called something else for openmpi). Best, Samar > On Jan 13, 2022, at 6:01 AM, Danyang Su wrote: > > Hi Samar, > > Thanks for your suggestion. Unfortunately, it does not work. I checked the mpif90 wrapper and the option "-Wl,-flat_namespace? is present. > > (base) ? bin ./mpif90 -show > ifort -I/Users/danyangsu/Soft/PETSc/petsc-3.16.3/macos-intel-dbg/include -Wl,-flat_namespace -Wl,-commons,use_dylibs -I/Users/danyangsu/Soft/PETSc/petsc-3.16.3/macos-intel-dbg/lib -L/Users/danyangsu/Soft/PETSc/petsc-3.16.3/macos-intel-dbg/lib -lmpi_usempif08 -lmpi_usempi_ignore_tkr -lmpi_mpifh -lmpi > > Thanks anyway, > > Danyang > From: Samar Khatiwala > Date: Wednesday, January 12, 2022 at 2:01 PM > To: Danyang Su > Cc: PETSc > Subject: Re: [petsc-users] PETSc configuration error on macOS Monterey with Intel oneAPI > > Hi Danyang, > > I had trouble configuring PETSc on MacOS Monterey with ifort when using mpich (which I was building myself). I tracked it down to an errant "-Wl,-flat_namespace? > option in the mpif90 wrapper. I rebuilt mpich with the "--enable-two-level-namespace? configuration option and the problem went away. I don?t know if there?s a similar > issue with openmpi but you could check the corresponding mpif90 wrapper (mpif90 -show) whether "-Wl,-flat_namespace? is present or not. If so, perhaps passing > "--enable-two-level-namespace? to PETSc configure might fix the problem (although I don?t know how you would set this flag *just* for building openmpi). > > Samar > > >> On Jan 12, 2022, at 9:41 PM, Danyang Su > wrote: >> >> Hi All, >> >> I got an error in PETSc configuration on macOS Monterey with Intel oneAPI using the following options: >> >> >> ./configure --with-cc=icc --with-cxx=icpc --with-fc=ifort --with-blas-lapack-dir=/opt/intel/oneapi/mkl/2022.0.0/lib/ --with-debugging=1 PETSC_ARCH=macos-intel-dbg --download-mumps --download-parmetis --download-metis --download-hypre --download-superlu --download-hdf5=yes --download-openmpi >> >> >> Error with downloaded OpenMPI: Cannot compile/link FC with /Users/danyangsu/Soft/PETSc/petsc-3.16.3/macos-intel-dbg/bin/mpif90. >> >> >> Any suggestions for that? >> >> >> There is no problem if I use GNU compiler and MPICH. >> >> >> Thanks, >> >> >> Danyang -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Thu Jan 13 06:00:31 2022 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 13 Jan 2022 07:00:31 -0500 Subject: [petsc-users] Parallel DMPlex Local to Global Mapping In-Reply-To: References: Message-ID: On Wed, Jan 12, 2022 at 7:55 PM Ferrand, Jesus A. wrote: > Dear PETSc Team: > > Hi! I'm working on a parallel version of a PETSc script that I wrote in > serial using DMPlex. After calling DMPlexDistribute() each rank is assigned > its own DAG where the points are numbered locally. For example, If I split > a 100-cell mesh over 4 processors, each process numbers their cells 0-24 as > oppossed to something like 0-24, 25-49, 50-74, and 74-99 on ranks 0,1,2, > and 3 respectively. The same happens for Face, Edge, and Vertex points in > that the local DAG's renumber the ID's starting with 0 instead of using > global numbering. > A parallel Plex is a collection of local Plexes connected by a PetscSF, which tells us which pieces of the mesh are replicated on other processes. Each Plex is locally numbered. This allows us to expand to any number of processes without running out of numbers. > How can I distribute a mesh such that the global numbering is reflected in > the local DAG's? > You can always create a global numbering of points using https://petsc.org/main/docs/manualpages/DMPLEX/DMPlexCreatePointNumbering.html but you should not need to do this. > If not, what would be the right way to retrieve the global numbering? I've > seen the term "StarForest" in some [petsc-users] discussion threads > discussing a similar issue but have little clue as how to use them. > > I've looked at the following functions: > > - DMPlexCreatePointNumbering() - Sounds like what I need, but I don't > think it will work because I am relying on DMPlexGetDepthStratum() which > returns bounds in local numbering. > - DMPlexGetCellNumbering() - Only converts Cells > - DMPlexGetVertexNumbering() - Only converts Vertices > > Basically, what I want to do is to have a global matrix and have my MPI > ranks call MatSetValues() on it (with ADD_VALUES as the mode). In my serial > code I was relying on the global point numbering to build the matrix. > Without it, I can't do it my way : (. I'm manually assembling a global > stiffness matrix out of element stiffness matrices to run FEA. > You can do this at least two ways: 1) Right before you call MatSetValues(), use the IS from CreatePointNumbering() to translate your local numbers to global numbers/rows. 2) Call https://petsc.org/main/docs/manualpages/DMPLEX/DMPlexMatSetClosure.html This is how I set FEM element matrices in the PETSc examples. Thanks, Matt > Any help is much appreciated. > > Sincerely: > > *J.A. Ferrand* > > Embry-Riddle Aeronautical University - Daytona Beach FL > > M.Sc. Aerospace Engineering | May 2022 > > B.Sc. Aerospace Engineering > > B.Sc. Computational Mathematics > > > > Sigma Gamma Tau > > Tau Beta Pi > > Honors Program > > > > *Phone:* (386)-843-1829 > > *Email(s):* ferranj2 at my.erau.edu > > jesus.ferrand at gmail.com > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From lshtongying at 163.com Wed Jan 12 23:18:08 2022 From: lshtongying at 163.com (=?UTF-8?B?5L2f6I65?=) Date: Thu, 13 Jan 2022 13:18:08 +0800 (GMT+08:00) Subject: [petsc-users] Help letter from PETSc3.6 user Message-ID: <7cd93c2.20e1.17e51df325d.Coremail.lshtongying@163.com> Dear PETSc developers? Recently the following problem appeared in my code: [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [0]PETSC ERROR: Overflow in integer operation: http://www.mcs.anl.gov/petsc/documentation/faq.html#64-bit-indices [0]PETSC ERROR: Mesh of 513 by 513 by 19 (dof) is too large for 32 bit indices [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. [0]PETSC ERROR: Petsc Release Version 3.6.1, Jul, 22, 2015 [0]PETSC ERROR: runSimulation on a arch-linux2-c-opt named cn91 by hyper Thu Jan 13 10:21:05 2022 [0]PETSC ERROR: Configure options prefix=/****/software/petsc/3.6.1-c-gcc4.9-mvapich2 --with-cc=mpicc --with-cxx=mpicxx --with-fc=mpif90 --with-mpiexec=yhrun --with-blas-lib=/****/software/lapack/3.8.0-gcc4.9/lib64/libblas.a --with-lapack-lib=/***/software/lapack/3.8.0-gcc4.9/lib64/liblapack.a --with-debugging=0 --with-shared-libraries [0]PETSC ERROR: #1 DMSetUp_DA_3D() line 211 in /****/project/petsc/petsc-3.6.1-gcc4.9.4-mvapich2/src/dm/impls/da/da3.c [0]PETSC ERROR: #2 DMSetUp_DA() line 27 in /****/project/petsc/petsc-3.6.1-gcc4.9.4-mvapich2/src/dm/impls/da/dareg.c [0]PETSC ERROR: #3 DMSetUp() line 569 in /****/project/petsc/petsc-3.6.1-gcc4.9.4-mvapich2/src/dm/interface/dm.c [0]PETSC ERROR: #4 DMDACreate3d() line 1441 in /****/project/petsc/petsc-3.6.1-gcc4.9.4-mvapich2/src/dm/impls/da/da3.c However, the troubleshooting website mentioned in the tip no longer exists. For this, I want your help. Hope you can tell me how can I troubleshoot the above. Looking forward to hearing from you. Sincerely, PETSc user Ying Tong | | ?? | | lshtongying at 163.com | ??????????? -------------- next part -------------- An HTML attachment was scrubbed... URL: From balay at mcs.anl.gov Thu Jan 13 08:18:03 2022 From: balay at mcs.anl.gov (Satish Balay) Date: Thu, 13 Jan 2022 08:18:03 -0600 (CST) Subject: [petsc-users] Help letter from PETSc3.6 user In-Reply-To: <7cd93c2.20e1.17e51df325d.Coremail.lshtongying@163.com> References: <7cd93c2.20e1.17e51df325d.Coremail.lshtongying@163.com> Message-ID: Try: https://petsc.org/release/faq/#when-should-can-i-use-the-configure-option-with-64-bit-indices Also best to use the current release 3.16 Satish On Thu, 13 Jan 2022, ?? wrote: > Dear PETSc developers? > Recently the following problem appeared in my code: > > > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > [0]PETSC ERROR: Overflow in integer operation: http://www.mcs.anl.gov/petsc/documentation/faq.html#64-bit-indices > [0]PETSC ERROR: Mesh of 513 by 513 by 19 (dof) is too large for 32 bit indices > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > [0]PETSC ERROR: Petsc Release Version 3.6.1, Jul, 22, 2015 > [0]PETSC ERROR: runSimulation on a arch-linux2-c-opt named cn91 by hyper Thu Jan 13 10:21:05 2022 > [0]PETSC ERROR: Configure options prefix=/****/software/petsc/3.6.1-c-gcc4.9-mvapich2 --with-cc=mpicc --with-cxx=mpicxx --with-fc=mpif90 --with-mpiexec=yhrun --with-blas-lib=/****/software/lapack/3.8.0-gcc4.9/lib64/libblas.a --with-lapack-lib=/***/software/lapack/3.8.0-gcc4.9/lib64/liblapack.a --with-debugging=0 --with-shared-libraries > [0]PETSC ERROR: #1 DMSetUp_DA_3D() line 211 in /****/project/petsc/petsc-3.6.1-gcc4.9.4-mvapich2/src/dm/impls/da/da3.c > [0]PETSC ERROR: #2 DMSetUp_DA() line 27 in /****/project/petsc/petsc-3.6.1-gcc4.9.4-mvapich2/src/dm/impls/da/dareg.c > [0]PETSC ERROR: #3 DMSetUp() line 569 in /****/project/petsc/petsc-3.6.1-gcc4.9.4-mvapich2/src/dm/interface/dm.c > [0]PETSC ERROR: #4 DMDACreate3d() line 1441 in /****/project/petsc/petsc-3.6.1-gcc4.9.4-mvapich2/src/dm/impls/da/da3.c > > > However, the troubleshooting website mentioned in the tip no longer exists. For this, I want your help. Hope you can tell me how can I troubleshoot the above. Looking forward to hearing from you. > > > Sincerely, > > > > PETSc user > Ying Tong > > > > > > > > > > > > > > > > > > > | | > ?? > | > | > lshtongying at 163.com > | > ??????????? From nabw91 at gmail.com Thu Jan 13 12:14:59 2022 From: nabw91 at gmail.com (=?UTF-8?Q?Nicol=C3=A1s_Barnafi?=) Date: Thu, 13 Jan 2022 19:14:59 +0100 Subject: [petsc-users] PCSetCoordinates does not set coordinates of sub PC (fieldsplit) objects In-Reply-To: References: Message-ID: Dear all, I have created a first implementation. For now it must be called after setting the fields, eventually I would like to move it to the setup phase. The implementation seems clean, but it is giving me some memory errors (free() corrupted unsorted chunks). You may find the code below. After some work with gdb, I found out that the errors appears when calling the ISDestroy(&is_coords) line, which to me is not very clear, as I am indeed within the while scope creating and then destroying the is_coords object. I would greatly appreciate it if you could give me a hint on what the problem is. After debugging this, and working on your suggestion, I will open a PR. Best regards, NB ----- CODE ------ static PetscErrorCode PCSetCoordinates_FieldSplit(PC pc, PetscInt dim, PetscInt nloc, PetscReal coords[]) { PetscErrorCode ierr; PC_FieldSplit *jac = (PC_FieldSplit*)pc->data; PC_FieldSplitLink ilink_current = jac->head; PC pc_current; PetscInt nmin, nmax, ii, ndofs; PetscInt *owned_dofs; // Indexes owned by this processor PetscReal *coords_block; // Coordinates to be given to the current PC IS is_owned; PetscFunctionBegin; // Extract matrix ownership range to then compute subindexes for coordinates. This results in an IS object (is_owned). // TODO: This would be simpler with a general MatGetOwnershipIS (currently supported only by Elemental and BLAS matrices). ierr = MatGetOwnershipRange(pc->mat,&nmin,&nmax);CHKERRQ(ierr); ndofs = nmax - nmin; ierr = PetscMalloc1(ndofs, &owned_dofs); CHKERRQ(ierr); for(PetscInt i=nmin;iis, is_owned, PETSC_TRUE, &is_coords); CHKERRQ(ierr); // Setting drop to TRUE, although it should make no difference. ierr = PetscMalloc1(ndofs_block, &coords_block); CHKERRQ(ierr); ierr = ISGetLocalSize(is_coords, &ndofs_block); CHKERRQ(ierr); ierr = ISGetIndices(is_coords, &block_dofs_enumeration); CHKERRQ(ierr); // Having the indices computed and the memory allocated, we can copy the relevant coords and set them to the subPC. for(PetscInt dof=0;dofksp, &pc_current); CHKERRQ(ierr); ierr = PCSetCoordinates(pc_current, dim, ndofs_block, coords_block); CHKERRQ(ierr); ierr = PetscFree(coords_block); CHKERRQ(ierr); if(!pc_current) SETERRQ(PetscObjectComm((PetscObject)pc),PETSC_ERR_ORDER,"Setting coordinates to PCFIELDSPLIT but a subPC is null."); ilink_current = ilink_current->next; ++ii; } ierr = PetscFree(owned_dofs); CHKERRQ(ierr); PetscFunctionReturn(0); } On Wed, Jan 12, 2022 at 6:22 AM Barry Smith wrote: > > > On Jan 11, 2022, at 9:51 PM, Matthew Knepley wrote: > > On Tue, Jan 11, 2022 at 3:31 PM Barry Smith wrote: > >> >> Nicolas, >> >> For "simple" PCFIELDSPLIT it is possible to pass down the attached >> coordinate information. By simple I mean where the splitting is done by >> fields and not by general lists of IS (where one does not have enough >> information to know what the coordinates would mean to the subPCS). >> >> Look in fieldsplit.c PCFieldSplitSetFields_FieldSplit() where it does >> the KSPCreate(). I think you can do a KSPGetPC() on that ksp and >> PCSetCoordinates on that PC to supply the coordinates to the subPC. In the >> function PCFieldSplitSetIS_FieldSplit() you can also attach the coordinates >> to the subPCs IF defaultsplit is true. >> >> Sadly this is not the full story. The outer PC will not have any >> coordinates because calling PCSetCoordinates on a PCFIELDSPLIT does nothing >> since fieldsplit doesn't handle coordinates. So you need to do more, you >> need to provide a PCSetCoordinates_FieldSplit() that saves the coordinates >> in new entries in the PC_FieldSplit struct and then in >> PCFieldSplitSetFields_FieldSplit() you need to access those saved values >> and pass them into the PCSetCoordinates() that you call on the subPCs. Once >> you write >> PCSetCoordinates_FieldSplit() you need to call >> >> ierr = >> PetscObjectComposeFunction((PetscObject)pc,"PCSetCoordinates_C",PCSetCoordinates_FieldSplit);CHKERRQ(ierr); >> >> >> inside PCCreate_FieldSplit(). >> >> Any questions just let us know. >> > > I will add "Why is this so cumbersome?". This is a workaround in order to > get geometric information into GAMG. It should really be > PCGAMGSetCoordinates(), which > are used to calculate the rigid body modes, and assume a bunch of stuff > about the coordinate space. This would not help you, because it would still > force you to pull > out the correct subPC. The "right" way now to give geometric information > to a TS/SNES/KSP/PC is through a DM, which are passed down through > PCFIELDSPLIT, > PCPATCH, etc. However they are heavier weight than just some coordinates. > > > This is not cumbersome at all. It is a simple natural way to pass around > coordinates to PC's and, when possible, their children. > > Barry > > Note that we could also have a DMGetCoordinates() that pulled > coordinates from a DM (that happended to have them) in this form associated > with the PC and the PC could call it to get the coordinates and use them as > needed. But this simple PCSetCoordinates() is a nice complement to that > approach. > > > Thanks, > > Matt > > >> Barry >> >> >> > On Jan 11, 2022, at 11:58 AM, Nicol?s Barnafi wrote: >> > >> > Dear community, >> > >> > I am working on a block preconditioner, where one of the blocks uses >> HYPRE's AMS. As it requires the coordinates of the dofs, I have done so to >> the PC object. I expected the coordinates to be inherited down to the >> subblocks, is this not the case? (it seems so as I couldn't find a >> specialized FIELDSPLIT SetCoordinates function). >> > >> > If this feature is missing, please give me some hints on where to add >> the missing function, I would gladly do it. If not, please let me know why >> it was dismissed, in order to do things the hard way [as in hard-coded ;)]. >> > >> > Kind regards, >> > Nicolas >> >> > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > > > -- Nicol?s Alejandro Barnafi Wittwer -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Thu Jan 13 12:20:50 2022 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 13 Jan 2022 13:20:50 -0500 Subject: [petsc-users] PCSetCoordinates does not set coordinates of sub PC (fieldsplit) objects In-Reply-To: References: Message-ID: On Thu, Jan 13, 2022 at 1:15 PM Nicol?s Barnafi wrote: > Dear all, > > I have created a first implementation. For now it must be called after > setting the fields, eventually I would like to move it to the setup phase. > The implementation seems clean, but it is giving me some memory errors > (free() corrupted unsorted chunks). > > You may find the code below. After some work with gdb, I found out that > the errors appears when calling the ISDestroy(&is_coords) line, which to me > is not very clear, as I am indeed within the while scope creating and then > destroying the is_coords object. I would greatly appreciate it if you could > give me a hint on what the problem is. > > After debugging this, and working on your suggestion, I will open a PR. > > Best regards, > NB > > > ----- CODE ------ > > static PetscErrorCode PCSetCoordinates_FieldSplit(PC pc, PetscInt dim, > PetscInt nloc, PetscReal coords[]) > { > PetscErrorCode ierr; > PC_FieldSplit *jac = (PC_FieldSplit*)pc->data; > PC_FieldSplitLink ilink_current = jac->head; > PC pc_current; > PetscInt nmin, nmax, ii, ndofs; > PetscInt *owned_dofs; // Indexes owned by this processor > PetscReal *coords_block; // Coordinates to be given to the current PC > IS is_owned; > > PetscFunctionBegin; > // Extract matrix ownership range to then compute subindexes for > coordinates. This results in an IS object (is_owned). > // TODO: This would be simpler with a general MatGetOwnershipIS > (currently supported only by Elemental and BLAS matrices). > ierr = MatGetOwnershipRange(pc->mat,&nmin,&nmax);CHKERRQ(ierr); > ndofs = nmax - nmin; > ierr = PetscMalloc1(ndofs, &owned_dofs); CHKERRQ(ierr); > for(PetscInt i=nmin;i owned_dofs[i] = nmin + i; > ierr = ISCreateGeneral(MPI_COMM_WORLD, ndofs, owned_dofs, > PETSC_OWN_POINTER, &is_owned); CHKERRQ(ierr); > Here you tell PETSc to take control of the memory for owned_dofs, but below you PetscFree(owned_dofs). This is not compatible. You should just destroy the IS when you are done. Thanks, Matt > // For each IS, embed it to get local coords indces and then set > coordinates in the subPC. > ii=0; > while(ilink_current) > { > IS is_coords; > PetscInt ndofs_block; > const PetscInt *block_dofs_enumeration; // Numbering of the dofs > relevant to the current block > > ierr = ISEmbed(ilink_current->is, is_owned, PETSC_TRUE, &is_coords); > CHKERRQ(ierr); // Setting drop to TRUE, although it should make no > difference. > ierr = PetscMalloc1(ndofs_block, &coords_block); CHKERRQ(ierr); > ierr = ISGetLocalSize(is_coords, &ndofs_block); CHKERRQ(ierr); > ierr = ISGetIndices(is_coords, &block_dofs_enumeration); CHKERRQ(ierr); > > // Having the indices computed and the memory allocated, we can copy > the relevant coords and set them to the subPC. > for(PetscInt dof=0;dof for(PetscInt d=0;d { > coords_block[dim*dof + d] = coords[dim * > block_dofs_enumeration[dof] + d]; > // printf("Dof: %d, Global: %f\n", block_dofs_enumeration[dof], > coords[dim * block_dofs_enumeration[dof] + d]); > } > ierr = ISRestoreIndices(is_coords, &block_dofs_enumeration); > CHKERRQ(ierr); > ierr = ISDestroy(&is_coords); CHKERRQ(ierr); > ierr = KSPGetPC(ilink_current->ksp, &pc_current); CHKERRQ(ierr); > ierr = PCSetCoordinates(pc_current, dim, ndofs_block, coords_block); > CHKERRQ(ierr); > ierr = PetscFree(coords_block); CHKERRQ(ierr); > if(!pc_current) > SETERRQ(PetscObjectComm((PetscObject)pc),PETSC_ERR_ORDER,"Setting > coordinates to PCFIELDSPLIT but a subPC is null."); > > ilink_current = ilink_current->next; > ++ii; > } > ierr = PetscFree(owned_dofs); CHKERRQ(ierr); > PetscFunctionReturn(0); > } > > On Wed, Jan 12, 2022 at 6:22 AM Barry Smith wrote: > >> >> >> On Jan 11, 2022, at 9:51 PM, Matthew Knepley wrote: >> >> On Tue, Jan 11, 2022 at 3:31 PM Barry Smith wrote: >> >>> >>> Nicolas, >>> >>> For "simple" PCFIELDSPLIT it is possible to pass down the attached >>> coordinate information. By simple I mean where the splitting is done by >>> fields and not by general lists of IS (where one does not have enough >>> information to know what the coordinates would mean to the subPCS). >>> >>> Look in fieldsplit.c PCFieldSplitSetFields_FieldSplit() where it does >>> the KSPCreate(). I think you can do a KSPGetPC() on that ksp and >>> PCSetCoordinates on that PC to supply the coordinates to the subPC. In the >>> function PCFieldSplitSetIS_FieldSplit() you can also attach the coordinates >>> to the subPCs IF defaultsplit is true. >>> >>> Sadly this is not the full story. The outer PC will not have any >>> coordinates because calling PCSetCoordinates on a PCFIELDSPLIT does nothing >>> since fieldsplit doesn't handle coordinates. So you need to do more, you >>> need to provide a PCSetCoordinates_FieldSplit() that saves the coordinates >>> in new entries in the PC_FieldSplit struct and then in >>> PCFieldSplitSetFields_FieldSplit() you need to access those saved values >>> and pass them into the PCSetCoordinates() that you call on the subPCs. Once >>> you write >>> PCSetCoordinates_FieldSplit() you need to call >>> >>> ierr = >>> PetscObjectComposeFunction((PetscObject)pc,"PCSetCoordinates_C",PCSetCoordinates_FieldSplit);CHKERRQ(ierr); >>> >>> >>> inside PCCreate_FieldSplit(). >>> >>> Any questions just let us know. >>> >> >> I will add "Why is this so cumbersome?". This is a workaround in order to >> get geometric information into GAMG. It should really be >> PCGAMGSetCoordinates(), which >> are used to calculate the rigid body modes, and assume a bunch of stuff >> about the coordinate space. This would not help you, because it would still >> force you to pull >> out the correct subPC. The "right" way now to give geometric information >> to a TS/SNES/KSP/PC is through a DM, which are passed down through >> PCFIELDSPLIT, >> PCPATCH, etc. However they are heavier weight than just some coordinates. >> >> >> This is not cumbersome at all. It is a simple natural way to pass >> around coordinates to PC's and, when possible, their children. >> >> Barry >> >> Note that we could also have a DMGetCoordinates() that pulled >> coordinates from a DM (that happended to have them) in this form associated >> with the PC and the PC could call it to get the coordinates and use them as >> needed. But this simple PCSetCoordinates() is a nice complement to that >> approach. >> >> >> Thanks, >> >> Matt >> >> >>> Barry >>> >>> >>> > On Jan 11, 2022, at 11:58 AM, Nicol?s Barnafi >>> wrote: >>> > >>> > Dear community, >>> > >>> > I am working on a block preconditioner, where one of the blocks uses >>> HYPRE's AMS. As it requires the coordinates of the dofs, I have done so to >>> the PC object. I expected the coordinates to be inherited down to the >>> subblocks, is this not the case? (it seems so as I couldn't find a >>> specialized FIELDSPLIT SetCoordinates function). >>> > >>> > If this feature is missing, please give me some hints on where to add >>> the missing function, I would gladly do it. If not, please let me know why >>> it was dismissed, in order to do things the hard way [as in hard-coded ;)]. >>> > >>> > Kind regards, >>> > Nicolas >>> >>> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ >> >> >> >> > > -- > Nicol?s Alejandro Barnafi Wittwer > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From danyang.su at gmail.com Thu Jan 13 14:24:34 2022 From: danyang.su at gmail.com (Danyang Su) Date: Thu, 13 Jan 2022 12:24:34 -0800 Subject: [petsc-users] PETSc configuration error on macOS Monterey with Intel oneAPI In-Reply-To: References: <93338C45-052E-49DB-9A89-C3CE54FD95F9@gmail.com> Message-ID: Hi Samar, Yes, with mpich, there is no such error. I will just use this configuration for now. Thanks, Danyang From: Samar Khatiwala Date: Thursday, January 13, 2022 at 1:16 AM To: Danyang Su Cc: PETSc Subject: Re: [petsc-users] PETSc configuration error on macOS Monterey with Intel oneAPI Hi Danyang, Just to reiterate, the presence of -Wl,-flat_namespace *is* the problem. I got rid of it by configuring mpich with --enable-two-level-namespace. I reported this problem to the PETSc folks a few weeks ago and they were going to patch MPICH.py (under config/BuildSystem/config/packages) to pass this flag. So you could try configuring with ?download-mpich (or build your own mpich, which is pretty straightforward). If you?re wedded to openmpi, you could patch up OpenMPI.py yourself (maybe --enable-two-level-namespace is called something else for openmpi). Best, Samar On Jan 13, 2022, at 6:01 AM, Danyang Su wrote: Hi Samar, Thanks for your suggestion. Unfortunately, it does not work. I checked the mpif90 wrapper and the option "-Wl,-flat_namespace? is present. (base) ? bin ./mpif90 -show ifort -I/Users/danyangsu/Soft/PETSc/petsc-3.16.3/macos-intel-dbg/include -Wl,-flat_namespace -Wl,-commons,use_dylibs -I/Users/danyangsu/Soft/PETSc/petsc-3.16.3/macos-intel-dbg/lib -L/Users/danyangsu/Soft/PETSc/petsc-3.16.3/macos-intel-dbg/lib -lmpi_usempif08 -lmpi_usempi_ignore_tkr -lmpi_mpifh -lmpi Thanks anyway, Danyang From: Samar Khatiwala Date: Wednesday, January 12, 2022 at 2:01 PM To: Danyang Su Cc: PETSc Subject: Re: [petsc-users] PETSc configuration error on macOS Monterey with Intel oneAPI Hi Danyang, I had trouble configuring PETSc on MacOS Monterey with ifort when using mpich (which I was building myself). I tracked it down to an errant "-Wl,-flat_namespace? option in the mpif90 wrapper. I rebuilt mpich with the "--enable-two-level-namespace? configuration option and the problem went away. I don?t know if there?s a similar issue with openmpi but you could check the corresponding mpif90 wrapper (mpif90 -show) whether "-Wl,-flat_namespace? is present or not. If so, perhaps passing "--enable-two-level-namespace? to PETSc configure might fix the problem (although I don?t know how you would set this flag *just* for building openmpi). Samar On Jan 12, 2022, at 9:41 PM, Danyang Su wrote: Hi All, I got an error in PETSc configuration on macOS Monterey with Intel oneAPI using the following options: ./configure --with-cc=icc --with-cxx=icpc --with-fc=ifort --with-blas-lapack-dir=/opt/intel/oneapi/mkl/2022.0.0/lib/ --with-debugging=1 PETSC_ARCH=macos-intel-dbg --download-mumps --download-parmetis --download-metis --download-hypre --download-superlu --download-hdf5=yes --download-openmpi Error with downloaded OpenMPI: Cannot compile/link FC with /Users/danyangsu/Soft/PETSc/petsc-3.16.3/macos-intel-dbg/bin/mpif90. Any suggestions for that? There is no problem if I use GNU compiler and MPICH. Thanks, Danyang -------------- next part -------------- An HTML attachment was scrubbed... URL: From dong-hao at outlook.com Fri Jan 14 00:49:15 2022 From: dong-hao at outlook.com (Hao DONG) Date: Fri, 14 Jan 2022 06:49:15 +0000 Subject: [petsc-users] Strange CUDA failure with a second petscfinalize with PETSc 3.16 Message-ID: <46A67106-A80C-4057-A301-E3C13236CC18@outlook.com> Dear All, I have encountered a peculiar problem when fiddling with a code with PETSC 3.16.3 (which worked fine with PETSc 3.15). It is a very straight forward PDE-based optimization code which repeatedly solves a linearized PDE problem with KSP in a subroutine (the rest of the code does not contain any PETSc related content). The main program provides the subroutine with an MPI comm. Then I set the comm as PETSC_COMM_WORLD to tell PETSC to attach to it (and detach with it when the solving is finished each time). Strangely, I observe a CUDA failure whenever the petscfinalize is called for a *second* time. In other words, the first and second PDE calculations with GPU are fine (with correct solutions). The petsc code just fails after the SECOND petscfinalize command is called. You can also see the PETSC config in the error message: [1]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [1]PETSC ERROR: GPU error [1]PETSC ERROR: cuda error 201 (cudaErrorDeviceUninitialized) : invalid device context [1]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting. [1]PETSC ERROR: Petsc Release Version 3.16.3, unknown [1]PETSC ERROR: maxwell.gpu on a named stratosphere by hao Fri Jan 14 10:21:05 2022 [1]PETSC ERROR: Configure options --prefix=/opt/petsc/complex-double-with-cuda --with-cc=mpicc --with-cxx=mpicxx --with-fc=mpif90 COPTFLAGS="-O3 -mavx2" CXXOPTFLAGS="-O3 -mavx2" FOPTFLAGS="-O3 -ffree-line-length-none -mavx2" CUDAOPTFLAGS=-O3 --with-cxx-dialect=cxx14 --with-cuda-dialect=cxx14 --with-scalar-type=complex --with-precision=double --with-cuda-dir=/usr/local/cuda --with-debugging=1 [1]PETSC ERROR: #1 PetscFinalize() at /home/hao/packages/petsc-current/src/sys/objects/pinit.c:1638 You might have forgotten to call PetscInitialize(). The EXACT line numbers in the error traceback are not available. Instead the line number of the start of the function is given. [1] #1 PetscAbortFindSourceFile_Private() at /home/hao/packages/petsc-current/src/sys/error/err.c:35 [1] #2 PetscLogGetStageLog() at /home/hao/packages/petsc-current/src/sys/logging/utils/stagelog.c:29 [1] #3 PetscClassIdRegister() at /home/hao/packages/petsc-current/src/sys/logging/plog.c:2376 [1] #4 MatMFFDInitializePackage() at /home/hao/packages/petsc-current/src/mat/impls/mffd/mffd.c:45 [1] #5 MatInitializePackage() at /home/hao/packages/petsc-current/src/mat/interface/dlregismat.c:163 [1] #6 MatCreate() at /home/hao/packages/petsc-current/src/mat/utils/gcreate.c:77 However, it doesn?t seem to affect the other part of my code, so the code can continue running until it gets to the petsc part again (the *third* time). Unfortunately, it doesn?t give me any further information even if I set the debugging to yes in the configure file. It also worth noting that PETSC without CUDA (i.e. with simple MATMPIAIJ) works perfectly fine. I am able to re-produce the problem with a toy code modified from ex11f. Please see the attached file (ex11fc.F90) for details. Essentially the code does the same thing as ex11f, but three times with a do loop. To do that I added an extra MPI_INIT/MPI_FINALIZE to ensure that the MPI communicator is not destroyed when PETSC_FINALIZE is called. I used the PetscOptionsHasName utility to check if you have ?-usecuda? in the options. So running the code with and without that option can give you a comparison w/o CUDA. I can see that the code also fails after the second loop of the KSP operation. Could you kindly shed some lights on this problem? I should say that I am not even sure if the problem is from PETSc, as I also accidentally updated the NVIDIA driver (for now it is 510.06 with cuda 11.6). And it is well known that NVIDIA can give you some surprise in the updates (yes, I know I shouldn?t have touched that if it?s not broken). But my CUDA code without PETSC (which basically does the same PDE thing, but with cusparse/cublas directly) seems to work just fine after the update. It is also possible that my petsc code related to CUDA was not quite ?legitimate? ? I just use: MatSetType(A, MATMPIAIJCUSPARSE, ierr) and MatCreateVecs(A, u, PETSC_NULL_VEC, ierr) to make the data onto GPU. I would very much appreciate it if you could show me the ?right? way to do that. Thanks a lot in advance, and all the best, Hao -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: ex11fc.F90 Type: application/octet-stream Size: 8925 bytes Desc: ex11fc.F90 URL: From colin.cotter at imperial.ac.uk Fri Jan 14 04:32:17 2022 From: colin.cotter at imperial.ac.uk (Cotter, Colin J) Date: Fri, 14 Jan 2022 10:32:17 +0000 Subject: [petsc-users] Postdoctoral position at Imperial College London Message-ID: Dear PETSc users, Just drawing your attention to a postdoctoral position I have available, aiming to start 1st April 2022. The skillset I'm looking for aligns with many PETSc users, hopefully. https://www.jobs.ac.uk/job/CMI339/research-associate-in-computational-mathematics all the best --cjc Professor Colin Cotter (he/him) Department of Mathematics 755, Huxley Building Imperial College London South Kensington Campus United Kingdom of Great Britain and Northern Ireland +44 2075943468 -------------- next part -------------- An HTML attachment was scrubbed... URL: From thibault.bridelbertomeu at gmail.com Fri Jan 14 07:23:34 2022 From: thibault.bridelbertomeu at gmail.com (Thibault Bridel-Bertomeu) Date: Fri, 14 Jan 2022 14:23:34 +0100 Subject: [petsc-users] DMPlex filter with Face Sets Message-ID: Dear all, I have a new question - in relation with my other ongoing thread "Fluid-Structure interaction with multiple DMPlex" actually. Is it possible to filter a DMPlex and keep in the new piece of DM the strata from the Face Sets label that belong to that piece ? Consider for instance the following example : // Standard includes #include // PETSc includes #include #include #include int main( int argc, /**< Number of CLI arguments */ char* argv[] /**< List of CLI arguments as strings */ ) { User user; DM dm; PetscErrorCode ierr = EXIT_SUCCESS; // Start-up the PETSc interface ierr = PetscInitialize(&argc, &argv, (char*)0, NULL);if (ierr) return ierr; // Allocate whatever needs to be ierr = PetscNew(&user);CHKERRQ(ierr); // Parse the mesh given by the user //+ Mesh filename ierr = PetscOptionsBegin(PETSC_COMM_WORLD, NULL, "PETSc DMPlex Coupled Physics Demo Mesh Options", "");CHKERRQ(ierr); { ierr = PetscStrcpy(user->meshName, "data/divided_square/divided_square_gmsh22ascii.msh");CHKERRQ(ierr); ierr = PetscOptionsString("-meshname", "Mesh filename", "", user->meshName, user->meshName, sizeof(user->meshName), NULL);CHKERRQ(ierr); } ierr = PetscOptionsEnd();CHKERRQ(ierr); //+ Read the mesh in ierr = DMPlexCreateFromFile(PETSC_COMM_WORLD, user->meshName, "CoupledPhysics_Plex", PETSC_TRUE, &dm);CHKERRQ(ierr); ierr = DMSetFromOptions(dm);CHKERRQ(ierr); ierr = DMViewFromOptions(dm, NULL, "-dm_view");CHKERRQ(ierr); // Extract both domains separately //+ Label of the fluid domain DMLabel fluidLabel; ierr = DMGetLabel(dm, "Fluid", &fluidLabel);CHKERRQ(ierr); //+ Sub-DM for the fluid domain DM fluidDM; // ierr = DMPlexFilter(dm, domainsLabel, 2, &fluidDM);CHKERRQ(ierr); ierr = DMPlexFilter(dm, fluidLabel, 2, &fluidDM);CHKERRQ(ierr); ierr = DMViewFromOptions(fluidDM, NULL, "-fluid_dm_view");CHKERRQ(ierr); //+ Label of the solid domain DMLabel solidLabel; ierr = DMGetLabel(dm, "Solid", &solidLabel);CHKERRQ(ierr); //+ Sub-DM for the solid domain DM solidDM; ierr = DMPlexFilter(dm, solidLabel, 1, &solidDM);CHKERRQ(ierr); ierr = DMViewFromOptions(solidDM, NULL, "-solid_dm_view");CHKERRQ(ierr); // Close the PETSc interface and end the parallel communicator ierr = PetscFree(user);CHKERRQ(ierr); ierr = PetscFinalize(); return(ierr); } run with the mesh attached to this email and the command ./program -meshname divided_square_gmsh41ascii.msh -dm_plex_gmsh_use_regions -dm_view -fluid_dm_view. It yields : DM Object: CoupledPhysics_Plex 1 MPI processes type: plex CoupledPhysics_Plex in 2 dimensions: Number of 0-cells per rank: 1409 Number of 1-cells per rank: 4084 Number of 2-cells per rank: 2676 Labels: celltype: 3 strata with value/size (0 (1409), 3 (2676), 1 (4084)) depth: 3 strata with value/size (0 (1409), 1 (4084), 2 (2676)) Cell Sets: 2 strata with value/size (1 (924), 2 (1752)) Solid: 1 strata with value/size (1 (924)) Fluid: 1 strata with value/size (2 (1752)) Face Sets: 5 strata with value/size (4 (60), 3 (40), 7 (20), 5 (40), 6 (20)) Insulation: 1 strata with value/size (4 (60)) Wall: 1 strata with value/size (3 (40)) Outlet: 1 strata with value/size (7 (20)) Freestream: 1 strata with value/size (5 (40)) Inlet: 1 strata with value/size (6 (20)) DM Object: 1 MPI processes type: plex DM_0x557ff009be70_1 in 2 dimensions: Number of 0-cells per rank: 937 Number of 1-cells per rank: 2688 Number of 2-cells per rank: 1752 Labels: celltype: 3 strata with value/size (0 (937), 1 (2688), 3 (1752)) depth: 3 strata with value/size (0 (937), 1 (2688), 2 (1752)) In the attached mesh, Wall, Outlet, Freestream and Inlet "belong" to the Fluid domain, therefore I would like to transfer them to the fluidDM during or right after the DMPlexFilter action. Is that possible ? I was looking at extracting the IS from those labels of dm and creating new labels in fluidDM from those IS, but the indices won't match, will they ? The filtering operation renumbers everything right ? Thank you very much !! Thibault Bridel-Bertomeu ? Eng, MSc, PhD Research Engineer CEA/CESTA 33114 LE BARP Tel.: (+33)557046924 Mob.: (+33)611025322 Mail: thibault.bridelbertomeu at gmail.com -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: divided_square_gmsh41ascii.msh Type: model/mesh Size: 111184 bytes Desc: not available URL: From knepley at gmail.com Fri Jan 14 07:37:23 2022 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 14 Jan 2022 08:37:23 -0500 Subject: [petsc-users] DMPlex filter with Face Sets In-Reply-To: References: Message-ID: On Fri, Jan 14, 2022 at 8:23 AM Thibault Bridel-Bertomeu < thibault.bridelbertomeu at gmail.com> wrote: > Dear all, > > I have a new question - in relation with my other ongoing thread > "Fluid-Structure interaction with multiple DMPlex" actually. > Is it possible to filter a DMPlex and keep in the new piece of DM the > strata from the Face Sets label that belong to that piece ? > You are right. We are not doing labels. That is just an oversight. I will fix it. Thanks, Matt > Consider for instance the following example : > > // Standard includes > #include > > // PETSc includes > #include > #include > #include > > int main( > int argc, /**< Number of CLI arguments */ > char* argv[] /**< List of CLI arguments as strings */ > ) > { > User user; > DM dm; > PetscErrorCode ierr = EXIT_SUCCESS; > > // Start-up the PETSc interface > ierr = PetscInitialize(&argc, &argv, (char*)0, NULL);if (ierr) return > ierr; > > // Allocate whatever needs to be > ierr = PetscNew(&user);CHKERRQ(ierr); > > // Parse the mesh given by the user > //+ Mesh filename > ierr = PetscOptionsBegin(PETSC_COMM_WORLD, NULL, "PETSc DMPlex Coupled > Physics Demo Mesh Options", "");CHKERRQ(ierr); > { > ierr = PetscStrcpy(user->meshName, > "data/divided_square/divided_square_gmsh22ascii.msh");CHKERRQ(ierr); > ierr = PetscOptionsString("-meshname", "Mesh filename", "", > user->meshName, user->meshName, sizeof(user->meshName), NULL);CHKERRQ > (ierr); > } > ierr = PetscOptionsEnd();CHKERRQ(ierr); > //+ Read the mesh in > ierr = DMPlexCreateFromFile(PETSC_COMM_WORLD, user->meshName, > "CoupledPhysics_Plex", PETSC_TRUE, &dm);CHKERRQ(ierr); > ierr = DMSetFromOptions(dm);CHKERRQ(ierr); > ierr = DMViewFromOptions(dm, NULL, "-dm_view");CHKERRQ(ierr); > > // Extract both domains separately > //+ Label of the fluid domain > DMLabel fluidLabel; > ierr = DMGetLabel(dm, "Fluid", &fluidLabel);CHKERRQ(ierr); > //+ Sub-DM for the fluid domain > DM fluidDM; > // ierr = DMPlexFilter(dm, domainsLabel, 2, &fluidDM);CHKERRQ(ierr); > ierr = DMPlexFilter(dm, fluidLabel, 2, &fluidDM);CHKERRQ(ierr); > ierr = DMViewFromOptions(fluidDM, NULL, "-fluid_dm_view");CHKERRQ(ierr); > //+ Label of the solid domain > DMLabel solidLabel; > ierr = DMGetLabel(dm, "Solid", &solidLabel);CHKERRQ(ierr); > //+ Sub-DM for the solid domain > DM solidDM; > ierr = DMPlexFilter(dm, solidLabel, 1, &solidDM);CHKERRQ(ierr); > ierr = DMViewFromOptions(solidDM, NULL, "-solid_dm_view");CHKERRQ(ierr); > > // Close the PETSc interface and end the parallel communicator > ierr = PetscFree(user);CHKERRQ(ierr); > ierr = PetscFinalize(); > return(ierr); > } > > run with the mesh attached to this email and the command ./program > -meshname divided_square_gmsh41ascii.msh -dm_plex_gmsh_use_regions -dm_view > -fluid_dm_view. > It yields : > > DM Object: CoupledPhysics_Plex 1 MPI processes > type: plex > CoupledPhysics_Plex in 2 dimensions: > Number of 0-cells per rank: 1409 > Number of 1-cells per rank: 4084 > Number of 2-cells per rank: 2676 > Labels: > celltype: 3 strata with value/size (0 (1409), 3 (2676), 1 (4084)) > depth: 3 strata with value/size (0 (1409), 1 (4084), 2 (2676)) > Cell Sets: 2 strata with value/size (1 (924), 2 (1752)) > Solid: 1 strata with value/size (1 (924)) > Fluid: 1 strata with value/size (2 (1752)) > Face Sets: 5 strata with value/size (4 (60), 3 (40), 7 (20), 5 (40), 6 > (20)) > Insulation: 1 strata with value/size (4 (60)) > Wall: 1 strata with value/size (3 (40)) > Outlet: 1 strata with value/size (7 (20)) > Freestream: 1 strata with value/size (5 (40)) > Inlet: 1 strata with value/size (6 (20)) > > DM Object: 1 MPI processes > type: plex > DM_0x557ff009be70_1 in 2 dimensions: > Number of 0-cells per rank: 937 > Number of 1-cells per rank: 2688 > Number of 2-cells per rank: 1752 > Labels: > celltype: 3 strata with value/size (0 (937), 1 (2688), 3 (1752)) > depth: 3 strata with value/size (0 (937), 1 (2688), 2 (1752)) > > In the attached mesh, Wall, Outlet, Freestream and Inlet "belong" to the > Fluid domain, therefore I would like to transfer them to the fluidDM during > or right after the DMPlexFilter action. > Is that possible ? I was looking at extracting the IS from those labels of > dm and creating new labels in fluidDM from those IS, but the indices won't > match, will they ? The filtering operation renumbers everything right ? > > Thank you very much !! > > Thibault Bridel-Bertomeu > ? > Eng, MSc, PhD > Research Engineer > CEA/CESTA > 33114 LE BARP > Tel.: (+33)557046924 > Mob.: (+33)611025322 > Mail: thibault.bridelbertomeu at gmail.com > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From thibault.bridelbertomeu at gmail.com Fri Jan 14 07:55:11 2022 From: thibault.bridelbertomeu at gmail.com (Thibault Bridel-Bertomeu) Date: Fri, 14 Jan 2022 14:55:11 +0100 Subject: [petsc-users] DMPlex filter with Face Sets In-Reply-To: References: Message-ID: Hi Matt, Thank you for your help in this matter ! Do you mind posting the link to the gitlab issue when you have the occasion to open it ? Thanks ! Thibault Le ven. 14 janv. 2022 ? 14:37, Matthew Knepley a ?crit : > On Fri, Jan 14, 2022 at 8:23 AM Thibault Bridel-Bertomeu < > thibault.bridelbertomeu at gmail.com> wrote: > >> Dear all, >> >> I have a new question - in relation with my other ongoing thread >> "Fluid-Structure interaction with multiple DMPlex" actually. >> Is it possible to filter a DMPlex and keep in the new piece of DM the >> strata from the Face Sets label that belong to that piece ? >> > > You are right. We are not doing labels. That is just an oversight. I will > fix it. > > Thanks, > > Matt > > >> Consider for instance the following example : >> >> // Standard includes >> #include >> >> // PETSc includes >> #include >> #include >> #include >> >> int main( >> int argc, /**< Number of CLI arguments */ >> char* argv[] /**< List of CLI arguments as strings */ >> ) >> { >> User user; >> DM dm; >> PetscErrorCode ierr = EXIT_SUCCESS; >> >> // Start-up the PETSc interface >> ierr = PetscInitialize(&argc, &argv, (char*)0, NULL);if (ierr) return >> ierr; >> >> // Allocate whatever needs to be >> ierr = PetscNew(&user);CHKERRQ(ierr); >> >> // Parse the mesh given by the user >> //+ Mesh filename >> ierr = PetscOptionsBegin(PETSC_COMM_WORLD, NULL, "PETSc DMPlex Coupled >> Physics Demo Mesh Options", "");CHKERRQ(ierr); >> { >> ierr = PetscStrcpy(user->meshName, >> "data/divided_square/divided_square_gmsh22ascii.msh");CHKERRQ(ierr); >> ierr = PetscOptionsString("-meshname", "Mesh filename", "", >> user->meshName, user->meshName, sizeof(user->meshName), NULL);CHKERRQ >> (ierr); >> } >> ierr = PetscOptionsEnd();CHKERRQ(ierr); >> //+ Read the mesh in >> ierr = DMPlexCreateFromFile(PETSC_COMM_WORLD, user->meshName, >> "CoupledPhysics_Plex", PETSC_TRUE, &dm);CHKERRQ(ierr); >> ierr = DMSetFromOptions(dm);CHKERRQ(ierr); >> ierr = DMViewFromOptions(dm, NULL, "-dm_view");CHKERRQ(ierr); >> >> // Extract both domains separately >> //+ Label of the fluid domain >> DMLabel fluidLabel; >> ierr = DMGetLabel(dm, "Fluid", &fluidLabel);CHKERRQ(ierr); >> //+ Sub-DM for the fluid domain >> DM fluidDM; >> // ierr = DMPlexFilter(dm, domainsLabel, 2, &fluidDM);CHKERRQ(ierr); >> ierr = DMPlexFilter(dm, fluidLabel, 2, &fluidDM);CHKERRQ(ierr); >> ierr = DMViewFromOptions(fluidDM, NULL, "-fluid_dm_view");CHKERRQ(ierr); >> //+ Label of the solid domain >> DMLabel solidLabel; >> ierr = DMGetLabel(dm, "Solid", &solidLabel);CHKERRQ(ierr); >> //+ Sub-DM for the solid domain >> DM solidDM; >> ierr = DMPlexFilter(dm, solidLabel, 1, &solidDM);CHKERRQ(ierr); >> ierr = DMViewFromOptions(solidDM, NULL, "-solid_dm_view");CHKERRQ(ierr); >> >> // Close the PETSc interface and end the parallel communicator >> ierr = PetscFree(user);CHKERRQ(ierr); >> ierr = PetscFinalize(); >> return(ierr); >> } >> >> run with the mesh attached to this email and the command ./program >> -meshname divided_square_gmsh41ascii.msh -dm_plex_gmsh_use_regions -dm_view >> -fluid_dm_view. >> It yields : >> >> DM Object: CoupledPhysics_Plex 1 MPI processes >> type: plex >> CoupledPhysics_Plex in 2 dimensions: >> Number of 0-cells per rank: 1409 >> Number of 1-cells per rank: 4084 >> Number of 2-cells per rank: 2676 >> Labels: >> celltype: 3 strata with value/size (0 (1409), 3 (2676), 1 (4084)) >> depth: 3 strata with value/size (0 (1409), 1 (4084), 2 (2676)) >> Cell Sets: 2 strata with value/size (1 (924), 2 (1752)) >> Solid: 1 strata with value/size (1 (924)) >> Fluid: 1 strata with value/size (2 (1752)) >> Face Sets: 5 strata with value/size (4 (60), 3 (40), 7 (20), 5 (40), 6 >> (20)) >> Insulation: 1 strata with value/size (4 (60)) >> Wall: 1 strata with value/size (3 (40)) >> Outlet: 1 strata with value/size (7 (20)) >> Freestream: 1 strata with value/size (5 (40)) >> Inlet: 1 strata with value/size (6 (20)) >> >> DM Object: 1 MPI processes >> type: plex >> DM_0x557ff009be70_1 in 2 dimensions: >> Number of 0-cells per rank: 937 >> Number of 1-cells per rank: 2688 >> Number of 2-cells per rank: 1752 >> Labels: >> celltype: 3 strata with value/size (0 (937), 1 (2688), 3 (1752)) >> depth: 3 strata with value/size (0 (937), 1 (2688), 2 (1752)) >> >> In the attached mesh, Wall, Outlet, Freestream and Inlet "belong" to the >> Fluid domain, therefore I would like to transfer them to the fluidDM during >> or right after the DMPlexFilter action. >> Is that possible ? I was looking at extracting the IS from those labels >> of dm and creating new labels in fluidDM from those IS, but the indices >> won't match, will they ? The filtering operation renumbers everything right >> ? >> >> Thank you very much !! >> >> Thibault Bridel-Bertomeu >> ? >> Eng, MSc, PhD >> Research Engineer >> CEA/CESTA >> 33114 LE BARP >> Tel.: (+33)557046924 >> Mob.: (+33)611025322 >> Mail: thibault.bridelbertomeu at gmail.com >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From thibault.bridelbertomeu at gmail.com Fri Jan 14 08:07:25 2022 From: thibault.bridelbertomeu at gmail.com (Thibault Bridel-Bertomeu) Date: Fri, 14 Jan 2022 15:07:25 +0100 Subject: [petsc-users] DMPlex filter with Face Sets In-Reply-To: References: Message-ID: Also, if we still consider my example with Solid and Fluid, let's image we call DMPlexFilter twice. We then get two new DMs with Solid in one and Fluid in the other one. The labels will be communicated, so fluidDM will still know Wall, Inlet, Freestream and Outlet and on the other hand, solidDM will still know Wall and Insulation : those two domain share the Wall stratum of the Face Sets. Can I extract data at the Wall label from the solidDM and transfer it to the Wall label of the fluidDM ? Thanks ! Thibault Le ven. 14 janv. 2022 ? 14:55, Thibault Bridel-Bertomeu < thibault.bridelbertomeu at gmail.com> a ?crit : > Hi Matt, > > Thank you for your help in this matter ! > Do you mind posting the link to the gitlab issue when you have the > occasion to open it ? > > Thanks ! > Thibault > > > Le ven. 14 janv. 2022 ? 14:37, Matthew Knepley a > ?crit : > >> On Fri, Jan 14, 2022 at 8:23 AM Thibault Bridel-Bertomeu < >> thibault.bridelbertomeu at gmail.com> wrote: >> >>> Dear all, >>> >>> I have a new question - in relation with my other ongoing thread >>> "Fluid-Structure interaction with multiple DMPlex" actually. >>> Is it possible to filter a DMPlex and keep in the new piece of DM the >>> strata from the Face Sets label that belong to that piece ? >>> >> >> You are right. We are not doing labels. That is just an oversight. I will >> fix it. >> >> Thanks, >> >> Matt >> >> >>> Consider for instance the following example : >>> >>> // Standard includes >>> #include >>> >>> // PETSc includes >>> #include >>> #include >>> #include >>> >>> int main( >>> int argc, /**< Number of CLI arguments */ >>> char* argv[] /**< List of CLI arguments as strings */ >>> ) >>> { >>> User user; >>> DM dm; >>> PetscErrorCode ierr = EXIT_SUCCESS; >>> >>> // Start-up the PETSc interface >>> ierr = PetscInitialize(&argc, &argv, (char*)0, NULL);if (ierr) return >>> ierr; >>> >>> // Allocate whatever needs to be >>> ierr = PetscNew(&user);CHKERRQ(ierr); >>> >>> // Parse the mesh given by the user >>> //+ Mesh filename >>> ierr = PetscOptionsBegin(PETSC_COMM_WORLD, NULL, "PETSc DMPlex Coupled >>> Physics Demo Mesh Options", "");CHKERRQ(ierr); >>> { >>> ierr = PetscStrcpy(user->meshName, >>> "data/divided_square/divided_square_gmsh22ascii.msh");CHKERRQ(ierr); >>> ierr = PetscOptionsString("-meshname", "Mesh filename", "", >>> user->meshName, user->meshName, sizeof(user->meshName), NULL);CHKERRQ >>> (ierr); >>> } >>> ierr = PetscOptionsEnd();CHKERRQ(ierr); >>> //+ Read the mesh in >>> ierr = DMPlexCreateFromFile(PETSC_COMM_WORLD, user->meshName, >>> "CoupledPhysics_Plex", PETSC_TRUE, &dm);CHKERRQ(ierr); >>> ierr = DMSetFromOptions(dm);CHKERRQ(ierr); >>> ierr = DMViewFromOptions(dm, NULL, "-dm_view");CHKERRQ(ierr); >>> >>> // Extract both domains separately >>> //+ Label of the fluid domain >>> DMLabel fluidLabel; >>> ierr = DMGetLabel(dm, "Fluid", &fluidLabel);CHKERRQ(ierr); >>> //+ Sub-DM for the fluid domain >>> DM fluidDM; >>> // ierr = DMPlexFilter(dm, domainsLabel, 2, &fluidDM);CHKERRQ(ierr); >>> ierr = DMPlexFilter(dm, fluidLabel, 2, &fluidDM);CHKERRQ(ierr); >>> ierr = DMViewFromOptions(fluidDM, NULL, "-fluid_dm_view");CHKERRQ(ierr); >>> //+ Label of the solid domain >>> DMLabel solidLabel; >>> ierr = DMGetLabel(dm, "Solid", &solidLabel);CHKERRQ(ierr); >>> //+ Sub-DM for the solid domain >>> DM solidDM; >>> ierr = DMPlexFilter(dm, solidLabel, 1, &solidDM);CHKERRQ(ierr); >>> ierr = DMViewFromOptions(solidDM, NULL, "-solid_dm_view");CHKERRQ(ierr); >>> >>> // Close the PETSc interface and end the parallel communicator >>> ierr = PetscFree(user);CHKERRQ(ierr); >>> ierr = PetscFinalize(); >>> return(ierr); >>> } >>> >>> run with the mesh attached to this email and the command ./program >>> -meshname divided_square_gmsh41ascii.msh -dm_plex_gmsh_use_regions -dm_view >>> -fluid_dm_view. >>> It yields : >>> >>> DM Object: CoupledPhysics_Plex 1 MPI processes >>> type: plex >>> CoupledPhysics_Plex in 2 dimensions: >>> Number of 0-cells per rank: 1409 >>> Number of 1-cells per rank: 4084 >>> Number of 2-cells per rank: 2676 >>> Labels: >>> celltype: 3 strata with value/size (0 (1409), 3 (2676), 1 (4084)) >>> depth: 3 strata with value/size (0 (1409), 1 (4084), 2 (2676)) >>> Cell Sets: 2 strata with value/size (1 (924), 2 (1752)) >>> Solid: 1 strata with value/size (1 (924)) >>> Fluid: 1 strata with value/size (2 (1752)) >>> Face Sets: 5 strata with value/size (4 (60), 3 (40), 7 (20), 5 (40), 6 >>> (20)) >>> Insulation: 1 strata with value/size (4 (60)) >>> Wall: 1 strata with value/size (3 (40)) >>> Outlet: 1 strata with value/size (7 (20)) >>> Freestream: 1 strata with value/size (5 (40)) >>> Inlet: 1 strata with value/size (6 (20)) >>> >>> DM Object: 1 MPI processes >>> type: plex >>> DM_0x557ff009be70_1 in 2 dimensions: >>> Number of 0-cells per rank: 937 >>> Number of 1-cells per rank: 2688 >>> Number of 2-cells per rank: 1752 >>> Labels: >>> celltype: 3 strata with value/size (0 (937), 1 (2688), 3 (1752)) >>> depth: 3 strata with value/size (0 (937), 1 (2688), 2 (1752)) >>> >>> In the attached mesh, Wall, Outlet, Freestream and Inlet "belong" to the >>> Fluid domain, therefore I would like to transfer them to the fluidDM during >>> or right after the DMPlexFilter action. >>> Is that possible ? I was looking at extracting the IS from those labels >>> of dm and creating new labels in fluidDM from those IS, but the indices >>> won't match, will they ? The filtering operation renumbers everything right >>> ? >>> >>> Thank you very much !! >>> >>> Thibault Bridel-Bertomeu >>> ? >>> Eng, MSc, PhD >>> Research Engineer >>> CEA/CESTA >>> 33114 LE BARP >>> Tel.: (+33)557046924 >>> Mob.: (+33)611025322 >>> Mail: thibault.bridelbertomeu at gmail.com >>> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Fri Jan 14 08:12:41 2022 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 14 Jan 2022 09:12:41 -0500 Subject: [petsc-users] DMPlex filter with Face Sets In-Reply-To: References: Message-ID: On Fri, Jan 14, 2022 at 9:07 AM Thibault Bridel-Bertomeu < thibault.bridelbertomeu at gmail.com> wrote: > Also, if we still consider my example with Solid and Fluid, let's image we > call DMPlexFilter twice. We then get two new DMs with Solid in one and > Fluid in the other one. > > The labels will be communicated, so fluidDM will still know Wall, Inlet, > Freestream and Outlet and on the other hand, solidDM will still know Wall > and Insulation : those two domain share the Wall stratum of the Face Sets. > > Can I extract data at the Wall label from the solidDM and transfer it to > the Wall label of the fluidDM ? > Yes, conceptually here is how that would work. You iterate over the label, extracting the values you want. You map those points to points in the original mesh using the subpointMap, and then map them again using the subpointMap from the fluidDM down to it. Now you can insert the values using the section in the fluidDM. I think the easiest way to do this is to setup a VecScatter (or PetscSF) from one boundary to the other. Then you would just stick in the two vectors and call VecScatterBegin/End() If this turns out to be useful, this construction is something we could easily automate in the library. Thanks, Matt > Thanks ! > Thibault > > > Le ven. 14 janv. 2022 ? 14:55, Thibault Bridel-Bertomeu < > thibault.bridelbertomeu at gmail.com> a ?crit : > >> Hi Matt, >> >> Thank you for your help in this matter ! >> Do you mind posting the link to the gitlab issue when you have the >> occasion to open it ? >> >> Thanks ! >> Thibault >> >> >> Le ven. 14 janv. 2022 ? 14:37, Matthew Knepley a >> ?crit : >> >>> On Fri, Jan 14, 2022 at 8:23 AM Thibault Bridel-Bertomeu < >>> thibault.bridelbertomeu at gmail.com> wrote: >>> >>>> Dear all, >>>> >>>> I have a new question - in relation with my other ongoing thread >>>> "Fluid-Structure interaction with multiple DMPlex" actually. >>>> Is it possible to filter a DMPlex and keep in the new piece of DM the >>>> strata from the Face Sets label that belong to that piece ? >>>> >>> >>> You are right. We are not doing labels. That is just an oversight. I >>> will fix it. >>> >>> Thanks, >>> >>> Matt >>> >>> >>>> Consider for instance the following example : >>>> >>>> // Standard includes >>>> #include >>>> >>>> // PETSc includes >>>> #include >>>> #include >>>> #include >>>> >>>> int main( >>>> int argc, /**< Number of CLI arguments */ >>>> char* argv[] /**< List of CLI arguments as strings */ >>>> ) >>>> { >>>> User user; >>>> DM dm; >>>> PetscErrorCode ierr = EXIT_SUCCESS; >>>> >>>> // Start-up the PETSc interface >>>> ierr = PetscInitialize(&argc, &argv, (char*)0, NULL);if (ierr) return >>>> ierr; >>>> >>>> // Allocate whatever needs to be >>>> ierr = PetscNew(&user);CHKERRQ(ierr); >>>> >>>> // Parse the mesh given by the user >>>> //+ Mesh filename >>>> ierr = PetscOptionsBegin(PETSC_COMM_WORLD, NULL, "PETSc DMPlex Coupled >>>> Physics Demo Mesh Options", "");CHKERRQ(ierr); >>>> { >>>> ierr = PetscStrcpy(user->meshName, >>>> "data/divided_square/divided_square_gmsh22ascii.msh");CHKERRQ(ierr); >>>> ierr = PetscOptionsString("-meshname", "Mesh filename", "", >>>> user->meshName, user->meshName, sizeof(user->meshName), NULL);CHKERRQ >>>> (ierr); >>>> } >>>> ierr = PetscOptionsEnd();CHKERRQ(ierr); >>>> //+ Read the mesh in >>>> ierr = DMPlexCreateFromFile(PETSC_COMM_WORLD, user->meshName, >>>> "CoupledPhysics_Plex", PETSC_TRUE, &dm);CHKERRQ(ierr); >>>> ierr = DMSetFromOptions(dm);CHKERRQ(ierr); >>>> ierr = DMViewFromOptions(dm, NULL, "-dm_view");CHKERRQ(ierr); >>>> >>>> // Extract both domains separately >>>> //+ Label of the fluid domain >>>> DMLabel fluidLabel; >>>> ierr = DMGetLabel(dm, "Fluid", &fluidLabel);CHKERRQ(ierr); >>>> //+ Sub-DM for the fluid domain >>>> DM fluidDM; >>>> // ierr = DMPlexFilter(dm, domainsLabel, 2, &fluidDM);CHKERRQ(ierr); >>>> ierr = DMPlexFilter(dm, fluidLabel, 2, &fluidDM);CHKERRQ(ierr); >>>> ierr = DMViewFromOptions(fluidDM, NULL, "-fluid_dm_view");CHKERRQ >>>> (ierr); >>>> //+ Label of the solid domain >>>> DMLabel solidLabel; >>>> ierr = DMGetLabel(dm, "Solid", &solidLabel);CHKERRQ(ierr); >>>> //+ Sub-DM for the solid domain >>>> DM solidDM; >>>> ierr = DMPlexFilter(dm, solidLabel, 1, &solidDM);CHKERRQ(ierr); >>>> ierr = DMViewFromOptions(solidDM, NULL, "-solid_dm_view");CHKERRQ >>>> (ierr); >>>> >>>> // Close the PETSc interface and end the parallel communicator >>>> ierr = PetscFree(user);CHKERRQ(ierr); >>>> ierr = PetscFinalize(); >>>> return(ierr); >>>> } >>>> >>>> run with the mesh attached to this email and the command ./program >>>> -meshname divided_square_gmsh41ascii.msh -dm_plex_gmsh_use_regions -dm_view >>>> -fluid_dm_view. >>>> It yields : >>>> >>>> DM Object: CoupledPhysics_Plex 1 MPI processes >>>> type: plex >>>> CoupledPhysics_Plex in 2 dimensions: >>>> Number of 0-cells per rank: 1409 >>>> Number of 1-cells per rank: 4084 >>>> Number of 2-cells per rank: 2676 >>>> Labels: >>>> celltype: 3 strata with value/size (0 (1409), 3 (2676), 1 (4084)) >>>> depth: 3 strata with value/size (0 (1409), 1 (4084), 2 (2676)) >>>> Cell Sets: 2 strata with value/size (1 (924), 2 (1752)) >>>> Solid: 1 strata with value/size (1 (924)) >>>> Fluid: 1 strata with value/size (2 (1752)) >>>> Face Sets: 5 strata with value/size (4 (60), 3 (40), 7 (20), 5 (40), >>>> 6 (20)) >>>> Insulation: 1 strata with value/size (4 (60)) >>>> Wall: 1 strata with value/size (3 (40)) >>>> Outlet: 1 strata with value/size (7 (20)) >>>> Freestream: 1 strata with value/size (5 (40)) >>>> Inlet: 1 strata with value/size (6 (20)) >>>> >>>> DM Object: 1 MPI processes >>>> type: plex >>>> DM_0x557ff009be70_1 in 2 dimensions: >>>> Number of 0-cells per rank: 937 >>>> Number of 1-cells per rank: 2688 >>>> Number of 2-cells per rank: 1752 >>>> Labels: >>>> celltype: 3 strata with value/size (0 (937), 1 (2688), 3 (1752)) >>>> depth: 3 strata with value/size (0 (937), 1 (2688), 2 (1752)) >>>> >>>> In the attached mesh, Wall, Outlet, Freestream and Inlet "belong" to >>>> the Fluid domain, therefore I would like to transfer them to the fluidDM >>>> during or right after the DMPlexFilter action. >>>> Is that possible ? I was looking at extracting the IS from those labels >>>> of dm and creating new labels in fluidDM from those IS, but the indices >>>> won't match, will they ? The filtering operation renumbers everything right >>>> ? >>>> >>>> Thank you very much !! >>>> >>>> Thibault Bridel-Bertomeu >>>> ? >>>> Eng, MSc, PhD >>>> Research Engineer >>>> CEA/CESTA >>>> 33114 LE BARP >>>> Tel.: (+33)557046924 >>>> Mob.: (+33)611025322 >>>> Mail: thibault.bridelbertomeu at gmail.com >>>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >>> https://www.cse.buffalo.edu/~knepley/ >>> >>> >> -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From wencel at gmail.com Fri Jan 14 08:48:30 2022 From: wencel at gmail.com (Lawrence Mitchell) Date: Fri, 14 Jan 2022 14:48:30 +0000 Subject: [petsc-users] DMPlex filter with Face Sets In-Reply-To: References: Message-ID: <2AB212C1-49B9-4FE0-B9C9-DFCB355E1A27@gmail.com> > On 14 Jan 2022, at 14:12, Matthew Knepley wrote: > > On Fri, Jan 14, 2022 at 9:07 AM Thibault Bridel-Bertomeu wrote: > Also, if we still consider my example with Solid and Fluid, let's image we call DMPlexFilter twice. We then get two new DMs with Solid in one and Fluid in the other one. > > The labels will be communicated, so fluidDM will still know Wall, Inlet, Freestream and Outlet and on the other hand, solidDM will still know Wall and Insulation : those two domain share the Wall stratum of the Face Sets. > > Can I extract data at the Wall label from the solidDM and transfer it to the Wall label of the fluidDM ? > > Yes, conceptually here is how that would work. You iterate over the label, extracting the values you want. You map those points > to points in the original mesh using the subpointMap, and then map them again using the subpointMap from the fluidDM down to it. > Now you can insert the values using the section in the fluidDM. I think the easiest way to do this is to setup a VecScatter (or PetscSF) > from one boundary to the other. Then you would just stick in the two vectors and call VecScatterBegin/End() > > If this turns out to be useful, this construction is something we could easily automate in the library. I've needed this too, I think I have some code lying around, let me see if I can port it into DMPlexFilter.. Lawrence From knepley at gmail.com Fri Jan 14 08:52:19 2022 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 14 Jan 2022 09:52:19 -0500 Subject: [petsc-users] DMPlex filter with Face Sets In-Reply-To: <2AB212C1-49B9-4FE0-B9C9-DFCB355E1A27@gmail.com> References: <2AB212C1-49B9-4FE0-B9C9-DFCB355E1A27@gmail.com> Message-ID: On Fri, Jan 14, 2022 at 9:48 AM Lawrence Mitchell wrote: > > On 14 Jan 2022, at 14:12, Matthew Knepley wrote: > > > > On Fri, Jan 14, 2022 at 9:07 AM Thibault Bridel-Bertomeu < > thibault.bridelbertomeu at gmail.com> wrote: > > Also, if we still consider my example with Solid and Fluid, let's image > we call DMPlexFilter twice. We then get two new DMs with Solid in one and > Fluid in the other one. > > > > The labels will be communicated, so fluidDM will still know Wall, Inlet, > Freestream and Outlet and on the other hand, solidDM will still know Wall > and Insulation : those two domain share the Wall stratum of the Face Sets. > > > > Can I extract data at the Wall label from the solidDM and transfer it to > the Wall label of the fluidDM ? > > > > Yes, conceptually here is how that would work. You iterate over the > label, extracting the values you want. You map those points > > to points in the original mesh using the subpointMap, and then map them > again using the subpointMap from the fluidDM down to it. > > Now you can insert the values using the section in the fluidDM. I think > the easiest way to do this is to setup a VecScatter (or PetscSF) > > from one boundary to the other. Then you would just stick in the two > vectors and call VecScatterBegin/End() > > > > If this turns out to be useful, this construction is something we could > easily automate in the library. > > I've needed this too, I think I have some code lying around, let me see if > I can port it into DMPlexFilter.. > I did the initial implementation for the label filtering. It is here https://gitlab.com/petsc/petsc/-/merge_requests/4717 Thibault, can you try it on your example? I will not have time to code up a nice test until I get home from this conference. Thanks, Matt > Lawrence -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From junchao.zhang at gmail.com Fri Jan 14 19:58:28 2022 From: junchao.zhang at gmail.com (Junchao Zhang) Date: Fri, 14 Jan 2022 19:58:28 -0600 Subject: [petsc-users] Strange CUDA failure with a second petscfinalize with PETSc 3.16 In-Reply-To: <46A67106-A80C-4057-A301-E3C13236CC18@outlook.com> References: <46A67106-A80C-4057-A301-E3C13236CC18@outlook.com> Message-ID: Jacob, Could you have a look as it seems the "invalid device context" is in your newly added module? Thanks --Junchao Zhang On Fri, Jan 14, 2022 at 12:49 AM Hao DONG wrote: > Dear All, > > > > I have encountered a peculiar problem when fiddling with a code with PETSC > 3.16.3 (which worked fine with PETSc 3.15). It is a very straight forward > PDE-based optimization code which repeatedly solves a linearized PDE > problem with KSP in a subroutine (the rest of the code does not contain any > PETSc related content). The main program provides the subroutine with an > MPI comm. Then I set the comm as PETSC_COMM_WORLD to tell PETSC to attach > to it (and detach with it when the solving is finished each time). > > > > Strangely, I observe a CUDA failure whenever the petscfinalize is called > for a *second* time. In other words, the first and second PDE calculations > with GPU are fine (with correct solutions). The petsc code just fails after > the SECOND petscfinalize command is called. You can also see the PETSC > config in the error message: > > > > [1]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > > [1]PETSC ERROR: GPU error > > [1]PETSC ERROR: cuda error 201 (cudaErrorDeviceUninitialized) : invalid > device context > > [1]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting. > > [1]PETSC ERROR: Petsc Release Version 3.16.3, unknown > > [1]PETSC ERROR: maxwell.gpu on a named stratosphere by hao Fri Jan 14 > 10:21:05 2022 > > [1]PETSC ERROR: Configure options > --prefix=/opt/petsc/complex-double-with-cuda --with-cc=mpicc > --with-cxx=mpicxx --with-fc=mpif90 COPTFLAGS="-O3 -mavx2" CXXOPTFLAGS="-O3 > -mavx2" FOPTFLAGS="-O3 -ffree-line-length-none -mavx2" CUDAOPTFLAGS=-O3 > --with-cxx-dialect=cxx14 --with-cuda-dialect=cxx14 > --with-scalar-type=complex --with-precision=double > --with-cuda-dir=/usr/local/cuda --with-debugging=1 > > [1]PETSC ERROR: #1 PetscFinalize() at > /home/hao/packages/petsc-current/src/sys/objects/pinit.c:1638 > > You might have forgotten to call PetscInitialize(). > > The EXACT line numbers in the error traceback are not available. > > Instead the line number of the start of the function is given. > > [1] #1 PetscAbortFindSourceFile_Private() at > /home/hao/packages/petsc-current/src/sys/error/err.c:35 > > [1] #2 PetscLogGetStageLog() at > /home/hao/packages/petsc-current/src/sys/logging/utils/stagelog.c:29 > > [1] #3 PetscClassIdRegister() at > /home/hao/packages/petsc-current/src/sys/logging/plog.c:2376 > > [1] #4 MatMFFDInitializePackage() at > /home/hao/packages/petsc-current/src/mat/impls/mffd/mffd.c:45 > > [1] #5 MatInitializePackage() at > /home/hao/packages/petsc-current/src/mat/interface/dlregismat.c:163 > > [1] #6 MatCreate() at > /home/hao/packages/petsc-current/src/mat/utils/gcreate.c:77 > > > > However, it doesn?t seem to affect the other part of my code, so the code > can continue running until it gets to the petsc part again (the **third** > time). Unfortunately, it doesn?t give me any further information even if I > set the debugging to yes in the configure file. It also worth noting that > PETSC without CUDA (i.e. with simple MATMPIAIJ) works perfectly fine. > > > > I am able to re-produce the problem with a toy code modified from ex11f. > Please see the attached file (ex11fc.F90) for details. Essentially the > code does the same thing as ex11f, but three times with a do loop. To do > that I added an extra MPI_INIT/MPI_FINALIZE to ensure that the MPI > communicator is not destroyed when PETSC_FINALIZE is called. I used the > PetscOptionsHasName utility to check if you have ?-usecuda? in the options. > So running the code with and without that option can give you a comparison > w/o CUDA. I can see that the code also fails after the second loop of the > KSP operation. Could you kindly shed some lights on this problem? > > > > I should say that I am not even sure if the problem is from PETSc, as I > also accidentally updated the NVIDIA driver (for now it is 510.06 with cuda > 11.6). And it is well known that NVIDIA can give you some surprise in the > updates (yes, I know I shouldn?t have touched that if it?s not broken). But > my CUDA code without PETSC (which basically does the same PDE thing, but > with cusparse/cublas directly) seems to work just fine after the update. It > is also possible that my petsc code related to CUDA was not quite > ?legitimate? ? I just use: > > MatSetType(A, MATMPIAIJCUSPARSE, ierr) > > and > > MatCreateVecs(A, u, PETSC_NULL_VEC, ierr) > > to make the data onto GPU. I would very much appreciate it if you could > show me the ?right? way to do that. > > > > Thanks a lot in advance, and all the best, > > Hao > -------------- next part -------------- An HTML attachment was scrubbed... URL: From thibault.bridelbertomeu at gmail.com Sat Jan 15 06:07:23 2022 From: thibault.bridelbertomeu at gmail.com (Thibault Bridel-Bertomeu) Date: Sat, 15 Jan 2022 13:07:23 +0100 Subject: [petsc-users] DMPlex filter with Face Sets In-Reply-To: References: <2AB212C1-49B9-4FE0-B9C9-DFCB355E1A27@gmail.com> Message-ID: Le ven. 14 janv. 2022 ? 15:52, Matthew Knepley a ?crit : > On Fri, Jan 14, 2022 at 9:48 AM Lawrence Mitchell > wrote: > >> > On 14 Jan 2022, at 14:12, Matthew Knepley wrote: >> > >> > On Fri, Jan 14, 2022 at 9:07 AM Thibault Bridel-Bertomeu < >> thibault.bridelbertomeu at gmail.com> wrote: >> > Also, if we still consider my example with Solid and Fluid, let's image >> we call DMPlexFilter twice. We then get two new DMs with Solid in one and >> Fluid in the other one. >> > >> > The labels will be communicated, so fluidDM will still know Wall, >> Inlet, Freestream and Outlet and on the other hand, solidDM will still know >> Wall and Insulation : those two domain share the Wall stratum of the Face >> Sets. >> > >> > Can I extract data at the Wall label from the solidDM and transfer it >> to the Wall label of the fluidDM ? >> > >> > Yes, conceptually here is how that would work. You iterate over the >> label, extracting the values you want. You map those points >> > to points in the original mesh using the subpointMap, and then map them >> again using the subpointMap from the fluidDM down to it. >> > Now you can insert the values using the section in the fluidDM. I think >> the easiest way to do this is to setup a VecScatter (or PetscSF) >> > from one boundary to the other. Then you would just stick in the two >> vectors and call VecScatterBegin/End() >> > >> > If this turns out to be useful, this construction is something we could >> easily automate in the library. >> >> I've needed this too, I think I have some code lying around, let me see >> if I can port it into DMPlexFilter.. > > Hi Lawrence, Thank you so much, that would be great ! Would you mind sending the original snippet of code, maybe I can figure it out ? > > I did the initial implementation for the label filtering. It is here > > https://gitlab.com/petsc/petsc/-/merge_requests/4717 > > Thibault, can you try it on your example? I will not have time to code up > a nice test until I get home from this conference. > Hi Matt, Thank you for the quick mod. It works well, the labels are passed down to the filtered DMPlex as we can see in the example below : // Overall plex : DM Object: CoupledPhysics_Plex 1 MPI processes type: plex CoupledPhysics_Plex in 2 dimensions: Number of 0-cells per rank: 1409 Number of 1-cells per rank: 4084 Number of 2-cells per rank: 2676 Labels: celltype: 3 strata with value/size (0 (1409), 3 (2676), 1 (4084)) depth: 3 strata with value/size (0 (1409), 1 (4084), 2 (2676)) Cell Sets: 2 strata with value/size (1 (924), 2 (1752)) Solid: 1 strata with value/size (1 (924)) Fluid: 1 strata with value/size (2 (1752)) Face Sets: 5 strata with value/size (4 (60), 3 (40), 7 (20), 5 (40), 6 (20)) Insulation: 1 strata with value/size (4 (60)) Wall: 1 strata with value/size (3 (40)) Outlet: 1 strata with value/size (7 (20)) Freestream: 1 strata with value/size (5 (40)) Inlet: 1 strata with value/size (6 (20)) // Fluid plex : DM Object: 1 MPI processes type: plex DM_0x12471c2f0_1 in 2 dimensions: Number of 0-cells per rank: 937 Number of 1-cells per rank: 2688 Number of 2-cells per rank: 1752 Labels: celltype: 3 strata with value/size (0 (937), 1 (2688), 3 (1752)) depth: 3 strata with value/size (0 (937), 1 (2688), 2 (1752)) Cell Sets: 1 strata with value/size (2 (1752)) Solid: 0 strata with value/size () Fluid: 1 strata with value/size (2 (1752)) Face Sets: 4 strata with value/size (3 (40), 7 (20), 5 (40), 6 (20)) Insulation: 0 strata with value/size () Wall: 1 strata with value/size (3 (40)) Outlet: 1 strata with value/size (7 (20)) Freestream: 1 strata with value/size (5 (40)) Inlet: 1 strata with value/size (6 (20)) // Solid plex : DM Object: 1 MPI processes type: plex DM_0x12471c2f0_2 in 2 dimensions: Number of 0-cells per rank: 513 Number of 1-cells per rank: 1436 Number of 2-cells per rank: 924 Labels: celltype: 3 strata with value/size (0 (513), 1 (1436), 3 (924)) depth: 3 strata with value/size (0 (513), 1 (1436), 2 (924)) Cell Sets: 1 strata with value/size (1 (924)) Solid: 1 strata with value/size (1 (924)) Fluid: 0 strata with value/size () Face Sets: 2 strata with value/size (4 (60), 3 (40)) Insulation: 1 strata with value/size (4 (60)) Wall: 1 strata with value/size (3 (40)) Outlet: 0 strata with value/size () Freestream: 0 strata with value/size () Inlet: 0 strata with value/size () I think it would be perfect if the 0-sized labels were also completely filtered out. Is that something that you could add to the DMPlexFliterLabels_Internal function ? By the way, I believe there is a typo in the name of the function, it should read "F*il*ter". Thanks again !! Thibault Thanks, > > Matt > > >> Lawrence > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at jedbrown.org Sat Jan 15 08:23:31 2022 From: jed at jedbrown.org (Jed Brown) Date: Sat, 15 Jan 2022 07:23:31 -0700 Subject: [petsc-users] DMPlex filter with Face Sets In-Reply-To: References: <2AB212C1-49B9-4FE0-B9C9-DFCB355E1A27@gmail.com> Message-ID: <878rvhoz6k.fsf@jedbrown.org> Thibault Bridel-Bertomeu writes: > I think it would be perfect if the 0-sized labels were also completely > filtered out. Is that something that you could add to the > DMPlexFliterLabels_Internal function ? Why do you want it filtered? Determining that it's zero sized requires all processes agreeing, and DMAddBoundary is collective so *no* ranks can add a boundary condition unless the label exists on *all* ranks (even if zero sized on many). From jacob.fai at gmail.com Sat Jan 15 10:12:25 2022 From: jacob.fai at gmail.com (Jacob Faibussowitsch) Date: Sat, 15 Jan 2022 11:12:25 -0500 Subject: [petsc-users] Strange CUDA failure with a second petscfinalize with PETSc 3.16 In-Reply-To: References: <46A67106-A80C-4057-A301-E3C13236CC18@outlook.com> Message-ID: I don?t quite understand how it is getting to the CUDA error to be honest. None of the code in the stack trace is anywhere near the device code. Reading the error message carefully, it first chokes on PetscLogGetStageLog() from a call to PetscClassIdRegister(): PetscErrorCode PetscLogGetStageLog(PetscStageLog *stageLog) { PetscFunctionBegin; PetscValidPointer(stageLog,1); if (!petsc_stageLog) { fprintf(stderr, "PETSC ERROR: Logging has not been enabled.\nYou might have forgotten to call PetscInitialize().\n"); PETSCABORT(MPI_COMM_WORLD, PETSC_ERR_SUP); // Here } ... But then jumps to PetscFinalize(). You can also see the "You might have forgotten to call PetscInitialize().? message in the error message, just under the topmost level of the stack trace. Can you check the value of ierr of each function call (use the CHKERRA() macro to do so)? I suspect the problem here that errors occurring previously in the program are being ignored, leading to the garbled stack trace. Best regards, Jacob Faibussowitsch (Jacob Fai - booss - oh - vitch) > On Jan 14, 2022, at 20:58, Junchao Zhang wrote: > > Jacob, > Could you have a look as it seems the "invalid device context" is in your newly added module? > Thanks > --Junchao Zhang > > > On Fri, Jan 14, 2022 at 12:49 AM Hao DONG > wrote: > Dear All, > > > > I have encountered a peculiar problem when fiddling with a code with PETSC 3.16.3 (which worked fine with PETSc 3.15). It is a very straight forward PDE-based optimization code which repeatedly solves a linearized PDE problem with KSP in a subroutine (the rest of the code does not contain any PETSc related content). The main program provides the subroutine with an MPI comm. Then I set the comm as PETSC_COMM_WORLD to tell PETSC to attach to it (and detach with it when the solving is finished each time). > > > > Strangely, I observe a CUDA failure whenever the petscfinalize is called for a *second* time. In other words, the first and second PDE calculations with GPU are fine (with correct solutions). The petsc code just fails after the SECOND petscfinalize command is called. You can also see the PETSC config in the error message: > > > > [1]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > > [1]PETSC ERROR: GPU error > > [1]PETSC ERROR: cuda error 201 (cudaErrorDeviceUninitialized) : invalid device context > > [1]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting. > > [1]PETSC ERROR: Petsc Release Version 3.16.3, unknown > > [1]PETSC ERROR: maxwell.gpu on a named stratosphere by hao Fri Jan 14 10:21:05 2022 > > [1]PETSC ERROR: Configure options --prefix=/opt/petsc/complex-double-with-cuda --with-cc=mpicc --with-cxx=mpicxx --with-fc=mpif90 COPTFLAGS="-O3 -mavx2" CXXOPTFLAGS="-O3 -mavx2" FOPTFLAGS="-O3 -ffree-line-length-none -mavx2" CUDAOPTFLAGS=-O3 --with-cxx-dialect=cxx14 --with-cuda-dialect=cxx14 --with-scalar-type=complex --with-precision=double --with-cuda-dir=/usr/local/cuda --with-debugging=1 > > [1]PETSC ERROR: #1 PetscFinalize() at /home/hao/packages/petsc-current/src/sys/objects/pinit.c:1638 > > You might have forgotten to call PetscInitialize(). > > The EXACT line numbers in the error traceback are not available. > > Instead the line number of the start of the function is given. > > [1] #1 PetscAbortFindSourceFile_Private() at /home/hao/packages/petsc-current/src/sys/error/err.c:35 > > [1] #2 PetscLogGetStageLog() at /home/hao/packages/petsc-current/src/sys/logging/utils/stagelog.c:29 > > [1] #3 PetscClassIdRegister() at /home/hao/packages/petsc-current/src/sys/logging/plog.c:2376 > > [1] #4 MatMFFDInitializePackage() at /home/hao/packages/petsc-current/src/mat/impls/mffd/mffd.c:45 > > [1] #5 MatInitializePackage() at /home/hao/packages/petsc-current/src/mat/interface/dlregismat.c:163 > > [1] #6 MatCreate() at /home/hao/packages/petsc-current/src/mat/utils/gcreate.c:77 > > > > However, it doesn?t seem to affect the other part of my code, so the code can continue running until it gets to the petsc part again (the *third* time). Unfortunately, it doesn?t give me any further information even if I set the debugging to yes in the configure file. It also worth noting that PETSC without CUDA (i.e. with simple MATMPIAIJ) works perfectly fine. > > > > I am able to re-produce the problem with a toy code modified from ex11f. Please see the attached file (ex11fc.F90) for details. Essentially the code does the same thing as ex11f, but three times with a do loop. To do that I added an extra MPI_INIT/MPI_FINALIZE to ensure that the MPI communicator is not destroyed when PETSC_FINALIZE is called. I used the PetscOptionsHasName utility to check if you have ?-usecuda? in the options. So running the code with and without that option can give you a comparison w/o CUDA. I can see that the code also fails after the second loop of the KSP operation. Could you kindly shed some lights on this problem? > > > > I should say that I am not even sure if the problem is from PETSc, as I also accidentally updated the NVIDIA driver (for now it is 510.06 with cuda 11.6). And it is well known that NVIDIA can give you some surprise in the updates (yes, I know I shouldn?t have touched that if it?s not broken). But my CUDA code without PETSC (which basically does the same PDE thing, but with cusparse/cublas directly) seems to work just fine after the update. It is also possible that my petsc code related to CUDA was not quite ?legitimate? ? I just use: > > MatSetType(A, MATMPIAIJCUSPARSE, ierr) > > and > > MatCreateVecs(A, u, PETSC_NULL_VEC, ierr) > > to make the data onto GPU. I would very much appreciate it if you could show me the ?right? way to do that. > > > > Thanks a lot in advance, and all the best, > > Hao > -------------- next part -------------- An HTML attachment was scrubbed... URL: From thibault.bridelbertomeu at gmail.com Sat Jan 15 10:58:55 2022 From: thibault.bridelbertomeu at gmail.com (Thibault Bridel-Bertomeu) Date: Sat, 15 Jan 2022 17:58:55 +0100 Subject: [petsc-users] DMPlex filter with Face Sets In-Reply-To: <878rvhoz6k.fsf@jedbrown.org> References: <2AB212C1-49B9-4FE0-B9C9-DFCB355E1A27@gmail.com> <878rvhoz6k.fsf@jedbrown.org> Message-ID: Le sam. 15 janv. 2022 ? 15:23, Jed Brown a ?crit : > Thibault Bridel-Bertomeu writes: > > > I think it would be perfect if the 0-sized labels were also completely > > filtered out. Is that something that you could add to the > > DMPlexFliterLabels_Internal function ? > > Why do you want it filtered? Determining that it's zero sized requires all > processes agreeing, and DMAddBoundary is collective so *no* ranks can add a > boundary condition unless the label exists on *all* ranks (even if zero > sized on many). > Well, an empty boundary condition globally-speaking will not be used - that?s my main argument for taking it out altogether. I understand that a bunch of extra Allgather might be too costly if many calls to DMPlexFilter are made, but if the users know that they?ll call it maybe only once or twice, then it would make sense to clean the filtered pieces. What about an option for the DMPlexFilter set to false by default to prevent extra MPI comms? -- Thibault Bridel-Bertomeu ? Eng, MSc, PhD Research Engineer CEA/CESTA 33114 LE BARP Tel.: (+33)557046924 Mob.: (+33)611025322 Mail: thibault.bridelbertomeu at gmail.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From varunhiremath at gmail.com Sun Jan 16 03:40:16 2022 From: varunhiremath at gmail.com (Varun Hiremath) Date: Sun, 16 Jan 2022 01:40:16 -0800 Subject: [petsc-users] PETSc MUMPS interface In-Reply-To: References: Message-ID: Hi Hong, Thank you, this is very helpful! Using this method I am able to get the memory estimates before the actual solve, however, I think my code may be causing the symbolic factorization to be run twice. Attached is my code where I am using SLEPc to compute eigenvalues, and I use MUMPS for factorization. I have commented above the code that computes the memory estimates, could you please check and tell me if this would cause the symbolic factor to be computed twice (a second time inside EPSSolve?), as I am seeing a slight increase in the overall computation time? Regards, Varun On Wed, Jan 12, 2022 at 7:58 AM Zhang, Hong wrote: > PCFactorSetMatSolverType(pc,MATSOLVERMUMPS); > PCFactorSetUpMatSolverType(pc); > PCFactorGetMatrix(pc,&F); > > MatLUFactorSymbolic(F,A,...) > You must provide row and column permutations etc, > petsc/src/mat/tests/ex125.c may give you a clue on how to get these > inputs. > > Hong > > > ------------------------------ > *From:* petsc-users on behalf of > Junchao Zhang > *Sent:* Wednesday, January 12, 2022 9:03 AM > *To:* Varun Hiremath > *Cc:* Peder J?rgensgaard Olesen via petsc-users > *Subject:* Re: [petsc-users] PETSc MUMPS interface > > Calling PCSetUp() before KSPSetUp()? > > --Junchao Zhang > > > On Wed, Jan 12, 2022 at 3:00 AM Varun Hiremath > wrote: > > Hi All, > > I want to collect MUMPS memory estimates based on the initial > symbolic factorization analysis before the actual numerical factorization > starts to check if the estimated memory requirements fit the available > memory. > > I am following the steps from > https://petsc.org/main/src/ksp/ksp/tutorials/ex52.c.html > > PCFactorSetMatSolverType(pc,MATSOLVERMUMPS); > PCFactorSetUpMatSolverType(pc); > PCFactorGetMatrix(pc,&F); > > KSPSetUp(ksp); > MatMumpsGetInfog(F,...) > > But it appears KSPSetUp calls both symbolic and numerical factorization. > So is there some other way to get these statistics before the actual > factorization starts? > > Thanks, > Varun > > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: slepc_eps_mumps_test.cpp Type: application/octet-stream Size: 2798 bytes Desc: not available URL: From varunhiremath at gmail.com Sun Jan 16 04:37:04 2022 From: varunhiremath at gmail.com (Varun Hiremath) Date: Sun, 16 Jan 2022 02:37:04 -0800 Subject: [petsc-users] Inconsistent PETSc MUMPS statistics Message-ID: Hi All, I am using SLEPc to compute eigenvalues and MUMPS for factorization. Please find attached: 1) A simple test program slepc_eps_mumps_test.cpp that reads a given PETSc matrix and computes the smallest eigenvalues using MUMPS for factorization 2) An example PETSc matrix MatA of size 581343 rows (sending in .gz format via Google drive link, please extract it "gunzip MatA.gz" before using). You should be able to reproduce this issue with any other matrix of a similar or bigger size. I notice that when I run the attached test program in parallel with the attached test matrix the MUMPS statistics printed (using the -eps_view option in the command line) change with every run. This is how I run the test: *$ *mpiexec -n 24 ./slepc_eps_mumps_test.o -nev 5 -f MatA -eps_view and for example, the output of this includes the following MUMPS stats ... PC Object: (st_) 24 MPI processes type: lu out-of-place factorization tolerance for zero pivot 2.22045e-14 matrix ordering: external factor fill ratio given 0., needed 0. Factored matrix follows: Mat Object: 24 MPI processes type: mumps rows=581343, cols=581343 package used to perform factorization: mumps *total: nonzeros=348236349, allocated nonzeros=348236349* MUMPS run parameters: SYM (matrix type): 0 ... I ran this test 10 times as follows and got a different number of nonzeros (line highlighted above ) reported in each run. (If you save the full output and compare, you will notice many other differences, but I wouldn't have expected the nonzeros to change with every run.) *$* for i in `seq 1 10`; do echo "run $i :-----"; mpiexec -n 24 ./slepc_eps_mumps_test.o -nev 5 -f MatA -eps_view | grep -A 1 "factorization: mumps"; done run 1 :----- package used to perform factorization: mumps total: nonzeros=354789915, allocated nonzeros=354789915 run 2 :----- package used to perform factorization: mumps total: nonzeros=359811101, allocated nonzeros=359811101 run 3 :----- package used to perform factorization: mumps total: nonzeros=354834871, allocated nonzeros=354834871 run 4 :----- package used to perform factorization: mumps total: nonzeros=354830397, allocated nonzeros=354830397 run 5 :----- package used to perform factorization: mumps total: nonzeros=353942929, allocated nonzeros=353942929 run 6 :----- package used to perform factorization: mumps total: nonzeros=354147241, allocated nonzeros=354147241 run 7 :----- package used to perform factorization: mumps total: nonzeros=354980083, allocated nonzeros=354980083 run 8 :----- package used to perform factorization: mumps total: nonzeros=354980083, allocated nonzeros=354980083 run 9 :----- package used to perform factorization: mumps total: nonzeros=354214219, allocated nonzeros=354214219 run 10 :----- package used to perform factorization: mumps total: nonzeros=355894047, allocated nonzeros=355894047 Can somebody please explain what causes these differences in MUMPS stats? Thanks, Varun MatA.gz -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: slepc_eps_mumps_test.cpp Type: application/octet-stream Size: 2255 bytes Desc: not available URL: From jroman at dsic.upv.es Sun Jan 16 05:11:39 2022 From: jroman at dsic.upv.es (Jose E. Roman) Date: Sun, 16 Jan 2022 12:11:39 +0100 Subject: [petsc-users] PETSc MUMPS interface In-Reply-To: References: Message-ID: <1AD9EBCE-4B48-4D54-9829-DFD5EDC68B76@dsic.upv.es> Hong may give a better answer, but if you look at PCSetUp_LU() https://petsc.org/main/src/ksp/pc/impls/factor/lu/lu.c.html#PCSetUp_LU you will see that MatLUFactorSymbolic() is called unconditionally during the first PCSetUp(). Currently there is no way to check if the user has already called MatLUFactorSymbolic(). Jose > El 16 ene 2022, a las 10:40, Varun Hiremath escribi?: > > Hi Hong, > > Thank you, this is very helpful! > > Using this method I am able to get the memory estimates before the actual solve, however, I think my code may be causing the symbolic factorization to be run twice. Attached is my code where I am using SLEPc to compute eigenvalues, and I use MUMPS for factorization. I have commented above the code that computes the memory estimates, could you please check and tell me if this would cause the symbolic factor to be computed twice (a second time inside EPSSolve?), as I am seeing a slight increase in the overall computation time? > > Regards, > Varun > > On Wed, Jan 12, 2022 at 7:58 AM Zhang, Hong wrote: > PCFactorSetMatSolverType(pc,MATSOLVERMUMPS); > PCFactorSetUpMatSolverType(pc); > PCFactorGetMatrix(pc,&F); > > MatLUFactorSymbolic(F,A,...) > You must provide row and column permutations etc, petsc/src/mat/tests/ex125.c may give you a clue on how to get these inputs. > > Hong > > > From: petsc-users on behalf of Junchao Zhang > Sent: Wednesday, January 12, 2022 9:03 AM > To: Varun Hiremath > Cc: Peder J?rgensgaard Olesen via petsc-users > Subject: Re: [petsc-users] PETSc MUMPS interface > > Calling PCSetUp() before KSPSetUp()? > > --Junchao Zhang > > > On Wed, Jan 12, 2022 at 3:00 AM Varun Hiremath wrote: > Hi All, > > I want to collect MUMPS memory estimates based on the initial symbolic factorization analysis before the actual numerical factorization starts to check if the estimated memory requirements fit the available memory. > > I am following the steps from https://petsc.org/main/src/ksp/ksp/tutorials/ex52.c.html > > PCFactorSetMatSolverType(pc,MATSOLVERMUMPS); > PCFactorSetUpMatSolverType(pc); > PCFactorGetMatrix(pc,&F); > > KSPSetUp(ksp); > MatMumpsGetInfog(F,...) > > But it appears KSPSetUp calls both symbolic and numerical factorization. So is there some other way to get these statistics before the actual factorization starts? > > Thanks, > Varun > From jroman at dsic.upv.es Sun Jan 16 05:28:31 2022 From: jroman at dsic.upv.es (Jose E. Roman) Date: Sun, 16 Jan 2022 12:28:31 +0100 Subject: [petsc-users] Inconsistent PETSc MUMPS statistics In-Reply-To: References: Message-ID: <29D6CB25-F735-4874-B3B4-EB450FF58E5E@dsic.upv.es> Probably someone else can give a better answer, but if I remember correctly ParMetis relies on certain random number generator that makes it produce different partitions for different runs. I think this was fixed in the ParMetis that --download-parmetis installs, but if I am not wrong you are not using that version. This would explain what you get. Jose > El 16 ene 2022, a las 11:37, Varun Hiremath escribi?: > > Hi All, > > I am using SLEPc to compute eigenvalues and MUMPS for factorization. > > Please find attached: > 1) A simple test program slepc_eps_mumps_test.cpp that reads a given PETSc matrix and computes the smallest eigenvalues using MUMPS for factorization > 2) An example PETSc matrix MatA of size 581343 rows (sending in .gz format via Google drive link, please extract it "gunzip MatA.gz" before using). You should be able to reproduce this issue with any other matrix of a similar or bigger size. > > I notice that when I run the attached test program in parallel with the attached test matrix the MUMPS statistics printed (using the -eps_view option in the command line) change with every run. > > This is how I run the test: > $ mpiexec -n 24 ./slepc_eps_mumps_test.o -nev 5 -f MatA -eps_view > and for example, the output of this includes the following MUMPS stats > ... > PC Object: (st_) 24 MPI processes > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: external > factor fill ratio given 0., needed 0. > Factored matrix follows: > Mat Object: 24 MPI processes > type: mumps > rows=581343, cols=581343 > package used to perform factorization: mumps > total: nonzeros=348236349, allocated nonzeros=348236349 > MUMPS run parameters: > SYM (matrix type): 0 > ... > > I ran this test 10 times as follows and got a different number of nonzeros (line highlighted above ) reported in each run. (If you save the full output and compare, you will notice many other differences, but I wouldn't have expected the nonzeros to change with every run.) > > $ for i in `seq 1 10`; do echo "run $i :-----"; mpiexec -n 24 ./slepc_eps_mumps_test.o -nev 5 -f MatA -eps_view | grep -A 1 "factorization: mumps"; done > > run 1 :----- > package used to perform factorization: mumps > total: nonzeros=354789915, allocated nonzeros=354789915 > run 2 :----- > package used to perform factorization: mumps > total: nonzeros=359811101, allocated nonzeros=359811101 > run 3 :----- > package used to perform factorization: mumps > total: nonzeros=354834871, allocated nonzeros=354834871 > run 4 :----- > package used to perform factorization: mumps > total: nonzeros=354830397, allocated nonzeros=354830397 > run 5 :----- > package used to perform factorization: mumps > total: nonzeros=353942929, allocated nonzeros=353942929 > run 6 :----- > package used to perform factorization: mumps > total: nonzeros=354147241, allocated nonzeros=354147241 > run 7 :----- > package used to perform factorization: mumps > total: nonzeros=354980083, allocated nonzeros=354980083 > run 8 :----- > package used to perform factorization: mumps > total: nonzeros=354980083, allocated nonzeros=354980083 > run 9 :----- > package used to perform factorization: mumps > total: nonzeros=354214219, allocated nonzeros=354214219 > run 10 :----- > package used to perform factorization: mumps > total: nonzeros=355894047, allocated nonzeros=355894047 > > Can somebody please explain what causes these differences in MUMPS stats? > > Thanks, > Varun > MatA.gz > > From pierre at joliv.et Sun Jan 16 08:09:29 2022 From: pierre at joliv.et (Pierre Jolivet) Date: Sun, 16 Jan 2022 15:09:29 +0100 Subject: [petsc-users] Inconsistent PETSc MUMPS statistics In-Reply-To: <29D6CB25-F735-4874-B3B4-EB450FF58E5E@dsic.upv.es> References: <29D6CB25-F735-4874-B3B4-EB450FF58E5E@dsic.upv.es> Message-ID: <0B934B14-BAC3-4050-9BC4-45CD3E6A95DC@joliv.et> Default renumbering is sequential. Since --download-parmetis requires --download-metis, I doubt Varun is using ParMETIS since the appropriate ICNTL flag is not explicitly set. If we go back to the original ?issue?, I believe this is because you are doing an LU factorization with a large number of off diagonal pivots, check INFOG(12), which are handled dynamically, and thus may yield different factors. Thanks, Pierre > On 16 Jan 2022, at 12:28 PM, Jose E. Roman wrote: > > Probably someone else can give a better answer, but if I remember correctly ParMetis relies on certain random number generator that makes it produce different partitions for different runs. I think this was fixed in the ParMetis that --download-parmetis installs, but if I am not wrong you are not using that version. This would explain what you get. > > Jose > > >> El 16 ene 2022, a las 11:37, Varun Hiremath escribi?: >> >> Hi All, >> >> I am using SLEPc to compute eigenvalues and MUMPS for factorization. >> >> Please find attached: >> 1) A simple test program slepc_eps_mumps_test.cpp that reads a given PETSc matrix and computes the smallest eigenvalues using MUMPS for factorization >> 2) An example PETSc matrix MatA of size 581343 rows (sending in .gz format via Google drive link, please extract it "gunzip MatA.gz" before using). You should be able to reproduce this issue with any other matrix of a similar or bigger size. >> >> I notice that when I run the attached test program in parallel with the attached test matrix the MUMPS statistics printed (using the -eps_view option in the command line) change with every run. >> >> This is how I run the test: >> $ mpiexec -n 24 ./slepc_eps_mumps_test.o -nev 5 -f MatA -eps_view >> and for example, the output of this includes the following MUMPS stats >> ... >> PC Object: (st_) 24 MPI processes >> type: lu >> out-of-place factorization >> tolerance for zero pivot 2.22045e-14 >> matrix ordering: external >> factor fill ratio given 0., needed 0. >> Factored matrix follows: >> Mat Object: 24 MPI processes >> type: mumps >> rows=581343, cols=581343 >> package used to perform factorization: mumps >> total: nonzeros=348236349, allocated nonzeros=348236349 >> MUMPS run parameters: >> SYM (matrix type): 0 >> ... >> >> I ran this test 10 times as follows and got a different number of nonzeros (line highlighted above ) reported in each run. (If you save the full output and compare, you will notice many other differences, but I wouldn't have expected the nonzeros to change with every run.) >> >> $ for i in `seq 1 10`; do echo "run $i :-----"; mpiexec -n 24 ./slepc_eps_mumps_test.o -nev 5 -f MatA -eps_view | grep -A 1 "factorization: mumps"; done >> >> run 1 :----- >> package used to perform factorization: mumps >> total: nonzeros=354789915, allocated nonzeros=354789915 >> run 2 :----- >> package used to perform factorization: mumps >> total: nonzeros=359811101, allocated nonzeros=359811101 >> run 3 :----- >> package used to perform factorization: mumps >> total: nonzeros=354834871, allocated nonzeros=354834871 >> run 4 :----- >> package used to perform factorization: mumps >> total: nonzeros=354830397, allocated nonzeros=354830397 >> run 5 :----- >> package used to perform factorization: mumps >> total: nonzeros=353942929, allocated nonzeros=353942929 >> run 6 :----- >> package used to perform factorization: mumps >> total: nonzeros=354147241, allocated nonzeros=354147241 >> run 7 :----- >> package used to perform factorization: mumps >> total: nonzeros=354980083, allocated nonzeros=354980083 >> run 8 :----- >> package used to perform factorization: mumps >> total: nonzeros=354980083, allocated nonzeros=354980083 >> run 9 :----- >> package used to perform factorization: mumps >> total: nonzeros=354214219, allocated nonzeros=354214219 >> run 10 :----- >> package used to perform factorization: mumps >> total: nonzeros=355894047, allocated nonzeros=355894047 >> >> Can somebody please explain what causes these differences in MUMPS stats? >> >> Thanks, >> Varun >> MatA.gz >> >> > From hzhang at mcs.anl.gov Sun Jan 16 18:01:41 2022 From: hzhang at mcs.anl.gov (Zhang, Hong) Date: Mon, 17 Jan 2022 00:01:41 +0000 Subject: [petsc-users] PETSc MUMPS interface In-Reply-To: <1AD9EBCE-4B48-4D54-9829-DFD5EDC68B76@dsic.upv.es> References: <1AD9EBCE-4B48-4D54-9829-DFD5EDC68B76@dsic.upv.es> Message-ID: Varun, I believe Jose is correct. You may verify it by running your code with option '-log_view', then check the number of calls to MatLUFactorSym. I guess I can add a flag in PCSetUp() to check if user has already called MatLUFactorSymbolic() and wants to skip it. Normally, users simply allocate sufficient memory in the symbolic factorization. Why do you want to check it? Hong ________________________________ From: Jose E. Roman Sent: Sunday, January 16, 2022 5:11 AM To: Varun Hiremath Cc: Zhang, Hong ; Peder J?rgensgaard Olesen via petsc-users Subject: Re: [petsc-users] PETSc MUMPS interface Hong may give a better answer, but if you look at PCSetUp_LU() https://petsc.org/main/src/ksp/pc/impls/factor/lu/lu.c.html#PCSetUp_LU you will see that MatLUFactorSymbolic() is called unconditionally during the first PCSetUp(). Currently there is no way to check if the user has already called MatLUFactorSymbolic(). Jose > El 16 ene 2022, a las 10:40, Varun Hiremath escribi?: > > Hi Hong, > > Thank you, this is very helpful! > > Using this method I am able to get the memory estimates before the actual solve, however, I think my code may be causing the symbolic factorization to be run twice. Attached is my code where I am using SLEPc to compute eigenvalues, and I use MUMPS for factorization. I have commented above the code that computes the memory estimates, could you please check and tell me if this would cause the symbolic factor to be computed twice (a second time inside EPSSolve?), as I am seeing a slight increase in the overall computation time? > > Regards, > Varun > > On Wed, Jan 12, 2022 at 7:58 AM Zhang, Hong wrote: > PCFactorSetMatSolverType(pc,MATSOLVERMUMPS); > PCFactorSetUpMatSolverType(pc); > PCFactorGetMatrix(pc,&F); > > MatLUFactorSymbolic(F,A,...) > You must provide row and column permutations etc, petsc/src/mat/tests/ex125.c may give you a clue on how to get these inputs. > > Hong > > > From: petsc-users on behalf of Junchao Zhang > Sent: Wednesday, January 12, 2022 9:03 AM > To: Varun Hiremath > Cc: Peder J?rgensgaard Olesen via petsc-users > Subject: Re: [petsc-users] PETSc MUMPS interface > > Calling PCSetUp() before KSPSetUp()? > > --Junchao Zhang > > > On Wed, Jan 12, 2022 at 3:00 AM Varun Hiremath wrote: > Hi All, > > I want to collect MUMPS memory estimates based on the initial symbolic factorization analysis before the actual numerical factorization starts to check if the estimated memory requirements fit the available memory. > > I am following the steps from https://petsc.org/main/src/ksp/ksp/tutorials/ex52.c.html > > PCFactorSetMatSolverType(pc,MATSOLVERMUMPS); > PCFactorSetUpMatSolverType(pc); > PCFactorGetMatrix(pc,&F); > > KSPSetUp(ksp); > MatMumpsGetInfog(F,...) > > But it appears KSPSetUp calls both symbolic and numerical factorization. So is there some other way to get these statistics before the actual factorization starts? > > Thanks, > Varun > -------------- next part -------------- An HTML attachment was scrubbed... URL: From varunhiremath at gmail.com Sun Jan 16 22:10:42 2022 From: varunhiremath at gmail.com (Varun Hiremath) Date: Sun, 16 Jan 2022 20:10:42 -0800 Subject: [petsc-users] PETSc MUMPS interface In-Reply-To: References: <1AD9EBCE-4B48-4D54-9829-DFD5EDC68B76@dsic.upv.es> Message-ID: Hi Jose, Hong, Thanks for the explanation. I have verified using -log_view that MatLUFactorSymbolic is indeed getting called twice. Hong, we use MUMPS solver for other things, and we typically run the symbolic analysis first and get memory estimates to ensure that we have enough memory available to run the case. If the available memory is not enough, we can stop or switch to the out-of-core (OOC) option in MUMPS. We wanted to do the same when using MUMPS via SLEPc/PETSc. Please let me know if there are other ways of getting these memory stats and switching options during runtime with PETSc. Appreciate your help! Thanks, Varun On Sun, Jan 16, 2022 at 4:01 PM Zhang, Hong wrote: > Varun, > I believe Jose is correct. You may verify it by running your code with > option '-log_view', then check the number of calls to MatLUFactorSym. > > I guess I can add a flag in PCSetUp() to check if user has already called > MatLUFactorSymbolic() and wants to skip it. Normally, users simply allocate > sufficient memory in the symbolic factorization. Why do you want to check > it? > Hong > > ------------------------------ > *From:* Jose E. Roman > *Sent:* Sunday, January 16, 2022 5:11 AM > *To:* Varun Hiremath > *Cc:* Zhang, Hong ; Peder J?rgensgaard Olesen via > petsc-users > *Subject:* Re: [petsc-users] PETSc MUMPS interface > > Hong may give a better answer, but if you look at PCSetUp_LU() > https://petsc.org/main/src/ksp/pc/impls/factor/lu/lu.c.html#PCSetUp_LU > you will see that MatLUFactorSymbolic() is called unconditionally during > the first PCSetUp(). Currently there is no way to check if the user has > already called MatLUFactorSymbolic(). > > Jose > > > > El 16 ene 2022, a las 10:40, Varun Hiremath > escribi?: > > > > Hi Hong, > > > > Thank you, this is very helpful! > > > > Using this method I am able to get the memory estimates before the > actual solve, however, I think my code may be causing the symbolic > factorization to be run twice. Attached is my code where I am using SLEPc > to compute eigenvalues, and I use MUMPS for factorization. I have commented > above the code that computes the memory estimates, could you please check > and tell me if this would cause the symbolic factor to be computed twice (a > second time inside EPSSolve?), as I am seeing a slight increase in the > overall computation time? > > > > Regards, > > Varun > > > > On Wed, Jan 12, 2022 at 7:58 AM Zhang, Hong wrote: > > PCFactorSetMatSolverType(pc,MATSOLVERMUMPS); > > PCFactorSetUpMatSolverType(pc); > > PCFactorGetMatrix(pc,&F); > > > > MatLUFactorSymbolic(F,A,...) > > You must provide row and column permutations etc, > petsc/src/mat/tests/ex125.c may give you a clue on how to get these inputs. > > > > Hong > > > > > > From: petsc-users on behalf of > Junchao Zhang > > Sent: Wednesday, January 12, 2022 9:03 AM > > To: Varun Hiremath > > Cc: Peder J?rgensgaard Olesen via petsc-users > > Subject: Re: [petsc-users] PETSc MUMPS interface > > > > Calling PCSetUp() before KSPSetUp()? > > > > --Junchao Zhang > > > > > > On Wed, Jan 12, 2022 at 3:00 AM Varun Hiremath > wrote: > > Hi All, > > > > I want to collect MUMPS memory estimates based on the initial symbolic > factorization analysis before the actual numerical factorization starts to > check if the estimated memory requirements fit the available memory. > > > > I am following the steps from > https://petsc.org/main/src/ksp/ksp/tutorials/ex52.c.html > > > > PCFactorSetMatSolverType(pc,MATSOLVERMUMPS); > > PCFactorSetUpMatSolverType(pc); > > PCFactorGetMatrix(pc,&F); > > > > KSPSetUp(ksp); > > MatMumpsGetInfog(F,...) > > > > But it appears KSPSetUp calls both symbolic and numerical factorization. > So is there some other way to get these statistics before the actual > factorization starts? > > > > Thanks, > > Varun > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From varunhiremath at gmail.com Sun Jan 16 22:42:21 2022 From: varunhiremath at gmail.com (Varun Hiremath) Date: Sun, 16 Jan 2022 20:42:21 -0800 Subject: [petsc-users] Inconsistent PETSc MUMPS statistics In-Reply-To: <0B934B14-BAC3-4050-9BC4-45CD3E6A95DC@joliv.et> References: <29D6CB25-F735-4874-B3B4-EB450FF58E5E@dsic.upv.es> <0B934B14-BAC3-4050-9BC4-45CD3E6A95DC@joliv.et> Message-ID: Hi Jose, Pierre, Yes, I am not using ParMetis or Metis. This is how I am configuring PETSc: $ ./configure --with-debugging=no --with-mpi-dir= --with-scalar-type=complex --download-mumps --download-scalapack --with-blaslapack-dir= Would you happen to know if there is any option available to force MUMPS to be deterministic/repeatable in parallel? I couldn't find anything in the MUMPS user guide that suggests non-deterministic behavior. For the example matrix that I shared, the final computed eigenvalues are within machine precision on multiple runs. However, for some other bigger cases that I am testing, I get different eigenvalues on multiple runs and I am trying to figure out the source of these inconsistencies. Thanks, Varun On Sun, Jan 16, 2022 at 6:09 AM Pierre Jolivet wrote: > Default renumbering is sequential. Since --download-parmetis requires > --download-metis, I doubt Varun is using ParMETIS since the appropriate > ICNTL flag is not explicitly set. > If we go back to the original ?issue?, I believe this is because you are > doing an LU factorization with a large number of off diagonal pivots, check > INFOG(12), which are handled dynamically, and thus may yield different > factors. > > Thanks, > Pierre > > > On 16 Jan 2022, at 12:28 PM, Jose E. Roman wrote: > > > > Probably someone else can give a better answer, but if I remember > correctly ParMetis relies on certain random number generator that makes it > produce different partitions for different runs. I think this was fixed in > the ParMetis that --download-parmetis installs, but if I am not wrong you > are not using that version. This would explain what you get. > > > > Jose > > > > > >> El 16 ene 2022, a las 11:37, Varun Hiremath > escribi?: > >> > >> Hi All, > >> > >> I am using SLEPc to compute eigenvalues and MUMPS for factorization. > >> > >> Please find attached: > >> 1) A simple test program slepc_eps_mumps_test.cpp that reads a given > PETSc matrix and computes the smallest eigenvalues using MUMPS for > factorization > >> 2) An example PETSc matrix MatA of size 581343 rows (sending in .gz > format via Google drive link, please extract it "gunzip MatA.gz" before > using). You should be able to reproduce this issue with any other matrix of > a similar or bigger size. > >> > >> I notice that when I run the attached test program in parallel with the > attached test matrix the MUMPS statistics printed (using the -eps_view > option in the command line) change with every run. > >> > >> This is how I run the test: > >> $ mpiexec -n 24 ./slepc_eps_mumps_test.o -nev 5 -f MatA -eps_view > >> and for example, the output of this includes the following MUMPS stats > >> ... > >> PC Object: (st_) 24 MPI processes > >> type: lu > >> out-of-place factorization > >> tolerance for zero pivot 2.22045e-14 > >> matrix ordering: external > >> factor fill ratio given 0., needed 0. > >> Factored matrix follows: > >> Mat Object: 24 MPI processes > >> type: mumps > >> rows=581343, cols=581343 > >> package used to perform factorization: mumps > >> total: nonzeros=348236349, allocated nonzeros=348236349 > >> MUMPS run parameters: > >> SYM (matrix type): 0 > >> ... > >> > >> I ran this test 10 times as follows and got a different number of > nonzeros (line highlighted above ) reported in each run. (If you save the > full output and compare, you will notice many other differences, but I > wouldn't have expected the nonzeros to change with every run.) > >> > >> $ for i in `seq 1 10`; do echo "run $i :-----"; mpiexec -n 24 > ./slepc_eps_mumps_test.o -nev 5 -f MatA -eps_view | grep -A 1 > "factorization: mumps"; done > >> > >> run 1 :----- > >> package used to perform factorization: mumps > >> total: nonzeros=354789915, allocated nonzeros=354789915 > >> run 2 :----- > >> package used to perform factorization: mumps > >> total: nonzeros=359811101, allocated nonzeros=359811101 > >> run 3 :----- > >> package used to perform factorization: mumps > >> total: nonzeros=354834871, allocated nonzeros=354834871 > >> run 4 :----- > >> package used to perform factorization: mumps > >> total: nonzeros=354830397, allocated nonzeros=354830397 > >> run 5 :----- > >> package used to perform factorization: mumps > >> total: nonzeros=353942929, allocated nonzeros=353942929 > >> run 6 :----- > >> package used to perform factorization: mumps > >> total: nonzeros=354147241, allocated nonzeros=354147241 > >> run 7 :----- > >> package used to perform factorization: mumps > >> total: nonzeros=354980083, allocated nonzeros=354980083 > >> run 8 :----- > >> package used to perform factorization: mumps > >> total: nonzeros=354980083, allocated nonzeros=354980083 > >> run 9 :----- > >> package used to perform factorization: mumps > >> total: nonzeros=354214219, allocated nonzeros=354214219 > >> run 10 :----- > >> package used to perform factorization: mumps > >> total: nonzeros=355894047, allocated nonzeros=355894047 > >> > >> Can somebody please explain what causes these differences in MUMPS > stats? > >> > >> Thanks, > >> Varun > >> MatA.gz > >> > >> > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From thibault.bridelbertomeu at gmail.com Mon Jan 17 04:23:49 2022 From: thibault.bridelbertomeu at gmail.com (Thibault Bridel-Bertomeu) Date: Mon, 17 Jan 2022 11:23:49 +0100 Subject: [petsc-users] PetscFV without ghost boundary conditions Message-ID: Hi everyone, I was wondering if it was possible to build a solver based on PetscFV without using ghost cells for the boundary conditions ? Would it be possible to call PetscDSAddBoundary with DM_BC_ESSENTIAL and so on instead of DM_BC_NATURAL_RIEMANN ? Mostly in the case of an hybrid problem with FVM and FEM (like for instance ex18.c from the TS tutorials), it would make sense that the boundaries shared by the two discretizations be set at the same locations, i.e. for FEM the quadrature points, wouldn't it ? Thanks !! Thibault -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at jedbrown.org Mon Jan 17 08:52:47 2022 From: jed at jedbrown.org (Jed Brown) Date: Mon, 17 Jan 2022 07:52:47 -0700 Subject: [petsc-users] PetscFV without ghost boundary conditions In-Reply-To: References: Message-ID: <87czkqo1mo.fsf@jedbrown.org> Thibault Bridel-Bertomeu writes: > Hi everyone, > > I was wondering if it was possible to build a solver based on PetscFV > without using ghost cells for the boundary conditions ? > Would it be possible to call PetscDSAddBoundary with DM_BC_ESSENTIAL and so > on instead of DM_BC_NATURAL_RIEMANN ? Are you thinking this sets the value in a ghost cell or the value in a cell adjacent to the domain? For the former, the value in a ghost cell must typically be set to a (nonlinear for nonlinear PDE) function of the value in the adjacent cell to implement desired BCs. > Mostly in the case of an hybrid problem with FVM and FEM (like for instance > ex18.c from the TS tutorials), it would make sense that the boundaries > shared by the two discretizations be set at the same locations, i.e. for > FEM the quadrature points, wouldn't it ? You either have a surface integral (NATURAL boundary condition, with the integrand specified at quadrature points) or an ESSENTIAL condition, which is implemented nodally (with some lifting for non-Lagrange bases). From thibault.bridelbertomeu at gmail.com Mon Jan 17 08:58:53 2022 From: thibault.bridelbertomeu at gmail.com (Thibault Bridel-Bertomeu) Date: Mon, 17 Jan 2022 15:58:53 +0100 Subject: [petsc-users] PetscFV without ghost boundary conditions In-Reply-To: <87czkqo1mo.fsf@jedbrown.org> References: <87czkqo1mo.fsf@jedbrown.org> Message-ID: Hello everyone, Thank you for your answer Jed. Le lun. 17 janv. 2022 ? 15:52, Jed Brown a ?crit : > Thibault Bridel-Bertomeu writes: > > > Hi everyone, > > > > I was wondering if it was possible to build a solver based on PetscFV > > without using ghost cells for the boundary conditions ? > > Would it be possible to call PetscDSAddBoundary with DM_BC_ESSENTIAL and > so > > on instead of DM_BC_NATURAL_RIEMANN ? > > Are you thinking this sets the value in a ghost cell or the value in a > cell adjacent to the domain? For the former, the value in a ghost cell must > typically be set to a (nonlinear for nonlinear PDE) function of the value > in the adjacent cell to implement desired BCs. > Sorry I wasn't clear I think. I was thinking that for finite volume codes, you have two options: either you use ghost cells that you fill up with the appropriate primitive state for the boundary condition, or you directly enforce the boundary condition at the face. In the PetscFV examples, the DM_BC_NATURAL_RIEMANN type of boundary condition with ghost cells is always used, but what about _not_ generating the ghost cells and using a DM_BC_ESSENTIAL to enforce the boundary condition directly at the face (or at the quadrature points on the face if you will) ? > > > Mostly in the case of an hybrid problem with FVM and FEM (like for > instance > > ex18.c from the TS tutorials), it would make sense that the boundaries > > shared by the two discretizations be set at the same locations, i.e. for > > FEM the quadrature points, wouldn't it ? > > You either have a surface integral (NATURAL boundary condition, with the > integrand specified at quadrature points) or an ESSENTIAL condition, which > is implemented nodally (with some lifting for non-Lagrange bases). > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at jedbrown.org Mon Jan 17 09:49:08 2022 From: jed at jedbrown.org (Jed Brown) Date: Mon, 17 Jan 2022 08:49:08 -0700 Subject: [petsc-users] PetscFV without ghost boundary conditions In-Reply-To: References: <87czkqo1mo.fsf@jedbrown.org> Message-ID: <87a6funz0r.fsf@jedbrown.org> Thibault Bridel-Bertomeu writes: > Sorry I wasn't clear I think. > I was thinking that for finite volume codes, you have two options: either > you use ghost cells that you fill up with the appropriate primitive state > for the boundary condition, or you directly enforce the boundary condition > at the face. > In the PetscFV examples, the DM_BC_NATURAL_RIEMANN type of boundary > condition with ghost cells is always used, but what about _not_ generating > the ghost cells and using a DM_BC_ESSENTIAL to enforce the boundary > condition directly at the face (or at the quadrature points on the face if > you will) ? Essential boundary conditions change the function space in which you seek a solution. There is no dof at the face in an FV method, so how would you propose modifying the function space. Also, FV methods with higher than first order require a reconstruction (generally nonlinear) using neighbor cells to create a richer (usually piecewise linear) representation inside cells. From knepley at gmail.com Mon Jan 17 10:19:22 2022 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 17 Jan 2022 11:19:22 -0500 Subject: [petsc-users] PetscFV without ghost boundary conditions In-Reply-To: <87a6funz0r.fsf@jedbrown.org> References: <87czkqo1mo.fsf@jedbrown.org> <87a6funz0r.fsf@jedbrown.org> Message-ID: On Mon, Jan 17, 2022 at 10:49 AM Jed Brown wrote: > Thibault Bridel-Bertomeu writes: > > > Sorry I wasn't clear I think. > > I was thinking that for finite volume codes, you have two options: either > > you use ghost cells that you fill up with the appropriate primitive state > > for the boundary condition, or you directly enforce the boundary > condition > > at the face. > > In the PetscFV examples, the DM_BC_NATURAL_RIEMANN type of boundary > > condition with ghost cells is always used, but what about _not_ > generating > > the ghost cells and using a DM_BC_ESSENTIAL to enforce the boundary > > condition directly at the face (or at the quadrature points on the face > if > > you will) ? > > Essential boundary conditions change the function space in which you seek > a solution. There is no dof at the face in an FV method, so how would you > propose modifying the function space. > > Also, FV methods with higher than first order require a reconstruction > (generally nonlinear) using neighbor cells to create a richer (usually > piecewise linear) representation inside cells. > What you suggest (flux boundary conditions) would require short-circuiting the Riemann solver loop to stick in the flux. It is doable, but we opted against it in the initial design because it seemed more complex than ghost cells and did not seem to have any advantages. For multi-domain things, wouldn't you require that the cell average of the adjacent cell be used as the ghost value for consistency? We would need a new interface to mark existing cells as ghosts, but that is quite simple compared to a flux BC addition. Thanks, Matt -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From thibault.bridelbertomeu at gmail.com Mon Jan 17 10:36:34 2022 From: thibault.bridelbertomeu at gmail.com (Thibault Bridel-Bertomeu) Date: Mon, 17 Jan 2022 17:36:34 +0100 Subject: [petsc-users] PetscFV without ghost boundary conditions In-Reply-To: References: <87czkqo1mo.fsf@jedbrown.org> <87a6funz0r.fsf@jedbrown.org> Message-ID: Le lun. 17 janv. 2022 ? 17:19, Matthew Knepley a ?crit : > On Mon, Jan 17, 2022 at 10:49 AM Jed Brown wrote: > >> Thibault Bridel-Bertomeu writes: >> >> > Sorry I wasn't clear I think. >> > I was thinking that for finite volume codes, you have two options: >> either >> > you use ghost cells that you fill up with the appropriate primitive >> state >> > for the boundary condition, or you directly enforce the boundary >> condition >> > at the face. >> > In the PetscFV examples, the DM_BC_NATURAL_RIEMANN type of boundary >> > condition with ghost cells is always used, but what about _not_ >> generating >> > the ghost cells and using a DM_BC_ESSENTIAL to enforce the boundary >> > condition directly at the face (or at the quadrature points on the face >> if >> > you will) ? >> >> Essential boundary conditions change the function space in which you seek >> a solution. There is no dof at the face in an FV method, so how would you >> propose modifying the function space. >> > There can be dof at the faces actually, higher order FV can also be based on a quadrature representation that involves extra dof in the cells and in the faces for accurate integration. > >> Also, FV methods with higher than first order require a reconstruction >> (generally nonlinear) using neighbor cells to create a richer (usually >> piecewise linear) representation inside cells. > > That also can be worked around by upwinding the reconstruction schemes - it's unpleasant work, but it can be done :) > What you suggest (flux boundary conditions) would require short-circuiting > the Riemann solver loop to stick in the flux. It is doable, but we opted > against it in the initial > design because it seemed more complex than ghost cells and did not seem to > have any advantages. > OK ! I agree it is more complex, ghost cells are quite straightforward to use and don't require any extra upwinding of the reconstructions etc ... I actually mostly also opt for ghost cells in the codes I write, so I understand your decision ! > For multi-domain things, wouldn't you require that the cell average of > the adjacent cell be used as the ghost value for consistency? We would > need a new interface to mark existing cells as ghosts, but that is quite > simple compared to a flux BC > addition. > Well I think you read my mind. My question is indeed still in the context of the thread I started last week about multi-physics stuff. Actually, if we can mark cells that are actually part of the mesh as ghost to be used by boundary conditions, then it would be perfect ! I was thinking how I could do that actually ... If we think about the same example I have been using so far (a rectangle split into two distinct domains to represent fluid on one hand and solid on the other hand) then the ghost cells for the fluid at the fluid/solid interface would actually be cells located in the solid domain, and those are the ones we could be marking if you think it's possible ! It would be about handling the special case of creating ghost cells for faces that already have two cells on each side I believe. Anyway for my example, I was thinking of working my way from the "Wall" label, and identifying for each face in the label, the cells on each side and put those cells that also belong to the "Solid" label in the "ghost" label - I don't know if that makes sense. (By the way Matt the MR 4717 you asked me to check works well, thanks !) Thanks for your feedback, Thibault > > Thanks, > > Matt > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From hzhang at mcs.anl.gov Mon Jan 17 11:18:08 2022 From: hzhang at mcs.anl.gov (Zhang, Hong) Date: Mon, 17 Jan 2022 17:18:08 +0000 Subject: [petsc-users] PETSc MUMPS interface In-Reply-To: References: <1AD9EBCE-4B48-4D54-9829-DFD5EDC68B76@dsic.upv.es> Message-ID: Varun, I am trying to find a way to enable you to switch options after MatLUFactorSymbolic(). A hack is modifying the flag 'mumps->matstruc' inside MatLUFactorSymbolic_AIJMUMPS() and MatFactorNumeric_MUMPS(). My understanding of what you want is: // collect mumps memory info ... MatLUFactorSymbolic(F,A,perm,iperm,&info); printMumpsMemoryInfo(F); //--------- if (memory is available) { EPSSolve(eps); --> skip calling of MatLUFactorSymbolic() } else { //out-of-core (OOC) option in MUMPS } Am I correct? I'll let you know once I work out a solution. Hong ________________________________ From: Varun Hiremath Sent: Sunday, January 16, 2022 10:10 PM To: Zhang, Hong Cc: Jose E. Roman ; Peder J?rgensgaard Olesen via petsc-users Subject: Re: [petsc-users] PETSc MUMPS interface Hi Jose, Hong, Thanks for the explanation. I have verified using -log_view that MatLUFactorSymbolic is indeed getting called twice. Hong, we use MUMPS solver for other things, and we typically run the symbolic analysis first and get memory estimates to ensure that we have enough memory available to run the case. If the available memory is not enough, we can stop or switch to the out-of-core (OOC) option in MUMPS. We wanted to do the same when using MUMPS via SLEPc/PETSc. Please let me know if there are other ways of getting these memory stats and switching options during runtime with PETSc. Appreciate your help! Thanks, Varun On Sun, Jan 16, 2022 at 4:01 PM Zhang, Hong > wrote: Varun, I believe Jose is correct. You may verify it by running your code with option '-log_view', then check the number of calls to MatLUFactorSym. I guess I can add a flag in PCSetUp() to check if user has already called MatLUFactorSymbolic() and wants to skip it. Normally, users simply allocate sufficient memory in the symbolic factorization. Why do you want to check it? Hong ________________________________ From: Jose E. Roman > Sent: Sunday, January 16, 2022 5:11 AM To: Varun Hiremath > Cc: Zhang, Hong >; Peder J?rgensgaard Olesen via petsc-users > Subject: Re: [petsc-users] PETSc MUMPS interface Hong may give a better answer, but if you look at PCSetUp_LU() https://petsc.org/main/src/ksp/pc/impls/factor/lu/lu.c.html#PCSetUp_LU you will see that MatLUFactorSymbolic() is called unconditionally during the first PCSetUp(). Currently there is no way to check if the user has already called MatLUFactorSymbolic(). Jose > El 16 ene 2022, a las 10:40, Varun Hiremath > escribi?: > > Hi Hong, > > Thank you, this is very helpful! > > Using this method I am able to get the memory estimates before the actual solve, however, I think my code may be causing the symbolic factorization to be run twice. Attached is my code where I am using SLEPc to compute eigenvalues, and I use MUMPS for factorization. I have commented above the code that computes the memory estimates, could you please check and tell me if this would cause the symbolic factor to be computed twice (a second time inside EPSSolve?), as I am seeing a slight increase in the overall computation time? > > Regards, > Varun > > On Wed, Jan 12, 2022 at 7:58 AM Zhang, Hong > wrote: > PCFactorSetMatSolverType(pc,MATSOLVERMUMPS); > PCFactorSetUpMatSolverType(pc); > PCFactorGetMatrix(pc,&F); > > MatLUFactorSymbolic(F,A,...) > You must provide row and column permutations etc, petsc/src/mat/tests/ex125.c may give you a clue on how to get these inputs. > > Hong > > > From: petsc-users > on behalf of Junchao Zhang > > Sent: Wednesday, January 12, 2022 9:03 AM > To: Varun Hiremath > > Cc: Peder J?rgensgaard Olesen via petsc-users > > Subject: Re: [petsc-users] PETSc MUMPS interface > > Calling PCSetUp() before KSPSetUp()? > > --Junchao Zhang > > > On Wed, Jan 12, 2022 at 3:00 AM Varun Hiremath > wrote: > Hi All, > > I want to collect MUMPS memory estimates based on the initial symbolic factorization analysis before the actual numerical factorization starts to check if the estimated memory requirements fit the available memory. > > I am following the steps from https://petsc.org/main/src/ksp/ksp/tutorials/ex52.c.html > > PCFactorSetMatSolverType(pc,MATSOLVERMUMPS); > PCFactorSetUpMatSolverType(pc); > PCFactorGetMatrix(pc,&F); > > KSPSetUp(ksp); > MatMumpsGetInfog(F,...) > > But it appears KSPSetUp calls both symbolic and numerical factorization. So is there some other way to get these statistics before the actual factorization starts? > > Thanks, > Varun > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at jedbrown.org Mon Jan 17 11:46:59 2022 From: jed at jedbrown.org (Jed Brown) Date: Mon, 17 Jan 2022 10:46:59 -0700 Subject: [petsc-users] PetscFV without ghost boundary conditions In-Reply-To: References: <87czkqo1mo.fsf@jedbrown.org> <87a6funz0r.fsf@jedbrown.org> Message-ID: <877dayntkc.fsf@jedbrown.org> Matthew Knepley writes: > What you suggest (flux boundary conditions) would require short-circuiting > the Riemann solver loop to stick in the flux. It is doable, but we opted > against it in the initial > design because it seemed more complex than ghost cells and did not seem to > have any advantages. For multi-domain things, wouldn't you require that the > cell average of > the adjacent cell be used as the ghost value for consistency? We would need > a new interface to mark existing cells as ghosts, but that is quite simple > compared to a flux BC > addition. FV methods usually need a ghost for slope reconstruction. FE methods (CG or DG) don't have a reconstruction and thus can use a flux boundary condition by solving a degenerate Riemann problem. IMO, this is simpler than creating a ghost cell and solving a full Riemann problem. It's what we use for subsonic inflow in libceed-fluids, for example. From varunhiremath at gmail.com Mon Jan 17 12:41:27 2022 From: varunhiremath at gmail.com (Varun Hiremath) Date: Mon, 17 Jan 2022 10:41:27 -0800 Subject: [petsc-users] PETSc MUMPS interface In-Reply-To: References: <1AD9EBCE-4B48-4D54-9829-DFD5EDC68B76@dsic.upv.es> Message-ID: Hi Hong, Thanks for looking into this. Here is the workflow that I might use: MatLUFactorSymbolic(F,A,perm,iperm,&info); // get memory estimates from MUMPS e.g. INFO(3), INFOG(16), INFOG(17) // find available memory on the system e.g. RAM size if (estimated_memory > available_memory) { // inform and stop; or // switch MUMPS to out-of-core factorization ICNTL(22) = 1; } else { // set appropriate settings for in-core factorization } // Now we call the solve and inside if MatLUFactorSymbolic is already called then it should be skipped EPSSolve(eps); Thanks, Varun On Mon, Jan 17, 2022 at 9:18 AM Zhang, Hong wrote: > Varun, > I am trying to find a way to enable you to switch options after MatLUFactorSymbolic(). > A hack is modifying the flag 'mumps->matstruc' > inside MatLUFactorSymbolic_AIJMUMPS() and MatFactorNumeric_MUMPS(). > > My understanding of what you want is: > // collect mumps memory info > ... > MatLUFactorSymbolic(F,A,perm,iperm,&info); > printMumpsMemoryInfo(F); > //--------- > if (memory is available) { > EPSSolve(eps); --> skip calling of MatLUFactorSymbolic() > } else { > //out-of-core (OOC) option in MUMPS > } > > Am I correct? I'll let you know once I work out a solution. > Hong > > ------------------------------ > *From:* Varun Hiremath > *Sent:* Sunday, January 16, 2022 10:10 PM > *To:* Zhang, Hong > *Cc:* Jose E. Roman ; Peder J?rgensgaard Olesen via > petsc-users > *Subject:* Re: [petsc-users] PETSc MUMPS interface > > Hi Jose, Hong, > > Thanks for the explanation. I have verified using -log_view that MatLUFactorSymbolic > is indeed getting called twice. > > Hong, we use MUMPS solver for other things, and we typically run the > symbolic analysis first and get memory estimates to ensure that we have > enough memory available to run the case. If the available memory is not > enough, we can stop or switch to the out-of-core (OOC) option in MUMPS. We > wanted to do the same when using MUMPS via SLEPc/PETSc. Please let me know > if there are other ways of getting these memory stats and switching options > during runtime with PETSc. > Appreciate your help! > > Thanks, > Varun > > On Sun, Jan 16, 2022 at 4:01 PM Zhang, Hong wrote: > > Varun, > I believe Jose is correct. You may verify it by running your code with > option '-log_view', then check the number of calls to MatLUFactorSym. > > I guess I can add a flag in PCSetUp() to check if user has already called > MatLUFactorSymbolic() and wants to skip it. Normally, users simply allocate > sufficient memory in the symbolic factorization. Why do you want to check > it? > Hong > > ------------------------------ > *From:* Jose E. Roman > *Sent:* Sunday, January 16, 2022 5:11 AM > *To:* Varun Hiremath > *Cc:* Zhang, Hong ; Peder J?rgensgaard Olesen via > petsc-users > *Subject:* Re: [petsc-users] PETSc MUMPS interface > > Hong may give a better answer, but if you look at PCSetUp_LU() > https://petsc.org/main/src/ksp/pc/impls/factor/lu/lu.c.html#PCSetUp_LU > you will see that MatLUFactorSymbolic() is called unconditionally during > the first PCSetUp(). Currently there is no way to check if the user has > already called MatLUFactorSymbolic(). > > Jose > > > > El 16 ene 2022, a las 10:40, Varun Hiremath > escribi?: > > > > Hi Hong, > > > > Thank you, this is very helpful! > > > > Using this method I am able to get the memory estimates before the > actual solve, however, I think my code may be causing the symbolic > factorization to be run twice. Attached is my code where I am using SLEPc > to compute eigenvalues, and I use MUMPS for factorization. I have commented > above the code that computes the memory estimates, could you please check > and tell me if this would cause the symbolic factor to be computed twice (a > second time inside EPSSolve?), as I am seeing a slight increase in the > overall computation time? > > > > Regards, > > Varun > > > > On Wed, Jan 12, 2022 at 7:58 AM Zhang, Hong wrote: > > PCFactorSetMatSolverType(pc,MATSOLVERMUMPS); > > PCFactorSetUpMatSolverType(pc); > > PCFactorGetMatrix(pc,&F); > > > > MatLUFactorSymbolic(F,A,...) > > You must provide row and column permutations etc, > petsc/src/mat/tests/ex125.c may give you a clue on how to get these inputs. > > > > Hong > > > > > > From: petsc-users on behalf of > Junchao Zhang > > Sent: Wednesday, January 12, 2022 9:03 AM > > To: Varun Hiremath > > Cc: Peder J?rgensgaard Olesen via petsc-users > > Subject: Re: [petsc-users] PETSc MUMPS interface > > > > Calling PCSetUp() before KSPSetUp()? > > > > --Junchao Zhang > > > > > > On Wed, Jan 12, 2022 at 3:00 AM Varun Hiremath > wrote: > > Hi All, > > > > I want to collect MUMPS memory estimates based on the initial symbolic > factorization analysis before the actual numerical factorization starts to > check if the estimated memory requirements fit the available memory. > > > > I am following the steps from > https://petsc.org/main/src/ksp/ksp/tutorials/ex52.c.html > > > > PCFactorSetMatSolverType(pc,MATSOLVERMUMPS); > > PCFactorSetUpMatSolverType(pc); > > PCFactorGetMatrix(pc,&F); > > > > KSPSetUp(ksp); > > MatMumpsGetInfog(F,...) > > > > But it appears KSPSetUp calls both symbolic and numerical factorization. > So is there some other way to get these statistics before the actual > factorization starts? > > > > Thanks, > > Varun > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From fdkong.jd at gmail.com Mon Jan 17 13:54:39 2022 From: fdkong.jd at gmail.com (Fande Kong) Date: Mon, 17 Jan 2022 12:54:39 -0700 Subject: [petsc-users] Downloaded superlu_dist could not be used. Please check install in $PREFIX Message-ID: I am trying to port PETSc-3.16.3 to the MOOSE ecosystem. I got an error that PETSc could not build superlu_dist. The log file was attached. PETSc-3.15.x worked correctly in the same environment. Thanks, Fande -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: configure.log.zip Type: application/zip Size: 167246 bytes Desc: not available URL: From FERRANJ2 at my.erau.edu Mon Jan 17 17:40:27 2022 From: FERRANJ2 at my.erau.edu (Ferrand, Jesus A.) Date: Mon, 17 Jan 2022 23:40:27 +0000 Subject: [petsc-users] MatMPIBAIJSetPreallocation() diagonal block: Message-ID: Greetings! I have a question about preallocating memory for a parallel BAIJ matrix. From the documentation of MatMPIBAIJSetPreallocation(), the preallocation is divided between the so-called "diagonal" and "off-diagonal" sub matrices. In the example from the documentation, the following illustration is given for the portion of a matrix that owns rows 3,4, and 5 of some 12-column matrix: 0 1 2 3 4 5 6 7 8 9 10 11 -------------------------- row 3 |o o o d d d o o o o o o row 4 |o o o d d d o o o o o o row 5 |o o o d d d o o o o o o -------------------------- I have a few questions: about MPIBAIJ matrices, I am somewhat confused by the description: 1. If the rank owned, say 5 rows instead of 3 on the same matrix, would the "d" block become a 5 by 5? as in: 2. 0 1 2 3 4 5 6 7 8 9 10 11 -------------------------- row 3 |o o o d d d d d o o o o row 4 |o o o d d d d d o o o o row 5 |o o o d d d d d o o o o row 6 |o o o d d d d d o o o o row 7 |o o o d d d d d o o o o -------------------------- 1. Is there a way to not have to deal with "d" blocks and "o" blocks separately? I know how many nonzero blocks are in each row of my matrix but I can't easily determine which of those zeros are in the "d" block or "o" block of my ranks. 2. The block size "bs", that is the size of each individual "d" or "o" as shown in the diagrams right? 3. TRUE/FALSE: The parameters d_nz, o_nz, d_nnz, and o_nnz, those specify the number of "d" and "o" blocks that are nonzero (where each "d" or "o" is its own dense submatrix), not the actual number of nonzeros in the rows, right? Say the block size is 3 and I set d_nz to 2, then for the diagram from the documentation this translates to 54 actual nonzeros in the diagonal block, right? Sincerely: J.A. Ferrand Embry-Riddle Aeronautical University - Daytona Beach FL M.Sc. Aerospace Engineering | May 2022 B.Sc. Aerospace Engineering B.Sc. Computational Mathematics Sigma Gamma Tau Tau Beta Pi Honors Program Phone: (386)-843-1829 Email(s): ferranj2 at my.erau.edu jesus.ferrand at gmail.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From mfadams at lbl.gov Mon Jan 17 19:32:39 2022 From: mfadams at lbl.gov (Mark Adams) Date: Mon, 17 Jan 2022 20:32:39 -0500 Subject: [petsc-users] MatMPIBAIJSetPreallocation() diagonal block: In-Reply-To: References: Message-ID: On Mon, Jan 17, 2022 at 6:40 PM Ferrand, Jesus A. wrote: > Greetings! > > I have a question about preallocating memory for a parallel BAIJ matrix. > From the documentation of MatMPIBAIJSetPreallocation(), the preallocation > is divided between the so-called "diagonal" and "off-diagonal" sub > matrices. In the example from the documentation, the following illustration > is given for the portion of a matrix that owns rows 3,4, and 5 of some > 12-column matrix: > > 0 1 2 3 4 5 6 7 8 9 10 11 > -------------------------- > row 3 |o o o d d d o o o o o o > row 4 |o o o d d d o o o o o o > row 5 |o o o d d d o o o o o o > -------------------------- > > I have a few questions: about MPIBAIJ matrices, I am somewhat confused by > the description: > > 1. If the rank owned, say 5 rows instead of 3 on the same matrix, > would the "d" block become a 5 by 5? as in: > > When you create a matrix you specify how many local rows -- and columns (n) -- that you want. int MatCreateMPIBAIJ(MPI_Comm comm,int bs,int m,int n,int M,int N,int d_nz,int *d_nnz,int o_nz,int *o_nnz,Mat *A) > > 1. > > 0 1 2 3 4 5 6 7 8 9 10 11 > -------------------------- > row 3 |o o o d d d d d o o o o > row 4 |o o o d d d d d o o o o > row 5 |o o o d d d d d o o o o > row 6 |o o o d d d d d o o o o > row 7 |o o o d d d d d o o o o > -------------------------- > > > > 1. Is there a way to not have to deal with "d" blocks and "o" blocks > separately? I know how many nonzero blocks are in each row of my matrix but > I can't easily determine which of those zeros are in the "d" block or "o" > block of my ranks. > > No. It is best for performance to give an upper bound for each. > > 1. The block size "bs", that is the size of each individual "d" or "o" > as shown in the diagrams right? > > d and o have the same "size" but the API is such that > > 1. TRUE/FALSE: The parameters d_nz, o_nz, d_nnz, and o_nnz, those > specify the number of "d" and "o" blocks that are nonzero (where each "d" > or "o" is its own dense submatrix), not the actual number of nonzeros in > the rows, right? Say the block size is 3 and I set d_nz to 2, then for the > diagram from the documentation this translates to 54 actual nonzeros in the > diagonal block, right? > > FALSE. This is a little confusing but the interface is the same for blocked or unblocked versions of the same matrix. Thus you can set from the command line -mat_type aij or baij and they both work. Keep that in mind to figure out the API. This first example would work with a bs==3 or bs==1, which is an unblocked matrix. Your bs==5 example will not work with a blocked BAIJ matrix. Now performance sensitive functions like MatSetValues need blocked versions (https://petsc.org/main/docs/manualpages/Mat/MatSetValuesBlocked.html). MatSetValues will work for a blocked matrix, but it is slower than exploiting the blocks with the blocked version. To be flexible, as I described with command line options, you would need to query the matrix to figure out which type it is and have two versions of the assembly (MatSetValues) code. In practice, people often hardwire the code to a BAIJ matrix to avoid this complexity if they don't care about this flexibility. > > 1. > > > > Sincerely: > > *J.A. Ferrand* > > Embry-Riddle Aeronautical University - Daytona Beach FL > > M.Sc. Aerospace Engineering | May 2022 > > B.Sc. Aerospace Engineering > > B.Sc. Computational Mathematics > > > > Sigma Gamma Tau > > Tau Beta Pi > > Honors Program > > > > *Phone:* (386)-843-1829 > > *Email(s):* ferranj2 at my.erau.edu > > jesus.ferrand at gmail.com > -------------- next part -------------- An HTML attachment was scrubbed... URL: From hzhang at mcs.anl.gov Mon Jan 17 21:49:47 2022 From: hzhang at mcs.anl.gov (Zhang, Hong) Date: Tue, 18 Jan 2022 03:49:47 +0000 Subject: [petsc-users] PETSc MUMPS interface In-Reply-To: References: <1AD9EBCE-4B48-4D54-9829-DFD5EDC68B76@dsic.upv.es> Message-ID: Varun, I created a branch hzhang/feature-mumps-mem-estimate, see https://gitlab.com/petsc/petsc/-/merge_requests/4727 You may give it a try and let me know if this is what you want. src/ksp/ksp/tutorials/ex52.c is an example. Hong ________________________________ From: Varun Hiremath Sent: Monday, January 17, 2022 12:41 PM To: Zhang, Hong Cc: Jose E. Roman ; Peder J?rgensgaard Olesen via petsc-users Subject: Re: [petsc-users] PETSc MUMPS interface Hi Hong, Thanks for looking into this. Here is the workflow that I might use: MatLUFactorSymbolic(F,A,perm,iperm,&info); // get memory estimates from MUMPS e.g. INFO(3), INFOG(16), INFOG(17) // find available memory on the system e.g. RAM size if (estimated_memory > available_memory) { // inform and stop; or // switch MUMPS to out-of-core factorization ICNTL(22) = 1; } else { // set appropriate settings for in-core factorization } // Now we call the solve and inside if MatLUFactorSymbolic is already called then it should be skipped EPSSolve(eps); Thanks, Varun On Mon, Jan 17, 2022 at 9:18 AM Zhang, Hong > wrote: Varun, I am trying to find a way to enable you to switch options after MatLUFactorSymbolic(). A hack is modifying the flag 'mumps->matstruc' inside MatLUFactorSymbolic_AIJMUMPS() and MatFactorNumeric_MUMPS(). My understanding of what you want is: // collect mumps memory info ... MatLUFactorSymbolic(F,A,perm,iperm,&info); printMumpsMemoryInfo(F); //--------- if (memory is available) { EPSSolve(eps); --> skip calling of MatLUFactorSymbolic() } else { //out-of-core (OOC) option in MUMPS } Am I correct? I'll let you know once I work out a solution. Hong ________________________________ From: Varun Hiremath > Sent: Sunday, January 16, 2022 10:10 PM To: Zhang, Hong > Cc: Jose E. Roman >; Peder J?rgensgaard Olesen via petsc-users > Subject: Re: [petsc-users] PETSc MUMPS interface Hi Jose, Hong, Thanks for the explanation. I have verified using -log_view that MatLUFactorSymbolic is indeed getting called twice. Hong, we use MUMPS solver for other things, and we typically run the symbolic analysis first and get memory estimates to ensure that we have enough memory available to run the case. If the available memory is not enough, we can stop or switch to the out-of-core (OOC) option in MUMPS. We wanted to do the same when using MUMPS via SLEPc/PETSc. Please let me know if there are other ways of getting these memory stats and switching options during runtime with PETSc. Appreciate your help! Thanks, Varun On Sun, Jan 16, 2022 at 4:01 PM Zhang, Hong > wrote: Varun, I believe Jose is correct. You may verify it by running your code with option '-log_view', then check the number of calls to MatLUFactorSym. I guess I can add a flag in PCSetUp() to check if user has already called MatLUFactorSymbolic() and wants to skip it. Normally, users simply allocate sufficient memory in the symbolic factorization. Why do you want to check it? Hong ________________________________ From: Jose E. Roman > Sent: Sunday, January 16, 2022 5:11 AM To: Varun Hiremath > Cc: Zhang, Hong >; Peder J?rgensgaard Olesen via petsc-users > Subject: Re: [petsc-users] PETSc MUMPS interface Hong may give a better answer, but if you look at PCSetUp_LU() https://petsc.org/main/src/ksp/pc/impls/factor/lu/lu.c.html#PCSetUp_LU you will see that MatLUFactorSymbolic() is called unconditionally during the first PCSetUp(). Currently there is no way to check if the user has already called MatLUFactorSymbolic(). Jose > El 16 ene 2022, a las 10:40, Varun Hiremath > escribi?: > > Hi Hong, > > Thank you, this is very helpful! > > Using this method I am able to get the memory estimates before the actual solve, however, I think my code may be causing the symbolic factorization to be run twice. Attached is my code where I am using SLEPc to compute eigenvalues, and I use MUMPS for factorization. I have commented above the code that computes the memory estimates, could you please check and tell me if this would cause the symbolic factor to be computed twice (a second time inside EPSSolve?), as I am seeing a slight increase in the overall computation time? > > Regards, > Varun > > On Wed, Jan 12, 2022 at 7:58 AM Zhang, Hong > wrote: > PCFactorSetMatSolverType(pc,MATSOLVERMUMPS); > PCFactorSetUpMatSolverType(pc); > PCFactorGetMatrix(pc,&F); > > MatLUFactorSymbolic(F,A,...) > You must provide row and column permutations etc, petsc/src/mat/tests/ex125.c may give you a clue on how to get these inputs. > > Hong > > > From: petsc-users > on behalf of Junchao Zhang > > Sent: Wednesday, January 12, 2022 9:03 AM > To: Varun Hiremath > > Cc: Peder J?rgensgaard Olesen via petsc-users > > Subject: Re: [petsc-users] PETSc MUMPS interface > > Calling PCSetUp() before KSPSetUp()? > > --Junchao Zhang > > > On Wed, Jan 12, 2022 at 3:00 AM Varun Hiremath > wrote: > Hi All, > > I want to collect MUMPS memory estimates based on the initial symbolic factorization analysis before the actual numerical factorization starts to check if the estimated memory requirements fit the available memory. > > I am following the steps from https://petsc.org/main/src/ksp/ksp/tutorials/ex52.c.html > > PCFactorSetMatSolverType(pc,MATSOLVERMUMPS); > PCFactorSetUpMatSolverType(pc); > PCFactorGetMatrix(pc,&F); > > KSPSetUp(ksp); > MatMumpsGetInfog(F,...) > > But it appears KSPSetUp calls both symbolic and numerical factorization. So is there some other way to get these statistics before the actual factorization starts? > > Thanks, > Varun > -------------- next part -------------- An HTML attachment was scrubbed... URL: From dong-hao at outlook.com Mon Jan 17 22:06:39 2022 From: dong-hao at outlook.com (Hao DONG) Date: Tue, 18 Jan 2022 04:06:39 +0000 Subject: [petsc-users] Strange CUDA failure with a second petscfinalize with PETSc 3.16 In-Reply-To: References: <46A67106-A80C-4057-A301-E3C13236CC18@outlook.com> Message-ID: Dear Junchao and Jacob, Thanks a lot for the response ? I also don?t understand why this is related to the device, especially on why the procedure can be successfully finished for *once* ? As instructed, I tried to add a CHKERRA() macro after (almost) every petsc line ? such as the initialization, mat assemble, ksp create, solve, mat destroy, etc. However, all other petsc commands returns with error code 0. It only gives me a similar (still not very informative) error after I call the petscfinalize (again for the second time), with error code 97: [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [0]PETSC ERROR: GPU error [0]PETSC ERROR: cuda error 709 (cudaErrorContextIsDestroyed) : context is destroyed [0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting. [0]PETSC ERROR: Petsc Release Version 3.16.3, unknown [0]PETSC ERROR: ./ex11f on a named stratosphere by donghao Tue Jan 18 11:39:43 2022 [0]PETSC ERROR: Configure options --prefix=/opt/petsc/complex-double-with-cuda --with-cc=mpicc --with-cxx=mpicxx --with-fc=mpif90 COPTFLAGS="-O3 -mavx2" CXXOPTFLAGS="-O3 -mavx2" FOPTFLAGS="-O3 -ffree-line-length-none -mavx2" CUDAOPTFLAGS=-O3 --with-cxx-dialect=cxx14 --with-cuda-dialect=cxx14 --with-scalar-type=complex --with-precision=double --with-cuda-dir=/usr/local/cuda --with-debugging=1 [0]PETSC ERROR: #1 PetscFinalize() at /home/donghao/packages/petsc-current/src/sys/objects/pinit.c:1638 [0]PETSC ERROR: #2 User provided function() at User file:0 I can also confirm that rolling back to petsc 3.15 will *not* see the problem, even with the new nvidia driver. And petsc 3.16.3 with an old nvidia driver (470.42) also get this same error. So it?s probably not connected to the nvidia driver. Any idea on where I should look at next? Thanks a lot in advance, and all the best, Hao From: Jacob Faibussowitsch Sent: Sunday, January 16, 2022 12:12 AM To: Junchao Zhang Cc: petsc-users; Hao DONG Subject: Re: [petsc-users] Strange CUDA failure with a second petscfinalize with PETSc 3.16 I don?t quite understand how it is getting to the CUDA error to be honest. None of the code in the stack trace is anywhere near the device code. Reading the error message carefully, it first chokes on PetscLogGetStageLog() from a call to PetscClassIdRegister(): PetscErrorCode PetscLogGetStageLog(PetscStageLog *stageLog) { PetscFunctionBegin; PetscValidPointer(stageLog,1); if (!petsc_stageLog) { fprintf(stderr, "PETSC ERROR: Logging has not been enabled.\nYou might have forgotten to call PetscInitialize().\n"); PETSCABORT(MPI_COMM_WORLD, PETSC_ERR_SUP); // Here } ... But then jumps to PetscFinalize(). You can also see the "You might have forgotten to call PetscInitialize().? message in the error message, just under the topmost level of the stack trace. Can you check the value of ierr of each function call (use the CHKERRA() macro to do so)? I suspect the problem here that errors occurring previously in the program are being ignored, leading to the garbled stack trace. Best regards, Jacob Faibussowitsch (Jacob Fai - booss - oh - vitch) On Jan 14, 2022, at 20:58, Junchao Zhang > wrote: Jacob, Could you have a look as it seems the "invalid device context" is in your newly added module? Thanks --Junchao Zhang On Fri, Jan 14, 2022 at 12:49 AM Hao DONG > wrote: Dear All, I have encountered a peculiar problem when fiddling with a code with PETSC 3.16.3 (which worked fine with PETSc 3.15). It is a very straight forward PDE-based optimization code which repeatedly solves a linearized PDE problem with KSP in a subroutine (the rest of the code does not contain any PETSc related content). The main program provides the subroutine with an MPI comm. Then I set the comm as PETSC_COMM_WORLD to tell PETSC to attach to it (and detach with it when the solving is finished each time). Strangely, I observe a CUDA failure whenever the petscfinalize is called for a *second* time. In other words, the first and second PDE calculations with GPU are fine (with correct solutions). The petsc code just fails after the SECOND petscfinalize command is called. You can also see the PETSC config in the error message: [1]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [1]PETSC ERROR: GPU error [1]PETSC ERROR: cuda error 201 (cudaErrorDeviceUninitialized) : invalid device context [1]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting. [1]PETSC ERROR: Petsc Release Version 3.16.3, unknown [1]PETSC ERROR: maxwell.gpu on a named stratosphere by hao Fri Jan 14 10:21:05 2022 [1]PETSC ERROR: Configure options --prefix=/opt/petsc/complex-double-with-cuda --with-cc=mpicc --with-cxx=mpicxx --with-fc=mpif90 COPTFLAGS="-O3 -mavx2" CXXOPTFLAGS="-O3 -mavx2" FOPTFLAGS="-O3 -ffree-line-length-none -mavx2" CUDAOPTFLAGS=-O3 --with-cxx-dialect=cxx14 --with-cuda-dialect=cxx14 --with-scalar-type=complex --with-precision=double --with-cuda-dir=/usr/local/cuda --with-debugging=1 [1]PETSC ERROR: #1 PetscFinalize() at /home/hao/packages/petsc-current/src/sys/objects/pinit.c:1638 You might have forgotten to call PetscInitialize(). The EXACT line numbers in the error traceback are not available. Instead the line number of the start of the function is given. [1] #1 PetscAbortFindSourceFile_Private() at /home/hao/packages/petsc-current/src/sys/error/err.c:35 [1] #2 PetscLogGetStageLog() at /home/hao/packages/petsc-current/src/sys/logging/utils/stagelog.c:29 [1] #3 PetscClassIdRegister() at /home/hao/packages/petsc-current/src/sys/logging/plog.c:2376 [1] #4 MatMFFDInitializePackage() at /home/hao/packages/petsc-current/src/mat/impls/mffd/mffd.c:45 [1] #5 MatInitializePackage() at /home/hao/packages/petsc-current/src/mat/interface/dlregismat.c:163 [1] #6 MatCreate() at /home/hao/packages/petsc-current/src/mat/utils/gcreate.c:77 However, it doesn?t seem to affect the other part of my code, so the code can continue running until it gets to the petsc part again (the *third* time). Unfortunately, it doesn?t give me any further information even if I set the debugging to yes in the configure file. It also worth noting that PETSC without CUDA (i.e. with simple MATMPIAIJ) works perfectly fine. I am able to re-produce the problem with a toy code modified from ex11f. Please see the attached file (ex11fc.F90) for details. Essentially the code does the same thing as ex11f, but three times with a do loop. To do that I added an extra MPI_INIT/MPI_FINALIZE to ensure that the MPI communicator is not destroyed when PETSC_FINALIZE is called. I used the PetscOptionsHasName utility to check if you have ?-usecuda? in the options. So running the code with and without that option can give you a comparison w/o CUDA. I can see that the code also fails after the second loop of the KSP operation. Could you kindly shed some lights on this problem? I should say that I am not even sure if the problem is from PETSc, as I also accidentally updated the NVIDIA driver (for now it is 510.06 with cuda 11.6). And it is well known that NVIDIA can give you some surprise in the updates (yes, I know I shouldn?t have touched that if it?s not broken). But my CUDA code without PETSC (which basically does the same PDE thing, but with cusparse/cublas directly) seems to work just fine after the update. It is also possible that my petsc code related to CUDA was not quite ?legitimate? ? I just use: MatSetType(A, MATMPIAIJCUSPARSE, ierr) and MatCreateVecs(A, u, PETSC_NULL_VEC, ierr) to make the data onto GPU. I would very much appreciate it if you could show me the ?right? way to do that. Thanks a lot in advance, and all the best, Hao -------------- next part -------------- An HTML attachment was scrubbed... URL: From jacob.fai at gmail.com Tue Jan 18 13:06:46 2022 From: jacob.fai at gmail.com (Jacob Faibussowitsch) Date: Tue, 18 Jan 2022 14:06:46 -0500 Subject: [petsc-users] Strange CUDA failure with a second petscfinalize with PETSc 3.16 In-Reply-To: References: Message-ID: <45585074-8C0E-453B-993B-554D23A1E971@gmail.com> Can you send your updated source file as well as your configure.log (should be $PETSC_DIR/configure.log). I will see if I can reproduce the error on my end. Best regards, Jacob Faibussowitsch (Jacob Fai - booss - oh - vitch) > On Jan 17, 2022, at 23:06, Hao DONG wrote: > ? > Dear Junchao and Jacob, > > Thanks a lot for the response ? I also don?t understand why this is related to the device, especially on why the procedure can be successfully finished for *once* ? As instructed, I tried to add a CHKERRA() macro after (almost) every petsc line ? such as the initialization, mat assemble, ksp create, solve, mat destroy, etc. However, all other petsc commands returns with error code 0. It only gives me a similar (still not very informative) error after I call the petscfinalize (again for the second time), with error code 97: > > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > [0]PETSC ERROR: GPU error > [0]PETSC ERROR: cuda error 709 (cudaErrorContextIsDestroyed) : context is destroyed > [0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting. > [0]PETSC ERROR: Petsc Release Version 3.16.3, unknown > [0]PETSC ERROR: ./ex11f on a named stratosphere by donghao Tue Jan 18 11:39:43 2022 > [0]PETSC ERROR: Configure options --prefix=/opt/petsc/complex-double-with-cuda --with-cc=mpicc --with-cxx=mpicxx --with-fc=mpif90 COPTFLAGS="-O3 -mavx2" CXXOPTFLAGS="-O3 -mavx2" FOPTFLAGS="-O3 -ffree-line-length-none -mavx2" CUDAOPTFLAGS=-O3 --with-cxx-dialect=cxx14 --with-cuda-dialect=cxx14 --with-scalar-type=complex --with-precision=double --with-cuda-dir=/usr/local/cuda --with-debugging=1 > [0]PETSC ERROR: #1 PetscFinalize() at /home/donghao/packages/petsc-current/src/sys/objects/pinit.c:1638 > [0]PETSC ERROR: #2 User provided function() at User file:0 > > I can also confirm that rolling back to petsc 3.15 will *not* see the problem, even with the new nvidia driver. And petsc 3.16.3 with an old nvidia driver (470.42) also get this same error. So it?s probably not connected to the nvidia driver. > > Any idea on where I should look at next? > Thanks a lot in advance, and all the best, > Hao > > From: Jacob Faibussowitsch > Sent: Sunday, January 16, 2022 12:12 AM > To: Junchao Zhang > Cc: petsc-users; Hao DONG > Subject: Re: [petsc-users] Strange CUDA failure with a second petscfinalize with PETSc 3.16 > > I don?t quite understand how it is getting to the CUDA error to be honest. None of the code in the stack trace is anywhere near the device code. Reading the error message carefully, it first chokes on PetscLogGetStageLog() from a call to PetscClassIdRegister(): > > PetscErrorCode PetscLogGetStageLog(PetscStageLog *stageLog) > { > PetscFunctionBegin; > PetscValidPointer(stageLog,1); > if (!petsc_stageLog) { > fprintf(stderr, "PETSC ERROR: Logging has not been enabled.\nYou might have forgotten to call PetscInitialize().\n"); > PETSCABORT(MPI_COMM_WORLD, PETSC_ERR_SUP); // Here > } > ... > > But then jumps to PetscFinalize(). You can also see the "You might have forgotten to call PetscInitialize().? message in the error message, just under the topmost level of the stack trace. > > Can you check the value of ierr of each function call (use the CHKERRA() macro to do so)? I suspect the problem here that errors occurring previously in the program are being ignored, leading to the garbled stack trace. > > Best regards, > > Jacob Faibussowitsch > (Jacob Fai - booss - oh - vitch) > > > On Jan 14, 2022, at 20:58, Junchao Zhang wrote: > > Jacob, > Could you have a look as it seems the "invalid device context" is in your newly added module? > Thanks > --Junchao Zhang > > > On Fri, Jan 14, 2022 at 12:49 AM Hao DONG wrote: > Dear All, > > I have encountered a peculiar problem when fiddling with a code with PETSC 3.16.3 (which worked fine with PETSc 3.15). It is a very straight forward PDE-based optimization code which repeatedly solves a linearized PDE problem with KSP in a subroutine (the rest of the code does not contain any PETSc related content). The main program provides the subroutine with an MPI comm. Then I set the comm as PETSC_COMM_WORLD to tell PETSC to attach to it (and detach with it when the solving is finished each time). > > Strangely, I observe a CUDA failure whenever the petscfinalize is called for a *second* time. In other words, the first and second PDE calculations with GPU are fine (with correct solutions). The petsc code just fails after the SECOND petscfinalize command is called. You can also see the PETSC config in the error message: > > [1]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > [1]PETSC ERROR: GPU error > [1]PETSC ERROR: cuda error 201 (cudaErrorDeviceUninitialized) : invalid device context > [1]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting. > [1]PETSC ERROR: Petsc Release Version 3.16.3, unknown > [1]PETSC ERROR: maxwell.gpu on a named stratosphere by hao Fri Jan 14 10:21:05 2022 > [1]PETSC ERROR: Configure options --prefix=/opt/petsc/complex-double-with-cuda --with-cc=mpicc --with-cxx=mpicxx --with-fc=mpif90 COPTFLAGS="-O3 -mavx2" CXXOPTFLAGS="-O3 -mavx2" FOPTFLAGS="-O3 -ffree-line-length-none -mavx2" CUDAOPTFLAGS=-O3 --with-cxx-dialect=cxx14 --with-cuda-dialect=cxx14 --with-scalar-type=complex --with-precision=double --with-cuda-dir=/usr/local/cuda --with-debugging=1 > [1]PETSC ERROR: #1 PetscFinalize() at /home/hao/packages/petsc-current/src/sys/objects/pinit.c:1638 > You might have forgotten to call PetscInitialize(). > The EXACT line numbers in the error traceback are not available. > Instead the line number of the start of the function is given. > [1] #1 PetscAbortFindSourceFile_Private() at /home/hao/packages/petsc-current/src/sys/error/err.c:35 > [1] #2 PetscLogGetStageLog() at /home/hao/packages/petsc-current/src/sys/logging/utils/stagelog.c:29 > [1] #3 PetscClassIdRegister() at /home/hao/packages/petsc-current/src/sys/logging/plog.c:2376 > [1] #4 MatMFFDInitializePackage() at /home/hao/packages/petsc-current/src/mat/impls/mffd/mffd.c:45 > [1] #5 MatInitializePackage() at /home/hao/packages/petsc-current/src/mat/interface/dlregismat.c:163 > [1] #6 MatCreate() at /home/hao/packages/petsc-current/src/mat/utils/gcreate.c:77 > > However, it doesn?t seem to affect the other part of my code, so the code can continue running until it gets to the petsc part again (the *third* time). Unfortunately, it doesn?t give me any further information even if I set the debugging to yes in the configure file. It also worth noting that PETSC without CUDA (i.e. with simple MATMPIAIJ) works perfectly fine. > > I am able to re-produce the problem with a toy code modified from ex11f. Please see the attached file (ex11fc.F90) for details. Essentially the code does the same thing as ex11f, but three times with a do loop. To do that I added an extra MPI_INIT/MPI_FINALIZE to ensure that the MPI communicator is not destroyed when PETSC_FINALIZE is called. I used the PetscOptionsHasName utility to check if you have ?-usecuda? in the options. So running the code with and without that option can give you a comparison w/o CUDA. I can see that the code also fails after the second loop of the KSP operation. Could you kindly shed some lights on this problem? > > I should say that I am not even sure if the problem is from PETSc, as I also accidentally updated the NVIDIA driver (for now it is 510.06 with cuda 11.6). And it is well known that NVIDIA can give you some surprise in the updates (yes, I know I shouldn?t have touched that if it?s not broken). But my CUDA code without PETSC (which basically does the same PDE thing, but with cusparse/cublas directly) seems to work just fine after the update. It is also possible that my petsc code related to CUDA was not quite ?legitimate? ? I just use: > MatSetType(A, MATMPIAIJCUSPARSE, ierr) > and > MatCreateVecs(A, u, PETSC_NULL_VEC, ierr) > to make the data onto GPU. I would very much appreciate it if you could show me the ?right? way to do that. > > Thanks a lot in advance, and all the best, > Hao -------------- next part -------------- An HTML attachment was scrubbed... URL: From jacob.fai at gmail.com Tue Jan 18 13:38:37 2022 From: jacob.fai at gmail.com (Jacob Faibussowitsch) Date: Tue, 18 Jan 2022 14:38:37 -0500 Subject: [petsc-users] Strange CUDA failure with a second petscfinalize with PETSc 3.16 In-Reply-To: <45585074-8C0E-453B-993B-554D23A1E971@gmail.com> References: <45585074-8C0E-453B-993B-554D23A1E971@gmail.com> Message-ID: <60E1BD31-E98A-462F-BA2B-2099745B2582@gmail.com> Apologies, forgot to mention in my previous email but can you also include a copy of the full printout of the error message that you get? It will include all the command-line flags that you ran with (if any) so I can exactly mirror your environment. Best regards, Jacob Faibussowitsch (Jacob Fai - booss - oh - vitch) > On Jan 18, 2022, at 14:06, Jacob Faibussowitsch wrote: > > Can you send your updated source file as well as your configure.log (should be $PETSC_DIR/configure.log). I will see if I can reproduce the error on my end. > > Best regards, > > Jacob Faibussowitsch > (Jacob Fai - booss - oh - vitch) > >> On Jan 17, 2022, at 23:06, Hao DONG wrote: >> >> ? >> Dear Junchao and Jacob, >> >> Thanks a lot for the response ? I also don?t understand why this is related to the device, especially on why the procedure can be successfully finished for *once* ? As instructed, I tried to add a CHKERRA() macro after (almost) every petsc line ? such as the initialization, mat assemble, ksp create, solve, mat destroy, etc. However, all other petsc commands returns with error code 0. It only gives me a similar (still not very informative) error after I call the petscfinalize (again for the second time), with error code 97: >> >> [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- >> [0]PETSC ERROR: GPU error >> [0]PETSC ERROR: cuda error 709 (cudaErrorContextIsDestroyed) : context is destroyed >> [0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting. >> [0]PETSC ERROR: Petsc Release Version 3.16.3, unknown >> [0]PETSC ERROR: ./ex11f on a named stratosphere by donghao Tue Jan 18 11:39:43 2022 >> [0]PETSC ERROR: Configure options --prefix=/opt/petsc/complex-double-with-cuda --with-cc=mpicc --with-cxx=mpicxx --with-fc=mpif90 COPTFLAGS="-O3 -mavx2" CXXOPTFLAGS="-O3 -mavx2" FOPTFLAGS="-O3 -ffree-line-length-none -mavx2" CUDAOPTFLAGS=-O3 --with-cxx-dialect=cxx14 --with-cuda-dialect=cxx14 --with-scalar-type=complex --with-precision=double --with-cuda-dir=/usr/local/cuda --with-debugging=1 >> [0]PETSC ERROR: #1 PetscFinalize() at /home/donghao/packages/petsc-current/src/sys/objects/pinit.c:1638 >> [0]PETSC ERROR: #2 User provided function() at User file:0 >> >> I can also confirm that rolling back to petsc 3.15 will *not* see the problem, even with the new nvidia driver. And petsc 3.16.3 with an old nvidia driver (470.42) also get this same error. So it?s probably not connected to the nvidia driver. >> >> Any idea on where I should look at next? >> Thanks a lot in advance, and all the best, >> Hao >> >> From: Jacob Faibussowitsch >> Sent: Sunday, January 16, 2022 12:12 AM >> To: Junchao Zhang >> Cc: petsc-users ; Hao DONG >> Subject: Re: [petsc-users] Strange CUDA failure with a second petscfinalize with PETSc 3.16 >> >> I don?t quite understand how it is getting to the CUDA error to be honest. None of the code in the stack trace is anywhere near the device code. Reading the error message carefully, it first chokes on PetscLogGetStageLog() from a call to PetscClassIdRegister(): >> >> PetscErrorCode PetscLogGetStageLog(PetscStageLog *stageLog) >> { >> PetscFunctionBegin; >> PetscValidPointer(stageLog,1); >> if (!petsc_stageLog) { >> fprintf(stderr, "PETSC ERROR: Logging has not been enabled.\nYou might have forgotten to call PetscInitialize().\n"); >> PETSCABORT(MPI_COMM_WORLD, PETSC_ERR_SUP); // Here >> } >> ... >> >> But then jumps to PetscFinalize(). You can also see the "You might have forgotten to call PetscInitialize().? message in the error message, just under the topmost level of the stack trace. >> >> Can you check the value of ierr of each function call (use the CHKERRA() macro to do so)? I suspect the problem here that errors occurring previously in the program are being ignored, leading to the garbled stack trace. >> >> Best regards, >> >> Jacob Faibussowitsch >> (Jacob Fai - booss - oh - vitch) >> >> >> On Jan 14, 2022, at 20:58, Junchao Zhang > wrote: >> >> Jacob, >> Could you have a look as it seems the "invalid device context" is in your newly added module? >> Thanks >> --Junchao Zhang >> >> >> On Fri, Jan 14, 2022 at 12:49 AM Hao DONG > wrote: >> Dear All, >> >> I have encountered a peculiar problem when fiddling with a code with PETSC 3.16.3 (which worked fine with PETSc 3.15). It is a very straight forward PDE-based optimization code which repeatedly solves a linearized PDE problem with KSP in a subroutine (the rest of the code does not contain any PETSc related content). The main program provides the subroutine with an MPI comm. Then I set the comm as PETSC_COMM_WORLD to tell PETSC to attach to it (and detach with it when the solving is finished each time). >> >> Strangely, I observe a CUDA failure whenever the petscfinalize is called for a *second* time. In other words, the first and second PDE calculations with GPU are fine (with correct solutions). The petsc code just fails after the SECOND petscfinalize command is called. You can also see the PETSC config in the error message: >> >> [1]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- >> [1]PETSC ERROR: GPU error >> [1]PETSC ERROR: cuda error 201 (cudaErrorDeviceUninitialized) : invalid device context >> [1]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting. >> [1]PETSC ERROR: Petsc Release Version 3.16.3, unknown >> [1]PETSC ERROR: maxwell.gpu on a named stratosphere by hao Fri Jan 14 10:21:05 2022 >> [1]PETSC ERROR: Configure options --prefix=/opt/petsc/complex-double-with-cuda --with-cc=mpicc --with-cxx=mpicxx --with-fc=mpif90 COPTFLAGS="-O3 -mavx2" CXXOPTFLAGS="-O3 -mavx2" FOPTFLAGS="-O3 -ffree-line-length-none -mavx2" CUDAOPTFLAGS=-O3 --with-cxx-dialect=cxx14 --with-cuda-dialect=cxx14 --with-scalar-type=complex --with-precision=double --with-cuda-dir=/usr/local/cuda --with-debugging=1 >> [1]PETSC ERROR: #1 PetscFinalize() at /home/hao/packages/petsc-current/src/sys/objects/pinit.c:1638 >> You might have forgotten to call PetscInitialize(). >> The EXACT line numbers in the error traceback are not available. >> Instead the line number of the start of the function is given. >> [1] #1 PetscAbortFindSourceFile_Private() at /home/hao/packages/petsc-current/src/sys/error/err.c:35 >> [1] #2 PetscLogGetStageLog() at /home/hao/packages/petsc-current/src/sys/logging/utils/stagelog.c:29 >> [1] #3 PetscClassIdRegister() at /home/hao/packages/petsc-current/src/sys/logging/plog.c:2376 >> [1] #4 MatMFFDInitializePackage() at /home/hao/packages/petsc-current/src/mat/impls/mffd/mffd.c:45 >> [1] #5 MatInitializePackage() at /home/hao/packages/petsc-current/src/mat/interface/dlregismat.c:163 >> [1] #6 MatCreate() at /home/hao/packages/petsc-current/src/mat/utils/gcreate.c:77 >> >> However, it doesn?t seem to affect the other part of my code, so the code can continue running until it gets to the petsc part again (the *third* time). Unfortunately, it doesn?t give me any further information even if I set the debugging to yes in the configure file. It also worth noting that PETSC without CUDA (i.e. with simple MATMPIAIJ) works perfectly fine. >> >> I am able to re-produce the problem with a toy code modified from ex11f. Please see the attached file (ex11fc.F90) for details. Essentially the code does the same thing as ex11f, but three times with a do loop. To do that I added an extra MPI_INIT/MPI_FINALIZE to ensure that the MPI communicator is not destroyed when PETSC_FINALIZE is called. I used the PetscOptionsHasName utility to check if you have ?-usecuda? in the options. So running the code with and without that option can give you a comparison w/o CUDA. I can see that the code also fails after the second loop of the KSP operation. Could you kindly shed some lights on this problem? >> >> I should say that I am not even sure if the problem is from PETSc, as I also accidentally updated the NVIDIA driver (for now it is 510.06 with cuda 11.6). And it is well known that NVIDIA can give you some surprise in the updates (yes, I know I shouldn?t have touched that if it?s not broken). But my CUDA code without PETSC (which basically does the same PDE thing, but with cusparse/cublas directly) seems to work just fine after the update. It is also possible that my petsc code related to CUDA was not quite ?legitimate? ? I just use: >> MatSetType(A, MATMPIAIJCUSPARSE, ierr) >> and >> MatCreateVecs(A, u, PETSC_NULL_VEC, ierr) >> to make the data onto GPU. I would very much appreciate it if you could show me the ?right? way to do that. >> >> Thanks a lot in advance, and all the best, >> Hao -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Tue Jan 18 14:20:04 2022 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 18 Jan 2022 15:20:04 -0500 Subject: [petsc-users] Nullspaces In-Reply-To: References: Message-ID: On Thu, Dec 16, 2021 at 11:09 AM Marco Cisternino < marco.cisternino at optimad.it> wrote: > Hello Matthew, > > as promised I prepared a minimal (112960 rows. I?m not able to produce > anything smaller than this and triggering the issue) example of the > behavior I was talking about some days ago. > > What I did is to produce matrix, right hand side and initial solution of > the linear system. > > > > As I told you before, this linear system is the discretization of the > pressure equation of a predictor-corrector method for NS equations in the > framework of finite volume method. > > This case has homogeneous Neumann boundary conditions. Computational > domain has two independent and separated sub-domains. > > I discretize the weak formulation and I divide every row of the linear > system by the volume of the relative cell. > > The underlying mesh is not uniform, therefore cells have different > volumes. > > The issue I?m going to explain does not show up if the mesh is uniform, > same volume for all the cells. > > > > I usually build the null space sub-domain by sub-domain with > > MatNullSpaceCreate(getCommunicator(), PETSC_FALSE, nConstants, constants, > &nullspace); > > Where nConstants = 2 and constants contains two normalized arrays with > constant values on degrees of freedom relative to the associated sub-domain > and zeros elsewhere. > > > > However, as a test I tried the constant over the whole domain using 2 > alternatives that should produce the same null space: > > 1. MatNullSpaceCreate(getCommunicator(), PETSC_TRUE, 0, nullptr, > &nullspace); > 2. Vec* nsp; > > VecDuplicateVecs(solution, 1, &nsp); > > VecSet(nsp[0],1.0); > > VecNormalize(nsp[0], nullptr); > > MatNullSpaceCreate(getCommunicator(), PETSC_FALSE, 1, nsp, &nullspace); > > > > Once I created the null space I test it using: > > MatNullSpaceTest(nullspace, m_A, &isNullSpaceValid); > > > > The case 1 pass the test while case 2 don?t. > > > > I have a small code for matrix loading, null spaces creation and testing. > > Unfortunately I cannot implement a small code able to produce that linear > system. > > > > As attachment you can find an archive containing the matrix, the initial > solution (used to manually build the null space) and the rhs (not used in > the test code) in binary format. > > You can also find the testing code in the same archive. > > I used petsc 3.12(gcc+openMPI) and petsc 3.15.2(intelOneAPI) same results. > > If the attachment is not delivered, I can share a link to it. > Marco, please forgive me for taking so long to get to your issue. It has been crazy. You are correct, we had a bug. it is in MatNullSpaceTest. The normalization for the constant vector was wrong: diff --git a/src/mat/interface/matnull.c b/src/mat/interface/matnull.c index f8ab2925988..0c4c3855be0 100644 --- a/src/mat/interface/matnull.c +++ b/src/mat/interface/matnull.c @@ -429,7 +429,7 @@ PetscErrorCode MatNullSpaceTest(MatNullSpace sp,Mat mat,PetscBool *isNull) if (sp->has_cnst) { ierr = VecDuplicate(l,&r);CHKERRQ(ierr); ierr = VecGetSize(l,&N);CHKERRQ(ierr); - sum = 1.0/N; + sum = 1.0/PetscSqrtReal(N); ierr = VecSet(l,sum);CHKERRQ(ierr); ierr = MatMult(mat,l,r);CHKERRQ(ierr); ierr = VecNorm(r,NORM_2,&nrm);CHKERRQ(ierr); With this fix, your two cases give the same answer, namely that the constant vector is not a null vector of your operator, but it is close, as your can see using -mat_null_space_test_view. Thanks, Matt > Thanks for any help. > > > > Marco Cisternino > > > > > > Marco Cisternino, PhD > marco.cisternino at optimad.it > > ______________________ > > Optimad Engineering Srl > > Via Bligny 5, Torino, Italia. > +3901119719782 > www.optimad.it > > > > *From:* Marco Cisternino > *Sent:* marted? 7 dicembre 2021 19:36 > *To:* Matthew Knepley > *Cc:* petsc-users > *Subject:* Re: [petsc-users] Nullspaces > > > > I will, as soon as possible... > > > > Scarica Outlook per Android > ------------------------------ > > *From:* Matthew Knepley > *Sent:* Tuesday, December 7, 2021 7:25:43 PM > *To:* Marco Cisternino > *Cc:* petsc-users > *Subject:* Re: [petsc-users] Nullspaces > > > > On Tue, Dec 7, 2021 at 11:19 AM Marco Cisternino < > marco.cisternino at optimad.it> wrote: > > Good morning, > > I?m still struggling with the Poisson equation with Neumann BCs. > > I discretize the equation by finite volume method and I divide every line > of the linear system by the volume of the cell. I could avoid this > division, but I?m trying to understand. > > My mesh is not uniform, i.e. cells have different volumes (it is an octree > mesh). > > Moreover, in my computational domain there are 2 separated sub-domains. > > I build the null space and then I use MatNullSpaceTest to check it. > > > > If I do this: > > MatNullSpaceCreate(getCommunicator(), PETSC_TRUE, 0, nullptr, &nullspace); > > It works > > > > This produces the normalized constant vector. > > > > If I do this: > > Vec nsp; > > VecDuplicate(m_rhs, &nsp); > > VecSet(nsp,1.0); > > VecNormalize(nsp, nullptr); > > MatNullSpaceCreate(getCommunicator(), PETSC_FALSE, 1, &nsp, &nullspace); > > It does not work > > > > This is also the normalized constant vector. > > > > So you are saying that these two vectors give different results with > MatNullSpaceTest()? > > Something must be wrong in the code. Can you send a minimal example of > this? I will go > > through and debug it. > > > > Thanks, > > > > Matt > > > > Probably, I have wrong expectations, but should not it be the same? > > > > Thanks > > > > Marco Cisternino, PhD > marco.cisternino at optimad.it > > ______________________ > > Optimad Engineering Srl > > Via Bligny 5, Torino, Italia. > +3901119719782 > www.optimad.it > > > > > > > -- > > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > > > https://www.cse.buffalo.edu/~knepley/ > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Tue Jan 18 14:24:35 2022 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 18 Jan 2022 15:24:35 -0500 Subject: [petsc-users] Nullspaces In-Reply-To: References: Message-ID: I made a fix for this: https://gitlab.com/petsc/petsc/-/merge_requests/4729 Thanks, Matt On Tue, Jan 18, 2022 at 3:20 PM Matthew Knepley wrote: > On Thu, Dec 16, 2021 at 11:09 AM Marco Cisternino < > marco.cisternino at optimad.it> wrote: > >> Hello Matthew, >> >> as promised I prepared a minimal (112960 rows. I?m not able to produce >> anything smaller than this and triggering the issue) example of the >> behavior I was talking about some days ago. >> >> What I did is to produce matrix, right hand side and initial solution of >> the linear system. >> >> >> >> As I told you before, this linear system is the discretization of the >> pressure equation of a predictor-corrector method for NS equations in the >> framework of finite volume method. >> >> This case has homogeneous Neumann boundary conditions. Computational >> domain has two independent and separated sub-domains. >> >> I discretize the weak formulation and I divide every row of the linear >> system by the volume of the relative cell. >> >> The underlying mesh is not uniform, therefore cells have different >> volumes. >> >> The issue I?m going to explain does not show up if the mesh is uniform, >> same volume for all the cells. >> >> >> >> I usually build the null space sub-domain by sub-domain with >> >> MatNullSpaceCreate(getCommunicator(), PETSC_FALSE, nConstants, constants, >> &nullspace); >> >> Where nConstants = 2 and constants contains two normalized arrays with >> constant values on degrees of freedom relative to the associated sub-domain >> and zeros elsewhere. >> >> >> >> However, as a test I tried the constant over the whole domain using 2 >> alternatives that should produce the same null space: >> >> 1. MatNullSpaceCreate(getCommunicator(), PETSC_TRUE, 0, nullptr, >> &nullspace); >> 2. Vec* nsp; >> >> VecDuplicateVecs(solution, 1, &nsp); >> >> VecSet(nsp[0],1.0); >> >> VecNormalize(nsp[0], nullptr); >> >> MatNullSpaceCreate(getCommunicator(), PETSC_FALSE, 1, nsp, &nullspace); >> >> >> >> Once I created the null space I test it using: >> >> MatNullSpaceTest(nullspace, m_A, &isNullSpaceValid); >> >> >> >> The case 1 pass the test while case 2 don?t. >> >> >> >> I have a small code for matrix loading, null spaces creation and testing. >> >> Unfortunately I cannot implement a small code able to produce that linear >> system. >> >> >> >> As attachment you can find an archive containing the matrix, the initial >> solution (used to manually build the null space) and the rhs (not used in >> the test code) in binary format. >> >> You can also find the testing code in the same archive. >> >> I used petsc 3.12(gcc+openMPI) and petsc 3.15.2(intelOneAPI) same results. >> >> If the attachment is not delivered, I can share a link to it. >> > > Marco, please forgive me for taking so long to get to your issue. It has > been crazy. > > You are correct, we had a bug. it is in MatNullSpaceTest. The > normalization for the constant vector was wrong: > > diff --git a/src/mat/interface/matnull.c b/src/mat/interface/matnull.c > index f8ab2925988..0c4c3855be0 100644 > --- a/src/mat/interface/matnull.c > +++ b/src/mat/interface/matnull.c > @@ -429,7 +429,7 @@ PetscErrorCode MatNullSpaceTest(MatNullSpace sp,Mat > mat,PetscBool *isNull) > if (sp->has_cnst) { > ierr = VecDuplicate(l,&r);CHKERRQ(ierr); > ierr = VecGetSize(l,&N);CHKERRQ(ierr); > - sum = 1.0/N; > > + sum = 1.0/PetscSqrtReal(N); > ierr = VecSet(l,sum);CHKERRQ(ierr); > ierr = MatMult(mat,l,r);CHKERRQ(ierr); > ierr = VecNorm(r,NORM_2,&nrm);CHKERRQ(ierr); > > With this fix, your two cases give the same answer, namely that the > constant vector is not a null vector of your > operator, but it is close, as your can see using -mat_null_space_test_view. > > Thanks, > > Matt > > >> Thanks for any help. >> >> >> >> Marco Cisternino >> >> >> >> >> >> Marco Cisternino, PhD >> marco.cisternino at optimad.it >> >> ______________________ >> >> Optimad Engineering Srl >> >> Via Bligny 5, Torino, Italia. >> +3901119719782 >> www.optimad.it >> >> >> >> *From:* Marco Cisternino >> *Sent:* marted? 7 dicembre 2021 19:36 >> *To:* Matthew Knepley >> *Cc:* petsc-users >> *Subject:* Re: [petsc-users] Nullspaces >> >> >> >> I will, as soon as possible... >> >> >> >> Scarica Outlook per Android >> ------------------------------ >> >> *From:* Matthew Knepley >> *Sent:* Tuesday, December 7, 2021 7:25:43 PM >> *To:* Marco Cisternino >> *Cc:* petsc-users >> *Subject:* Re: [petsc-users] Nullspaces >> >> >> >> On Tue, Dec 7, 2021 at 11:19 AM Marco Cisternino < >> marco.cisternino at optimad.it> wrote: >> >> Good morning, >> >> I?m still struggling with the Poisson equation with Neumann BCs. >> >> I discretize the equation by finite volume method and I divide every line >> of the linear system by the volume of the cell. I could avoid this >> division, but I?m trying to understand. >> >> My mesh is not uniform, i.e. cells have different volumes (it is an >> octree mesh). >> >> Moreover, in my computational domain there are 2 separated sub-domains. >> >> I build the null space and then I use MatNullSpaceTest to check it. >> >> >> >> If I do this: >> >> MatNullSpaceCreate(getCommunicator(), PETSC_TRUE, 0, nullptr, &nullspace); >> >> It works >> >> >> >> This produces the normalized constant vector. >> >> >> >> If I do this: >> >> Vec nsp; >> >> VecDuplicate(m_rhs, &nsp); >> >> VecSet(nsp,1.0); >> >> VecNormalize(nsp, nullptr); >> >> MatNullSpaceCreate(getCommunicator(), PETSC_FALSE, 1, &nsp, &nullspace); >> >> It does not work >> >> >> >> This is also the normalized constant vector. >> >> >> >> So you are saying that these two vectors give different results with >> MatNullSpaceTest()? >> >> Something must be wrong in the code. Can you send a minimal example of >> this? I will go >> >> through and debug it. >> >> >> >> Thanks, >> >> >> >> Matt >> >> >> >> Probably, I have wrong expectations, but should not it be the same? >> >> >> >> Thanks >> >> >> >> Marco Cisternino, PhD >> marco.cisternino at optimad.it >> >> ______________________ >> >> Optimad Engineering Srl >> >> Via Bligny 5, Torino, Italia. >> +3901119719782 >> www.optimad.it >> >> >> >> >> >> >> -- >> >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> >> >> https://www.cse.buffalo.edu/~knepley/ >> >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From xsli at lbl.gov Tue Jan 18 16:50:28 2022 From: xsli at lbl.gov (Xiaoye S. Li) Date: Tue, 18 Jan 2022 14:50:28 -0800 Subject: [petsc-users] Downloaded superlu_dist could not be used. Please check install in $PREFIX In-Reply-To: References: Message-ID: There was a merge error in the master branch. I fixed it today. Not sure whether that's causing your problem. Can you try now? Sherry On Mon, Jan 17, 2022 at 11:55 AM Fande Kong wrote: > I am trying to port PETSc-3.16.3 to the MOOSE ecosystem. I got an error > that PETSc could not build superlu_dist. The log file was attached. > > PETSc-3.15.x worked correctly in the same environment. > > Thanks, > Fande > -------------- next part -------------- An HTML attachment was scrubbed... URL: From balay at mcs.anl.gov Tue Jan 18 23:48:04 2022 From: balay at mcs.anl.gov (Satish Balay) Date: Tue, 18 Jan 2022 23:48:04 -0600 (CST) Subject: [petsc-users] Downloaded superlu_dist could not be used. Please check install in $PREFIX In-Reply-To: References: Message-ID: <7a5749b1-2695-d7c0-beaf-44415b3e68b9@mcs.anl.gov> Sherry, This is with superlu-dist-7.1.1 [not master branch] Fande, >>>>>> Executing: mpifort -o /tmp/petsc-UYa6A8/config.compilers/conftest -fopenmp -fopenmp -I$PREFIX/include -fPIC -O3 -fopenmp /tmp/petsc-UYa6A8/config.compilers/conftest.o /tmp/petsc-UYa6A8/config.compilers/confc.o -Wl,-rpath,$PREFIX/lib -L$PREFIX/lib -lsuperlu_dist -lpthread -Wl,-rpath,$PREFIX/lib -L$PREFIX/lib -lparmetis -Wl,-rpath,$PREFIX/lib -L$PREFIX/lib -lmetis -Wl,-rpath,$PREFIX/lib -L$PREFIX/lib -lflapack -Wl,-rpath,$PREFIX/lib -L$PREFIX/lib -lfblas -lm -Wl,-rpath,$BUILD_PREFIX/lib -L$BUILD_PREFIX/lib -lstdc++ -ldl -lmpifort -lmpi -lgfortran -lm -Wl,-rpath,$BUILD_PREFIX/lib/gcc/x86_64-conda-linux-gnu/9.3.0 -L$BUILD_PREFIX/lib/gcc/x86_64-conda-linux-gnu/9.3.0 -Wl,-rpath,$BUILD_PREFIX/lib/gcc -L$BUILD_PREFIX/lib/gcc -Wl,-rpath,$BUILD_PREFIX/x86_64-conda-linux-gnu/lib -L$BUILD_PREFIX/x86_64-conda-linux-gnu/lib -Wl,-rpath,$BUILD_PREFIX/lib -lgfortran -lm -lgcc_s -lquadmath -lrt -lquadmath -lstdc++ -ldl Possible ERROR while running linker: stderr: $BUILD_PREFIX/bin/../lib/gcc/x86_64-conda-linux-gnu/9.3.0/../../../../x86_64-conda-linux-gnu/bin/ld: warning: libmpicxx.so.12, needed by $PREFIX/lib/libsuperlu_dist.so, not found (try using -rpath or -rpath-link) <<< I don't really understand why this error comes up [as with shared libraries we should be able to link with -lsuperlu_dist - without having to link with libmpicxx.so.12 What do you get for: ldd $PREFIX/lib/libstdc++.so BTW: is configure.log modified to replace realpaths with $PREFIX $BUILD_PREFIX etc? Can you try additional configure option LIBS=-lmpicxx and see if that works around this problem? Satish On Tue, 18 Jan 2022, Xiaoye S. Li wrote: > There was a merge error in the master branch. I fixed it today. Not sure > whether that's causing your problem. Can you try now? > > Sherry > > On Mon, Jan 17, 2022 at 11:55 AM Fande Kong wrote: > > > I am trying to port PETSc-3.16.3 to the MOOSE ecosystem. I got an error > > that PETSc could not build superlu_dist. The log file was attached. > > > > PETSc-3.15.x worked correctly in the same environment. > > > > Thanks, > > Fande > > > From dong-hao at outlook.com Wed Jan 19 01:01:12 2022 From: dong-hao at outlook.com (Hao DONG) Date: Wed, 19 Jan 2022 07:01:12 +0000 Subject: [petsc-users] Strange CUDA failure with a second petscfinalize with PETSc 3.16 In-Reply-To: <60E1BD31-E98A-462F-BA2B-2099745B2582@gmail.com> References: <45585074-8C0E-453B-993B-554D23A1E971@gmail.com> <60E1BD31-E98A-462F-BA2B-2099745B2582@gmail.com> Message-ID: Thanks Jacob for looking into this ? You can see the updated source code of ex11fc in the attachment ? although there is not much that I modified (except for the jabbers I outputted). I also attached the full output (ex11fc.log) along with the configure.log file. It?s an old dual Xeon workstation (one of my ?production? machines) with Linux kernel 5.4.0 and gcc 9.3. I simply ran the code with mpiexec -np 2 ex11fc -usecuda for GPU test. And as stated before, calling without the ?-usecuda? option shows no errors. Please let me know if you find anything wrong with the configure/code. Cheers, Hao From: Jacob Faibussowitsch Sent: Wednesday, January 19, 2022 3:38 AM To: Hao DONG Cc: Junchao Zhang; petsc-users Subject: Re: [petsc-users] Strange CUDA failure with a second petscfinalize with PETSc 3.16 Apologies, forgot to mention in my previous email but can you also include a copy of the full printout of the error message that you get? It will include all the command-line flags that you ran with (if any) so I can exactly mirror your environment. Best regards, Jacob Faibussowitsch (Jacob Fai - booss - oh - vitch) On Jan 18, 2022, at 14:06, Jacob Faibussowitsch > wrote: Can you send your updated source file as well as your configure.log (should be $PETSC_DIR/configure.log). I will see if I can reproduce the error on my end. Best regards, Jacob Faibussowitsch (Jacob Fai - booss - oh - vitch) On Jan 17, 2022, at 23:06, Hao DONG > wrote: ? Dear Junchao and Jacob, Thanks a lot for the response ? I also don?t understand why this is related to the device, especially on why the procedure can be successfully finished for *once* ? As instructed, I tried to add a CHKERRA() macro after (almost) every petsc line ? such as the initialization, mat assemble, ksp create, solve, mat destroy, etc. However, all other petsc commands returns with error code 0. It only gives me a similar (still not very informative) error after I call the petscfinalize (again for the second time), with error code 97: [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [0]PETSC ERROR: GPU error [0]PETSC ERROR: cuda error 709 (cudaErrorContextIsDestroyed) : context is destroyed [0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting. [0]PETSC ERROR: Petsc Release Version 3.16.3, unknown [0]PETSC ERROR: ./ex11f on a named stratosphere by donghao Tue Jan 18 11:39:43 2022 [0]PETSC ERROR: Configure options --prefix=/opt/petsc/complex-double-with-cuda --with-cc=mpicc --with-cxx=mpicxx --with-fc=mpif90 COPTFLAGS="-O3 -mavx2" CXXOPTFLAGS="-O3 -mavx2" FOPTFLAGS="-O3 -ffree-line-length-none -mavx2" CUDAOPTFLAGS=-O3 --with-cxx-dialect=cxx14 --with-cuda-dialect=cxx14 --with-scalar-type=complex --with-precision=double --with-cuda-dir=/usr/local/cuda --with-debugging=1 [0]PETSC ERROR: #1 PetscFinalize() at /home/donghao/packages/petsc-current/src/sys/objects/pinit.c:1638 [0]PETSC ERROR: #2 User provided function() at User file:0 I can also confirm that rolling back to petsc 3.15 will *not* see the problem, even with the new nvidia driver. And petsc 3.16.3 with an old nvidia driver (470.42) also get this same error. So it?s probably not connected to the nvidia driver. Any idea on where I should look at next? Thanks a lot in advance, and all the best, Hao From: Jacob Faibussowitsch Sent: Sunday, January 16, 2022 12:12 AM To: Junchao Zhang Cc: petsc-users; Hao DONG Subject: Re: [petsc-users] Strange CUDA failure with a second petscfinalize with PETSc 3.16 I don?t quite understand how it is getting to the CUDA error to be honest. None of the code in the stack trace is anywhere near the device code. Reading the error message carefully, it first chokes on PetscLogGetStageLog() from a call to PetscClassIdRegister(): PetscErrorCode PetscLogGetStageLog(PetscStageLog *stageLog) { PetscFunctionBegin; PetscValidPointer(stageLog,1); if (!petsc_stageLog) { fprintf(stderr, "PETSC ERROR: Logging has not been enabled.\nYou might have forgotten to call PetscInitialize().\n"); PETSCABORT(MPI_COMM_WORLD, PETSC_ERR_SUP); // Here } ... But then jumps to PetscFinalize(). You can also see the "You might have forgotten to call PetscInitialize().? message in the error message, just under the topmost level of the stack trace. Can you check the value of ierr of each function call (use the CHKERRA() macro to do so)? I suspect the problem here that errors occurring previously in the program are being ignored, leading to the garbled stack trace. Best regards, Jacob Faibussowitsch (Jacob Fai - booss - oh - vitch) On Jan 14, 2022, at 20:58, Junchao Zhang > wrote: Jacob, Could you have a look as it seems the "invalid device context" is in your newly added module? Thanks --Junchao Zhang On Fri, Jan 14, 2022 at 12:49 AM Hao DONG > wrote: Dear All, I have encountered a peculiar problem when fiddling with a code with PETSC 3.16.3 (which worked fine with PETSc 3.15). It is a very straight forward PDE-based optimization code which repeatedly solves a linearized PDE problem with KSP in a subroutine (the rest of the code does not contain any PETSc related content). The main program provides the subroutine with an MPI comm. Then I set the comm as PETSC_COMM_WORLD to tell PETSC to attach to it (and detach with it when the solving is finished each time). Strangely, I observe a CUDA failure whenever the petscfinalize is called for a *second* time. In other words, the first and second PDE calculations with GPU are fine (with correct solutions). The petsc code just fails after the SECOND petscfinalize command is called. You can also see the PETSC config in the error message: [1]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [1]PETSC ERROR: GPU error [1]PETSC ERROR: cuda error 201 (cudaErrorDeviceUninitialized) : invalid device context [1]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting. [1]PETSC ERROR: Petsc Release Version 3.16.3, unknown [1]PETSC ERROR: maxwell.gpu on a named stratosphere by hao Fri Jan 14 10:21:05 2022 [1]PETSC ERROR: Configure options --prefix=/opt/petsc/complex-double-with-cuda --with-cc=mpicc --with-cxx=mpicxx --with-fc=mpif90 COPTFLAGS="-O3 -mavx2" CXXOPTFLAGS="-O3 -mavx2" FOPTFLAGS="-O3 -ffree-line-length-none -mavx2" CUDAOPTFLAGS=-O3 --with-cxx-dialect=cxx14 --with-cuda-dialect=cxx14 --with-scalar-type=complex --with-precision=double --with-cuda-dir=/usr/local/cuda --with-debugging=1 [1]PETSC ERROR: #1 PetscFinalize() at /home/hao/packages/petsc-current/src/sys/objects/pinit.c:1638 You might have forgotten to call PetscInitialize(). The EXACT line numbers in the error traceback are not available. Instead the line number of the start of the function is given. [1] #1 PetscAbortFindSourceFile_Private() at /home/hao/packages/petsc-current/src/sys/error/err.c:35 [1] #2 PetscLogGetStageLog() at /home/hao/packages/petsc-current/src/sys/logging/utils/stagelog.c:29 [1] #3 PetscClassIdRegister() at /home/hao/packages/petsc-current/src/sys/logging/plog.c:2376 [1] #4 MatMFFDInitializePackage() at /home/hao/packages/petsc-current/src/mat/impls/mffd/mffd.c:45 [1] #5 MatInitializePackage() at /home/hao/packages/petsc-current/src/mat/interface/dlregismat.c:163 [1] #6 MatCreate() at /home/hao/packages/petsc-current/src/mat/utils/gcreate.c:77 However, it doesn?t seem to affect the other part of my code, so the code can continue running until it gets to the petsc part again (the *third* time). Unfortunately, it doesn?t give me any further information even if I set the debugging to yes in the configure file. It also worth noting that PETSC without CUDA (i.e. with simple MATMPIAIJ) works perfectly fine. I am able to re-produce the problem with a toy code modified from ex11f. Please see the attached file (ex11fc.F90) for details. Essentially the code does the same thing as ex11f, but three times with a do loop. To do that I added an extra MPI_INIT/MPI_FINALIZE to ensure that the MPI communicator is not destroyed when PETSC_FINALIZE is called. I used the PetscOptionsHasName utility to check if you have ?-usecuda? in the options. So running the code with and without that option can give you a comparison w/o CUDA. I can see that the code also fails after the second loop of the KSP operation. Could you kindly shed some lights on this problem? I should say that I am not even sure if the problem is from PETSc, as I also accidentally updated the NVIDIA driver (for now it is 510.06 with cuda 11.6). And it is well known that NVIDIA can give you some surprise in the updates (yes, I know I shouldn?t have touched that if it?s not broken). But my CUDA code without PETSC (which basically does the same PDE thing, but with cusparse/cublas directly) seems to work just fine after the update. It is also possible that my petsc code related to CUDA was not quite ?legitimate? ? I just use: MatSetType(A, MATMPIAIJCUSPARSE, ierr) and MatCreateVecs(A, u, PETSC_NULL_VEC, ierr) to make the data onto GPU. I would very much appreciate it if you could show me the ?right? way to do that. Thanks a lot in advance, and all the best, Hao -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: configure.log Type: application/octet-stream Size: 1006969 bytes Desc: configure.log URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: ex11fc.F90 Type: application/octet-stream Size: 10723 bytes Desc: ex11fc.F90 URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: ex11fc.log Type: application/octet-stream Size: 6686 bytes Desc: ex11fc.log URL: From varunhiremath at gmail.com Wed Jan 19 02:44:44 2022 From: varunhiremath at gmail.com (Varun Hiremath) Date: Wed, 19 Jan 2022 00:44:44 -0800 Subject: [petsc-users] PETSc MUMPS interface In-Reply-To: References: <1AD9EBCE-4B48-4D54-9829-DFD5EDC68B76@dsic.upv.es> Message-ID: Hi Hong, Thanks, I tested your branch and I think it is working fine. I don't see any increase in runtime, however with -log_view I see that the MatLUFactorSymbolic function is still being called twice, so is this expected? Is the second call a no-op? $ ./ex52.o -use_mumps_lu -print_mumps_memory -log_view | grep MatLUFactorSym MatLUFactorSym 2 1.0 4.4411e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 2 0 0 0 0 2 0 0 0 0 0 Thanks, Varun On Mon, Jan 17, 2022 at 7:49 PM Zhang, Hong wrote: > Varun, > I created a branch hzhang/feature-mumps-mem-estimate, > see https://gitlab.com/petsc/petsc/-/merge_requests/4727 > > You may give it a try and let me know if this is what you want. > src/ksp/ksp/tutorials/ex52.c is an example. > > Hong > ------------------------------ > *From:* Varun Hiremath > *Sent:* Monday, January 17, 2022 12:41 PM > *To:* Zhang, Hong > *Cc:* Jose E. Roman ; Peder J?rgensgaard Olesen via > petsc-users > *Subject:* Re: [petsc-users] PETSc MUMPS interface > > Hi Hong, > > Thanks for looking into this. Here is the workflow that I might use: > > MatLUFactorSymbolic(F,A,perm,iperm,&info); > > // get memory estimates from MUMPS e.g. INFO(3), INFOG(16), INFOG(17) > // find available memory on the system e.g. RAM size > if (estimated_memory > available_memory) > { > // inform and stop; or > // switch MUMPS to out-of-core factorization > ICNTL(22) = 1; > } > else > { > // set appropriate settings for in-core factorization > } > > // Now we call the solve and inside if MatLUFactorSymbolic is already > called then it should be skipped > EPSSolve(eps); > > Thanks, > Varun > > On Mon, Jan 17, 2022 at 9:18 AM Zhang, Hong wrote: > > Varun, > I am trying to find a way to enable you to switch options after MatLUFactorSymbolic(). > A hack is modifying the flag 'mumps->matstruc' > inside MatLUFactorSymbolic_AIJMUMPS() and MatFactorNumeric_MUMPS(). > > My understanding of what you want is: > // collect mumps memory info > ... > MatLUFactorSymbolic(F,A,perm,iperm,&info); > printMumpsMemoryInfo(F); > //--------- > if (memory is available) { > EPSSolve(eps); --> skip calling of MatLUFactorSymbolic() > } else { > //out-of-core (OOC) option in MUMPS > } > > Am I correct? I'll let you know once I work out a solution. > Hong > > ------------------------------ > *From:* Varun Hiremath > *Sent:* Sunday, January 16, 2022 10:10 PM > *To:* Zhang, Hong > *Cc:* Jose E. Roman ; Peder J?rgensgaard Olesen via > petsc-users > *Subject:* Re: [petsc-users] PETSc MUMPS interface > > Hi Jose, Hong, > > Thanks for the explanation. I have verified using -log_view that MatLUFactorSymbolic > is indeed getting called twice. > > Hong, we use MUMPS solver for other things, and we typically run the > symbolic analysis first and get memory estimates to ensure that we have > enough memory available to run the case. If the available memory is not > enough, we can stop or switch to the out-of-core (OOC) option in MUMPS. We > wanted to do the same when using MUMPS via SLEPc/PETSc. Please let me know > if there are other ways of getting these memory stats and switching options > during runtime with PETSc. > Appreciate your help! > > Thanks, > Varun > > On Sun, Jan 16, 2022 at 4:01 PM Zhang, Hong wrote: > > Varun, > I believe Jose is correct. You may verify it by running your code with > option '-log_view', then check the number of calls to MatLUFactorSym. > > I guess I can add a flag in PCSetUp() to check if user has already called > MatLUFactorSymbolic() and wants to skip it. Normally, users simply allocate > sufficient memory in the symbolic factorization. Why do you want to check > it? > Hong > > ------------------------------ > *From:* Jose E. Roman > *Sent:* Sunday, January 16, 2022 5:11 AM > *To:* Varun Hiremath > *Cc:* Zhang, Hong ; Peder J?rgensgaard Olesen via > petsc-users > *Subject:* Re: [petsc-users] PETSc MUMPS interface > > Hong may give a better answer, but if you look at PCSetUp_LU() > https://petsc.org/main/src/ksp/pc/impls/factor/lu/lu.c.html#PCSetUp_LU > you will see that MatLUFactorSymbolic() is called unconditionally during > the first PCSetUp(). Currently there is no way to check if the user has > already called MatLUFactorSymbolic(). > > Jose > > > > El 16 ene 2022, a las 10:40, Varun Hiremath > escribi?: > > > > Hi Hong, > > > > Thank you, this is very helpful! > > > > Using this method I am able to get the memory estimates before the > actual solve, however, I think my code may be causing the symbolic > factorization to be run twice. Attached is my code where I am using SLEPc > to compute eigenvalues, and I use MUMPS for factorization. I have commented > above the code that computes the memory estimates, could you please check > and tell me if this would cause the symbolic factor to be computed twice (a > second time inside EPSSolve?), as I am seeing a slight increase in the > overall computation time? > > > > Regards, > > Varun > > > > On Wed, Jan 12, 2022 at 7:58 AM Zhang, Hong wrote: > > PCFactorSetMatSolverType(pc,MATSOLVERMUMPS); > > PCFactorSetUpMatSolverType(pc); > > PCFactorGetMatrix(pc,&F); > > > > MatLUFactorSymbolic(F,A,...) > > You must provide row and column permutations etc, > petsc/src/mat/tests/ex125.c may give you a clue on how to get these inputs. > > > > Hong > > > > > > From: petsc-users on behalf of > Junchao Zhang > > Sent: Wednesday, January 12, 2022 9:03 AM > > To: Varun Hiremath > > Cc: Peder J?rgensgaard Olesen via petsc-users > > Subject: Re: [petsc-users] PETSc MUMPS interface > > > > Calling PCSetUp() before KSPSetUp()? > > > > --Junchao Zhang > > > > > > On Wed, Jan 12, 2022 at 3:00 AM Varun Hiremath > wrote: > > Hi All, > > > > I want to collect MUMPS memory estimates based on the initial symbolic > factorization analysis before the actual numerical factorization starts to > check if the estimated memory requirements fit the available memory. > > > > I am following the steps from > https://petsc.org/main/src/ksp/ksp/tutorials/ex52.c.html > > > > PCFactorSetMatSolverType(pc,MATSOLVERMUMPS); > > PCFactorSetUpMatSolverType(pc); > > PCFactorGetMatrix(pc,&F); > > > > KSPSetUp(ksp); > > MatMumpsGetInfog(F,...) > > > > But it appears KSPSetUp calls both symbolic and numerical factorization. > So is there some other way to get these statistics before the actual > factorization starts? > > > > Thanks, > > Varun > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From marco.cisternino at optimad.it Wed Jan 19 03:52:05 2022 From: marco.cisternino at optimad.it (Marco Cisternino) Date: Wed, 19 Jan 2022 09:52:05 +0000 Subject: [petsc-users] Nullspaces In-Reply-To: References: Message-ID: Thank you Matthew. But I cannot get the point. I got the point about the test but to try to explain my doubt I?m going to prepare another toy code. By words? I usually have a finite volume discretization of the Laplace operator with homogeneous Neumann BC on an octree mesh and it reads Aij * xj = bi, being the discretization of Int|Vi(nabla^2 pi dV) = Int|Vi(nabla dot ui) (I omit constants), where Int|Vi(?dV) is a volume integral on the I cell, pi is cell pressure, ui is the cell velocity. The computational domain contains 2 separated sub-domains. Let?s consider 4 cells into the whole domain and 2 cells for each sub-domain. I would expect a null space made of 2 vectors and from your patch they should look like [1/sqrt(2) 1/sqrt(2) 0 0] and [0 0 1/sqrt(2) 1/sqrt(2)], i.e. norm2 = 1 for both. With MatNullSpaceCreate(getCommunicator(), PETSC_TRUE, 0, nullptr, &nullspace) I?m saying that [1/sqrt(4) 1/sqrt(4) 1/sqrt(4) 1/sqrt(4)] is the null space, which is not but it is in the null space. But this is not the case I sent you, not exactly. The case I sent is 1/Vi * Aij * xj = 1/Vi bi, where Vi is the volume of the cell i. Let?s admit that yj is in the null space of of Aij, it should be in the null space of 1/Vi * Aij, therefore Aij*yj = 0 and 1/Vi * Aij*yj = 0 too. But in the framework of the test, this is true with infinite precision. What happens if norm2(Aij*yj) = 10^-15 and Vi = 10^-5? Norm2(1/Vi * Aij * yj) = 10^-10!!! Is yi still in the null space numerically? Let?s say yi is the constant vector over the whole domain, i.e. [1/sqrt(4) 1/sqrt(4) 1/sqrt(4) 1/sqrt(4)]. Should this be in the null space of 1/Vi * Aij, shouldn?t it? An analogous argument should be for the compatibility condition that concerns bi. My current problem is that properly creating the null space for Aij, i.e. [1/sqrt(2) 1/sqrt(2) 0 0] and [0 0 1/sqrt(2) 1/sqrt(2)], allows me to solve and find xi, but multiplying by 1/Vi, I cannot get any solution using both FGMRES+ILU and FGMRE+GAMG. The tiny problem will load Aij, Vi and bi and show the problem by testing the proper null space and trying to solve. I will include the patch to my PETSc version. I hope to come back to you very soon. Thank you very much for your support! Marco Cisternino From: Matthew Knepley Sent: marted? 18 gennaio 2022 21:25 To: Marco Cisternino Cc: petsc-users Subject: Re: [petsc-users] Nullspaces I made a fix for this: https://gitlab.com/petsc/petsc/-/merge_requests/4729 Thanks, Matt On Tue, Jan 18, 2022 at 3:20 PM Matthew Knepley > wrote: On Thu, Dec 16, 2021 at 11:09 AM Marco Cisternino > wrote: Hello Matthew, as promised I prepared a minimal (112960 rows. I?m not able to produce anything smaller than this and triggering the issue) example of the behavior I was talking about some days ago. What I did is to produce matrix, right hand side and initial solution of the linear system. As I told you before, this linear system is the discretization of the pressure equation of a predictor-corrector method for NS equations in the framework of finite volume method. This case has homogeneous Neumann boundary conditions. Computational domain has two independent and separated sub-domains. I discretize the weak formulation and I divide every row of the linear system by the volume of the relative cell. The underlying mesh is not uniform, therefore cells have different volumes. The issue I?m going to explain does not show up if the mesh is uniform, same volume for all the cells. I usually build the null space sub-domain by sub-domain with MatNullSpaceCreate(getCommunicator(), PETSC_FALSE, nConstants, constants, &nullspace); Where nConstants = 2 and constants contains two normalized arrays with constant values on degrees of freedom relative to the associated sub-domain and zeros elsewhere. However, as a test I tried the constant over the whole domain using 2 alternatives that should produce the same null space: 1. MatNullSpaceCreate(getCommunicator(), PETSC_TRUE, 0, nullptr, &nullspace); 2. Vec* nsp; VecDuplicateVecs(solution, 1, &nsp); VecSet(nsp[0],1.0); VecNormalize(nsp[0], nullptr); MatNullSpaceCreate(getCommunicator(), PETSC_FALSE, 1, nsp, &nullspace); Once I created the null space I test it using: MatNullSpaceTest(nullspace, m_A, &isNullSpaceValid); The case 1 pass the test while case 2 don?t. I have a small code for matrix loading, null spaces creation and testing. Unfortunately I cannot implement a small code able to produce that linear system. As attachment you can find an archive containing the matrix, the initial solution (used to manually build the null space) and the rhs (not used in the test code) in binary format. You can also find the testing code in the same archive. I used petsc 3.12(gcc+openMPI) and petsc 3.15.2(intelOneAPI) same results. If the attachment is not delivered, I can share a link to it. Marco, please forgive me for taking so long to get to your issue. It has been crazy. You are correct, we had a bug. it is in MatNullSpaceTest. The normalization for the constant vector was wrong: diff --git a/src/mat/interface/matnull.c b/src/mat/interface/matnull.c index f8ab2925988..0c4c3855be0 100644 --- a/src/mat/interface/matnull.c +++ b/src/mat/interface/matnull.c @@ -429,7 +429,7 @@ PetscErrorCode MatNullSpaceTest(MatNullSpace sp,Mat mat,PetscBool *isNull) if (sp->has_cnst) { ierr = VecDuplicate(l,&r);CHKERRQ(ierr); ierr = VecGetSize(l,&N);CHKERRQ(ierr); - sum = 1.0/N; + sum = 1.0/PetscSqrtReal(N); ierr = VecSet(l,sum);CHKERRQ(ierr); ierr = MatMult(mat,l,r);CHKERRQ(ierr); ierr = VecNorm(r,NORM_2,&nrm);CHKERRQ(ierr); With this fix, your two cases give the same answer, namely that the constant vector is not a null vector of your operator, but it is close, as your can see using -mat_null_space_test_view. Thanks, Matt Thanks for any help. Marco Cisternino Marco Cisternino, PhD marco.cisternino at optimad.it ______________________ Optimad Engineering Srl Via Bligny 5, Torino, Italia. +3901119719782 www.optimad.it From: Marco Cisternino > Sent: marted? 7 dicembre 2021 19:36 To: Matthew Knepley > Cc: petsc-users > Subject: Re: [petsc-users] Nullspaces I will, as soon as possible... Scarica Outlook per Android ________________________________ From: Matthew Knepley > Sent: Tuesday, December 7, 2021 7:25:43 PM To: Marco Cisternino > Cc: petsc-users > Subject: Re: [petsc-users] Nullspaces On Tue, Dec 7, 2021 at 11:19 AM Marco Cisternino > wrote: Good morning, I?m still struggling with the Poisson equation with Neumann BCs. I discretize the equation by finite volume method and I divide every line of the linear system by the volume of the cell. I could avoid this division, but I?m trying to understand. My mesh is not uniform, i.e. cells have different volumes (it is an octree mesh). Moreover, in my computational domain there are 2 separated sub-domains. I build the null space and then I use MatNullSpaceTest to check it. If I do this: MatNullSpaceCreate(getCommunicator(), PETSC_TRUE, 0, nullptr, &nullspace); It works This produces the normalized constant vector. If I do this: Vec nsp; VecDuplicate(m_rhs, &nsp); VecSet(nsp,1.0); VecNormalize(nsp, nullptr); MatNullSpaceCreate(getCommunicator(), PETSC_FALSE, 1, &nsp, &nullspace); It does not work This is also the normalized constant vector. So you are saying that these two vectors give different results with MatNullSpaceTest()? Something must be wrong in the code. Can you send a minimal example of this? I will go through and debug it. Thanks, Matt Probably, I have wrong expectations, but should not it be the same? Thanks Marco Cisternino, PhD marco.cisternino at optimad.it ______________________ Optimad Engineering Srl Via Bligny 5, Torino, Italia. +3901119719782 www.optimad.it -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Wed Jan 19 06:19:06 2022 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 19 Jan 2022 07:19:06 -0500 Subject: [petsc-users] Nullspaces In-Reply-To: References: Message-ID: On Wed, Jan 19, 2022 at 4:52 AM Marco Cisternino < marco.cisternino at optimad.it> wrote: > Thank you Matthew. > > But I cannot get the point. I got the point about the test but to try to > explain my doubt I?m going to prepare another toy code. > > > > By words? > > I usually have a finite volume discretization of the Laplace operator with > homogeneous Neumann BC on an octree mesh and it reads > > Aij * xj = bi, > > being the discretization of > > Int|Vi(nabla^2 pi dV) = Int|Vi(nabla dot ui) > > (I omit constants), where Int|Vi(?dV) is a volume integral on the I cell, > pi is cell pressure, ui is the cell velocity. > > The computational domain contains 2 separated sub-domains. > > Let?s consider 4 cells into the whole domain and 2 cells for each > sub-domain. > > I would expect a null space made of 2 vectors and from your patch they > should look like [1/sqrt(2) 1/sqrt(2) 0 0] and [0 0 1/sqrt(2) 1/sqrt(2)], > i.e. norm2 = 1 for both. > > With MatNullSpaceCreate(getCommunicator(), PETSC_TRUE, 0, nullptr, > &nullspace) I?m saying that [1/sqrt(4) 1/sqrt(4) 1/sqrt(4) 1/sqrt(4)] is > the null space, which is not but it is in the null space. > > But this is not the case I sent you, not exactly. > > > > The case I sent is 1/Vi * Aij * xj = 1/Vi bi, where Vi is the volume of > the cell i. > > Let?s admit that yj is in the null space of of Aij, it should be in the > null space of 1/Vi * Aij, therefore Aij*yj = 0 and 1/Vi * Aij*yj = 0 too. > > But in the framework of the test, this is true with infinite precision. > > What happens if norm2(Aij*yj) = 10^-15 and Vi = 10^-5? Norm2(1/Vi * Aij * > yj) = 10^-10!!! Is yi still in the null space numerically? > > Let?s say yi is the constant vector over the whole domain, i.e. [1/sqrt(4) > 1/sqrt(4) 1/sqrt(4) 1/sqrt(4)]. Should this be in the null space of 1/Vi * > Aij, shouldn?t it? > > An analogous argument should be for the compatibility condition that > concerns bi. > > > > My current problem is that properly creating the null space for Aij, i.e. > [1/sqrt(2) 1/sqrt(2) 0 0] and [0 0 1/sqrt(2) 1/sqrt(2)], allows me to solve > and find xi, but multiplying by 1/Vi, I cannot get any solution using both > FGMRES+ILU and FGMRE+GAMG. > > > > The tiny problem will load Aij, Vi and bi and show the problem by testing > the proper null space and trying to solve. I will include the patch to my > PETSc version. I hope to come back to you very soon. > This sounds like a dimensionalization problem to me. It is best to choose length (and other) units that make the matrix entries about 1. It seems like you are far from this, and it is causing a loss of accuracy (your null vector has a residual of about 1e-7). Thanks, Matt > Thank you very much for your support! > > > > > > Marco Cisternino > > > > *From:* Matthew Knepley > *Sent:* marted? 18 gennaio 2022 21:25 > *To:* Marco Cisternino > *Cc:* petsc-users > *Subject:* Re: [petsc-users] Nullspaces > > > > I made a fix for this: > > > > https://gitlab.com/petsc/petsc/-/merge_requests/4729 > > > > Thanks, > > > > Matt > > > > On Tue, Jan 18, 2022 at 3:20 PM Matthew Knepley wrote: > > On Thu, Dec 16, 2021 at 11:09 AM Marco Cisternino < > marco.cisternino at optimad.it> wrote: > > Hello Matthew, > > as promised I prepared a minimal (112960 rows. I?m not able to produce > anything smaller than this and triggering the issue) example of the > behavior I was talking about some days ago. > > What I did is to produce matrix, right hand side and initial solution of > the linear system. > > > > As I told you before, this linear system is the discretization of the > pressure equation of a predictor-corrector method for NS equations in the > framework of finite volume method. > > This case has homogeneous Neumann boundary conditions. Computational > domain has two independent and separated sub-domains. > > I discretize the weak formulation and I divide every row of the linear > system by the volume of the relative cell. > > The underlying mesh is not uniform, therefore cells have different > volumes. > > The issue I?m going to explain does not show up if the mesh is uniform, > same volume for all the cells. > > > > I usually build the null space sub-domain by sub-domain with > > MatNullSpaceCreate(getCommunicator(), PETSC_FALSE, nConstants, constants, > &nullspace); > > Where nConstants = 2 and constants contains two normalized arrays with > constant values on degrees of freedom relative to the associated sub-domain > and zeros elsewhere. > > > > However, as a test I tried the constant over the whole domain using 2 > alternatives that should produce the same null space: > > 1. MatNullSpaceCreate(getCommunicator(), PETSC_TRUE, 0, nullptr, > &nullspace); > 2. Vec* nsp; > > VecDuplicateVecs(solution, 1, &nsp); > > VecSet(nsp[0],1.0); > > VecNormalize(nsp[0], nullptr); > > MatNullSpaceCreate(getCommunicator(), PETSC_FALSE, 1, nsp, &nullspace); > > > > Once I created the null space I test it using: > > MatNullSpaceTest(nullspace, m_A, &isNullSpaceValid); > > > > The case 1 pass the test while case 2 don?t. > > > > I have a small code for matrix loading, null spaces creation and testing. > > Unfortunately I cannot implement a small code able to produce that linear > system. > > > > As attachment you can find an archive containing the matrix, the initial > solution (used to manually build the null space) and the rhs (not used in > the test code) in binary format. > > You can also find the testing code in the same archive. > > I used petsc 3.12(gcc+openMPI) and petsc 3.15.2(intelOneAPI) same results. > > If the attachment is not delivered, I can share a link to it. > > > > Marco, please forgive me for taking so long to get to your issue. It has > been crazy. > > > > You are correct, we had a bug. it is in MatNullSpaceTest. The > normalization for the constant vector was wrong: > > > > diff --git a/src/mat/interface/matnull.c b/src/mat/interface/matnull.c > index f8ab2925988..0c4c3855be0 100644 > --- a/src/mat/interface/matnull.c > +++ b/src/mat/interface/matnull.c > @@ -429,7 +429,7 @@ PetscErrorCode MatNullSpaceTest(MatNullSpace sp,Mat > mat,PetscBool *isNull) > if (sp->has_cnst) { > ierr = VecDuplicate(l,&r);CHKERRQ(ierr); > ierr = VecGetSize(l,&N);CHKERRQ(ierr); > - sum = 1.0/N; > > + sum = 1.0/PetscSqrtReal(N); > ierr = VecSet(l,sum);CHKERRQ(ierr); > ierr = MatMult(mat,l,r);CHKERRQ(ierr); > ierr = VecNorm(r,NORM_2,&nrm);CHKERRQ(ierr); > > > > With this fix, your two cases give the same answer, namely that the > constant vector is not a null vector of your > > operator, but it is close, as your can see using -mat_null_space_test_view. > > > > Thanks, > > > > Matt > > > > Thanks for any help. > > > > Marco Cisternino > > > > > > Marco Cisternino, PhD > marco.cisternino at optimad.it > > ______________________ > > Optimad Engineering Srl > > Via Bligny 5, Torino, Italia. > +3901119719782 > www.optimad.it > > > > *From:* Marco Cisternino > *Sent:* marted? 7 dicembre 2021 19:36 > *To:* Matthew Knepley > *Cc:* petsc-users > *Subject:* Re: [petsc-users] Nullspaces > > > > I will, as soon as possible... > > > > Scarica Outlook per Android > ------------------------------ > > *From:* Matthew Knepley > *Sent:* Tuesday, December 7, 2021 7:25:43 PM > *To:* Marco Cisternino > *Cc:* petsc-users > *Subject:* Re: [petsc-users] Nullspaces > > > > On Tue, Dec 7, 2021 at 11:19 AM Marco Cisternino < > marco.cisternino at optimad.it> wrote: > > Good morning, > > I?m still struggling with the Poisson equation with Neumann BCs. > > I discretize the equation by finite volume method and I divide every line > of the linear system by the volume of the cell. I could avoid this > division, but I?m trying to understand. > > My mesh is not uniform, i.e. cells have different volumes (it is an octree > mesh). > > Moreover, in my computational domain there are 2 separated sub-domains. > > I build the null space and then I use MatNullSpaceTest to check it. > > > > If I do this: > > MatNullSpaceCreate(getCommunicator(), PETSC_TRUE, 0, nullptr, &nullspace); > > It works > > > > This produces the normalized constant vector. > > > > If I do this: > > Vec nsp; > > VecDuplicate(m_rhs, &nsp); > > VecSet(nsp,1.0); > > VecNormalize(nsp, nullptr); > > MatNullSpaceCreate(getCommunicator(), PETSC_FALSE, 1, &nsp, &nullspace); > > It does not work > > > > This is also the normalized constant vector. > > > > So you are saying that these two vectors give different results with > MatNullSpaceTest()? > > Something must be wrong in the code. Can you send a minimal example of > this? I will go > > through and debug it. > > > > Thanks, > > > > Matt > > > > Probably, I have wrong expectations, but should not it be the same? > > > > Thanks > > > > Marco Cisternino, PhD > marco.cisternino at optimad.it > > ______________________ > > Optimad Engineering Srl > > Via Bligny 5, Torino, Italia. > +3901119719782 > www.optimad.it > > > > > > > -- > > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > > > https://www.cse.buffalo.edu/~knepley/ > > > > > > -- > > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > > > https://www.cse.buffalo.edu/~knepley/ > > > > > > -- > > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > > > https://www.cse.buffalo.edu/~knepley/ > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From mfadams at lbl.gov Wed Jan 19 08:15:31 2022 From: mfadams at lbl.gov (Mark Adams) Date: Wed, 19 Jan 2022 09:15:31 -0500 Subject: [petsc-users] Nullspaces In-Reply-To: References: Message-ID: ILU is LU for this 1D problem and its singular. ILU might have some logic to deal with a singular system. Not sure. LU should fail. You might try -ksp_type preonly and -pc_type lu And -pc_type jacobi should not have any numerical problems. Try that. On Wed, Jan 19, 2022 at 7:19 AM Matthew Knepley wrote: > On Wed, Jan 19, 2022 at 4:52 AM Marco Cisternino < > marco.cisternino at optimad.it> wrote: > >> Thank you Matthew. >> >> But I cannot get the point. I got the point about the test but to try to >> explain my doubt I?m going to prepare another toy code. >> >> >> >> By words? >> >> I usually have a finite volume discretization of the Laplace operator >> with homogeneous Neumann BC on an octree mesh and it reads >> >> Aij * xj = bi, >> >> being the discretization of >> >> Int|Vi(nabla^2 pi dV) = Int|Vi(nabla dot ui) >> >> (I omit constants), where Int|Vi(?dV) is a volume integral on the I cell, >> pi is cell pressure, ui is the cell velocity. >> >> The computational domain contains 2 separated sub-domains. >> >> Let?s consider 4 cells into the whole domain and 2 cells for each >> sub-domain. >> >> I would expect a null space made of 2 vectors and from your patch they >> should look like [1/sqrt(2) 1/sqrt(2) 0 0] and [0 0 1/sqrt(2) 1/sqrt(2)], >> i.e. norm2 = 1 for both. >> >> With MatNullSpaceCreate(getCommunicator(), PETSC_TRUE, 0, nullptr, >> &nullspace) I?m saying that [1/sqrt(4) 1/sqrt(4) 1/sqrt(4) 1/sqrt(4)] is >> the null space, which is not but it is in the null space. >> >> But this is not the case I sent you, not exactly. >> >> >> >> The case I sent is 1/Vi * Aij * xj = 1/Vi bi, where Vi is the volume of >> the cell i. >> >> Let?s admit that yj is in the null space of of Aij, it should be in the >> null space of 1/Vi * Aij, therefore Aij*yj = 0 and 1/Vi * Aij*yj = 0 too. >> >> But in the framework of the test, this is true with infinite precision. >> >> What happens if norm2(Aij*yj) = 10^-15 and Vi = 10^-5? Norm2(1/Vi * Aij >> * yj) = 10^-10!!! Is yi still in the null space numerically? >> >> Let?s say yi is the constant vector over the whole domain, i.e. >> [1/sqrt(4) 1/sqrt(4) 1/sqrt(4) 1/sqrt(4)]. Should this be in the null space >> of 1/Vi * Aij, shouldn?t it? >> >> An analogous argument should be for the compatibility condition that >> concerns bi. >> >> >> >> My current problem is that properly creating the null space for Aij, i.e. >> [1/sqrt(2) 1/sqrt(2) 0 0] and [0 0 1/sqrt(2) 1/sqrt(2)], allows me to solve >> and find xi, but multiplying by 1/Vi, I cannot get any solution using both >> FGMRES+ILU and FGMRE+GAMG. >> >> >> >> The tiny problem will load Aij, Vi and bi and show the problem by testing >> the proper null space and trying to solve. I will include the patch to my >> PETSc version. I hope to come back to you very soon. >> > > This sounds like a dimensionalization problem to me. It is best to choose > length (and other) units that make the matrix entries about 1. It seems > like you are far from this, and it is causing a loss of accuracy (your > null vector has a residual of about 1e-7). > > Thanks, > > Matt > > >> Thank you very much for your support! >> >> >> >> >> >> Marco Cisternino >> >> >> >> *From:* Matthew Knepley >> *Sent:* marted? 18 gennaio 2022 21:25 >> *To:* Marco Cisternino >> *Cc:* petsc-users >> *Subject:* Re: [petsc-users] Nullspaces >> >> >> >> I made a fix for this: >> >> >> >> https://gitlab.com/petsc/petsc/-/merge_requests/4729 >> >> >> >> Thanks, >> >> >> >> Matt >> >> >> >> On Tue, Jan 18, 2022 at 3:20 PM Matthew Knepley >> wrote: >> >> On Thu, Dec 16, 2021 at 11:09 AM Marco Cisternino < >> marco.cisternino at optimad.it> wrote: >> >> Hello Matthew, >> >> as promised I prepared a minimal (112960 rows. I?m not able to produce >> anything smaller than this and triggering the issue) example of the >> behavior I was talking about some days ago. >> >> What I did is to produce matrix, right hand side and initial solution of >> the linear system. >> >> >> >> As I told you before, this linear system is the discretization of the >> pressure equation of a predictor-corrector method for NS equations in the >> framework of finite volume method. >> >> This case has homogeneous Neumann boundary conditions. Computational >> domain has two independent and separated sub-domains. >> >> I discretize the weak formulation and I divide every row of the linear >> system by the volume of the relative cell. >> >> The underlying mesh is not uniform, therefore cells have different >> volumes. >> >> The issue I?m going to explain does not show up if the mesh is uniform, >> same volume for all the cells. >> >> >> >> I usually build the null space sub-domain by sub-domain with >> >> MatNullSpaceCreate(getCommunicator(), PETSC_FALSE, nConstants, constants, >> &nullspace); >> >> Where nConstants = 2 and constants contains two normalized arrays with >> constant values on degrees of freedom relative to the associated sub-domain >> and zeros elsewhere. >> >> >> >> However, as a test I tried the constant over the whole domain using 2 >> alternatives that should produce the same null space: >> >> 1. MatNullSpaceCreate(getCommunicator(), PETSC_TRUE, 0, nullptr, >> &nullspace); >> 2. Vec* nsp; >> >> VecDuplicateVecs(solution, 1, &nsp); >> >> VecSet(nsp[0],1.0); >> >> VecNormalize(nsp[0], nullptr); >> >> MatNullSpaceCreate(getCommunicator(), PETSC_FALSE, 1, nsp, &nullspace); >> >> >> >> Once I created the null space I test it using: >> >> MatNullSpaceTest(nullspace, m_A, &isNullSpaceValid); >> >> >> >> The case 1 pass the test while case 2 don?t. >> >> >> >> I have a small code for matrix loading, null spaces creation and testing. >> >> Unfortunately I cannot implement a small code able to produce that linear >> system. >> >> >> >> As attachment you can find an archive containing the matrix, the initial >> solution (used to manually build the null space) and the rhs (not used in >> the test code) in binary format. >> >> You can also find the testing code in the same archive. >> >> I used petsc 3.12(gcc+openMPI) and petsc 3.15.2(intelOneAPI) same results. >> >> If the attachment is not delivered, I can share a link to it. >> >> >> >> Marco, please forgive me for taking so long to get to your issue. It has >> been crazy. >> >> >> >> You are correct, we had a bug. it is in MatNullSpaceTest. The >> normalization for the constant vector was wrong: >> >> >> >> diff --git a/src/mat/interface/matnull.c b/src/mat/interface/matnull.c >> index f8ab2925988..0c4c3855be0 100644 >> --- a/src/mat/interface/matnull.c >> +++ b/src/mat/interface/matnull.c >> @@ -429,7 +429,7 @@ PetscErrorCode MatNullSpaceTest(MatNullSpace sp,Mat >> mat,PetscBool *isNull) >> if (sp->has_cnst) { >> ierr = VecDuplicate(l,&r);CHKERRQ(ierr); >> ierr = VecGetSize(l,&N);CHKERRQ(ierr); >> - sum = 1.0/N; >> >> + sum = 1.0/PetscSqrtReal(N); >> ierr = VecSet(l,sum);CHKERRQ(ierr); >> ierr = MatMult(mat,l,r);CHKERRQ(ierr); >> ierr = VecNorm(r,NORM_2,&nrm);CHKERRQ(ierr); >> >> >> >> With this fix, your two cases give the same answer, namely that the >> constant vector is not a null vector of your >> >> operator, but it is close, as your can see >> using -mat_null_space_test_view. >> >> >> >> Thanks, >> >> >> >> Matt >> >> >> >> Thanks for any help. >> >> >> >> Marco Cisternino >> >> >> >> >> >> Marco Cisternino, PhD >> marco.cisternino at optimad.it >> >> ______________________ >> >> Optimad Engineering Srl >> >> Via Bligny 5, Torino, Italia. >> +3901119719782 >> www.optimad.it >> >> >> >> *From:* Marco Cisternino >> *Sent:* marted? 7 dicembre 2021 19:36 >> *To:* Matthew Knepley >> *Cc:* petsc-users >> *Subject:* Re: [petsc-users] Nullspaces >> >> >> >> I will, as soon as possible... >> >> >> >> Scarica Outlook per Android >> ------------------------------ >> >> *From:* Matthew Knepley >> *Sent:* Tuesday, December 7, 2021 7:25:43 PM >> *To:* Marco Cisternino >> *Cc:* petsc-users >> *Subject:* Re: [petsc-users] Nullspaces >> >> >> >> On Tue, Dec 7, 2021 at 11:19 AM Marco Cisternino < >> marco.cisternino at optimad.it> wrote: >> >> Good morning, >> >> I?m still struggling with the Poisson equation with Neumann BCs. >> >> I discretize the equation by finite volume method and I divide every line >> of the linear system by the volume of the cell. I could avoid this >> division, but I?m trying to understand. >> >> My mesh is not uniform, i.e. cells have different volumes (it is an >> octree mesh). >> >> Moreover, in my computational domain there are 2 separated sub-domains. >> >> I build the null space and then I use MatNullSpaceTest to check it. >> >> >> >> If I do this: >> >> MatNullSpaceCreate(getCommunicator(), PETSC_TRUE, 0, nullptr, &nullspace); >> >> It works >> >> >> >> This produces the normalized constant vector. >> >> >> >> If I do this: >> >> Vec nsp; >> >> VecDuplicate(m_rhs, &nsp); >> >> VecSet(nsp,1.0); >> >> VecNormalize(nsp, nullptr); >> >> MatNullSpaceCreate(getCommunicator(), PETSC_FALSE, 1, &nsp, &nullspace); >> >> It does not work >> >> >> >> This is also the normalized constant vector. >> >> >> >> So you are saying that these two vectors give different results with >> MatNullSpaceTest()? >> >> Something must be wrong in the code. Can you send a minimal example of >> this? I will go >> >> through and debug it. >> >> >> >> Thanks, >> >> >> >> Matt >> >> >> >> Probably, I have wrong expectations, but should not it be the same? >> >> >> >> Thanks >> >> >> >> Marco Cisternino, PhD >> marco.cisternino at optimad.it >> >> ______________________ >> >> Optimad Engineering Srl >> >> Via Bligny 5, Torino, Italia. >> +3901119719782 >> www.optimad.it >> >> >> >> >> >> >> -- >> >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> >> >> https://www.cse.buffalo.edu/~knepley/ >> >> >> >> >> >> -- >> >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> >> >> https://www.cse.buffalo.edu/~knepley/ >> >> >> >> >> >> -- >> >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> >> >> https://www.cse.buffalo.edu/~knepley/ >> >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From fdkong.jd at gmail.com Wed Jan 19 09:26:09 2022 From: fdkong.jd at gmail.com (Fande Kong) Date: Wed, 19 Jan 2022 08:26:09 -0700 Subject: [petsc-users] Downloaded superlu_dist could not be used. Please check install in $PREFIX In-Reply-To: <7a5749b1-2695-d7c0-beaf-44415b3e68b9@mcs.anl.gov> References: <7a5749b1-2695-d7c0-beaf-44415b3e68b9@mcs.anl.gov> Message-ID: Thanks, Sherry, and Satish, I will try your suggestion, and report back to you as soon as possible. Thanks, Fande On Tue, Jan 18, 2022 at 10:48 PM Satish Balay wrote: > Sherry, > > This is with superlu-dist-7.1.1 [not master branch] > > > Fande, > > >>>>>> > Executing: mpifort -o /tmp/petsc-UYa6A8/config.compilers/conftest > -fopenmp -fopenmp -I$PREFIX/include -fPIC -O3 -fopenmp > /tmp/petsc-UYa6A8/config.compilers/conftest.o > /tmp/petsc-UYa6A8/config.compilers/confc.o -Wl,-rpath,$PREFIX/lib > -L$PREFIX/lib -lsuperlu_dist -lpthread -Wl,-rpath,$PREFIX/lib -L$PREFIX/lib > -lparmetis -Wl,-rpath,$PREFIX/lib -L$PREFIX/lib -lmetis > -Wl,-rpath,$PREFIX/lib -L$PREFIX/lib -lflapack -Wl,-rpath,$PREFIX/lib > -L$PREFIX/lib -lfblas -lm -Wl,-rpath,$BUILD_PREFIX/lib -L$BUILD_PREFIX/lib > -lstdc++ -ldl -lmpifort -lmpi -lgfortran -lm > -Wl,-rpath,$BUILD_PREFIX/lib/gcc/x86_64-conda-linux-gnu/9.3.0 > -L$BUILD_PREFIX/lib/gcc/x86_64-conda-linux-gnu/9.3.0 > -Wl,-rpath,$BUILD_PREFIX/lib/gcc -L$BUILD_PREFIX/lib/gcc > -Wl,-rpath,$BUILD_PREFIX/x86_64-conda-linux-gnu/lib > -L$BUILD_PREFIX/x86_64-conda-linux-gnu/lib -Wl,-rpath,$BUILD_PREFIX/lib > -lgfortran -lm -lgcc_s -lquadmath -lrt -lquadmath -lstdc++ -ldl > Possible ERROR while running linker: > stderr: > $BUILD_PREFIX/bin/../lib/gcc/x86_64-conda-linux-gnu/9.3.0/../../../../x86_64-conda-linux-gnu/bin/ld: > warning: libmpicxx.so.12, needed by $PREFIX/lib/libsuperlu_dist.so, not > found (try using -rpath or -rpath-link) > <<< > > I don't really understand why this error comes up [as with shared > libraries we should be able to link with -lsuperlu_dist - without having to > link with libmpicxx.so.12 > > What do you get for: > > ldd $PREFIX/lib/libstdc++.so > > > BTW: is configure.log modified to replace realpaths with $PREFIX > $BUILD_PREFIX etc? > > Can you try additional configure option LIBS=-lmpicxx and see if that > works around this problem? > > Satish > > On Tue, 18 Jan 2022, Xiaoye S. Li wrote: > > > There was a merge error in the master branch. I fixed it today. Not sure > > whether that's causing your problem. Can you try now? > > > > Sherry > > > > On Mon, Jan 17, 2022 at 11:55 AM Fande Kong wrote: > > > > > I am trying to port PETSc-3.16.3 to the MOOSE ecosystem. I got an error > > > that PETSc could not build superlu_dist. The log file was attached. > > > > > > PETSc-3.15.x worked correctly in the same environment. > > > > > > Thanks, > > > Fande > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From hzhang at mcs.anl.gov Wed Jan 19 09:37:04 2022 From: hzhang at mcs.anl.gov (Zhang, Hong) Date: Wed, 19 Jan 2022 15:37:04 +0000 Subject: [petsc-users] PETSc MUMPS interface In-Reply-To: References: <1AD9EBCE-4B48-4D54-9829-DFD5EDC68B76@dsic.upv.es> Message-ID: Varun, Good to know it works. FactorSymbolic function is still being called twice, but the 2nd call is a no-op, thus it still appears in '-log_view'. I made changes in the low level of mumps routine, not within PCSetUp() because I feel your use case is limited to mumps, not other matrix package solvers. Hong ________________________________ From: Varun Hiremath Sent: Wednesday, January 19, 2022 2:44 AM To: Zhang, Hong Cc: Peder J?rgensgaard Olesen via petsc-users Subject: Re: [petsc-users] PETSc MUMPS interface Hi Hong, Thanks, I tested your branch and I think it is working fine. I don't see any increase in runtime, however with -log_view I see that the MatLUFactorSymbolic function is still being called twice, so is this expected? Is the second call a no-op? $ ./ex52.o -use_mumps_lu -print_mumps_memory -log_view | grep MatLUFactorSym MatLUFactorSym 2 1.0 4.4411e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 2 0 0 0 0 2 0 0 0 0 0 Thanks, Varun On Mon, Jan 17, 2022 at 7:49 PM Zhang, Hong > wrote: Varun, I created a branch hzhang/feature-mumps-mem-estimate, see https://gitlab.com/petsc/petsc/-/merge_requests/4727 You may give it a try and let me know if this is what you want. src/ksp/ksp/tutorials/ex52.c is an example. Hong ________________________________ From: Varun Hiremath > Sent: Monday, January 17, 2022 12:41 PM To: Zhang, Hong > Cc: Jose E. Roman >; Peder J?rgensgaard Olesen via petsc-users > Subject: Re: [petsc-users] PETSc MUMPS interface Hi Hong, Thanks for looking into this. Here is the workflow that I might use: MatLUFactorSymbolic(F,A,perm,iperm,&info); // get memory estimates from MUMPS e.g. INFO(3), INFOG(16), INFOG(17) // find available memory on the system e.g. RAM size if (estimated_memory > available_memory) { // inform and stop; or // switch MUMPS to out-of-core factorization ICNTL(22) = 1; } else { // set appropriate settings for in-core factorization } // Now we call the solve and inside if MatLUFactorSymbolic is already called then it should be skipped EPSSolve(eps); Thanks, Varun On Mon, Jan 17, 2022 at 9:18 AM Zhang, Hong > wrote: Varun, I am trying to find a way to enable you to switch options after MatLUFactorSymbolic(). A hack is modifying the flag 'mumps->matstruc' inside MatLUFactorSymbolic_AIJMUMPS() and MatFactorNumeric_MUMPS(). My understanding of what you want is: // collect mumps memory info ... MatLUFactorSymbolic(F,A,perm,iperm,&info); printMumpsMemoryInfo(F); //--------- if (memory is available) { EPSSolve(eps); --> skip calling of MatLUFactorSymbolic() } else { //out-of-core (OOC) option in MUMPS } Am I correct? I'll let you know once I work out a solution. Hong ________________________________ From: Varun Hiremath > Sent: Sunday, January 16, 2022 10:10 PM To: Zhang, Hong > Cc: Jose E. Roman >; Peder J?rgensgaard Olesen via petsc-users > Subject: Re: [petsc-users] PETSc MUMPS interface Hi Jose, Hong, Thanks for the explanation. I have verified using -log_view that MatLUFactorSymbolic is indeed getting called twice. Hong, we use MUMPS solver for other things, and we typically run the symbolic analysis first and get memory estimates to ensure that we have enough memory available to run the case. If the available memory is not enough, we can stop or switch to the out-of-core (OOC) option in MUMPS. We wanted to do the same when using MUMPS via SLEPc/PETSc. Please let me know if there are other ways of getting these memory stats and switching options during runtime with PETSc. Appreciate your help! Thanks, Varun On Sun, Jan 16, 2022 at 4:01 PM Zhang, Hong > wrote: Varun, I believe Jose is correct. You may verify it by running your code with option '-log_view', then check the number of calls to MatLUFactorSym. I guess I can add a flag in PCSetUp() to check if user has already called MatLUFactorSymbolic() and wants to skip it. Normally, users simply allocate sufficient memory in the symbolic factorization. Why do you want to check it? Hong ________________________________ From: Jose E. Roman > Sent: Sunday, January 16, 2022 5:11 AM To: Varun Hiremath > Cc: Zhang, Hong >; Peder J?rgensgaard Olesen via petsc-users > Subject: Re: [petsc-users] PETSc MUMPS interface Hong may give a better answer, but if you look at PCSetUp_LU() https://petsc.org/main/src/ksp/pc/impls/factor/lu/lu.c.html#PCSetUp_LU you will see that MatLUFactorSymbolic() is called unconditionally during the first PCSetUp(). Currently there is no way to check if the user has already called MatLUFactorSymbolic(). Jose > El 16 ene 2022, a las 10:40, Varun Hiremath > escribi?: > > Hi Hong, > > Thank you, this is very helpful! > > Using this method I am able to get the memory estimates before the actual solve, however, I think my code may be causing the symbolic factorization to be run twice. Attached is my code where I am using SLEPc to compute eigenvalues, and I use MUMPS for factorization. I have commented above the code that computes the memory estimates, could you please check and tell me if this would cause the symbolic factor to be computed twice (a second time inside EPSSolve?), as I am seeing a slight increase in the overall computation time? > > Regards, > Varun > > On Wed, Jan 12, 2022 at 7:58 AM Zhang, Hong > wrote: > PCFactorSetMatSolverType(pc,MATSOLVERMUMPS); > PCFactorSetUpMatSolverType(pc); > PCFactorGetMatrix(pc,&F); > > MatLUFactorSymbolic(F,A,...) > You must provide row and column permutations etc, petsc/src/mat/tests/ex125.c may give you a clue on how to get these inputs. > > Hong > > > From: petsc-users > on behalf of Junchao Zhang > > Sent: Wednesday, January 12, 2022 9:03 AM > To: Varun Hiremath > > Cc: Peder J?rgensgaard Olesen via petsc-users > > Subject: Re: [petsc-users] PETSc MUMPS interface > > Calling PCSetUp() before KSPSetUp()? > > --Junchao Zhang > > > On Wed, Jan 12, 2022 at 3:00 AM Varun Hiremath > wrote: > Hi All, > > I want to collect MUMPS memory estimates based on the initial symbolic factorization analysis before the actual numerical factorization starts to check if the estimated memory requirements fit the available memory. > > I am following the steps from https://petsc.org/main/src/ksp/ksp/tutorials/ex52.c.html > > PCFactorSetMatSolverType(pc,MATSOLVERMUMPS); > PCFactorSetUpMatSolverType(pc); > PCFactorGetMatrix(pc,&F); > > KSPSetUp(ksp); > MatMumpsGetInfog(F,...) > > But it appears KSPSetUp calls both symbolic and numerical factorization. So is there some other way to get these statistics before the actual factorization starts? > > Thanks, > Varun > -------------- next part -------------- An HTML attachment was scrubbed... URL: From fdkong.jd at gmail.com Wed Jan 19 10:04:44 2022 From: fdkong.jd at gmail.com (Fande Kong) Date: Wed, 19 Jan 2022 09:04:44 -0700 Subject: [petsc-users] Cannot eagerly initialize cuda, as doing so results in cuda error 35 (cudaErrorInsufficientDriver) : CUDA driver version is insufficient for CUDA runtime version Message-ID: Hi All, Upgraded PETSc from 3.16.1 to the current main branch. I suddenly got the following error message: 2d_diffusion]$ ../../../moose_test-dbg -i 2d_diffusion_test.i -use_gpu_aware_mpi 0 -gpu_mat_type aijcusparse -gpu_vec_type cuda -log_view [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [0]PETSC ERROR: Missing or incorrect user input [0]PETSC ERROR: Cannot eagerly initialize cuda, as doing so results in cuda error 35 (cudaErrorInsufficientDriver) : CUDA driver version is insufficient for CUDA runtime version [0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting. [0]PETSC ERROR: Petsc Development GIT revision: v3.16.3-618-gad32f7e GIT Date: 2022-01-18 16:04:31 +0000 [0]PETSC ERROR: ../../../moose_test-dbg on a arch-linux-c-opt named r8i3n0 by kongf Wed Jan 19 08:30:13 2022 [0]PETSC ERROR: Configure options --with-debugging=no --with-shared-libraries=1 --download-fblaslapack=1 --download-metis=1 --download-ptscotch=1 --download-parmetis=1 --download-mumps=1 --download-strumpack=1 --download-scalapack=1 --download-slepc=1 --with-mpi=1 --with-cxx-dialect=C++14 --with-fortran-bindings=0 --with-sowing=0 --with-64-bit-indices --with-make-np=24 --with-cuda --with-cudac=nvcc --with-cuda-arch=70 --download-kokkos=1 [0]PETSC ERROR: #1 initialize() at /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/impls/cupm/cupmdevice.cxx:298 [0]PETSC ERROR: #2 PetscDeviceInitializeTypeFromOptions_Private() at /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/interface/device.cxx:299 [0]PETSC ERROR: #3 PetscDeviceInitializeFromOptions_Internal() at /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/interface/device.cxx:425 [0]PETSC ERROR: #4 PetscInitialize_Common() at /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/pinit.c:963 [0]PETSC ERROR: #5 PetscInitialize() at /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/pinit.c:1238 [0]PETSC ERROR: #6 SlepcInitialize() at /home/kongf/workhome/sawtooth/moosegpu/petsc/arch-linux-c-opt/externalpackages/git.slepc/src/sys/slepcinit.c:275 [0]PETSC ERROR: #7 LibMeshInit() at ../src/base/libmesh.C:522 [r8i3n0:mpi_rank_0][MPIDI_CH3_Abort] application called MPI_Abort(MPI_COMM_WORLD, 95) - process 0: No such file or directory (2) Thanks, Fande -------------- next part -------------- An HTML attachment was scrubbed... URL: From jacob.fai at gmail.com Wed Jan 19 11:32:29 2022 From: jacob.fai at gmail.com (Jacob Faibussowitsch) Date: Wed, 19 Jan 2022 11:32:29 -0600 Subject: [petsc-users] Cannot eagerly initialize cuda, as doing so results in cuda error 35 (cudaErrorInsufficientDriver) : CUDA driver version is insufficient for CUDA runtime version In-Reply-To: References: Message-ID: Hi Fande, What machine are you running this on? Please attach configure.log so I can troubleshoot this. Best regards, Jacob Faibussowitsch (Jacob Fai - booss - oh - vitch) > On Jan 19, 2022, at 10:04, Fande Kong wrote: > > Hi All, > > Upgraded PETSc from 3.16.1 to the current main branch. I suddenly got the following error message: > > 2d_diffusion]$ ../../../moose_test-dbg -i 2d_diffusion_test.i -use_gpu_aware_mpi 0 -gpu_mat_type aijcusparse -gpu_vec_type cuda -log_view > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > [0]PETSC ERROR: Missing or incorrect user input > [0]PETSC ERROR: Cannot eagerly initialize cuda, as doing so results in cuda error 35 (cudaErrorInsufficientDriver) : CUDA driver version is insufficient for CUDA runtime version > [0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting. > [0]PETSC ERROR: Petsc Development GIT revision: v3.16.3-618-gad32f7e GIT Date: 2022-01-18 16:04:31 +0000 > [0]PETSC ERROR: ../../../moose_test-dbg on a arch-linux-c-opt named r8i3n0 by kongf Wed Jan 19 08:30:13 2022 > [0]PETSC ERROR: Configure options --with-debugging=no --with-shared-libraries=1 --download-fblaslapack=1 --download-metis=1 --download-ptscotch=1 --download-parmetis=1 --download-mumps=1 --download-strumpack=1 --download-scalapack=1 --download-slepc=1 --with-mpi=1 --with-cxx-dialect=C++14 --with-fortran-bindings=0 --with-sowing=0 --with-64-bit-indices --with-make-np=24 --with-cuda --with-cudac=nvcc --with-cuda-arch=70 --download-kokkos=1 > [0]PETSC ERROR: #1 initialize() at /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/impls/cupm/cupmdevice.cxx:298 > [0]PETSC ERROR: #2 PetscDeviceInitializeTypeFromOptions_Private() at /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/interface/device.cxx:299 > [0]PETSC ERROR: #3 PetscDeviceInitializeFromOptions_Internal() at /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/interface/device.cxx:425 > [0]PETSC ERROR: #4 PetscInitialize_Common() at /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/pinit.c:963 > [0]PETSC ERROR: #5 PetscInitialize() at /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/pinit.c:1238 > [0]PETSC ERROR: #6 SlepcInitialize() at /home/kongf/workhome/sawtooth/moosegpu/petsc/arch-linux-c-opt/externalpackages/git.slepc/src/sys/slepcinit.c:275 > [0]PETSC ERROR: #7 LibMeshInit() at ../src/base/libmesh.C:522 > [r8i3n0:mpi_rank_0][MPIDI_CH3_Abort] application called MPI_Abort(MPI_COMM_WORLD, 95) - process 0: No such file or directory (2) > > Thanks, > > Fande -------------- next part -------------- An HTML attachment was scrubbed... URL: From marco.cisternino at optimad.it Wed Jan 19 11:54:01 2022 From: marco.cisternino at optimad.it (Marco Cisternino) Date: Wed, 19 Jan 2022 17:54:01 +0000 Subject: [petsc-users] Nullspaces In-Reply-To: References: Message-ID: Thank you, Matthew. I?m going to pay attention to our non-dimensionalization, avoiding division by cell volume helps a lot. Sorry, Mark, I cannot get your point: which 1D problem are you referring to? The case I?m talking about is based on a 3D octree mesh. Thank you all for your support. Bests, Marco Cisternino From: Mark Adams Sent: mercoled? 19 gennaio 2022 15:16 To: Matthew Knepley Cc: Marco Cisternino ; petsc-users Subject: Re: [petsc-users] Nullspaces ILU is LU for this 1D problem and its singular. ILU might have some logic to deal with a singular system. Not sure. LU should fail. You might try -ksp_type preonly and -pc_type lu And -pc_type jacobi should not have any numerical problems. Try that. On Wed, Jan 19, 2022 at 7:19 AM Matthew Knepley > wrote: On Wed, Jan 19, 2022 at 4:52 AM Marco Cisternino > wrote: Thank you Matthew. But I cannot get the point. I got the point about the test but to try to explain my doubt I?m going to prepare another toy code. By words? I usually have a finite volume discretization of the Laplace operator with homogeneous Neumann BC on an octree mesh and it reads Aij * xj = bi, being the discretization of Int|Vi(nabla^2 pi dV) = Int|Vi(nabla dot ui) (I omit constants), where Int|Vi(?dV) is a volume integral on the I cell, pi is cell pressure, ui is the cell velocity. The computational domain contains 2 separated sub-domains. Let?s consider 4 cells into the whole domain and 2 cells for each sub-domain. I would expect a null space made of 2 vectors and from your patch they should look like [1/sqrt(2) 1/sqrt(2) 0 0] and [0 0 1/sqrt(2) 1/sqrt(2)], i.e. norm2 = 1 for both. With MatNullSpaceCreate(getCommunicator(), PETSC_TRUE, 0, nullptr, &nullspace) I?m saying that [1/sqrt(4) 1/sqrt(4) 1/sqrt(4) 1/sqrt(4)] is the null space, which is not but it is in the null space. But this is not the case I sent you, not exactly. The case I sent is 1/Vi * Aij * xj = 1/Vi bi, where Vi is the volume of the cell i. Let?s admit that yj is in the null space of of Aij, it should be in the null space of 1/Vi * Aij, therefore Aij*yj = 0 and 1/Vi * Aij*yj = 0 too. But in the framework of the test, this is true with infinite precision. What happens if norm2(Aij*yj) = 10^-15 and Vi = 10^-5? Norm2(1/Vi * Aij * yj) = 10^-10!!! Is yi still in the null space numerically? Let?s say yi is the constant vector over the whole domain, i.e. [1/sqrt(4) 1/sqrt(4) 1/sqrt(4) 1/sqrt(4)]. Should this be in the null space of 1/Vi * Aij, shouldn?t it? An analogous argument should be for the compatibility condition that concerns bi. My current problem is that properly creating the null space for Aij, i.e. [1/sqrt(2) 1/sqrt(2) 0 0] and [0 0 1/sqrt(2) 1/sqrt(2)], allows me to solve and find xi, but multiplying by 1/Vi, I cannot get any solution using both FGMRES+ILU and FGMRE+GAMG. The tiny problem will load Aij, Vi and bi and show the problem by testing the proper null space and trying to solve. I will include the patch to my PETSc version. I hope to come back to you very soon. This sounds like a dimensionalization problem to me. It is best to choose length (and other) units that make the matrix entries about 1. It seems like you are far from this, and it is causing a loss of accuracy (your null vector has a residual of about 1e-7). Thanks, Matt Thank you very much for your support! Marco Cisternino From: Matthew Knepley > Sent: marted? 18 gennaio 2022 21:25 To: Marco Cisternino > Cc: petsc-users > Subject: Re: [petsc-users] Nullspaces I made a fix for this: https://gitlab.com/petsc/petsc/-/merge_requests/4729 Thanks, Matt On Tue, Jan 18, 2022 at 3:20 PM Matthew Knepley > wrote: On Thu, Dec 16, 2021 at 11:09 AM Marco Cisternino > wrote: Hello Matthew, as promised I prepared a minimal (112960 rows. I?m not able to produce anything smaller than this and triggering the issue) example of the behavior I was talking about some days ago. What I did is to produce matrix, right hand side and initial solution of the linear system. As I told you before, this linear system is the discretization of the pressure equation of a predictor-corrector method for NS equations in the framework of finite volume method. This case has homogeneous Neumann boundary conditions. Computational domain has two independent and separated sub-domains. I discretize the weak formulation and I divide every row of the linear system by the volume of the relative cell. The underlying mesh is not uniform, therefore cells have different volumes. The issue I?m going to explain does not show up if the mesh is uniform, same volume for all the cells. I usually build the null space sub-domain by sub-domain with MatNullSpaceCreate(getCommunicator(), PETSC_FALSE, nConstants, constants, &nullspace); Where nConstants = 2 and constants contains two normalized arrays with constant values on degrees of freedom relative to the associated sub-domain and zeros elsewhere. However, as a test I tried the constant over the whole domain using 2 alternatives that should produce the same null space: 1. MatNullSpaceCreate(getCommunicator(), PETSC_TRUE, 0, nullptr, &nullspace); 2. Vec* nsp; VecDuplicateVecs(solution, 1, &nsp); VecSet(nsp[0],1.0); VecNormalize(nsp[0], nullptr); MatNullSpaceCreate(getCommunicator(), PETSC_FALSE, 1, nsp, &nullspace); Once I created the null space I test it using: MatNullSpaceTest(nullspace, m_A, &isNullSpaceValid); The case 1 pass the test while case 2 don?t. I have a small code for matrix loading, null spaces creation and testing. Unfortunately I cannot implement a small code able to produce that linear system. As attachment you can find an archive containing the matrix, the initial solution (used to manually build the null space) and the rhs (not used in the test code) in binary format. You can also find the testing code in the same archive. I used petsc 3.12(gcc+openMPI) and petsc 3.15.2(intelOneAPI) same results. If the attachment is not delivered, I can share a link to it. Marco, please forgive me for taking so long to get to your issue. It has been crazy. You are correct, we had a bug. it is in MatNullSpaceTest. The normalization for the constant vector was wrong: diff --git a/src/mat/interface/matnull.c b/src/mat/interface/matnull.c index f8ab2925988..0c4c3855be0 100644 --- a/src/mat/interface/matnull.c +++ b/src/mat/interface/matnull.c @@ -429,7 +429,7 @@ PetscErrorCode MatNullSpaceTest(MatNullSpace sp,Mat mat,PetscBool *isNull) if (sp->has_cnst) { ierr = VecDuplicate(l,&r);CHKERRQ(ierr); ierr = VecGetSize(l,&N);CHKERRQ(ierr); - sum = 1.0/N; + sum = 1.0/PetscSqrtReal(N); ierr = VecSet(l,sum);CHKERRQ(ierr); ierr = MatMult(mat,l,r);CHKERRQ(ierr); ierr = VecNorm(r,NORM_2,&nrm);CHKERRQ(ierr); With this fix, your two cases give the same answer, namely that the constant vector is not a null vector of your operator, but it is close, as your can see using -mat_null_space_test_view. Thanks, Matt Thanks for any help. Marco Cisternino Marco Cisternino, PhD marco.cisternino at optimad.it ______________________ Optimad Engineering Srl Via Bligny 5, Torino, Italia. +3901119719782 www.optimad.it From: Marco Cisternino > Sent: marted? 7 dicembre 2021 19:36 To: Matthew Knepley > Cc: petsc-users > Subject: Re: [petsc-users] Nullspaces I will, as soon as possible... Scarica Outlook per Android ________________________________ From: Matthew Knepley > Sent: Tuesday, December 7, 2021 7:25:43 PM To: Marco Cisternino > Cc: petsc-users > Subject: Re: [petsc-users] Nullspaces On Tue, Dec 7, 2021 at 11:19 AM Marco Cisternino > wrote: Good morning, I?m still struggling with the Poisson equation with Neumann BCs. I discretize the equation by finite volume method and I divide every line of the linear system by the volume of the cell. I could avoid this division, but I?m trying to understand. My mesh is not uniform, i.e. cells have different volumes (it is an octree mesh). Moreover, in my computational domain there are 2 separated sub-domains. I build the null space and then I use MatNullSpaceTest to check it. If I do this: MatNullSpaceCreate(getCommunicator(), PETSC_TRUE, 0, nullptr, &nullspace); It works This produces the normalized constant vector. If I do this: Vec nsp; VecDuplicate(m_rhs, &nsp); VecSet(nsp,1.0); VecNormalize(nsp, nullptr); MatNullSpaceCreate(getCommunicator(), PETSC_FALSE, 1, &nsp, &nullspace); It does not work This is also the normalized constant vector. So you are saying that these two vectors give different results with MatNullSpaceTest()? Something must be wrong in the code. Can you send a minimal example of this? I will go through and debug it. Thanks, Matt Probably, I have wrong expectations, but should not it be the same? Thanks Marco Cisternino, PhD marco.cisternino at optimad.it ______________________ Optimad Engineering Srl Via Bligny 5, Torino, Italia. +3901119719782 www.optimad.it -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From mfadams at lbl.gov Wed Jan 19 12:00:59 2022 From: mfadams at lbl.gov (Mark Adams) Date: Wed, 19 Jan 2022 13:00:59 -0500 Subject: [petsc-users] Nullspaces In-Reply-To: References: Message-ID: On Wed, Jan 19, 2022 at 12:54 PM Marco Cisternino < marco.cisternino at optimad.it> wrote: > Thank you, Matthew. > > I?m going to pay attention to our non-dimensionalization, avoiding > division by cell volume helps a lot. > > > > Sorry, Mark, I cannot get your point: which 1D problem are you referring > to? The case I?m talking about is based on a 3D octree mesh. > > > Woops, getting my threads mixed up. -------------- next part -------------- An HTML attachment was scrubbed... URL: From fdkong.jd at gmail.com Wed Jan 19 12:07:20 2022 From: fdkong.jd at gmail.com (Fande Kong) Date: Wed, 19 Jan 2022 11:07:20 -0700 Subject: [petsc-users] Cannot eagerly initialize cuda, as doing so results in cuda error 35 (cudaErrorInsufficientDriver) : CUDA driver version is insufficient for CUDA runtime version In-Reply-To: References: Message-ID: Thanks, Jacob, and Junchao The log was attached. I am using Sawtooth at INL https://hpc.inl.gov/SitePages/Home.aspx Thanks, Fande On Wed, Jan 19, 2022 at 10:32 AM Jacob Faibussowitsch wrote: > Hi Fande, > > What machine are you running this on? Please attach configure.log so I can > troubleshoot this. > > Best regards, > > Jacob Faibussowitsch > (Jacob Fai - booss - oh - vitch) > > On Jan 19, 2022, at 10:04, Fande Kong wrote: > > Hi All, > > Upgraded PETSc from 3.16.1 to the current main branch. I suddenly got the > following error message: > > 2d_diffusion]$ ../../../moose_test-dbg -i 2d_diffusion_test.i > -use_gpu_aware_mpi 0 -gpu_mat_type aijcusparse -gpu_vec_type cuda > -log_view > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > [0]PETSC ERROR: Missing or incorrect user input > [0]PETSC ERROR: Cannot eagerly initialize cuda, as doing so results in > cuda error 35 (cudaErrorInsufficientDriver) : CUDA driver version is > insufficient for CUDA runtime version > [0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting. > [0]PETSC ERROR: Petsc Development GIT revision: v3.16.3-618-gad32f7e GIT > Date: 2022-01-18 16:04:31 +0000 > [0]PETSC ERROR: ../../../moose_test-dbg on a arch-linux-c-opt named r8i3n0 > by kongf Wed Jan 19 08:30:13 2022 > [0]PETSC ERROR: Configure options --with-debugging=no > --with-shared-libraries=1 --download-fblaslapack=1 --download-metis=1 > --download-ptscotch=1 --download-parmetis=1 --download-mumps=1 > --download-strumpack=1 --download-scalapack=1 --download-slepc=1 > --with-mpi=1 --with-cxx-dialect=C++14 --with-fortran-bindings=0 > --with-sowing=0 --with-64-bit-indices --with-make-np=24 --with-cuda > --with-cudac=nvcc --with-cuda-arch=70 --download-kokkos=1 > [0]PETSC ERROR: #1 initialize() at > /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/impls/cupm/cupmdevice.cxx:298 > [0]PETSC ERROR: #2 PetscDeviceInitializeTypeFromOptions_Private() at > /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/interface/device.cxx:299 > [0]PETSC ERROR: #3 PetscDeviceInitializeFromOptions_Internal() at > /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/interface/device.cxx:425 > [0]PETSC ERROR: #4 PetscInitialize_Common() at > /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/pinit.c:963 > [0]PETSC ERROR: #5 PetscInitialize() at > /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/pinit.c:1238 > [0]PETSC ERROR: #6 SlepcInitialize() at > /home/kongf/workhome/sawtooth/moosegpu/petsc/arch-linux-c-opt/externalpackages/git.slepc/src/sys/slepcinit.c:275 > [0]PETSC ERROR: #7 LibMeshInit() at ../src/base/libmesh.C:522 > [r8i3n0:mpi_rank_0][MPIDI_CH3_Abort] application called > MPI_Abort(MPI_COMM_WORLD, 95) - process 0: No such file or directory (2) > > Thanks, > > Fande > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: configure.log Type: application/octet-stream Size: 1337432 bytes Desc: not available URL: From jacob.fai at gmail.com Wed Jan 19 12:39:49 2022 From: jacob.fai at gmail.com (Jacob Faibussowitsch) Date: Wed, 19 Jan 2022 12:39:49 -0600 Subject: [petsc-users] Cannot eagerly initialize cuda, as doing so results in cuda error 35 (cudaErrorInsufficientDriver) : CUDA driver version is insufficient for CUDA runtime version In-Reply-To: References: Message-ID: <233D0F20-AD95-481B-B862-353D5789C556@gmail.com> Are you running on login nodes or compute nodes (I can?t seem to tell from the configure.log)? If running from login nodes, do they support running with GPU?s? Some clusters will install stub versions of cuda runtime on login nodes (such that configuration can find them), but that won?t actually work in practice. If this is the case then CUDA will fail to initialize with this exact error. IIRC It wasn?t until CUDA 11.1 that they created a specific error code (cudaErrorStubLibrary) for it. Best regards, Jacob Faibussowitsch (Jacob Fai - booss - oh - vitch) > On Jan 19, 2022, at 12:07, Fande Kong wrote: > > Thanks, Jacob, and Junchao > > The log was attached. I am using Sawtooth at INL https://hpc.inl.gov/SitePages/Home.aspx > > > Thanks, > > Fande > > On Wed, Jan 19, 2022 at 10:32 AM Jacob Faibussowitsch > wrote: > Hi Fande, > > What machine are you running this on? Please attach configure.log so I can troubleshoot this. > > Best regards, > > Jacob Faibussowitsch > (Jacob Fai - booss - oh - vitch) > >> On Jan 19, 2022, at 10:04, Fande Kong > wrote: >> >> Hi All, >> >> Upgraded PETSc from 3.16.1 to the current main branch. I suddenly got the following error message: >> >> 2d_diffusion]$ ../../../moose_test-dbg -i 2d_diffusion_test.i -use_gpu_aware_mpi 0 -gpu_mat_type aijcusparse -gpu_vec_type cuda -log_view >> [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- >> [0]PETSC ERROR: Missing or incorrect user input >> [0]PETSC ERROR: Cannot eagerly initialize cuda, as doing so results in cuda error 35 (cudaErrorInsufficientDriver) : CUDA driver version is insufficient for CUDA runtime version >> [0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting. >> [0]PETSC ERROR: Petsc Development GIT revision: v3.16.3-618-gad32f7e GIT Date: 2022-01-18 16:04:31 +0000 >> [0]PETSC ERROR: ../../../moose_test-dbg on a arch-linux-c-opt named r8i3n0 by kongf Wed Jan 19 08:30:13 2022 >> [0]PETSC ERROR: Configure options --with-debugging=no --with-shared-libraries=1 --download-fblaslapack=1 --download-metis=1 --download-ptscotch=1 --download-parmetis=1 --download-mumps=1 --download-strumpack=1 --download-scalapack=1 --download-slepc=1 --with-mpi=1 --with-cxx-dialect=C++14 --with-fortran-bindings=0 --with-sowing=0 --with-64-bit-indices --with-make-np=24 --with-cuda --with-cudac=nvcc --with-cuda-arch=70 --download-kokkos=1 >> [0]PETSC ERROR: #1 initialize() at /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/impls/cupm/cupmdevice.cxx:298 >> [0]PETSC ERROR: #2 PetscDeviceInitializeTypeFromOptions_Private() at /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/interface/device.cxx:299 >> [0]PETSC ERROR: #3 PetscDeviceInitializeFromOptions_Internal() at /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/interface/device.cxx:425 >> [0]PETSC ERROR: #4 PetscInitialize_Common() at /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/pinit.c:963 >> [0]PETSC ERROR: #5 PetscInitialize() at /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/pinit.c:1238 >> [0]PETSC ERROR: #6 SlepcInitialize() at /home/kongf/workhome/sawtooth/moosegpu/petsc/arch-linux-c-opt/externalpackages/git.slepc/src/sys/slepcinit.c:275 >> [0]PETSC ERROR: #7 LibMeshInit() at ../src/base/libmesh.C:522 >> [r8i3n0:mpi_rank_0][MPIDI_CH3_Abort] application called MPI_Abort(MPI_COMM_WORLD, 95) - process 0: No such file or directory (2) >> >> Thanks, >> >> Fande > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From fdkong.jd at gmail.com Wed Jan 19 13:18:09 2022 From: fdkong.jd at gmail.com (Fande Kong) Date: Wed, 19 Jan 2022 12:18:09 -0700 Subject: [petsc-users] Cannot eagerly initialize cuda, as doing so results in cuda error 35 (cudaErrorInsufficientDriver) : CUDA driver version is insufficient for CUDA runtime version In-Reply-To: <233D0F20-AD95-481B-B862-353D5789C556@gmail.com> References: <233D0F20-AD95-481B-B862-353D5789C556@gmail.com> Message-ID: On Wed, Jan 19, 2022 at 11:39 AM Jacob Faibussowitsch wrote: > Are you running on login nodes or compute nodes (I can?t seem to tell from > the configure.log)? > I was compiling codes on login nodes, and running codes on compute nodes. Login nodes do not have GPUs, but compute nodes do have GPUs. Just to be clear, the same thing (code, machine) with PETSc-3.16.1 worked perfectly. I have this trouble with PETSc-main. I might do "git bisect" when I have time Thanks, Fande If running from login nodes, do they support running with GPU?s? Some > clusters will install stub versions of cuda runtime on login nodes (such > that configuration can find them), but that won?t actually work in > practice. > > If this is the case then CUDA will fail to initialize with this exact > error. IIRC It wasn?t until CUDA 11.1 that they created a specific error > code (cudaErrorStubLibrary) for it. > > Best regards, > > Jacob Faibussowitsch > (Jacob Fai - booss - oh - vitch) > > On Jan 19, 2022, at 12:07, Fande Kong wrote: > > Thanks, Jacob, and Junchao > > The log was attached. I am using Sawtooth at INL > https://hpc.inl.gov/SitePages/Home.aspx > > > Thanks, > > Fande > > On Wed, Jan 19, 2022 at 10:32 AM Jacob Faibussowitsch > wrote: > >> Hi Fande, >> >> What machine are you running this on? Please attach configure.log so I >> can troubleshoot this. >> >> Best regards, >> >> Jacob Faibussowitsch >> (Jacob Fai - booss - oh - vitch) >> >> On Jan 19, 2022, at 10:04, Fande Kong wrote: >> >> Hi All, >> >> Upgraded PETSc from 3.16.1 to the current main branch. I suddenly got the >> following error message: >> >> 2d_diffusion]$ ../../../moose_test-dbg -i 2d_diffusion_test.i >> -use_gpu_aware_mpi 0 -gpu_mat_type aijcusparse -gpu_vec_type cuda >> -log_view >> [0]PETSC ERROR: --------------------- Error Message >> -------------------------------------------------------------- >> [0]PETSC ERROR: Missing or incorrect user input >> [0]PETSC ERROR: Cannot eagerly initialize cuda, as doing so results in >> cuda error 35 (cudaErrorInsufficientDriver) : CUDA driver version is >> insufficient for CUDA runtime version >> [0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting. >> [0]PETSC ERROR: Petsc Development GIT revision: v3.16.3-618-gad32f7e GIT >> Date: 2022-01-18 16:04:31 +0000 >> [0]PETSC ERROR: ../../../moose_test-dbg on a arch-linux-c-opt named >> r8i3n0 by kongf Wed Jan 19 08:30:13 2022 >> [0]PETSC ERROR: Configure options --with-debugging=no >> --with-shared-libraries=1 --download-fblaslapack=1 --download-metis=1 >> --download-ptscotch=1 --download-parmetis=1 --download-mumps=1 >> --download-strumpack=1 --download-scalapack=1 --download-slepc=1 >> --with-mpi=1 --with-cxx-dialect=C++14 --with-fortran-bindings=0 >> --with-sowing=0 --with-64-bit-indices --with-make-np=24 --with-cuda >> --with-cudac=nvcc --with-cuda-arch=70 --download-kokkos=1 >> [0]PETSC ERROR: #1 initialize() at >> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/impls/cupm/cupmdevice.cxx:298 >> [0]PETSC ERROR: #2 PetscDeviceInitializeTypeFromOptions_Private() at >> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/interface/device.cxx:299 >> [0]PETSC ERROR: #3 PetscDeviceInitializeFromOptions_Internal() at >> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/interface/device.cxx:425 >> [0]PETSC ERROR: #4 PetscInitialize_Common() at >> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/pinit.c:963 >> [0]PETSC ERROR: #5 PetscInitialize() at >> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/pinit.c:1238 >> [0]PETSC ERROR: #6 SlepcInitialize() at >> /home/kongf/workhome/sawtooth/moosegpu/petsc/arch-linux-c-opt/externalpackages/git.slepc/src/sys/slepcinit.c:275 >> [0]PETSC ERROR: #7 LibMeshInit() at ../src/base/libmesh.C:522 >> [r8i3n0:mpi_rank_0][MPIDI_CH3_Abort] application called >> MPI_Abort(MPI_COMM_WORLD, 95) - process 0: No such file or directory (2) >> >> Thanks, >> >> Fande >> >> >> > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From hzhang at mcs.anl.gov Wed Jan 19 17:33:52 2022 From: hzhang at mcs.anl.gov (Zhang, Hong) Date: Wed, 19 Jan 2022 23:33:52 +0000 Subject: [petsc-users] PETSc MUMPS interface In-Reply-To: References: <1AD9EBCE-4B48-4D54-9829-DFD5EDC68B76@dsic.upv.es> Message-ID: Varun, This feature is merged to petsc main https://gitlab.com/petsc/petsc/-/merge_requests/4727 Hong ________________________________ From: petsc-users on behalf of Zhang, Hong via petsc-users Sent: Wednesday, January 19, 2022 9:37 AM To: Varun Hiremath Cc: Peder J?rgensgaard Olesen via petsc-users Subject: Re: [petsc-users] PETSc MUMPS interface Varun, Good to know it works. FactorSymbolic function is still being called twice, but the 2nd call is a no-op, thus it still appears in '-log_view'. I made changes in the low level of mumps routine, not within PCSetUp() because I feel your use case is limited to mumps, not other matrix package solvers. Hong ________________________________ From: Varun Hiremath Sent: Wednesday, January 19, 2022 2:44 AM To: Zhang, Hong Cc: Peder J?rgensgaard Olesen via petsc-users Subject: Re: [petsc-users] PETSc MUMPS interface Hi Hong, Thanks, I tested your branch and I think it is working fine. I don't see any increase in runtime, however with -log_view I see that the MatLUFactorSymbolic function is still being called twice, so is this expected? Is the second call a no-op? $ ./ex52.o -use_mumps_lu -print_mumps_memory -log_view | grep MatLUFactorSym MatLUFactorSym 2 1.0 4.4411e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 2 0 0 0 0 2 0 0 0 0 0 Thanks, Varun On Mon, Jan 17, 2022 at 7:49 PM Zhang, Hong > wrote: Varun, I created a branch hzhang/feature-mumps-mem-estimate, see https://gitlab.com/petsc/petsc/-/merge_requests/4727 You may give it a try and let me know if this is what you want. src/ksp/ksp/tutorials/ex52.c is an example. Hong ________________________________ From: Varun Hiremath > Sent: Monday, January 17, 2022 12:41 PM To: Zhang, Hong > Cc: Jose E. Roman >; Peder J?rgensgaard Olesen via petsc-users > Subject: Re: [petsc-users] PETSc MUMPS interface Hi Hong, Thanks for looking into this. Here is the workflow that I might use: MatLUFactorSymbolic(F,A,perm,iperm,&info); // get memory estimates from MUMPS e.g. INFO(3), INFOG(16), INFOG(17) // find available memory on the system e.g. RAM size if (estimated_memory > available_memory) { // inform and stop; or // switch MUMPS to out-of-core factorization ICNTL(22) = 1; } else { // set appropriate settings for in-core factorization } // Now we call the solve and inside if MatLUFactorSymbolic is already called then it should be skipped EPSSolve(eps); Thanks, Varun On Mon, Jan 17, 2022 at 9:18 AM Zhang, Hong > wrote: Varun, I am trying to find a way to enable you to switch options after MatLUFactorSymbolic(). A hack is modifying the flag 'mumps->matstruc' inside MatLUFactorSymbolic_AIJMUMPS() and MatFactorNumeric_MUMPS(). My understanding of what you want is: // collect mumps memory info ... MatLUFactorSymbolic(F,A,perm,iperm,&info); printMumpsMemoryInfo(F); //--------- if (memory is available) { EPSSolve(eps); --> skip calling of MatLUFactorSymbolic() } else { //out-of-core (OOC) option in MUMPS } Am I correct? I'll let you know once I work out a solution. Hong ________________________________ From: Varun Hiremath > Sent: Sunday, January 16, 2022 10:10 PM To: Zhang, Hong > Cc: Jose E. Roman >; Peder J?rgensgaard Olesen via petsc-users > Subject: Re: [petsc-users] PETSc MUMPS interface Hi Jose, Hong, Thanks for the explanation. I have verified using -log_view that MatLUFactorSymbolic is indeed getting called twice. Hong, we use MUMPS solver for other things, and we typically run the symbolic analysis first and get memory estimates to ensure that we have enough memory available to run the case. If the available memory is not enough, we can stop or switch to the out-of-core (OOC) option in MUMPS. We wanted to do the same when using MUMPS via SLEPc/PETSc. Please let me know if there are other ways of getting these memory stats and switching options during runtime with PETSc. Appreciate your help! Thanks, Varun On Sun, Jan 16, 2022 at 4:01 PM Zhang, Hong > wrote: Varun, I believe Jose is correct. You may verify it by running your code with option '-log_view', then check the number of calls to MatLUFactorSym. I guess I can add a flag in PCSetUp() to check if user has already called MatLUFactorSymbolic() and wants to skip it. Normally, users simply allocate sufficient memory in the symbolic factorization. Why do you want to check it? Hong ________________________________ From: Jose E. Roman > Sent: Sunday, January 16, 2022 5:11 AM To: Varun Hiremath > Cc: Zhang, Hong >; Peder J?rgensgaard Olesen via petsc-users > Subject: Re: [petsc-users] PETSc MUMPS interface Hong may give a better answer, but if you look at PCSetUp_LU() https://petsc.org/main/src/ksp/pc/impls/factor/lu/lu.c.html#PCSetUp_LU you will see that MatLUFactorSymbolic() is called unconditionally during the first PCSetUp(). Currently there is no way to check if the user has already called MatLUFactorSymbolic(). Jose > El 16 ene 2022, a las 10:40, Varun Hiremath > escribi?: > > Hi Hong, > > Thank you, this is very helpful! > > Using this method I am able to get the memory estimates before the actual solve, however, I think my code may be causing the symbolic factorization to be run twice. Attached is my code where I am using SLEPc to compute eigenvalues, and I use MUMPS for factorization. I have commented above the code that computes the memory estimates, could you please check and tell me if this would cause the symbolic factor to be computed twice (a second time inside EPSSolve?), as I am seeing a slight increase in the overall computation time? > > Regards, > Varun > > On Wed, Jan 12, 2022 at 7:58 AM Zhang, Hong > wrote: > PCFactorSetMatSolverType(pc,MATSOLVERMUMPS); > PCFactorSetUpMatSolverType(pc); > PCFactorGetMatrix(pc,&F); > > MatLUFactorSymbolic(F,A,...) > You must provide row and column permutations etc, petsc/src/mat/tests/ex125.c may give you a clue on how to get these inputs. > > Hong > > > From: petsc-users > on behalf of Junchao Zhang > > Sent: Wednesday, January 12, 2022 9:03 AM > To: Varun Hiremath > > Cc: Peder J?rgensgaard Olesen via petsc-users > > Subject: Re: [petsc-users] PETSc MUMPS interface > > Calling PCSetUp() before KSPSetUp()? > > --Junchao Zhang > > > On Wed, Jan 12, 2022 at 3:00 AM Varun Hiremath > wrote: > Hi All, > > I want to collect MUMPS memory estimates based on the initial symbolic factorization analysis before the actual numerical factorization starts to check if the estimated memory requirements fit the available memory. > > I am following the steps from https://petsc.org/main/src/ksp/ksp/tutorials/ex52.c.html > > PCFactorSetMatSolverType(pc,MATSOLVERMUMPS); > PCFactorSetUpMatSolverType(pc); > PCFactorGetMatrix(pc,&F); > > KSPSetUp(ksp); > MatMumpsGetInfog(F,...) > > But it appears KSPSetUp calls both symbolic and numerical factorization. So is there some other way to get these statistics before the actual factorization starts? > > Thanks, > Varun > -------------- next part -------------- An HTML attachment was scrubbed... URL: From fdkong.jd at gmail.com Wed Jan 19 23:11:56 2022 From: fdkong.jd at gmail.com (Fande Kong) Date: Wed, 19 Jan 2022 22:11:56 -0700 Subject: [petsc-users] Does mpiaijkok intend to support 64-bit integers? Message-ID: Hi All, It seems that mpiaijkok does not support 64-bit integers at this time. Do we have any motivation for this? Or Is it just a bug? Thanks, Fande petsc/src/mat/impls/aij/mpi/kokkos/mpiaijkok.kokkos.cxx(306): error: a value of type "MatColumnIndexType *" cannot be assigned to an entity of type "int *" petsc/src/mat/impls/aij/mpi/kokkos/mpiaijkok.kokkos.cxx(308): error: a value of type "PetscInt *" cannot be assigned to an entity of type "int *" petsc/src/mat/impls/aij/mpi/kokkos/mpiaijkok.kokkos.cxx(310): error: a value of type "PetscInt *" cannot be assigned to an entity of type "int *" petsc/src/mat/impls/aij/mpi/kokkos/mpiaijkok.kokkos.cxx(316): error: a value of type "MatColumnIndexType *" cannot be assigned to an entity of type "int *" petsc/src/mat/impls/aij/mpi/kokkos/mpiaijkok.kokkos.cxx(329): error: a value of type "PetscInt *" cannot be assigned to an entity of type "int *" petsc/src/mat/impls/aij/mpi/kokkos/mpiaijkok.kokkos.cxx(331): error: a value of type "PetscInt *" cannot be assigned to an entity of type "int *" 6 errors detected in the compilation of "/tmp/tmpxft_00017e46_00000000-6_mpiaijkok.kokkos.cpp1.ii". gmake[3]: *** [arch-linux-c-opt/obj/mat/impls/aij/mpi/kokkos/mpiaijkok.o] Error 1 -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at jedbrown.org Thu Jan 20 00:03:15 2022 From: jed at jedbrown.org (Jed Brown) Date: Wed, 19 Jan 2022 23:03:15 -0700 Subject: [petsc-users] Cannot eagerly initialize cuda, as doing so results in cuda error 35 (cudaErrorInsufficientDriver) : CUDA driver version is insufficient for CUDA runtime version In-Reply-To: References: <233D0F20-AD95-481B-B862-353D5789C556@gmail.com> Message-ID: <87czkn9c64.fsf@jedbrown.org> Fande Kong writes: > On Wed, Jan 19, 2022 at 11:39 AM Jacob Faibussowitsch > wrote: > >> Are you running on login nodes or compute nodes (I can?t seem to tell from >> the configure.log)? >> > > I was compiling codes on login nodes, and running codes on compute nodes. > Login nodes do not have GPUs, but compute nodes do have GPUs. > > Just to be clear, the same thing (code, machine) with PETSc-3.16.1 worked > perfectly. I have this trouble with PETSc-main. I assume you can export PETSC_OPTIONS='-device_enable lazy' and it'll work. I think this should be the default. The main complaint is that timing the first GPU-using event isn't accurate if it includes initialization, but I think this is mostly hypothetical because you can't trust any timing that doesn't preload in some form and the first GPU-using event will almost always be something uninteresting so I think it will rarely lead to confusion. Meanwhile, eager initialization is viscerally disruptive for lots of people. From mfadams at lbl.gov Thu Jan 20 07:49:33 2022 From: mfadams at lbl.gov (Mark Adams) Date: Thu, 20 Jan 2022 08:49:33 -0500 Subject: [petsc-users] Does mpiaijkok intend to support 64-bit integers? In-Reply-To: References: Message-ID: Humm, I was not able to reproduce this on my Mac. Trying Crusher now. Are you using main? or even a recent release. We did fix a 64 bit int bug recently in mpiaijkok. Thanks, Mark On Thu, Jan 20, 2022 at 12:12 AM Fande Kong wrote: > Hi All, > > It seems that mpiaijkok does not support 64-bit integers at this time. Do > we have any motivation for this? Or Is it just a bug? > > Thanks, > > Fande > > petsc/src/mat/impls/aij/mpi/kokkos/mpiaijkok.kokkos.cxx(306): error: a > value of type "MatColumnIndexType *" cannot be assigned to an entity of > type "int *" > > > petsc/src/mat/impls/aij/mpi/kokkos/mpiaijkok.kokkos.cxx(308): error: a > value of type "PetscInt *" cannot be assigned to an entity of type "int *" > > > petsc/src/mat/impls/aij/mpi/kokkos/mpiaijkok.kokkos.cxx(310): error: a > value of type "PetscInt *" cannot be assigned to an entity of type "int *" > > > petsc/src/mat/impls/aij/mpi/kokkos/mpiaijkok.kokkos.cxx(316): error: a > value of type "MatColumnIndexType *" cannot be assigned to an entity of > type "int *" > > > petsc/src/mat/impls/aij/mpi/kokkos/mpiaijkok.kokkos.cxx(329): error: a > value of type "PetscInt *" cannot be assigned to an entity of type "int *" > > > petsc/src/mat/impls/aij/mpi/kokkos/mpiaijkok.kokkos.cxx(331): error: a > value of type "PetscInt *" cannot be assigned to an entity of type "int *" > > > 6 errors detected in the compilation of > "/tmp/tmpxft_00017e46_00000000-6_mpiaijkok.kokkos.cpp1.ii". > > gmake[3]: *** [arch-linux-c-opt/obj/mat/impls/aij/mpi/kokkos/mpiaijkok.o] > Error 1 > -------------- next part -------------- An HTML attachment was scrubbed... URL: From fdkong.jd at gmail.com Thu Jan 20 10:14:45 2022 From: fdkong.jd at gmail.com (Fande Kong) Date: Thu, 20 Jan 2022 09:14:45 -0700 Subject: [petsc-users] Does mpiaijkok intend to support 64-bit integers? In-Reply-To: References: Message-ID: On Thu, Jan 20, 2022 at 6:49 AM Mark Adams wrote: > Humm, I was not able to reproduce this on my Mac. Trying Crusher now. > Are you using main? or even a recent release. > I am working with PETSc-3.16.1 I will try main now Thanks, Fande > We did fix a 64 bit int bug recently in mpiaijkok. > > Thanks, > Mark > > On Thu, Jan 20, 2022 at 12:12 AM Fande Kong wrote: > >> Hi All, >> >> It seems that mpiaijkok does not support 64-bit integers at this time. Do >> we have any motivation for this? Or Is it just a bug? >> >> Thanks, >> >> Fande >> >> petsc/src/mat/impls/aij/mpi/kokkos/mpiaijkok.kokkos.cxx(306): error: a >> value of type "MatColumnIndexType *" cannot be assigned to an entity of >> type "int *" >> >> >> petsc/src/mat/impls/aij/mpi/kokkos/mpiaijkok.kokkos.cxx(308): error: a >> value of type "PetscInt *" cannot be assigned to an entity of type "int *" >> >> >> petsc/src/mat/impls/aij/mpi/kokkos/mpiaijkok.kokkos.cxx(310): error: a >> value of type "PetscInt *" cannot be assigned to an entity of type "int *" >> >> >> petsc/src/mat/impls/aij/mpi/kokkos/mpiaijkok.kokkos.cxx(316): error: a >> value of type "MatColumnIndexType *" cannot be assigned to an entity of >> type "int *" >> >> >> petsc/src/mat/impls/aij/mpi/kokkos/mpiaijkok.kokkos.cxx(329): error: a >> value of type "PetscInt *" cannot be assigned to an entity of type "int *" >> >> >> petsc/src/mat/impls/aij/mpi/kokkos/mpiaijkok.kokkos.cxx(331): error: a >> value of type "PetscInt *" cannot be assigned to an entity of type "int *" >> >> >> 6 errors detected in the compilation of >> "/tmp/tmpxft_00017e46_00000000-6_mpiaijkok.kokkos.cpp1.ii". >> >> gmake[3]: *** [arch-linux-c-opt/obj/mat/impls/aij/mpi/kokkos/mpiaijkok.o] >> Error 1 >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From fdkong.jd at gmail.com Thu Jan 20 13:21:09 2022 From: fdkong.jd at gmail.com (Fande Kong) Date: Thu, 20 Jan 2022 12:21:09 -0700 Subject: [petsc-users] Does mpiaijkok intend to support 64-bit integers? In-Reply-To: References: Message-ID: Thanks, Mark, PETSc-main has no issue. Fande On Thu, Jan 20, 2022 at 9:14 AM Fande Kong wrote: > > > On Thu, Jan 20, 2022 at 6:49 AM Mark Adams wrote: > >> Humm, I was not able to reproduce this on my Mac. Trying Crusher now. >> Are you using main? or even a recent release. >> > > I am working with PETSc-3.16.1 > > I will try main now > > Thanks, > Fande > > > >> We did fix a 64 bit int bug recently in mpiaijkok. >> >> Thanks, >> Mark >> >> On Thu, Jan 20, 2022 at 12:12 AM Fande Kong wrote: >> >>> Hi All, >>> >>> It seems that mpiaijkok does not support 64-bit integers at this time. >>> Do we have any motivation for this? Or Is it just a bug? >>> >>> Thanks, >>> >>> Fande >>> >>> petsc/src/mat/impls/aij/mpi/kokkos/mpiaijkok.kokkos.cxx(306): error: a >>> value of type "MatColumnIndexType *" cannot be assigned to an entity of >>> type "int *" >>> >>> >>> petsc/src/mat/impls/aij/mpi/kokkos/mpiaijkok.kokkos.cxx(308): error: a >>> value of type "PetscInt *" cannot be assigned to an entity of type "int *" >>> >>> >>> petsc/src/mat/impls/aij/mpi/kokkos/mpiaijkok.kokkos.cxx(310): error: a >>> value of type "PetscInt *" cannot be assigned to an entity of type "int *" >>> >>> >>> petsc/src/mat/impls/aij/mpi/kokkos/mpiaijkok.kokkos.cxx(316): error: a >>> value of type "MatColumnIndexType *" cannot be assigned to an entity of >>> type "int *" >>> >>> >>> petsc/src/mat/impls/aij/mpi/kokkos/mpiaijkok.kokkos.cxx(329): error: a >>> value of type "PetscInt *" cannot be assigned to an entity of type "int *" >>> >>> >>> petsc/src/mat/impls/aij/mpi/kokkos/mpiaijkok.kokkos.cxx(331): error: a >>> value of type "PetscInt *" cannot be assigned to an entity of type "int *" >>> >>> >>> 6 errors detected in the compilation of >>> "/tmp/tmpxft_00017e46_00000000-6_mpiaijkok.kokkos.cpp1.ii". >>> >>> gmake[3]: *** >>> [arch-linux-c-opt/obj/mat/impls/aij/mpi/kokkos/mpiaijkok.o] Error 1 >>> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From fdkong.jd at gmail.com Thu Jan 20 14:09:51 2022 From: fdkong.jd at gmail.com (Fande Kong) Date: Thu, 20 Jan 2022 13:09:51 -0700 Subject: [petsc-users] Cannot eagerly initialize cuda, as doing so results in cuda error 35 (cudaErrorInsufficientDriver) : CUDA driver version is insufficient for CUDA runtime version In-Reply-To: <87czkn9c64.fsf@jedbrown.org> References: <233D0F20-AD95-481B-B862-353D5789C556@gmail.com> <87czkn9c64.fsf@jedbrown.org> Message-ID: Thanks, Jed, This worked! Fande On Wed, Jan 19, 2022 at 11:03 PM Jed Brown wrote: > Fande Kong writes: > > > On Wed, Jan 19, 2022 at 11:39 AM Jacob Faibussowitsch < > jacob.fai at gmail.com> > > wrote: > > > >> Are you running on login nodes or compute nodes (I can?t seem to tell > from > >> the configure.log)? > >> > > > > I was compiling codes on login nodes, and running codes on compute nodes. > > Login nodes do not have GPUs, but compute nodes do have GPUs. > > > > Just to be clear, the same thing (code, machine) with PETSc-3.16.1 worked > > perfectly. I have this trouble with PETSc-main. > > I assume you can > > export PETSC_OPTIONS='-device_enable lazy' > > and it'll work. > > I think this should be the default. The main complaint is that timing the > first GPU-using event isn't accurate if it includes initialization, but I > think this is mostly hypothetical because you can't trust any timing that > doesn't preload in some form and the first GPU-using event will almost > always be something uninteresting so I think it will rarely lead to > confusion. Meanwhile, eager initialization is viscerally disruptive for > lots of people. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From fdkong.jd at gmail.com Thu Jan 20 14:29:20 2022 From: fdkong.jd at gmail.com (Fande Kong) Date: Thu, 20 Jan 2022 13:29:20 -0700 Subject: [petsc-users] Cannot eagerly initialize cuda, as doing so results in cuda error 35 (cudaErrorInsufficientDriver) : CUDA driver version is insufficient for CUDA runtime version In-Reply-To: References: <233D0F20-AD95-481B-B862-353D5789C556@gmail.com> <87czkn9c64.fsf@jedbrown.org> Message-ID: I spoke too soon. It seems that we have trouble creating cuda/kokkos vecs now. Got Segmentation fault. Thanks, Fande Program received signal SIGSEGV, Segmentation fault. 0x00002aaab5558b11 in Petsc::CUPMDevice<(Petsc::CUPMDeviceType)0>::CUPMDeviceInternal::initialize (this=0x1) at /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/impls/cupm/cupmdevice.cxx:54 54 PetscErrorCode CUPMDevice::CUPMDeviceInternal::initialize() noexcept Missing separate debuginfos, use: debuginfo-install bzip2-libs-1.0.6-13.el7.x86_64 elfutils-libelf-0.176-5.el7.x86_64 elfutils-libs-0.176-5.el7.x86_64 glibc-2.17-325.el7_9.x86_64 libX11-1.6.7-4.el7_9.x86_64 libXau-1.0.8-2.1.el7.x86_64 libattr-2.4.46-13.el7.x86_64 libcap-2.22-11.el7.x86_64 libibmad-5.4.0.MLNX20190423.1d917ae-0.1.49224.x86_64 libibumad-43.1.1.MLNX20200211.078947f-0.1.49224.x86_64 libibverbs-41mlnx1-OFED.4.9.0.0.7.49224.x86_64 libmlx4-41mlnx1-OFED.4.7.3.0.3.49224.x86_64 libmlx5-41mlnx1-OFED.4.9.0.1.2.49224.x86_64 libnl3-3.2.28-4.el7.x86_64 librdmacm-41mlnx1-OFED.4.7.3.0.6.49224.x86_64 librxe-41mlnx1-OFED.4.4.2.4.6.49224.x86_64 libxcb-1.13-1.el7.x86_64 libxml2-2.9.1-6.el7_9.6.x86_64 numactl-libs-2.0.12-5.el7.x86_64 systemd-libs-219-78.el7_9.3.x86_64 xz-libs-5.2.2-1.el7.x86_64 zlib-1.2.7-19.el7_9.x86_64 (gdb) bt #0 0x00002aaab5558b11 in Petsc::CUPMDevice<(Petsc::CUPMDeviceType)0>::CUPMDeviceInternal::initialize (this=0x1) at /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/impls/cupm/cupmdevice.cxx:54 #1 0x00002aaab5558db7 in Petsc::CUPMDevice<(Petsc::CUPMDeviceType)0>::getDevice (this=this at entry=0x2aaab7f37b70 , device=0x115da00, id=-35, id at entry=-1) at /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/impls/cupm/cupmdevice.cxx:344 #2 0x00002aaab55577de in PetscDeviceCreate (type=type at entry=PETSC_DEVICE_CUDA, devid=devid at entry=-1, device=device at entry=0x2aaab7f37b48 ) at /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/interface/device.cxx:107 #3 0x00002aaab5557b3a in PetscDeviceInitializeDefaultDevice_Internal (type=type at entry=PETSC_DEVICE_CUDA, defaultDeviceId=defaultDeviceId at entry=-1) at /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/interface/device.cxx:273 #4 0x00002aaab5557bf6 in PetscDeviceInitialize (type=type at entry=PETSC_DEVICE_CUDA) at /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/interface/device.cxx:234 #5 0x00002aaab5661fcd in VecCreate_SeqCUDA (V=0x115d150) at /home/kongf/workhome/sawtooth/moosegpu/petsc/src/vec/vec/impls/seq/seqcuda/veccuda.c:244 #6 0x00002aaab5649b40 in VecSetType (vec=vec at entry=0x115d150, method=method at entry=0x2aaab70b45b8 "seqcuda") at /home/kongf/workhome/sawtooth/moosegpu/petsc/src/vec/vec/interface/vecreg.c:93 #7 0x00002aaab579c33f in VecCreate_CUDA (v=0x115d150) at /home/kongf/workhome/sawtooth/moosegpu/petsc/src/vec/vec/impls/mpi/mpicuda/ mpicuda.cu:214 #8 0x00002aaab5649b40 in VecSetType (vec=vec at entry=0x115d150, method=method at entry=0x7fffffff9260 "cuda") at /home/kongf/workhome/sawtooth/moosegpu/petsc/src/vec/vec/interface/vecreg.c:93 #9 0x00002aaab5648bf1 in VecSetTypeFromOptions_Private (vec=0x115d150, PetscOptionsObject=0x7fffffff9210) at /home/kongf/workhome/sawtooth/moosegpu/petsc/src/vec/vec/interface/vector.c:1263 #10 VecSetFromOptions (vec=0x115d150) at /home/kongf/workhome/sawtooth/moosegpu/petsc/src/vec/vec/interface/vector.c:1297 #11 0x00002aaab02ef227 in libMesh::PetscVector::init (this=0x11cd1a0, n=441, n_local=441, fast=false, ptype=libMesh::PARALLEL) at /home/kongf/workhome/sawtooth/moosegpu/scripts/../libmesh/installed/include/libmesh/petsc_vector.h:693 On Thu, Jan 20, 2022 at 1:09 PM Fande Kong wrote: > Thanks, Jed, > > This worked! > > Fande > > On Wed, Jan 19, 2022 at 11:03 PM Jed Brown wrote: > >> Fande Kong writes: >> >> > On Wed, Jan 19, 2022 at 11:39 AM Jacob Faibussowitsch < >> jacob.fai at gmail.com> >> > wrote: >> > >> >> Are you running on login nodes or compute nodes (I can?t seem to tell >> from >> >> the configure.log)? >> >> >> > >> > I was compiling codes on login nodes, and running codes on compute >> nodes. >> > Login nodes do not have GPUs, but compute nodes do have GPUs. >> > >> > Just to be clear, the same thing (code, machine) with PETSc-3.16.1 >> worked >> > perfectly. I have this trouble with PETSc-main. >> >> I assume you can >> >> export PETSC_OPTIONS='-device_enable lazy' >> >> and it'll work. >> >> I think this should be the default. The main complaint is that timing the >> first GPU-using event isn't accurate if it includes initialization, but I >> think this is mostly hypothetical because you can't trust any timing that >> doesn't preload in some form and the first GPU-using event will almost >> always be something uninteresting so I think it will rarely lead to >> confusion. Meanwhile, eager initialization is viscerally disruptive for >> lots of people. >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jxiong at anl.gov Thu Jan 20 16:13:57 2022 From: jxiong at anl.gov (Xiong, Jing) Date: Thu, 20 Jan 2022 22:13:57 +0000 Subject: [petsc-users] Asking examples about solving DAE in python Message-ID: Hi, I hope you are well. I'm very interested in PETSc and want to explore the possibility of whether it could solve Differential-algebraic equations (DAE) in python. I know there are great examples in C, but I'm struggling to connect the examples in python. The only example I got right now is for solving ODEs in the paper: PETSc/TS: A Modern Scalable ODE/DAE Solver Library. And I got the following questions: 1. Is petsc4py the right package to use? 2. Could you give an example for solving DAEs in python? 3. Is Jacobian must be specified? If not, could your show an example for solving DAEs without specified Jacobian in python? Thank you for your help. Best, Jing -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Thu Jan 20 16:21:48 2022 From: bsmith at petsc.dev (Barry Smith) Date: Thu, 20 Jan 2022 17:21:48 -0500 Subject: [petsc-users] Asking examples about solving DAE in python In-Reply-To: References: Message-ID: <7C9E66F4-AC3C-4289-AA12-AAE5DAB41042@petsc.dev> Hong Zhang can give you more correct details but I can get you started. Yes, you can use Petsc4py for this purpose. If you cannot provide a Jacobian then you should be able to use either a matrix-free solver for the Jacobian system or have PETSc build the Jacobian for you with finite differencing and coloring. Or if the problem is small just have PETSc brute force the Jacobian with finite differencing. > On Jan 20, 2022, at 5:13 PM, Xiong, Jing via petsc-users wrote: > > Hi, > > I hope you are well. > I'm very interested in PETSc and want to explore the possibility of whether it could solve Differential-algebraic equations (DAE) in python. I know there are great examples in C, but I'm struggling to connect the examples in python. > > The only example I got right now is for solving ODEs in the paper: PETSc/TS: A Modern Scalable ODE/DAE Solver Library. > And I got the following questions: > Is petsc4py the right package to use? > Could you give an example for solving DAEs in python? > Is Jacobian must be specified? If not, could your show an example for solving DAEs without specified Jacobian in python? > > Thank you for your help. > > Best, > Jing -------------- next part -------------- An HTML attachment was scrubbed... URL: From jxiong at anl.gov Thu Jan 20 16:45:51 2022 From: jxiong at anl.gov (Xiong, Jing) Date: Thu, 20 Jan 2022 22:45:51 +0000 Subject: [petsc-users] Asking examples about solving DAE in python In-Reply-To: <7C9E66F4-AC3C-4289-AA12-AAE5DAB41042@petsc.dev> References: <7C9E66F4-AC3C-4289-AA12-AAE5DAB41042@petsc.dev> Message-ID: Good afternoon Dr. Smith, Thank you for your quick response. I would like to know more details for "a matrix-free solver for the Jacobian system or have PETSc build the Jacobian for you with finite differencing and coloring." Do you happen to have any examples for this? Maybe I can run a demo and test the performance of my problem. Best, Jing ________________________________ From: Barry Smith Sent: Thursday, January 20, 2022 2:21 PM To: Xiong, Jing Cc: petsc-users at mcs.anl.gov ; Zhao, Dongbo ; Hong, Tianqi Subject: Re: [petsc-users] Asking examples about solving DAE in python Hong Zhang can give you more correct details but I can get you started. Yes, you can use Petsc4py for this purpose. If you cannot provide a Jacobian then you should be able to use either a matrix-free solver for the Jacobian system or have PETSc build the Jacobian for you with finite differencing and coloring. Or if the problem is small just have PETSc brute force the Jacobian with finite differencing. On Jan 20, 2022, at 5:13 PM, Xiong, Jing via petsc-users > wrote: Hi, I hope you are well. I'm very interested in PETSc and want to explore the possibility of whether it could solve Differential-algebraic equations (DAE) in python. I know there are great examples in C, but I'm struggling to connect the examples in python. The only example I got right now is for solving ODEs in the paper: PETSc/TS: A Modern Scalable ODE/DAE Solver Library. And I got the following questions: 1. Is petsc4py the right package to use? 2. Could you give an example for solving DAEs in python? 3. Is Jacobian must be specified? If not, could your show an example for solving DAEs without specified Jacobian in python? Thank you for your help. Best, Jing -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Thu Jan 20 16:57:11 2022 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 20 Jan 2022 17:57:11 -0500 Subject: [petsc-users] Asking examples about solving DAE in python In-Reply-To: References: <7C9E66F4-AC3C-4289-AA12-AAE5DAB41042@petsc.dev> Message-ID: On Thu, Jan 20, 2022 at 5:45 PM Xiong, Jing via petsc-users < petsc-users at mcs.anl.gov> wrote: > Good afternoon Dr. Smith, > > Thank you for your quick response. > I would like to know more details for "a matrix-free solver for the > Jacobian system or have PETSc build the Jacobian for you with finite > differencing and coloring." > If you do not specify anything, we do this by default, so the first thing to do is just run with your residual function. > Do you happen to have any examples for this? Maybe I can run a demo and > test the performance of my problem. > Most of the TS examples can be configured to run in these modes. There is a chapter in the users manual about this. Thanks, Matt > Best, > Jing > ------------------------------ > *From:* Barry Smith > *Sent:* Thursday, January 20, 2022 2:21 PM > *To:* Xiong, Jing > *Cc:* petsc-users at mcs.anl.gov ; Zhao, Dongbo < > dongbo.zhao at anl.gov>; Hong, Tianqi > *Subject:* Re: [petsc-users] Asking examples about solving DAE in python > > > Hong Zhang can give you more correct details but I can get you started. > > Yes, you can use Petsc4py for this purpose. > > If you cannot provide a Jacobian then you should be able to use either > a matrix-free solver for the Jacobian system or have PETSc build the > Jacobian for you with finite differencing and coloring. Or if the problem > is small just have PETSc brute force the Jacobian with finite differencing. > > > > On Jan 20, 2022, at 5:13 PM, Xiong, Jing via petsc-users < > petsc-users at mcs.anl.gov> wrote: > > Hi, > > I hope you are well. > I'm very interested in PETSc and want to explore the possibility of > whether it could solve Differential-algebraic equations (DAE) in python. I > know there are great examples in C, but I'm struggling to connect the > examples in python. > > The only example I got right now is for solving ODEs in the paper: > PETSc/TS: A Modern Scalable ODE/DAE Solver Library. > And I got the following questions: > > 1. *Is petsc4py the right package to use?* > 2. *Could you give an example for solving DAEs in python?* > 3. *Is Jacobian must be specified? If not, could your show an example > for solving DAEs without specified Jacobian in python?* > > > Thank you for your help. > > Best, > Jing > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From hongzhang at anl.gov Thu Jan 20 17:05:36 2022 From: hongzhang at anl.gov (Zhang, Hong) Date: Thu, 20 Jan 2022 23:05:36 +0000 Subject: [petsc-users] Asking examples about solving DAE in python In-Reply-To: References: Message-ID: On Jan 20, 2022, at 4:13 PM, Xiong, Jing via petsc-users > wrote: Hi, I hope you are well. I'm very interested in PETSc and want to explore the possibility of whether it could solve Differential-algebraic equations (DAE) in python. I know there are great examples in C, but I'm struggling to connect the examples in python. The only example I got right now is for solving ODEs in the paper: PETSc/TS: A Modern Scalable ODE/DAE Solver Library. And I got the following questions: 1. Is petsc4py the right package to use? Yes, you need petsc4py for python. 1. Could you give an example for solving DAEs in python? src/binding/petsc4py/demo/ode/vanderpol.py gives a simple example to demonstrate a variety of PETSc capabilities. The corresponding C version of this example is ex20adj.c in src/ts/tutorials/. 1. How to solve an ODE with explicit methods. 2. How to solve an ODE/DAE with implicit methods. 3. How to use TSAdjoint to calculate adjoint sensitivities. 4. How to do a manual matrix-free implementation (e.g. you may already have a way to differentiate your RHS function to generate the Jacobian-vector product). 1. Is Jacobian must be specified? If not, could your show an example for solving DAEs without specified Jacobian in python? PETSc can do finite-difference approximations to generate the Jacobian matrix automatically. This may work nicely for small-sized problems. You can also try to use an AD tool to produce the Jacobian-vector product and use it in a matrix-free implementation. Hong Thank you for your help. Best, Jing -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at jedbrown.org Thu Jan 20 17:34:06 2022 From: jed at jedbrown.org (Jed Brown) Date: Thu, 20 Jan 2022 16:34:06 -0700 Subject: [petsc-users] Cannot eagerly initialize cuda, as doing so results in cuda error 35 (cudaErrorInsufficientDriver) : CUDA driver version is insufficient for CUDA runtime version In-Reply-To: References: <233D0F20-AD95-481B-B862-353D5789C556@gmail.com> <87czkn9c64.fsf@jedbrown.org> Message-ID: <875yqe7zip.fsf@jedbrown.org> You can't create CUDA or Kokkos Vecs if you're running on a node without a GPU. The point of lazy initialization is to make it possible to run a solve that doesn't use a GPU in PETSC_ARCH that supports GPUs, regardless of whether a GPU is actually present. Fande Kong writes: > I spoke too soon. It seems that we have trouble creating cuda/kokkos vecs > now. Got Segmentation fault. > > Thanks, > > Fande > > Program received signal SIGSEGV, Segmentation fault. > 0x00002aaab5558b11 in > Petsc::CUPMDevice<(Petsc::CUPMDeviceType)0>::CUPMDeviceInternal::initialize > (this=0x1) at > /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/impls/cupm/cupmdevice.cxx:54 > 54 PetscErrorCode CUPMDevice::CUPMDeviceInternal::initialize() noexcept > Missing separate debuginfos, use: debuginfo-install > bzip2-libs-1.0.6-13.el7.x86_64 elfutils-libelf-0.176-5.el7.x86_64 > elfutils-libs-0.176-5.el7.x86_64 glibc-2.17-325.el7_9.x86_64 > libX11-1.6.7-4.el7_9.x86_64 libXau-1.0.8-2.1.el7.x86_64 > libattr-2.4.46-13.el7.x86_64 libcap-2.22-11.el7.x86_64 > libibmad-5.4.0.MLNX20190423.1d917ae-0.1.49224.x86_64 > libibumad-43.1.1.MLNX20200211.078947f-0.1.49224.x86_64 > libibverbs-41mlnx1-OFED.4.9.0.0.7.49224.x86_64 > libmlx4-41mlnx1-OFED.4.7.3.0.3.49224.x86_64 > libmlx5-41mlnx1-OFED.4.9.0.1.2.49224.x86_64 libnl3-3.2.28-4.el7.x86_64 > librdmacm-41mlnx1-OFED.4.7.3.0.6.49224.x86_64 > librxe-41mlnx1-OFED.4.4.2.4.6.49224.x86_64 libxcb-1.13-1.el7.x86_64 > libxml2-2.9.1-6.el7_9.6.x86_64 numactl-libs-2.0.12-5.el7.x86_64 > systemd-libs-219-78.el7_9.3.x86_64 xz-libs-5.2.2-1.el7.x86_64 > zlib-1.2.7-19.el7_9.x86_64 > (gdb) bt > #0 0x00002aaab5558b11 in > Petsc::CUPMDevice<(Petsc::CUPMDeviceType)0>::CUPMDeviceInternal::initialize > (this=0x1) at > /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/impls/cupm/cupmdevice.cxx:54 > #1 0x00002aaab5558db7 in > Petsc::CUPMDevice<(Petsc::CUPMDeviceType)0>::getDevice > (this=this at entry=0x2aaab7f37b70 > , device=0x115da00, id=-35, id at entry=-1) at > /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/impls/cupm/cupmdevice.cxx:344 > #2 0x00002aaab55577de in PetscDeviceCreate (type=type at entry=PETSC_DEVICE_CUDA, > devid=devid at entry=-1, device=device at entry=0x2aaab7f37b48 > ) at > /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/interface/device.cxx:107 > #3 0x00002aaab5557b3a in PetscDeviceInitializeDefaultDevice_Internal > (type=type at entry=PETSC_DEVICE_CUDA, defaultDeviceId=defaultDeviceId at entry=-1) > at > /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/interface/device.cxx:273 > #4 0x00002aaab5557bf6 in PetscDeviceInitialize > (type=type at entry=PETSC_DEVICE_CUDA) > at > /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/interface/device.cxx:234 > #5 0x00002aaab5661fcd in VecCreate_SeqCUDA (V=0x115d150) at > /home/kongf/workhome/sawtooth/moosegpu/petsc/src/vec/vec/impls/seq/seqcuda/veccuda.c:244 > #6 0x00002aaab5649b40 in VecSetType (vec=vec at entry=0x115d150, > method=method at entry=0x2aaab70b45b8 "seqcuda") at > /home/kongf/workhome/sawtooth/moosegpu/petsc/src/vec/vec/interface/vecreg.c:93 > #7 0x00002aaab579c33f in VecCreate_CUDA (v=0x115d150) at > /home/kongf/workhome/sawtooth/moosegpu/petsc/src/vec/vec/impls/mpi/mpicuda/ > mpicuda.cu:214 > #8 0x00002aaab5649b40 in VecSetType (vec=vec at entry=0x115d150, > method=method at entry=0x7fffffff9260 "cuda") at > /home/kongf/workhome/sawtooth/moosegpu/petsc/src/vec/vec/interface/vecreg.c:93 > #9 0x00002aaab5648bf1 in VecSetTypeFromOptions_Private (vec=0x115d150, > PetscOptionsObject=0x7fffffff9210) at > /home/kongf/workhome/sawtooth/moosegpu/petsc/src/vec/vec/interface/vector.c:1263 > #10 VecSetFromOptions (vec=0x115d150) at > /home/kongf/workhome/sawtooth/moosegpu/petsc/src/vec/vec/interface/vector.c:1297 > #11 0x00002aaab02ef227 in libMesh::PetscVector::init > (this=0x11cd1a0, n=441, n_local=441, fast=false, ptype=libMesh::PARALLEL) > at > /home/kongf/workhome/sawtooth/moosegpu/scripts/../libmesh/installed/include/libmesh/petsc_vector.h:693 > > On Thu, Jan 20, 2022 at 1:09 PM Fande Kong wrote: > >> Thanks, Jed, >> >> This worked! >> >> Fande >> >> On Wed, Jan 19, 2022 at 11:03 PM Jed Brown wrote: >> >>> Fande Kong writes: >>> >>> > On Wed, Jan 19, 2022 at 11:39 AM Jacob Faibussowitsch < >>> jacob.fai at gmail.com> >>> > wrote: >>> > >>> >> Are you running on login nodes or compute nodes (I can?t seem to tell >>> from >>> >> the configure.log)? >>> >> >>> > >>> > I was compiling codes on login nodes, and running codes on compute >>> nodes. >>> > Login nodes do not have GPUs, but compute nodes do have GPUs. >>> > >>> > Just to be clear, the same thing (code, machine) with PETSc-3.16.1 >>> worked >>> > perfectly. I have this trouble with PETSc-main. >>> >>> I assume you can >>> >>> export PETSC_OPTIONS='-device_enable lazy' >>> >>> and it'll work. >>> >>> I think this should be the default. The main complaint is that timing the >>> first GPU-using event isn't accurate if it includes initialization, but I >>> think this is mostly hypothetical because you can't trust any timing that >>> doesn't preload in some form and the first GPU-using event will almost >>> always be something uninteresting so I think it will rarely lead to >>> confusion. Meanwhile, eager initialization is viscerally disruptive for >>> lots of people. >>> >> From fdkong.jd at gmail.com Thu Jan 20 17:43:41 2022 From: fdkong.jd at gmail.com (Fande Kong) Date: Thu, 20 Jan 2022 16:43:41 -0700 Subject: [petsc-users] Cannot eagerly initialize cuda, as doing so results in cuda error 35 (cudaErrorInsufficientDriver) : CUDA driver version is insufficient for CUDA runtime version In-Reply-To: <875yqe7zip.fsf@jedbrown.org> References: <233D0F20-AD95-481B-B862-353D5789C556@gmail.com> <87czkn9c64.fsf@jedbrown.org> <875yqe7zip.fsf@jedbrown.org> Message-ID: Thanks, Jed On Thu, Jan 20, 2022 at 4:34 PM Jed Brown wrote: > You can't create CUDA or Kokkos Vecs if you're running on a node without a > GPU. I am running the code on compute nodes that do have GPUs. With PETSc-3.16.1, I got good speedup by running GAMG on GPUs. That might be a bug of PETSc-main. Thanks, Fande KSPSetUp 13 1.0 6.4400e-01 1.0 2.02e+09 1.0 0.0e+00 0.0e+00 0.0e+00 0 5 0 0 0 0 5 0 0 0 3140 64630 15 1.05e+02 5 3.49e+01 100 KSPSolve 1 1.0 1.0109e+00 1.0 3.49e+10 1.0 0.0e+00 0.0e+00 0.0e+00 0 87 0 0 0 0 87 0 0 0 34522 69556 4 4.35e-03 1 2.38e-03 100 KSPGMRESOrthog 142 1.0 1.2674e-01 1.0 1.06e+10 1.0 0.0e+00 0.0e+00 0.0e+00 0 27 0 0 0 0 27 0 0 0 83755 87801 0 0.00e+00 0 0.00e+00 100 SNESSolve 1 1.0 4.4402e+01 1.0 4.00e+10 1.0 0.0e+00 0.0e+00 0.0e+00 21100 0 0 0 21100 0 0 0 901 51365 57 1.10e+03 52 8.78e+02 100 SNESSetUp 1 1.0 3.9101e-05 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0 SNESFunctionEval 2 1.0 1.7097e+01 1.0 1.60e+07 1.0 0.0e+00 0.0e+00 0.0e+00 8 0 0 0 0 8 0 0 0 0 1 0 0 0.00e+00 6 1.92e+02 0 SNESJacobianEval 1 1.0 1.6213e+01 1.0 2.80e+07 1.0 0.0e+00 0.0e+00 0.0e+00 8 0 0 0 0 8 0 0 0 0 2 0 0 0.00e+00 1 3.20e+01 0 SNESLineSearch 1 1.0 8.5582e+00 1.0 1.24e+08 1.0 0.0e+00 0.0e+00 0.0e+00 4 0 0 0 0 4 0 0 0 0 14 64153 1 3.20e+01 3 9.61e+01 94 PCGAMGGraph_AGG 5 1.0 3.0509e+00 1.0 8.19e+07 1.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 27 0 5 3.49e+01 9 7.43e+01 0 PCGAMGCoarse_AGG 5 1.0 3.8711e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 2 0 0 0 0 2 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0 PCGAMGProl_AGG 5 1.0 7.0748e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0 PCGAMGPOpt_AGG 5 1.0 1.2904e+00 1.0 2.14e+09 1.0 0.0e+00 0.0e+00 0.0e+00 1 5 0 0 0 1 5 0 0 0 1661 29807 26 7.15e+02 20 2.90e+02 99 GAMG: createProl 5 1.0 8.9489e+00 1.0 2.22e+09 1.0 0.0e+00 0.0e+00 0.0e+00 4 6 0 0 0 4 6 0 0 0 249 29666 31 7.50e+02 29 3.64e+02 96 Graph 10 1.0 3.0478e+00 1.0 8.19e+07 1.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 27 0 5 3.49e+01 9 7.43e+01 0 MIS/Agg 5 1.0 4.1290e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0 SA: col data 5 1.0 1.9127e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0 SA: frmProl0 5 1.0 6.2662e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0 SA: smooth 5 1.0 4.9595e-01 1.0 1.21e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 244 2709 15 1.97e+02 15 2.55e+02 90 GAMG: partLevel 5 1.0 4.7330e-01 1.0 6.98e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 2 0 0 0 0 2 0 0 0 1475 4120 5 1.78e+02 10 2.55e+02 100 PCGAMG Squ l00 1 1.0 2.6027e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0 PCGAMG Gal l00 1 1.0 3.8406e-01 1.0 5.48e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 1426 4270 1 1.48e+02 2 2.11e+02 100 PCGAMG Opt l00 1 1.0 2.4932e-01 1.0 7.20e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 289 2653 1 6.41e+01 1 1.13e+02 100 PCGAMG Gal l01 1 1.0 6.6279e-02 1.0 1.09e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1645 3851 1 2.40e+01 2 3.64e+01 100 PCGAMG Opt l01 1 1.0 2.9544e-02 1.0 7.15e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 242 1671 1 4.84e+00 1 1.23e+01 100 PCGAMG Gal l02 1 1.0 1.8874e-02 1.0 3.72e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1974 3636 1 5.04e+00 2 6.58e+00 100 PCGAMG Opt l02 1 1.0 7.4353e-03 1.0 2.40e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 323 1457 1 7.71e-01 1 2.30e+00 100 PCGAMG Gal l03 1 1.0 2.8479e-03 1.0 4.10e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1440 2266 1 4.44e-01 2 5.51e-01 100 PCGAMG Opt l03 1 1.0 8.2684e-04 1.0 2.80e+05 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 339 1667 1 6.72e-02 1 2.03e-01 100 PCGAMG Gal l04 1 1.0 1.2238e-03 1.0 2.09e+05 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 170 244 1 2.05e-02 2 2.53e-02 100 PCGAMG Opt l04 1 1.0 4.1008e-04 1.0 1.77e+04 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 43 165 1 4.49e-03 1 1.19e-02 100 PCSetUp 2 1.0 9.9632e+00 1.0 4.95e+09 1.0 0.0e+00 0.0e+00 0.0e+00 5 12 0 0 0 5 12 0 0 0 496 17826 55 1.03e+03 45 6.54e+02 98 PCSetUpOnBlocks 44 1.0 9.9087e-04 1.0 2.88e+03 1.0 > The point of lazy initialization is to make it possible to run a solve > that doesn't use a GPU in PETSC_ARCH that supports GPUs, regardless of > whether a GPU is actually present. > > Fande Kong writes: > > > I spoke too soon. It seems that we have trouble creating cuda/kokkos vecs > > now. Got Segmentation fault. > > > > Thanks, > > > > Fande > > > > Program received signal SIGSEGV, Segmentation fault. > > 0x00002aaab5558b11 in > > > Petsc::CUPMDevice<(Petsc::CUPMDeviceType)0>::CUPMDeviceInternal::initialize > > (this=0x1) at > > > /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/impls/cupm/cupmdevice.cxx:54 > > 54 PetscErrorCode CUPMDevice::CUPMDeviceInternal::initialize() > noexcept > > Missing separate debuginfos, use: debuginfo-install > > bzip2-libs-1.0.6-13.el7.x86_64 elfutils-libelf-0.176-5.el7.x86_64 > > elfutils-libs-0.176-5.el7.x86_64 glibc-2.17-325.el7_9.x86_64 > > libX11-1.6.7-4.el7_9.x86_64 libXau-1.0.8-2.1.el7.x86_64 > > libattr-2.4.46-13.el7.x86_64 libcap-2.22-11.el7.x86_64 > > libibmad-5.4.0.MLNX20190423.1d917ae-0.1.49224.x86_64 > > libibumad-43.1.1.MLNX20200211.078947f-0.1.49224.x86_64 > > libibverbs-41mlnx1-OFED.4.9.0.0.7.49224.x86_64 > > libmlx4-41mlnx1-OFED.4.7.3.0.3.49224.x86_64 > > libmlx5-41mlnx1-OFED.4.9.0.1.2.49224.x86_64 libnl3-3.2.28-4.el7.x86_64 > > librdmacm-41mlnx1-OFED.4.7.3.0.6.49224.x86_64 > > librxe-41mlnx1-OFED.4.4.2.4.6.49224.x86_64 libxcb-1.13-1.el7.x86_64 > > libxml2-2.9.1-6.el7_9.6.x86_64 numactl-libs-2.0.12-5.el7.x86_64 > > systemd-libs-219-78.el7_9.3.x86_64 xz-libs-5.2.2-1.el7.x86_64 > > zlib-1.2.7-19.el7_9.x86_64 > > (gdb) bt > > #0 0x00002aaab5558b11 in > > > Petsc::CUPMDevice<(Petsc::CUPMDeviceType)0>::CUPMDeviceInternal::initialize > > (this=0x1) at > > > /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/impls/cupm/cupmdevice.cxx:54 > > #1 0x00002aaab5558db7 in > > Petsc::CUPMDevice<(Petsc::CUPMDeviceType)0>::getDevice > > (this=this at entry=0x2aaab7f37b70 > > , device=0x115da00, id=-35, id at entry=-1) at > > > /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/impls/cupm/cupmdevice.cxx:344 > > #2 0x00002aaab55577de in PetscDeviceCreate (type=type at entry > =PETSC_DEVICE_CUDA, > > devid=devid at entry=-1, device=device at entry=0x2aaab7f37b48 > > ) at > > > /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/interface/device.cxx:107 > > #3 0x00002aaab5557b3a in PetscDeviceInitializeDefaultDevice_Internal > > (type=type at entry=PETSC_DEVICE_CUDA, > defaultDeviceId=defaultDeviceId at entry=-1) > > at > > > /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/interface/device.cxx:273 > > #4 0x00002aaab5557bf6 in PetscDeviceInitialize > > (type=type at entry=PETSC_DEVICE_CUDA) > > at > > > /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/interface/device.cxx:234 > > #5 0x00002aaab5661fcd in VecCreate_SeqCUDA (V=0x115d150) at > > > /home/kongf/workhome/sawtooth/moosegpu/petsc/src/vec/vec/impls/seq/seqcuda/veccuda.c:244 > > #6 0x00002aaab5649b40 in VecSetType (vec=vec at entry=0x115d150, > > method=method at entry=0x2aaab70b45b8 "seqcuda") at > > > /home/kongf/workhome/sawtooth/moosegpu/petsc/src/vec/vec/interface/vecreg.c:93 > > #7 0x00002aaab579c33f in VecCreate_CUDA (v=0x115d150) at > > > /home/kongf/workhome/sawtooth/moosegpu/petsc/src/vec/vec/impls/mpi/mpicuda/ > > mpicuda.cu:214 > > #8 0x00002aaab5649b40 in VecSetType (vec=vec at entry=0x115d150, > > method=method at entry=0x7fffffff9260 "cuda") at > > > /home/kongf/workhome/sawtooth/moosegpu/petsc/src/vec/vec/interface/vecreg.c:93 > > #9 0x00002aaab5648bf1 in VecSetTypeFromOptions_Private (vec=0x115d150, > > PetscOptionsObject=0x7fffffff9210) at > > > /home/kongf/workhome/sawtooth/moosegpu/petsc/src/vec/vec/interface/vector.c:1263 > > #10 VecSetFromOptions (vec=0x115d150) at > > > /home/kongf/workhome/sawtooth/moosegpu/petsc/src/vec/vec/interface/vector.c:1297 > > #11 0x00002aaab02ef227 in libMesh::PetscVector::init > > (this=0x11cd1a0, n=441, n_local=441, fast=false, ptype=libMesh::PARALLEL) > > at > > > /home/kongf/workhome/sawtooth/moosegpu/scripts/../libmesh/installed/include/libmesh/petsc_vector.h:693 > > > > On Thu, Jan 20, 2022 at 1:09 PM Fande Kong wrote: > > > >> Thanks, Jed, > >> > >> This worked! > >> > >> Fande > >> > >> On Wed, Jan 19, 2022 at 11:03 PM Jed Brown wrote: > >> > >>> Fande Kong writes: > >>> > >>> > On Wed, Jan 19, 2022 at 11:39 AM Jacob Faibussowitsch < > >>> jacob.fai at gmail.com> > >>> > wrote: > >>> > > >>> >> Are you running on login nodes or compute nodes (I can?t seem to > tell > >>> from > >>> >> the configure.log)? > >>> >> > >>> > > >>> > I was compiling codes on login nodes, and running codes on compute > >>> nodes. > >>> > Login nodes do not have GPUs, but compute nodes do have GPUs. > >>> > > >>> > Just to be clear, the same thing (code, machine) with PETSc-3.16.1 > >>> worked > >>> > perfectly. I have this trouble with PETSc-main. > >>> > >>> I assume you can > >>> > >>> export PETSC_OPTIONS='-device_enable lazy' > >>> > >>> and it'll work. > >>> > >>> I think this should be the default. The main complaint is that timing > the > >>> first GPU-using event isn't accurate if it includes initialization, > but I > >>> think this is mostly hypothetical because you can't trust any timing > that > >>> doesn't preload in some form and the first GPU-using event will almost > >>> always be something uninteresting so I think it will rarely lead to > >>> confusion. Meanwhile, eager initialization is viscerally disruptive for > >>> lots of people. > >>> > >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Thu Jan 20 17:47:58 2022 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 20 Jan 2022 18:47:58 -0500 Subject: [petsc-users] Cannot eagerly initialize cuda, as doing so results in cuda error 35 (cudaErrorInsufficientDriver) : CUDA driver version is insufficient for CUDA runtime version In-Reply-To: References: <233D0F20-AD95-481B-B862-353D5789C556@gmail.com> <87czkn9c64.fsf@jedbrown.org> <875yqe7zip.fsf@jedbrown.org> Message-ID: On Thu, Jan 20, 2022 at 6:44 PM Fande Kong wrote: > Thanks, Jed > > On Thu, Jan 20, 2022 at 4:34 PM Jed Brown wrote: > >> You can't create CUDA or Kokkos Vecs if you're running on a node without >> a GPU. > > > I am running the code on compute nodes that do have GPUs. > If you are actually running on GPUs, why would you need lazy initialization? It would not break with GPUs present. Matt > With PETSc-3.16.1, I got good speedup by running GAMG on GPUs. That > might be a bug of PETSc-main. > > Thanks, > > Fande > > > > KSPSetUp 13 1.0 6.4400e-01 1.0 2.02e+09 1.0 0.0e+00 0.0e+00 > 0.0e+00 0 5 0 0 0 0 5 0 0 0 3140 64630 15 1.05e+02 5 > 3.49e+01 100 > KSPSolve 1 1.0 1.0109e+00 1.0 3.49e+10 1.0 0.0e+00 0.0e+00 > 0.0e+00 0 87 0 0 0 0 87 0 0 0 34522 69556 4 4.35e-03 1 > 2.38e-03 100 > KSPGMRESOrthog 142 1.0 1.2674e-01 1.0 1.06e+10 1.0 0.0e+00 0.0e+00 > 0.0e+00 0 27 0 0 0 0 27 0 0 0 83755 87801 0 0.00e+00 0 > 0.00e+00 100 > SNESSolve 1 1.0 4.4402e+01 1.0 4.00e+10 1.0 0.0e+00 0.0e+00 > 0.0e+00 21100 0 0 0 21100 0 0 0 901 51365 57 1.10e+03 52 > 8.78e+02 100 > SNESSetUp 1 1.0 3.9101e-05 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 > 0.00e+00 0 > SNESFunctionEval 2 1.0 1.7097e+01 1.0 1.60e+07 1.0 0.0e+00 0.0e+00 > 0.0e+00 8 0 0 0 0 8 0 0 0 0 1 0 0 0.00e+00 6 > 1.92e+02 0 > SNESJacobianEval 1 1.0 1.6213e+01 1.0 2.80e+07 1.0 0.0e+00 0.0e+00 > 0.0e+00 8 0 0 0 0 8 0 0 0 0 2 0 0 0.00e+00 1 > 3.20e+01 0 > SNESLineSearch 1 1.0 8.5582e+00 1.0 1.24e+08 1.0 0.0e+00 0.0e+00 > 0.0e+00 4 0 0 0 0 4 0 0 0 0 14 64153 1 3.20e+01 3 > 9.61e+01 94 > PCGAMGGraph_AGG 5 1.0 3.0509e+00 1.0 8.19e+07 1.0 0.0e+00 0.0e+00 > 0.0e+00 1 0 0 0 0 1 0 0 0 0 27 0 5 3.49e+01 9 > 7.43e+01 0 > PCGAMGCoarse_AGG 5 1.0 3.8711e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 2 0 0 0 0 2 0 0 0 0 0 0 0 0.00e+00 0 > 0.00e+00 0 > PCGAMGProl_AGG 5 1.0 7.0748e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 > 0.00e+00 0 > PCGAMGPOpt_AGG 5 1.0 1.2904e+00 1.0 2.14e+09 1.0 0.0e+00 0.0e+00 > 0.0e+00 1 5 0 0 0 1 5 0 0 0 1661 29807 26 7.15e+02 20 > 2.90e+02 99 > GAMG: createProl 5 1.0 8.9489e+00 1.0 2.22e+09 1.0 0.0e+00 0.0e+00 > 0.0e+00 4 6 0 0 0 4 6 0 0 0 249 29666 31 7.50e+02 29 > 3.64e+02 96 > Graph 10 1.0 3.0478e+00 1.0 8.19e+07 1.0 0.0e+00 0.0e+00 > 0.0e+00 1 0 0 0 0 1 0 0 0 0 27 0 5 3.49e+01 9 > 7.43e+01 0 > MIS/Agg 5 1.0 4.1290e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 > 0.00e+00 0 > SA: col data 5 1.0 1.9127e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 > 0.00e+00 0 > SA: frmProl0 5 1.0 6.2662e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 > 0.00e+00 0 > SA: smooth 5 1.0 4.9595e-01 1.0 1.21e+08 1.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 244 2709 15 1.97e+02 15 > 2.55e+02 90 > GAMG: partLevel 5 1.0 4.7330e-01 1.0 6.98e+08 1.0 0.0e+00 0.0e+00 > 0.0e+00 0 2 0 0 0 0 2 0 0 0 1475 4120 5 1.78e+02 10 > 2.55e+02 100 > PCGAMG Squ l00 1 1.0 2.6027e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 1 0 0 0 0 1 0 0 0 0 0 0 0 0.00e+00 0 > 0.00e+00 0 > PCGAMG Gal l00 1 1.0 3.8406e-01 1.0 5.48e+08 1.0 0.0e+00 0.0e+00 > 0.0e+00 0 1 0 0 0 0 1 0 0 0 1426 4270 1 1.48e+02 2 > 2.11e+02 100 > PCGAMG Opt l00 1 1.0 2.4932e-01 1.0 7.20e+07 1.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 289 2653 1 6.41e+01 1 > 1.13e+02 100 > PCGAMG Gal l01 1 1.0 6.6279e-02 1.0 1.09e+08 1.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 1645 3851 1 2.40e+01 2 > 3.64e+01 100 > PCGAMG Opt l01 1 1.0 2.9544e-02 1.0 7.15e+06 1.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 242 1671 1 4.84e+00 1 > 1.23e+01 100 > PCGAMG Gal l02 1 1.0 1.8874e-02 1.0 3.72e+07 1.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 1974 3636 1 5.04e+00 2 > 6.58e+00 100 > PCGAMG Opt l02 1 1.0 7.4353e-03 1.0 2.40e+06 1.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 323 1457 1 7.71e-01 1 > 2.30e+00 100 > PCGAMG Gal l03 1 1.0 2.8479e-03 1.0 4.10e+06 1.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 1440 2266 1 4.44e-01 2 > 5.51e-01 100 > PCGAMG Opt l03 1 1.0 8.2684e-04 1.0 2.80e+05 1.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 339 1667 1 6.72e-02 1 > 2.03e-01 100 > PCGAMG Gal l04 1 1.0 1.2238e-03 1.0 2.09e+05 1.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 170 244 1 2.05e-02 2 > 2.53e-02 100 > PCGAMG Opt l04 1 1.0 4.1008e-04 1.0 1.77e+04 1.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 43 165 1 4.49e-03 1 > 1.19e-02 100 > PCSetUp 2 1.0 9.9632e+00 1.0 4.95e+09 1.0 0.0e+00 0.0e+00 > 0.0e+00 5 12 0 0 0 5 12 0 0 0 496 17826 55 1.03e+03 45 > 6.54e+02 98 > PCSetUpOnBlocks 44 1.0 9.9087e-04 1.0 2.88e+03 1.0 > > > > > > >> The point of lazy initialization is to make it possible to run a solve >> that doesn't use a GPU in PETSC_ARCH that supports GPUs, regardless of >> whether a GPU is actually present. >> >> Fande Kong writes: >> >> > I spoke too soon. It seems that we have trouble creating cuda/kokkos >> vecs >> > now. Got Segmentation fault. >> > >> > Thanks, >> > >> > Fande >> > >> > Program received signal SIGSEGV, Segmentation fault. >> > 0x00002aaab5558b11 in >> > >> Petsc::CUPMDevice<(Petsc::CUPMDeviceType)0>::CUPMDeviceInternal::initialize >> > (this=0x1) at >> > >> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/impls/cupm/cupmdevice.cxx:54 >> > 54 PetscErrorCode CUPMDevice::CUPMDeviceInternal::initialize() >> noexcept >> > Missing separate debuginfos, use: debuginfo-install >> > bzip2-libs-1.0.6-13.el7.x86_64 elfutils-libelf-0.176-5.el7.x86_64 >> > elfutils-libs-0.176-5.el7.x86_64 glibc-2.17-325.el7_9.x86_64 >> > libX11-1.6.7-4.el7_9.x86_64 libXau-1.0.8-2.1.el7.x86_64 >> > libattr-2.4.46-13.el7.x86_64 libcap-2.22-11.el7.x86_64 >> > libibmad-5.4.0.MLNX20190423.1d917ae-0.1.49224.x86_64 >> > libibumad-43.1.1.MLNX20200211.078947f-0.1.49224.x86_64 >> > libibverbs-41mlnx1-OFED.4.9.0.0.7.49224.x86_64 >> > libmlx4-41mlnx1-OFED.4.7.3.0.3.49224.x86_64 >> > libmlx5-41mlnx1-OFED.4.9.0.1.2.49224.x86_64 libnl3-3.2.28-4.el7.x86_64 >> > librdmacm-41mlnx1-OFED.4.7.3.0.6.49224.x86_64 >> > librxe-41mlnx1-OFED.4.4.2.4.6.49224.x86_64 libxcb-1.13-1.el7.x86_64 >> > libxml2-2.9.1-6.el7_9.6.x86_64 numactl-libs-2.0.12-5.el7.x86_64 >> > systemd-libs-219-78.el7_9.3.x86_64 xz-libs-5.2.2-1.el7.x86_64 >> > zlib-1.2.7-19.el7_9.x86_64 >> > (gdb) bt >> > #0 0x00002aaab5558b11 in >> > >> Petsc::CUPMDevice<(Petsc::CUPMDeviceType)0>::CUPMDeviceInternal::initialize >> > (this=0x1) at >> > >> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/impls/cupm/cupmdevice.cxx:54 >> > #1 0x00002aaab5558db7 in >> > Petsc::CUPMDevice<(Petsc::CUPMDeviceType)0>::getDevice >> > (this=this at entry=0x2aaab7f37b70 >> > , device=0x115da00, id=-35, id at entry=-1) at >> > >> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/impls/cupm/cupmdevice.cxx:344 >> > #2 0x00002aaab55577de in PetscDeviceCreate (type=type at entry >> =PETSC_DEVICE_CUDA, >> > devid=devid at entry=-1, device=device at entry=0x2aaab7f37b48 >> > ) at >> > >> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/interface/device.cxx:107 >> > #3 0x00002aaab5557b3a in PetscDeviceInitializeDefaultDevice_Internal >> > (type=type at entry=PETSC_DEVICE_CUDA, >> defaultDeviceId=defaultDeviceId at entry=-1) >> > at >> > >> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/interface/device.cxx:273 >> > #4 0x00002aaab5557bf6 in PetscDeviceInitialize >> > (type=type at entry=PETSC_DEVICE_CUDA) >> > at >> > >> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/interface/device.cxx:234 >> > #5 0x00002aaab5661fcd in VecCreate_SeqCUDA (V=0x115d150) at >> > >> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/vec/vec/impls/seq/seqcuda/veccuda.c:244 >> > #6 0x00002aaab5649b40 in VecSetType (vec=vec at entry=0x115d150, >> > method=method at entry=0x2aaab70b45b8 "seqcuda") at >> > >> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/vec/vec/interface/vecreg.c:93 >> > #7 0x00002aaab579c33f in VecCreate_CUDA (v=0x115d150) at >> > >> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/vec/vec/impls/mpi/mpicuda/ >> > mpicuda.cu:214 >> > #8 0x00002aaab5649b40 in VecSetType (vec=vec at entry=0x115d150, >> > method=method at entry=0x7fffffff9260 "cuda") at >> > >> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/vec/vec/interface/vecreg.c:93 >> > #9 0x00002aaab5648bf1 in VecSetTypeFromOptions_Private (vec=0x115d150, >> > PetscOptionsObject=0x7fffffff9210) at >> > >> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/vec/vec/interface/vector.c:1263 >> > #10 VecSetFromOptions (vec=0x115d150) at >> > >> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/vec/vec/interface/vector.c:1297 >> > #11 0x00002aaab02ef227 in libMesh::PetscVector::init >> > (this=0x11cd1a0, n=441, n_local=441, fast=false, >> ptype=libMesh::PARALLEL) >> > at >> > >> /home/kongf/workhome/sawtooth/moosegpu/scripts/../libmesh/installed/include/libmesh/petsc_vector.h:693 >> > >> > On Thu, Jan 20, 2022 at 1:09 PM Fande Kong wrote: >> > >> >> Thanks, Jed, >> >> >> >> This worked! >> >> >> >> Fande >> >> >> >> On Wed, Jan 19, 2022 at 11:03 PM Jed Brown wrote: >> >> >> >>> Fande Kong writes: >> >>> >> >>> > On Wed, Jan 19, 2022 at 11:39 AM Jacob Faibussowitsch < >> >>> jacob.fai at gmail.com> >> >>> > wrote: >> >>> > >> >>> >> Are you running on login nodes or compute nodes (I can?t seem to >> tell >> >>> from >> >>> >> the configure.log)? >> >>> >> >> >>> > >> >>> > I was compiling codes on login nodes, and running codes on compute >> >>> nodes. >> >>> > Login nodes do not have GPUs, but compute nodes do have GPUs. >> >>> > >> >>> > Just to be clear, the same thing (code, machine) with PETSc-3.16.1 >> >>> worked >> >>> > perfectly. I have this trouble with PETSc-main. >> >>> >> >>> I assume you can >> >>> >> >>> export PETSC_OPTIONS='-device_enable lazy' >> >>> >> >>> and it'll work. >> >>> >> >>> I think this should be the default. The main complaint is that timing >> the >> >>> first GPU-using event isn't accurate if it includes initialization, >> but I >> >>> think this is mostly hypothetical because you can't trust any timing >> that >> >>> doesn't preload in some form and the first GPU-using event will almost >> >>> always be something uninteresting so I think it will rarely lead to >> >>> confusion. Meanwhile, eager initialization is viscerally disruptive >> for >> >>> lots of people. >> >>> >> >> >> > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From jacob.fai at gmail.com Thu Jan 20 19:25:28 2022 From: jacob.fai at gmail.com (Jacob Faibussowitsch) Date: Thu, 20 Jan 2022 19:25:28 -0600 Subject: [petsc-users] Cannot eagerly initialize cuda, as doing so results in cuda error 35 (cudaErrorInsufficientDriver) : CUDA driver version is insufficient for CUDA runtime version In-Reply-To: References: <233D0F20-AD95-481B-B862-353D5789C556@gmail.com> <87czkn9c64.fsf@jedbrown.org> <875yqe7zip.fsf@jedbrown.org> Message-ID: Segfault is caused by the following check at src/sys/objects/device/impls/cupm/cupmdevice.cxx:349 being a PetscUnlikelyDebug() rather than just PetscUnlikely(): ``` if (PetscUnlikelyDebug(_defaultDevice < 0)) { // _defaultDevice is in fact < 0 here and uncaught ``` To clarify: ?lazy? initialization is not that lazy after all, it still does some 50% of the initialization that ?eager? initialization does. It stops short initializing the CUDA runtime, checking CUDA aware MPI, gathering device data, and initializing cublas and friends. Lazy also importantly swallows any errors that crop up during initialization, storing the resulting error code for later (specifically _defaultDevice = -init_error_value;). So whether you initialize lazily or eagerly makes no difference here, as _defaultDevice will always contain -35. The bigger question is why cudaGetDeviceCount() is returning cudaErrorInsufficientDriver. Can you compile and run ``` #include int main() { int ndev; return cudaGetDeviceCount(&ndev): } ``` Then show the value of "echo $??? Best regards, Jacob Faibussowitsch (Jacob Fai - booss - oh - vitch) > On Jan 20, 2022, at 17:47, Matthew Knepley wrote: > > On Thu, Jan 20, 2022 at 6:44 PM Fande Kong > wrote: > Thanks, Jed > > On Thu, Jan 20, 2022 at 4:34 PM Jed Brown > wrote: > You can't create CUDA or Kokkos Vecs if you're running on a node without a GPU. > > I am running the code on compute nodes that do have GPUs. > > If you are actually running on GPUs, why would you need lazy initialization? It would not break with GPUs present. > > Matt > > With PETSc-3.16.1, I got good speedup by running GAMG on GPUs. That might be a bug of PETSc-main. > > Thanks, > > Fande > > > > KSPSetUp 13 1.0 6.4400e-01 1.0 2.02e+09 1.0 0.0e+00 0.0e+00 0.0e+00 0 5 0 0 0 0 5 0 0 0 3140 64630 15 1.05e+02 5 3.49e+01 100 > KSPSolve 1 1.0 1.0109e+00 1.0 3.49e+10 1.0 0.0e+00 0.0e+00 0.0e+00 0 87 0 0 0 0 87 0 0 0 34522 69556 4 4.35e-03 1 2.38e-03 100 > KSPGMRESOrthog 142 1.0 1.2674e-01 1.0 1.06e+10 1.0 0.0e+00 0.0e+00 0.0e+00 0 27 0 0 0 0 27 0 0 0 83755 87801 0 0.00e+00 0 0.00e+00 100 > SNESSolve 1 1.0 4.4402e+01 1.0 4.00e+10 1.0 0.0e+00 0.0e+00 0.0e+00 21100 0 0 0 21100 0 0 0 901 51365 57 1.10e+03 52 8.78e+02 100 > SNESSetUp 1 1.0 3.9101e-05 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0 > SNESFunctionEval 2 1.0 1.7097e+01 1.0 1.60e+07 1.0 0.0e+00 0.0e+00 0.0e+00 8 0 0 0 0 8 0 0 0 0 1 0 0 0.00e+00 6 1.92e+02 0 > SNESJacobianEval 1 1.0 1.6213e+01 1.0 2.80e+07 1.0 0.0e+00 0.0e+00 0.0e+00 8 0 0 0 0 8 0 0 0 0 2 0 0 0.00e+00 1 3.20e+01 0 > SNESLineSearch 1 1.0 8.5582e+00 1.0 1.24e+08 1.0 0.0e+00 0.0e+00 0.0e+00 4 0 0 0 0 4 0 0 0 0 14 64153 1 3.20e+01 3 9.61e+01 94 > PCGAMGGraph_AGG 5 1.0 3.0509e+00 1.0 8.19e+07 1.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 27 0 5 3.49e+01 9 7.43e+01 0 > PCGAMGCoarse_AGG 5 1.0 3.8711e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 2 0 0 0 0 2 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0 > PCGAMGProl_AGG 5 1.0 7.0748e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0 > PCGAMGPOpt_AGG 5 1.0 1.2904e+00 1.0 2.14e+09 1.0 0.0e+00 0.0e+00 0.0e+00 1 5 0 0 0 1 5 0 0 0 1661 29807 26 7.15e+02 20 2.90e+02 99 > GAMG: createProl 5 1.0 8.9489e+00 1.0 2.22e+09 1.0 0.0e+00 0.0e+00 0.0e+00 4 6 0 0 0 4 6 0 0 0 249 29666 31 7.50e+02 29 3.64e+02 96 > Graph 10 1.0 3.0478e+00 1.0 8.19e+07 1.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 27 0 5 3.49e+01 9 7.43e+01 0 > MIS/Agg 5 1.0 4.1290e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0 > SA: col data 5 1.0 1.9127e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0 > SA: frmProl0 5 1.0 6.2662e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0 > SA: smooth 5 1.0 4.9595e-01 1.0 1.21e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 244 2709 15 1.97e+02 15 2.55e+02 90 > GAMG: partLevel 5 1.0 4.7330e-01 1.0 6.98e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 2 0 0 0 0 2 0 0 0 1475 4120 5 1.78e+02 10 2.55e+02 100 > PCGAMG Squ l00 1 1.0 2.6027e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0 > PCGAMG Gal l00 1 1.0 3.8406e-01 1.0 5.48e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 1426 4270 1 1.48e+02 2 2.11e+02 100 > PCGAMG Opt l00 1 1.0 2.4932e-01 1.0 7.20e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 289 2653 1 6.41e+01 1 1.13e+02 100 > PCGAMG Gal l01 1 1.0 6.6279e-02 1.0 1.09e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1645 3851 1 2.40e+01 2 3.64e+01 100 > PCGAMG Opt l01 1 1.0 2.9544e-02 1.0 7.15e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 242 1671 1 4.84e+00 1 1.23e+01 100 > PCGAMG Gal l02 1 1.0 1.8874e-02 1.0 3.72e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1974 3636 1 5.04e+00 2 6.58e+00 100 > PCGAMG Opt l02 1 1.0 7.4353e-03 1.0 2.40e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 323 1457 1 7.71e-01 1 2.30e+00 100 > PCGAMG Gal l03 1 1.0 2.8479e-03 1.0 4.10e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1440 2266 1 4.44e-01 2 5.51e-01 100 > PCGAMG Opt l03 1 1.0 8.2684e-04 1.0 2.80e+05 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 339 1667 1 6.72e-02 1 2.03e-01 100 > PCGAMG Gal l04 1 1.0 1.2238e-03 1.0 2.09e+05 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 170 244 1 2.05e-02 2 2.53e-02 100 > PCGAMG Opt l04 1 1.0 4.1008e-04 1.0 1.77e+04 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 43 165 1 4.49e-03 1 1.19e-02 100 > PCSetUp 2 1.0 9.9632e+00 1.0 4.95e+09 1.0 0.0e+00 0.0e+00 0.0e+00 5 12 0 0 0 5 12 0 0 0 496 17826 55 1.03e+03 45 6.54e+02 98 > PCSetUpOnBlocks 44 1.0 9.9087e-04 1.0 2.88e+03 1.0 > > > > > > The point of lazy initialization is to make it possible to run a solve that doesn't use a GPU in PETSC_ARCH that supports GPUs, regardless of whether a GPU is actually present. > > Fande Kong > writes: > > > I spoke too soon. It seems that we have trouble creating cuda/kokkos vecs > > now. Got Segmentation fault. > > > > Thanks, > > > > Fande > > > > Program received signal SIGSEGV, Segmentation fault. > > 0x00002aaab5558b11 in > > Petsc::CUPMDevice<(Petsc::CUPMDeviceType)0>::CUPMDeviceInternal::initialize > > (this=0x1) at > > /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/impls/cupm/cupmdevice.cxx:54 > > 54 PetscErrorCode CUPMDevice::CUPMDeviceInternal::initialize() noexcept > > Missing separate debuginfos, use: debuginfo-install > > bzip2-libs-1.0.6-13.el7.x86_64 elfutils-libelf-0.176-5.el7.x86_64 > > elfutils-libs-0.176-5.el7.x86_64 glibc-2.17-325.el7_9.x86_64 > > libX11-1.6.7-4.el7_9.x86_64 libXau-1.0.8-2.1.el7.x86_64 > > libattr-2.4.46-13.el7.x86_64 libcap-2.22-11.el7.x86_64 > > libibmad-5.4.0.MLNX20190423.1d917ae-0.1.49224.x86_64 > > libibumad-43.1.1.MLNX20200211.078947f-0.1.49224.x86_64 > > libibverbs-41mlnx1-OFED.4.9.0.0.7.49224.x86_64 > > libmlx4-41mlnx1-OFED.4.7.3.0.3.49224.x86_64 > > libmlx5-41mlnx1-OFED.4.9.0.1.2.49224.x86_64 libnl3-3.2.28-4.el7.x86_64 > > librdmacm-41mlnx1-OFED.4.7.3.0.6.49224.x86_64 > > librxe-41mlnx1-OFED.4.4.2.4.6.49224.x86_64 libxcb-1.13-1.el7.x86_64 > > libxml2-2.9.1-6.el7_9.6.x86_64 numactl-libs-2.0.12-5.el7.x86_64 > > systemd-libs-219-78.el7_9.3.x86_64 xz-libs-5.2.2-1.el7.x86_64 > > zlib-1.2.7-19.el7_9.x86_64 > > (gdb) bt > > #0 0x00002aaab5558b11 in > > Petsc::CUPMDevice<(Petsc::CUPMDeviceType)0>::CUPMDeviceInternal::initialize > > (this=0x1) at > > /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/impls/cupm/cupmdevice.cxx:54 > > #1 0x00002aaab5558db7 in > > Petsc::CUPMDevice<(Petsc::CUPMDeviceType)0>::getDevice > > (this=this at entry=0x2aaab7f37b70 > > , device=0x115da00, id=-35, id at entry=-1) at > > /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/impls/cupm/cupmdevice.cxx:344 > > #2 0x00002aaab55577de in PetscDeviceCreate (type=type at entry=PETSC_DEVICE_CUDA, > > devid=devid at entry=-1, device=device at entry=0x2aaab7f37b48 > > ) at > > /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/interface/device.cxx:107 > > #3 0x00002aaab5557b3a in PetscDeviceInitializeDefaultDevice_Internal > > (type=type at entry=PETSC_DEVICE_CUDA, defaultDeviceId=defaultDeviceId at entry=-1) > > at > > /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/interface/device.cxx:273 > > #4 0x00002aaab5557bf6 in PetscDeviceInitialize > > (type=type at entry=PETSC_DEVICE_CUDA) > > at > > /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/interface/device.cxx:234 > > #5 0x00002aaab5661fcd in VecCreate_SeqCUDA (V=0x115d150) at > > /home/kongf/workhome/sawtooth/moosegpu/petsc/src/vec/vec/impls/seq/seqcuda/veccuda.c:244 > > #6 0x00002aaab5649b40 in VecSetType (vec=vec at entry=0x115d150, > > method=method at entry=0x2aaab70b45b8 "seqcuda") at > > /home/kongf/workhome/sawtooth/moosegpu/petsc/src/vec/vec/interface/vecreg.c:93 > > #7 0x00002aaab579c33f in VecCreate_CUDA (v=0x115d150) at > > /home/kongf/workhome/sawtooth/moosegpu/petsc/src/vec/vec/impls/mpi/mpicuda/ > > mpicuda.cu:214 > > #8 0x00002aaab5649b40 in VecSetType (vec=vec at entry=0x115d150, > > method=method at entry=0x7fffffff9260 "cuda") at > > /home/kongf/workhome/sawtooth/moosegpu/petsc/src/vec/vec/interface/vecreg.c:93 > > #9 0x00002aaab5648bf1 in VecSetTypeFromOptions_Private (vec=0x115d150, > > PetscOptionsObject=0x7fffffff9210) at > > /home/kongf/workhome/sawtooth/moosegpu/petsc/src/vec/vec/interface/vector.c:1263 > > #10 VecSetFromOptions (vec=0x115d150) at > > /home/kongf/workhome/sawtooth/moosegpu/petsc/src/vec/vec/interface/vector.c:1297 > > #11 0x00002aaab02ef227 in libMesh::PetscVector::init > > (this=0x11cd1a0, n=441, n_local=441, fast=false, ptype=libMesh::PARALLEL) > > at > > /home/kongf/workhome/sawtooth/moosegpu/scripts/../libmesh/installed/include/libmesh/petsc_vector.h:693 > > > > On Thu, Jan 20, 2022 at 1:09 PM Fande Kong > wrote: > > > >> Thanks, Jed, > >> > >> This worked! > >> > >> Fande > >> > >> On Wed, Jan 19, 2022 at 11:03 PM Jed Brown > wrote: > >> > >>> Fande Kong > writes: > >>> > >>> > On Wed, Jan 19, 2022 at 11:39 AM Jacob Faibussowitsch < > >>> jacob.fai at gmail.com > > >>> > wrote: > >>> > > >>> >> Are you running on login nodes or compute nodes (I can?t seem to tell > >>> from > >>> >> the configure.log)? > >>> >> > >>> > > >>> > I was compiling codes on login nodes, and running codes on compute > >>> nodes. > >>> > Login nodes do not have GPUs, but compute nodes do have GPUs. > >>> > > >>> > Just to be clear, the same thing (code, machine) with PETSc-3.16.1 > >>> worked > >>> > perfectly. I have this trouble with PETSc-main. > >>> > >>> I assume you can > >>> > >>> export PETSC_OPTIONS='-device_enable lazy' > >>> > >>> and it'll work. > >>> > >>> I think this should be the default. The main complaint is that timing the > >>> first GPU-using event isn't accurate if it includes initialization, but I > >>> think this is mostly hypothetical because you can't trust any timing that > >>> doesn't preload in some form and the first GPU-using event will almost > >>> always be something uninteresting so I think it will rarely lead to > >>> confusion. Meanwhile, eager initialization is viscerally disruptive for > >>> lots of people. > >>> > >> > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From junchao.zhang at gmail.com Thu Jan 20 20:29:51 2022 From: junchao.zhang at gmail.com (Junchao Zhang) Date: Thu, 20 Jan 2022 20:29:51 -0600 Subject: [petsc-users] Cannot eagerly initialize cuda, as doing so results in cuda error 35 (cudaErrorInsufficientDriver) : CUDA driver version is insufficient for CUDA runtime version In-Reply-To: References: <233D0F20-AD95-481B-B862-353D5789C556@gmail.com> <87czkn9c64.fsf@jedbrown.org> <875yqe7zip.fsf@jedbrown.org> Message-ID: I don't see values using PetscUnlikely() today. --Junchao Zhang On Thu, Jan 20, 2022 at 7:26 PM Jacob Faibussowitsch wrote: > Segfault is caused by the following check at > src/sys/objects/device/impls/cupm/cupmdevice.cxx:349 being a > PetscUnlikelyDebug() rather than just PetscUnlikely(): > > ``` > if (PetscUnlikelyDebug(_defaultDevice < 0)) { // _defaultDevice is in fact > < 0 here and uncaught > ``` > > To clarify: > > ?lazy? initialization is not that lazy after all, it still does some 50% > of the initialization that ?eager? initialization does. It stops short > initializing the CUDA runtime, checking CUDA aware MPI, gathering device > data, and initializing cublas and friends. Lazy also importantly swallows > any errors that crop up during initialization, storing the resulting error > code for later (specifically _defaultDevice = -init_error_value;). > > So whether you initialize lazily or eagerly makes no difference here, as > _defaultDevice will always contain -35. > > The bigger question is why cudaGetDeviceCount() is returning > cudaErrorInsufficientDriver. Can you compile and run > > ``` > #include > > int main() > { > int ndev; > return cudaGetDeviceCount(&ndev): > } > ``` > > Then show the value of "echo $??? > > Best regards, > > Jacob Faibussowitsch > (Jacob Fai - booss - oh - vitch) > > On Jan 20, 2022, at 17:47, Matthew Knepley wrote: > > On Thu, Jan 20, 2022 at 6:44 PM Fande Kong wrote: > >> Thanks, Jed >> >> On Thu, Jan 20, 2022 at 4:34 PM Jed Brown wrote: >> >>> You can't create CUDA or Kokkos Vecs if you're running on a node without >>> a GPU. >> >> >> I am running the code on compute nodes that do have GPUs. >> > > If you are actually running on GPUs, why would you need lazy > initialization? It would not break with GPUs present. > > Matt > > >> With PETSc-3.16.1, I got good speedup by running GAMG on GPUs. That >> might be a bug of PETSc-main. >> >> Thanks, >> >> Fande >> >> >> >> KSPSetUp 13 1.0 6.4400e-01 1.0 2.02e+09 1.0 0.0e+00 0.0e+00 >> 0.0e+00 0 5 0 0 0 0 5 0 0 0 3140 64630 15 1.05e+02 5 >> 3.49e+01 100 >> KSPSolve 1 1.0 1.0109e+00 1.0 3.49e+10 1.0 0.0e+00 0.0e+00 >> 0.0e+00 0 87 0 0 0 0 87 0 0 0 34522 69556 4 4.35e-03 1 >> 2.38e-03 100 >> KSPGMRESOrthog 142 1.0 1.2674e-01 1.0 1.06e+10 1.0 0.0e+00 0.0e+00 >> 0.0e+00 0 27 0 0 0 0 27 0 0 0 83755 87801 0 0.00e+00 0 >> 0.00e+00 100 >> SNESSolve 1 1.0 4.4402e+01 1.0 4.00e+10 1.0 0.0e+00 0.0e+00 >> 0.0e+00 21100 0 0 0 21100 0 0 0 901 51365 57 1.10e+03 52 >> 8.78e+02 100 >> SNESSetUp 1 1.0 3.9101e-05 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 >> 0.00e+00 0 >> SNESFunctionEval 2 1.0 1.7097e+01 1.0 1.60e+07 1.0 0.0e+00 0.0e+00 >> 0.0e+00 8 0 0 0 0 8 0 0 0 0 1 0 0 0.00e+00 6 >> 1.92e+02 0 >> SNESJacobianEval 1 1.0 1.6213e+01 1.0 2.80e+07 1.0 0.0e+00 0.0e+00 >> 0.0e+00 8 0 0 0 0 8 0 0 0 0 2 0 0 0.00e+00 1 >> 3.20e+01 0 >> SNESLineSearch 1 1.0 8.5582e+00 1.0 1.24e+08 1.0 0.0e+00 0.0e+00 >> 0.0e+00 4 0 0 0 0 4 0 0 0 0 14 64153 1 3.20e+01 3 >> 9.61e+01 94 >> PCGAMGGraph_AGG 5 1.0 3.0509e+00 1.0 8.19e+07 1.0 0.0e+00 0.0e+00 >> 0.0e+00 1 0 0 0 0 1 0 0 0 0 27 0 5 3.49e+01 9 >> 7.43e+01 0 >> PCGAMGCoarse_AGG 5 1.0 3.8711e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 >> 0.0e+00 2 0 0 0 0 2 0 0 0 0 0 0 0 0.00e+00 0 >> 0.00e+00 0 >> PCGAMGProl_AGG 5 1.0 7.0748e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 >> 0.00e+00 0 >> PCGAMGPOpt_AGG 5 1.0 1.2904e+00 1.0 2.14e+09 1.0 0.0e+00 0.0e+00 >> 0.0e+00 1 5 0 0 0 1 5 0 0 0 1661 29807 26 7.15e+02 20 >> 2.90e+02 99 >> GAMG: createProl 5 1.0 8.9489e+00 1.0 2.22e+09 1.0 0.0e+00 0.0e+00 >> 0.0e+00 4 6 0 0 0 4 6 0 0 0 249 29666 31 7.50e+02 29 >> 3.64e+02 96 >> Graph 10 1.0 3.0478e+00 1.0 8.19e+07 1.0 0.0e+00 0.0e+00 >> 0.0e+00 1 0 0 0 0 1 0 0 0 0 27 0 5 3.49e+01 9 >> 7.43e+01 0 >> MIS/Agg 5 1.0 4.1290e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 >> 0.00e+00 0 >> SA: col data 5 1.0 1.9127e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 >> 0.00e+00 0 >> SA: frmProl0 5 1.0 6.2662e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 >> 0.00e+00 0 >> SA: smooth 5 1.0 4.9595e-01 1.0 1.21e+08 1.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 244 2709 15 1.97e+02 15 >> 2.55e+02 90 >> GAMG: partLevel 5 1.0 4.7330e-01 1.0 6.98e+08 1.0 0.0e+00 0.0e+00 >> 0.0e+00 0 2 0 0 0 0 2 0 0 0 1475 4120 5 1.78e+02 10 >> 2.55e+02 100 >> PCGAMG Squ l00 1 1.0 2.6027e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 >> 0.0e+00 1 0 0 0 0 1 0 0 0 0 0 0 0 0.00e+00 0 >> 0.00e+00 0 >> PCGAMG Gal l00 1 1.0 3.8406e-01 1.0 5.48e+08 1.0 0.0e+00 0.0e+00 >> 0.0e+00 0 1 0 0 0 0 1 0 0 0 1426 4270 1 1.48e+02 2 >> 2.11e+02 100 >> PCGAMG Opt l00 1 1.0 2.4932e-01 1.0 7.20e+07 1.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 289 2653 1 6.41e+01 1 >> 1.13e+02 100 >> PCGAMG Gal l01 1 1.0 6.6279e-02 1.0 1.09e+08 1.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 1645 3851 1 2.40e+01 2 >> 3.64e+01 100 >> PCGAMG Opt l01 1 1.0 2.9544e-02 1.0 7.15e+06 1.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 242 1671 1 4.84e+00 1 >> 1.23e+01 100 >> PCGAMG Gal l02 1 1.0 1.8874e-02 1.0 3.72e+07 1.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 1974 3636 1 5.04e+00 2 >> 6.58e+00 100 >> PCGAMG Opt l02 1 1.0 7.4353e-03 1.0 2.40e+06 1.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 323 1457 1 7.71e-01 1 >> 2.30e+00 100 >> PCGAMG Gal l03 1 1.0 2.8479e-03 1.0 4.10e+06 1.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 1440 2266 1 4.44e-01 2 >> 5.51e-01 100 >> PCGAMG Opt l03 1 1.0 8.2684e-04 1.0 2.80e+05 1.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 339 1667 1 6.72e-02 1 >> 2.03e-01 100 >> PCGAMG Gal l04 1 1.0 1.2238e-03 1.0 2.09e+05 1.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 170 244 1 2.05e-02 2 >> 2.53e-02 100 >> PCGAMG Opt l04 1 1.0 4.1008e-04 1.0 1.77e+04 1.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 43 165 1 4.49e-03 1 >> 1.19e-02 100 >> PCSetUp 2 1.0 9.9632e+00 1.0 4.95e+09 1.0 0.0e+00 0.0e+00 >> 0.0e+00 5 12 0 0 0 5 12 0 0 0 496 17826 55 1.03e+03 45 >> 6.54e+02 98 >> PCSetUpOnBlocks 44 1.0 9.9087e-04 1.0 2.88e+03 1.0 >> >> >> >> >> >> >>> The point of lazy initialization is to make it possible to run a solve >>> that doesn't use a GPU in PETSC_ARCH that supports GPUs, regardless of >>> whether a GPU is actually present. >>> >>> Fande Kong writes: >>> >>> > I spoke too soon. It seems that we have trouble creating cuda/kokkos >>> vecs >>> > now. Got Segmentation fault. >>> > >>> > Thanks, >>> > >>> > Fande >>> > >>> > Program received signal SIGSEGV, Segmentation fault. >>> > 0x00002aaab5558b11 in >>> > >>> Petsc::CUPMDevice<(Petsc::CUPMDeviceType)0>::CUPMDeviceInternal::initialize >>> > (this=0x1) at >>> > >>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/impls/cupm/cupmdevice.cxx:54 >>> > 54 PetscErrorCode CUPMDevice::CUPMDeviceInternal::initialize() >>> noexcept >>> > Missing separate debuginfos, use: debuginfo-install >>> > bzip2-libs-1.0.6-13.el7.x86_64 elfutils-libelf-0.176-5.el7.x86_64 >>> > elfutils-libs-0.176-5.el7.x86_64 glibc-2.17-325.el7_9.x86_64 >>> > libX11-1.6.7-4.el7_9.x86_64 libXau-1.0.8-2.1.el7.x86_64 >>> > libattr-2.4.46-13.el7.x86_64 libcap-2.22-11.el7.x86_64 >>> > libibmad-5.4.0.MLNX20190423.1d917ae-0.1.49224.x86_64 >>> > libibumad-43.1.1.MLNX20200211.078947f-0.1.49224.x86_64 >>> > libibverbs-41mlnx1-OFED.4.9.0.0.7.49224.x86_64 >>> > libmlx4-41mlnx1-OFED.4.7.3.0.3.49224.x86_64 >>> > libmlx5-41mlnx1-OFED.4.9.0.1.2.49224.x86_64 libnl3-3.2.28-4.el7.x86_64 >>> > librdmacm-41mlnx1-OFED.4.7.3.0.6.49224.x86_64 >>> > librxe-41mlnx1-OFED.4.4.2.4.6.49224.x86_64 libxcb-1.13-1.el7.x86_64 >>> > libxml2-2.9.1-6.el7_9.6.x86_64 numactl-libs-2.0.12-5.el7.x86_64 >>> > systemd-libs-219-78.el7_9.3.x86_64 xz-libs-5.2.2-1.el7.x86_64 >>> > zlib-1.2.7-19.el7_9.x86_64 >>> > (gdb) bt >>> > #0 0x00002aaab5558b11 in >>> > >>> Petsc::CUPMDevice<(Petsc::CUPMDeviceType)0>::CUPMDeviceInternal::initialize >>> > (this=0x1) at >>> > >>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/impls/cupm/cupmdevice.cxx:54 >>> > #1 0x00002aaab5558db7 in >>> > Petsc::CUPMDevice<(Petsc::CUPMDeviceType)0>::getDevice >>> > (this=this at entry=0x2aaab7f37b70 >>> > , device=0x115da00, id=-35, id at entry=-1) at >>> > >>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/impls/cupm/cupmdevice.cxx:344 >>> > #2 0x00002aaab55577de in PetscDeviceCreate (type=type at entry >>> =PETSC_DEVICE_CUDA, >>> > devid=devid at entry=-1, device=device at entry=0x2aaab7f37b48 >>> > ) at >>> > >>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/interface/device.cxx:107 >>> > #3 0x00002aaab5557b3a in PetscDeviceInitializeDefaultDevice_Internal >>> > (type=type at entry=PETSC_DEVICE_CUDA, >>> defaultDeviceId=defaultDeviceId at entry=-1) >>> > at >>> > >>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/interface/device.cxx:273 >>> > #4 0x00002aaab5557bf6 in PetscDeviceInitialize >>> > (type=type at entry=PETSC_DEVICE_CUDA) >>> > at >>> > >>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/interface/device.cxx:234 >>> > #5 0x00002aaab5661fcd in VecCreate_SeqCUDA (V=0x115d150) at >>> > >>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/vec/vec/impls/seq/seqcuda/veccuda.c:244 >>> > #6 0x00002aaab5649b40 in VecSetType (vec=vec at entry=0x115d150, >>> > method=method at entry=0x2aaab70b45b8 "seqcuda") at >>> > >>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/vec/vec/interface/vecreg.c:93 >>> > #7 0x00002aaab579c33f in VecCreate_CUDA (v=0x115d150) at >>> > >>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/vec/vec/impls/mpi/mpicuda/ >>> > mpicuda.cu:214 >>> > #8 0x00002aaab5649b40 in VecSetType (vec=vec at entry=0x115d150, >>> > method=method at entry=0x7fffffff9260 "cuda") at >>> > >>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/vec/vec/interface/vecreg.c:93 >>> > #9 0x00002aaab5648bf1 in VecSetTypeFromOptions_Private (vec=0x115d150, >>> > PetscOptionsObject=0x7fffffff9210) at >>> > >>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/vec/vec/interface/vector.c:1263 >>> > #10 VecSetFromOptions (vec=0x115d150) at >>> > >>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/vec/vec/interface/vector.c:1297 >>> > #11 0x00002aaab02ef227 in libMesh::PetscVector::init >>> > (this=0x11cd1a0, n=441, n_local=441, fast=false, >>> ptype=libMesh::PARALLEL) >>> > at >>> > >>> /home/kongf/workhome/sawtooth/moosegpu/scripts/../libmesh/installed/include/libmesh/petsc_vector.h:693 >>> > >>> > On Thu, Jan 20, 2022 at 1:09 PM Fande Kong >>> wrote: >>> > >>> >> Thanks, Jed, >>> >> >>> >> This worked! >>> >> >>> >> Fande >>> >> >>> >> On Wed, Jan 19, 2022 at 11:03 PM Jed Brown wrote: >>> >> >>> >>> Fande Kong writes: >>> >>> >>> >>> > On Wed, Jan 19, 2022 at 11:39 AM Jacob Faibussowitsch < >>> >>> jacob.fai at gmail.com> >>> >>> > wrote: >>> >>> > >>> >>> >> Are you running on login nodes or compute nodes (I can?t seem to >>> tell >>> >>> from >>> >>> >> the configure.log)? >>> >>> >> >>> >>> > >>> >>> > I was compiling codes on login nodes, and running codes on compute >>> >>> nodes. >>> >>> > Login nodes do not have GPUs, but compute nodes do have GPUs. >>> >>> > >>> >>> > Just to be clear, the same thing (code, machine) with PETSc-3.16.1 >>> >>> worked >>> >>> > perfectly. I have this trouble with PETSc-main. >>> >>> >>> >>> I assume you can >>> >>> >>> >>> export PETSC_OPTIONS='-device_enable lazy' >>> >>> >>> >>> and it'll work. >>> >>> >>> >>> I think this should be the default. The main complaint is that >>> timing the >>> >>> first GPU-using event isn't accurate if it includes initialization, >>> but I >>> >>> think this is mostly hypothetical because you can't trust any timing >>> that >>> >>> doesn't preload in some form and the first GPU-using event will >>> almost >>> >>> always be something uninteresting so I think it will rarely lead to >>> >>> confusion. Meanwhile, eager initialization is viscerally disruptive >>> for >>> >>> lots of people. >>> >>> >>> >> >>> >> > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at jedbrown.org Thu Jan 20 23:49:35 2022 From: jed at jedbrown.org (Jed Brown) Date: Thu, 20 Jan 2022 22:49:35 -0700 Subject: [petsc-users] Cannot eagerly initialize cuda, as doing so results in cuda error 35 (cudaErrorInsufficientDriver) : CUDA driver version is insufficient for CUDA runtime version In-Reply-To: References: <233D0F20-AD95-481B-B862-353D5789C556@gmail.com> <87czkn9c64.fsf@jedbrown.org> <875yqe7zip.fsf@jedbrown.org> Message-ID: <87zgnp7i4w.fsf@jedbrown.org> Junchao Zhang writes: > I don't see values using PetscUnlikely() today. It's usually premature optimization and PetscUnlikelyDebug makes it too easy to skip important checks. But at the time when I added PetscUnlikely, it was important for CHKERRQ(ierr). Specifically, without PetsUnlikely, many compilers (even at high optimization) would put the error handling code (a few instructions for every source line) in with the error-free path, using forward jumps to bypass it. Most CPUs predict that backward jumps are taken and forward jumps are not so this impacts both branch prediction and instruction cache locality. With PetscUnlikely, the error-handling code is reliably deposited at the end of the function where it usually won't have to enter the instruction cache and is never branch predicted. From mfadams at lbl.gov Fri Jan 21 08:05:10 2022 From: mfadams at lbl.gov (Mark Adams) Date: Fri, 21 Jan 2022 09:05:10 -0500 Subject: [petsc-users] hypre / hip usage Message-ID: Two questions about hypre on HIP: * I am doing this now. Is this correct? '--download-hypre', '--download-hypre-configure-arguments=--enable-unified-memory', '--with-hypre-gpuarch=gfx90a', * -mat_type hypre fails, so I am not using a -mat_type now. Just -vec_type hip. Hypre does seem to be running on the GPU from looking at scaling data and comparing it to GAMG. Is there a way to tell from log_view data that hypre is running on the GPU? Thanks, Mark -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at jedbrown.org Fri Jan 21 08:19:05 2022 From: jed at jedbrown.org (Jed Brown) Date: Fri, 21 Jan 2022 07:19:05 -0700 Subject: [petsc-users] hypre / hip usage In-Reply-To: References: Message-ID: <87r1916ujq.fsf@jedbrown.org> Mark Adams writes: > Two questions about hypre on HIP: > > * I am doing this now. Is this correct? > > '--download-hypre', > '--download-hypre-configure-arguments=--enable-unified-memory', > '--with-hypre-gpuarch=gfx90a', I's recommended to use --with-hip-arch=gfx90a, which forwards to Hypre. > * -mat_type hypre fails, so I am not using a -mat_type now. > Just -vec_type hip. > Hypre does seem to be running on the GPU from looking at scaling data and > comparing it to GAMG. > Is there a way to tell from log_view data that hypre is running on the GPU? Is it clear from data transfer within PCApply? I think it should (but might not currently) say in -ksp_view output. From ptbauman at gmail.com Fri Jan 21 08:29:53 2022 From: ptbauman at gmail.com (Paul T. Bauman) Date: Fri, 21 Jan 2022 08:29:53 -0600 Subject: [petsc-users] hypre / hip usage In-Reply-To: <87r1916ujq.fsf@jedbrown.org> References: <87r1916ujq.fsf@jedbrown.org> Message-ID: On Fri, Jan 21, 2022 at 8:19 AM Jed Brown wrote: > Mark Adams writes: > > > Two questions about hypre on HIP: > > > > * I am doing this now. Is this correct? > > > > '--download-hypre', > > '--download-hypre-configure-arguments=--enable-unified-memory', > Apologies for interjecting, but I want to point out here that a pretty good chunk of BoomerAMG is ported to the GPU and you may not need this unified-memory option. I point this out because you will get substantially better performance without this option, i.e. using "native" GPU memory. I do not know the intricacies of the PETSc/HYPRE/GPU interaction so maybe PETSc won't handle the CPU->GPU memcopies for you (I'm assuming vecs, mats are assembled on the CPU) in which case you might need the option. And if you do run into code paths in BoomerAMG that are not ported to the GPU and you want to use them, I'd be very interested to know what the options are that are missing a GPU port. Thanks, Paul -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at jedbrown.org Fri Jan 21 08:37:47 2022 From: jed at jedbrown.org (Jed Brown) Date: Fri, 21 Jan 2022 07:37:47 -0700 Subject: [petsc-users] hypre / hip usage In-Reply-To: References: <87r1916ujq.fsf@jedbrown.org> Message-ID: <87lez96tok.fsf@jedbrown.org> "Paul T. Bauman" writes: > On Fri, Jan 21, 2022 at 8:19 AM Jed Brown wrote: > >> Mark Adams writes: >> >> > Two questions about hypre on HIP: >> > >> > * I am doing this now. Is this correct? >> > >> > '--download-hypre', >> > '--download-hypre-configure-arguments=--enable-unified-memory', >> > > Apologies for interjecting, but I want to point out here that a pretty good > chunk of BoomerAMG is ported to the GPU and you may not need this > unified-memory option. I point this out because you will get substantially > better performance without this option, i.e. using "native" GPU memory. I > do not know the intricacies of the PETSc/HYPRE/GPU interaction so maybe > PETSc won't handle the CPU->GPU memcopies for you (I'm assuming vecs, mats > are assembled on the CPU) in which case you might need the option. And if > you do run into code paths in BoomerAMG that are not ported to the GPU and > you want to use them, I'd be very interested to know what the options are > that are missing a GPU port. We have matrices and vectors assembled on the device and logic to pass the device data to Hypre. Stefano knows the details. Will the option --enable-unified-memory hurt performance if we provide all data on the device? From nicolas.barnafi at unimi.it Fri Jan 21 08:50:17 2022 From: nicolas.barnafi at unimi.it (Nicolas Alejandro Barnafi) Date: Fri, 21 Jan 2022 15:50:17 +0100 Subject: [petsc-users] HYPRE AMS - Segmentation Violation with discrete gradient Message-ID: <7bf2bb42c9ccf.61ead639@unimi.it> Dear community,? I'm giving the discrete gradient to a PC object (sub PC of a fieldsplit) but HYPRE internally gives a segmentation violation error. The matrix has been adequately set, as I have added it to the program output for inspection. Has this happened to anyone? I have attached the error below together with the discrete gradient, in case you see something I am missing. The code is currently running in serial, so I don't expect communication/partitioning to be an issue (although it could be in the future). Thanks in advance, Nicolas ------------------------------ PETSc output ------------------------------------ Mat Object: 1 MPI processes type: seqaij -1.00000e+00 1.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 1.00000e+00 0.00000e+00 -1.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 -1.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 1.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 -1.00000e+00 0.00000e+00 1.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 1.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 -1.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 1.00000e+00 -1.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 1.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 -1.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 -1.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 1.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 1.00000e+00 -1.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 -1.00000e+00 0.00000e+00 1.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 1.00000e+00 0.00000e+00 -1.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 -1.00000e+00 1.00000e+00 [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger [0]PETSC ERROR: or see https://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors [0]PETSC ERROR: likely location of problem given in stack below [0]PETSC ERROR: --------------------- Stack Frames ------------------------------------ [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, [0]PETSC ERROR: INSTEAD the line number of the start of the function [0]PETSC ERROR: is given. [0]PETSC ERROR: [0] jac->setup line 408 /home/ubuntu/petsc/src/ksp/pc/impls/hypre/hypre.c [0]PETSC ERROR: [0] PCSetUp_HYPRE line 223 /home/ubuntu/petsc/src/ksp/pc/impls/hypre/hypre.c [0]PETSC ERROR: [0] PCSetUp line 971 /home/ubuntu/petsc/src/ksp/pc/interface/precon.c [0]PETSC ERROR: [0] KSPSetUp line 319 /home/ubuntu/petsc/src/ksp/ksp/interface/itfunc.c [0]PETSC ERROR: [0] KSPSolve_Private line 615 /home/ubuntu/petsc/src/ksp/ksp/interface/itfunc.c [0]PETSC ERROR: [0] KSPSolve line 884 /home/ubuntu/petsc/src/ksp/ksp/interface/itfunc.c [0]PETSC ERROR: [0] PCApply_FieldSplit line 1241 /home/ubuntu/petsc/src/ksp/pc/impls/fieldsplit/fieldsplit.c [0]PETSC ERROR: [0] PCApply line 426 /home/ubuntu/petsc/src/ksp/pc/interface/precon.c [0]PETSC ERROR: [0] KSP_PCApply line 281 /home/ubuntu/petsc/include/petsc/private/kspimpl.h [0]PETSC ERROR: [0] KSPInitialResidual line 40 /home/ubuntu/petsc/src/ksp/ksp/interface/itres.c [0]PETSC ERROR: [0] KSPSolve_GMRES line 233 /home/ubuntu/petsc/src/ksp/ksp/impls/gmres/gmres.c [0]PETSC ERROR: [0] KSPSolve_Private line 615 /home/ubuntu/petsc/src/ksp/ksp/interface/itfunc.c [0]PETSC ERROR: [0] KSPSolve line 884 /home/ubuntu/petsc/src/ksp/ksp/interface/itfunc.c [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [0]PETSC ERROR: Signal received [0]PETSC ERROR: See https://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. [0]PETSC ERROR: Petsc Release Version 3.14.6, unknown [0]PETSC ERROR: Unknown Name on a arch-linux-c-debug named ubuntu-ThinkPad-L14-Gen-1 by ubuntu Fri Jan 21 15:37:45 2022 [0]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --download-mpich --download-fblaslapack --with-mpi=1 --download-superlu_dist --download-mumps --download-hypre --with-debugging=1 COPTFLAGS="-O3 -march=native -mtune=native" CXXOPTFLAGS="-O3 -march=native -mtune=native" FOPTFLAGS="-O3 -march=native -mtune=native" --download-scalapack --download-hpddm [0]PETSC ERROR: #1 User provided function() line 0 in unknown file [0]PETSC ERROR: Checking the memory for corruption. application called MPI_Abort(MPI_COMM_WORLD, 50176059) - process 0 [unset]: write_line error; fd=-1 buf=:cmd=abort exitcode=50176059 : system msg for write_line failure : Bad file descriptor -------------- next part -------------- An HTML attachment was scrubbed... URL: From mfadams at lbl.gov Fri Jan 21 09:27:54 2022 From: mfadams at lbl.gov (Mark Adams) Date: Fri, 21 Jan 2022 10:27:54 -0500 Subject: [petsc-users] hypre / hip usage In-Reply-To: <87r1916ujq.fsf@jedbrown.org> References: <87r1916ujq.fsf@jedbrown.org> Message-ID: > > > > > Is there a way to tell from log_view data that hypre is running on the > GPU? > > Is it clear from data transfer within PCApply? > > Well, this does not look right. '-mat_type hypre' fails. I guess we have to get that working or could/should it work with -mat_type aijkokkos ? --- Event Stage 2: KSP Solve only MatMult 230 1.0 1.0922e-01 2.0 1.50e+07 2.1 2.3e+06 2.7e+02 0.0e+00 1 58 81 64 0 3 91100100 0 62942 0 0 0.00e+00 920 4.26e+00 0 KSPSolve 10 1.0 3.0406e+00 1.0 1.64e+07 2.0 2.3e+06 2.7e+02 7.0e+02 51 64 81 64 74 100100100100100 2488 4253 230 8.99e-01 1620 4.27e+00 9 SFPack 230 1.0 3.6798e-03 2.9 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0 SFUnpack 230 1.0 1.5381e-04 4.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0 VecTDot 460 1.0 3.9809e-01 1.7 4.71e+05 1.5 0.0e+00 0.0e+00 4.6e+02 5 2 0 0 49 10 3 0 0 66 577 5656 230 8.99e-01 460 3.68e-03 100 VecNorm 240 1.0 1.5313e-01 1.2 2.46e+05 1.5 0.0e+00 0.0e+00 2.4e+02 2 1 0 0 25 5 2 0 0 34 783 4140 0 0.00e+00 240 1.92e-03 100 VecCopy 20 1.0 3.9648e-03 2.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0 VecSet 250 1.0 4.8203e-02 1.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0 VecAXPY 460 1.0 6.6492e-02 1.4 4.71e+05 1.5 0.0e+00 0.0e+00 0.0e+00 1 2 0 0 0 2 3 0 0 0 3460 5734 0 0.00e+00 0 0.00e+00 100 VecAYPX 220 1.0 4.4230e-02 1.9 2.25e+05 1.5 0.0e+00 0.0e+00 0.0e+00 1 1 0 0 0 1 1 0 0 0 2487 3157 0 0.00e+00 0 0.00e+00 100 VecScatterBegin 230 1.0 7.2590e-02 2.9 0.00e+00 0.0 2.3e+06 2.7e+02 0.0e+00 1 0 81 64 0 1 0100100 0 0 0 0 0.00e+00 460 2.46e+00 0 VecScatterEnd 230 1.0 4.1541e-0213.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 1 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0 VecHIPCopyTo 230 1.0 5.0658e-03 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 230 8.99e-01 0 0.00e+00 0 VecHIPCopyFrom 460 1.0 1.1580e-02 2.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 460 1.80e+00 0 PCApply 230 1.0 2.4747e+00 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 40 0 0 0 0 80 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0 I think it should (but might not currently) say in -ksp_view output. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ptbauman at gmail.com Fri Jan 21 09:32:37 2022 From: ptbauman at gmail.com (Paul T. Bauman) Date: Fri, 21 Jan 2022 09:32:37 -0600 Subject: [petsc-users] hypre / hip usage In-Reply-To: References: <87r1916ujq.fsf@jedbrown.org> <87lez96tok.fsf@jedbrown.org> Message-ID: Dammit, didn't reply-all. Sorry. On Fri, Jan 21, 2022 at 8:52 AM Paul T. Bauman wrote: > > > On Fri, Jan 21, 2022 at 8:37 AM Jed Brown wrote: > >> "Paul T. Bauman" writes: >> >> > On Fri, Jan 21, 2022 at 8:19 AM Jed Brown wrote: >> > >> >> Mark Adams writes: >> >> >> >> > Two questions about hypre on HIP: >> >> > >> >> > * I am doing this now. Is this correct? >> >> > >> >> > '--download-hypre', >> >> > '--download-hypre-configure-arguments=--enable-unified-memory', >> >> >> > >> > Apologies for interjecting, but I want to point out here that a pretty >> good >> > chunk of BoomerAMG is ported to the GPU and you may not need this >> > unified-memory option. I point this out because you will get >> substantially >> > better performance without this option, i.e. using "native" GPU memory. >> I >> > do not know the intricacies of the PETSc/HYPRE/GPU interaction so maybe >> > PETSc won't handle the CPU->GPU memcopies for you (I'm assuming vecs, >> mats >> > are assembled on the CPU) in which case you might need the option. And >> if >> > you do run into code paths in BoomerAMG that are not ported to the GPU >> and >> > you want to use them, I'd be very interested to know what the options >> are >> > that are missing a GPU port. >> >> We have matrices and vectors assembled on the device and logic to pass >> the device data to Hypre. Stefano knows the details. >> >> Will the option --enable-unified-memory hurt performance if we provide >> all data on the device? >> > > Yes. The way HYPRE's memory model is setup is that ALL GPU allocations are > "native" (i.e. [cuda,hip]Malloc) or, if unified memory is enabled, then ALL > GPU allocations are unified memory (i.e. [cuda,hip]MallocManaged). > Regarding HIP, there is an HMM implementation of hipMallocManaged planned, > but is it not yet delivered AFAIK (and it will *not* support gfx906, e.g. > RVII, FYI), so, today, under the covers, hipMallocManaged is calling > hipHostMalloc. So, today, all your unified memory allocations in HYPRE on > HIP are doing CPU-pinned memory accesses. And performance is just truly > terrible (as you might expect). > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Fri Jan 21 09:34:25 2022 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 21 Jan 2022 10:34:25 -0500 Subject: [petsc-users] HYPRE AMS - Segmentation Violation with discrete gradient In-Reply-To: <7bf2bb42c9ccf.61ead639@unimi.it> References: <7bf2bb42c9ccf.61ead639@unimi.it> Message-ID: On Fri, Jan 21, 2022 at 9:50 AM Nicolas Alejandro Barnafi < nicolas.barnafi at unimi.it> wrote: > Dear community, > > I'm giving the discrete gradient to a PC object (sub PC of a fieldsplit) > but HYPRE internally gives a segmentation violation error. The matrix has > been adequately set, as I have added it to the program output for > inspection. Has this happened to anyone? > Is there a chance of sending us something that we can run? Alternatively, can you run https://gitlab.com/petsc/petsc/-/blob/main/src/ksp/ksp/tutorials/ex10.c loading your matrix and giving options to select Hypre? Then we can do the same thing here with your matrix. Thanks, Matt > I have attached the error below together with the discrete gradient, in > case you see something I am missing. > > The code is currently running in serial, so I don't expect > communication/partitioning to be an issue (although it could be in the > future). > > Thanks in advance, > Nicolas > > > ------------------------------ PETSc output > ------------------------------------ > > Mat Object: 1 MPI processes > type: seqaij > -1.00000e+00 1.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 > 0.00000e+00 0.00000e+00 0.00000e+00 > 1.00000e+00 0.00000e+00 -1.00000e+00 0.00000e+00 0.00000e+00 > 0.00000e+00 0.00000e+00 0.00000e+00 > -1.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 1.00000e+00 > 0.00000e+00 0.00000e+00 0.00000e+00 > 0.00000e+00 -1.00000e+00 0.00000e+00 1.00000e+00 0.00000e+00 > 0.00000e+00 0.00000e+00 0.00000e+00 > 0.00000e+00 1.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 > -1.00000e+00 0.00000e+00 0.00000e+00 > 0.00000e+00 0.00000e+00 1.00000e+00 -1.00000e+00 0.00000e+00 > 0.00000e+00 0.00000e+00 0.00000e+00 > 0.00000e+00 0.00000e+00 1.00000e+00 0.00000e+00 0.00000e+00 > 0.00000e+00 -1.00000e+00 0.00000e+00 > 0.00000e+00 0.00000e+00 0.00000e+00 -1.00000e+00 0.00000e+00 > 0.00000e+00 0.00000e+00 1.00000e+00 > 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 1.00000e+00 > -1.00000e+00 0.00000e+00 0.00000e+00 > 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 -1.00000e+00 > 0.00000e+00 1.00000e+00 0.00000e+00 > 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 > 1.00000e+00 0.00000e+00 -1.00000e+00 > 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 > 0.00000e+00 -1.00000e+00 1.00000e+00 > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, > probably memory access out of range > [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > [0]PETSC ERROR: or see > https://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind > [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS > X to find memory corruption errors > [0]PETSC ERROR: likely location of problem given in stack below > [0]PETSC ERROR: --------------------- Stack Frames > ------------------------------------ > [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not > available, > [0]PETSC ERROR: INSTEAD the line number of the start of the function > [0]PETSC ERROR: is given. > [0]PETSC ERROR: [0] jac->setup line 408 > /home/ubuntu/petsc/src/ksp/pc/impls/hypre/hypre.c > [0]PETSC ERROR: [0] PCSetUp_HYPRE line 223 > /home/ubuntu/petsc/src/ksp/pc/impls/hypre/hypre.c > [0]PETSC ERROR: [0] PCSetUp line 971 > /home/ubuntu/petsc/src/ksp/pc/interface/precon.c > [0]PETSC ERROR: [0] KSPSetUp line 319 > /home/ubuntu/petsc/src/ksp/ksp/interface/itfunc.c > [0]PETSC ERROR: [0] KSPSolve_Private line 615 > /home/ubuntu/petsc/src/ksp/ksp/interface/itfunc.c > [0]PETSC ERROR: [0] KSPSolve line 884 > /home/ubuntu/petsc/src/ksp/ksp/interface/itfunc.c > [0]PETSC ERROR: [0] PCApply_FieldSplit line 1241 > /home/ubuntu/petsc/src/ksp/pc/impls/fieldsplit/fieldsplit.c > [0]PETSC ERROR: [0] PCApply line 426 > /home/ubuntu/petsc/src/ksp/pc/interface/precon.c > [0]PETSC ERROR: [0] KSP_PCApply line 281 > /home/ubuntu/petsc/include/petsc/private/kspimpl.h > [0]PETSC ERROR: [0] KSPInitialResidual line 40 > /home/ubuntu/petsc/src/ksp/ksp/interface/itres.c > [0]PETSC ERROR: [0] KSPSolve_GMRES line 233 > /home/ubuntu/petsc/src/ksp/ksp/impls/gmres/gmres.c > [0]PETSC ERROR: [0] KSPSolve_Private line 615 > /home/ubuntu/petsc/src/ksp/ksp/interface/itfunc.c > [0]PETSC ERROR: [0] KSPSolve line 884 > /home/ubuntu/petsc/src/ksp/ksp/interface/itfunc.c > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > [0]PETSC ERROR: Signal received > [0]PETSC ERROR: See https://www.mcs.anl.gov/petsc/documentation/faq.html > for trouble shooting. > [0]PETSC ERROR: Petsc Release Version 3.14.6, unknown > [0]PETSC ERROR: Unknown Name on a arch-linux-c-debug named > ubuntu-ThinkPad-L14-Gen-1 by ubuntu Fri Jan 21 15:37:45 2022 > [0]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ > --with-fc=gfortran --download-mpich --download-fblaslapack --with-mpi=1 > --download-superlu_dist --download-mumps --download-hypre > --with-debugging=1 COPTFLAGS="-O3 -march=native -mtune=native" > CXXOPTFLAGS="-O3 -march=native -mtune=native" FOPTFLAGS="-O3 -march=native > -mtune=native" --download-scalapack --download-hpddm > [0]PETSC ERROR: #1 User provided function() line 0 in unknown file > [0]PETSC ERROR: Checking the memory for corruption. > application called MPI_Abort(MPI_COMM_WORLD, 50176059) - process 0 > [unset]: write_line error; fd=-1 buf=:cmd=abort exitcode=50176059 > : > system msg for write_line failure : Bad file descriptor > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at jedbrown.org Fri Jan 21 09:44:41 2022 From: jed at jedbrown.org (Jed Brown) Date: Fri, 21 Jan 2022 08:44:41 -0700 Subject: [petsc-users] hypre / hip usage In-Reply-To: References: <87r1916ujq.fsf@jedbrown.org> Message-ID: <87ilud6ql2.fsf@jedbrown.org> Mark Adams writes: >> >> >> >> > Is there a way to tell from log_view data that hypre is running on the >> GPU? >> >> Is it clear from data transfer within PCApply? >> >> > Well, this does not look right. '-mat_type hypre' fails. I guess we have to > get that working or could/should it work with -mat_type aijkokkos ? > > --- Event Stage 2: KSP Solve only > > MatMult 230 1.0 1.0922e-01 2.0 1.50e+07 2.1 2.3e+06 2.7e+02 > 0.0e+00 1 58 81 64 0 3 91100100 0 62942 0 0 0.00e+00 920 > 4.26e+00 0 > KSPSolve 10 1.0 3.0406e+00 1.0 1.64e+07 2.0 2.3e+06 2.7e+02 > 7.0e+02 51 64 81 64 74 100100100100100 2488 4253 230 8.99e-01 1620 > 4.27e+00 9 This 9% on GPU isn't good. For comparison (debug on my laptop) $ ompi-cuda-g/tests/snes/tutorials/ex5 -da_refine 7 -dm_mat_type aijcusparse -dm_vec_type cuda -pc_type hypre -log_view [...] Event Count Time (sec) Flop --- Global --- --- Stage ---- Total GPU - CpuToGpu - - GpuToCpu - GPU Max Ratio Max Ratio Max Ratio Mess AvgLen Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s Mflop/s Count Size Count Size %F --------------------------------------------------------------------------------------------------------------------------------------------------------------- --- Event Stage 0: Main Stage BuildTwoSided 1 1.0 4.7631e-05 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0 MatMult 28 1.0 1.0347e-02 1.0 3.73e+07 1.0 0.0e+00 0.0e+00 0.0e+00 1 30 0 0 0 1 30 0 0 0 3601 8498 4 2.72e+01 0 0.00e+00 100 MatConvert 4 1.0 4.0087e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 3 0 0 0 0 3 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0 MatAssemblyBegin 9 1.0 2.2978e-05 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0 MatAssemblyEnd 9 1.0 2.0741e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 2 0 0 0 0 2 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0 MatGetRowIJ 4 1.0 8.6590e-06 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0 MatView 1 1.0 5.0983e-05 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0 MatCUSPARSCopyTo 4 1.0 5.4931e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 4 2.72e+01 0 0.00e+00 0 KSPSetUp 4 1.0 1.9243e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0 KSPSolve 4 1.0 1.0843e-01 1.0 1.03e+08 1.0 0.0e+00 0.0e+00 0.0e+00 9 83 0 0 0 9 83 0 0 0 945 7695 4 2.72e+01 0 0.00e+00 100 KSPGMRESOrthog 24 1.0 5.3822e-03 1.0 4.98e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 40 0 0 0 0 40 0 0 0 9255 16695 0 0.00e+00 0 0.00e+00 100 So all the recorded flops are on the GPU. SNESSolve 1 1.0 9.3487e-01 1.0 1.23e+08 1.0 0.0e+00 0.0e+00 0.0e+00 76100 0 0 0 76100 0 0 0 132 6212 11 3.55e+01 10 1.19e+01 93 SNESSetUp 1 1.0 2.5863e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 21 0 0 0 0 21 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0 SNESFunctionEval 5 1.0 4.3279e-02 1.0 8.23e+06 1.0 0.0e+00 0.0e+00 0.0e+00 4 7 0 0 0 4 7 0 0 0 190 0 1 1.19e+00 7 8.30e+00 0 SNESJacobianEval 4 1.0 4.0748e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 33 0 0 0 0 33 0 0 0 0 0 0 0 0.00e+00 3 3.56e+00 0 SNESLineSearch 4 1.0 3.4112e-02 1.0 1.90e+07 1.0 0.0e+00 0.0e+00 0.0e+00 3 15 0 0 0 3 15 0 0 0 558 2556 5 5.93e+00 5 5.93e+00 65 DMCreateMat 1 1.0 2.5826e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 21 0 0 0 0 21 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0 SFSetGraph 2 1.0 7.5938e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0 SFSetUp 1 1.0 7.5895e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0 SFPack 17 1.0 3.9599e-05 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0 SFUnpack 17 1.0 3.9463e-05 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0 VecDot 4 1.0 4.1088e-04 1.0 1.19e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 2886 3698 0 0.00e+00 0 0.00e+00 100 VecMDot 24 1.0 2.9923e-03 1.0 2.49e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 20 0 0 0 0 20 0 0 0 8325 17385 0 0.00e+00 0 0.00e+00 100 VecNorm 37 1.0 8.1704e-03 1.0 1.10e+07 1.0 0.0e+00 0.0e+00 0.0e+00 1 9 0 0 0 1 9 0 0 0 1342 1638 5 5.93e+00 0 0.00e+00 100 VecScale 32 1.0 6.4316e-04 1.0 4.15e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 3 0 0 0 0 3 0 0 0 6453 11772 0 0.00e+00 0 0.00e+00 100 VecCopy 12 1.0 4.6137e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0 VecSet 37 1.0 9.6073e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0 VecAXPY 4 1.0 1.5693e-04 1.0 1.19e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 7556 11591 0 0.00e+00 0 0.00e+00 100 VecWAXPY 4 1.0 3.6415e-04 1.0 1.19e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 3256 4372 0 0.00e+00 0 0.00e+00 100 VecMAXPY 28 1.0 2.4212e-03 1.0 3.20e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 26 0 0 0 0 26 0 0 0 13223 15414 0 0.00e+00 0 0.00e+00 100 VecScatterBegin 9 1.0 1.0676e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 0 0 1 1.19e+00 7 8.30e+00 0 VecScatterEnd 9 1.0 9.2955e-05 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0 VecReduceArith 8 1.0 2.2297e-03 1.0 2.37e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 2 0 0 0 0 2 0 0 0 1064 1246 1 1.19e+00 0 0.00e+00 100 VecReduceComm 4 1.0 1.3690e-05 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0 VecNormalize 28 1.0 6.0496e-03 1.0 1.25e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 10 0 0 0 0 10 0 0 0 2058 2357 0 0.00e+00 0 0.00e+00 100 VecCUDACopyTo 6 1.0 1.1751e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 6 7.11e+00 0 0.00e+00 0 VecCUDACopyFrom 4 1.0 7.8236e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 4 4.74e+00 0 PCSetUp 4 1.0 3.6268e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 29 0 0 0 0 29 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0 PCApply 28 1.0 8.1240e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 7 0 0 0 0 7 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0 and PCApply doesn't transfer anything to the device. Perhaps we should use operator complexity to make a nonzero placeholder for Hypre flops. I don't think BoomerAMG has an API for operator complexity, but it looks like there is code to print it so maybe we can obtain it (or ask them to add an API). From jed at jedbrown.org Fri Jan 21 10:14:05 2022 From: jed at jedbrown.org (Jed Brown) Date: Fri, 21 Jan 2022 09:14:05 -0700 Subject: [petsc-users] hypre / hip usage In-Reply-To: References: <87r1916ujq.fsf@jedbrown.org> <87lez96tok.fsf@jedbrown.org> Message-ID: <87fsph6p82.fsf@jedbrown.org> "Paul T. Bauman" writes: > On Fri, Jan 21, 2022 at 8:52 AM Paul T. Bauman wrote: >> Yes. The way HYPRE's memory model is setup is that ALL GPU allocations are >> "native" (i.e. [cuda,hip]Malloc) or, if unified memory is enabled, then ALL >> GPU allocations are unified memory (i.e. [cuda,hip]MallocManaged). >> Regarding HIP, there is an HMM implementation of hipMallocManaged planned, >> but is it not yet delivered AFAIK (and it will *not* support gfx906, e.g. >> RVII, FYI), so, today, under the covers, hipMallocManaged is calling >> hipHostMalloc. So, today, all your unified memory allocations in HYPRE on >> HIP are doing CPU-pinned memory accesses. And performance is just truly >> terrible (as you might expect). Thanks for this important bit of information. And it sounds like when we add support to hand off Kokkos matrices and vectors (our current support for matrices on ROCm devices uses Kokkos) or add direct support for hipSparse, we'll avoid touching host memory in assembly-to-solve with hypre. From mfadams at lbl.gov Fri Jan 21 10:23:53 2022 From: mfadams at lbl.gov (Mark Adams) Date: Fri, 21 Jan 2022 11:23:53 -0500 Subject: [petsc-users] hypre / hip usage In-Reply-To: <87fsph6p82.fsf@jedbrown.org> References: <87r1916ujq.fsf@jedbrown.org> <87lez96tok.fsf@jedbrown.org> <87fsph6p82.fsf@jedbrown.org> Message-ID: On Fri, Jan 21, 2022 at 11:14 AM Jed Brown wrote: > "Paul T. Bauman" writes: > > > On Fri, Jan 21, 2022 at 8:52 AM Paul T. Bauman > wrote: > >> Yes. The way HYPRE's memory model is setup is that ALL GPU allocations > are > >> "native" (i.e. [cuda,hip]Malloc) or, if unified memory is enabled, then > ALL > >> GPU allocations are unified memory (i.e. [cuda,hip]MallocManaged). > >> Regarding HIP, there is an HMM implementation of hipMallocManaged > planned, > >> but is it not yet delivered AFAIK (and it will *not* support gfx906, > e.g. > >> RVII, FYI), so, today, under the covers, hipMallocManaged is calling > >> hipHostMalloc. So, today, all your unified memory allocations in HYPRE > on > >> HIP are doing CPU-pinned memory accesses. And performance is just truly > >> terrible (as you might expect). > > Thanks for this important bit of information. > > And it sounds like when we add support to hand off Kokkos matrices and > vectors (our current support for matrices on ROCm devices uses Kokkos) or > add direct support for hipSparse, we'll avoid touching host memory in > assembly-to-solve with hypre. > It does not look like anyone has made Hypre work with HIP. Stafano added a runex19_hypre_hip target 4 months ago and hypre.py has some HIP things. I have a user that would like to try this, no hurry but, can I get an idea of a plan for this? Thanks, Mark -------------- next part -------------- An HTML attachment was scrubbed... URL: From nicolas.barnafi at unimi.it Fri Jan 21 11:05:09 2022 From: nicolas.barnafi at unimi.it (Nicolas Alejandro Barnafi) Date: Fri, 21 Jan 2022 18:05:09 +0100 Subject: [petsc-users] HYPRE AMS - Segmentation Violation with discrete gradient In-Reply-To: <7d0fabc8cb8c2.61eae7b5@unimi.it> References: <7bf2bb42c9ccf.61ead639@unimi.it> <7d64948dc9956.61eae6c2@unimi.it> <7e7dd1a3ce69c.61eae6fe@unimi.it> <7e7daa98ce073.61eae73b@unimi.it> <7f28becdc88d0.61eae779@unimi.it> <7d0fabc8cb8c2.61eae7b5@unimi.it> Message-ID: <7f2cb725cce30.61eaf5d5@unimi.it> Thank you Matt, I have trimmed down ex10 (a lot)?to do as required, and it indeed reproduces the error.? You may find it attached and it can be reproduced with ./ex10 -fA Amat -fP Pmat -fG Gmat -ksp_type gmres -pc_type hypre -pc_hypre_type ams Thank you! Il 21/01/22 16:35, Matthew Knepley ha scritto: > > > > On Fri, Jan 21, 2022 at 9:50 AM Nicolas Alejandro Barnafi wrote: > > > > Dear community,? > > > > I'm giving the discrete gradient to a PC object (sub PC of a fieldsplit) but HYPRE internally gives a segmentation violation error. The matrix has been adequately set, as I have added it to the program output for inspection. Has this happened to anyone? > > > > > Is there a chance of sending us something that we can run? Alternatively, can you run > > https://gitlab.com/petsc/petsc/-/blob/main/src/ksp/ksp/tutorials/ex10.c > > loading your matrix and giving options to select Hypre? Then we can do the same thing here with your matrix. > > Thanks, > > Matt > > > > > > I have attached the error below together with the discrete gradient, in case you see something I am missing. > > > > The code is currently running in serial, so I don't expect communication/partitioning to be an issue (although it could be in the future). > > > > Thanks in advance, > > Nicolas > > > > > > > > ------------------------------ PETSc output ------------------------------------ > > > > > > Mat Object: 1 MPI processes > > type: seqaij > > -1.00000e+00 1.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 > > 1.00000e+00 0.00000e+00 -1.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 > > -1.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 1.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 > > 0.00000e+00 -1.00000e+00 0.00000e+00 1.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 > > 0.00000e+00 1.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 -1.00000e+00 0.00000e+00 0.00000e+00 > > 0.00000e+00 0.00000e+00 1.00000e+00 -1.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 > > 0.00000e+00 0.00000e+00 1.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 -1.00000e+00 0.00000e+00 > > 0.00000e+00 0.00000e+00 0.00000e+00 -1.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 1.00000e+00 > > 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 1.00000e+00 -1.00000e+00 0.00000e+00 0.00000e+00 > > 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 -1.00000e+00 0.00000e+00 1.00000e+00 0.00000e+00 > > 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 1.00000e+00 0.00000e+00 -1.00000e+00 > > 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 -1.00000e+00 1.00000e+00 > > [0]PETSC ERROR: ------------------------------------------------------------------------ > > [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range > > [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > > [0]PETSC ERROR: or see https://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind > > [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors > > [0]PETSC ERROR: likely location of problem given in stack below > > [0]PETSC ERROR: --------------------- Stack Frames ------------------------------------ > > [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, > > [0]PETSC ERROR: INSTEAD the line number of the start of the function > > [0]PETSC ERROR: is given. > > [0]PETSC ERROR: [0] jac->setup line 408 /home/ubuntu/petsc/src/ksp/pc/impls/hypre/hypre.c > > [0]PETSC ERROR: [0] PCSetUp_HYPRE line 223 /home/ubuntu/petsc/src/ksp/pc/impls/hypre/hypre.c > > [0]PETSC ERROR: [0] PCSetUp line 971 /home/ubuntu/petsc/src/ksp/pc/interface/precon.c > > [0]PETSC ERROR: [0] KSPSetUp line 319 /home/ubuntu/petsc/src/ksp/ksp/interface/itfunc.c > > [0]PETSC ERROR: [0] KSPSolve_Private line 615 /home/ubuntu/petsc/src/ksp/ksp/interface/itfunc.c > > [0]PETSC ERROR: [0] KSPSolve line 884 /home/ubuntu/petsc/src/ksp/ksp/interface/itfunc.c > > [0]PETSC ERROR: [0] PCApply_FieldSplit line 1241 /home/ubuntu/petsc/src/ksp/pc/impls/fieldsplit/fieldsplit.c > > [0]PETSC ERROR: [0] PCApply line 426 /home/ubuntu/petsc/src/ksp/pc/interface/precon.c > > [0]PETSC ERROR: [0] KSP_PCApply line 281 /home/ubuntu/petsc/include/petsc/private/kspimpl.h > > [0]PETSC ERROR: [0] KSPInitialResidual line 40 /home/ubuntu/petsc/src/ksp/ksp/interface/itres.c > > [0]PETSC ERROR: [0] KSPSolve_GMRES line 233 /home/ubuntu/petsc/src/ksp/ksp/impls/gmres/gmres.c > > [0]PETSC ERROR: [0] KSPSolve_Private line 615 /home/ubuntu/petsc/src/ksp/ksp/interface/itfunc.c > > [0]PETSC ERROR: [0] KSPSolve line 884 /home/ubuntu/petsc/src/ksp/ksp/interface/itfunc.c > > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > > [0]PETSC ERROR: Signal received > > [0]PETSC ERROR: See https://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > > [0]PETSC ERROR: Petsc Release Version 3.14.6, unknown > > [0]PETSC ERROR: Unknown Name on a arch-linux-c-debug named ubuntu-ThinkPad-L14-Gen-1 by ubuntu Fri Jan 21 15:37:45 2022 > > [0]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --download-mpich --download-fblaslapack --with-mpi=1 --download-superlu_dist --download-mumps --download-hypre --with-debugging=1 COPTFLAGS="-O3 -march=native -mtune=native" CXXOPTFLAGS="-O3 -march=native -mtune=native" FOPTFLAGS="-O3 -march=native -mtune=native" --download-scalapack --download-hpddm > > [0]PETSC ERROR: #1 User provided function() line 0 in unknown file > > [0]PETSC ERROR: Checking the memory for corruption. > > application called MPI_Abort(MPI_COMM_WORLD, 50176059) - process 0 > > [unset]: write_line error; fd=-1 buf=:cmd=abort exitcode=50176059 > > : > > system msg for write_line failure : Bad file descriptor > > > > > > -- > > > > > > > > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/(http://www.cse.buffalo.edu/~knepley/ ) > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: ex10-hypre-ams.tar Type: application/x-tar Size: 20480 bytes Desc: not available URL: From varunhiremath at gmail.com Sat Jan 22 02:11:22 2022 From: varunhiremath at gmail.com (Varun Hiremath) Date: Sat, 22 Jan 2022 00:11:22 -0800 Subject: [petsc-users] PETSc MUMPS interface In-Reply-To: References: <1AD9EBCE-4B48-4D54-9829-DFD5EDC68B76@dsic.upv.es> Message-ID: Hi Hong, I tested this in the latest petsc main branch and it appears to be working fine. Thanks for implementing this so quickly! Regards, Varun On Wed, Jan 19, 2022 at 3:33 PM Zhang, Hong wrote: > Varun, > This feature is merged to petsc main > https://gitlab.com/petsc/petsc/-/merge_requests/4727 > Hong > ------------------------------ > *From:* petsc-users on behalf of Zhang, > Hong via petsc-users > *Sent:* Wednesday, January 19, 2022 9:37 AM > *To:* Varun Hiremath > *Cc:* Peder J?rgensgaard Olesen via petsc-users > *Subject:* Re: [petsc-users] PETSc MUMPS interface > > Varun, > Good to know it works. FactorSymbolic function is still being called > twice, but the 2nd call is a no-op, thus it still appears in '-log_view'. I > made changes in the low level of mumps routine, not within PCSetUp() > because I feel your use case is limited to mumps, not other matrix package > solvers. > Hong > ------------------------------ > *From:* Varun Hiremath > *Sent:* Wednesday, January 19, 2022 2:44 AM > *To:* Zhang, Hong > *Cc:* Peder J?rgensgaard Olesen via petsc-users > *Subject:* Re: [petsc-users] PETSc MUMPS interface > > Hi Hong, > > Thanks, I tested your branch and I think it is working fine. I don't see > any increase in runtime, however with -log_view I see that the > MatLUFactorSymbolic function is still being called twice, so is this > expected? Is the second call a no-op? > > $ ./ex52.o -use_mumps_lu -print_mumps_memory -log_view | grep > MatLUFactorSym > MatLUFactorSym 2 1.0 4.4411e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 2 0 0 0 0 2 0 0 0 0 0 > > Thanks, > Varun > > On Mon, Jan 17, 2022 at 7:49 PM Zhang, Hong wrote: > > Varun, > I created a branch hzhang/feature-mumps-mem-estimate, > see https://gitlab.com/petsc/petsc/-/merge_requests/4727 > > You may give it a try and let me know if this is what you want. > src/ksp/ksp/tutorials/ex52.c is an example. > > Hong > ------------------------------ > *From:* Varun Hiremath > *Sent:* Monday, January 17, 2022 12:41 PM > *To:* Zhang, Hong > *Cc:* Jose E. Roman ; Peder J?rgensgaard Olesen via > petsc-users > *Subject:* Re: [petsc-users] PETSc MUMPS interface > > Hi Hong, > > Thanks for looking into this. Here is the workflow that I might use: > > MatLUFactorSymbolic(F,A,perm,iperm,&info); > > // get memory estimates from MUMPS e.g. INFO(3), INFOG(16), INFOG(17) > // find available memory on the system e.g. RAM size > if (estimated_memory > available_memory) > { > // inform and stop; or > // switch MUMPS to out-of-core factorization > ICNTL(22) = 1; > } > else > { > // set appropriate settings for in-core factorization > } > > // Now we call the solve and inside if MatLUFactorSymbolic is already > called then it should be skipped > EPSSolve(eps); > > Thanks, > Varun > > On Mon, Jan 17, 2022 at 9:18 AM Zhang, Hong wrote: > > Varun, > I am trying to find a way to enable you to switch options after MatLUFactorSymbolic(). > A hack is modifying the flag 'mumps->matstruc' > inside MatLUFactorSymbolic_AIJMUMPS() and MatFactorNumeric_MUMPS(). > > My understanding of what you want is: > // collect mumps memory info > ... > MatLUFactorSymbolic(F,A,perm,iperm,&info); > printMumpsMemoryInfo(F); > //--------- > if (memory is available) { > EPSSolve(eps); --> skip calling of MatLUFactorSymbolic() > } else { > //out-of-core (OOC) option in MUMPS > } > > Am I correct? I'll let you know once I work out a solution. > Hong > > ------------------------------ > *From:* Varun Hiremath > *Sent:* Sunday, January 16, 2022 10:10 PM > *To:* Zhang, Hong > *Cc:* Jose E. Roman ; Peder J?rgensgaard Olesen via > petsc-users > *Subject:* Re: [petsc-users] PETSc MUMPS interface > > Hi Jose, Hong, > > Thanks for the explanation. I have verified using -log_view that MatLUFactorSymbolic > is indeed getting called twice. > > Hong, we use MUMPS solver for other things, and we typically run the > symbolic analysis first and get memory estimates to ensure that we have > enough memory available to run the case. If the available memory is not > enough, we can stop or switch to the out-of-core (OOC) option in MUMPS. We > wanted to do the same when using MUMPS via SLEPc/PETSc. Please let me know > if there are other ways of getting these memory stats and switching options > during runtime with PETSc. > Appreciate your help! > > Thanks, > Varun > > On Sun, Jan 16, 2022 at 4:01 PM Zhang, Hong wrote: > > Varun, > I believe Jose is correct. You may verify it by running your code with > option '-log_view', then check the number of calls to MatLUFactorSym. > > I guess I can add a flag in PCSetUp() to check if user has already called > MatLUFactorSymbolic() and wants to skip it. Normally, users simply allocate > sufficient memory in the symbolic factorization. Why do you want to check > it? > Hong > > ------------------------------ > *From:* Jose E. Roman > *Sent:* Sunday, January 16, 2022 5:11 AM > *To:* Varun Hiremath > *Cc:* Zhang, Hong ; Peder J?rgensgaard Olesen via > petsc-users > *Subject:* Re: [petsc-users] PETSc MUMPS interface > > Hong may give a better answer, but if you look at PCSetUp_LU() > https://petsc.org/main/src/ksp/pc/impls/factor/lu/lu.c.html#PCSetUp_LU > you will see that MatLUFactorSymbolic() is called unconditionally during > the first PCSetUp(). Currently there is no way to check if the user has > already called MatLUFactorSymbolic(). > > Jose > > > > El 16 ene 2022, a las 10:40, Varun Hiremath > escribi?: > > > > Hi Hong, > > > > Thank you, this is very helpful! > > > > Using this method I am able to get the memory estimates before the > actual solve, however, I think my code may be causing the symbolic > factorization to be run twice. Attached is my code where I am using SLEPc > to compute eigenvalues, and I use MUMPS for factorization. I have commented > above the code that computes the memory estimates, could you please check > and tell me if this would cause the symbolic factor to be computed twice (a > second time inside EPSSolve?), as I am seeing a slight increase in the > overall computation time? > > > > Regards, > > Varun > > > > On Wed, Jan 12, 2022 at 7:58 AM Zhang, Hong wrote: > > PCFactorSetMatSolverType(pc,MATSOLVERMUMPS); > > PCFactorSetUpMatSolverType(pc); > > PCFactorGetMatrix(pc,&F); > > > > MatLUFactorSymbolic(F,A,...) > > You must provide row and column permutations etc, > petsc/src/mat/tests/ex125.c may give you a clue on how to get these inputs. > > > > Hong > > > > > > From: petsc-users on behalf of > Junchao Zhang > > Sent: Wednesday, January 12, 2022 9:03 AM > > To: Varun Hiremath > > Cc: Peder J?rgensgaard Olesen via petsc-users > > Subject: Re: [petsc-users] PETSc MUMPS interface > > > > Calling PCSetUp() before KSPSetUp()? > > > > --Junchao Zhang > > > > > > On Wed, Jan 12, 2022 at 3:00 AM Varun Hiremath > wrote: > > Hi All, > > > > I want to collect MUMPS memory estimates based on the initial symbolic > factorization analysis before the actual numerical factorization starts to > check if the estimated memory requirements fit the available memory. > > > > I am following the steps from > https://petsc.org/main/src/ksp/ksp/tutorials/ex52.c.html > > > > PCFactorSetMatSolverType(pc,MATSOLVERMUMPS); > > PCFactorSetUpMatSolverType(pc); > > PCFactorGetMatrix(pc,&F); > > > > KSPSetUp(ksp); > > MatMumpsGetInfog(F,...) > > > > But it appears KSPSetUp calls both symbolic and numerical factorization. > So is there some other way to get these statistics before the actual > factorization starts? > > > > Thanks, > > Varun > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefano.zampini at gmail.com Sat Jan 22 10:30:53 2022 From: stefano.zampini at gmail.com (Stefano Zampini) Date: Sat, 22 Jan 2022 19:30:53 +0300 Subject: [petsc-users] hypre / hip usage In-Reply-To: References: <87r1916ujq.fsf@jedbrown.org> <87lez96tok.fsf@jedbrown.org> <87fsph6p82.fsf@jedbrown.org> Message-ID: Mark the two options are only there to test the code in CI, and are not needed in general '--download-hypre-configure-arguments=--enable-unified-memory', This is only here to test the unified memory code path '--with-hypre-gpuarch=gfx90a', This is not needed if rocminfo is in PATH Our interface code with HYPRE GPU works fine for HIP, it is tested in CI. The -mat_type hypre assembling for ex19 does not work because ex19 uses FDColoring. Just assemble in mpiaij format (look at runex19_hypre_hip in src/snes/tutorials/makefile); the interface code will copy the matrix to the GPU Il giorno ven 21 gen 2022 alle ore 19:24 Mark Adams ha scritto: > > > On Fri, Jan 21, 2022 at 11:14 AM Jed Brown wrote: > >> "Paul T. Bauman" writes: >> >> > On Fri, Jan 21, 2022 at 8:52 AM Paul T. Bauman >> wrote: >> >> Yes. The way HYPRE's memory model is setup is that ALL GPU allocations >> are >> >> "native" (i.e. [cuda,hip]Malloc) or, if unified memory is enabled, >> then ALL >> >> GPU allocations are unified memory (i.e. [cuda,hip]MallocManaged). >> >> Regarding HIP, there is an HMM implementation of hipMallocManaged >> planned, >> >> but is it not yet delivered AFAIK (and it will *not* support gfx906, >> e.g. >> >> RVII, FYI), so, today, under the covers, hipMallocManaged is calling >> >> hipHostMalloc. So, today, all your unified memory allocations in HYPRE >> on >> >> HIP are doing CPU-pinned memory accesses. And performance is just truly >> >> terrible (as you might expect). >> >> Thanks for this important bit of information. >> >> And it sounds like when we add support to hand off Kokkos matrices and >> vectors (our current support for matrices on ROCm devices uses Kokkos) or >> add direct support for hipSparse, we'll avoid touching host memory in >> assembly-to-solve with hypre. >> > > It does not look like anyone has made Hypre work with HIP. Stafano added a > runex19_hypre_hip target 4 months ago and hypre.py has some HIP things. > > I have a user that would like to try this, no hurry but, can I get an idea > of a plan for this? > > Thanks, > Mark > > -- Stefano -------------- next part -------------- An HTML attachment was scrubbed... URL: From mfadams at lbl.gov Sun Jan 23 07:15:28 2022 From: mfadams at lbl.gov (Mark Adams) Date: Sun, 23 Jan 2022 08:15:28 -0500 Subject: [petsc-users] hypre / hip usage In-Reply-To: References: <87r1916ujq.fsf@jedbrown.org> <87lez96tok.fsf@jedbrown.org> <87fsph6p82.fsf@jedbrown.org> Message-ID: Thanks, '-mat_type hypre' was failing for me. I could not find a test that used it and I was not sure it was considered functional. I will look at it again and collect a bug report if needed. On Sat, Jan 22, 2022 at 11:31 AM Stefano Zampini wrote: > Mark > > the two options are only there to test the code in CI, and are not needed > in general > > '--download-hypre-configure-arguments=--enable-unified-memory', > This is only here to test the unified memory code path > > '--with-hypre-gpuarch=gfx90a', > This is not needed if rocminfo is in PATH > > Our interface code with HYPRE GPU works fine for HIP, it is tested in CI. > The -mat_type hypre assembling for ex19 does not work because ex19 uses > FDColoring. Just assemble in mpiaij format (look at runex19_hypre_hip in > src/snes/tutorials/makefile); the interface code will copy the matrix to > the GPU > > Il giorno ven 21 gen 2022 alle ore 19:24 Mark Adams ha > scritto: > >> >> >> On Fri, Jan 21, 2022 at 11:14 AM Jed Brown wrote: >> >>> "Paul T. Bauman" writes: >>> >>> > On Fri, Jan 21, 2022 at 8:52 AM Paul T. Bauman >>> wrote: >>> >> Yes. The way HYPRE's memory model is setup is that ALL GPU >>> allocations are >>> >> "native" (i.e. [cuda,hip]Malloc) or, if unified memory is enabled, >>> then ALL >>> >> GPU allocations are unified memory (i.e. [cuda,hip]MallocManaged). >>> >> Regarding HIP, there is an HMM implementation of hipMallocManaged >>> planned, >>> >> but is it not yet delivered AFAIK (and it will *not* support gfx906, >>> e.g. >>> >> RVII, FYI), so, today, under the covers, hipMallocManaged is calling >>> >> hipHostMalloc. So, today, all your unified memory allocations in >>> HYPRE on >>> >> HIP are doing CPU-pinned memory accesses. And performance is just >>> truly >>> >> terrible (as you might expect). >>> >>> Thanks for this important bit of information. >>> >>> And it sounds like when we add support to hand off Kokkos matrices and >>> vectors (our current support for matrices on ROCm devices uses Kokkos) or >>> add direct support for hipSparse, we'll avoid touching host memory in >>> assembly-to-solve with hypre. >>> >> >> It does not look like anyone has made Hypre work with HIP. Stafano added >> a runex19_hypre_hip target 4 months ago and hypre.py has some HIP things. >> >> I have a user that would like to try this, no hurry but, can I get an >> idea of a plan for this? >> >> Thanks, >> Mark >> >> > > > -- > Stefano > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mfadams at lbl.gov Sun Jan 23 08:55:18 2022 From: mfadams at lbl.gov (Mark Adams) Date: Sun, 23 Jan 2022 09:55:18 -0500 Subject: [petsc-users] hypre / hip usage In-Reply-To: References: <87r1916ujq.fsf@jedbrown.org> <87lez96tok.fsf@jedbrown.org> <87fsph6p82.fsf@jedbrown.org> Message-ID: Stefano and Matt, This segv looks like a Plexism. + srun -n1 -N1 --ntasks-per-gpu=1 --gpu-bind=closest ../ex13 -dm_plex_box_faces 2,2,2 -petscpartitioner_simple_process_grid 1,1,1 -dm_plex_box_upper 1,1,1 -petscpartitioner_simple_node_grid 1,1,1 -dm_refine 2 -dm_view -malloc_debug -log_trace -pc_type hypre -dm_vec_type hip -dm_mat_type hypre + tee out_001_kokkos_Crusher_2_8_hypre.txt [0] 1.293e-06 Event begin: DMPlexSymmetrize [0] 8.9463e-05 Event end: DMPlexSymmetrize ..... [0] 0.554529 Event end: VecHIPCopyFrom [0] 0.559891 Event begin: DMCreateInterp [0] 0.560603 Event begin: DMPlexInterpFE [0] 0.566707 Event begin: MatAssemblyBegin [0] 0.566962 Event begin: BuildTwoSidedF [0] 0.567068 Event begin: BuildTwoSided [0] 0.567119 Event end: BuildTwoSided [0] 0.567154 Event end: BuildTwoSidedF [0] 0.567162 Event end: MatAssemblyBegin [0] 0.567164 Event begin: MatAssemblyEnd [0] 0.567356 Event end: MatAssemblyEnd [0] 0.572884 Event begin: MatAssemblyBegin [0] 0.57289 Event end: MatAssemblyBegin [0] 0.572892 Event begin: MatAssemblyEnd [0] 0.573978 Event end: MatAssemblyEnd [0] 0.574428 Event begin: MatZeroEntries [0] 0.579998 Event end: MatZeroEntries :0:rocdevice.cpp :2589: 257935591316 us: Device::callbackQueue aborting with error : HSA_STATUS_ERROR_MEMORY_FAULT: Agent attempted to access an inaccessible address. code: 0x2b srun: error: crusher001: task 0: Aborted srun: launch/slurm: _step_signal: Terminating StepId=65929.4 + date Sun 23 Jan 2022 09:46:55 AM EST On Sun, Jan 23, 2022 at 8:15 AM Mark Adams wrote: > Thanks, > '-mat_type hypre' was failing for me. I could not find a test that used it > and I was not sure it was considered functional. > I will look at it again and collect a bug report if needed. > > On Sat, Jan 22, 2022 at 11:31 AM Stefano Zampini < > stefano.zampini at gmail.com> wrote: > >> Mark >> >> the two options are only there to test the code in CI, and are not needed >> in general >> >> '--download-hypre-configure-arguments=--enable-unified-memory', >> This is only here to test the unified memory code path >> >> '--with-hypre-gpuarch=gfx90a', >> This is not needed if rocminfo is in PATH >> >> Our interface code with HYPRE GPU works fine for HIP, it is tested in CI. >> The -mat_type hypre assembling for ex19 does not work because ex19 uses >> FDColoring. Just assemble in mpiaij format (look at runex19_hypre_hip in >> src/snes/tutorials/makefile); the interface code will copy the matrix to >> the GPU >> >> Il giorno ven 21 gen 2022 alle ore 19:24 Mark Adams ha >> scritto: >> >>> >>> >>> On Fri, Jan 21, 2022 at 11:14 AM Jed Brown wrote: >>> >>>> "Paul T. Bauman" writes: >>>> >>>> > On Fri, Jan 21, 2022 at 8:52 AM Paul T. Bauman >>>> wrote: >>>> >> Yes. The way HYPRE's memory model is setup is that ALL GPU >>>> allocations are >>>> >> "native" (i.e. [cuda,hip]Malloc) or, if unified memory is enabled, >>>> then ALL >>>> >> GPU allocations are unified memory (i.e. [cuda,hip]MallocManaged). >>>> >> Regarding HIP, there is an HMM implementation of hipMallocManaged >>>> planned, >>>> >> but is it not yet delivered AFAIK (and it will *not* support gfx906, >>>> e.g. >>>> >> RVII, FYI), so, today, under the covers, hipMallocManaged is calling >>>> >> hipHostMalloc. So, today, all your unified memory allocations in >>>> HYPRE on >>>> >> HIP are doing CPU-pinned memory accesses. And performance is just >>>> truly >>>> >> terrible (as you might expect). >>>> >>>> Thanks for this important bit of information. >>>> >>>> And it sounds like when we add support to hand off Kokkos matrices and >>>> vectors (our current support for matrices on ROCm devices uses Kokkos) or >>>> add direct support for hipSparse, we'll avoid touching host memory in >>>> assembly-to-solve with hypre. >>>> >>> >>> It does not look like anyone has made Hypre work with HIP. Stafano added >>> a runex19_hypre_hip target 4 months ago and hypre.py has some HIP things. >>> >>> I have a user that would like to try this, no hurry but, can I get an >>> idea of a plan for this? >>> >>> Thanks, >>> Mark >>> >>> >> >> >> -- >> Stefano >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From dong-hao at outlook.com Sun Jan 23 21:29:58 2022 From: dong-hao at outlook.com (Hao DONG) Date: Mon, 24 Jan 2022 03:29:58 +0000 Subject: [petsc-users] Strange CUDA failure with a second petscfinalize with PETSc 3.16 In-Reply-To: References: <45585074-8C0E-453B-993B-554D23A1E971@gmail.com> <60E1BD31-E98A-462F-BA2B-2099745B2582@gmail.com> Message-ID: Dear Jacob, Any luck reproducing the CUDA problem? - just write to check in, in case somehow the response did not reach me (this happens to my colleagues abroad sometimes, probably due to the Wall). All the best, Hao On Jan 19, 2022, at 3:01 PM, Hao DONG wrote: ? Thanks Jacob for looking into this ? You can see the updated source code of ex11fc in the attachment ? although there is not much that I modified (except for the jabbers I outputted). I also attached the full output (ex11fc.log) along with the configure.log file. It?s an old dual Xeon workstation (one of my ?production? machines) with Linux kernel 5.4.0 and gcc 9.3. I simply ran the code with mpiexec -np 2 ex11fc -usecuda for GPU test. And as stated before, calling without the ?-usecuda? option shows no errors. Please let me know if you find anything wrong with the configure/code. Cheers, Hao From: Jacob Faibussowitsch Sent: Wednesday, January 19, 2022 3:38 AM To: Hao DONG Cc: Junchao Zhang; petsc-users Subject: Re: [petsc-users] Strange CUDA failure with a second petscfinalize with PETSc 3.16 Apologies, forgot to mention in my previous email but can you also include a copy of the full printout of the error message that you get? It will include all the command-line flags that you ran with (if any) so I can exactly mirror your environment. Best regards, Jacob Faibussowitsch (Jacob Fai - booss - oh - vitch) On Jan 18, 2022, at 14:06, Jacob Faibussowitsch > wrote: Can you send your updated source file as well as your configure.log (should be $PETSC_DIR/configure.log). I will see if I can reproduce the error on my end. Best regards, Jacob Faibussowitsch (Jacob Fai - booss - oh - vitch) On Jan 17, 2022, at 23:06, Hao DONG > wrote: ? Dear Junchao and Jacob, Thanks a lot for the response ? I also don?t understand why this is related to the device, especially on why the procedure can be successfully finished for *once* ? As instructed, I tried to add a CHKERRA() macro after (almost) every petsc line ? such as the initialization, mat assemble, ksp create, solve, mat destroy, etc. However, all other petsc commands returns with error code 0. It only gives me a similar (still not very informative) error after I call the petscfinalize (again for the second time), with error code 97: [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [0]PETSC ERROR: GPU error [0]PETSC ERROR: cuda error 709 (cudaErrorContextIsDestroyed) : context is destroyed [0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting. [0]PETSC ERROR: Petsc Release Version 3.16.3, unknown [0]PETSC ERROR: ./ex11f on a named stratosphere by donghao Tue Jan 18 11:39:43 2022 [0]PETSC ERROR: Configure options --prefix=/opt/petsc/complex-double-with-cuda --with-cc=mpicc --with-cxx=mpicxx --with-fc=mpif90 COPTFLAGS="-O3 -mavx2" CXXOPTFLAGS="-O3 -mavx2" FOPTFLAGS="-O3 -ffree-line-length-none -mavx2" CUDAOPTFLAGS=-O3 --with-cxx-dialect=cxx14 --with-cuda-dialect=cxx14 --with-scalar-type=complex --with-precision=double --with-cuda-dir=/usr/local/cuda --with-debugging=1 [0]PETSC ERROR: #1 PetscFinalize() at /home/donghao/packages/petsc-current/src/sys/objects/pinit.c:1638 [0]PETSC ERROR: #2 User provided function() at User file:0 I can also confirm that rolling back to petsc 3.15 will *not* see the problem, even with the new nvidia driver. And petsc 3.16.3 with an old nvidia driver (470.42) also get this same error. So it?s probably not connected to the nvidia driver. Any idea on where I should look at next? Thanks a lot in advance, and all the best, Hao From: Jacob Faibussowitsch Sent: Sunday, January 16, 2022 12:12 AM To: Junchao Zhang Cc: petsc-users; Hao DONG Subject: Re: [petsc-users] Strange CUDA failure with a second petscfinalize with PETSc 3.16 I don?t quite understand how it is getting to the CUDA error to be honest. None of the code in the stack trace is anywhere near the device code. Reading the error message carefully, it first chokes on PetscLogGetStageLog() from a call to PetscClassIdRegister(): PetscErrorCode PetscLogGetStageLog(PetscStageLog *stageLog) { PetscFunctionBegin; PetscValidPointer(stageLog,1); if (!petsc_stageLog) { fprintf(stderr, "PETSC ERROR: Logging has not been enabled.\nYou might have forgotten to call PetscInitialize().\n"); PETSCABORT(MPI_COMM_WORLD, PETSC_ERR_SUP); // Here } ... But then jumps to PetscFinalize(). You can also see the "You might have forgotten to call PetscInitialize().? message in the error message, just under the topmost level of the stack trace. Can you check the value of ierr of each function call (use the CHKERRA() macro to do so)? I suspect the problem here that errors occurring previously in the program are being ignored, leading to the garbled stack trace. Best regards, Jacob Faibussowitsch (Jacob Fai - booss - oh - vitch) On Jan 14, 2022, at 20:58, Junchao Zhang > wrote: Jacob, Could you have a look as it seems the "invalid device context" is in your newly added module? Thanks --Junchao Zhang On Fri, Jan 14, 2022 at 12:49 AM Hao DONG > wrote: Dear All, I have encountered a peculiar problem when fiddling with a code with PETSC 3.16.3 (which worked fine with PETSc 3.15). It is a very straight forward PDE-based optimization code which repeatedly solves a linearized PDE problem with KSP in a subroutine (the rest of the code does not contain any PETSc related content). The main program provides the subroutine with an MPI comm. Then I set the comm as PETSC_COMM_WORLD to tell PETSC to attach to it (and detach with it when the solving is finished each time). Strangely, I observe a CUDA failure whenever the petscfinalize is called for a *second* time. In other words, the first and second PDE calculations with GPU are fine (with correct solutions). The petsc code just fails after the SECOND petscfinalize command is called. You can also see the PETSC config in the error message: [1]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [1]PETSC ERROR: GPU error [1]PETSC ERROR: cuda error 201 (cudaErrorDeviceUninitialized) : invalid device context [1]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting. [1]PETSC ERROR: Petsc Release Version 3.16.3, unknown [1]PETSC ERROR: maxwell.gpu on a named stratosphere by hao Fri Jan 14 10:21:05 2022 [1]PETSC ERROR: Configure options --prefix=/opt/petsc/complex-double-with-cuda --with-cc=mpicc --with-cxx=mpicxx --with-fc=mpif90 COPTFLAGS="-O3 -mavx2" CXXOPTFLAGS="-O3 -mavx2" FOPTFLAGS="-O3 -ffree-line-length-none -mavx2" CUDAOPTFLAGS=-O3 --with-cxx-dialect=cxx14 --with-cuda-dialect=cxx14 --with-scalar-type=complex --with-precision=double --with-cuda-dir=/usr/local/cuda --with-debugging=1 [1]PETSC ERROR: #1 PetscFinalize() at /home/hao/packages/petsc-current/src/sys/objects/pinit.c:1638 You might have forgotten to call PetscInitialize(). The EXACT line numbers in the error traceback are not available. Instead the line number of the start of the function is given. [1] #1 PetscAbortFindSourceFile_Private() at /home/hao/packages/petsc-current/src/sys/error/err.c:35 [1] #2 PetscLogGetStageLog() at /home/hao/packages/petsc-current/src/sys/logging/utils/stagelog.c:29 [1] #3 PetscClassIdRegister() at /home/hao/packages/petsc-current/src/sys/logging/plog.c:2376 [1] #4 MatMFFDInitializePackage() at /home/hao/packages/petsc-current/src/mat/impls/mffd/mffd.c:45 [1] #5 MatInitializePackage() at /home/hao/packages/petsc-current/src/mat/interface/dlregismat.c:163 [1] #6 MatCreate() at /home/hao/packages/petsc-current/src/mat/utils/gcreate.c:77 However, it doesn?t seem to affect the other part of my code, so the code can continue running until it gets to the petsc part again (the *third* time). Unfortunately, it doesn?t give me any further information even if I set the debugging to yes in the configure file. It also worth noting that PETSC without CUDA (i.e. with simple MATMPIAIJ) works perfectly fine. I am able to re-produce the problem with a toy code modified from ex11f. Please see the attached file (ex11fc.F90) for details. Essentially the code does the same thing as ex11f, but three times with a do loop. To do that I added an extra MPI_INIT/MPI_FINALIZE to ensure that the MPI communicator is not destroyed when PETSC_FINALIZE is called. I used the PetscOptionsHasName utility to check if you have ?-usecuda? in the options. So running the code with and without that option can give you a comparison w/o CUDA. I can see that the code also fails after the second loop of the KSP operation. Could you kindly shed some lights on this problem? I should say that I am not even sure if the problem is from PETSc, as I also accidentally updated the NVIDIA driver (for now it is 510.06 with cuda 11.6). And it is well known that NVIDIA can give you some surprise in the updates (yes, I know I shouldn?t have touched that if it?s not broken). But my CUDA code without PETSC (which basically does the same PDE thing, but with cusparse/cublas directly) seems to work just fine after the update. It is also possible that my petsc code related to CUDA was not quite ?legitimate? ? I just use: MatSetType(A, MATMPIAIJCUSPARSE, ierr) and MatCreateVecs(A, u, PETSC_NULL_VEC, ierr) to make the data onto GPU. I would very much appreciate it if you could show me the ?right? way to do that. Thanks a lot in advance, and all the best, Hao -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: configure.log Type: application/octet-stream Size: 1006969 bytes Desc: configure.log URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: ex11fc.F90 Type: application/octet-stream Size: 11039 bytes Desc: ex11fc.F90 URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: ex11fc.log Type: application/octet-stream Size: 6844 bytes Desc: ex11fc.log URL: From mbuerkle at web.de Mon Jan 24 03:33:47 2022 From: mbuerkle at web.de (Marius Buerkle) Date: Mon, 24 Jan 2022 10:33:47 +0100 Subject: [petsc-users] MatPreallocatorPreallocate segfault with PETSC 3.16 Message-ID: Hi, ? I try to use?MatPreallocatorPreallocate to allocate a MATMPIAIJ matrix A . I define the MATPREALLOCATOR preM with MatSetValues and then call MatPreallocatorPreallocate to get A. This works on the first call to MatPreallocatorPreallocate, but if I call MatPreallocatorPreallocate again with the same preM to get another matrix B then I get a segfault, although the program continues to run (see below). It worked with PETSC 3.15 but with 3.16 I stopped working. When I check mat_info_nz_allocated and mat_info_nz_used for the allocated matrix it looks correct for the first call, but on the second call mat_info_nz_used is 0. I also attached a minimal example. [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [0]PETSC ERROR: Null argument, when expecting valid pointer [1]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [0]PETSC ERROR: Null Pointer: Parameter # 1 [0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting. [0]PETSC ERROR: [1]PETSC ERROR: Null argument, when expecting valid pointer [1]PETSC ERROR: Petsc Development GIT revision: v3.16.3-686-g5e81a90 GIT Date: 2022-01-23 05:13:26 +0000 [0]PETSC ERROR: ./prem_test on a named cd001 by cdfmat_marius Mon Jan 24 18:21:17 2022 [0]PETSC ERROR: Null Pointer: Parameter # 1 [1]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting. [1]PETSC ERROR: Configure options --prefix=/home/cdfmat_marius/prog/petsc/petsc_main_dbg --with-scalar-type=complex --with-fortran-kernels=1 --with-64-bit-indices=0 --CC=mpiicc --COPTFLAGS="-g -traceback" --CXX=mpiicpc --CXXOPTFLAGS="-g -traceback" --FC=mpiifort --FOPTFLAGS="-g -traceback" --with-mpi=1 --with-x=0 --with-cuda=0 --download-parmetis=/home/cdfmat_marius/prog/petsc/git/petsc_main/externalpackages/git.parmetis.tar.gz --download-parmetis-commit=HEAD --download-metis=/home/cdfmat_marius/prog/petsc/git/petsc_main/externalpackages/git.metis.tar.gz --download-metis-commit=HEAD --download-slepc=/home/cdfmat_marius/prog/petsc/git/petsc_main/externalpackages/git.slepc_main.tar.gz --download-slepc-commit=HEAD --download-superlu_dist=/home/cdfmat_marius/prog/petsc/git/petsc_main/externalpackages/git.superlu_dist.tar.gz --download-superlu_dist-commit=HEAD --download-mumps=/home/cdfmat_marius/prog/petsc/git/petsc_main/externalpackages/git.mumps.tar.gz --download-mumps-commit=HEAD --download-hypre=/home/cdfmat_marius/prog/petsc/git/petsc_main/externalpackages/git.hypre.tar.gz --download-hypre-commit=HEAD --download-hwloc=/home/cdfmat_marius/prog/petsc/git/petsc_main/externalpackages/hwloc-2.5.0.tar.gz --download-sowing=/home/cdfmat_marius/prog/petsc/git/petsc_main/externalpackages/git.sowing.tar.gz --download-elemental=/home/cdfmat_marius/prog/petsc/git/petsc_main/externalpackages/git.elemental.tar.gz --download-elemental-commit=HEAD --download-make=/home/cdfmat_marius/prog/petsc/git/petsc_main/externalpackages/make-4.2.1-6.fc28.tar.gz --download-ptscotch=/home/cdfmat_marius/prog/petsc/git/petsc_main/externalpackages/git.ptscotch.tar.gz --download-ptscotch-commit=HEAD --with-openmp=0 --with-pthread=0 --with-cxx-dialect=c++11 --with-debugging=1 --with-cuda=0 --with-cudac=0 --with-valgrind=0 --with-blaslapack-lib="-mkl=sequential -lmkl_scalapack_lp64 -lmkl_blacs_intelmpi_lp64 -lpthread -lm -ldl" --with-scalapack-lib="-mkl=sequential -lmkl_scalapack_lp64 -lmkl_blacs_intelmpi_lp64 -lpthread -lm -ldl" --with-mkl_pardiso-dir=/home/appli/intel/compilers_and_libraries_2020.4.304/linux/mkl --with-mkl_cpardiso-dir=/home/appli/intel/compilers_and_libraries_2020.4.304/linux/mkl --with-mkl_sparse-dir=/home/appli/intel/compilers_and_libraries_2020.4.304/linux/mkl --with-mkl_sparse_optimize-dir=/home/appli/intel/compilers_and_libraries_2020.4.304/linux/mkl [0]PETSC ERROR: Petsc Development GIT revision: v3.16.3-686-g5e81a90 GIT Date: 2022-01-23 05:13:26 +0000 [1]PETSC ERROR: ./prem_test on a named cd001 by cdfmat_marius Mon Jan 24 18:21:17 2022 [1]PETSC ERROR: #1 PetscHSetIJGetSize() at /home/cdfmat_marius/prog/petsc/git/petsc_main/include/petsc/private/hashsetij.h:13 [0]PETSC ERROR: #2 MatPreallocatorPreallocate_Preallocator() at /home/cdfmat_marius/prog/petsc/git/petsc_main/src/mat/impls/preallocator/matpreallocator.c:165 [0]PETSC ERROR: #3 MatPreallocatorPreallocate() at /home/cdfmat_marius/prog/petsc/git/petsc_main/src/mat/impls/preallocator/matpreallocator.c:234 Configure options --prefix=/home/cdfmat_marius/prog/petsc/petsc_main_dbg --with-scalar-type=complex --with-fortran-kernels=1 --with-64-bit-indices=0 --CC=mpiicc --COPTFLAGS="-g -traceback" --CXX=mpiicpc --CXXOPTFLAGS="-g -traceback" --FC=mpiifort --FOPTFLAGS="-g -traceback" --with-mpi=1 --with-x=0 --with-cuda=0 --download-parmetis=/home/cdfmat_marius/prog/petsc/git/petsc_main/externalpackages/git.parmetis.tar.gz --download-parmetis-commit=HEAD --download-metis=/home/cdfmat_marius/prog/petsc/git/petsc_main/externalpackages/git.metis.tar.gz --download-metis-commit=HEAD --download-slepc=/home/cdfmat_marius/prog/petsc/git/petsc_main/externalpackages/git.slepc_main.tar.gz --download-slepc-commit=HEAD --download-superlu_dist=/home/cdfmat_marius/prog/petsc/git/petsc_main/externalpackages/git.superlu_dist.tar.gz --download-superlu_dist-commit=HEAD --download-mumps=/home/cdfmat_marius/prog/petsc/git/petsc_main/externalpackages/git.mumps.tar.gz --download-mumps-commit=HEAD --download-hypre=/home/cdfmat_marius/prog/petsc/git/petsc_main/externalpackages/git.hypre.tar.gz --download-hypre-commit=HEAD --download-hwloc=/home/cdfmat_marius/prog/petsc/git/petsc_main/externalpackages/hwloc-2.5.0.tar.gz --download-sowing=/home/cdfmat_marius/prog/petsc/git/petsc_main/externalpackages/git.sowing.tar.gz --download-elemental=/home/cdfmat_marius/prog/petsc/git/petsc_main/externalpackages/git.elemental.tar.gz --download-elemental-commit=HEAD --download-make=/home/cdfmat_marius/prog/petsc/git/petsc_main/externalpackages/make-4.2.1-6.fc28.tar.gz --download-ptscotch=/home/cdfmat_marius/prog/petsc/git/petsc_main/externalpackages/git.ptscotch.tar.gz --download-ptscotch-commit=HEAD --with-openmp=0 --with-pthread=0 --with-cxx-dialect=c++11 --with-debugging=1 --with-cuda=0 --with-cudac=0 --with-valgrind=0 --with-blaslapack-lib="-mkl=sequential -lmkl_scalapack_lp64 -lmkl_blacs_intelmpi_lp64 -lpthread -lm -ldl" --with-scalapack-lib="-mkl=sequential -lmkl_scalapack_lp64 -lmkl_blacs_intelmpi_lp64 -lpthread -lm -ldl" --with-mkl_pardiso-dir=/home/appli/intel/compilers_and_libraries_2020.4.304/linux/mkl --with-mkl_cpardiso-dir=/home/appli/intel/compilers_and_libraries_2020.4.304/linux/mkl --with-mkl_sparse-dir=/home/appli/intel/compilers_and_libraries_2020.4.304/linux/mkl --with-mkl_sparse_optimize-dir=/home/appli/intel/compilers_and_libraries_2020.4.304/linux/mkl [1]PETSC ERROR: #1 PetscHSetIJGetSize() at /home/cdfmat_marius/prog/petsc/git/petsc_main/include/petsc/private/hashsetij.h:13 [1]PETSC ERROR: #2 MatPreallocatorPreallocate_Preallocator() at /home/cdfmat_marius/prog/petsc/git/petsc_main/src/mat/impls/preallocator/matpreallocator.c:165 [1]PETSC ERROR: #3 MatPreallocatorPreallocate() at /home/cdfmat_marius/prog/petsc/git/petsc_main/src/mat/impls/preallocator/matpreallocator.c:234 Best and Thanks, Marius -------------- next part -------------- A non-text attachment was scrubbed... Name: prem_test.tar.gz Type: application/octet-stream Size: 2373 bytes Desc: not available URL: From mfadams at lbl.gov Mon Jan 24 08:24:07 2022 From: mfadams at lbl.gov (Mark Adams) Date: Mon, 24 Jan 2022 09:24:07 -0500 Subject: [petsc-users] hypre / hip usage In-Reply-To: References: <87r1916ujq.fsf@jedbrown.org> <87lez96tok.fsf@jedbrown.org> <87fsph6p82.fsf@jedbrown.org> Message-ID: What is the fastest way to rebuild hypre? reconfiguring did not work and is slow. I am printf debugging to find this HSA_STATUS_ERROR_MEMORY_FAULT (no debuggers other than valgrind on Crusher??!?!) and I get to a hypre call: PetscStackCallStandard(HYPRE_IJMatrixAddToValues,(hA->ij,1,&hnc,(HYPRE_BigInt*)(rows+i),(HYPRE_BigInt*)cscr[0],sscr)); This is from DMPlexComputeJacobian_Internal and MatSetClosure. HYPRE_IJMatrixAddToValues is successfully called in earlier parts of the run. The args look OK, so I am going into HYPRE_IJMatrixAddToValues. Thanks, Mark On Sun, Jan 23, 2022 at 9:55 AM Mark Adams wrote: > Stefano and Matt, This segv looks like a Plexism. > > + srun -n1 -N1 --ntasks-per-gpu=1 --gpu-bind=closest ../ex13 > -dm_plex_box_faces 2,2,2 -petscpartitioner_simple_process_grid > 1,1,1 -dm_plex_box_upper 1,1,1 -petscpartitioner_simple_node_grid 1,1,1 > -dm_refine 2 -dm_view -malloc_debug -log_trace -pc_type hypre -dm_vec_type > hip -dm_mat_type hypre > + tee out_001_kokkos_Crusher_2_8_hypre.txt > [0] 1.293e-06 Event begin: DMPlexSymmetrize > [0] 8.9463e-05 Event end: DMPlexSymmetrize > ..... > [0] 0.554529 Event end: VecHIPCopyFrom > [0] 0.559891 Event begin: DMCreateInterp > [0] 0.560603 Event begin: DMPlexInterpFE > [0] 0.566707 Event begin: MatAssemblyBegin > [0] 0.566962 Event begin: BuildTwoSidedF > [0] 0.567068 Event begin: BuildTwoSided > [0] 0.567119 Event end: BuildTwoSided > [0] 0.567154 Event end: BuildTwoSidedF > [0] 0.567162 Event end: MatAssemblyBegin > [0] 0.567164 Event begin: MatAssemblyEnd > [0] 0.567356 Event end: MatAssemblyEnd > [0] 0.572884 Event begin: MatAssemblyBegin > [0] 0.57289 Event end: MatAssemblyBegin > [0] 0.572892 Event begin: MatAssemblyEnd > [0] 0.573978 Event end: MatAssemblyEnd > [0] 0.574428 Event begin: MatZeroEntries > [0] 0.579998 Event end: MatZeroEntries > :0:rocdevice.cpp :2589: 257935591316 us: Device::callbackQueue > aborting with error : HSA_STATUS_ERROR_MEMORY_FAULT: Agent attempted to > access an inaccessible address. code: 0x2b > srun: error: crusher001: task 0: Aborted > srun: launch/slurm: _step_signal: Terminating StepId=65929.4 > + date > Sun 23 Jan 2022 09:46:55 AM EST > > On Sun, Jan 23, 2022 at 8:15 AM Mark Adams wrote: > >> Thanks, >> '-mat_type hypre' was failing for me. I could not find a test that used >> it and I was not sure it was considered functional. >> I will look at it again and collect a bug report if needed. >> >> On Sat, Jan 22, 2022 at 11:31 AM Stefano Zampini < >> stefano.zampini at gmail.com> wrote: >> >>> Mark >>> >>> the two options are only there to test the code in CI, and are not >>> needed in general >>> >>> '--download-hypre-configure-arguments=--enable-unified-memory', >>> This is only here to test the unified memory code path >>> >>> '--with-hypre-gpuarch=gfx90a', >>> This is not needed if rocminfo is in PATH >>> >>> Our interface code with HYPRE GPU works fine for HIP, it is tested in CI. >>> The -mat_type hypre assembling for ex19 does not work because ex19 uses >>> FDColoring. Just assemble in mpiaij format (look at runex19_hypre_hip in >>> src/snes/tutorials/makefile); the interface code will copy the matrix to >>> the GPU >>> >>> Il giorno ven 21 gen 2022 alle ore 19:24 Mark Adams >>> ha scritto: >>> >>>> >>>> >>>> On Fri, Jan 21, 2022 at 11:14 AM Jed Brown wrote: >>>> >>>>> "Paul T. Bauman" writes: >>>>> >>>>> > On Fri, Jan 21, 2022 at 8:52 AM Paul T. Bauman >>>>> wrote: >>>>> >> Yes. The way HYPRE's memory model is setup is that ALL GPU >>>>> allocations are >>>>> >> "native" (i.e. [cuda,hip]Malloc) or, if unified memory is enabled, >>>>> then ALL >>>>> >> GPU allocations are unified memory (i.e. [cuda,hip]MallocManaged). >>>>> >> Regarding HIP, there is an HMM implementation of hipMallocManaged >>>>> planned, >>>>> >> but is it not yet delivered AFAIK (and it will *not* support >>>>> gfx906, e.g. >>>>> >> RVII, FYI), so, today, under the covers, hipMallocManaged is calling >>>>> >> hipHostMalloc. So, today, all your unified memory allocations in >>>>> HYPRE on >>>>> >> HIP are doing CPU-pinned memory accesses. And performance is just >>>>> truly >>>>> >> terrible (as you might expect). >>>>> >>>>> Thanks for this important bit of information. >>>>> >>>>> And it sounds like when we add support to hand off Kokkos matrices and >>>>> vectors (our current support for matrices on ROCm devices uses Kokkos) or >>>>> add direct support for hipSparse, we'll avoid touching host memory in >>>>> assembly-to-solve with hypre. >>>>> >>>> >>>> It does not look like anyone has made Hypre work with HIP. Stafano >>>> added a runex19_hypre_hip target 4 months ago and hypre.py has some HIP >>>> things. >>>> >>>> I have a user that would like to try this, no hurry but, can I get an >>>> idea of a plan for this? >>>> >>>> Thanks, >>>> Mark >>>> >>>> >>> >>> >>> -- >>> Stefano >>> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Mon Jan 24 08:52:58 2022 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 24 Jan 2022 09:52:58 -0500 Subject: [petsc-users] hypre / hip usage In-Reply-To: References: <87r1916ujq.fsf@jedbrown.org> <87lez96tok.fsf@jedbrown.org> <87fsph6p82.fsf@jedbrown.org> Message-ID: On Mon, Jan 24, 2022 at 9:24 AM Mark Adams wrote: > What is the fastest way to rebuild hypre? reconfiguring did not work and > is slow. > > I am printf debugging to find this HSA_STATUS_ERROR_MEMORY_FAULT (no > debuggers other than valgrind on Crusher??!?!) and I get to a hypre call: > > > PetscStackCallStandard(HYPRE_IJMatrixAddToValues,(hA->ij,1,&hnc,(HYPRE_BigInt*)(rows+i),(HYPRE_BigInt*)cscr[0],sscr)); > > This is from DMPlexComputeJacobian_Internal and MatSetClosure. > HYPRE_IJMatrixAddToValues is successfully called in earlier parts of the > run. > So MatSetClosure() just calls MatSetValues(). That should find any out of range index. I guess it is possible that the element matrix storage is invalid, but that is a hard thing to mess up. Hopefully, debugging shows the SEGV in Hypre. Thanks, Matt > The args look OK, so I am going into HYPRE_IJMatrixAddToValues. > > Thanks, > Mark > > > > On Sun, Jan 23, 2022 at 9:55 AM Mark Adams wrote: > >> Stefano and Matt, This segv looks like a Plexism. >> >> + srun -n1 -N1 --ntasks-per-gpu=1 --gpu-bind=closest ../ex13 >> -dm_plex_box_faces 2,2,2 -petscpartitioner_simple_process_grid >> 1,1,1 -dm_plex_box_upper 1,1,1 -petscpartitioner_simple_node_grid 1,1,1 >> -dm_refine 2 -dm_view -malloc_debug -log_trace -pc_type hypre -dm_vec_type >> hip -dm_mat_type hypre >> + tee out_001_kokkos_Crusher_2_8_hypre.txt >> [0] 1.293e-06 Event begin: DMPlexSymmetrize >> [0] 8.9463e-05 Event end: DMPlexSymmetrize >> ..... >> [0] 0.554529 Event end: VecHIPCopyFrom >> [0] 0.559891 Event begin: DMCreateInterp >> [0] 0.560603 Event begin: DMPlexInterpFE >> [0] 0.566707 Event begin: MatAssemblyBegin >> [0] 0.566962 Event begin: BuildTwoSidedF >> [0] 0.567068 Event begin: BuildTwoSided >> [0] 0.567119 Event end: BuildTwoSided >> [0] 0.567154 Event end: BuildTwoSidedF >> [0] 0.567162 Event end: MatAssemblyBegin >> [0] 0.567164 Event begin: MatAssemblyEnd >> [0] 0.567356 Event end: MatAssemblyEnd >> [0] 0.572884 Event begin: MatAssemblyBegin >> [0] 0.57289 Event end: MatAssemblyBegin >> [0] 0.572892 Event begin: MatAssemblyEnd >> [0] 0.573978 Event end: MatAssemblyEnd >> [0] 0.574428 Event begin: MatZeroEntries >> [0] 0.579998 Event end: MatZeroEntries >> :0:rocdevice.cpp :2589: 257935591316 us: Device::callbackQueue >> aborting with error : HSA_STATUS_ERROR_MEMORY_FAULT: Agent attempted to >> access an inaccessible address. code: 0x2b >> srun: error: crusher001: task 0: Aborted >> srun: launch/slurm: _step_signal: Terminating StepId=65929.4 >> + date >> Sun 23 Jan 2022 09:46:55 AM EST >> >> On Sun, Jan 23, 2022 at 8:15 AM Mark Adams wrote: >> >>> Thanks, >>> '-mat_type hypre' was failing for me. I could not find a test that used >>> it and I was not sure it was considered functional. >>> I will look at it again and collect a bug report if needed. >>> >>> On Sat, Jan 22, 2022 at 11:31 AM Stefano Zampini < >>> stefano.zampini at gmail.com> wrote: >>> >>>> Mark >>>> >>>> the two options are only there to test the code in CI, and are not >>>> needed in general >>>> >>>> '--download-hypre-configure-arguments=--enable-unified-memory', >>>> This is only here to test the unified memory code path >>>> >>>> '--with-hypre-gpuarch=gfx90a', >>>> This is not needed if rocminfo is in PATH >>>> >>>> Our interface code with HYPRE GPU works fine for HIP, it is tested in >>>> CI. >>>> The -mat_type hypre assembling for ex19 does not work because ex19 uses >>>> FDColoring. Just assemble in mpiaij format (look at runex19_hypre_hip in >>>> src/snes/tutorials/makefile); the interface code will copy the matrix to >>>> the GPU >>>> >>>> Il giorno ven 21 gen 2022 alle ore 19:24 Mark Adams >>>> ha scritto: >>>> >>>>> >>>>> >>>>> On Fri, Jan 21, 2022 at 11:14 AM Jed Brown wrote: >>>>> >>>>>> "Paul T. Bauman" writes: >>>>>> >>>>>> > On Fri, Jan 21, 2022 at 8:52 AM Paul T. Bauman >>>>>> wrote: >>>>>> >> Yes. The way HYPRE's memory model is setup is that ALL GPU >>>>>> allocations are >>>>>> >> "native" (i.e. [cuda,hip]Malloc) or, if unified memory is enabled, >>>>>> then ALL >>>>>> >> GPU allocations are unified memory (i.e. [cuda,hip]MallocManaged). >>>>>> >> Regarding HIP, there is an HMM implementation of hipMallocManaged >>>>>> planned, >>>>>> >> but is it not yet delivered AFAIK (and it will *not* support >>>>>> gfx906, e.g. >>>>>> >> RVII, FYI), so, today, under the covers, hipMallocManaged is >>>>>> calling >>>>>> >> hipHostMalloc. So, today, all your unified memory allocations in >>>>>> HYPRE on >>>>>> >> HIP are doing CPU-pinned memory accesses. And performance is just >>>>>> truly >>>>>> >> terrible (as you might expect). >>>>>> >>>>>> Thanks for this important bit of information. >>>>>> >>>>>> And it sounds like when we add support to hand off Kokkos matrices >>>>>> and vectors (our current support for matrices on ROCm devices uses Kokkos) >>>>>> or add direct support for hipSparse, we'll avoid touching host memory in >>>>>> assembly-to-solve with hypre. >>>>>> >>>>> >>>>> It does not look like anyone has made Hypre work with HIP. Stafano >>>>> added a runex19_hypre_hip target 4 months ago and hypre.py has some HIP >>>>> things. >>>>> >>>>> I have a user that would like to try this, no hurry but, can I get an >>>>> idea of a plan for this? >>>>> >>>>> Thanks, >>>>> Mark >>>>> >>>>> >>>> >>>> >>>> -- >>>> Stefano >>>> >>> -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From ptbauman at gmail.com Mon Jan 24 09:16:22 2022 From: ptbauman at gmail.com (Paul T. Bauman) Date: Mon, 24 Jan 2022 09:16:22 -0600 Subject: [petsc-users] hypre / hip usage In-Reply-To: References: <87r1916ujq.fsf@jedbrown.org> <87lez96tok.fsf@jedbrown.org> <87fsph6p82.fsf@jedbrown.org> Message-ID: On Mon, Jan 24, 2022 at 8:53 AM Matthew Knepley wrote: > On Mon, Jan 24, 2022 at 9:24 AM Mark Adams wrote: > >> What is the fastest way to rebuild hypre? reconfiguring did not work and >> is slow. >> >> I am printf debugging to find this HSA_STATUS_ERROR_MEMORY_FAULT (no >> debuggers other than valgrind on Crusher??!?!) >> > Again, apologies for interjecting, but I wanted to offer a few pointers here. 1. `rocgdb` will be in your PATH when the `rocm` module is loaded. This is gdb, but with some extra AMDGPU goodies. AFAIK, you cannot, yet, do stepping through a kernel in the source (only the ISA), but you can query device variables in host code, print their values, etc. 1a. Note that multiple threads can be spawned by the HIP runtime. Furthermore, it's likely the thread you'll be on when you catch the error is (one of) the runtime thread(s). You'll need to do `info threads` and then select your host thread to get back to it. 2. To get an accurate stacktrace (meaning get the line in the host code where the error is actually happening), I recommend setting the following environment variables for debugging that will force the serialization of async memcopies and kernel launches: AMD_SERIALIZE_KERNEL = 3 AMD_SERIALIZE_COPY=3 Thanks, Paul -------------- next part -------------- An HTML attachment was scrubbed... URL: From mfadams at lbl.gov Mon Jan 24 09:31:27 2022 From: mfadams at lbl.gov (Mark Adams) Date: Mon, 24 Jan 2022 10:31:27 -0500 Subject: [petsc-users] hypre / hip usage In-Reply-To: References: <87r1916ujq.fsf@jedbrown.org> <87lez96tok.fsf@jedbrown.org> <87fsph6p82.fsf@jedbrown.org> Message-ID: Thanks Paul, How do I get a stack trace? I have been relying on PETSc's which piggybacks on timers so it is not getting too deep here. On Mon, Jan 24, 2022 at 10:16 AM Paul T. Bauman wrote: > On Mon, Jan 24, 2022 at 8:53 AM Matthew Knepley wrote: > >> On Mon, Jan 24, 2022 at 9:24 AM Mark Adams wrote: >> >>> What is the fastest way to rebuild hypre? reconfiguring did not work and >>> is slow. >>> >>> I am printf debugging to find this HSA_STATUS_ERROR_MEMORY_FAULT (no >>> debuggers other than valgrind on Crusher??!?!) >>> >> > Again, apologies for interjecting, but I wanted to offer a few pointers > here. > > 1. `rocgdb` will be in your PATH when the `rocm` module is loaded. This is > gdb, but with some extra AMDGPU goodies. AFAIK, you cannot, yet, do > stepping through a kernel in the source (only the ISA), but you can query > device variables in host code, print their values, etc. > 1a. Note that multiple threads can be spawned by the HIP runtime. > Furthermore, it's likely the thread you'll be on when you catch the error > is (one of) the runtime thread(s). You'll need to do `info threads` and > then select your host thread to get back to it. > 2. To get an accurate stacktrace (meaning get the line in the host code > where the error is actually happening), I recommend setting the following > environment variables for debugging that will force the serialization of > async memcopies and kernel launches: > AMD_SERIALIZE_KERNEL = 3 > AMD_SERIALIZE_COPY=3 > > Thanks, > > Paul > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mfadams at lbl.gov Mon Jan 24 09:36:31 2022 From: mfadams at lbl.gov (Mark Adams) Date: Mon, 24 Jan 2022 10:36:31 -0500 Subject: [petsc-users] hypre / hip usage In-Reply-To: References: <87r1916ujq.fsf@jedbrown.org> <87lez96tok.fsf@jedbrown.org> <87fsph6p82.fsf@jedbrown.org> Message-ID: Sorry for the dumb question , but I added print statements in my hypre code and want to rebuild hypre to get these changes. How does one do that? 'make all' in the hypre directory does not do it. Thanks, Mark On Mon, Jan 24, 2022 at 10:31 AM Mark Adams wrote: > Thanks Paul, > > How do I get a stack trace? I have been relying on PETSc's > which piggybacks on timers so it is not getting too deep here. > > On Mon, Jan 24, 2022 at 10:16 AM Paul T. Bauman > wrote: > >> On Mon, Jan 24, 2022 at 8:53 AM Matthew Knepley >> wrote: >> >>> On Mon, Jan 24, 2022 at 9:24 AM Mark Adams wrote: >>> >>>> What is the fastest way to rebuild hypre? reconfiguring did not work >>>> and is slow. >>>> >>>> I am printf debugging to find this HSA_STATUS_ERROR_MEMORY_FAULT (no >>>> debuggers other than valgrind on Crusher??!?!) >>>> >>> >> Again, apologies for interjecting, but I wanted to offer a few pointers >> here. >> >> 1. `rocgdb` will be in your PATH when the `rocm` module is loaded. This >> is gdb, but with some extra AMDGPU goodies. AFAIK, you cannot, yet, do >> stepping through a kernel in the source (only the ISA), but you can query >> device variables in host code, print their values, etc. >> 1a. Note that multiple threads can be spawned by the HIP runtime. >> Furthermore, it's likely the thread you'll be on when you catch the error >> is (one of) the runtime thread(s). You'll need to do `info threads` and >> then select your host thread to get back to it. >> 2. To get an accurate stacktrace (meaning get the line in the host code >> where the error is actually happening), I recommend setting the following >> environment variables for debugging that will force the serialization of >> async memcopies and kernel launches: >> AMD_SERIALIZE_KERNEL = 3 >> AMD_SERIALIZE_COPY=3 >> >> Thanks, >> >> Paul >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ptbauman at gmail.com Mon Jan 24 09:43:45 2022 From: ptbauman at gmail.com (Paul T. Bauman) Date: Mon, 24 Jan 2022 09:43:45 -0600 Subject: [petsc-users] hypre / hip usage In-Reply-To: References: <87r1916ujq.fsf@jedbrown.org> <87lez96tok.fsf@jedbrown.org> <87fsph6p82.fsf@jedbrown.org> Message-ID: On Mon, Jan 24, 2022 at 9:31 AM Mark Adams wrote: > Thanks Paul, > > How do I get a stack trace? I have been relying on PETSc's > which piggybacks on timers so it is not getting too deep here. > I'm not sure what the "PETSc way" is, but I just run the executable through `rocgdb` as one would do with `gdb` (`rocgdb` is literally `gdb` built with extra AMD stuff (that stuff is either upstreamed or being upstreamed to gdb BTW)). You can do it in batch mode as well so you can dump the logs from each MPI process. > > On Mon, Jan 24, 2022 at 10:16 AM Paul T. Bauman > wrote: > >> On Mon, Jan 24, 2022 at 8:53 AM Matthew Knepley >> wrote: >> >>> On Mon, Jan 24, 2022 at 9:24 AM Mark Adams wrote: >>> >>>> What is the fastest way to rebuild hypre? reconfiguring did not work >>>> and is slow. >>>> >>>> I am printf debugging to find this HSA_STATUS_ERROR_MEMORY_FAULT (no >>>> debuggers other than valgrind on Crusher??!?!) >>>> >>> >> Again, apologies for interjecting, but I wanted to offer a few pointers >> here. >> >> 1. `rocgdb` will be in your PATH when the `rocm` module is loaded. This >> is gdb, but with some extra AMDGPU goodies. AFAIK, you cannot, yet, do >> stepping through a kernel in the source (only the ISA), but you can query >> device variables in host code, print their values, etc. >> 1a. Note that multiple threads can be spawned by the HIP runtime. >> Furthermore, it's likely the thread you'll be on when you catch the error >> is (one of) the runtime thread(s). You'll need to do `info threads` and >> then select your host thread to get back to it. >> 2. To get an accurate stacktrace (meaning get the line in the host code >> where the error is actually happening), I recommend setting the following >> environment variables for debugging that will force the serialization of >> async memcopies and kernel launches: >> AMD_SERIALIZE_KERNEL = 3 >> AMD_SERIALIZE_COPY=3 >> >> Thanks, >> >> Paul >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at jedbrown.org Mon Jan 24 09:53:03 2022 From: jed at jedbrown.org (Jed Brown) Date: Mon, 24 Jan 2022 08:53:03 -0700 Subject: [petsc-users] hypre / hip usage In-Reply-To: References: <87r1916ujq.fsf@jedbrown.org> <87lez96tok.fsf@jedbrown.org> <87fsph6p82.fsf@jedbrown.org> Message-ID: <87sftdceqo.fsf@jedbrown.org> "Paul T. Bauman" writes: > 1. `rocgdb` will be in your PATH when the `rocm` module is loaded. This is > gdb, but with some extra AMDGPU goodies. AFAIK, you cannot, yet, do > stepping through a kernel in the source (only the ISA), but you can query > device variables in host code, print their values, etc. > 1a. Note that multiple threads can be spawned by the HIP runtime. > Furthermore, it's likely the thread you'll be on when you catch the error > is (one of) the runtime thread(s). You'll need to do `info threads` and > then select your host thread to get back to it. > 2. To get an accurate stacktrace (meaning get the line in the host code > where the error is actually happening), I recommend setting the following > environment variables for debugging that will force the serialization of > async memcopies and kernel launches: > AMD_SERIALIZE_KERNEL = 3 > AMD_SERIALIZE_COPY=3 Is there a tutorial on this? I bet a 10-minute screencast demo would make a big impact in the use of these tools. AMD_SERIALIZE_COPY isn't documented at all and AMD_SERIALIZE_KERNEL isn't mentioned in this context. https://rocmdocs.amd.com/en/latest/search.html?q=amd_serialize_copy&check_keywords=yes&area=default From mfadams at lbl.gov Mon Jan 24 09:54:39 2022 From: mfadams at lbl.gov (Mark Adams) Date: Mon, 24 Jan 2022 10:54:39 -0500 Subject: [petsc-users] hypre / hip usage In-Reply-To: References: <87r1916ujq.fsf@jedbrown.org> <87lez96tok.fsf@jedbrown.org> <87fsph6p82.fsf@jedbrown.org> Message-ID: OK, you meant in gdb. rocgdb seems to be hung here. Do you see a problem? Thanks, + srun -n1 -N1 --ntasks-per-gpu=1 --gpu-bind=closest rocgdb --args ../ex13 -dm_plex_box_faces 2,2,2 -petscpartitioner_simple_process_grid 2,2,2 -dm_plex_box_upper 1,1,1 -petscpartitioner_simple_node_grid 1,1,1 -dm_refine 6 -dm_view -dm_mat_type aijkokkos -dm_vec_type kokkos -pc_type jacobi -log_view -ksp_view -use_gpu_aware_mpi true GNU gdb (rocm-rel-4.5-56) 11.1 Copyright (C) 2021 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-pc-linux-gnu". Type "show configuration" for configuration details. For bug reporting instructions, please see: . Find the GDB manual and other documentation resources online at: . For help, type "help". Type "apropos word" to search for commands related to "word"... Reading symbols from ../ex13... On Mon, Jan 24, 2022 at 10:43 AM Paul T. Bauman wrote: > > > On Mon, Jan 24, 2022 at 9:31 AM Mark Adams wrote: > >> Thanks Paul, >> >> How do I get a stack trace? I have been relying on PETSc's >> which piggybacks on timers so it is not getting too deep here. >> > > I'm not sure what the "PETSc way" is, but I just run the executable > through `rocgdb` as one would do with `gdb` (`rocgdb` is literally `gdb` > built with extra AMD stuff (that stuff is either upstreamed or being > upstreamed to gdb BTW)). You can do it in batch mode as well so you can > dump the logs from each MPI process. > > >> >> On Mon, Jan 24, 2022 at 10:16 AM Paul T. Bauman >> wrote: >> >>> On Mon, Jan 24, 2022 at 8:53 AM Matthew Knepley >>> wrote: >>> >>>> On Mon, Jan 24, 2022 at 9:24 AM Mark Adams wrote: >>>> >>>>> What is the fastest way to rebuild hypre? reconfiguring did not work >>>>> and is slow. >>>>> >>>>> I am printf debugging to find this HSA_STATUS_ERROR_MEMORY_FAULT (no >>>>> debuggers other than valgrind on Crusher??!?!) >>>>> >>>> >>> Again, apologies for interjecting, but I wanted to offer a few pointers >>> here. >>> >>> 1. `rocgdb` will be in your PATH when the `rocm` module is loaded. This >>> is gdb, but with some extra AMDGPU goodies. AFAIK, you cannot, yet, do >>> stepping through a kernel in the source (only the ISA), but you can query >>> device variables in host code, print their values, etc. >>> 1a. Note that multiple threads can be spawned by the HIP runtime. >>> Furthermore, it's likely the thread you'll be on when you catch the error >>> is (one of) the runtime thread(s). You'll need to do `info threads` and >>> then select your host thread to get back to it. >>> 2. To get an accurate stacktrace (meaning get the line in the host code >>> where the error is actually happening), I recommend setting the following >>> environment variables for debugging that will force the serialization of >>> async memcopies and kernel launches: >>> AMD_SERIALIZE_KERNEL = 3 >>> AMD_SERIALIZE_COPY=3 >>> >>> Thanks, >>> >>> Paul >>> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From ptbauman at gmail.com Mon Jan 24 10:14:32 2022 From: ptbauman at gmail.com (Paul T. Bauman) Date: Mon, 24 Jan 2022 10:14:32 -0600 Subject: [petsc-users] hypre / hip usage In-Reply-To: <87sftdceqo.fsf@jedbrown.org> References: <87r1916ujq.fsf@jedbrown.org> <87lez96tok.fsf@jedbrown.org> <87fsph6p82.fsf@jedbrown.org> <87sftdceqo.fsf@jedbrown.org> Message-ID: On Mon, Jan 24, 2022 at 9:53 AM Jed Brown wrote: > "Paul T. Bauman" writes: > > > 1. `rocgdb` will be in your PATH when the `rocm` module is loaded. This > is > > gdb, but with some extra AMDGPU goodies. AFAIK, you cannot, yet, do > > stepping through a kernel in the source (only the ISA), but you can query > > device variables in host code, print their values, etc. > > 1a. Note that multiple threads can be spawned by the HIP runtime. > > Furthermore, it's likely the thread you'll be on when you catch the error > > is (one of) the runtime thread(s). You'll need to do `info threads` and > > then select your host thread to get back to it. > > 2. To get an accurate stacktrace (meaning get the line in the host code > > where the error is actually happening), I recommend setting the following > > environment variables for debugging that will force the serialization of > > async memcopies and kernel launches: > > AMD_SERIALIZE_KERNEL = 3 > > AMD_SERIALIZE_COPY=3 > > Is there a tutorial on this? I bet a 10-minute screencast demo would make > a big impact in the use of these tools. > The one that springs to mind is a 3-day (virtual) workshop from last May at OLCF. There was a recent workshop on crusher that may also cover this. https://www.olcf.ornl.gov/calendar/2021hip/ https://www.olcf.ornl.gov/wp-content/uploads/2021/04/rocgdb_hipmath_ornl_2021_v2.pdf They recorded it, but I can't seem to find the recordings, not sure what OLCF did with them. Justin did live demos of the debugger during his talk. :( AMD_SERIALIZE_COPY isn't documented at all and AMD_SERIALIZE_KERNEL isn't > mentioned in this context. > > > https://rocmdocs.amd.com/en/latest/search.html?q=amd_serialize_copy&check_keywords=yes&area=default Sigh. This is a never-ending source of frustration on my end. Sorry, it is really unacceptable. This link is probably the best description at this moment: https://github.com/ROCm-Developer-Tools/HIP/blob/develop/docs/markdown/hip_debugging.md -------------- next part -------------- An HTML attachment was scrubbed... URL: From jacob.fai at gmail.com Mon Jan 24 11:22:46 2022 From: jacob.fai at gmail.com (Jacob Faibussowitsch) Date: Mon, 24 Jan 2022 11:22:46 -0600 Subject: [petsc-users] Strange CUDA failure with a second petscfinalize with PETSc 3.16 In-Reply-To: References: <45585074-8C0E-453B-993B-554D23A1E971@gmail.com> <60E1BD31-E98A-462F-BA2B-2099745B2582@gmail.com> Message-ID: Hi Hao, > Any luck reproducing the CUDA problem? Sorry for the long radio silence, I still have not been able to reproduce the problem unfortunately. I have tried on a local machine, and a few larger clusters and all return without errors both with and without cuda? Can you try pulling the latest version of main, reconfiguring and trying again? BTW, your configure arguments are a little wonky: 1. --with-clanguage=c - this isn?t needed, PETSc will default to C 2. --with-cxx-dialect=cxx14 --with-cuda-dialect=cxx14 - use --with-cxx-dialect=14 instead, PETSc will detect that you have gnu compilers and enable gnu extensions 3. -with-debugging=1 - this is missing an extra dash, but you also have optimization flags set so maybe just leave this one out Best regards, Jacob Faibussowitsch (Jacob Fai - booss - oh - vitch) > On Jan 23, 2022, at 21:29, Hao DONG wrote: > > Dear Jacob, > > Any luck reproducing the CUDA problem? - just write to check in, in case somehow the response did not reach me (this happens to my colleagues abroad sometimes, probably due to the Wall). > > All the best, > Hao > >> On Jan 19, 2022, at 3:01 PM, Hao DONG wrote: >> >> ? >> Thanks Jacob for looking into this ? >> >> You can see the updated source code of ex11fc in the attachment ? although there is not much that I modified (except for the jabbers I outputted). I also attached the full output (ex11fc.log) along with the configure.log file. It?s an old dual Xeon workstation (one of my ?production? machines) with Linux kernel 5.4.0 and gcc 9.3. >> >> I simply ran the code with >> >> mpiexec -np 2 ex11fc -usecuda >> >> for GPU test. And as stated before, calling without the ?-usecuda? option shows no errors. >> >> Please let me know if you find anything wrong with the configure/code. >> >> Cheers, >> Hao >> >> From: Jacob Faibussowitsch >> Sent: Wednesday, January 19, 2022 3:38 AM >> To: Hao DONG >> Cc: Junchao Zhang ; petsc-users >> Subject: Re: [petsc-users] Strange CUDA failure with a second petscfinalize with PETSc 3.16 >> >> Apologies, forgot to mention in my previous email but can you also include a copy of the full printout of the error message that you get? It will include all the command-line flags that you ran with (if any) so I can exactly mirror your environment. >> >> Best regards, >> >> Jacob Faibussowitsch >> (Jacob Fai - booss - oh - vitch) >> >> >> On Jan 18, 2022, at 14:06, Jacob Faibussowitsch > wrote: >> >> Can you send your updated source file as well as your configure.log (should be $PETSC_DIR/configure.log). I will see if I can reproduce the error on my end. >> >> Best regards, >> >> Jacob Faibussowitsch >> (Jacob Fai - booss - oh - vitch) >> >> >> On Jan 17, 2022, at 23:06, Hao DONG > wrote: >> >> ? >> Dear Junchao and Jacob, >> >> Thanks a lot for the response ? I also don?t understand why this is related to the device, especially on why the procedure can be successfully finished for *once* ? As instructed, I tried to add a CHKERRA() macro after (almost) every petsc line ? such as the initialization, mat assemble, ksp create, solve, mat destroy, etc. However, all other petsc commands returns with error code 0. It only gives me a similar (still not very informative) error after I call the petscfinalize (again for the second time), with error code 97: >> >> [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- >> [0]PETSC ERROR: GPU error >> [0]PETSC ERROR: cuda error 709 (cudaErrorContextIsDestroyed) : context is destroyed >> [0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting. >> [0]PETSC ERROR: Petsc Release Version 3.16.3, unknown >> [0]PETSC ERROR: ./ex11f on a named stratosphere by donghao Tue Jan 18 11:39:43 2022 >> [0]PETSC ERROR: Configure options --prefix=/opt/petsc/complex-double-with-cuda --with-cc=mpicc --with-cxx=mpicxx --with-fc=mpif90 COPTFLAGS="-O3 -mavx2" CXXOPTFLAGS="-O3 -mavx2" FOPTFLAGS="-O3 -ffree-line-length-none -mavx2" CUDAOPTFLAGS=-O3 --with-cxx-dialect=cxx14 --with-cuda-dialect=cxx14 --with-scalar-type=complex --with-precision=double --with-cuda-dir=/usr/local/cuda --with-debugging=1 >> [0]PETSC ERROR: #1 PetscFinalize() at /home/donghao/packages/petsc-current/src/sys/objects/pinit.c:1638 >> [0]PETSC ERROR: #2 User provided function() at User file:0 >> >> I can also confirm that rolling back to petsc 3.15 will *not* see the problem, even with the new nvidia driver. And petsc 3.16.3 with an old nvidia driver (470.42) also get this same error. So it?s probably not connected to the nvidia driver. >> >> Any idea on where I should look at next? >> Thanks a lot in advance, and all the best, >> Hao >> >> From: Jacob Faibussowitsch >> Sent: Sunday, January 16, 2022 12:12 AM >> To: Junchao Zhang >> Cc: petsc-users ; Hao DONG >> Subject: Re: [petsc-users] Strange CUDA failure with a second petscfinalize with PETSc 3.16 >> >> I don?t quite understand how it is getting to the CUDA error to be honest. None of the code in the stack trace is anywhere near the device code. Reading the error message carefully, it first chokes on PetscLogGetStageLog() from a call to PetscClassIdRegister(): >> >> PetscErrorCode PetscLogGetStageLog(PetscStageLog *stageLog) >> { >> PetscFunctionBegin; >> PetscValidPointer(stageLog,1); >> if (!petsc_stageLog) { >> fprintf(stderr, "PETSC ERROR: Logging has not been enabled.\nYou might have forgotten to call PetscInitialize().\n"); >> PETSCABORT(MPI_COMM_WORLD, PETSC_ERR_SUP); // Here >> } >> ... >> >> But then jumps to PetscFinalize(). You can also see the "You might have forgotten to call PetscInitialize().? message in the error message, just under the topmost level of the stack trace. >> >> Can you check the value of ierr of each function call (use the CHKERRA() macro to do so)? I suspect the problem here that errors occurring previously in the program are being ignored, leading to the garbled stack trace. >> >> Best regards, >> >> Jacob Faibussowitsch >> (Jacob Fai - booss - oh - vitch) >> >> >> >> On Jan 14, 2022, at 20:58, Junchao Zhang > wrote: >> >> Jacob, >> Could you have a look as it seems the "invalid device context" is in your newly added module? >> Thanks >> --Junchao Zhang >> >> >> On Fri, Jan 14, 2022 at 12:49 AM Hao DONG > wrote: >> Dear All, >> >> I have encountered a peculiar problem when fiddling with a code with PETSC 3.16.3 (which worked fine with PETSc 3.15). It is a very straight forward PDE-based optimization code which repeatedly solves a linearized PDE problem with KSP in a subroutine (the rest of the code does not contain any PETSc related content). The main program provides the subroutine with an MPI comm. Then I set the comm as PETSC_COMM_WORLD to tell PETSC to attach to it (and detach with it when the solving is finished each time). >> >> Strangely, I observe a CUDA failure whenever the petscfinalize is called for a *second* time. In other words, the first and second PDE calculations with GPU are fine (with correct solutions). The petsc code just fails after the SECOND petscfinalize command is called. You can also see the PETSC config in the error message: >> >> [1]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- >> [1]PETSC ERROR: GPU error >> [1]PETSC ERROR: cuda error 201 (cudaErrorDeviceUninitialized) : invalid device context >> [1]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting. >> [1]PETSC ERROR: Petsc Release Version 3.16.3, unknown >> [1]PETSC ERROR: maxwell.gpu on a named stratosphere by hao Fri Jan 14 10:21:05 2022 >> [1]PETSC ERROR: Configure options --prefix=/opt/petsc/complex-double-with-cuda --with-cc=mpicc --with-cxx=mpicxx --with-fc=mpif90 COPTFLAGS="-O3 -mavx2" CXXOPTFLAGS="-O3 -mavx2" FOPTFLAGS="-O3 -ffree-line-length-none -mavx2" CUDAOPTFLAGS=-O3 --with-cxx-dialect=cxx14 --with-cuda-dialect=cxx14 --with-scalar-type=complex --with-precision=double --with-cuda-dir=/usr/local/cuda --with-debugging=1 >> [1]PETSC ERROR: #1 PetscFinalize() at /home/hao/packages/petsc-current/src/sys/objects/pinit.c:1638 >> You might have forgotten to call PetscInitialize(). >> The EXACT line numbers in the error traceback are not available. >> Instead the line number of the start of the function is given. >> [1] #1 PetscAbortFindSourceFile_Private() at /home/hao/packages/petsc-current/src/sys/error/err.c:35 >> [1] #2 PetscLogGetStageLog() at /home/hao/packages/petsc-current/src/sys/logging/utils/stagelog.c:29 >> [1] #3 PetscClassIdRegister() at /home/hao/packages/petsc-current/src/sys/logging/plog.c:2376 >> [1] #4 MatMFFDInitializePackage() at /home/hao/packages/petsc-current/src/mat/impls/mffd/mffd.c:45 >> [1] #5 MatInitializePackage() at /home/hao/packages/petsc-current/src/mat/interface/dlregismat.c:163 >> [1] #6 MatCreate() at /home/hao/packages/petsc-current/src/mat/utils/gcreate.c:77 >> >> However, it doesn?t seem to affect the other part of my code, so the code can continue running until it gets to the petsc part again (the *third* time). Unfortunately, it doesn?t give me any further information even if I set the debugging to yes in the configure file. It also worth noting that PETSC without CUDA (i.e. with simple MATMPIAIJ) works perfectly fine. >> >> I am able to re-produce the problem with a toy code modified from ex11f. Please see the attached file (ex11fc.F90) for details. Essentially the code does the same thing as ex11f, but three times with a do loop. To do that I added an extra MPI_INIT/MPI_FINALIZE to ensure that the MPI communicator is not destroyed when PETSC_FINALIZE is called. I used the PetscOptionsHasName utility to check if you have ?-usecuda? in the options. So running the code with and without that option can give you a comparison w/o CUDA. I can see that the code also fails after the second loop of the KSP operation. Could you kindly shed some lights on this problem? >> >> I should say that I am not even sure if the problem is from PETSc, as I also accidentally updated the NVIDIA driver (for now it is 510.06 with cuda 11.6). And it is well known that NVIDIA can give you some surprise in the updates (yes, I know I shouldn?t have touched that if it?s not broken). But my CUDA code without PETSC (which basically does the same PDE thing, but with cusparse/cublas directly) seems to work just fine after the update. It is also possible that my petsc code related to CUDA was not quite ?legitimate? ? I just use: >> MatSetType(A, MATMPIAIJCUSPARSE, ierr) >> and >> MatCreateVecs(A, u, PETSC_NULL_VEC, ierr) >> to make the data onto GPU. I would very much appreciate it if you could show me the ?right? way to do that. >> >> Thanks a lot in advance, and all the best, >> Hao >> >> > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From nicolas.barnafi at unimi.it Mon Jan 24 11:39:12 2022 From: nicolas.barnafi at unimi.it (Nicolas Alejandro Barnafi) Date: Mon, 24 Jan 2022 18:39:12 +0100 Subject: [petsc-users] HYPRE AMS - Segmentation Violation with discrete gradient In-Reply-To: <7f2cb725cce30.61eaf5d5@unimi.it> References: <7bf2bb42c9ccf.61ead639@unimi.it> <7d64948dc9956.61eae6c2@unimi.it> <7e7dd1a3ce69c.61eae6fe@unimi.it> <7e7daa98ce073.61eae73b@unimi.it> <7f28becdc88d0.61eae779@unimi.it> <7d0fabc8cb8c2.61eae7b5@unimi.it> <7f2cb725cce30.61eaf5d5@unimi.it> Message-ID: <7f36cea7174331.61eef250@unimi.it> Please let me know if you were able to run the example (or anyone else). Thanks for the help. Kind regards Il 21/01/22 18:05, "Nicolas Alejandro Barnafi" ha scritto: > > Thank you Matt, I have trimmed down ex10 (a lot)?to do as required, and it indeed reproduces the error.? > You may find it attached and it can be reproduced with > ./ex10 -fA Amat -fP Pmat -fG Gmat -ksp_type gmres -pc_type hypre -pc_hypre_type ams > > Thank you! > > Il 21/01/22 16:35, Matthew Knepley ha scritto: > > > > > > > > On Fri, Jan 21, 2022 at 9:50 AM Nicolas Alejandro Barnafi wrote: > > > > > > > Dear community,? > > > > > > I'm giving the discrete gradient to a PC object (sub PC of a fieldsplit) but HYPRE internally gives a segmentation violation error. The matrix has been adequately set, as I have added it to the program output for inspection. Has this happened to anyone? > > > > > > > > > Is there a chance of sending us something that we can run? Alternatively, can you run > > > > https://gitlab.com/petsc/petsc/-/blob/main/src/ksp/ksp/tutorials/ex10.c > > > > loading your matrix and giving options to select Hypre? Then we can do the same thing here with your matrix. > > > > Thanks, > > > > Matt > > > > > > > > > > I have attached the error below together with the discrete gradient, in case you see something I am missing. > > > > > > The code is currently running in serial, so I don't expect communication/partitioning to be an issue (although it could be in the future). > > > > > > Thanks in advance, > > > Nicolas > > > > > > > > > > > > ------------------------------ PETSc output ------------------------------------ > > > > > > > > > Mat Object: 1 MPI processes > > > type: seqaij > > > -1.00000e+00 1.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 > > > 1.00000e+00 0.00000e+00 -1.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 > > > -1.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 1.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 > > > 0.00000e+00 -1.00000e+00 0.00000e+00 1.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 > > > 0.00000e+00 1.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 -1.00000e+00 0.00000e+00 0.00000e+00 > > > 0.00000e+00 0.00000e+00 1.00000e+00 -1.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 > > > 0.00000e+00 0.00000e+00 1.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 -1.00000e+00 0.00000e+00 > > > 0.00000e+00 0.00000e+00 0.00000e+00 -1.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 1.00000e+00 > > > 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 1.00000e+00 -1.00000e+00 0.00000e+00 0.00000e+00 > > > 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 -1.00000e+00 0.00000e+00 1.00000e+00 0.00000e+00 > > > 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 1.00000e+00 0.00000e+00 -1.00000e+00 > > > 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 -1.00000e+00 1.00000e+00 > > > [0]PETSC ERROR: ------------------------------------------------------------------------ > > > [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range > > > [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > > > [0]PETSC ERROR: or see https://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind > > > [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors > > > [0]PETSC ERROR: likely location of problem given in stack below > > > [0]PETSC ERROR: --------------------- Stack Frames ------------------------------------ > > > [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, > > > [0]PETSC ERROR: INSTEAD the line number of the start of the function > > > [0]PETSC ERROR: is given. > > > [0]PETSC ERROR: [0] jac->setup line 408 /home/ubuntu/petsc/src/ksp/pc/impls/hypre/hypre.c > > > [0]PETSC ERROR: [0] PCSetUp_HYPRE line 223 /home/ubuntu/petsc/src/ksp/pc/impls/hypre/hypre.c > > > [0]PETSC ERROR: [0] PCSetUp line 971 /home/ubuntu/petsc/src/ksp/pc/interface/precon.c > > > [0]PETSC ERROR: [0] KSPSetUp line 319 /home/ubuntu/petsc/src/ksp/ksp/interface/itfunc.c > > > [0]PETSC ERROR: [0] KSPSolve_Private line 615 /home/ubuntu/petsc/src/ksp/ksp/interface/itfunc.c > > > [0]PETSC ERROR: [0] KSPSolve line 884 /home/ubuntu/petsc/src/ksp/ksp/interface/itfunc.c > > > [0]PETSC ERROR: [0] PCApply_FieldSplit line 1241 /home/ubuntu/petsc/src/ksp/pc/impls/fieldsplit/fieldsplit.c > > > [0]PETSC ERROR: [0] PCApply line 426 /home/ubuntu/petsc/src/ksp/pc/interface/precon.c > > > [0]PETSC ERROR: [0] KSP_PCApply line 281 /home/ubuntu/petsc/include/petsc/private/kspimpl.h > > > [0]PETSC ERROR: [0] KSPInitialResidual line 40 /home/ubuntu/petsc/src/ksp/ksp/interface/itres.c > > > [0]PETSC ERROR: [0] KSPSolve_GMRES line 233 /home/ubuntu/petsc/src/ksp/ksp/impls/gmres/gmres.c > > > [0]PETSC ERROR: [0] KSPSolve_Private line 615 /home/ubuntu/petsc/src/ksp/ksp/interface/itfunc.c > > > [0]PETSC ERROR: [0] KSPSolve line 884 /home/ubuntu/petsc/src/ksp/ksp/interface/itfunc.c > > > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > > > [0]PETSC ERROR: Signal received > > > [0]PETSC ERROR: See https://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > > > [0]PETSC ERROR: Petsc Release Version 3.14.6, unknown > > > [0]PETSC ERROR: Unknown Name on a arch-linux-c-debug named ubuntu-ThinkPad-L14-Gen-1 by ubuntu Fri Jan 21 15:37:45 2022 > > > [0]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --download-mpich --download-fblaslapack --with-mpi=1 --download-superlu_dist --download-mumps --download-hypre --with-debugging=1 COPTFLAGS="-O3 -march=native -mtune=native" CXXOPTFLAGS="-O3 -march=native -mtune=native" FOPTFLAGS="-O3 -march=native -mtune=native" --download-scalapack --download-hpddm > > > [0]PETSC ERROR: #1 User provided function() line 0 in unknown file > > > [0]PETSC ERROR: Checking the memory for corruption. > > > application called MPI_Abort(MPI_COMM_WORLD, 50176059) - process 0 > > > [unset]: write_line error; fd=-1 buf=:cmd=abort exitcode=50176059 > > > : > > > system msg for write_line failure : Bad file descriptor > > > > > > > > > > > -- > > > > > > > > > > > > > > > > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > > -- Norbert Wiener > > > > https://www.cse.buffalo.edu/~knepley/(http://www.cse.buffalo.edu/~knepley/ ) > > > > > > > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mfadams at lbl.gov Mon Jan 24 13:42:55 2022 From: mfadams at lbl.gov (Mark Adams) Date: Mon, 24 Jan 2022 14:42:55 -0500 Subject: [petsc-users] hypre / hip usage In-Reply-To: References: <87r1916ujq.fsf@jedbrown.org> <87lez96tok.fsf@jedbrown.org> <87fsph6p82.fsf@jedbrown.org> Message-ID: On Sat, Jan 22, 2022 at 11:31 AM Stefano Zampini wrote: > Mark > > the two options are only there to test the code in CI, and are not needed > in general > > '--download-hypre-configure-arguments=--enable-unified-memory', > This is only here to test the unified memory code path > > '--with-hypre-gpuarch=gfx90a', > This is not needed if rocminfo is in PATH > > Our interface code with HYPRE GPU works fine for HIP, it is tested in CI. > I think there may be a problem with Crusher (he says after trying to debug this all day). I was not able to get to the error. rocgdb hung and I did not manage to get a print statement from hypre. It could be that this error happens _at_ the call to HYPRE_IJMatrixAddToValues in PETSc (ie, it never gets to hypre code). Not sure. Oddly HYPRE_IJMatrixSetToValues worked in snes/ex56. I didn't figure out how to simple do this until now: 14:21 adams/aijkokkos-gpu-logging *= crusher:/gpfs/alpine/csc314/scratch/adams/petsc$ make -f gmakefile PETSC_ARCH=arch-olcf-crusher-g test search='ksp_ksp_tutorials-ex55_hypre_device' Using MAKEFLAGS: -- search=ksp_ksp_tutorials-ex55_hypre_device PETSC_ARCH=arch-olcf-crusher-g TEST arch-olcf-crusher-g/tests/counts/ksp_ksp_tutorials-ex55_hypre_device.counts # retrying ksp_ksp_tutorials-ex55_hypre_device not ok ksp_ksp_tutorials-ex55_hypre_device # Error code: 134 # :0:rocdevice.cpp :2589: 360810731350 us: Device::callbackQueue aborting with error : HSA_STATUS_ERROR_INVALID_ISA: The instruction set architecture is invalid. code: 0x100f # :0:rocdevice.cpp :2589: 360810732560 us: Device::callbackQueue aborting with error : HSA_STATUS_ERROR_INVALID_ISA: The instruction set architecture is invalid. code: 0x100f # :0:rocdevice.cpp :2589: 360810735300 us: Device::callbackQueue aborting with error : HSA_STATUS_ERROR_INVALID_ISA: The instruction set architecture is invalid. code: 0x100f # :0:rocdevice.cpp :2589: 360810736352 us: Device::callbackQueue aborting with error : HSA_STATUS_ERROR_INVALID_ISA: The instruction set architecture is invalid. code: 0x100f # srun: error: crusher002: tasks 0-3: Aborted # srun: launch/slurm: _step_signal: Terminating StepId=66195.4 ok ksp_ksp_tutorials-ex55_hypre_device # SKIP Command failed so no diff # FAILED ksp_ksp_tutorials-ex55_hypre_device # # To rerun failed tests: # /usr/bin/gmake -f gmakefile test test-fail=1 > The -mat_type hypre assembling for ex19 does not work because ex19 uses > FDColoring. Just assemble in mpiaij format (look at runex19_hypre_hip in > src/snes/tutorials/makefile); the interface code will copy the matrix to > the GPU > > Il giorno ven 21 gen 2022 alle ore 19:24 Mark Adams ha > scritto: > >> >> >> On Fri, Jan 21, 2022 at 11:14 AM Jed Brown wrote: >> >>> "Paul T. Bauman" writes: >>> >>> > On Fri, Jan 21, 2022 at 8:52 AM Paul T. Bauman >>> wrote: >>> >> Yes. The way HYPRE's memory model is setup is that ALL GPU >>> allocations are >>> >> "native" (i.e. [cuda,hip]Malloc) or, if unified memory is enabled, >>> then ALL >>> >> GPU allocations are unified memory (i.e. [cuda,hip]MallocManaged). >>> >> Regarding HIP, there is an HMM implementation of hipMallocManaged >>> planned, >>> >> but is it not yet delivered AFAIK (and it will *not* support gfx906, >>> e.g. >>> >> RVII, FYI), so, today, under the covers, hipMallocManaged is calling >>> >> hipHostMalloc. So, today, all your unified memory allocations in >>> HYPRE on >>> >> HIP are doing CPU-pinned memory accesses. And performance is just >>> truly >>> >> terrible (as you might expect). >>> >>> Thanks for this important bit of information. >>> >>> And it sounds like when we add support to hand off Kokkos matrices and >>> vectors (our current support for matrices on ROCm devices uses Kokkos) or >>> add direct support for hipSparse, we'll avoid touching host memory in >>> assembly-to-solve with hypre. >>> >> >> It does not look like anyone has made Hypre work with HIP. Stafano added >> a runex19_hypre_hip target 4 months ago and hypre.py has some HIP things. >> >> I have a user that would like to try this, no hurry but, can I get an >> idea of a plan for this? >> >> Thanks, >> Mark >> >> > > > -- > Stefano > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Mon Jan 24 15:42:45 2022 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 24 Jan 2022 16:42:45 -0500 Subject: [petsc-users] HYPRE AMS - Segmentation Violation with discrete gradient In-Reply-To: <7f36cea7174331.61eef250@unimi.it> References: <7bf2bb42c9ccf.61ead639@unimi.it> <7d64948dc9956.61eae6c2@unimi.it> <7e7dd1a3ce69c.61eae6fe@unimi.it> <7e7daa98ce073.61eae73b@unimi.it> <7f28becdc88d0.61eae779@unimi.it> <7d0fabc8cb8c2.61eae7b5@unimi.it> <7f2cb725cce30.61eaf5d5@unimi.it> <7f36cea7174331.61eef250@unimi.it> Message-ID: Okay, I can run your example and get an error. I am looking at it. Not sure if it is us or Hypre yet. Thanks, Matt On Mon, Jan 24, 2022 at 12:39 PM Nicolas Alejandro Barnafi < nicolas.barnafi at unimi.it> wrote: > Please let me know if you were able to run the example (or anyone else). > Thanks for the help. > Kind regards > > Il 21/01/22 18:05, *"Nicolas Alejandro Barnafi" * < > nicolas.barnafi at unimi.it> ha scritto: > > Thank you Matt, I have trimmed down ex10 (a lot) to do as required, and it > indeed reproduces the error. > You may find it attached and it can be reproduced with > ./ex10 -fA Amat -fP Pmat -fG Gmat -ksp_type gmres -pc_type hypre > -pc_hypre_type ams > > Thank you! > > Il 21/01/22 16:35, *Matthew Knepley * ha scritto: > > On Fri, Jan 21, 2022 at 9:50 AM Nicolas Alejandro Barnafi < > nicolas.barnafi at unimi.it> wrote: > >> Dear community, >> >> I'm giving the discrete gradient to a PC object (sub PC of a fieldsplit) >> but HYPRE internally gives a segmentation violation error. The matrix has >> been adequately set, as I have added it to the program output for >> inspection. Has this happened to anyone? >> > > Is there a chance of sending us something that we can run? Alternatively, > can you run > > https://gitlab.com/petsc/petsc/-/blob/main/src/ksp/ksp/tutorials/ex10.c > > loading your matrix and giving options to select Hypre? Then we can do the > same thing here with your matrix. > > Thanks, > > Matt > > I have attached the error below together with the discrete gradient, in >> case you see something I am missing. >> >> The code is currently running in serial, so I don't expect >> communication/partitioning to be an issue (although it could be in the >> future). >> >> Thanks in advance, >> Nicolas >> >> >> ------------------------------ PETSc output >> ------------------------------------ >> >> Mat Object: 1 MPI processes >> type: seqaij >> -1.00000e+00 1.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 >> 0.00000e+00 0.00000e+00 0.00000e+00 >> 1.00000e+00 0.00000e+00 -1.00000e+00 0.00000e+00 0.00000e+00 >> 0.00000e+00 0.00000e+00 0.00000e+00 >> -1.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 1.00000e+00 >> 0.00000e+00 0.00000e+00 0.00000e+00 >> 0.00000e+00 -1.00000e+00 0.00000e+00 1.00000e+00 0.00000e+00 >> 0.00000e+00 0.00000e+00 0.00000e+00 >> 0.00000e+00 1.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 >> -1.00000e+00 0.00000e+00 0.00000e+00 >> 0.00000e+00 0.00000e+00 1.00000e+00 -1.00000e+00 0.00000e+00 >> 0.00000e+00 0.00000e+00 0.00000e+00 >> 0.00000e+00 0.00000e+00 1.00000e+00 0.00000e+00 0.00000e+00 >> 0.00000e+00 -1.00000e+00 0.00000e+00 >> 0.00000e+00 0.00000e+00 0.00000e+00 -1.00000e+00 0.00000e+00 >> 0.00000e+00 0.00000e+00 1.00000e+00 >> 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 1.00000e+00 >> -1.00000e+00 0.00000e+00 0.00000e+00 >> 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 -1.00000e+00 >> 0.00000e+00 1.00000e+00 0.00000e+00 >> 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 >> 1.00000e+00 0.00000e+00 -1.00000e+00 >> 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 >> 0.00000e+00 -1.00000e+00 1.00000e+00 >> [0]PETSC ERROR: >> ------------------------------------------------------------------------ >> [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, >> probably memory access out of range >> [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger >> [0]PETSC ERROR: or see >> https://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind >> [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS >> X to find memory corruption errors >> [0]PETSC ERROR: likely location of problem given in stack below >> [0]PETSC ERROR: --------------------- Stack Frames >> ------------------------------------ >> [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not >> available, >> [0]PETSC ERROR: INSTEAD the line number of the start of the function >> [0]PETSC ERROR: is given. >> [0]PETSC ERROR: [0] jac->setup line 408 >> /home/ubuntu/petsc/src/ksp/pc/impls/hypre/hypre.c >> [0]PETSC ERROR: [0] PCSetUp_HYPRE line 223 >> /home/ubuntu/petsc/src/ksp/pc/impls/hypre/hypre.c >> [0]PETSC ERROR: [0] PCSetUp line 971 >> /home/ubuntu/petsc/src/ksp/pc/interface/precon.c >> [0]PETSC ERROR: [0] KSPSetUp line 319 >> /home/ubuntu/petsc/src/ksp/ksp/interface/itfunc.c >> [0]PETSC ERROR: [0] KSPSolve_Private line 615 >> /home/ubuntu/petsc/src/ksp/ksp/interface/itfunc.c >> [0]PETSC ERROR: [0] KSPSolve line 884 >> /home/ubuntu/petsc/src/ksp/ksp/interface/itfunc.c >> [0]PETSC ERROR: [0] PCApply_FieldSplit line 1241 >> /home/ubuntu/petsc/src/ksp/pc/impls/fieldsplit/fieldsplit.c >> [0]PETSC ERROR: [0] PCApply line 426 >> /home/ubuntu/petsc/src/ksp/pc/interface/precon.c >> [0]PETSC ERROR: [0] KSP_PCApply line 281 >> /home/ubuntu/petsc/include/petsc/private/kspimpl.h >> [0]PETSC ERROR: [0] KSPInitialResidual line 40 >> /home/ubuntu/petsc/src/ksp/ksp/interface/itres.c >> [0]PETSC ERROR: [0] KSPSolve_GMRES line 233 >> /home/ubuntu/petsc/src/ksp/ksp/impls/gmres/gmres.c >> [0]PETSC ERROR: [0] KSPSolve_Private line 615 >> /home/ubuntu/petsc/src/ksp/ksp/interface/itfunc.c >> [0]PETSC ERROR: [0] KSPSolve line 884 >> /home/ubuntu/petsc/src/ksp/ksp/interface/itfunc.c >> [0]PETSC ERROR: --------------------- Error Message >> -------------------------------------------------------------- >> [0]PETSC ERROR: Signal received >> [0]PETSC ERROR: See https://www.mcs.anl.gov/petsc/documentation/faq.html >> for trouble shooting. >> [0]PETSC ERROR: Petsc Release Version 3.14.6, unknown >> [0]PETSC ERROR: Unknown Name on a arch-linux-c-debug named >> ubuntu-ThinkPad-L14-Gen-1 by ubuntu Fri Jan 21 15:37:45 2022 >> [0]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ >> --with-fc=gfortran --download-mpich --download-fblaslapack --with-mpi=1 >> --download-superlu_dist --download-mumps --download-hypre >> --with-debugging=1 COPTFLAGS="-O3 -march=native -mtune=native" >> CXXOPTFLAGS="-O3 -march=native -mtune=native" FOPTFLAGS="-O3 -march=native >> -mtune=native" --download-scalapack --download-hpddm >> [0]PETSC ERROR: #1 User provided function() line 0 in unknown file >> [0]PETSC ERROR: Checking the memory for corruption. >> application called MPI_Abort(MPI_COMM_WORLD, 50176059) - process 0 >> [unset]: write_line error; fd=-1 buf=:cmd=abort exitcode=50176059 >> : >> system msg for write_line failure : Bad file descriptor >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Mon Jan 24 16:02:07 2022 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 24 Jan 2022 17:02:07 -0500 Subject: [petsc-users] HYPRE AMS - Segmentation Violation with discrete gradient In-Reply-To: References: <7bf2bb42c9ccf.61ead639@unimi.it> <7d64948dc9956.61eae6c2@unimi.it> <7e7dd1a3ce69c.61eae6fe@unimi.it> <7e7daa98ce073.61eae73b@unimi.it> <7f28becdc88d0.61eae779@unimi.it> <7d0fabc8cb8c2.61eae7b5@unimi.it> <7f2cb725cce30.61eaf5d5@unimi.it> <7f36cea7174331.61eef250@unimi.it> Message-ID: On Mon, Jan 24, 2022 at 4:42 PM Matthew Knepley wrote: > Okay, I can run your example and get an error. I am looking at it. Not > sure if it is us or Hypre yet. > This is failing inside Hypre. It appears that the A_diag offset array does not match the matrix. Are you giving Hypre a rectangular matrix? If so, it appears that the code cannot handle it and accidentally indexes out of the array. Thanks, Matt > Thanks, > > Matt > > On Mon, Jan 24, 2022 at 12:39 PM Nicolas Alejandro Barnafi < > nicolas.barnafi at unimi.it> wrote: > >> Please let me know if you were able to run the example (or anyone else). >> Thanks for the help. >> Kind regards >> >> Il 21/01/22 18:05, *"Nicolas Alejandro Barnafi" * < >> nicolas.barnafi at unimi.it> ha scritto: >> >> Thank you Matt, I have trimmed down ex10 (a lot) to do as required, and >> it indeed reproduces the error. >> You may find it attached and it can be reproduced with >> ./ex10 -fA Amat -fP Pmat -fG Gmat -ksp_type gmres -pc_type hypre >> -pc_hypre_type ams >> >> Thank you! >> >> Il 21/01/22 16:35, *Matthew Knepley * ha scritto: >> >> On Fri, Jan 21, 2022 at 9:50 AM Nicolas Alejandro Barnafi < >> nicolas.barnafi at unimi.it> wrote: >> >>> Dear community, >>> >>> I'm giving the discrete gradient to a PC object (sub PC of a fieldsplit) >>> but HYPRE internally gives a segmentation violation error. The matrix has >>> been adequately set, as I have added it to the program output for >>> inspection. Has this happened to anyone? >>> >> >> Is there a chance of sending us something that we can run? Alternatively, >> can you run >> >> https://gitlab.com/petsc/petsc/-/blob/main/src/ksp/ksp/tutorials/ex10.c >> >> loading your matrix and giving options to select Hypre? Then we can do >> the same thing here with your matrix. >> >> Thanks, >> >> Matt >> >> I have attached the error below together with the discrete gradient, in >>> case you see something I am missing. >>> >>> The code is currently running in serial, so I don't expect >>> communication/partitioning to be an issue (although it could be in the >>> future). >>> >>> Thanks in advance, >>> Nicolas >>> >>> >>> ------------------------------ PETSc output >>> ------------------------------------ >>> >>> Mat Object: 1 MPI processes >>> type: seqaij >>> -1.00000e+00 1.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 >>> 0.00000e+00 0.00000e+00 0.00000e+00 >>> 1.00000e+00 0.00000e+00 -1.00000e+00 0.00000e+00 0.00000e+00 >>> 0.00000e+00 0.00000e+00 0.00000e+00 >>> -1.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 1.00000e+00 >>> 0.00000e+00 0.00000e+00 0.00000e+00 >>> 0.00000e+00 -1.00000e+00 0.00000e+00 1.00000e+00 0.00000e+00 >>> 0.00000e+00 0.00000e+00 0.00000e+00 >>> 0.00000e+00 1.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 >>> -1.00000e+00 0.00000e+00 0.00000e+00 >>> 0.00000e+00 0.00000e+00 1.00000e+00 -1.00000e+00 0.00000e+00 >>> 0.00000e+00 0.00000e+00 0.00000e+00 >>> 0.00000e+00 0.00000e+00 1.00000e+00 0.00000e+00 0.00000e+00 >>> 0.00000e+00 -1.00000e+00 0.00000e+00 >>> 0.00000e+00 0.00000e+00 0.00000e+00 -1.00000e+00 0.00000e+00 >>> 0.00000e+00 0.00000e+00 1.00000e+00 >>> 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 1.00000e+00 >>> -1.00000e+00 0.00000e+00 0.00000e+00 >>> 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 -1.00000e+00 >>> 0.00000e+00 1.00000e+00 0.00000e+00 >>> 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 >>> 1.00000e+00 0.00000e+00 -1.00000e+00 >>> 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 >>> 0.00000e+00 -1.00000e+00 1.00000e+00 >>> [0]PETSC ERROR: >>> ------------------------------------------------------------------------ >>> [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, >>> probably memory access out of range >>> [0]PETSC ERROR: Try option -start_in_debugger or >>> -on_error_attach_debugger >>> [0]PETSC ERROR: or see >>> https://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind >>> [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac >>> OS X to find memory corruption errors >>> [0]PETSC ERROR: likely location of problem given in stack below >>> [0]PETSC ERROR: --------------------- Stack Frames >>> ------------------------------------ >>> [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not >>> available, >>> [0]PETSC ERROR: INSTEAD the line number of the start of the >>> function >>> [0]PETSC ERROR: is given. >>> [0]PETSC ERROR: [0] jac->setup line 408 >>> /home/ubuntu/petsc/src/ksp/pc/impls/hypre/hypre.c >>> [0]PETSC ERROR: [0] PCSetUp_HYPRE line 223 >>> /home/ubuntu/petsc/src/ksp/pc/impls/hypre/hypre.c >>> [0]PETSC ERROR: [0] PCSetUp line 971 >>> /home/ubuntu/petsc/src/ksp/pc/interface/precon.c >>> [0]PETSC ERROR: [0] KSPSetUp line 319 >>> /home/ubuntu/petsc/src/ksp/ksp/interface/itfunc.c >>> [0]PETSC ERROR: [0] KSPSolve_Private line 615 >>> /home/ubuntu/petsc/src/ksp/ksp/interface/itfunc.c >>> [0]PETSC ERROR: [0] KSPSolve line 884 >>> /home/ubuntu/petsc/src/ksp/ksp/interface/itfunc.c >>> [0]PETSC ERROR: [0] PCApply_FieldSplit line 1241 >>> /home/ubuntu/petsc/src/ksp/pc/impls/fieldsplit/fieldsplit.c >>> [0]PETSC ERROR: [0] PCApply line 426 >>> /home/ubuntu/petsc/src/ksp/pc/interface/precon.c >>> [0]PETSC ERROR: [0] KSP_PCApply line 281 >>> /home/ubuntu/petsc/include/petsc/private/kspimpl.h >>> [0]PETSC ERROR: [0] KSPInitialResidual line 40 >>> /home/ubuntu/petsc/src/ksp/ksp/interface/itres.c >>> [0]PETSC ERROR: [0] KSPSolve_GMRES line 233 >>> /home/ubuntu/petsc/src/ksp/ksp/impls/gmres/gmres.c >>> [0]PETSC ERROR: [0] KSPSolve_Private line 615 >>> /home/ubuntu/petsc/src/ksp/ksp/interface/itfunc.c >>> [0]PETSC ERROR: [0] KSPSolve line 884 >>> /home/ubuntu/petsc/src/ksp/ksp/interface/itfunc.c >>> [0]PETSC ERROR: --------------------- Error Message >>> -------------------------------------------------------------- >>> [0]PETSC ERROR: Signal received >>> [0]PETSC ERROR: See https://www.mcs.anl.gov/petsc/documentation/faq.html >>> for trouble shooting. >>> [0]PETSC ERROR: Petsc Release Version 3.14.6, unknown >>> [0]PETSC ERROR: Unknown Name on a arch-linux-c-debug named >>> ubuntu-ThinkPad-L14-Gen-1 by ubuntu Fri Jan 21 15:37:45 2022 >>> [0]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ >>> --with-fc=gfortran --download-mpich --download-fblaslapack --with-mpi=1 >>> --download-superlu_dist --download-mumps --download-hypre >>> --with-debugging=1 COPTFLAGS="-O3 -march=native -mtune=native" >>> CXXOPTFLAGS="-O3 -march=native -mtune=native" FOPTFLAGS="-O3 -march=native >>> -mtune=native" --download-scalapack --download-hpddm >>> [0]PETSC ERROR: #1 User provided function() line 0 in unknown file >>> [0]PETSC ERROR: Checking the memory for corruption. >>> application called MPI_Abort(MPI_COMM_WORLD, 50176059) - process 0 >>> [unset]: write_line error; fd=-1 buf=:cmd=abort exitcode=50176059 >>> : >>> system msg for write_line failure : Bad file descriptor >>> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ >> >> >> > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From jxiong at anl.gov Tue Jan 25 00:03:50 2022 From: jxiong at anl.gov (Xiong, Jing) Date: Tue, 25 Jan 2022 06:03:50 +0000 Subject: [petsc-users] Asking examples about solving DAE in python In-Reply-To: References: Message-ID: Good morning, Thanks for all your help. I'm trying to implement a toy example solving a DAE for a Pendulum system and I got two questions: 1. How to set the initial value for xdot? 2. I got the following error information: * Assertion failed: (!PyErr_Occurred()), function __Pyx_PyCFunction_FastCall, file src/petsc4py.PETSc.c, line 359099. The system equation is given in https://en.wikipedia.org/wiki/Differential-algebraic_system_of_equations The formula I used is shown as follows: [cid:3001ee1d-031e-4500-b5e0-bc14e99812b5] I also attached my code below: import sys, petsc4py petsc4py.init(sys.argv) import numpy as np from petsc4py import PETSc class Pendulum(object): n = 5 comm = PETSc.COMM_SELF def initialCondition(self, x): # mu = self.mu_ l = 1.0 m = 1.0 g = 1.0 #initial condition theta0= np.pi/3 #starting angle x0=np.sin(theta0) y0=-(l-x0**2)**.5 lambdaval = 0.1 x[0] = x0 x[1] = y0 x[4] = lambdaval x.assemble() def evalIFunction(self, ts, t, x, xdot, f): f.setArray ([x[2]-xdot[0]], [x[3]-xdot[1]], [-xdot[2]+x[4]*x[0]/m], [-xdot[3]+x[4]*x[1]/m-g], [x[2]**2 + x[3]**2 + (x[0]**2 + x[1]**2)/m*x[4] - x[1] * g]) OptDB = PETSc.Options() ode = Pendulum() x = PETSc.Vec().createSeq(ode.n, comm=ode.comm) f = x.duplicate() ts = PETSc.TS().create(comm=ode.comm) ts.setType(ts.Type.CN) ts.setIFunction(ode.evalIFunction, f) ts.setSaveTrajectory() ts.setTime(0.0) ts.setTimeStep(0.001) ts.setMaxTime(0.5) ts.setMaxSteps(1000) ts.setExactFinalTime(PETSc.TS.ExactFinalTime.MATCHSTEP) ts.setFromOptions() ode.initialCondition(x) ts.solve(x) Best, Jing ________________________________ From: Zhang, Hong Sent: Thursday, January 20, 2022 3:05 PM To: Xiong, Jing Cc: petsc-users at mcs.anl.gov ; Zhao, Dongbo ; Hong, Tianqi Subject: Re: [petsc-users] Asking examples about solving DAE in python On Jan 20, 2022, at 4:13 PM, Xiong, Jing via petsc-users > wrote: Hi, I hope you are well. I'm very interested in PETSc and want to explore the possibility of whether it could solve Differential-algebraic equations (DAE) in python. I know there are great examples in C, but I'm struggling to connect the examples in python. The only example I got right now is for solving ODEs in the paper: PETSc/TS: A Modern Scalable ODE/DAE Solver Library. And I got the following questions: 1. Is petsc4py the right package to use? Yes, you need petsc4py for python. 1. Could you give an example for solving DAEs in python? src/binding/petsc4py/demo/ode/vanderpol.py gives a simple example to demonstrate a variety of PETSc capabilities. The corresponding C version of this example is ex20adj.c in src/ts/tutorials/. 1. How to solve an ODE with explicit methods. 2. How to solve an ODE/DAE with implicit methods. 3. How to use TSAdjoint to calculate adjoint sensitivities. 4. How to do a manual matrix-free implementation (e.g. you may already have a way to differentiate your RHS function to generate the Jacobian-vector product). 1. Is Jacobian must be specified? If not, could your show an example for solving DAEs without specified Jacobian in python? PETSc can do finite-difference approximations to generate the Jacobian matrix automatically. This may work nicely for small-sized problems. You can also try to use an AD tool to produce the Jacobian-vector product and use it in a matrix-free implementation. Hong Thank you for your help. Best, Jing -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image.png Type: image/png Size: 30250 bytes Desc: image.png URL: From jxiong at anl.gov Tue Jan 25 00:13:11 2022 From: jxiong at anl.gov (Xiong, Jing) Date: Tue, 25 Jan 2022 06:13:11 +0000 Subject: [petsc-users] Asking examples about solving DAE in python In-Reply-To: References: Message-ID: Sorry for the confusion. I was implemented following the pendulum example given by https://github.com/bmcage/odes/blob/master/ipython_examples/Planar%20Pendulum%20as%20DAE.ipynb. The formula is [cid:b49e0bc9-f98e-44a8-a55a-caa0fd0522e3] ________________________________ From: Xiong, Jing Sent: Monday, January 24, 2022 10:03 PM To: petsc-users at mcs.anl.gov Subject: Re: [petsc-users] Asking examples about solving DAE in python Good morning, Thanks for all your help. I'm trying to implement a toy example solving a DAE for a Pendulum system and I got two questions: 1. How to set the initial value for xdot? 2. I got the following error information: * Assertion failed: (!PyErr_Occurred()), function __Pyx_PyCFunction_FastCall, file src/petsc4py.PETSc.c, line 359099. The system equation is given in https://en.wikipedia.org/wiki/Differential-algebraic_system_of_equations The formula I used is shown as follows: [cid:3001ee1d-031e-4500-b5e0-bc14e99812b5] I also attached my code below: import sys, petsc4py petsc4py.init(sys.argv) import numpy as np from petsc4py import PETSc class Pendulum(object): n = 5 comm = PETSc.COMM_SELF def initialCondition(self, x): # mu = self.mu_ l = 1.0 m = 1.0 g = 1.0 #initial condition theta0= np.pi/3 #starting angle x0=np.sin(theta0) y0=-(l-x0**2)**.5 lambdaval = 0.1 x[0] = x0 x[1] = y0 x[4] = lambdaval x.assemble() def evalIFunction(self, ts, t, x, xdot, f): f.setArray ([x[2]-xdot[0]], [x[3]-xdot[1]], [-xdot[2]+x[4]*x[0]/m], [-xdot[3]+x[4]*x[1]/m-g], [x[2]**2 + x[3]**2 + (x[0]**2 + x[1]**2)/m*x[4] - x[1] * g]) OptDB = PETSc.Options() ode = Pendulum() x = PETSc.Vec().createSeq(ode.n, comm=ode.comm) f = x.duplicate() ts = PETSc.TS().create(comm=ode.comm) ts.setType(ts.Type.CN) ts.setIFunction(ode.evalIFunction, f) ts.setSaveTrajectory() ts.setTime(0.0) ts.setTimeStep(0.001) ts.setMaxTime(0.5) ts.setMaxSteps(1000) ts.setExactFinalTime(PETSc.TS.ExactFinalTime.MATCHSTEP) ts.setFromOptions() ode.initialCondition(x) ts.solve(x) Best, Jing ________________________________ From: Zhang, Hong Sent: Thursday, January 20, 2022 3:05 PM To: Xiong, Jing Cc: petsc-users at mcs.anl.gov ; Zhao, Dongbo ; Hong, Tianqi Subject: Re: [petsc-users] Asking examples about solving DAE in python On Jan 20, 2022, at 4:13 PM, Xiong, Jing via petsc-users > wrote: Hi, I hope you are well. I'm very interested in PETSc and want to explore the possibility of whether it could solve Differential-algebraic equations (DAE) in python. I know there are great examples in C, but I'm struggling to connect the examples in python. The only example I got right now is for solving ODEs in the paper: PETSc/TS: A Modern Scalable ODE/DAE Solver Library. And I got the following questions: 1. Is petsc4py the right package to use? Yes, you need petsc4py for python. 1. Could you give an example for solving DAEs in python? src/binding/petsc4py/demo/ode/vanderpol.py gives a simple example to demonstrate a variety of PETSc capabilities. The corresponding C version of this example is ex20adj.c in src/ts/tutorials/. 1. How to solve an ODE with explicit methods. 2. How to solve an ODE/DAE with implicit methods. 3. How to use TSAdjoint to calculate adjoint sensitivities. 4. How to do a manual matrix-free implementation (e.g. you may already have a way to differentiate your RHS function to generate the Jacobian-vector product). 1. Is Jacobian must be specified? If not, could your show an example for solving DAEs without specified Jacobian in python? PETSc can do finite-difference approximations to generate the Jacobian matrix automatically. This may work nicely for small-sized problems. You can also try to use an AD tool to produce the Jacobian-vector product and use it in a matrix-free implementation. Hong Thank you for your help. Best, Jing -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image.png Type: image/png Size: 30250 bytes Desc: image.png URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image.png Type: image/png Size: 21718 bytes Desc: image.png URL: From dong-hao at outlook.com Tue Jan 25 03:42:32 2022 From: dong-hao at outlook.com (Hao DONG) Date: Tue, 25 Jan 2022 09:42:32 +0000 Subject: [petsc-users] Strange CUDA failure with a second petscfinalize with PETSc 3.16 In-Reply-To: References: <45585074-8C0E-453B-993B-554D23A1E971@gmail.com> <60E1BD31-E98A-462F-BA2B-2099745B2582@gmail.com> Message-ID: Hi Jacob, Thanks for the comments ? silly that I have overlooked the debugging flag so far. Unfortunately, I am out of office for a couple of days so I cannot confirm the result on my workstation, for now. However, I have a laptop with nvidia graphic card (old gtx1050, which is actually slower than the cpu in terms of double precision calculation), I have tried to git pull on the laptop from main and re-config as you suggested. However, using ?--with-cxx-dialect=14? throws up an error like: Unknown C++ dialect: with-cxx-dialect=14 And omitting the ?--with-cuda-dialect=cxx14? also gives me a similar complaint with: CUDA Error: Using CUDA with PetscComplex requires a C++ dialect at least cxx11. Use --with-cxx-dialect=xxx and --with-cuda-dialect=xxx to specify a suitable compiler. Eventually I was able to configure and compile with the following config setup: ./configure --prefix=/opt/petsc/debug --with-cc=mpicc --with-cxx=mpicxx --with-fc=mpif90 --with-scalar-type=complex --with-precision=double --with-cuda-dir=/usr/local/cuda --with-fortran-kernels=1 --with-cxx-dialect=cxx14 --with-cuda-dialect=cxx14 --with-debugging=1 But still, I got the same error regarding cuda (still the error code 97 thing). I attached the configure and output log of my ex11 on my laptop ? is there anything that can help pinpoint the problem? I can also confirm that PETSc 3.15.2 works well with my ex11fc code with cuda, on my laptop. Sadly, my laptop setup is based on WSL, which is far from an ideal environment to test CUDA. I will let you know once I get my hands on my workstations. Cheers, Hao Sent from Mail for Windows From: Jacob Faibussowitsch Sent: Tuesday, January 25, 2022 1:22 AM To: Hao DONG Cc: petsc-users; Junchao Zhang Subject: Re: [petsc-users] Strange CUDA failure with a second petscfinalize with PETSc 3.16 Hi Hao, Any luck reproducing the CUDA problem? Sorry for the long radio silence, I still have not been able to reproduce the problem unfortunately. I have tried on a local machine, and a few larger clusters and all return without errors both with and without cuda? Can you try pulling the latest version of main, reconfiguring and trying again? BTW, your configure arguments are a little wonky: 1. --with-clanguage=c - this isn?t needed, PETSc will default to C 2. --with-cxx-dialect=cxx14 --with-cuda-dialect=cxx14 - use --with-cxx-dialect=14 instead, PETSc will detect that you have gnu compilers and enable gnu extensions 3. -with-debugging=1 - this is missing an extra dash, but you also have optimization flags set so maybe just leave this one out Best regards, Jacob Faibussowitsch (Jacob Fai - booss - oh - vitch) On Jan 23, 2022, at 21:29, Hao DONG > wrote: Dear Jacob, Any luck reproducing the CUDA problem? - just write to check in, in case somehow the response did not reach me (this happens to my colleagues abroad sometimes, probably due to the Wall). All the best, Hao On Jan 19, 2022, at 3:01 PM, Hao DONG > wrote: ? Thanks Jacob for looking into this ? You can see the updated source code of ex11fc in the attachment ? although there is not much that I modified (except for the jabbers I outputted). I also attached the full output (ex11fc.log) along with the configure.log file. It?s an old dual Xeon workstation (one of my ?production? machines) with Linux kernel 5.4.0 and gcc 9.3. I simply ran the code with mpiexec -np 2 ex11fc -usecuda for GPU test. And as stated before, calling without the ?-usecuda? option shows no errors. Please let me know if you find anything wrong with the configure/code. Cheers, Hao From: Jacob Faibussowitsch Sent: Wednesday, January 19, 2022 3:38 AM To: Hao DONG Cc: Junchao Zhang; petsc-users Subject: Re: [petsc-users] Strange CUDA failure with a second petscfinalize with PETSc 3.16 Apologies, forgot to mention in my previous email but can you also include a copy of the full printout of the error message that you get? It will include all the command-line flags that you ran with (if any) so I can exactly mirror your environment. Best regards, Jacob Faibussowitsch (Jacob Fai - booss - oh - vitch) On Jan 18, 2022, at 14:06, Jacob Faibussowitsch > wrote: Can you send your updated source file as well as your configure.log (should be $PETSC_DIR/configure.log). I will see if I can reproduce the error on my end. Best regards, Jacob Faibussowitsch (Jacob Fai - booss - oh - vitch) On Jan 17, 2022, at 23:06, Hao DONG > wrote: ? Dear Junchao and Jacob, Thanks a lot for the response ? I also don?t understand why this is related to the device, especially on why the procedure can be successfully finished for *once* ? As instructed, I tried to add a CHKERRA() macro after (almost) every petsc line ? such as the initialization, mat assemble, ksp create, solve, mat destroy, etc. However, all other petsc commands returns with error code 0. It only gives me a similar (still not very informative) error after I call the petscfinalize (again for the second time), with error code 97: [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [0]PETSC ERROR: GPU error [0]PETSC ERROR: cuda error 709 (cudaErrorContextIsDestroyed) : context is destroyed [0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting. [0]PETSC ERROR: Petsc Release Version 3.16.3, unknown [0]PETSC ERROR: ./ex11f on a named stratosphere by donghao Tue Jan 18 11:39:43 2022 [0]PETSC ERROR: Configure options --prefix=/opt/petsc/complex-double-with-cuda --with-cc=mpicc --with-cxx=mpicxx --with-fc=mpif90 COPTFLAGS="-O3 -mavx2" CXXOPTFLAGS="-O3 -mavx2" FOPTFLAGS="-O3 -ffree-line-length-none -mavx2" CUDAOPTFLAGS=-O3 --with-cxx-dialect=cxx14 --with-cuda-dialect=cxx14 --with-scalar-type=complex --with-precision=double --with-cuda-dir=/usr/local/cuda --with-debugging=1 [0]PETSC ERROR: #1 PetscFinalize() at /home/donghao/packages/petsc-current/src/sys/objects/pinit.c:1638 [0]PETSC ERROR: #2 User provided function() at User file:0 I can also confirm that rolling back to petsc 3.15 will *not* see the problem, even with the new nvidia driver. And petsc 3.16.3 with an old nvidia driver (470.42) also get this same error. So it?s probably not connected to the nvidia driver. Any idea on where I should look at next? Thanks a lot in advance, and all the best, Hao From: Jacob Faibussowitsch Sent: Sunday, January 16, 2022 12:12 AM To: Junchao Zhang Cc: petsc-users; Hao DONG Subject: Re: [petsc-users] Strange CUDA failure with a second petscfinalize with PETSc 3.16 I don?t quite understand how it is getting to the CUDA error to be honest. None of the code in the stack trace is anywhere near the device code. Reading the error message carefully, it first chokes on PetscLogGetStageLog() from a call to PetscClassIdRegister(): PetscErrorCode PetscLogGetStageLog(PetscStageLog *stageLog) { PetscFunctionBegin; PetscValidPointer(stageLog,1); if (!petsc_stageLog) { fprintf(stderr, "PETSC ERROR: Logging has not been enabled.\nYou might have forgotten to call PetscInitialize().\n"); PETSCABORT(MPI_COMM_WORLD, PETSC_ERR_SUP); // Here } ... But then jumps to PetscFinalize(). You can also see the "You might have forgotten to call PetscInitialize().? message in the error message, just under the topmost level of the stack trace. Can you check the value of ierr of each function call (use the CHKERRA() macro to do so)? I suspect the problem here that errors occurring previously in the program are being ignored, leading to the garbled stack trace. Best regards, Jacob Faibussowitsch (Jacob Fai - booss - oh - vitch) On Jan 14, 2022, at 20:58, Junchao Zhang > wrote: Jacob, Could you have a look as it seems the "invalid device context" is in your newly added module? Thanks --Junchao Zhang On Fri, Jan 14, 2022 at 12:49 AM Hao DONG > wrote: Dear All, I have encountered a peculiar problem when fiddling with a code with PETSC 3.16.3 (which worked fine with PETSc 3.15). It is a very straight forward PDE-based optimization code which repeatedly solves a linearized PDE problem with KSP in a subroutine (the rest of the code does not contain any PETSc related content). The main program provides the subroutine with an MPI comm. Then I set the comm as PETSC_COMM_WORLD to tell PETSC to attach to it (and detach with it when the solving is finished each time). Strangely, I observe a CUDA failure whenever the petscfinalize is called for a *second* time. In other words, the first and second PDE calculations with GPU are fine (with correct solutions). The petsc code just fails after the SECOND petscfinalize command is called. You can also see the PETSC config in the error message: [1]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [1]PETSC ERROR: GPU error [1]PETSC ERROR: cuda error 201 (cudaErrorDeviceUninitialized) : invalid device context [1]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting. [1]PETSC ERROR: Petsc Release Version 3.16.3, unknown [1]PETSC ERROR: maxwell.gpu on a named stratosphere by hao Fri Jan 14 10:21:05 2022 [1]PETSC ERROR: Configure options --prefix=/opt/petsc/complex-double-with-cuda --with-cc=mpicc --with-cxx=mpicxx --with-fc=mpif90 COPTFLAGS="-O3 -mavx2" CXXOPTFLAGS="-O3 -mavx2" FOPTFLAGS="-O3 -ffree-line-length-none -mavx2" CUDAOPTFLAGS=-O3 --with-cxx-dialect=cxx14 --with-cuda-dialect=cxx14 --with-scalar-type=complex --with-precision=double --with-cuda-dir=/usr/local/cuda --with-debugging=1 [1]PETSC ERROR: #1 PetscFinalize() at /home/hao/packages/petsc-current/src/sys/objects/pinit.c:1638 You might have forgotten to call PetscInitialize(). The EXACT line numbers in the error traceback are not available. Instead the line number of the start of the function is given. [1] #1 PetscAbortFindSourceFile_Private() at /home/hao/packages/petsc-current/src/sys/error/err.c:35 [1] #2 PetscLogGetStageLog() at /home/hao/packages/petsc-current/src/sys/logging/utils/stagelog.c:29 [1] #3 PetscClassIdRegister() at /home/hao/packages/petsc-current/src/sys/logging/plog.c:2376 [1] #4 MatMFFDInitializePackage() at /home/hao/packages/petsc-current/src/mat/impls/mffd/mffd.c:45 [1] #5 MatInitializePackage() at /home/hao/packages/petsc-current/src/mat/interface/dlregismat.c:163 [1] #6 MatCreate() at /home/hao/packages/petsc-current/src/mat/utils/gcreate.c:77 However, it doesn?t seem to affect the other part of my code, so the code can continue running until it gets to the petsc part again (the *third* time). Unfortunately, it doesn?t give me any further information even if I set the debugging to yes in the configure file. It also worth noting that PETSC without CUDA (i.e. with simple MATMPIAIJ) works perfectly fine. I am able to re-produce the problem with a toy code modified from ex11f. Please see the attached file (ex11fc.F90) for details. Essentially the code does the same thing as ex11f, but three times with a do loop. To do that I added an extra MPI_INIT/MPI_FINALIZE to ensure that the MPI communicator is not destroyed when PETSC_FINALIZE is called. I used the PetscOptionsHasName utility to check if you have ?-usecuda? in the options. So running the code with and without that option can give you a comparison w/o CUDA. I can see that the code also fails after the second loop of the KSP operation. Could you kindly shed some lights on this problem? I should say that I am not even sure if the problem is from PETSc, as I also accidentally updated the NVIDIA driver (for now it is 510.06 with cuda 11.6). And it is well known that NVIDIA can give you some surprise in the updates (yes, I know I shouldn?t have touched that if it?s not broken). But my CUDA code without PETSC (which basically does the same PDE thing, but with cusparse/cublas directly) seems to work just fine after the update. It is also possible that my petsc code related to CUDA was not quite ?legitimate? ? I just use: MatSetType(A, MATMPIAIJCUSPARSE, ierr) and MatCreateVecs(A, u, PETSC_NULL_VEC, ierr) to make the data onto GPU. I would very much appreciate it if you could show me the ?right? way to do that. Thanks a lot in advance, and all the best, Hao -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: configure-xps.log Type: application/octet-stream Size: 1278191 bytes Desc: configure-xps.log URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: ex11fc-xps.log Type: application/octet-stream Size: 6697 bytes Desc: ex11fc-xps.log URL: From patrick.sanan at gmail.com Tue Jan 25 06:36:17 2022 From: patrick.sanan at gmail.com (Patrick Sanan) Date: Tue, 25 Jan 2022 13:36:17 +0100 Subject: [petsc-users] Question about PCFieldSplit In-Reply-To: References: <9bd268fa58a14c8ba55abffa3661dad1@lanl.gov> <8A8FE6DE-F995-4242-92C7-878C48ED1A70@gmail.com> <037C3F53-8329-4950-B83B-63AC118526E3@msu.edu> <2e1a71b944a949c99c24ad85bed9350a@lanl.gov> <3667C2C9-C4E7-48C7-B7FC-B48DDA210BCF@gmail.com> Message-ID: Here is an MR which intends to introduce some logic to support DMCreateFieldDecomposition(). It doesn't use the PetscSection approach, which might be preferable, but nonetheless is a necessary component so It'd like to get it in, even if it has room for further optimization. Hopefully this can be followed fairly soon with some more examples and tests using PCFieldSplit itself. https://gitlab.com/petsc/petsc/-/merge_requests/4740 Am Mi., 23. Juni 2021 um 12:15 Uhr schrieb Matthew Knepley < knepley at gmail.com>: > On Wed, Jun 23, 2021 at 12:51 AM Patrick Sanan > wrote: > >> Hi Zakariae - >> >> The usual way to do this is to define an IS (index set) with the degrees >> of freedom of interest for the rows, and another one for the columns, and >> then use MatCreateSubmatrix [1] . >> >> There's not a particularly convenient way to create an IS with the >> degrees of freedom corresponding to a particular "stratum" (i.e. elements, >> faces, edges, or vertices) of a DMStag, but fortunately I believe we have >> some code to do exactly this in a development branch. >> >> I'll track it down and see if it can quickly be added to the main branch. >> > > Note that an easy way to keep track of this would be to create a section > with the different locations as fields. This Section could then > easily create the ISes, and could automatically interface with > PCFIELDSPLIT. > > Thanks, > > Matt > > >> >> [1]: >> https://petsc.org/release/docs/manualpages/Mat/MatCreateSubMatrix.html >> >> Am 22.06.2021 um 22:29 schrieb Jorti, Zakariae : >> >> Hello, >> >> I am working on DMStag and I have one dof on vertices (let us call it V), one >> dof on edges (let us call it E), one dof on faces ((let us call it F)) and >> one dof on cells (let us call it C). >> I build a matrix on this DM, and I was wondering if there was a way to >> get blocks (or sub matrices) of this matrix corresponding to specific >> degrees of freedom, for example rows corresponding to V dofs and columns >> corresponding to E dofs. >> I already asked this question before and the answer I got was I could >> call PCFieldSplitSetDetectSaddlePoint with the diagonal entries being of >> the matrix being zero or nonzero. >> That worked well. Nonetheless, I am curious to know if there was another >> alternative that does not require creating a dummy matrix with >> appropriate diagonal entries and solving a dummy linear system with this >> matrix to define the splits. >> >> >> Many thanks. >> >> Best regards, >> >> Zakariae >> ------------------------------ >> *From:* petsc-users on behalf of Tang, >> Qi >> *Sent:* Sunday, April 18, 2021 11:51:59 PM >> *To:* Patrick Sanan >> *Cc:* petsc-users at mcs.anl.gov; Tang, Xianzhu >> *Subject:* [EXTERNAL] Re: [petsc-users] Question about PCFieldSplit >> >> Thanks a lot, Patrick. We appreciate your help. >> >> Qi >> >> >> >> On Apr 18, 2021, at 11:30 PM, Patrick Sanan >> wrote: >> >> We have this functionality in a branch, which I'm working on cleaning up >> to get to master. It doesn't use PETScSection. Sorry about the delay! >> >> You can only use PCFieldSplitSetDetectSaddlePoint when your diagonal >> entries being zero or non-zero defines the splits correctly. >> >> Am 17.04.2021 um 21:09 schrieb Matthew Knepley : >> >> On Fri, Apr 16, 2021 at 8:39 PM Jorti, Zakariae via petsc-users < >> petsc-users at mcs.anl.gov> wrote: >> >>> Hello, >>> >>> I have a DMStag grid with one dof on each edge and face center. >>> I want to use a PCFieldSplit preconditioner on a Jacobian matrix that I >>> assume is already split but I am not sure how to determine the fields. >>> In the DMStag examples (ex2.c and ex3.c), the >>> function PCFieldSplitSetDetectSaddlePoint is used to determine those fields >>> based on zero diagonal entries. In my case, I have a Jacobian matrix that >>> does not have zero diagonal entries. >>> Can I use that PCFieldSplitSetDetectSaddlePoint in this case? >>> If not, how should I do? >>> Should I do like this example ( >>> https://www.mcs.anl.gov/petsc/petsc-master/src/ksp/ksp/tutorials/ex43.c.html >>> >>> ): >>> const PetscInt Bfields[1] = {0},Efields[1] = {1}; >>> KSPGetPC(ksp,&pc); >>> PCFieldSplitSetBlockSize(pc,2); >>> PCFieldSplitSetFields(pc,"B",1,Bfields,Bfields); >>> PCFieldSplitSetFields(pc,"E",1,Efields,Efields); >>> where my B unknowns are defined on face centers and E unknowns are >>> defined on edge centers? >>> >> That will not work.That interface only works for colocated fields that >> you get from DMDA. >> >> Patrick, does DMSTAG use PetscSection? Then the field split would be >> automatically calculated. If not, does it maintain the >> field division so that it could be given to PCFIELDSPLIT as ISes? >> >> Thanks, >> >> Matt >> >>> One last thing, I do not know which field comes first. Is it the one >>> defined for face dofs or edge dofs. >>> >>> Thank you. >>> Best regards, >>> >>> Zakariae >>> >>> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ >> >> >> >> > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From nicolas.barnafi at unimi.it Tue Jan 25 07:06:31 2022 From: nicolas.barnafi at unimi.it (Nicolas Alejandro Barnafi) Date: Tue, 25 Jan 2022 14:06:31 +0100 Subject: [petsc-users] HYPRE AMS - Segmentation Violation with discrete gradient In-Reply-To: References: <7bf2bb42c9ccf.61ead639@unimi.it> <7d64948dc9956.61eae6c2@unimi.it> <7e7dd1a3ce69c.61eae6fe@unimi.it> <7e7daa98ce073.61eae73b@unimi.it> <7f28becdc88d0.61eae779@unimi.it> <7d0fabc8cb8c2.61eae7b5@unimi.it> <7f2cb725cce30.61eaf5d5@unimi.it> <7f36cea7174331.61eef250@unimi.it> Message-ID: <7df4d9f61a7ea7.61f003e7@unimi.it> I have added -mat_view to see the matrices and perhaps I found the issue. The matrix G (third one) should have as many columns as the other two, and this is indeed not the case: Mat Object: 1 MPI processes type: seqaij row 0: (0, 3.) row 1: (1, 3.) row 2: (2, 3.) row 3: (3, 3.) row 4: (4, 3.) row 5: (5, 3.) Mat Object: 1 MPI processes type: seqaij row 0: (0, 3.) row 1: (1, 3.) row 2: (2, 3.) row 3: (3, 3.) row 4: (4, 3.) row 5: (5, 3.) Mat Object: 1 MPI processes type: seqaij row 0: (0, -1.) (1, 1.) row 1: (0, 1.) (2, -1.) row 2: (0, -1.) (4, 1.) row 3: (1, -1.) (3, 1.) row 4: (1, 1.) (5, -1.) row 5: (2, 1.) (3, -1.) row 6: (2, 1.) (6, -1.) row 7: (3, -1.) (7, 1.) row 8: (4, 1.) (5, -1.) row 9: (4, -1.) (6, 1.) row 10: (5, 1.) (7, -1.) row 11: (6, -1.) (7, 1.)? Thank you for the guidance, hopefully after correcting these issues the problem will be solved. I guess the method somewhere considers maybe the columns of G as a reference for the inner sbspace correction matrix and it ends up creating a mismatch. Perhaps I should let them know, an assert for matrix sizes should be there somehwere. Best regards Il 24/01/22 23:02, Matthew Knepley ha scritto: > > > > On Mon, Jan 24, 2022 at 4:42 PM Matthew Knepley wrote: > > > > > > Okay, I can run your example and get an error. I am looking at it. Not sure if it is us or Hypre yet. > > > > > This is failing inside Hypre. It appears that the A_diag offset array does not match the matrix. Are you giving Hypre > a rectangular matrix? If so, it appears that the code cannot handle it and accidentally indexes out of the array. > > Thanks, > > Matt > > > > > > > > Thanks, > > > > Matt > > > > > > > > On Mon, Jan 24, 2022 at 12:39 PM Nicolas Alejandro Barnafi wrote: > > > > > > > > Please let me know if you were able to run the example (or anyone else). Thanks for the help. > > > Kind regards > > > > > > Il 21/01/22 18:05, "Nicolas Alejandro Barnafi" ha scritto: > > > > > > > > Thank you Matt, I have trimmed down ex10 (a lot)?to do as required, and it indeed reproduces the error.? > > > > You may find it attached and it can be reproduced with > > > > ./ex10 -fA Amat -fP Pmat -fG Gmat -ksp_type gmres -pc_type hypre -pc_hypre_type ams > > > > > > > > Thank you! > > > > > > > > Il 21/01/22 16:35, Matthew Knepley ha scritto: > > > > > > > > > > > > > > > > > > > > On Fri, Jan 21, 2022 at 9:50 AM Nicolas Alejandro Barnafi wrote: > > > > > > > > > > > > > > > > Dear community,? > > > > > > > > > > > > I'm giving the discrete gradient to a PC object (sub PC of a fieldsplit) but HYPRE internally gives a segmentation violation error. The matrix has been adequately set, as I have added it to the program output for inspection. Has this happened to anyone? > > > > > > > > > > > > > > > > > > > > > Is there a chance of sending us something that we can run? Alternatively, can you run > > > > > > > > > > https://gitlab.com/petsc/petsc/-/blob/main/src/ksp/ksp/tutorials/ex10.c > > > > > > > > > > loading your matrix and giving options to select Hypre? Then we can do the same thing here with your matrix. > > > > > > > > > > Thanks, > > > > > > > > > > Matt > > > > > > > > > > > > > > > > > > > > > > I have attached the error below together with the discrete gradient, in case you see something I am missing. > > > > > > > > > > > > The code is currently running in serial, so I don't expect communication/partitioning to be an issue (although it could be in the future). > > > > > > > > > > > > Thanks in advance, > > > > > > Nicolas > > > > > > > > > > > > > > > > > > > > > > > > ------------------------------ PETSc output ------------------------------------ > > > > > > > > > > > > > > > > > > Mat Object: 1 MPI processes > > > > > > type: seqaij > > > > > > -1.00000e+00 1.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 > > > > > > 1.00000e+00 0.00000e+00 -1.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 > > > > > > -1.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 1.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 > > > > > > 0.00000e+00 -1.00000e+00 0.00000e+00 1.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 > > > > > > 0.00000e+00 1.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 -1.00000e+00 0.00000e+00 0.00000e+00 > > > > > > 0.00000e+00 0.00000e+00 1.00000e+00 -1.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 > > > > > > 0.00000e+00 0.00000e+00 1.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 -1.00000e+00 0.00000e+00 > > > > > > 0.00000e+00 0.00000e+00 0.00000e+00 -1.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 1.00000e+00 > > > > > > 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 1.00000e+00 -1.00000e+00 0.00000e+00 0.00000e+00 > > > > > > 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 -1.00000e+00 0.00000e+00 1.00000e+00 0.00000e+00 > > > > > > 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 1.00000e+00 0.00000e+00 -1.00000e+00 > > > > > > 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 -1.00000e+00 1.00000e+00 > > > > > > [0]PETSC ERROR: ------------------------------------------------------------------------ > > > > > > [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range > > > > > > [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > > > > > > [0]PETSC ERROR: or see https://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind > > > > > > [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors > > > > > > [0]PETSC ERROR: likely location of problem given in stack below > > > > > > [0]PETSC ERROR: --------------------- Stack Frames ------------------------------------ > > > > > > [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, > > > > > > [0]PETSC ERROR: INSTEAD the line number of the start of the function > > > > > > [0]PETSC ERROR: is given. > > > > > > [0]PETSC ERROR: [0] jac->setup line 408 /home/ubuntu/petsc/src/ksp/pc/impls/hypre/hypre.c > > > > > > [0]PETSC ERROR: [0] PCSetUp_HYPRE line 223 /home/ubuntu/petsc/src/ksp/pc/impls/hypre/hypre.c > > > > > > [0]PETSC ERROR: [0] PCSetUp line 971 /home/ubuntu/petsc/src/ksp/pc/interface/precon.c > > > > > > [0]PETSC ERROR: [0] KSPSetUp line 319 /home/ubuntu/petsc/src/ksp/ksp/interface/itfunc.c > > > > > > [0]PETSC ERROR: [0] KSPSolve_Private line 615 /home/ubuntu/petsc/src/ksp/ksp/interface/itfunc.c > > > > > > [0]PETSC ERROR: [0] KSPSolve line 884 /home/ubuntu/petsc/src/ksp/ksp/interface/itfunc.c > > > > > > [0]PETSC ERROR: [0] PCApply_FieldSplit line 1241 /home/ubuntu/petsc/src/ksp/pc/impls/fieldsplit/fieldsplit.c > > > > > > [0]PETSC ERROR: [0] PCApply line 426 /home/ubuntu/petsc/src/ksp/pc/interface/precon.c > > > > > > [0]PETSC ERROR: [0] KSP_PCApply line 281 /home/ubuntu/petsc/include/petsc/private/kspimpl.h > > > > > > [0]PETSC ERROR: [0] KSPInitialResidual line 40 /home/ubuntu/petsc/src/ksp/ksp/interface/itres.c > > > > > > [0]PETSC ERROR: [0] KSPSolve_GMRES line 233 /home/ubuntu/petsc/src/ksp/ksp/impls/gmres/gmres.c > > > > > > [0]PETSC ERROR: [0] KSPSolve_Private line 615 /home/ubuntu/petsc/src/ksp/ksp/interface/itfunc.c > > > > > > [0]PETSC ERROR: [0] KSPSolve line 884 /home/ubuntu/petsc/src/ksp/ksp/interface/itfunc.c > > > > > > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > > > > > > [0]PETSC ERROR: Signal received > > > > > > [0]PETSC ERROR: See https://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > > > > > > [0]PETSC ERROR: Petsc Release Version 3.14.6, unknown > > > > > > [0]PETSC ERROR: Unknown Name on a arch-linux-c-debug named ubuntu-ThinkPad-L14-Gen-1 by ubuntu Fri Jan 21 15:37:45 2022 > > > > > > [0]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --download-mpich --download-fblaslapack --with-mpi=1 --download-superlu_dist --download-mumps --download-hypre --with-debugging=1 COPTFLAGS="-O3 -march=native -mtune=native" CXXOPTFLAGS="-O3 -march=native -mtune=native" FOPTFLAGS="-O3 -march=native -mtune=native" --download-scalapack --download-hpddm > > > > > > [0]PETSC ERROR: #1 User provided function() line 0 in unknown file > > > > > > [0]PETSC ERROR: Checking the memory for corruption. > > > > > > application called MPI_Abort(MPI_COMM_WORLD, 50176059) - process 0 > > > > > > [unset]: write_line error; fd=-1 buf=:cmd=abort exitcode=50176059 > > > > > > : > > > > > > system msg for write_line failure : Bad file descriptor > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > > > > > -- Norbert Wiener > > > > > > > > > > https://www.cse.buffalo.edu/~knepley/(http://www.cse.buffalo.edu/~knepley/ ) > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > > > > > > > > > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > > -- Norbert Wiener > > > > https://www.cse.buffalo.edu/~knepley/(http://www.cse.buffalo.edu/~knepley/ ) > > > > > > > > > > > > -- > > > > > > > > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/(http://www.cse.buffalo.edu/~knepley/ ) > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From karthikeyan.chockalingam at stfc.ac.uk Tue Jan 25 08:30:26 2022 From: karthikeyan.chockalingam at stfc.ac.uk (Karthikeyan Chockalingam - STFC UKRI) Date: Tue, 25 Jan 2022 14:30:26 +0000 Subject: [petsc-users] NVIDIA's AmgX library Message-ID: <51B476AF-873A-416B-8504-45309E29B631@stfc.ac.uk> Hello, Is there a way one can use AmgX multigrid via PETSc? Kind regards, Karthik. This email and any attachments are intended solely for the use of the named recipients. If you are not the intended recipient you must not use, disclose, copy or distribute this email or any of its attachments and should notify the sender immediately and delete this email from your system. UK Research and Innovation (UKRI) has taken every reasonable precaution to minimise risk of this email or any attachments containing viruses or malware but the recipient should carry out its own virus and malware checks before opening the attachments. UKRI does not accept any liability for any losses or damages which the recipient may sustain due to presence of any viruses. -------------- next part -------------- An HTML attachment was scrubbed... URL: From mfadams at lbl.gov Tue Jan 25 08:54:05 2022 From: mfadams at lbl.gov (Mark Adams) Date: Tue, 25 Jan 2022 09:54:05 -0500 Subject: [petsc-users] NVIDIA's AmgX library In-Reply-To: <51B476AF-873A-416B-8504-45309E29B631@stfc.ac.uk> References: <51B476AF-873A-416B-8504-45309E29B631@stfc.ac.uk> Message-ID: We have an AMGx interface under development. Mark On Tue, Jan 25, 2022 at 9:30 AM Karthikeyan Chockalingam - STFC UKRI via petsc-users wrote: > Hello, > > > > Is there a way one can use AmgX multigrid via PETSc? > > > > Kind regards, > > Karthik. > > This email and any attachments are intended solely for the use of the > named recipients. If you are not the intended recipient you must not use, > disclose, copy or distribute this email or any of its attachments and > should notify the sender immediately and delete this email from your > system. UK Research and Innovation (UKRI) has taken every reasonable > precaution to minimise risk of this email or any attachments containing > viruses or malware but the recipient should carry out its own virus and > malware checks before opening the attachments. UKRI does not accept any > liability for any losses or damages which the recipient may sustain due to > presence of any viruses. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jacob.fai at gmail.com Tue Jan 25 09:19:34 2022 From: jacob.fai at gmail.com (Jacob Faibussowitsch) Date: Tue, 25 Jan 2022 09:19:34 -0600 Subject: [petsc-users] Strange CUDA failure with a second petscfinalize with PETSc 3.16 In-Reply-To: References: <45585074-8C0E-453B-993B-554D23A1E971@gmail.com> <60E1BD31-E98A-462F-BA2B-2099745B2582@gmail.com> Message-ID: Hi Hao, > I have tried to git pull on the laptop from main and re-config as you suggested. It looks like you?re still on the release branch. Do ``` $ git checkout main $ git pull ``` Then reconfigure. This is also why the cxx dialect flag did not work, I forgot that this change had not made it to release yet. > my laptop setup is based on WSL What version of windows do you have? And what version of WSL? And what version is the linux kernel? You will need at least WSL2 and both your NVIDIA driver, windows version, and linux kernel version are required to be fairly new AFAIK to be able to run CUDA on them. See here https://docs.nvidia.com/cuda/wsl-user-guide/index.html . To get your windows version: 1. Press Windows key+R 2. Type winver in the box, and press enter 3. You should see a line with Version and a build number. For example on my windows machine I see ?Version 21H2 (OS Build 19044.1496)? To get WSL version: 1. Open WSL 2. Type uname -r, for example I get "5.10.60.1-microsoft-standard-wsl2" To get NVIDIA driver version: 1. Open up the NVIDIA control panel 2. Click on ?System Information? in the bottom left corner 3. You should see a dual list, ?Items? and ?Details?. In the details column. You should see ?Driver verion?. For example on my machine I see ?511.23? Best regards, Jacob Faibussowitsch (Jacob Fai - booss - oh - vitch) > On Jan 25, 2022, at 03:42, Hao DONG wrote: > > Hi Jacob, > > Thanks for the comments ? silly that I have overlooked the debugging flag so far. Unfortunately, I am out of office for a couple of days so I cannot confirm the result on my workstation, for now. > > However, I have a laptop with nvidia graphic card (old gtx1050, which is actually slower than the cpu in terms of double precision calculation), I have tried to git pull on the laptop from main and re-config as you suggested. > > However, using ?--with-cxx-dialect=14? throws up an error like: > > Unknown C++ dialect: with-cxx-dialect=14 > > And omitting the ?--with-cuda-dialect=cxx14? also gives me a similar complaint with: > > CUDA Error: Using CUDA with PetscComplex requires a C++ dialect at least cxx11. Use --with-cxx-dialect=xxx and --with-cuda-dialect=xxx to specify a suitable compiler. > > Eventually I was able to configure and compile with the following config setup: > > ./configure --prefix=/opt/petsc/debug --with-cc=mpicc --with-cxx=mpicxx --with-fc=mpif90 --with-scalar-type=complex --with-precision=double --with-cuda-dir=/usr/local/cuda --with-fortran-kernels=1 --with-cxx-dialect=cxx14 --with-cuda-dialect=cxx14 --with-debugging=1 > > But still, I got the same error regarding cuda (still the error code 97 thing). I attached the configure and output log of my ex11 on my laptop ? is there anything that can help pinpoint the problem? I can also confirm that PETSc 3.15.2 works well with my ex11fc code with cuda, on my laptop. Sadly, my laptop setup is based on WSL, which is far from an ideal environment to test CUDA. I will let you know once I get my hands on my workstations. > > Cheers, > Hao > > Sent from Mail for Windows > > From: Jacob Faibussowitsch > Sent: Tuesday, January 25, 2022 1:22 AM > To: Hao DONG > Cc: petsc-users ; Junchao Zhang > Subject: Re: [petsc-users] Strange CUDA failure with a second petscfinalize with PETSc 3.16 > > Hi Hao, > > Any luck reproducing the CUDA problem? > > Sorry for the long radio silence, I still have not been able to reproduce the problem unfortunately. I have tried on a local machine, and a few larger clusters and all return without errors both with and without cuda? > > Can you try pulling the latest version of main, reconfiguring and trying again? > > BTW, your configure arguments are a little wonky: > > 1. --with-clanguage=c - this isn?t needed, PETSc will default to C > 2. --with-cxx-dialect=cxx14 --with-cuda-dialect=cxx14 - use --with-cxx-dialect=14 instead, PETSc will detect that you have gnu compilers and enable gnu extensions > 3. -with-debugging=1 - this is missing an extra dash, but you also have optimization flags set so maybe just leave this one out > > Best regards, > > Jacob Faibussowitsch > (Jacob Fai - booss - oh - vitch) > > > On Jan 23, 2022, at 21:29, Hao DONG > wrote: > > Dear Jacob, > > Any luck reproducing the CUDA problem? - just write to check in, in case somehow the response did not reach me (this happens to my colleagues abroad sometimes, probably due to the Wall). > > All the best, > Hao > > > On Jan 19, 2022, at 3:01 PM, Hao DONG > wrote: > > ? > Thanks Jacob for looking into this ? > > You can see the updated source code of ex11fc in the attachment ? although there is not much that I modified (except for the jabbers I outputted). I also attached the full output (ex11fc.log) along with the configure.log file. It?s an old dual Xeon workstation (one of my ?production? machines) with Linux kernel 5.4.0 and gcc 9.3. > > I simply ran the code with > > mpiexec -np 2 ex11fc -usecuda > > for GPU test. And as stated before, calling without the ?-usecuda? option shows no errors. > > Please let me know if you find anything wrong with the configure/code. > > Cheers, > Hao > > From: Jacob Faibussowitsch > Sent: Wednesday, January 19, 2022 3:38 AM > To: Hao DONG > Cc: Junchao Zhang ; petsc-users > Subject: Re: [petsc-users] Strange CUDA failure with a second petscfinalize with PETSc 3.16 > > Apologies, forgot to mention in my previous email but can you also include a copy of the full printout of the error message that you get? It will include all the command-line flags that you ran with (if any) so I can exactly mirror your environment. > > Best regards, > > Jacob Faibussowitsch > (Jacob Fai - booss - oh - vitch) > > > > On Jan 18, 2022, at 14:06, Jacob Faibussowitsch > wrote: > > Can you send your updated source file as well as your configure.log (should be $PETSC_DIR/configure.log). I will see if I can reproduce the error on my end. > > Best regards, > > Jacob Faibussowitsch > (Jacob Fai - booss - oh - vitch) > > > > On Jan 17, 2022, at 23:06, Hao DONG > wrote: > > ? > Dear Junchao and Jacob, > > Thanks a lot for the response ? I also don?t understand why this is related to the device, especially on why the procedure can be successfully finished for *once* ? As instructed, I tried to add a CHKERRA() macro after (almost) every petsc line ? such as the initialization, mat assemble, ksp create, solve, mat destroy, etc. However, all other petsc commands returns with error code 0. It only gives me a similar (still not very informative) error after I call the petscfinalize (again for the second time), with error code 97: > > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > [0]PETSC ERROR: GPU error > [0]PETSC ERROR: cuda error 709 (cudaErrorContextIsDestroyed) : context is destroyed > [0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting. > [0]PETSC ERROR: Petsc Release Version 3.16.3, unknown > [0]PETSC ERROR: ./ex11f on a named stratosphere by donghao Tue Jan 18 11:39:43 2022 > [0]PETSC ERROR: Configure options --prefix=/opt/petsc/complex-double-with-cuda --with-cc=mpicc --with-cxx=mpicxx --with-fc=mpif90 COPTFLAGS="-O3 -mavx2" CXXOPTFLAGS="-O3 -mavx2" FOPTFLAGS="-O3 -ffree-line-length-none -mavx2" CUDAOPTFLAGS=-O3 --with-cxx-dialect=cxx14 --with-cuda-dialect=cxx14 --with-scalar-type=complex --with-precision=double --with-cuda-dir=/usr/local/cuda --with-debugging=1 > [0]PETSC ERROR: #1 PetscFinalize() at /home/donghao/packages/petsc-current/src/sys/objects/pinit.c:1638 > [0]PETSC ERROR: #2 User provided function() at User file:0 > > I can also confirm that rolling back to petsc 3.15 will *not* see the problem, even with the new nvidia driver. And petsc 3.16.3 with an old nvidia driver (470.42) also get this same error. So it?s probably not connected to the nvidia driver. > > Any idea on where I should look at next? > Thanks a lot in advance, and all the best, > Hao > > From: Jacob Faibussowitsch > Sent: Sunday, January 16, 2022 12:12 AM > To: Junchao Zhang > Cc: petsc-users ; Hao DONG > Subject: Re: [petsc-users] Strange CUDA failure with a second petscfinalize with PETSc 3.16 > > I don?t quite understand how it is getting to the CUDA error to be honest. None of the code in the stack trace is anywhere near the device code. Reading the error message carefully, it first chokes on PetscLogGetStageLog() from a call to PetscClassIdRegister(): > > PetscErrorCode PetscLogGetStageLog(PetscStageLog *stageLog) > { > PetscFunctionBegin; > PetscValidPointer(stageLog,1); > if (!petsc_stageLog) { > fprintf(stderr, "PETSC ERROR: Logging has not been enabled.\nYou might have forgotten to call PetscInitialize().\n"); > PETSCABORT(MPI_COMM_WORLD, PETSC_ERR_SUP); // Here > } > ... > > But then jumps to PetscFinalize(). You can also see the "You might have forgotten to call PetscInitialize().? message in the error message, just under the topmost level of the stack trace. > > Can you check the value of ierr of each function call (use the CHKERRA() macro to do so)? I suspect the problem here that errors occurring previously in the program are being ignored, leading to the garbled stack trace. > > Best regards, > > Jacob Faibussowitsch > (Jacob Fai - booss - oh - vitch) > > > > > On Jan 14, 2022, at 20:58, Junchao Zhang > wrote: > > Jacob, > Could you have a look as it seems the "invalid device context" is in your newly added module? > Thanks > --Junchao Zhang > > > On Fri, Jan 14, 2022 at 12:49 AM Hao DONG > wrote: > Dear All, > > I have encountered a peculiar problem when fiddling with a code with PETSC 3.16.3 (which worked fine with PETSc 3.15). It is a very straight forward PDE-based optimization code which repeatedly solves a linearized PDE problem with KSP in a subroutine (the rest of the code does not contain any PETSc related content). The main program provides the subroutine with an MPI comm. Then I set the comm as PETSC_COMM_WORLD to tell PETSC to attach to it (and detach with it when the solving is finished each time). > > Strangely, I observe a CUDA failure whenever the petscfinalize is called for a *second* time. In other words, the first and second PDE calculations with GPU are fine (with correct solutions). The petsc code just fails after the SECOND petscfinalize command is called. You can also see the PETSC config in the error message: > > [1]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > [1]PETSC ERROR: GPU error > [1]PETSC ERROR: cuda error 201 (cudaErrorDeviceUninitialized) : invalid device context > [1]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting. > [1]PETSC ERROR: Petsc Release Version 3.16.3, unknown > [1]PETSC ERROR: maxwell.gpu on a named stratosphere by hao Fri Jan 14 10:21:05 2022 > [1]PETSC ERROR: Configure options --prefix=/opt/petsc/complex-double-with-cuda --with-cc=mpicc --with-cxx=mpicxx --with-fc=mpif90 COPTFLAGS="-O3 -mavx2" CXXOPTFLAGS="-O3 -mavx2" FOPTFLAGS="-O3 -ffree-line-length-none -mavx2" CUDAOPTFLAGS=-O3 --with-cxx-dialect=cxx14 --with-cuda-dialect=cxx14 --with-scalar-type=complex --with-precision=double --with-cuda-dir=/usr/local/cuda --with-debugging=1 > [1]PETSC ERROR: #1 PetscFinalize() at /home/hao/packages/petsc-current/src/sys/objects/pinit.c:1638 > You might have forgotten to call PetscInitialize(). > The EXACT line numbers in the error traceback are not available. > Instead the line number of the start of the function is given. > [1] #1 PetscAbortFindSourceFile_Private() at /home/hao/packages/petsc-current/src/sys/error/err.c:35 > [1] #2 PetscLogGetStageLog() at /home/hao/packages/petsc-current/src/sys/logging/utils/stagelog.c:29 > [1] #3 PetscClassIdRegister() at /home/hao/packages/petsc-current/src/sys/logging/plog.c:2376 > [1] #4 MatMFFDInitializePackage() at /home/hao/packages/petsc-current/src/mat/impls/mffd/mffd.c:45 > [1] #5 MatInitializePackage() at /home/hao/packages/petsc-current/src/mat/interface/dlregismat.c:163 > [1] #6 MatCreate() at /home/hao/packages/petsc-current/src/mat/utils/gcreate.c:77 > > However, it doesn?t seem to affect the other part of my code, so the code can continue running until it gets to the petsc part again (the *third* time). Unfortunately, it doesn?t give me any further information even if I set the debugging to yes in the configure file. It also worth noting that PETSC without CUDA (i.e. with simple MATMPIAIJ) works perfectly fine. > > I am able to re-produce the problem with a toy code modified from ex11f. Please see the attached file (ex11fc.F90) for details. Essentially the code does the same thing as ex11f, but three times with a do loop. To do that I added an extra MPI_INIT/MPI_FINALIZE to ensure that the MPI communicator is not destroyed when PETSC_FINALIZE is called. I used the PetscOptionsHasName utility to check if you have ?-usecuda? in the options. So running the code with and without that option can give you a comparison w/o CUDA. I can see that the code also fails after the second loop of the KSP operation. Could you kindly shed some lights on this problem? > > I should say that I am not even sure if the problem is from PETSc, as I also accidentally updated the NVIDIA driver (for now it is 510.06 with cuda 11.6). And it is well known that NVIDIA can give you some surprise in the updates (yes, I know I shouldn?t have touched that if it?s not broken). But my CUDA code without PETSC (which basically does the same PDE thing, but with cusparse/cublas directly) seems to work just fine after the update. It is also possible that my petsc code related to CUDA was not quite ?legitimate? ? I just use: > MatSetType(A, MATMPIAIJCUSPARSE, ierr) > and > MatCreateVecs(A, u, PETSC_NULL_VEC, ierr) > to make the data onto GPU. I would very much appreciate it if you could show me the ?right? way to do that. > > Thanks a lot in advance, and all the best, > Hao > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From axel.fourmont at cea.fr Tue Jan 25 08:05:17 2022 From: axel.fourmont at cea.fr (FOURMONT Axel) Date: Tue, 25 Jan 2022 14:05:17 +0000 Subject: [petsc-users] problem with spack instaler (trilinos link) Message-ID: <6498bab351304fb38d92b12ebf7563ca@cea.fr> Dear PETSc developers, First of all thank you for your work! I try to use the spack tool to install petsc with mumps: spack install petsc+mumps~hdf5 (with the good version for compilers). All is OK, PETSc works fine. But now, I want acces to ML preconditioner, so I need install a PETSc version with trilinos: spack install petsc+mumps+trilinos~hdf5 The compilation fails (in the check phase), I notices 2 things: petsc links on trilinos/lib64 but the directory path is lib I make a symbolic link: ln -s trilinos/lib trilinos/lib64 to try solve it Also there is an error with the definition of Zoltan_Create() Can you help me please? I attach the configure.log. I use the last spack release v0.17.1 and ubuntu 20.04 with gcc-10 gfortran-10, openmpi 4.0.3 from apt instaler Thanks, Axel -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: configure.log Type: text/x-log Size: 1231000 bytes Desc: configure.log URL: From balay at mcs.anl.gov Tue Jan 25 10:46:50 2022 From: balay at mcs.anl.gov (Satish Balay) Date: Tue, 25 Jan 2022 10:46:50 -0600 (CST) Subject: [petsc-users] problem with spack instaler (trilinos link) In-Reply-To: <6498bab351304fb38d92b12ebf7563ca@cea.fr> References: <6498bab351304fb38d92b12ebf7563ca@cea.fr> Message-ID: Yeah petsc+trilinos is currently broken in spack. On the petsc side - we currently use minimal ml tarball and that doesn't translate to full trilinos install. So if you need this feature - the current fix is to install petsc manually - without spack [or hack spack to add in '--download-ml' to petsc configure option - and not use +trilinos] Satish On Tue, 25 Jan 2022, FOURMONT Axel wrote: > Dear PETSc developers, > > First of all thank you for your work! > > I try to use the spack tool to install petsc with mumps: spack install petsc+mumps~hdf5 (with the good version for compilers). All is OK, PETSc works fine. > But now, I want acces to ML preconditioner, so I need install a PETSc version with trilinos: spack install petsc+mumps+trilinos~hdf5 > > The compilation fails (in the check phase), I notices 2 things: > petsc links on trilinos/lib64 but the directory path is lib > I make a symbolic link: ln -s trilinos/lib trilinos/lib64 to try solve it > Also there is an error with the definition of Zoltan_Create() > > Can you help me please? > > I attach the configure.log. > I use the last spack release v0.17.1 and > ubuntu 20.04 with gcc-10 gfortran-10, openmpi 4.0.3 from apt instaler > > > Thanks, > Axel > > From hongzhang at anl.gov Tue Jan 25 14:29:25 2022 From: hongzhang at anl.gov (Zhang, Hong) Date: Tue, 25 Jan 2022 20:29:25 +0000 Subject: [petsc-users] Asking examples about solving DAE in python In-Reply-To: References: Message-ID: <8088CFA5-E3E2-4D68-A7C2-C2F212722101@anl.gov> The following code should work. class Pendulum(object): n = 5 comm = PETSc.COMM_SELF l = 1.0 m = 1.0 g = 1.0 def initialCondition(self, x): # mu = self.mu_ #initial condition theta0= np.pi/3 #starting angle x0=np.sin(theta0) y0=-(self.l-x0**2)**.5 lambdaval = 0.1 x[0] = x0 x[1] = y0 x[2] = 0 x[4] = 0 x[5] = lambdaval x.assemble() def evalIFunction(self, ts, t, x, xdot, f): f[0] = x[2]-xdot[0] f[1] = x[3]-xdot[1] f[2] = -xdot[2]+x[4]*x[0]/self.m f[3] = -xdot[3]+x[4]*x[1]/self.m-self.g f[4] = [x[2]**2 + x[3]**2 + (x[0]**2 + x[1]**2)/self.m*x[4] - x[1] * self.g] f.assemble() You do not need to initialize xdot. Only x needs to be initialized. Hong On Jan 25, 2022, at 12:03 AM, Xiong, Jing via petsc-users > wrote: Good morning, Thanks for all your help. I'm trying to implement a toy example solving a DAE for a Pendulum system and I got two questions: 1. How to set the initial value for xdot? 2. I got the following error information: * Assertion failed: (!PyErr_Occurred()), function __Pyx_PyCFunction_FastCall, file src/petsc4py.PETSc.c, line 359099. The system equation is given in https://en.wikipedia.org/wiki/Differential-algebraic_system_of_equations The formula I used is shown as follows: I also attached my code below: import sys, petsc4py petsc4py.init(sys.argv) import numpy as np from petsc4py import PETSc class Pendulum(object): n = 5 comm = PETSc.COMM_SELF def initialCondition(self, x): # mu = self.mu_ l = 1.0 m = 1.0 g = 1.0 #initial condition theta0= np.pi/3 #starting angle x0=np.sin(theta0) y0=-(l-x0**2)**.5 lambdaval = 0.1 x[0] = x0 x[1] = y0 x[4] = lambdaval x.assemble() def evalIFunction(self, ts, t, x, xdot, f): f.setArray ([x[2]-xdot[0]], [x[3]-xdot[1]], [-xdot[2]+x[4]*x[0]/m], [-xdot[3]+x[4]*x[1]/m-g], [x[2]**2 + x[3]**2 + (x[0]**2 + x[1]**2)/m*x[4] - x[1] * g]) OptDB = PETSc.Options() ode = Pendulum() x = PETSc.Vec().createSeq(ode.n, comm=ode.comm) f = x.duplicate() ts = PETSc.TS().create(comm=ode.comm) ts.setType(ts.Type.CN) ts.setIFunction(ode.evalIFunction, f) ts.setSaveTrajectory() ts.setTime(0.0) ts.setTimeStep(0.001) ts.setMaxTime(0.5) ts.setMaxSteps(1000) ts.setExactFinalTime(PETSc.TS.ExactFinalTime.MATCHSTEP) ts.setFromOptions() ode.initialCondition(x) ts.solve(x) Best, Jing ________________________________ From: Zhang, Hong > Sent: Thursday, January 20, 2022 3:05 PM To: Xiong, Jing > Cc: petsc-users at mcs.anl.gov >; Zhao, Dongbo >; Hong, Tianqi > Subject: Re: [petsc-users] Asking examples about solving DAE in python On Jan 20, 2022, at 4:13 PM, Xiong, Jing via petsc-users > wrote: Hi, I hope you are well. I'm very interested in PETSc and want to explore the possibility of whether it could solve Differential-algebraic equations (DAE) in python. I know there are great examples in C, but I'm struggling to connect the examples in python. The only example I got right now is for solving ODEs in the paper: PETSc/TS: A Modern Scalable ODE/DAE Solver Library. And I got the following questions: 1. Is petsc4py the right package to use? Yes, you need petsc4py for python. 1. Could you give an example for solving DAEs in python? src/binding/petsc4py/demo/ode/vanderpol.py gives a simple example to demonstrate a variety of PETSc capabilities. The corresponding C version of this example is ex20adj.c in src/ts/tutorials/. 1. How to solve an ODE with explicit methods. 2. How to solve an ODE/DAE with implicit methods. 3. How to use TSAdjoint to calculate adjoint sensitivities. 4. How to do a manual matrix-free implementation (e.g. you may already have a way to differentiate your RHS function to generate the Jacobian-vector product). 1. Is Jacobian must be specified? If not, could your show an example for solving DAEs without specified Jacobian in python? PETSc can do finite-difference approximations to generate the Jacobian matrix automatically. This may work nicely for small-sized problems. You can also try to use an AD tool to produce the Jacobian-vector product and use it in a matrix-free implementation. Hong Thank you for your help. Best, Jing -------------- next part -------------- An HTML attachment was scrubbed... URL: From hongzhang at anl.gov Tue Jan 25 14:41:27 2022 From: hongzhang at anl.gov (Zhang, Hong) Date: Tue, 25 Jan 2022 20:41:27 +0000 Subject: [petsc-users] Asking examples about solving DAE in python In-Reply-To: <8088CFA5-E3E2-4D68-A7C2-C2F212722101@anl.gov> References: <8088CFA5-E3E2-4D68-A7C2-C2F212722101@anl.gov> Message-ID: <00A3FBE7-9DCA-4B80-930C-A27B8395C60D@anl.gov> Oops. Some typos in the index. They are corrected in below. x[0] = x0 x[1] = y0 x[2] = 0 x[3] = 0 x[4] = lambdaval Note that these initial values are randomly picked. Ideally, you should make sure the initial condition is consistent for DAEs. In other words, the algebraic equations (the last equation in your case) should be satisfied when you choose initial values. Hong > On Jan 25, 2022, at 2:29 PM, Zhang, Hong via petsc-users wrote: > > The following code should work. > > class Pendulum(object): > n = 5 > comm = PETSc.COMM_SELF > l = 1.0 > m = 1.0 > g = 1.0 > def initialCondition(self, x): > # mu = self.mu_ > #initial condition > theta0= np.pi/3 #starting angle > x0=np.sin(theta0) > y0=-(self.l-x0**2)**.5 > lambdaval = 0.1 > x[0] = x0 > x[1] = y0 > x[2] = 0 > x[4] = 0 > x[5] = lambdaval > x.assemble() > > def evalIFunction(self, ts, t, x, xdot, f): > f[0] = x[2]-xdot[0] > f[1] = x[3]-xdot[1] > f[2] = -xdot[2]+x[4]*x[0]/self.m > f[3] = -xdot[3]+x[4]*x[1]/self.m-self.g > f[4] = [x[2]**2 + x[3]**2 + (x[0]**2 + x[1]**2)/self.m*x[4] - x[1] * self.g] > f.assemble() > > You do not need to initialize xdot. Only x needs to be initialized. > > Hong > >> On Jan 25, 2022, at 12:03 AM, Xiong, Jing via petsc-users wrote: >> >> Good morning, >> >> Thanks for all your help. >> I'm trying to implement a toy example solving a DAE for a Pendulum system and I got two questions: >> ? How to set the initial value for xdot? >> ? I got the following error information: >> ? Assertion failed: (!PyErr_Occurred()), function __Pyx_PyCFunction_FastCall, file src/petsc4py.PETSc.c, line 359099. >> The system equation is given in https://en.wikipedia.org/wiki/Differential-algebraic_system_of_equations >> The formula I used is shown as follows: >> >> >> I also attached my code below: >> import sys, petsc4py >> petsc4py.init(sys.argv) >> import numpy as np >> >> from petsc4py import PETSc >> >> class Pendulum(object): >> n = 5 >> comm = PETSc.COMM_SELF >> def initialCondition(self, x): >> # mu = self.mu_ >> l = 1.0 >> m = 1.0 >> g = 1.0 >> #initial condition >> theta0= np.pi/3 #starting angle >> x0=np.sin(theta0) >> y0=-(l-x0**2)**.5 >> lambdaval = 0.1 >> x[0] = x0 >> x[1] = y0 >> x[4] = lambdaval >> x.assemble() >> >> def evalIFunction(self, ts, t, x, xdot, f): >> f.setArray ([x[2]-xdot[0]], >> [x[3]-xdot[1]], >> [-xdot[2]+x[4]*x[0]/m], >> [-xdot[3]+x[4]*x[1]/m-g], >> [x[2]**2 + x[3]**2 + (x[0]**2 + x[1]**2)/m*x[4] - x[1] * g]) >> >> OptDB = PETSc.Options() >> ode = Pendulum() >> >> x = PETSc.Vec().createSeq(ode.n, comm=ode.comm) >> f = x.duplicate() >> >> ts = PETSc.TS().create(comm=ode.comm) >> >> ts.setType(ts.Type.CN) >> ts.setIFunction(ode.evalIFunction, f) >> >> ts.setSaveTrajectory() >> ts.setTime(0.0) >> ts.setTimeStep(0.001) >> ts.setMaxTime(0.5) >> ts.setMaxSteps(1000) >> ts.setExactFinalTime(PETSc.TS.ExactFinalTime.MATCHSTEP) >> >> ts.setFromOptions() >> ode.initialCondition(x) >> ts.solve(x) >> >> Best, >> Jing >> From: Zhang, Hong >> Sent: Thursday, January 20, 2022 3:05 PM >> To: Xiong, Jing >> Cc: petsc-users at mcs.anl.gov ; Zhao, Dongbo ; Hong, Tianqi >> Subject: Re: [petsc-users] Asking examples about solving DAE in python >> >> >> >>> On Jan 20, 2022, at 4:13 PM, Xiong, Jing via petsc-users wrote: >>> >>> Hi, >>> >>> I hope you are well. >>> I'm very interested in PETSc and want to explore the possibility of whether it could solve Differential-algebraic equations (DAE) in python. I know there are great examples in C, but I'm struggling to connect the examples in python. >>> >>> The only example I got right now is for solving ODEs in the paper: PETSc/TS: A Modern Scalable ODE/DAE Solver Library. >>> And I got the following questions: >>> ? Is petsc4py the right package to use? >> >> Yes, you need petsc4py for python. >> >>> ? Could you give an example for solving DAEs in python? >> >> src/binding/petsc4py/demo/ode/vanderpol.py gives a simple example to demonstrate a variety of PETSc capabilities. The corresponding C version of this example is ex20adj.c in src/ts/tutorials/. >> ? How to solve an ODE with explicit methods. >> ? How to solve an ODE/DAE with implicit methods. >> ? How to use TSAdjoint to calculate adjoint sensitivities. >> ? How to do a manual matrix-free implementation (e.g. you may already have a way to differentiate your RHS function to generate the Jacobian-vector product). >> >>> ? Is Jacobian must be specified? If not, could your show an example for solving DAEs without specified Jacobian in python? >> >> PETSc can do finite-difference approximations to generate the Jacobian matrix automatically. This may work nicely for small-sized problems. You can also try to use an AD tool to produce the Jacobian-vector product and use it in a matrix-free implementation. >> >> Hong >> >>> >>> Thank you for your help. >>> >>> Best, >>> Jing > From Kang_Peng at uml.edu Tue Jan 25 15:53:58 2022 From: Kang_Peng at uml.edu (Peng, Kang) Date: Tue, 25 Jan 2022 21:53:58 +0000 Subject: [petsc-users] Error when configuring the PETSC environment Message-ID: Hi PETSc, I am trying to configure the PETSC environment in MacOS (Apple M1 pro chip, macOS 12.1), but something went wrong when I executing those command below. I tried many methods but failed to solve it. Could you help me to solve it? I?ve been following this instruction to install PETSc and configure the environment, but I can?t do it after changing to the new chip. https://www.pflotran.org/documentation/user_guide/how_to/installation/linux.html#linux-install Error is as follows: von at MacBook-Pro-VON petsc % ./configure --CFLAGS='-O3' --CXXFLAGS='-O3' --FFLAGS='-O3' --with-debugging=no --download-mpich=yes --download-hdf5=yes --download-fblaslapack=yes --download-cmake=yes --download-metis=yes --download-parmetis=yes --download-hdf5-fortran-bindings=yes --download-hdf5-configure-arguments="--with-zlib=yes" ============================================================================================= Configuring PETSc to compile on your system ============================================================================================= ============================================================================================= ***** WARNING: You have a version of GNU make older than 4.0. It will work, but may not support all the parallel testing options. You can install the latest GNU make with your package manager, such as brew or macports, or use the --download-make option to get the latest GNU make ***** ============================================================================================= ============================================================================================= Running configure on CMAKE; this may take several minutes ============================================================================================= ******************************************************************************* UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log for details): ------------------------------------------------------------------------------- Error running configure on CMAKE ******************************************************************************* ******************************************************************************* UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log for details): ------------------------------------------------------------------------------- Error running configure on CMAKE ******************************************************************************* File "/Users/von/petsc/config/configure.py", line 465, in petsc_configure framework.configure(out = sys.stdout) File "/Users/von/petsc/config/BuildSystem/config/framework.py", line 1385, in configure self.processChildren() File "/Users/von/petsc/config/BuildSystem/config/framework.py", line 1373, in processChildren self.serialEvaluation(self.childGraph) File "/Users/von/petsc/config/BuildSystem/config/framework.py", line 1348, in serialEvaluation child.configure() File "/Users/von/petsc/config/BuildSystem/config/packages/cmake.py", line 75, in configure config.package.GNUPackage.configure(self) File "/Users/von/petsc/config/BuildSystem/config/package.py", line 1189, in configure self.executeTest(self.configureLibrary) File "/Users/von/petsc/config/BuildSystem/config/base.py", line 138, in executeTest ret = test(*args,**kargs) File "/Users/von/petsc/config/BuildSystem/config/package.py", line 935, in configureLibrary for location, directory, lib, incl in self.generateGuesses(): File "/Users/von/petsc/config/BuildSystem/config/package.py", line 509, in generateGuesses d = self.checkDownload() File "/Users/von/petsc/config/BuildSystem/config/package.py", line 643, in checkDownload return self.getInstallDir() File "/Users/von/petsc/config/BuildSystem/config/package.py", line 405, in getInstallDir installDir = self.Install() File "/Users/von/petsc/config/BuildSystem/config/packages/cmake.py", line 47, in Install retdir = config.package.GNUPackage.Install(self) File "/Users/von/petsc/config/BuildSystem/config/package.py", line 1733, in Install raise RuntimeError('Error running configure on ' + self.PACKAGE) ================================================================================ Finishing configure run at Tue, 25 Jan 2022 16:21:58 -0500 ================================================================================ Thanks, Kang -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefano.zampini at gmail.com Tue Jan 25 15:57:33 2022 From: stefano.zampini at gmail.com (Stefano Zampini) Date: Wed, 26 Jan 2022 00:57:33 +0300 Subject: [petsc-users] Error when configuring the PETSC environment In-Reply-To: References: Message-ID: You should attach configure.log if you want us to take a look at the failure. In any case, you should use gmake that you can install via brew > On Jan 26, 2022, at 12:53 AM, Peng, Kang wrote: > > Hi PETSc, > > I am trying to configure the PETSC environment in MacOS (Apple M1 pro chip, macOS 12.1), but something went wrong when I executing those command below. I tried many methods but failed to solve it. Could you help me to solve it? > > I?ve been following this instruction to install PETSc and configure the environment, but I can?t do it after changing to the new chip. > https://www.pflotran.org/documentation/user_guide/how_to/installation/linux.html#linux-install > > Error is as follows: > von at MacBook-Pro-VON petsc % ./configure --CFLAGS='-O3' --CXXFLAGS='-O3' --FFLAGS='-O3' --with-debugging=no --download-mpich=yes --download-hdf5=yes --download-fblaslapack=yes --download-cmake=yes --download-metis=yes --download-parmetis=yes --download-hdf5-fortran-bindings=yes --download-hdf5-configure-arguments="--with-zlib=yes" > ============================================================================================= > Configuring PETSc to compile on your system > ============================================================================================= > ============================================================================================= ***** WARNING: You have a version of GNU make older than 4.0. It will work, but may not support all the parallel testing options. You can install the latest GNU make with your package manager, such as brew or macports, or use the --download-make option to get the latest GNU make ***** ============================================================================================= ============================================================================================= Running configure on CMAKE; this may take several minutes ============================================================================================= ******************************************************************************* > UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log for details): > ------------------------------------------------------------------------------- > Error running configure on CMAKE > ******************************************************************************* > > ******************************************************************************* > UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log for details): > ------------------------------------------------------------------------------- > Error running configure on CMAKE > ******************************************************************************* > File "/Users/von/petsc/config/configure.py", line 465, in petsc_configure > framework.configure(out = sys.stdout) > File "/Users/von/petsc/config/BuildSystem/config/framework.py", line 1385, in configure > self.processChildren() > File "/Users/von/petsc/config/BuildSystem/config/framework.py", line 1373, in processChildren > self.serialEvaluation(self.childGraph) > File "/Users/von/petsc/config/BuildSystem/config/framework.py", line 1348, in serialEvaluation > child.configure() > File "/Users/von/petsc/config/BuildSystem/config/packages/cmake.py", line 75, in configure > config.package.GNUPackage.configure(self) > File "/Users/von/petsc/config/BuildSystem/config/package.py", line 1189, in configure > self.executeTest(self.configureLibrary) > File "/Users/von/petsc/config/BuildSystem/config/base.py", line 138, in executeTest > ret = test(*args,**kargs) > File "/Users/von/petsc/config/BuildSystem/config/package.py", line 935, in configureLibrary > for location, directory, lib, incl in self.generateGuesses(): > File "/Users/von/petsc/config/BuildSystem/config/package.py", line 509, in generateGuesses > d = self.checkDownload() > File "/Users/von/petsc/config/BuildSystem/config/package.py", line 643, in checkDownload > return self.getInstallDir() > File "/Users/von/petsc/config/BuildSystem/config/package.py", line 405, in getInstallDir > installDir = self.Install() > File "/Users/von/petsc/config/BuildSystem/config/packages/cmake.py", line 47, in Install > retdir = config.package.GNUPackage.Install(self) > File "/Users/von/petsc/config/BuildSystem/config/package.py", line 1733, in Install > raise RuntimeError('Error running configure on ' + self.PACKAGE) > ================================================================================ > Finishing configure run at Tue, 25 Jan 2022 16:21:58 -0500 > ================================================================================ > > Thanks, > Kang -------------- next part -------------- An HTML attachment was scrubbed... URL: From Kang_Peng at uml.edu Tue Jan 25 15:43:21 2022 From: Kang_Peng at uml.edu (Peng, Kang) Date: Tue, 25 Jan 2022 21:43:21 +0000 Subject: [petsc-users] Error when configuring the PETSC environment Message-ID: <9FABB7EF-1506-4E92-AB68-6B55C85F43B6@uml.edu> Hi PETSc, I am trying to configure the PETSC environment in MacOS (Apple M1 pro chip, macOS 12.1), but something went wrong when I executing those command below. I tried many methods but failed to solve it. Could you help me to solve it? I?ve been following this instruction to install PETSc and configure the environment, but I can?t do it after changing to the new chip. https://www.pflotran.org/documentation/user_guide/how_to/installation/linux.html#linux-install Error is as follows: von at MacBook-Pro-VON petsc % ./configure --CFLAGS='-O3' --CXXFLAGS='-O3' --FFLAGS='-O3' --with-debugging=no --download-mpich=yes --download-hdf5=yes --download-fblaslapack=yes --download-cmake=yes --download-metis=yes --download-parmetis=yes --download-hdf5-fortran-bindings=yes --download-hdf5-configure-arguments="--with-zlib=yes" ============================================================================================= Configuring PETSc to compile on your system ============================================================================================= ============================================================================================= ***** WARNING: You have a version of GNU make older than 4.0. It will work, but may not support all the parallel testing options. You can install the latest GNU make with your package manager, such as brew or macports, or use the --download-make option to get the latest GNU make ***** ============================================================================================= ============================================================================================= Running configure on CMAKE; this may take several minutes ============================================================================================= ******************************************************************************* UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log for details): ------------------------------------------------------------------------------- Error running configure on CMAKE ******************************************************************************* ******************************************************************************* UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log for details): ------------------------------------------------------------------------------- Error running configure on CMAKE ******************************************************************************* File "/Users/von/petsc/config/configure.py", line 465, in petsc_configure framework.configure(out = sys.stdout) File "/Users/von/petsc/config/BuildSystem/config/framework.py", line 1385, in configure self.processChildren() File "/Users/von/petsc/config/BuildSystem/config/framework.py", line 1373, in processChildren self.serialEvaluation(self.childGraph) File "/Users/von/petsc/config/BuildSystem/config/framework.py", line 1348, in serialEvaluation child.configure() File "/Users/von/petsc/config/BuildSystem/config/packages/cmake.py", line 75, in configure config.package.GNUPackage.configure(self) File "/Users/von/petsc/config/BuildSystem/config/package.py", line 1189, in configure self.executeTest(self.configureLibrary) File "/Users/von/petsc/config/BuildSystem/config/base.py", line 138, in executeTest ret = test(*args,**kargs) File "/Users/von/petsc/config/BuildSystem/config/package.py", line 935, in configureLibrary for location, directory, lib, incl in self.generateGuesses(): File "/Users/von/petsc/config/BuildSystem/config/package.py", line 509, in generateGuesses d = self.checkDownload() File "/Users/von/petsc/config/BuildSystem/config/package.py", line 643, in checkDownload return self.getInstallDir() File "/Users/von/petsc/config/BuildSystem/config/package.py", line 405, in getInstallDir installDir = self.Install() File "/Users/von/petsc/config/BuildSystem/config/packages/cmake.py", line 47, in Install retdir = config.package.GNUPackage.Install(self) File "/Users/von/petsc/config/BuildSystem/config/package.py", line 1733, in Install raise RuntimeError('Error running configure on ' + self.PACKAGE) ================================================================================ Finishing configure run at Tue, 25 Jan 2022 16:21:58 -0500 ================================================================================ Thanks, Kang -------------- next part -------------- An HTML attachment was scrubbed... URL: From Kang_Peng at uml.edu Tue Jan 25 16:04:20 2022 From: Kang_Peng at uml.edu (Peng, Kang) Date: Tue, 25 Jan 2022 22:04:20 +0000 Subject: [petsc-users] Error when configuring the PETSC environment In-Reply-To: References: Message-ID: The blue word in the last Email is part of the configure.log file, this time I put the whole file in the attachment. I actually tried to install it via brew, but it still failed. And I could follow Pflotran's installation tutorial on my old computer before, but I don't know why it doesn't work now. 2022?1?25? 16:57?Stefano Zampini > ??? This e-mail originated from outside the UMass Lowell network. ________________________________ You should attach configure.log if you want us to take a look at the failure. In any case, you should use gmake that you can install via brew On Jan 26, 2022, at 12:53 AM, Peng, Kang > wrote: Hi PETSc, I am trying to configure the PETSC environment in MacOS (Apple M1 pro chip, macOS 12.1), but something went wrong when I executing those command below. I tried many methods but failed to solve it. Could you help me to solve it? I?ve been following this instruction to install PETSc and configure the environment, but I can?t do it after changing to the new chip. https://www.pflotran.org/documentation/user_guide/how_to/installation/linux.html#linux-install Error is as follows: von at MacBook-Pro-VON petsc % ./configure --CFLAGS='-O3' --CXXFLAGS='-O3' --FFLAGS='-O3' --with-debugging=no --download-mpich=yes --download-hdf5=yes --download-fblaslapack=yes --download-cmake=yes --download-metis=yes --download-parmetis=yes --download-hdf5-fortran-bindings=yes --download-hdf5-configure-arguments="--with-zlib=yes" ============================================================================================= Configuring PETSc to compile on your system ============================================================================================= ============================================================================================= ***** WARNING: You have a version of GNU make older than 4.0. It will work, but may not support all the parallel testing options. You can install the latest GNU make with your package manager, such as brew or macports, or use the --download-make option to get the latest GNU make ***** ============================================================================================= ============================================================================================= Running configure on CMAKE; this may take several minutes ============================================================================================= ******************************************************************************* UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log for details): ------------------------------------------------------------------------------- Error running configure on CMAKE ******************************************************************************* ******************************************************************************* UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log for details): ------------------------------------------------------------------------------- Error running configure on CMAKE ******************************************************************************* File "/Users/von/petsc/config/configure.py", line 465, in petsc_configure framework.configure(out = sys.stdout) File "/Users/von/petsc/config/BuildSystem/config/framework.py", line 1385, in configure self.processChildren() File "/Users/von/petsc/config/BuildSystem/config/framework.py", line 1373, in processChildren self.serialEvaluation(self.childGraph) File "/Users/von/petsc/config/BuildSystem/config/framework.py", line 1348, in serialEvaluation child.configure() File "/Users/von/petsc/config/BuildSystem/config/packages/cmake.py", line 75, in configure config.package.GNUPackage.configure(self) File "/Users/von/petsc/config/BuildSystem/config/package.py", line 1189, in configure self.executeTest(self.configureLibrary) File "/Users/von/petsc/config/BuildSystem/config/base.py", line 138, in executeTest ret = test(*args,**kargs) File "/Users/von/petsc/config/BuildSystem/config/package.py", line 935, in configureLibrary for location, directory, lib, incl in self.generateGuesses(): File "/Users/von/petsc/config/BuildSystem/config/package.py", line 509, in generateGuesses d = self.checkDownload() File "/Users/von/petsc/config/BuildSystem/config/package.py", line 643, in checkDownload return self.getInstallDir() File "/Users/von/petsc/config/BuildSystem/config/package.py", line 405, in getInstallDir installDir = self.Install() File "/Users/von/petsc/config/BuildSystem/config/packages/cmake.py", line 47, in Install retdir = config.package.GNUPackage.Install(self) File "/Users/von/petsc/config/BuildSystem/config/package.py", line 1733, in Install raise RuntimeError('Error running configure on ' + self.PACKAGE) ================================================================================ Finishing configure run at Tue, 25 Jan 2022 16:21:58 -0500 ================================================================================ Thanks, Kang -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: configure.log Type: application/octet-stream Size: 992259 bytes Desc: configure.log URL: From bastian.loehrer at tu-dresden.de Tue Jan 25 16:52:01 2022 From: bastian.loehrer at tu-dresden.de (=?UTF-8?Q?Bastian_L=c3=b6hrer?=) Date: Tue, 25 Jan 2022 23:52:01 +0100 Subject: [petsc-users] DMDA with 0 in lx, ly, lz ... or how to create a DMDA for a subregion Message-ID: An HTML attachment was scrubbed... URL: From rlmackie862 at gmail.com Tue Jan 25 17:09:25 2022 From: rlmackie862 at gmail.com (Randall Mackie) Date: Tue, 25 Jan 2022 15:09:25 -0800 Subject: [petsc-users] DMDA with 0 in lx, ly, lz ... or how to create a DMDA for a subregion In-Reply-To: References: Message-ID: <8F6D360E-1AA0-464E-A9D2-E1EB8736EA21@gmail.com> Take a look at these posts from last year and see if they will help you at least get a slice: https://lists.mcs.anl.gov/pipermail/petsc-users/2021-January/043037.html https://lists.mcs.anl.gov/pipermail/petsc-users/2021-January/043043.html Good luck, Randy M. > On Jan 25, 2022, at 2:52 PM, Bastian L?hrer wrote: > > Dear PETSc community, > > in our CFD code we use a 3D DMDA to organize our data. > > Now I need to compute a derived quantity in a subregion of the global domain and to write these data to disk for further post-processing. > > The subregion is actually a planar slice for now, but it could also be a boxed-shaped region in future applications. > > Hence, I figured I would create a new DMDA for this subregion by writing something along the lines of > > call DMDACreate3d( & ! https://www.mcs.anl.gov/petsc/petsc-3.8.4/docs/manualpages/DMDA/DMDACreate3d.html > PETSC_COMM_WORLD, & > DM_BOUNDARY_NONE, DM_BOUNDARY_NONE, DM_BOUNDARY_NONE, & > DMDA_STENCIL_BOX, & ! <= stencil type > sum(lx), sum(ly), sum(lz), & ! <= global dimension (of data) in each direction > px, py, pz, & ! <= number of processors in each dimension > 1, 0, & ! <= dof per node, stencil width (no ghost cell) > lx, ly, lz, & ! <= numbers of nodes held by processors > dmobject, ierr ) > > where lx, ly and lz could look like > > lx = (/ 0, 1, 0 /) > ly = (/ 16, 16 /) > lz = (/ 16, 16 /) > > Unfortunately, this does not work: > > >> [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- >> [0]PETSC ERROR: Argument out of range >> [0]PETSC ERROR: Partition in x direction is too fine! 1 2 >> [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. >> [0]PETSC ERROR: Petsc Release Version 3.8.4, Mar, 24, 2018 >> [0]PETSC ERROR: ./prime on a foss_debug named laptwo by bastian Tue Jan 25 23:32:30 2022 >> [0]PETSC ERROR: Configure options PETSC_ARCH=foss_debug --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --download-mpich --with-large-file-io=1 --with-shared-libraries=1 COPTFLAGS=" " CXXOPTFLAGS=" " FOPTFLAGS=" " --march=native --mtune=native --download-hypre=/soft/petsc-3.8.4/foss_debug/hypre-v2.12.0.tar.gz --with-debugging=yes >> [0]PETSC ERROR: #1 DMSetUp_DA_3D() line 318 in /soft/petsc-3.8.4/src/dm/impls/da/da3.c >> [0]PETSC ERROR: #2 DMSetUp_DA() line 25 in /soft/petsc-3.8.4/src/dm/impls/da/dareg.c >> [0]PETSC ERROR: #3 DMSetUp() line 720 in /soft/petsc-3.8.4/src/dm/interface/dm.c >> [0]PETSC ERROR: #4 User provided function() line 0 in User file >> application called MPI_Abort(MPI_COMM_WORLD, 63) - process 0 > Apparently, lx, ly and lz cannot contain zeros. > (Which would be a useful information in the documentation, too.) > > Is there any workaround? > > My current understanding is that I need to go the extra mile of creating an additional Communicator involving only those ranks that will share at least one cell in the subregion DMDA. > > If this is the way to go? If so, how can I control which rank receives which subdomain in the subregion? DMDACreate3d does not enable me to do so, but I need to make sure that a rank holds only those cells of the subregion which are also present in its share of the global domain. > > Many thanks in advance, > Bastian L?hrer > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Tue Jan 25 18:04:45 2022 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 25 Jan 2022 19:04:45 -0500 Subject: [petsc-users] Error when configuring the PETSC environment In-Reply-To: <9FABB7EF-1506-4E92-AB68-6B55C85F43B6@uml.edu> References: <9FABB7EF-1506-4E92-AB68-6B55C85F43B6@uml.edu> Message-ID: On Tue, Jan 25, 2022 at 4:59 PM Peng, Kang wrote: > Hi PETSc, > > I am trying to configure the PETSC environment in MacOS (Apple M1 pro > chip, macOS 12.1), but something went wrong when I executing those command > below. I tried many methods but failed to solve it. Could you help me to > solve it? > > I?ve been following this instruction to install PETSc and configure the > environment, but I can?t do it after changing to the new chip. > > https://www.pflotran.org/documentation/user_guide/how_to/installation/linux.html#linux-install > With any configure failure, you must send us configure.log, or we are just guessing what went wrong. Below it looks like there is a problem with the CMake build. It might be that they have a bug for the M1. Thanks, Matt > Error is as follows: > von at MacBook-Pro-VON petsc % ./configure --CFLAGS='-O3' --CXXFLAGS='-O3' > --FFLAGS='-O3' --with-debugging=no --download-mpich=yes --download-hdf5=yes > --download-fblaslapack=yes --download-cmake=yes --download-metis=yes > --download-parmetis=yes --download-hdf5-fortran-bindings=yes > --download-hdf5-configure-arguments="--with-zlib=yes" > > ============================================================================================= > Configuring PETSc to compile on your system > > > ============================================================================================= > ============================================================================================= > > ***** WARNING: You have a version of GNU make older than 4.0. It will > work, > but may not support all the parallel testing options. > You can install the > latest GNU make with your package > manager, such as brew or macports, or use > the > --download-make option to get the latest GNU make ***** > > > ============================================================================================= > > ============================================================================================= > > Running configure on CMAKE; this may take several minutes > > > ============================================================================================= > > > ******************************************************************************* > UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log for > details): > > ------------------------------------------------------------------------------- > Error running configure on CMAKE > > ******************************************************************************* > > > ******************************************************************************* > UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log for > details): > > ------------------------------------------------------------------------------- > Error running configure on CMAKE > > ******************************************************************************* > File "/Users/von/petsc/config/configure.py", line 465, in petsc_configure > framework.configure(out = sys.stdout) > File "/Users/von/petsc/config/BuildSystem/config/framework.py", line > 1385, in configure > self.processChildren() > File "/Users/von/petsc/config/BuildSystem/config/framework.py", line > 1373, in processChildren > self.serialEvaluation(self.childGraph) > File "/Users/von/petsc/config/BuildSystem/config/framework.py", line > 1348, in serialEvaluation > child.configure() > File "/Users/von/petsc/config/BuildSystem/config/packages/cmake.py", > line 75, in configure > config.package.GNUPackage.configure(self) > File "/Users/von/petsc/config/BuildSystem/config/package.py", line 1189, > in configure > self.executeTest(self.configureLibrary) > File "/Users/von/petsc/config/BuildSystem/config/base.py", line 138, in > executeTest > ret = test(*args,**kargs) > File "/Users/von/petsc/config/BuildSystem/config/package.py", line 935, > in configureLibrary > for location, directory, lib, incl in self.generateGuesses(): > File "/Users/von/petsc/config/BuildSystem/config/package.py", line 509, > in generateGuesses > d = self.checkDownload() > File "/Users/von/petsc/config/BuildSystem/config/package.py", line 643, > in checkDownload > return self.getInstallDir() > File "/Users/von/petsc/config/BuildSystem/config/package.py", line 405, > in getInstallDir > installDir = self.Install() > File "/Users/von/petsc/config/BuildSystem/config/packages/cmake.py", > line 47, in Install > retdir = config.package.GNUPackage.Install(self) > File "/Users/von/petsc/config/BuildSystem/config/package.py", line 1733, > in Install > raise RuntimeError('Error running configure on ' + self.PACKAGE) > > ================================================================================ > Finishing configure run at Tue, 25 Jan 2022 16:21:58 -0500 > > ================================================================================ > > Thanks, > Kang > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Tue Jan 25 18:07:03 2022 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 25 Jan 2022 19:07:03 -0500 Subject: [petsc-users] Error when configuring the PETSC environment In-Reply-To: References: Message-ID: On Tue, Jan 25, 2022 at 5:05 PM Peng, Kang wrote: > The blue word in the last Email is part of the configure.log file, this > time I put the whole file in the attachment. > I actually tried to install it via brew, but it still failed. And I could > follow Pflotran's installation tutorial on my old computer before, but I > don't know why it doesn't work now. > CMake does not like your Clang compiler, Can you specify it explicitly using --with-cc= Thanks, Matt ============================================================================================= Running configure on CMAKE; this may take several minutes ============================================================================================= Executing: ./configure --prefix=/Users/von/petsc/arch-darwin-c-opt --parallel=8 -- -DCMAKE_USE_OPENSSL=OFF stdout: --------------------------------------------- CMake 3.20.1, Copyright 2000-2021 Kitware, Inc. and Contributors Found Clang toolchain --------------------------------------------- Error when bootstrapping CMake: Cannot find appropriate C compiler on this system. Please specify one using environment variable CC. See cmake_bootstrap.log for compilers attempted. --------------------------------------------- Log of errors: /Users/von/petsc/arch-darwin-c-opt/externalpackages/cmake-3.20.1/Bootstrap.cmk/cmake_bootstrap.log --------------------------------------------- > 2022?1?25? 16:57?Stefano Zampini ??? > > *This e-mail originated from outside the UMass Lowell network.* > > ------------------------------ > You should attach configure.log if you want us to take a look at the > failure. In any case, you should use gmake that you can install via brew > > On Jan 26, 2022, at 12:53 AM, Peng, Kang wrote: > > Hi PETSc, > > I am trying to configure the PETSC environment in MacOS (Apple M1 pro > chip, macOS 12.1), but something went wrong when I executing those command > below. I tried many methods but failed to solve it. Could you help me to > solve it? > > I?ve been following this instruction to install PETSc and configure the > environment, but I can?t do it after changing to the new chip. > > https://www.pflotran.org/documentation/user_guide/how_to/installation/linux.html#linux-install > > > Error is as follows: > von at MacBook-Pro-VON petsc % ./configure --CFLAGS='-O3' --CXXFLAGS='-O3' > --FFLAGS='-O3' --with-debugging=no --download-mpich=yes --download-hdf5=yes > --download-fblaslapack=yes --download-cmake=yes --download-metis=yes > --download-parmetis=yes --download-hdf5-fortran-bindings=yes > --download-hdf5-configure-arguments="--with-zlib=yes" > > ============================================================================================= > Configuring PETSc to compile on your system > > > ============================================================================================= > ============================================================================================= > > ***** WARNING: You have a version of GNU make older than 4.0. It will > work, > but may not support all the parallel testing options. > You can install the > latest GNU make with your package > manager, such as brew or macports, or use > the > --download-make option to get the latest GNU make ***** > > > ============================================================================================= > > ============================================================================================= > > Running configure on CMAKE; this may take several minutes > > > ============================================================================================= > > > ******************************************************************************* > UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log for > details): > > ------------------------------------------------------------------------------- > Error running configure on CMAKE > > ******************************************************************************* > > > ******************************************************************************* > UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log for > details): > > ------------------------------------------------------------------------------- > Error running configure on CMAKE > > ******************************************************************************* > File "/Users/von/petsc/config/configure.py", line 465, in petsc_configure > framework.configure(out = sys.stdout) > File "/Users/von/petsc/config/BuildSystem/config/framework.py", line > 1385, in configure > self.processChildren() > File "/Users/von/petsc/config/BuildSystem/config/framework.py", line > 1373, in processChildren > self.serialEvaluation(self.childGraph) > File "/Users/von/petsc/config/BuildSystem/config/framework.py", line > 1348, in serialEvaluation > child.configure() > File "/Users/von/petsc/config/BuildSystem/config/packages/cmake.py", > line 75, in configure > config.package.GNUPackage.configure(self) > File "/Users/von/petsc/config/BuildSystem/config/package.py", line 1189, > in configure > self.executeTest(self.configureLibrary) > File "/Users/von/petsc/config/BuildSystem/config/base.py", line 138, in > executeTest > ret = test(*args,**kargs) > File "/Users/von/petsc/config/BuildSystem/config/package.py", line 935, > in configureLibrary > for location, directory, lib, incl in self.generateGuesses(): > File "/Users/von/petsc/config/BuildSystem/config/package.py", line 509, > in generateGuesses > d = self.checkDownload() > File "/Users/von/petsc/config/BuildSystem/config/package.py", line 643, > in checkDownload > return self.getInstallDir() > File "/Users/von/petsc/config/BuildSystem/config/package.py", line 405, > in getInstallDir > installDir = self.Install() > File "/Users/von/petsc/config/BuildSystem/config/packages/cmake.py", > line 47, in Install > retdir = config.package.GNUPackage.Install(self) > File "/Users/von/petsc/config/BuildSystem/config/package.py", line 1733, > in Install > raise RuntimeError('Error running configure on ' + self.PACKAGE) > > ================================================================================ > Finishing configure run at Tue, 25 Jan 2022 16:21:58 -0500 > > ================================================================================ > > Thanks, > Kang > > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From fdkong.jd at gmail.com Tue Jan 25 20:18:39 2022 From: fdkong.jd at gmail.com (Fande Kong) Date: Tue, 25 Jan 2022 19:18:39 -0700 Subject: [petsc-users] Cannot eagerly initialize cuda, as doing so results in cuda error 35 (cudaErrorInsufficientDriver) : CUDA driver version is insufficient for CUDA runtime version In-Reply-To: References: <233D0F20-AD95-481B-B862-353D5789C556@gmail.com> <87czkn9c64.fsf@jedbrown.org> <875yqe7zip.fsf@jedbrown.org> Message-ID: On Tue, Jan 25, 2022 at 9:04 AM Jacob Faibussowitsch wrote: > Configure should not have an impact here I think. The reason I had you run > `cudaGetDeviceCount()` is because this is the CUDA call (and in fact the > only CUDA call) in the initialization sequence that returns the error code. > There should be no prior CUDA calls. Maybe this is a problem with > oversubscribing GPU?s? In the runs that crash, how many ranks are using any > given GPU at once? Maybe MPS is required. > I used one MPI rank. Fande > > Best regards, > > Jacob Faibussowitsch > (Jacob Fai - booss - oh - vitch) > > On Jan 21, 2022, at 12:01, Fande Kong wrote: > > Thanks Jacob, > > On Thu, Jan 20, 2022 at 6:25 PM Jacob Faibussowitsch > wrote: > >> Segfault is caused by the following check at >> src/sys/objects/device/impls/cupm/cupmdevice.cxx:349 being a >> PetscUnlikelyDebug() rather than just PetscUnlikely(): >> >> ``` >> if (PetscUnlikelyDebug(_defaultDevice < 0)) { // _defaultDevice is in >> fact < 0 here and uncaught >> ``` >> >> To clarify: >> >> ?lazy? initialization is not that lazy after all, it still does some 50% >> of the initialization that ?eager? initialization does. It stops short >> initializing the CUDA runtime, checking CUDA aware MPI, gathering device >> data, and initializing cublas and friends. Lazy also importantly swallows >> any errors that crop up during initialization, storing the resulting error >> code for later (specifically _defaultDevice = -init_error_value;). >> >> So whether you initialize lazily or eagerly makes no difference here, as >> _defaultDevice will always contain -35. >> >> The bigger question is why cudaGetDeviceCount() is returning >> cudaErrorInsufficientDriver. Can you compile and run >> >> ``` >> #include >> >> int main() >> { >> int ndev; >> return cudaGetDeviceCount(&ndev): >> } >> ``` >> >> Then show the value of "echo $??? >> > > Modify your code a little to get more information. > > #include > #include > > int main() > { > int ndev; > int error = cudaGetDeviceCount(&ndev); > printf("ndev %d \n", ndev); > printf("error %d \n", error); > return 0; > } > > Results: > > $ ./a.out > ndev 4 > error 0 > > > I have not read the PETSc cuda initialization code yet. If I need to guess > at what was happening. I will naively think that PETSc did not get correct > GPU information in the configuration because the compiler node does not > have GPUs, and there was no way to get any GPU device information. > > During the runtime on GPU nodes, PETSc might have incorrect information > grabbed during configuration and had this kind of false error message. > > Thanks, > > Fande > > > >> >> Best regards, >> >> Jacob Faibussowitsch >> (Jacob Fai - booss - oh - vitch) >> >> On Jan 20, 2022, at 17:47, Matthew Knepley wrote: >> >> On Thu, Jan 20, 2022 at 6:44 PM Fande Kong wrote: >> >>> Thanks, Jed >>> >>> On Thu, Jan 20, 2022 at 4:34 PM Jed Brown wrote: >>> >>>> You can't create CUDA or Kokkos Vecs if you're running on a node >>>> without a GPU. >>> >>> >>> I am running the code on compute nodes that do have GPUs. >>> >> >> If you are actually running on GPUs, why would you need lazy >> initialization? It would not break with GPUs present. >> >> Matt >> >> >>> With PETSc-3.16.1, I got good speedup by running GAMG on GPUs. That >>> might be a bug of PETSc-main. >>> >>> Thanks, >>> >>> Fande >>> >>> >>> >>> KSPSetUp 13 1.0 6.4400e-01 1.0 2.02e+09 1.0 0.0e+00 0.0e+00 >>> 0.0e+00 0 5 0 0 0 0 5 0 0 0 3140 64630 15 1.05e+02 5 >>> 3.49e+01 100 >>> KSPSolve 1 1.0 1.0109e+00 1.0 3.49e+10 1.0 0.0e+00 0.0e+00 >>> 0.0e+00 0 87 0 0 0 0 87 0 0 0 34522 69556 4 4.35e-03 1 >>> 2.38e-03 100 >>> KSPGMRESOrthog 142 1.0 1.2674e-01 1.0 1.06e+10 1.0 0.0e+00 0.0e+00 >>> 0.0e+00 0 27 0 0 0 0 27 0 0 0 83755 87801 0 0.00e+00 0 >>> 0.00e+00 100 >>> SNESSolve 1 1.0 4.4402e+01 1.0 4.00e+10 1.0 0.0e+00 0.0e+00 >>> 0.0e+00 21100 0 0 0 21100 0 0 0 901 51365 57 1.10e+03 52 >>> 8.78e+02 100 >>> SNESSetUp 1 1.0 3.9101e-05 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 >>> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 >>> 0.00e+00 0 >>> SNESFunctionEval 2 1.0 1.7097e+01 1.0 1.60e+07 1.0 0.0e+00 0.0e+00 >>> 0.0e+00 8 0 0 0 0 8 0 0 0 0 1 0 0 0.00e+00 6 >>> 1.92e+02 0 >>> SNESJacobianEval 1 1.0 1.6213e+01 1.0 2.80e+07 1.0 0.0e+00 0.0e+00 >>> 0.0e+00 8 0 0 0 0 8 0 0 0 0 2 0 0 0.00e+00 1 >>> 3.20e+01 0 >>> SNESLineSearch 1 1.0 8.5582e+00 1.0 1.24e+08 1.0 0.0e+00 0.0e+00 >>> 0.0e+00 4 0 0 0 0 4 0 0 0 0 14 64153 1 3.20e+01 3 >>> 9.61e+01 94 >>> PCGAMGGraph_AGG 5 1.0 3.0509e+00 1.0 8.19e+07 1.0 0.0e+00 0.0e+00 >>> 0.0e+00 1 0 0 0 0 1 0 0 0 0 27 0 5 3.49e+01 9 >>> 7.43e+01 0 >>> PCGAMGCoarse_AGG 5 1.0 3.8711e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 >>> 0.0e+00 2 0 0 0 0 2 0 0 0 0 0 0 0 0.00e+00 0 >>> 0.00e+00 0 >>> PCGAMGProl_AGG 5 1.0 7.0748e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 >>> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 >>> 0.00e+00 0 >>> PCGAMGPOpt_AGG 5 1.0 1.2904e+00 1.0 2.14e+09 1.0 0.0e+00 0.0e+00 >>> 0.0e+00 1 5 0 0 0 1 5 0 0 0 1661 29807 26 7.15e+02 20 >>> 2.90e+02 99 >>> GAMG: createProl 5 1.0 8.9489e+00 1.0 2.22e+09 1.0 0.0e+00 0.0e+00 >>> 0.0e+00 4 6 0 0 0 4 6 0 0 0 249 29666 31 7.50e+02 29 >>> 3.64e+02 96 >>> Graph 10 1.0 3.0478e+00 1.0 8.19e+07 1.0 0.0e+00 >>> 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 27 0 5 >>> 3.49e+01 9 7.43e+01 0 >>> MIS/Agg 5 1.0 4.1290e-01 1.0 0.00e+00 0.0 0.0e+00 >>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 >>> 0.00e+00 0 0.00e+00 0 >>> SA: col data 5 1.0 1.9127e-02 1.0 0.00e+00 0.0 0.0e+00 >>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 >>> 0.00e+00 0 0.00e+00 0 >>> SA: frmProl0 5 1.0 6.2662e-01 1.0 0.00e+00 0.0 0.0e+00 >>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 >>> 0.00e+00 0 0.00e+00 0 >>> SA: smooth 5 1.0 4.9595e-01 1.0 1.21e+08 1.0 0.0e+00 >>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 244 2709 15 >>> 1.97e+02 15 2.55e+02 90 >>> GAMG: partLevel 5 1.0 4.7330e-01 1.0 6.98e+08 1.0 0.0e+00 0.0e+00 >>> 0.0e+00 0 2 0 0 0 0 2 0 0 0 1475 4120 5 1.78e+02 10 >>> 2.55e+02 100 >>> PCGAMG Squ l00 1 1.0 2.6027e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 >>> 0.0e+00 1 0 0 0 0 1 0 0 0 0 0 0 0 0.00e+00 0 >>> 0.00e+00 0 >>> PCGAMG Gal l00 1 1.0 3.8406e-01 1.0 5.48e+08 1.0 0.0e+00 0.0e+00 >>> 0.0e+00 0 1 0 0 0 0 1 0 0 0 1426 4270 1 1.48e+02 2 >>> 2.11e+02 100 >>> PCGAMG Opt l00 1 1.0 2.4932e-01 1.0 7.20e+07 1.0 0.0e+00 0.0e+00 >>> 0.0e+00 0 0 0 0 0 0 0 0 0 0 289 2653 1 6.41e+01 1 >>> 1.13e+02 100 >>> PCGAMG Gal l01 1 1.0 6.6279e-02 1.0 1.09e+08 1.0 0.0e+00 0.0e+00 >>> 0.0e+00 0 0 0 0 0 0 0 0 0 0 1645 3851 1 2.40e+01 2 >>> 3.64e+01 100 >>> PCGAMG Opt l01 1 1.0 2.9544e-02 1.0 7.15e+06 1.0 0.0e+00 0.0e+00 >>> 0.0e+00 0 0 0 0 0 0 0 0 0 0 242 1671 1 4.84e+00 1 >>> 1.23e+01 100 >>> PCGAMG Gal l02 1 1.0 1.8874e-02 1.0 3.72e+07 1.0 0.0e+00 0.0e+00 >>> 0.0e+00 0 0 0 0 0 0 0 0 0 0 1974 3636 1 5.04e+00 2 >>> 6.58e+00 100 >>> PCGAMG Opt l02 1 1.0 7.4353e-03 1.0 2.40e+06 1.0 0.0e+00 0.0e+00 >>> 0.0e+00 0 0 0 0 0 0 0 0 0 0 323 1457 1 7.71e-01 1 >>> 2.30e+00 100 >>> PCGAMG Gal l03 1 1.0 2.8479e-03 1.0 4.10e+06 1.0 0.0e+00 0.0e+00 >>> 0.0e+00 0 0 0 0 0 0 0 0 0 0 1440 2266 1 4.44e-01 2 >>> 5.51e-01 100 >>> PCGAMG Opt l03 1 1.0 8.2684e-04 1.0 2.80e+05 1.0 0.0e+00 0.0e+00 >>> 0.0e+00 0 0 0 0 0 0 0 0 0 0 339 1667 1 6.72e-02 1 >>> 2.03e-01 100 >>> PCGAMG Gal l04 1 1.0 1.2238e-03 1.0 2.09e+05 1.0 0.0e+00 0.0e+00 >>> 0.0e+00 0 0 0 0 0 0 0 0 0 0 170 244 1 2.05e-02 2 >>> 2.53e-02 100 >>> PCGAMG Opt l04 1 1.0 4.1008e-04 1.0 1.77e+04 1.0 0.0e+00 0.0e+00 >>> 0.0e+00 0 0 0 0 0 0 0 0 0 0 43 165 1 4.49e-03 1 >>> 1.19e-02 100 >>> PCSetUp 2 1.0 9.9632e+00 1.0 4.95e+09 1.0 0.0e+00 0.0e+00 >>> 0.0e+00 5 12 0 0 0 5 12 0 0 0 496 17826 55 1.03e+03 45 >>> 6.54e+02 98 >>> PCSetUpOnBlocks 44 1.0 9.9087e-04 1.0 2.88e+03 1.0 >>> >>> >>> >>> >>> >>> >>>> The point of lazy initialization is to make it possible to run a solve >>>> that doesn't use a GPU in PETSC_ARCH that supports GPUs, regardless of >>>> whether a GPU is actually present. >>>> >>>> Fande Kong writes: >>>> >>>> > I spoke too soon. It seems that we have trouble creating cuda/kokkos >>>> vecs >>>> > now. Got Segmentation fault. >>>> > >>>> > Thanks, >>>> > >>>> > Fande >>>> > >>>> > Program received signal SIGSEGV, Segmentation fault. >>>> > 0x00002aaab5558b11 in >>>> > >>>> Petsc::CUPMDevice<(Petsc::CUPMDeviceType)0>::CUPMDeviceInternal::initialize >>>> > (this=0x1) at >>>> > >>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/impls/cupm/cupmdevice.cxx:54 >>>> > 54 PetscErrorCode CUPMDevice::CUPMDeviceInternal::initialize() >>>> noexcept >>>> > Missing separate debuginfos, use: debuginfo-install >>>> > bzip2-libs-1.0.6-13.el7.x86_64 elfutils-libelf-0.176-5.el7.x86_64 >>>> > elfutils-libs-0.176-5.el7.x86_64 glibc-2.17-325.el7_9.x86_64 >>>> > libX11-1.6.7-4.el7_9.x86_64 libXau-1.0.8-2.1.el7.x86_64 >>>> > libattr-2.4.46-13.el7.x86_64 libcap-2.22-11.el7.x86_64 >>>> > libibmad-5.4.0.MLNX20190423.1d917ae-0.1.49224.x86_64 >>>> > libibumad-43.1.1.MLNX20200211.078947f-0.1.49224.x86_64 >>>> > libibverbs-41mlnx1-OFED.4.9.0.0.7.49224.x86_64 >>>> > libmlx4-41mlnx1-OFED.4.7.3.0.3.49224.x86_64 >>>> > libmlx5-41mlnx1-OFED.4.9.0.1.2.49224.x86_64 libnl3-3.2.28-4.el7.x86_64 >>>> > librdmacm-41mlnx1-OFED.4.7.3.0.6.49224.x86_64 >>>> > librxe-41mlnx1-OFED.4.4.2.4.6.49224.x86_64 libxcb-1.13-1.el7.x86_64 >>>> > libxml2-2.9.1-6.el7_9.6.x86_64 numactl-libs-2.0.12-5.el7.x86_64 >>>> > systemd-libs-219-78.el7_9.3.x86_64 xz-libs-5.2.2-1.el7.x86_64 >>>> > zlib-1.2.7-19.el7_9.x86_64 >>>> > (gdb) bt >>>> > #0 0x00002aaab5558b11 in >>>> > >>>> Petsc::CUPMDevice<(Petsc::CUPMDeviceType)0>::CUPMDeviceInternal::initialize >>>> > (this=0x1) at >>>> > >>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/impls/cupm/cupmdevice.cxx:54 >>>> > #1 0x00002aaab5558db7 in >>>> > Petsc::CUPMDevice<(Petsc::CUPMDeviceType)0>::getDevice >>>> > (this=this at entry=0x2aaab7f37b70 >>>> > , device=0x115da00, id=-35, id at entry=-1) at >>>> > >>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/impls/cupm/cupmdevice.cxx:344 >>>> > #2 0x00002aaab55577de in PetscDeviceCreate (type=type at entry >>>> =PETSC_DEVICE_CUDA, >>>> > devid=devid at entry=-1, device=device at entry=0x2aaab7f37b48 >>>> > ) at >>>> > >>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/interface/device.cxx:107 >>>> > #3 0x00002aaab5557b3a in PetscDeviceInitializeDefaultDevice_Internal >>>> > (type=type at entry=PETSC_DEVICE_CUDA, >>>> defaultDeviceId=defaultDeviceId at entry=-1) >>>> > at >>>> > >>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/interface/device.cxx:273 >>>> > #4 0x00002aaab5557bf6 in PetscDeviceInitialize >>>> > (type=type at entry=PETSC_DEVICE_CUDA) >>>> > at >>>> > >>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/interface/device.cxx:234 >>>> > #5 0x00002aaab5661fcd in VecCreate_SeqCUDA (V=0x115d150) at >>>> > >>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/vec/vec/impls/seq/seqcuda/veccuda.c:244 >>>> > #6 0x00002aaab5649b40 in VecSetType (vec=vec at entry=0x115d150, >>>> > method=method at entry=0x2aaab70b45b8 "seqcuda") at >>>> > >>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/vec/vec/interface/vecreg.c:93 >>>> > #7 0x00002aaab579c33f in VecCreate_CUDA (v=0x115d150) at >>>> > >>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/vec/vec/impls/mpi/mpicuda/ >>>> > mpicuda.cu:214 >>>> > #8 0x00002aaab5649b40 in VecSetType (vec=vec at entry=0x115d150, >>>> > method=method at entry=0x7fffffff9260 "cuda") at >>>> > >>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/vec/vec/interface/vecreg.c:93 >>>> > #9 0x00002aaab5648bf1 in VecSetTypeFromOptions_Private >>>> (vec=0x115d150, >>>> > PetscOptionsObject=0x7fffffff9210) at >>>> > >>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/vec/vec/interface/vector.c:1263 >>>> > #10 VecSetFromOptions (vec=0x115d150) at >>>> > >>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/vec/vec/interface/vector.c:1297 >>>> > #11 0x00002aaab02ef227 in libMesh::PetscVector::init >>>> > (this=0x11cd1a0, n=441, n_local=441, fast=false, >>>> ptype=libMesh::PARALLEL) >>>> > at >>>> > >>>> /home/kongf/workhome/sawtooth/moosegpu/scripts/../libmesh/installed/include/libmesh/petsc_vector.h:693 >>>> > >>>> > On Thu, Jan 20, 2022 at 1:09 PM Fande Kong >>>> wrote: >>>> > >>>> >> Thanks, Jed, >>>> >> >>>> >> This worked! >>>> >> >>>> >> Fande >>>> >> >>>> >> On Wed, Jan 19, 2022 at 11:03 PM Jed Brown wrote: >>>> >> >>>> >>> Fande Kong writes: >>>> >>> >>>> >>> > On Wed, Jan 19, 2022 at 11:39 AM Jacob Faibussowitsch < >>>> >>> jacob.fai at gmail.com> >>>> >>> > wrote: >>>> >>> > >>>> >>> >> Are you running on login nodes or compute nodes (I can?t seem to >>>> tell >>>> >>> from >>>> >>> >> the configure.log)? >>>> >>> >> >>>> >>> > >>>> >>> > I was compiling codes on login nodes, and running codes on compute >>>> >>> nodes. >>>> >>> > Login nodes do not have GPUs, but compute nodes do have GPUs. >>>> >>> > >>>> >>> > Just to be clear, the same thing (code, machine) with PETSc-3.16.1 >>>> >>> worked >>>> >>> > perfectly. I have this trouble with PETSc-main. >>>> >>> >>>> >>> I assume you can >>>> >>> >>>> >>> export PETSC_OPTIONS='-device_enable lazy' >>>> >>> >>>> >>> and it'll work. >>>> >>> >>>> >>> I think this should be the default. The main complaint is that >>>> timing the >>>> >>> first GPU-using event isn't accurate if it includes initialization, >>>> but I >>>> >>> think this is mostly hypothetical because you can't trust any >>>> timing that >>>> >>> doesn't preload in some form and the first GPU-using event will >>>> almost >>>> >>> always be something uninteresting so I think it will rarely lead to >>>> >>> confusion. Meanwhile, eager initialization is viscerally disruptive >>>> for >>>> >>> lots of people. >>>> >>> >>>> >> >>>> >>> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ >> >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From fdkong.jd at gmail.com Tue Jan 25 20:20:47 2022 From: fdkong.jd at gmail.com (Fande Kong) Date: Tue, 25 Jan 2022 19:20:47 -0700 Subject: [petsc-users] Cannot eagerly initialize cuda, as doing so results in cuda error 35 (cudaErrorInsufficientDriver) : CUDA driver version is insufficient for CUDA runtime version In-Reply-To: References: <233D0F20-AD95-481B-B862-353D5789C556@gmail.com> <87czkn9c64.fsf@jedbrown.org> <875yqe7zip.fsf@jedbrown.org> Message-ID: Not sure if this is helpful. I did "git bisect", and here was the result: [kongf at sawtooth2 petsc]$ git bisect bad 246ba74192519a5f34fb6e227d1c64364e19ce2c is the first bad commit commit 246ba74192519a5f34fb6e227d1c64364e19ce2c Author: Junchao Zhang Date: Wed Oct 13 05:32:43 2021 +0000 Config: fix CUDA library and header dirs :040000 040000 187c86055adb80f53c1d0565a8888704fec43a96 ea1efd7f594fd5e8df54170bc1bc7b00f35e4d5f M config Started from this commit, and GPU did not work for me on our HPC Thanks, Fande On Tue, Jan 25, 2022 at 7:18 PM Fande Kong wrote: > > > On Tue, Jan 25, 2022 at 9:04 AM Jacob Faibussowitsch > wrote: > >> Configure should not have an impact here I think. The reason I had you >> run `cudaGetDeviceCount()` is because this is the CUDA call (and in fact >> the only CUDA call) in the initialization sequence that returns the error >> code. There should be no prior CUDA calls. Maybe this is a problem with >> oversubscribing GPU?s? In the runs that crash, how many ranks are using any >> given GPU at once? Maybe MPS is required. >> > > I used one MPI rank. > > Fande > > > >> >> Best regards, >> >> Jacob Faibussowitsch >> (Jacob Fai - booss - oh - vitch) >> >> On Jan 21, 2022, at 12:01, Fande Kong wrote: >> >> Thanks Jacob, >> >> On Thu, Jan 20, 2022 at 6:25 PM Jacob Faibussowitsch >> wrote: >> >>> Segfault is caused by the following check at >>> src/sys/objects/device/impls/cupm/cupmdevice.cxx:349 being a >>> PetscUnlikelyDebug() rather than just PetscUnlikely(): >>> >>> ``` >>> if (PetscUnlikelyDebug(_defaultDevice < 0)) { // _defaultDevice is in >>> fact < 0 here and uncaught >>> ``` >>> >>> To clarify: >>> >>> ?lazy? initialization is not that lazy after all, it still does some 50% >>> of the initialization that ?eager? initialization does. It stops short >>> initializing the CUDA runtime, checking CUDA aware MPI, gathering device >>> data, and initializing cublas and friends. Lazy also importantly swallows >>> any errors that crop up during initialization, storing the resulting error >>> code for later (specifically _defaultDevice = -init_error_value;). >>> >>> So whether you initialize lazily or eagerly makes no difference here, as >>> _defaultDevice will always contain -35. >>> >>> The bigger question is why cudaGetDeviceCount() is returning >>> cudaErrorInsufficientDriver. Can you compile and run >>> >>> ``` >>> #include >>> >>> int main() >>> { >>> int ndev; >>> return cudaGetDeviceCount(&ndev): >>> } >>> ``` >>> >>> Then show the value of "echo $??? >>> >> >> Modify your code a little to get more information. >> >> #include >> #include >> >> int main() >> { >> int ndev; >> int error = cudaGetDeviceCount(&ndev); >> printf("ndev %d \n", ndev); >> printf("error %d \n", error); >> return 0; >> } >> >> Results: >> >> $ ./a.out >> ndev 4 >> error 0 >> >> >> I have not read the PETSc cuda initialization code yet. If I need to >> guess at what was happening. I will naively think that PETSc did not get >> correct GPU information in the configuration because the compiler node does >> not have GPUs, and there was no way to get any GPU device information. >> >> During the runtime on GPU nodes, PETSc might have incorrect information >> grabbed during configuration and had this kind of false error message. >> >> Thanks, >> >> Fande >> >> >> >>> >>> Best regards, >>> >>> Jacob Faibussowitsch >>> (Jacob Fai - booss - oh - vitch) >>> >>> On Jan 20, 2022, at 17:47, Matthew Knepley wrote: >>> >>> On Thu, Jan 20, 2022 at 6:44 PM Fande Kong wrote: >>> >>>> Thanks, Jed >>>> >>>> On Thu, Jan 20, 2022 at 4:34 PM Jed Brown wrote: >>>> >>>>> You can't create CUDA or Kokkos Vecs if you're running on a node >>>>> without a GPU. >>>> >>>> >>>> I am running the code on compute nodes that do have GPUs. >>>> >>> >>> If you are actually running on GPUs, why would you need lazy >>> initialization? It would not break with GPUs present. >>> >>> Matt >>> >>> >>>> With PETSc-3.16.1, I got good speedup by running GAMG on GPUs. That >>>> might be a bug of PETSc-main. >>>> >>>> Thanks, >>>> >>>> Fande >>>> >>>> >>>> >>>> KSPSetUp 13 1.0 6.4400e-01 1.0 2.02e+09 1.0 0.0e+00 >>>> 0.0e+00 0.0e+00 0 5 0 0 0 0 5 0 0 0 3140 64630 15 >>>> 1.05e+02 5 3.49e+01 100 >>>> KSPSolve 1 1.0 1.0109e+00 1.0 3.49e+10 1.0 0.0e+00 >>>> 0.0e+00 0.0e+00 0 87 0 0 0 0 87 0 0 0 34522 69556 4 >>>> 4.35e-03 1 2.38e-03 100 >>>> KSPGMRESOrthog 142 1.0 1.2674e-01 1.0 1.06e+10 1.0 0.0e+00 >>>> 0.0e+00 0.0e+00 0 27 0 0 0 0 27 0 0 0 83755 87801 0 >>>> 0.00e+00 0 0.00e+00 100 >>>> SNESSolve 1 1.0 4.4402e+01 1.0 4.00e+10 1.0 0.0e+00 >>>> 0.0e+00 0.0e+00 21100 0 0 0 21100 0 0 0 901 51365 57 >>>> 1.10e+03 52 8.78e+02 100 >>>> SNESSetUp 1 1.0 3.9101e-05 1.0 0.00e+00 0.0 0.0e+00 >>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 >>>> 0.00e+00 0 0.00e+00 0 >>>> SNESFunctionEval 2 1.0 1.7097e+01 1.0 1.60e+07 1.0 0.0e+00 >>>> 0.0e+00 0.0e+00 8 0 0 0 0 8 0 0 0 0 1 0 0 >>>> 0.00e+00 6 1.92e+02 0 >>>> SNESJacobianEval 1 1.0 1.6213e+01 1.0 2.80e+07 1.0 0.0e+00 >>>> 0.0e+00 0.0e+00 8 0 0 0 0 8 0 0 0 0 2 0 0 >>>> 0.00e+00 1 3.20e+01 0 >>>> SNESLineSearch 1 1.0 8.5582e+00 1.0 1.24e+08 1.0 0.0e+00 >>>> 0.0e+00 0.0e+00 4 0 0 0 0 4 0 0 0 0 14 64153 1 >>>> 3.20e+01 3 9.61e+01 94 >>>> PCGAMGGraph_AGG 5 1.0 3.0509e+00 1.0 8.19e+07 1.0 0.0e+00 >>>> 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 27 0 5 >>>> 3.49e+01 9 7.43e+01 0 >>>> PCGAMGCoarse_AGG 5 1.0 3.8711e+00 1.0 0.00e+00 0.0 0.0e+00 >>>> 0.0e+00 0.0e+00 2 0 0 0 0 2 0 0 0 0 0 0 0 >>>> 0.00e+00 0 0.00e+00 0 >>>> PCGAMGProl_AGG 5 1.0 7.0748e-01 1.0 0.00e+00 0.0 0.0e+00 >>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 >>>> 0.00e+00 0 0.00e+00 0 >>>> PCGAMGPOpt_AGG 5 1.0 1.2904e+00 1.0 2.14e+09 1.0 0.0e+00 >>>> 0.0e+00 0.0e+00 1 5 0 0 0 1 5 0 0 0 1661 29807 26 >>>> 7.15e+02 20 2.90e+02 99 >>>> GAMG: createProl 5 1.0 8.9489e+00 1.0 2.22e+09 1.0 0.0e+00 >>>> 0.0e+00 0.0e+00 4 6 0 0 0 4 6 0 0 0 249 29666 31 >>>> 7.50e+02 29 3.64e+02 96 >>>> Graph 10 1.0 3.0478e+00 1.0 8.19e+07 1.0 0.0e+00 >>>> 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 27 0 5 >>>> 3.49e+01 9 7.43e+01 0 >>>> MIS/Agg 5 1.0 4.1290e-01 1.0 0.00e+00 0.0 0.0e+00 >>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 >>>> 0.00e+00 0 0.00e+00 0 >>>> SA: col data 5 1.0 1.9127e-02 1.0 0.00e+00 0.0 0.0e+00 >>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 >>>> 0.00e+00 0 0.00e+00 0 >>>> SA: frmProl0 5 1.0 6.2662e-01 1.0 0.00e+00 0.0 0.0e+00 >>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 >>>> 0.00e+00 0 0.00e+00 0 >>>> SA: smooth 5 1.0 4.9595e-01 1.0 1.21e+08 1.0 0.0e+00 >>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 244 2709 15 >>>> 1.97e+02 15 2.55e+02 90 >>>> GAMG: partLevel 5 1.0 4.7330e-01 1.0 6.98e+08 1.0 0.0e+00 >>>> 0.0e+00 0.0e+00 0 2 0 0 0 0 2 0 0 0 1475 4120 5 >>>> 1.78e+02 10 2.55e+02 100 >>>> PCGAMG Squ l00 1 1.0 2.6027e+00 1.0 0.00e+00 0.0 0.0e+00 >>>> 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 0 0 0 >>>> 0.00e+00 0 0.00e+00 0 >>>> PCGAMG Gal l00 1 1.0 3.8406e-01 1.0 5.48e+08 1.0 0.0e+00 >>>> 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 1426 4270 1 >>>> 1.48e+02 2 2.11e+02 100 >>>> PCGAMG Opt l00 1 1.0 2.4932e-01 1.0 7.20e+07 1.0 0.0e+00 >>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 289 2653 1 >>>> 6.41e+01 1 1.13e+02 100 >>>> PCGAMG Gal l01 1 1.0 6.6279e-02 1.0 1.09e+08 1.0 0.0e+00 >>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1645 3851 1 >>>> 2.40e+01 2 3.64e+01 100 >>>> PCGAMG Opt l01 1 1.0 2.9544e-02 1.0 7.15e+06 1.0 0.0e+00 >>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 242 1671 1 >>>> 4.84e+00 1 1.23e+01 100 >>>> PCGAMG Gal l02 1 1.0 1.8874e-02 1.0 3.72e+07 1.0 0.0e+00 >>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1974 3636 1 >>>> 5.04e+00 2 6.58e+00 100 >>>> PCGAMG Opt l02 1 1.0 7.4353e-03 1.0 2.40e+06 1.0 0.0e+00 >>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 323 1457 1 >>>> 7.71e-01 1 2.30e+00 100 >>>> PCGAMG Gal l03 1 1.0 2.8479e-03 1.0 4.10e+06 1.0 0.0e+00 >>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1440 2266 1 >>>> 4.44e-01 2 5.51e-01 100 >>>> PCGAMG Opt l03 1 1.0 8.2684e-04 1.0 2.80e+05 1.0 0.0e+00 >>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 339 1667 1 >>>> 6.72e-02 1 2.03e-01 100 >>>> PCGAMG Gal l04 1 1.0 1.2238e-03 1.0 2.09e+05 1.0 0.0e+00 >>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 170 244 1 >>>> 2.05e-02 2 2.53e-02 100 >>>> PCGAMG Opt l04 1 1.0 4.1008e-04 1.0 1.77e+04 1.0 0.0e+00 >>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 43 165 1 >>>> 4.49e-03 1 1.19e-02 100 >>>> PCSetUp 2 1.0 9.9632e+00 1.0 4.95e+09 1.0 0.0e+00 >>>> 0.0e+00 0.0e+00 5 12 0 0 0 5 12 0 0 0 496 17826 55 >>>> 1.03e+03 45 6.54e+02 98 >>>> PCSetUpOnBlocks 44 1.0 9.9087e-04 1.0 2.88e+03 1.0 >>>> >>>> >>>> >>>> >>>> >>>> >>>>> The point of lazy initialization is to make it possible to run a solve >>>>> that doesn't use a GPU in PETSC_ARCH that supports GPUs, regardless of >>>>> whether a GPU is actually present. >>>>> >>>>> Fande Kong writes: >>>>> >>>>> > I spoke too soon. It seems that we have trouble creating cuda/kokkos >>>>> vecs >>>>> > now. Got Segmentation fault. >>>>> > >>>>> > Thanks, >>>>> > >>>>> > Fande >>>>> > >>>>> > Program received signal SIGSEGV, Segmentation fault. >>>>> > 0x00002aaab5558b11 in >>>>> > >>>>> Petsc::CUPMDevice<(Petsc::CUPMDeviceType)0>::CUPMDeviceInternal::initialize >>>>> > (this=0x1) at >>>>> > >>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/impls/cupm/cupmdevice.cxx:54 >>>>> > 54 PetscErrorCode CUPMDevice::CUPMDeviceInternal::initialize() >>>>> noexcept >>>>> > Missing separate debuginfos, use: debuginfo-install >>>>> > bzip2-libs-1.0.6-13.el7.x86_64 elfutils-libelf-0.176-5.el7.x86_64 >>>>> > elfutils-libs-0.176-5.el7.x86_64 glibc-2.17-325.el7_9.x86_64 >>>>> > libX11-1.6.7-4.el7_9.x86_64 libXau-1.0.8-2.1.el7.x86_64 >>>>> > libattr-2.4.46-13.el7.x86_64 libcap-2.22-11.el7.x86_64 >>>>> > libibmad-5.4.0.MLNX20190423.1d917ae-0.1.49224.x86_64 >>>>> > libibumad-43.1.1.MLNX20200211.078947f-0.1.49224.x86_64 >>>>> > libibverbs-41mlnx1-OFED.4.9.0.0.7.49224.x86_64 >>>>> > libmlx4-41mlnx1-OFED.4.7.3.0.3.49224.x86_64 >>>>> > libmlx5-41mlnx1-OFED.4.9.0.1.2.49224.x86_64 >>>>> libnl3-3.2.28-4.el7.x86_64 >>>>> > librdmacm-41mlnx1-OFED.4.7.3.0.6.49224.x86_64 >>>>> > librxe-41mlnx1-OFED.4.4.2.4.6.49224.x86_64 libxcb-1.13-1.el7.x86_64 >>>>> > libxml2-2.9.1-6.el7_9.6.x86_64 numactl-libs-2.0.12-5.el7.x86_64 >>>>> > systemd-libs-219-78.el7_9.3.x86_64 xz-libs-5.2.2-1.el7.x86_64 >>>>> > zlib-1.2.7-19.el7_9.x86_64 >>>>> > (gdb) bt >>>>> > #0 0x00002aaab5558b11 in >>>>> > >>>>> Petsc::CUPMDevice<(Petsc::CUPMDeviceType)0>::CUPMDeviceInternal::initialize >>>>> > (this=0x1) at >>>>> > >>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/impls/cupm/cupmdevice.cxx:54 >>>>> > #1 0x00002aaab5558db7 in >>>>> > Petsc::CUPMDevice<(Petsc::CUPMDeviceType)0>::getDevice >>>>> > (this=this at entry=0x2aaab7f37b70 >>>>> > , device=0x115da00, id=-35, id at entry=-1) at >>>>> > >>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/impls/cupm/cupmdevice.cxx:344 >>>>> > #2 0x00002aaab55577de in PetscDeviceCreate (type=type at entry >>>>> =PETSC_DEVICE_CUDA, >>>>> > devid=devid at entry=-1, device=device at entry=0x2aaab7f37b48 >>>>> > ) at >>>>> > >>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/interface/device.cxx:107 >>>>> > #3 0x00002aaab5557b3a in PetscDeviceInitializeDefaultDevice_Internal >>>>> > (type=type at entry=PETSC_DEVICE_CUDA, >>>>> defaultDeviceId=defaultDeviceId at entry=-1) >>>>> > at >>>>> > >>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/interface/device.cxx:273 >>>>> > #4 0x00002aaab5557bf6 in PetscDeviceInitialize >>>>> > (type=type at entry=PETSC_DEVICE_CUDA) >>>>> > at >>>>> > >>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/interface/device.cxx:234 >>>>> > #5 0x00002aaab5661fcd in VecCreate_SeqCUDA (V=0x115d150) at >>>>> > >>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/vec/vec/impls/seq/seqcuda/veccuda.c:244 >>>>> > #6 0x00002aaab5649b40 in VecSetType (vec=vec at entry=0x115d150, >>>>> > method=method at entry=0x2aaab70b45b8 "seqcuda") at >>>>> > >>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/vec/vec/interface/vecreg.c:93 >>>>> > #7 0x00002aaab579c33f in VecCreate_CUDA (v=0x115d150) at >>>>> > >>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/vec/vec/impls/mpi/mpicuda/ >>>>> > mpicuda.cu:214 >>>>> > #8 0x00002aaab5649b40 in VecSetType (vec=vec at entry=0x115d150, >>>>> > method=method at entry=0x7fffffff9260 "cuda") at >>>>> > >>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/vec/vec/interface/vecreg.c:93 >>>>> > #9 0x00002aaab5648bf1 in VecSetTypeFromOptions_Private >>>>> (vec=0x115d150, >>>>> > PetscOptionsObject=0x7fffffff9210) at >>>>> > >>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/vec/vec/interface/vector.c:1263 >>>>> > #10 VecSetFromOptions (vec=0x115d150) at >>>>> > >>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/vec/vec/interface/vector.c:1297 >>>>> > #11 0x00002aaab02ef227 in libMesh::PetscVector::init >>>>> > (this=0x11cd1a0, n=441, n_local=441, fast=false, >>>>> ptype=libMesh::PARALLEL) >>>>> > at >>>>> > >>>>> /home/kongf/workhome/sawtooth/moosegpu/scripts/../libmesh/installed/include/libmesh/petsc_vector.h:693 >>>>> > >>>>> > On Thu, Jan 20, 2022 at 1:09 PM Fande Kong >>>>> wrote: >>>>> > >>>>> >> Thanks, Jed, >>>>> >> >>>>> >> This worked! >>>>> >> >>>>> >> Fande >>>>> >> >>>>> >> On Wed, Jan 19, 2022 at 11:03 PM Jed Brown >>>>> wrote: >>>>> >> >>>>> >>> Fande Kong writes: >>>>> >>> >>>>> >>> > On Wed, Jan 19, 2022 at 11:39 AM Jacob Faibussowitsch < >>>>> >>> jacob.fai at gmail.com> >>>>> >>> > wrote: >>>>> >>> > >>>>> >>> >> Are you running on login nodes or compute nodes (I can?t seem >>>>> to tell >>>>> >>> from >>>>> >>> >> the configure.log)? >>>>> >>> >> >>>>> >>> > >>>>> >>> > I was compiling codes on login nodes, and running codes on >>>>> compute >>>>> >>> nodes. >>>>> >>> > Login nodes do not have GPUs, but compute nodes do have GPUs. >>>>> >>> > >>>>> >>> > Just to be clear, the same thing (code, machine) with >>>>> PETSc-3.16.1 >>>>> >>> worked >>>>> >>> > perfectly. I have this trouble with PETSc-main. >>>>> >>> >>>>> >>> I assume you can >>>>> >>> >>>>> >>> export PETSC_OPTIONS='-device_enable lazy' >>>>> >>> >>>>> >>> and it'll work. >>>>> >>> >>>>> >>> I think this should be the default. The main complaint is that >>>>> timing the >>>>> >>> first GPU-using event isn't accurate if it includes >>>>> initialization, but I >>>>> >>> think this is mostly hypothetical because you can't trust any >>>>> timing that >>>>> >>> doesn't preload in some form and the first GPU-using event will >>>>> almost >>>>> >>> always be something uninteresting so I think it will rarely lead to >>>>> >>> confusion. Meanwhile, eager initialization is viscerally >>>>> disruptive for >>>>> >>> lots of people. >>>>> >>> >>>>> >> >>>>> >>>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >>> https://www.cse.buffalo.edu/~knepley/ >>> >>> >>> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From junchao.zhang at gmail.com Tue Jan 25 21:21:12 2022 From: junchao.zhang at gmail.com (Junchao Zhang) Date: Tue, 25 Jan 2022 21:21:12 -0600 Subject: [petsc-users] Cannot eagerly initialize cuda, as doing so results in cuda error 35 (cudaErrorInsufficientDriver) : CUDA driver version is insufficient for CUDA runtime version In-Reply-To: References: <233D0F20-AD95-481B-B862-353D5789C556@gmail.com> <87czkn9c64.fsf@jedbrown.org> <875yqe7zip.fsf@jedbrown.org> Message-ID: Fande, could you send the configure.log that works (i.e., before this offending commit)? --Junchao Zhang On Tue, Jan 25, 2022 at 8:21 PM Fande Kong wrote: > Not sure if this is helpful. I did "git bisect", and here was the result: > > [kongf at sawtooth2 petsc]$ git bisect bad > 246ba74192519a5f34fb6e227d1c64364e19ce2c is the first bad commit > commit 246ba74192519a5f34fb6e227d1c64364e19ce2c > Author: Junchao Zhang > Date: Wed Oct 13 05:32:43 2021 +0000 > > Config: fix CUDA library and header dirs > > :040000 040000 187c86055adb80f53c1d0565a8888704fec43a96 > ea1efd7f594fd5e8df54170bc1bc7b00f35e4d5f M config > > > Started from this commit, and GPU did not work for me on our HPC > > Thanks, > Fande > > On Tue, Jan 25, 2022 at 7:18 PM Fande Kong wrote: > >> >> >> On Tue, Jan 25, 2022 at 9:04 AM Jacob Faibussowitsch >> wrote: >> >>> Configure should not have an impact here I think. The reason I had you >>> run `cudaGetDeviceCount()` is because this is the CUDA call (and in fact >>> the only CUDA call) in the initialization sequence that returns the error >>> code. There should be no prior CUDA calls. Maybe this is a problem with >>> oversubscribing GPU?s? In the runs that crash, how many ranks are using any >>> given GPU at once? Maybe MPS is required. >>> >> >> I used one MPI rank. >> >> Fande >> >> >> >>> >>> Best regards, >>> >>> Jacob Faibussowitsch >>> (Jacob Fai - booss - oh - vitch) >>> >>> On Jan 21, 2022, at 12:01, Fande Kong wrote: >>> >>> Thanks Jacob, >>> >>> On Thu, Jan 20, 2022 at 6:25 PM Jacob Faibussowitsch < >>> jacob.fai at gmail.com> wrote: >>> >>>> Segfault is caused by the following check at >>>> src/sys/objects/device/impls/cupm/cupmdevice.cxx:349 being a >>>> PetscUnlikelyDebug() rather than just PetscUnlikely(): >>>> >>>> ``` >>>> if (PetscUnlikelyDebug(_defaultDevice < 0)) { // _defaultDevice is in >>>> fact < 0 here and uncaught >>>> ``` >>>> >>>> To clarify: >>>> >>>> ?lazy? initialization is not that lazy after all, it still does some >>>> 50% of the initialization that ?eager? initialization does. It stops short >>>> initializing the CUDA runtime, checking CUDA aware MPI, gathering device >>>> data, and initializing cublas and friends. Lazy also importantly swallows >>>> any errors that crop up during initialization, storing the resulting error >>>> code for later (specifically _defaultDevice = -init_error_value;). >>>> >>>> So whether you initialize lazily or eagerly makes no difference here, >>>> as _defaultDevice will always contain -35. >>>> >>>> The bigger question is why cudaGetDeviceCount() is returning >>>> cudaErrorInsufficientDriver. Can you compile and run >>>> >>>> ``` >>>> #include >>>> >>>> int main() >>>> { >>>> int ndev; >>>> return cudaGetDeviceCount(&ndev): >>>> } >>>> ``` >>>> >>>> Then show the value of "echo $??? >>>> >>> >>> Modify your code a little to get more information. >>> >>> #include >>> #include >>> >>> int main() >>> { >>> int ndev; >>> int error = cudaGetDeviceCount(&ndev); >>> printf("ndev %d \n", ndev); >>> printf("error %d \n", error); >>> return 0; >>> } >>> >>> Results: >>> >>> $ ./a.out >>> ndev 4 >>> error 0 >>> >>> >>> I have not read the PETSc cuda initialization code yet. If I need to >>> guess at what was happening. I will naively think that PETSc did not get >>> correct GPU information in the configuration because the compiler node does >>> not have GPUs, and there was no way to get any GPU device information. >>> >>> During the runtime on GPU nodes, PETSc might have incorrect information >>> grabbed during configuration and had this kind of false error message. >>> >>> Thanks, >>> >>> Fande >>> >>> >>> >>>> >>>> Best regards, >>>> >>>> Jacob Faibussowitsch >>>> (Jacob Fai - booss - oh - vitch) >>>> >>>> On Jan 20, 2022, at 17:47, Matthew Knepley wrote: >>>> >>>> On Thu, Jan 20, 2022 at 6:44 PM Fande Kong wrote: >>>> >>>>> Thanks, Jed >>>>> >>>>> On Thu, Jan 20, 2022 at 4:34 PM Jed Brown wrote: >>>>> >>>>>> You can't create CUDA or Kokkos Vecs if you're running on a node >>>>>> without a GPU. >>>>> >>>>> >>>>> I am running the code on compute nodes that do have GPUs. >>>>> >>>> >>>> If you are actually running on GPUs, why would you need lazy >>>> initialization? It would not break with GPUs present. >>>> >>>> Matt >>>> >>>> >>>>> With PETSc-3.16.1, I got good speedup by running GAMG on GPUs. That >>>>> might be a bug of PETSc-main. >>>>> >>>>> Thanks, >>>>> >>>>> Fande >>>>> >>>>> >>>>> >>>>> KSPSetUp 13 1.0 6.4400e-01 1.0 2.02e+09 1.0 0.0e+00 >>>>> 0.0e+00 0.0e+00 0 5 0 0 0 0 5 0 0 0 3140 64630 15 >>>>> 1.05e+02 5 3.49e+01 100 >>>>> KSPSolve 1 1.0 1.0109e+00 1.0 3.49e+10 1.0 0.0e+00 >>>>> 0.0e+00 0.0e+00 0 87 0 0 0 0 87 0 0 0 34522 69556 4 >>>>> 4.35e-03 1 2.38e-03 100 >>>>> KSPGMRESOrthog 142 1.0 1.2674e-01 1.0 1.06e+10 1.0 0.0e+00 >>>>> 0.0e+00 0.0e+00 0 27 0 0 0 0 27 0 0 0 83755 87801 0 >>>>> 0.00e+00 0 0.00e+00 100 >>>>> SNESSolve 1 1.0 4.4402e+01 1.0 4.00e+10 1.0 0.0e+00 >>>>> 0.0e+00 0.0e+00 21100 0 0 0 21100 0 0 0 901 51365 57 >>>>> 1.10e+03 52 8.78e+02 100 >>>>> SNESSetUp 1 1.0 3.9101e-05 1.0 0.00e+00 0.0 0.0e+00 >>>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 >>>>> 0.00e+00 0 0.00e+00 0 >>>>> SNESFunctionEval 2 1.0 1.7097e+01 1.0 1.60e+07 1.0 0.0e+00 >>>>> 0.0e+00 0.0e+00 8 0 0 0 0 8 0 0 0 0 1 0 0 >>>>> 0.00e+00 6 1.92e+02 0 >>>>> SNESJacobianEval 1 1.0 1.6213e+01 1.0 2.80e+07 1.0 0.0e+00 >>>>> 0.0e+00 0.0e+00 8 0 0 0 0 8 0 0 0 0 2 0 0 >>>>> 0.00e+00 1 3.20e+01 0 >>>>> SNESLineSearch 1 1.0 8.5582e+00 1.0 1.24e+08 1.0 0.0e+00 >>>>> 0.0e+00 0.0e+00 4 0 0 0 0 4 0 0 0 0 14 64153 1 >>>>> 3.20e+01 3 9.61e+01 94 >>>>> PCGAMGGraph_AGG 5 1.0 3.0509e+00 1.0 8.19e+07 1.0 0.0e+00 >>>>> 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 27 0 5 >>>>> 3.49e+01 9 7.43e+01 0 >>>>> PCGAMGCoarse_AGG 5 1.0 3.8711e+00 1.0 0.00e+00 0.0 0.0e+00 >>>>> 0.0e+00 0.0e+00 2 0 0 0 0 2 0 0 0 0 0 0 0 >>>>> 0.00e+00 0 0.00e+00 0 >>>>> PCGAMGProl_AGG 5 1.0 7.0748e-01 1.0 0.00e+00 0.0 0.0e+00 >>>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 >>>>> 0.00e+00 0 0.00e+00 0 >>>>> PCGAMGPOpt_AGG 5 1.0 1.2904e+00 1.0 2.14e+09 1.0 0.0e+00 >>>>> 0.0e+00 0.0e+00 1 5 0 0 0 1 5 0 0 0 1661 29807 26 >>>>> 7.15e+02 20 2.90e+02 99 >>>>> GAMG: createProl 5 1.0 8.9489e+00 1.0 2.22e+09 1.0 0.0e+00 >>>>> 0.0e+00 0.0e+00 4 6 0 0 0 4 6 0 0 0 249 29666 31 >>>>> 7.50e+02 29 3.64e+02 96 >>>>> Graph 10 1.0 3.0478e+00 1.0 8.19e+07 1.0 0.0e+00 >>>>> 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 27 0 5 >>>>> 3.49e+01 9 7.43e+01 0 >>>>> MIS/Agg 5 1.0 4.1290e-01 1.0 0.00e+00 0.0 0.0e+00 >>>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 >>>>> 0.00e+00 0 0.00e+00 0 >>>>> SA: col data 5 1.0 1.9127e-02 1.0 0.00e+00 0.0 0.0e+00 >>>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 >>>>> 0.00e+00 0 0.00e+00 0 >>>>> SA: frmProl0 5 1.0 6.2662e-01 1.0 0.00e+00 0.0 0.0e+00 >>>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 >>>>> 0.00e+00 0 0.00e+00 0 >>>>> SA: smooth 5 1.0 4.9595e-01 1.0 1.21e+08 1.0 0.0e+00 >>>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 244 2709 15 >>>>> 1.97e+02 15 2.55e+02 90 >>>>> GAMG: partLevel 5 1.0 4.7330e-01 1.0 6.98e+08 1.0 0.0e+00 >>>>> 0.0e+00 0.0e+00 0 2 0 0 0 0 2 0 0 0 1475 4120 5 >>>>> 1.78e+02 10 2.55e+02 100 >>>>> PCGAMG Squ l00 1 1.0 2.6027e+00 1.0 0.00e+00 0.0 0.0e+00 >>>>> 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 0 0 0 >>>>> 0.00e+00 0 0.00e+00 0 >>>>> PCGAMG Gal l00 1 1.0 3.8406e-01 1.0 5.48e+08 1.0 0.0e+00 >>>>> 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 1426 4270 1 >>>>> 1.48e+02 2 2.11e+02 100 >>>>> PCGAMG Opt l00 1 1.0 2.4932e-01 1.0 7.20e+07 1.0 0.0e+00 >>>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 289 2653 1 >>>>> 6.41e+01 1 1.13e+02 100 >>>>> PCGAMG Gal l01 1 1.0 6.6279e-02 1.0 1.09e+08 1.0 0.0e+00 >>>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1645 3851 1 >>>>> 2.40e+01 2 3.64e+01 100 >>>>> PCGAMG Opt l01 1 1.0 2.9544e-02 1.0 7.15e+06 1.0 0.0e+00 >>>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 242 1671 1 >>>>> 4.84e+00 1 1.23e+01 100 >>>>> PCGAMG Gal l02 1 1.0 1.8874e-02 1.0 3.72e+07 1.0 0.0e+00 >>>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1974 3636 1 >>>>> 5.04e+00 2 6.58e+00 100 >>>>> PCGAMG Opt l02 1 1.0 7.4353e-03 1.0 2.40e+06 1.0 0.0e+00 >>>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 323 1457 1 >>>>> 7.71e-01 1 2.30e+00 100 >>>>> PCGAMG Gal l03 1 1.0 2.8479e-03 1.0 4.10e+06 1.0 0.0e+00 >>>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1440 2266 1 >>>>> 4.44e-01 2 5.51e-01 100 >>>>> PCGAMG Opt l03 1 1.0 8.2684e-04 1.0 2.80e+05 1.0 0.0e+00 >>>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 339 1667 1 >>>>> 6.72e-02 1 2.03e-01 100 >>>>> PCGAMG Gal l04 1 1.0 1.2238e-03 1.0 2.09e+05 1.0 0.0e+00 >>>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 170 244 1 >>>>> 2.05e-02 2 2.53e-02 100 >>>>> PCGAMG Opt l04 1 1.0 4.1008e-04 1.0 1.77e+04 1.0 0.0e+00 >>>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 43 165 1 >>>>> 4.49e-03 1 1.19e-02 100 >>>>> PCSetUp 2 1.0 9.9632e+00 1.0 4.95e+09 1.0 0.0e+00 >>>>> 0.0e+00 0.0e+00 5 12 0 0 0 5 12 0 0 0 496 17826 55 >>>>> 1.03e+03 45 6.54e+02 98 >>>>> PCSetUpOnBlocks 44 1.0 9.9087e-04 1.0 2.88e+03 1.0 >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>>> The point of lazy initialization is to make it possible to run a >>>>>> solve that doesn't use a GPU in PETSC_ARCH that supports GPUs, regardless >>>>>> of whether a GPU is actually present. >>>>>> >>>>>> Fande Kong writes: >>>>>> >>>>>> > I spoke too soon. It seems that we have trouble creating >>>>>> cuda/kokkos vecs >>>>>> > now. Got Segmentation fault. >>>>>> > >>>>>> > Thanks, >>>>>> > >>>>>> > Fande >>>>>> > >>>>>> > Program received signal SIGSEGV, Segmentation fault. >>>>>> > 0x00002aaab5558b11 in >>>>>> > >>>>>> Petsc::CUPMDevice<(Petsc::CUPMDeviceType)0>::CUPMDeviceInternal::initialize >>>>>> > (this=0x1) at >>>>>> > >>>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/impls/cupm/cupmdevice.cxx:54 >>>>>> > 54 PetscErrorCode CUPMDevice::CUPMDeviceInternal::initialize() >>>>>> noexcept >>>>>> > Missing separate debuginfos, use: debuginfo-install >>>>>> > bzip2-libs-1.0.6-13.el7.x86_64 elfutils-libelf-0.176-5.el7.x86_64 >>>>>> > elfutils-libs-0.176-5.el7.x86_64 glibc-2.17-325.el7_9.x86_64 >>>>>> > libX11-1.6.7-4.el7_9.x86_64 libXau-1.0.8-2.1.el7.x86_64 >>>>>> > libattr-2.4.46-13.el7.x86_64 libcap-2.22-11.el7.x86_64 >>>>>> > libibmad-5.4.0.MLNX20190423.1d917ae-0.1.49224.x86_64 >>>>>> > libibumad-43.1.1.MLNX20200211.078947f-0.1.49224.x86_64 >>>>>> > libibverbs-41mlnx1-OFED.4.9.0.0.7.49224.x86_64 >>>>>> > libmlx4-41mlnx1-OFED.4.7.3.0.3.49224.x86_64 >>>>>> > libmlx5-41mlnx1-OFED.4.9.0.1.2.49224.x86_64 >>>>>> libnl3-3.2.28-4.el7.x86_64 >>>>>> > librdmacm-41mlnx1-OFED.4.7.3.0.6.49224.x86_64 >>>>>> > librxe-41mlnx1-OFED.4.4.2.4.6.49224.x86_64 libxcb-1.13-1.el7.x86_64 >>>>>> > libxml2-2.9.1-6.el7_9.6.x86_64 numactl-libs-2.0.12-5.el7.x86_64 >>>>>> > systemd-libs-219-78.el7_9.3.x86_64 xz-libs-5.2.2-1.el7.x86_64 >>>>>> > zlib-1.2.7-19.el7_9.x86_64 >>>>>> > (gdb) bt >>>>>> > #0 0x00002aaab5558b11 in >>>>>> > >>>>>> Petsc::CUPMDevice<(Petsc::CUPMDeviceType)0>::CUPMDeviceInternal::initialize >>>>>> > (this=0x1) at >>>>>> > >>>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/impls/cupm/cupmdevice.cxx:54 >>>>>> > #1 0x00002aaab5558db7 in >>>>>> > Petsc::CUPMDevice<(Petsc::CUPMDeviceType)0>::getDevice >>>>>> > (this=this at entry=0x2aaab7f37b70 >>>>>> > , device=0x115da00, id=-35, id at entry=-1) at >>>>>> > >>>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/impls/cupm/cupmdevice.cxx:344 >>>>>> > #2 0x00002aaab55577de in PetscDeviceCreate (type=type at entry >>>>>> =PETSC_DEVICE_CUDA, >>>>>> > devid=devid at entry=-1, device=device at entry=0x2aaab7f37b48 >>>>>> > ) at >>>>>> > >>>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/interface/device.cxx:107 >>>>>> > #3 0x00002aaab5557b3a in >>>>>> PetscDeviceInitializeDefaultDevice_Internal >>>>>> > (type=type at entry=PETSC_DEVICE_CUDA, >>>>>> defaultDeviceId=defaultDeviceId at entry=-1) >>>>>> > at >>>>>> > >>>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/interface/device.cxx:273 >>>>>> > #4 0x00002aaab5557bf6 in PetscDeviceInitialize >>>>>> > (type=type at entry=PETSC_DEVICE_CUDA) >>>>>> > at >>>>>> > >>>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/interface/device.cxx:234 >>>>>> > #5 0x00002aaab5661fcd in VecCreate_SeqCUDA (V=0x115d150) at >>>>>> > >>>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/vec/vec/impls/seq/seqcuda/veccuda.c:244 >>>>>> > #6 0x00002aaab5649b40 in VecSetType (vec=vec at entry=0x115d150, >>>>>> > method=method at entry=0x2aaab70b45b8 "seqcuda") at >>>>>> > >>>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/vec/vec/interface/vecreg.c:93 >>>>>> > #7 0x00002aaab579c33f in VecCreate_CUDA (v=0x115d150) at >>>>>> > >>>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/vec/vec/impls/mpi/mpicuda/ >>>>>> > mpicuda.cu:214 >>>>>> > #8 0x00002aaab5649b40 in VecSetType (vec=vec at entry=0x115d150, >>>>>> > method=method at entry=0x7fffffff9260 "cuda") at >>>>>> > >>>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/vec/vec/interface/vecreg.c:93 >>>>>> > #9 0x00002aaab5648bf1 in VecSetTypeFromOptions_Private >>>>>> (vec=0x115d150, >>>>>> > PetscOptionsObject=0x7fffffff9210) at >>>>>> > >>>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/vec/vec/interface/vector.c:1263 >>>>>> > #10 VecSetFromOptions (vec=0x115d150) at >>>>>> > >>>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/vec/vec/interface/vector.c:1297 >>>>>> > #11 0x00002aaab02ef227 in libMesh::PetscVector::init >>>>>> > (this=0x11cd1a0, n=441, n_local=441, fast=false, >>>>>> ptype=libMesh::PARALLEL) >>>>>> > at >>>>>> > >>>>>> /home/kongf/workhome/sawtooth/moosegpu/scripts/../libmesh/installed/include/libmesh/petsc_vector.h:693 >>>>>> > >>>>>> > On Thu, Jan 20, 2022 at 1:09 PM Fande Kong >>>>>> wrote: >>>>>> > >>>>>> >> Thanks, Jed, >>>>>> >> >>>>>> >> This worked! >>>>>> >> >>>>>> >> Fande >>>>>> >> >>>>>> >> On Wed, Jan 19, 2022 at 11:03 PM Jed Brown >>>>>> wrote: >>>>>> >> >>>>>> >>> Fande Kong writes: >>>>>> >>> >>>>>> >>> > On Wed, Jan 19, 2022 at 11:39 AM Jacob Faibussowitsch < >>>>>> >>> jacob.fai at gmail.com> >>>>>> >>> > wrote: >>>>>> >>> > >>>>>> >>> >> Are you running on login nodes or compute nodes (I can?t seem >>>>>> to tell >>>>>> >>> from >>>>>> >>> >> the configure.log)? >>>>>> >>> >> >>>>>> >>> > >>>>>> >>> > I was compiling codes on login nodes, and running codes on >>>>>> compute >>>>>> >>> nodes. >>>>>> >>> > Login nodes do not have GPUs, but compute nodes do have GPUs. >>>>>> >>> > >>>>>> >>> > Just to be clear, the same thing (code, machine) with >>>>>> PETSc-3.16.1 >>>>>> >>> worked >>>>>> >>> > perfectly. I have this trouble with PETSc-main. >>>>>> >>> >>>>>> >>> I assume you can >>>>>> >>> >>>>>> >>> export PETSC_OPTIONS='-device_enable lazy' >>>>>> >>> >>>>>> >>> and it'll work. >>>>>> >>> >>>>>> >>> I think this should be the default. The main complaint is that >>>>>> timing the >>>>>> >>> first GPU-using event isn't accurate if it includes >>>>>> initialization, but I >>>>>> >>> think this is mostly hypothetical because you can't trust any >>>>>> timing that >>>>>> >>> doesn't preload in some form and the first GPU-using event will >>>>>> almost >>>>>> >>> always be something uninteresting so I think it will rarely lead >>>>>> to >>>>>> >>> confusion. Meanwhile, eager initialization is viscerally >>>>>> disruptive for >>>>>> >>> lots of people. >>>>>> >>> >>>>>> >> >>>>>> >>>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their >>>> experiments is infinitely more interesting than any results to which their >>>> experiments lead. >>>> -- Norbert Wiener >>>> >>>> https://www.cse.buffalo.edu/~knepley/ >>>> >>>> >>>> >>> -------------- next part -------------- An HTML attachment was scrubbed... URL: From dong-hao at outlook.com Tue Jan 25 21:52:47 2022 From: dong-hao at outlook.com (Hao DONG) Date: Wed, 26 Jan 2022 03:52:47 +0000 Subject: [petsc-users] Strange CUDA failure with a second petscfinalize with PETSc 3.16 In-Reply-To: References: <45585074-8C0E-453B-993B-554D23A1E971@gmail.com> <60E1BD31-E98A-462F-BA2B-2099745B2582@gmail.com> Message-ID: Thanks Jacob, Right, I forgot to change the clone -b release to main - my mistake. The c++ dialect option now works without problem. My laptop system is indeed configured with WSL2, with Linux kernel of ?5.10.60.1-microsoft-standard-WSL2?. And I have a windows 11 Version 21H2 (OS Build 22000.438) as the host system. The Nvidia driver version is ?510.06? with cuda 11.4. Interestingly, my ex11fc can now get pass the second petscfinalize. The code can now get to the third loop of kspsolve and reach the mpifinalize without a problem. So changing to main branch solves the petscfinalize error problem. However, it still complains with an error like: -------------------------------------------------------------------------- The call to cuEventDestory failed. This is a unrecoverable error and will cause the program to abort. cuEventDestory return value: 400 Check the cuda.h file for what the return value means. -------------------------------------------------------------------------- The full log files are also attached. I also noticed there are other event-management related errors like cuEventCreate and cuIpcGetEventHandle in the log. Does it give any insights on why we have the problem? Cheers, Hao Sent from Mail for Windows From: Jacob Faibussowitsch Sent: Tuesday, January 25, 2022 11:19 PM To: Hao DONG Cc: petsc-users; Junchao Zhang Subject: Re: [petsc-users] Strange CUDA failure with a second petscfinalize with PETSc 3.16 Hi Hao, I have tried to git pull on the laptop from main and re-config as you suggested. It looks like you?re still on the release branch. Do ``` $ git checkout main $ git pull ``` Then reconfigure. This is also why the cxx dialect flag did not work, I forgot that this change had not made it to release yet. my laptop setup is based on WSL What version of windows do you have? And what version of WSL? And what version is the linux kernel? You will need at least WSL2 and both your NVIDIA driver, windows version, and linux kernel version are required to be fairly new AFAIK to be able to run CUDA on them. See here https://docs.nvidia.com/cuda/wsl-user-guide/index.html. To get your windows version: 1. Press Windows key+R 2. Type winver in the box, and press enter 3. You should see a line with Version and a build number. For example on my windows machine I see ?Version 21H2 (OS Build 19044.1496)? To get WSL version: 1. Open WSL 2. Type uname -r, for example I get "5.10.60.1-microsoft-standard-wsl2" To get NVIDIA driver version: 1. Open up the NVIDIA control panel 2. Click on ?System Information? in the bottom left corner 3. You should see a dual list, ?Items? and ?Details?. In the details column. You should see ?Driver verion?. For example on my machine I see ?511.23? Best regards, Jacob Faibussowitsch (Jacob Fai - booss - oh - vitch) On Jan 25, 2022, at 03:42, Hao DONG > wrote: Hi Jacob, Thanks for the comments ? silly that I have overlooked the debugging flag so far. Unfortunately, I am out of office for a couple of days so I cannot confirm the result on my workstation, for now. However, I have a laptop with nvidia graphic card (old gtx1050, which is actually slower than the cpu in terms of double precision calculation), I have tried to git pull on the laptop from main and re-config as you suggested. However, using ?--with-cxx-dialect=14? throws up an error like: Unknown C++ dialect: with-cxx-dialect=14 And omitting the ?--with-cuda-dialect=cxx14? also gives me a similar complaint with: CUDA Error: Using CUDA with PetscComplex requires a C++ dialect at least cxx11. Use --with-cxx-dialect=xxx and --with-cuda-dialect=xxx to specify a suitable compiler. Eventually I was able to configure and compile with the following config setup: ./configure --prefix=/opt/petsc/debug --with-cc=mpicc --with-cxx=mpicxx --with-fc=mpif90 --with-scalar-type=complex --with-precision=double --with-cuda-dir=/usr/local/cuda --with-fortran-kernels=1 --with-cxx-dialect=cxx14 --with-cuda-dialect=cxx14 --with-debugging=1 But still, I got the same error regarding cuda (still the error code 97 thing). I attached the configure and output log of my ex11 on my laptop ? is there anything that can help pinpoint the problem? I can also confirm that PETSc 3.15.2 works well with my ex11fc code with cuda, on my laptop. Sadly, my laptop setup is based on WSL, which is far from an ideal environment to test CUDA. I will let you know once I get my hands on my workstations. Cheers, Hao Sent from Mail for Windows From: Jacob Faibussowitsch Sent: Tuesday, January 25, 2022 1:22 AM To: Hao DONG Cc: petsc-users; Junchao Zhang Subject: Re: [petsc-users] Strange CUDA failure with a second petscfinalize with PETSc 3.16 Hi Hao, Any luck reproducing the CUDA problem? Sorry for the long radio silence, I still have not been able to reproduce the problem unfortunately. I have tried on a local machine, and a few larger clusters and all return without errors both with and without cuda? Can you try pulling the latest version of main, reconfiguring and trying again? BTW, your configure arguments are a little wonky: 1. --with-clanguage=c - this isn?t needed, PETSc will default to C 2. --with-cxx-dialect=cxx14 --with-cuda-dialect=cxx14 - use --with-cxx-dialect=14 instead, PETSc will detect that you have gnu compilers and enable gnu extensions 3. -with-debugging=1 - this is missing an extra dash, but you also have optimization flags set so maybe just leave this one out Best regards, Jacob Faibussowitsch (Jacob Fai - booss - oh - vitch) On Jan 23, 2022, at 21:29, Hao DONG > wrote: Dear Jacob, Any luck reproducing the CUDA problem? - just write to check in, in case somehow the response did not reach me (this happens to my colleagues abroad sometimes, probably due to the Wall). All the best, Hao On Jan 19, 2022, at 3:01 PM, Hao DONG > wrote: ? Thanks Jacob for looking into this ? You can see the updated source code of ex11fc in the attachment ? although there is not much that I modified (except for the jabbers I outputted). I also attached the full output (ex11fc.log) along with the configure.log file. It?s an old dual Xeon workstation (one of my ?production? machines) with Linux kernel 5.4.0 and gcc 9.3. I simply ran the code with mpiexec -np 2 ex11fc -usecuda for GPU test. And as stated before, calling without the ?-usecuda? option shows no errors. Please let me know if you find anything wrong with the configure/code. Cheers, Hao From: Jacob Faibussowitsch Sent: Wednesday, January 19, 2022 3:38 AM To: Hao DONG Cc: Junchao Zhang; petsc-users Subject: Re: [petsc-users] Strange CUDA failure with a second petscfinalize with PETSc 3.16 Apologies, forgot to mention in my previous email but can you also include a copy of the full printout of the error message that you get? It will include all the command-line flags that you ran with (if any) so I can exactly mirror your environment. Best regards, Jacob Faibussowitsch (Jacob Fai - booss - oh - vitch) On Jan 18, 2022, at 14:06, Jacob Faibussowitsch > wrote: Can you send your updated source file as well as your configure.log (should be $PETSC_DIR/configure.log). I will see if I can reproduce the error on my end. Best regards, Jacob Faibussowitsch (Jacob Fai - booss - oh - vitch) On Jan 17, 2022, at 23:06, Hao DONG > wrote: ? Dear Junchao and Jacob, Thanks a lot for the response ? I also don?t understand why this is related to the device, especially on why the procedure can be successfully finished for *once* ? As instructed, I tried to add a CHKERRA() macro after (almost) every petsc line ? such as the initialization, mat assemble, ksp create, solve, mat destroy, etc. However, all other petsc commands returns with error code 0. It only gives me a similar (still not very informative) error after I call the petscfinalize (again for the second time), with error code 97: [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [0]PETSC ERROR: GPU error [0]PETSC ERROR: cuda error 709 (cudaErrorContextIsDestroyed) : context is destroyed [0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting. [0]PETSC ERROR: Petsc Release Version 3.16.3, unknown [0]PETSC ERROR: ./ex11f on a named stratosphere by donghao Tue Jan 18 11:39:43 2022 [0]PETSC ERROR: Configure options --prefix=/opt/petsc/complex-double-with-cuda --with-cc=mpicc --with-cxx=mpicxx --with-fc=mpif90 COPTFLAGS="-O3 -mavx2" CXXOPTFLAGS="-O3 -mavx2" FOPTFLAGS="-O3 -ffree-line-length-none -mavx2" CUDAOPTFLAGS=-O3 --with-cxx-dialect=cxx14 --with-cuda-dialect=cxx14 --with-scalar-type=complex --with-precision=double --with-cuda-dir=/usr/local/cuda --with-debugging=1 [0]PETSC ERROR: #1 PetscFinalize() at /home/donghao/packages/petsc-current/src/sys/objects/pinit.c:1638 [0]PETSC ERROR: #2 User provided function() at User file:0 I can also confirm that rolling back to petsc 3.15 will *not* see the problem, even with the new nvidia driver. And petsc 3.16.3 with an old nvidia driver (470.42) also get this same error. So it?s probably not connected to the nvidia driver. Any idea on where I should look at next? Thanks a lot in advance, and all the best, Hao From: Jacob Faibussowitsch Sent: Sunday, January 16, 2022 12:12 AM To: Junchao Zhang Cc: petsc-users; Hao DONG Subject: Re: [petsc-users] Strange CUDA failure with a second petscfinalize with PETSc 3.16 I don?t quite understand how it is getting to the CUDA error to be honest. None of the code in the stack trace is anywhere near the device code. Reading the error message carefully, it first chokes on PetscLogGetStageLog() from a call to PetscClassIdRegister(): PetscErrorCode PetscLogGetStageLog(PetscStageLog *stageLog) { PetscFunctionBegin; PetscValidPointer(stageLog,1); if (!petsc_stageLog) { fprintf(stderr, "PETSC ERROR: Logging has not been enabled.\nYou might have forgotten to call PetscInitialize().\n"); PETSCABORT(MPI_COMM_WORLD, PETSC_ERR_SUP); // Here } ... But then jumps to PetscFinalize(). You can also see the "You might have forgotten to call PetscInitialize().? message in the error message, just under the topmost level of the stack trace. Can you check the value of ierr of each function call (use the CHKERRA() macro to do so)? I suspect the problem here that errors occurring previously in the program are being ignored, leading to the garbled stack trace. Best regards, Jacob Faibussowitsch (Jacob Fai - booss - oh - vitch) On Jan 14, 2022, at 20:58, Junchao Zhang > wrote: Jacob, Could you have a look as it seems the "invalid device context" is in your newly added module? Thanks --Junchao Zhang On Fri, Jan 14, 2022 at 12:49 AM Hao DONG > wrote: Dear All, I have encountered a peculiar problem when fiddling with a code with PETSC 3.16.3 (which worked fine with PETSc 3.15). It is a very straight forward PDE-based optimization code which repeatedly solves a linearized PDE problem with KSP in a subroutine (the rest of the code does not contain any PETSc related content). The main program provides the subroutine with an MPI comm. Then I set the comm as PETSC_COMM_WORLD to tell PETSC to attach to it (and detach with it when the solving is finished each time). Strangely, I observe a CUDA failure whenever the petscfinalize is called for a *second* time. In other words, the first and second PDE calculations with GPU are fine (with correct solutions). The petsc code just fails after the SECOND petscfinalize command is called. You can also see the PETSC config in the error message: [1]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [1]PETSC ERROR: GPU error [1]PETSC ERROR: cuda error 201 (cudaErrorDeviceUninitialized) : invalid device context [1]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting. [1]PETSC ERROR: Petsc Release Version 3.16.3, unknown [1]PETSC ERROR: maxwell.gpu on a named stratosphere by hao Fri Jan 14 10:21:05 2022 [1]PETSC ERROR: Configure options --prefix=/opt/petsc/complex-double-with-cuda --with-cc=mpicc --with-cxx=mpicxx --with-fc=mpif90 COPTFLAGS="-O3 -mavx2" CXXOPTFLAGS="-O3 -mavx2" FOPTFLAGS="-O3 -ffree-line-length-none -mavx2" CUDAOPTFLAGS=-O3 --with-cxx-dialect=cxx14 --with-cuda-dialect=cxx14 --with-scalar-type=complex --with-precision=double --with-cuda-dir=/usr/local/cuda --with-debugging=1 [1]PETSC ERROR: #1 PetscFinalize() at /home/hao/packages/petsc-current/src/sys/objects/pinit.c:1638 You might have forgotten to call PetscInitialize(). The EXACT line numbers in the error traceback are not available. Instead the line number of the start of the function is given. [1] #1 PetscAbortFindSourceFile_Private() at /home/hao/packages/petsc-current/src/sys/error/err.c:35 [1] #2 PetscLogGetStageLog() at /home/hao/packages/petsc-current/src/sys/logging/utils/stagelog.c:29 [1] #3 PetscClassIdRegister() at /home/hao/packages/petsc-current/src/sys/logging/plog.c:2376 [1] #4 MatMFFDInitializePackage() at /home/hao/packages/petsc-current/src/mat/impls/mffd/mffd.c:45 [1] #5 MatInitializePackage() at /home/hao/packages/petsc-current/src/mat/interface/dlregismat.c:163 [1] #6 MatCreate() at /home/hao/packages/petsc-current/src/mat/utils/gcreate.c:77 However, it doesn?t seem to affect the other part of my code, so the code can continue running until it gets to the petsc part again (the *third* time). Unfortunately, it doesn?t give me any further information even if I set the debugging to yes in the configure file. It also worth noting that PETSC without CUDA (i.e. with simple MATMPIAIJ) works perfectly fine. I am able to re-produce the problem with a toy code modified from ex11f. Please see the attached file (ex11fc.F90) for details. Essentially the code does the same thing as ex11f, but three times with a do loop. To do that I added an extra MPI_INIT/MPI_FINALIZE to ensure that the MPI communicator is not destroyed when PETSC_FINALIZE is called. I used the PetscOptionsHasName utility to check if you have ?-usecuda? in the options. So running the code with and without that option can give you a comparison w/o CUDA. I can see that the code also fails after the second loop of the KSP operation. Could you kindly shed some lights on this problem? I should say that I am not even sure if the problem is from PETSc, as I also accidentally updated the NVIDIA driver (for now it is 510.06 with cuda 11.6). And it is well known that NVIDIA can give you some surprise in the updates (yes, I know I shouldn?t have touched that if it?s not broken). But my CUDA code without PETSC (which basically does the same PDE thing, but with cusparse/cublas directly) seems to work just fine after the update. It is also possible that my petsc code related to CUDA was not quite ?legitimate? ? I just use: MatSetType(A, MATMPIAIJCUSPARSE, ierr) and MatCreateVecs(A, u, PETSC_NULL_VEC, ierr) to make the data onto GPU. I would very much appreciate it if you could show me the ?right? way to do that. Thanks a lot in advance, and all the best, Hao -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: configure-main.log Type: application/octet-stream Size: 1286795 bytes Desc: configure-main.log URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: ex11fc-main.log Type: application/octet-stream Size: 8548 bytes Desc: ex11fc-main.log URL: From fdkong.jd at gmail.com Tue Jan 25 22:29:39 2022 From: fdkong.jd at gmail.com (Fande Kong) Date: Tue, 25 Jan 2022 21:29:39 -0700 Subject: [petsc-users] Cannot eagerly initialize cuda, as doing so results in cuda error 35 (cudaErrorInsufficientDriver) : CUDA driver version is insufficient for CUDA runtime version In-Reply-To: References: <233D0F20-AD95-481B-B862-353D5789C556@gmail.com> <87czkn9c64.fsf@jedbrown.org> <875yqe7zip.fsf@jedbrown.org> Message-ID: Hi Junchao, I attached a "bad" configure log and a "good" configure log. The "bad" one was on produced at 246ba74192519a5f34fb6e227d1c64364e19ce2c and the "good" one at 384645a00975869a1aacbd3169de62ba40cad683 This good hash is the last good hash that is just the right before the bad one. I think you could do a comparison between these two logs, and check what the differences were. Thanks, Fande On Tue, Jan 25, 2022 at 8:21 PM Junchao Zhang wrote: > Fande, could you send the configure.log that works (i.e., before this > offending commit)? > --Junchao Zhang > > > On Tue, Jan 25, 2022 at 8:21 PM Fande Kong wrote: > >> Not sure if this is helpful. I did "git bisect", and here was the result: >> >> [kongf at sawtooth2 petsc]$ git bisect bad >> 246ba74192519a5f34fb6e227d1c64364e19ce2c is the first bad commit >> commit 246ba74192519a5f34fb6e227d1c64364e19ce2c >> Author: Junchao Zhang >> Date: Wed Oct 13 05:32:43 2021 +0000 >> >> Config: fix CUDA library and header dirs >> >> :040000 040000 187c86055adb80f53c1d0565a8888704fec43a96 >> ea1efd7f594fd5e8df54170bc1bc7b00f35e4d5f M config >> >> >> Started from this commit, and GPU did not work for me on our HPC >> >> Thanks, >> Fande >> >> On Tue, Jan 25, 2022 at 7:18 PM Fande Kong wrote: >> >>> >>> >>> On Tue, Jan 25, 2022 at 9:04 AM Jacob Faibussowitsch < >>> jacob.fai at gmail.com> wrote: >>> >>>> Configure should not have an impact here I think. The reason I had you >>>> run `cudaGetDeviceCount()` is because this is the CUDA call (and in fact >>>> the only CUDA call) in the initialization sequence that returns the error >>>> code. There should be no prior CUDA calls. Maybe this is a problem with >>>> oversubscribing GPU?s? In the runs that crash, how many ranks are using any >>>> given GPU at once? Maybe MPS is required. >>>> >>> >>> I used one MPI rank. >>> >>> Fande >>> >>> >>> >>>> >>>> Best regards, >>>> >>>> Jacob Faibussowitsch >>>> (Jacob Fai - booss - oh - vitch) >>>> >>>> On Jan 21, 2022, at 12:01, Fande Kong wrote: >>>> >>>> Thanks Jacob, >>>> >>>> On Thu, Jan 20, 2022 at 6:25 PM Jacob Faibussowitsch < >>>> jacob.fai at gmail.com> wrote: >>>> >>>>> Segfault is caused by the following check at >>>>> src/sys/objects/device/impls/cupm/cupmdevice.cxx:349 being a >>>>> PetscUnlikelyDebug() rather than just PetscUnlikely(): >>>>> >>>>> ``` >>>>> if (PetscUnlikelyDebug(_defaultDevice < 0)) { // _defaultDevice is in >>>>> fact < 0 here and uncaught >>>>> ``` >>>>> >>>>> To clarify: >>>>> >>>>> ?lazy? initialization is not that lazy after all, it still does some >>>>> 50% of the initialization that ?eager? initialization does. It stops short >>>>> initializing the CUDA runtime, checking CUDA aware MPI, gathering device >>>>> data, and initializing cublas and friends. Lazy also importantly swallows >>>>> any errors that crop up during initialization, storing the resulting error >>>>> code for later (specifically _defaultDevice = -init_error_value;). >>>>> >>>>> So whether you initialize lazily or eagerly makes no difference here, >>>>> as _defaultDevice will always contain -35. >>>>> >>>>> The bigger question is why cudaGetDeviceCount() is returning >>>>> cudaErrorInsufficientDriver. Can you compile and run >>>>> >>>>> ``` >>>>> #include >>>>> >>>>> int main() >>>>> { >>>>> int ndev; >>>>> return cudaGetDeviceCount(&ndev): >>>>> } >>>>> ``` >>>>> >>>>> Then show the value of "echo $??? >>>>> >>>> >>>> Modify your code a little to get more information. >>>> >>>> #include >>>> #include >>>> >>>> int main() >>>> { >>>> int ndev; >>>> int error = cudaGetDeviceCount(&ndev); >>>> printf("ndev %d \n", ndev); >>>> printf("error %d \n", error); >>>> return 0; >>>> } >>>> >>>> Results: >>>> >>>> $ ./a.out >>>> ndev 4 >>>> error 0 >>>> >>>> >>>> I have not read the PETSc cuda initialization code yet. If I need to >>>> guess at what was happening. I will naively think that PETSc did not get >>>> correct GPU information in the configuration because the compiler node does >>>> not have GPUs, and there was no way to get any GPU device information. >>>> >>>> During the runtime on GPU nodes, PETSc might have incorrect information >>>> grabbed during configuration and had this kind of false error message. >>>> >>>> Thanks, >>>> >>>> Fande >>>> >>>> >>>> >>>>> >>>>> Best regards, >>>>> >>>>> Jacob Faibussowitsch >>>>> (Jacob Fai - booss - oh - vitch) >>>>> >>>>> On Jan 20, 2022, at 17:47, Matthew Knepley wrote: >>>>> >>>>> On Thu, Jan 20, 2022 at 6:44 PM Fande Kong >>>>> wrote: >>>>> >>>>>> Thanks, Jed >>>>>> >>>>>> On Thu, Jan 20, 2022 at 4:34 PM Jed Brown wrote: >>>>>> >>>>>>> You can't create CUDA or Kokkos Vecs if you're running on a node >>>>>>> without a GPU. >>>>>> >>>>>> >>>>>> I am running the code on compute nodes that do have GPUs. >>>>>> >>>>> >>>>> If you are actually running on GPUs, why would you need lazy >>>>> initialization? It would not break with GPUs present. >>>>> >>>>> Matt >>>>> >>>>> >>>>>> With PETSc-3.16.1, I got good speedup by running GAMG on GPUs. That >>>>>> might be a bug of PETSc-main. >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Fande >>>>>> >>>>>> >>>>>> >>>>>> KSPSetUp 13 1.0 6.4400e-01 1.0 2.02e+09 1.0 0.0e+00 >>>>>> 0.0e+00 0.0e+00 0 5 0 0 0 0 5 0 0 0 3140 64630 15 >>>>>> 1.05e+02 5 3.49e+01 100 >>>>>> KSPSolve 1 1.0 1.0109e+00 1.0 3.49e+10 1.0 0.0e+00 >>>>>> 0.0e+00 0.0e+00 0 87 0 0 0 0 87 0 0 0 34522 69556 4 >>>>>> 4.35e-03 1 2.38e-03 100 >>>>>> KSPGMRESOrthog 142 1.0 1.2674e-01 1.0 1.06e+10 1.0 0.0e+00 >>>>>> 0.0e+00 0.0e+00 0 27 0 0 0 0 27 0 0 0 83755 87801 0 >>>>>> 0.00e+00 0 0.00e+00 100 >>>>>> SNESSolve 1 1.0 4.4402e+01 1.0 4.00e+10 1.0 0.0e+00 >>>>>> 0.0e+00 0.0e+00 21100 0 0 0 21100 0 0 0 901 51365 57 >>>>>> 1.10e+03 52 8.78e+02 100 >>>>>> SNESSetUp 1 1.0 3.9101e-05 1.0 0.00e+00 0.0 0.0e+00 >>>>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 >>>>>> 0.00e+00 0 0.00e+00 0 >>>>>> SNESFunctionEval 2 1.0 1.7097e+01 1.0 1.60e+07 1.0 0.0e+00 >>>>>> 0.0e+00 0.0e+00 8 0 0 0 0 8 0 0 0 0 1 0 0 >>>>>> 0.00e+00 6 1.92e+02 0 >>>>>> SNESJacobianEval 1 1.0 1.6213e+01 1.0 2.80e+07 1.0 0.0e+00 >>>>>> 0.0e+00 0.0e+00 8 0 0 0 0 8 0 0 0 0 2 0 0 >>>>>> 0.00e+00 1 3.20e+01 0 >>>>>> SNESLineSearch 1 1.0 8.5582e+00 1.0 1.24e+08 1.0 0.0e+00 >>>>>> 0.0e+00 0.0e+00 4 0 0 0 0 4 0 0 0 0 14 64153 1 >>>>>> 3.20e+01 3 9.61e+01 94 >>>>>> PCGAMGGraph_AGG 5 1.0 3.0509e+00 1.0 8.19e+07 1.0 0.0e+00 >>>>>> 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 27 0 5 >>>>>> 3.49e+01 9 7.43e+01 0 >>>>>> PCGAMGCoarse_AGG 5 1.0 3.8711e+00 1.0 0.00e+00 0.0 0.0e+00 >>>>>> 0.0e+00 0.0e+00 2 0 0 0 0 2 0 0 0 0 0 0 0 >>>>>> 0.00e+00 0 0.00e+00 0 >>>>>> PCGAMGProl_AGG 5 1.0 7.0748e-01 1.0 0.00e+00 0.0 0.0e+00 >>>>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 >>>>>> 0.00e+00 0 0.00e+00 0 >>>>>> PCGAMGPOpt_AGG 5 1.0 1.2904e+00 1.0 2.14e+09 1.0 0.0e+00 >>>>>> 0.0e+00 0.0e+00 1 5 0 0 0 1 5 0 0 0 1661 29807 26 >>>>>> 7.15e+02 20 2.90e+02 99 >>>>>> GAMG: createProl 5 1.0 8.9489e+00 1.0 2.22e+09 1.0 0.0e+00 >>>>>> 0.0e+00 0.0e+00 4 6 0 0 0 4 6 0 0 0 249 29666 31 >>>>>> 7.50e+02 29 3.64e+02 96 >>>>>> Graph 10 1.0 3.0478e+00 1.0 8.19e+07 1.0 0.0e+00 >>>>>> 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 27 0 5 >>>>>> 3.49e+01 9 7.43e+01 0 >>>>>> MIS/Agg 5 1.0 4.1290e-01 1.0 0.00e+00 0.0 0.0e+00 >>>>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 >>>>>> 0.00e+00 0 0.00e+00 0 >>>>>> SA: col data 5 1.0 1.9127e-02 1.0 0.00e+00 0.0 0.0e+00 >>>>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 >>>>>> 0.00e+00 0 0.00e+00 0 >>>>>> SA: frmProl0 5 1.0 6.2662e-01 1.0 0.00e+00 0.0 0.0e+00 >>>>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 >>>>>> 0.00e+00 0 0.00e+00 0 >>>>>> SA: smooth 5 1.0 4.9595e-01 1.0 1.21e+08 1.0 0.0e+00 >>>>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 244 2709 15 >>>>>> 1.97e+02 15 2.55e+02 90 >>>>>> GAMG: partLevel 5 1.0 4.7330e-01 1.0 6.98e+08 1.0 0.0e+00 >>>>>> 0.0e+00 0.0e+00 0 2 0 0 0 0 2 0 0 0 1475 4120 5 >>>>>> 1.78e+02 10 2.55e+02 100 >>>>>> PCGAMG Squ l00 1 1.0 2.6027e+00 1.0 0.00e+00 0.0 0.0e+00 >>>>>> 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 0 0 0 >>>>>> 0.00e+00 0 0.00e+00 0 >>>>>> PCGAMG Gal l00 1 1.0 3.8406e-01 1.0 5.48e+08 1.0 0.0e+00 >>>>>> 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 1426 4270 1 >>>>>> 1.48e+02 2 2.11e+02 100 >>>>>> PCGAMG Opt l00 1 1.0 2.4932e-01 1.0 7.20e+07 1.0 0.0e+00 >>>>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 289 2653 1 >>>>>> 6.41e+01 1 1.13e+02 100 >>>>>> PCGAMG Gal l01 1 1.0 6.6279e-02 1.0 1.09e+08 1.0 0.0e+00 >>>>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1645 3851 1 >>>>>> 2.40e+01 2 3.64e+01 100 >>>>>> PCGAMG Opt l01 1 1.0 2.9544e-02 1.0 7.15e+06 1.0 0.0e+00 >>>>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 242 1671 1 >>>>>> 4.84e+00 1 1.23e+01 100 >>>>>> PCGAMG Gal l02 1 1.0 1.8874e-02 1.0 3.72e+07 1.0 0.0e+00 >>>>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1974 3636 1 >>>>>> 5.04e+00 2 6.58e+00 100 >>>>>> PCGAMG Opt l02 1 1.0 7.4353e-03 1.0 2.40e+06 1.0 0.0e+00 >>>>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 323 1457 1 >>>>>> 7.71e-01 1 2.30e+00 100 >>>>>> PCGAMG Gal l03 1 1.0 2.8479e-03 1.0 4.10e+06 1.0 0.0e+00 >>>>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1440 2266 1 >>>>>> 4.44e-01 2 5.51e-01 100 >>>>>> PCGAMG Opt l03 1 1.0 8.2684e-04 1.0 2.80e+05 1.0 0.0e+00 >>>>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 339 1667 1 >>>>>> 6.72e-02 1 2.03e-01 100 >>>>>> PCGAMG Gal l04 1 1.0 1.2238e-03 1.0 2.09e+05 1.0 0.0e+00 >>>>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 170 244 1 >>>>>> 2.05e-02 2 2.53e-02 100 >>>>>> PCGAMG Opt l04 1 1.0 4.1008e-04 1.0 1.77e+04 1.0 0.0e+00 >>>>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 43 165 1 >>>>>> 4.49e-03 1 1.19e-02 100 >>>>>> PCSetUp 2 1.0 9.9632e+00 1.0 4.95e+09 1.0 0.0e+00 >>>>>> 0.0e+00 0.0e+00 5 12 0 0 0 5 12 0 0 0 496 17826 55 >>>>>> 1.03e+03 45 6.54e+02 98 >>>>>> PCSetUpOnBlocks 44 1.0 9.9087e-04 1.0 2.88e+03 1.0 >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>>> The point of lazy initialization is to make it possible to run a >>>>>>> solve that doesn't use a GPU in PETSC_ARCH that supports GPUs, regardless >>>>>>> of whether a GPU is actually present. >>>>>>> >>>>>>> Fande Kong writes: >>>>>>> >>>>>>> > I spoke too soon. It seems that we have trouble creating >>>>>>> cuda/kokkos vecs >>>>>>> > now. Got Segmentation fault. >>>>>>> > >>>>>>> > Thanks, >>>>>>> > >>>>>>> > Fande >>>>>>> > >>>>>>> > Program received signal SIGSEGV, Segmentation fault. >>>>>>> > 0x00002aaab5558b11 in >>>>>>> > >>>>>>> Petsc::CUPMDevice<(Petsc::CUPMDeviceType)0>::CUPMDeviceInternal::initialize >>>>>>> > (this=0x1) at >>>>>>> > >>>>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/impls/cupm/cupmdevice.cxx:54 >>>>>>> > 54 PetscErrorCode CUPMDevice::CUPMDeviceInternal::initialize() >>>>>>> noexcept >>>>>>> > Missing separate debuginfos, use: debuginfo-install >>>>>>> > bzip2-libs-1.0.6-13.el7.x86_64 elfutils-libelf-0.176-5.el7.x86_64 >>>>>>> > elfutils-libs-0.176-5.el7.x86_64 glibc-2.17-325.el7_9.x86_64 >>>>>>> > libX11-1.6.7-4.el7_9.x86_64 libXau-1.0.8-2.1.el7.x86_64 >>>>>>> > libattr-2.4.46-13.el7.x86_64 libcap-2.22-11.el7.x86_64 >>>>>>> > libibmad-5.4.0.MLNX20190423.1d917ae-0.1.49224.x86_64 >>>>>>> > libibumad-43.1.1.MLNX20200211.078947f-0.1.49224.x86_64 >>>>>>> > libibverbs-41mlnx1-OFED.4.9.0.0.7.49224.x86_64 >>>>>>> > libmlx4-41mlnx1-OFED.4.7.3.0.3.49224.x86_64 >>>>>>> > libmlx5-41mlnx1-OFED.4.9.0.1.2.49224.x86_64 >>>>>>> libnl3-3.2.28-4.el7.x86_64 >>>>>>> > librdmacm-41mlnx1-OFED.4.7.3.0.6.49224.x86_64 >>>>>>> > librxe-41mlnx1-OFED.4.4.2.4.6.49224.x86_64 libxcb-1.13-1.el7.x86_64 >>>>>>> > libxml2-2.9.1-6.el7_9.6.x86_64 numactl-libs-2.0.12-5.el7.x86_64 >>>>>>> > systemd-libs-219-78.el7_9.3.x86_64 xz-libs-5.2.2-1.el7.x86_64 >>>>>>> > zlib-1.2.7-19.el7_9.x86_64 >>>>>>> > (gdb) bt >>>>>>> > #0 0x00002aaab5558b11 in >>>>>>> > >>>>>>> Petsc::CUPMDevice<(Petsc::CUPMDeviceType)0>::CUPMDeviceInternal::initialize >>>>>>> > (this=0x1) at >>>>>>> > >>>>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/impls/cupm/cupmdevice.cxx:54 >>>>>>> > #1 0x00002aaab5558db7 in >>>>>>> > Petsc::CUPMDevice<(Petsc::CUPMDeviceType)0>::getDevice >>>>>>> > (this=this at entry=0x2aaab7f37b70 >>>>>>> > , device=0x115da00, id=-35, id at entry=-1) at >>>>>>> > >>>>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/impls/cupm/cupmdevice.cxx:344 >>>>>>> > #2 0x00002aaab55577de in PetscDeviceCreate (type=type at entry >>>>>>> =PETSC_DEVICE_CUDA, >>>>>>> > devid=devid at entry=-1, device=device at entry=0x2aaab7f37b48 >>>>>>> > ) at >>>>>>> > >>>>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/interface/device.cxx:107 >>>>>>> > #3 0x00002aaab5557b3a in >>>>>>> PetscDeviceInitializeDefaultDevice_Internal >>>>>>> > (type=type at entry=PETSC_DEVICE_CUDA, >>>>>>> defaultDeviceId=defaultDeviceId at entry=-1) >>>>>>> > at >>>>>>> > >>>>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/interface/device.cxx:273 >>>>>>> > #4 0x00002aaab5557bf6 in PetscDeviceInitialize >>>>>>> > (type=type at entry=PETSC_DEVICE_CUDA) >>>>>>> > at >>>>>>> > >>>>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/interface/device.cxx:234 >>>>>>> > #5 0x00002aaab5661fcd in VecCreate_SeqCUDA (V=0x115d150) at >>>>>>> > >>>>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/vec/vec/impls/seq/seqcuda/veccuda.c:244 >>>>>>> > #6 0x00002aaab5649b40 in VecSetType (vec=vec at entry=0x115d150, >>>>>>> > method=method at entry=0x2aaab70b45b8 "seqcuda") at >>>>>>> > >>>>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/vec/vec/interface/vecreg.c:93 >>>>>>> > #7 0x00002aaab579c33f in VecCreate_CUDA (v=0x115d150) at >>>>>>> > >>>>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/vec/vec/impls/mpi/mpicuda/ >>>>>>> > mpicuda.cu:214 >>>>>>> > #8 0x00002aaab5649b40 in VecSetType (vec=vec at entry=0x115d150, >>>>>>> > method=method at entry=0x7fffffff9260 "cuda") at >>>>>>> > >>>>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/vec/vec/interface/vecreg.c:93 >>>>>>> > #9 0x00002aaab5648bf1 in VecSetTypeFromOptions_Private >>>>>>> (vec=0x115d150, >>>>>>> > PetscOptionsObject=0x7fffffff9210) at >>>>>>> > >>>>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/vec/vec/interface/vector.c:1263 >>>>>>> > #10 VecSetFromOptions (vec=0x115d150) at >>>>>>> > >>>>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/vec/vec/interface/vector.c:1297 >>>>>>> > #11 0x00002aaab02ef227 in libMesh::PetscVector::init >>>>>>> > (this=0x11cd1a0, n=441, n_local=441, fast=false, >>>>>>> ptype=libMesh::PARALLEL) >>>>>>> > at >>>>>>> > >>>>>>> /home/kongf/workhome/sawtooth/moosegpu/scripts/../libmesh/installed/include/libmesh/petsc_vector.h:693 >>>>>>> > >>>>>>> > On Thu, Jan 20, 2022 at 1:09 PM Fande Kong >>>>>>> wrote: >>>>>>> > >>>>>>> >> Thanks, Jed, >>>>>>> >> >>>>>>> >> This worked! >>>>>>> >> >>>>>>> >> Fande >>>>>>> >> >>>>>>> >> On Wed, Jan 19, 2022 at 11:03 PM Jed Brown >>>>>>> wrote: >>>>>>> >> >>>>>>> >>> Fande Kong writes: >>>>>>> >>> >>>>>>> >>> > On Wed, Jan 19, 2022 at 11:39 AM Jacob Faibussowitsch < >>>>>>> >>> jacob.fai at gmail.com> >>>>>>> >>> > wrote: >>>>>>> >>> > >>>>>>> >>> >> Are you running on login nodes or compute nodes (I can?t seem >>>>>>> to tell >>>>>>> >>> from >>>>>>> >>> >> the configure.log)? >>>>>>> >>> >> >>>>>>> >>> > >>>>>>> >>> > I was compiling codes on login nodes, and running codes on >>>>>>> compute >>>>>>> >>> nodes. >>>>>>> >>> > Login nodes do not have GPUs, but compute nodes do have GPUs. >>>>>>> >>> > >>>>>>> >>> > Just to be clear, the same thing (code, machine) with >>>>>>> PETSc-3.16.1 >>>>>>> >>> worked >>>>>>> >>> > perfectly. I have this trouble with PETSc-main. >>>>>>> >>> >>>>>>> >>> I assume you can >>>>>>> >>> >>>>>>> >>> export PETSC_OPTIONS='-device_enable lazy' >>>>>>> >>> >>>>>>> >>> and it'll work. >>>>>>> >>> >>>>>>> >>> I think this should be the default. The main complaint is that >>>>>>> timing the >>>>>>> >>> first GPU-using event isn't accurate if it includes >>>>>>> initialization, but I >>>>>>> >>> think this is mostly hypothetical because you can't trust any >>>>>>> timing that >>>>>>> >>> doesn't preload in some form and the first GPU-using event will >>>>>>> almost >>>>>>> >>> always be something uninteresting so I think it will rarely lead >>>>>>> to >>>>>>> >>> confusion. Meanwhile, eager initialization is viscerally >>>>>>> disruptive for >>>>>>> >>> lots of people. >>>>>>> >>> >>>>>>> >> >>>>>>> >>>>>> >>>>> >>>>> -- >>>>> What most experimenters take for granted before they begin their >>>>> experiments is infinitely more interesting than any results to which their >>>>> experiments lead. >>>>> -- Norbert Wiener >>>>> >>>>> https://www.cse.buffalo.edu/~knepley/ >>>>> >>>>> >>>>> >>>> -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: configure_bad.log Type: application/octet-stream Size: 2944742 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: configure_good.log Type: application/octet-stream Size: 2862450 bytes Desc: not available URL: From sjtang1223 at gmail.com Tue Jan 25 22:37:50 2022 From: sjtang1223 at gmail.com (sijie tang) Date: Tue, 25 Jan 2022 21:37:50 -0700 Subject: [petsc-users] Hypre in Petsc Message-ID: <5B1A5E6E-A134-408B-B998-DDB4910BC56F@gmail.com> Hi, I have a question about, I want to use AMG as a solver in PETSc, but from the user manual, I find that AMG only can be used as a preconditioner. Am I right? So I want to use the boomerAMG in HYPRE as the solver, Can I use the function like VecHYPRE_IJVectorCreate (Petsc function) in these links? https://petsc.org/release/src/vec/vec/impls/hypre/vhyp.c . (It Creates hypre ijvector from Petsc vector) https://petsc.org/main/src/mat/impls/hypre/mhypre.c.html (It Creates hypre ijmatrix from Petsc matrix) But there is a another question: how to transfer the vector/matrix structure from hypre back petsc ? I don?t find any function for that work, Do you know what function can work, or do I implement this function by myself? Thanks, Sijie -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at jedbrown.org Tue Jan 25 23:34:23 2022 From: jed at jedbrown.org (Jed Brown) Date: Tue, 25 Jan 2022 22:34:23 -0700 Subject: [petsc-users] Hypre in Petsc In-Reply-To: <5B1A5E6E-A134-408B-B998-DDB4910BC56F@gmail.com> References: <5B1A5E6E-A134-408B-B998-DDB4910BC56F@gmail.com> Message-ID: <87ee4vawm8.fsf@jedbrown.org> "multigrid as a solver" generally means stationary (Richardson) iterations: -ksp_type richardson -pc_type hypre This might not converge, and you'll almost certainly see faster convergence if you use it with a Krylov method. -ksp_type cg -pc_type hypre if your problem is SPD. sijie tang writes: > Hi, > > I have a question about, I want to use AMG as a solver in PETSc, but from the user manual, I find that AMG only can be used as a preconditioner. Am I right? > > So I want to use the boomerAMG in HYPRE as the solver, Can I use the function like VecHYPRE_IJVectorCreate (Petsc function) in these links? > > https://petsc.org/release/src/vec/vec/impls/hypre/vhyp.c . (It Creates hypre ijvector from Petsc vector) > https://petsc.org/main/src/mat/impls/hypre/mhypre.c.html (It Creates hypre ijmatrix from Petsc matrix) > > > But there is a another question: how to transfer the vector/matrix structure from hypre back petsc ? > > I don?t find any function for that work, Do you know what function can work, or do I implement this function by myself? > > > > Thanks, > Sijie From junchao.zhang at gmail.com Tue Jan 25 23:41:37 2022 From: junchao.zhang at gmail.com (Junchao Zhang) Date: Tue, 25 Jan 2022 23:41:37 -0600 Subject: [petsc-users] Cannot eagerly initialize cuda, as doing so results in cuda error 35 (cudaErrorInsufficientDriver) : CUDA driver version is insufficient for CUDA runtime version In-Reply-To: References: <233D0F20-AD95-481B-B862-353D5789C556@gmail.com> <87czkn9c64.fsf@jedbrown.org> <875yqe7zip.fsf@jedbrown.org> Message-ID: Fande, do you have a configure.log with current petsc/main? --Junchao Zhang On Tue, Jan 25, 2022 at 10:30 PM Fande Kong wrote: > Hi Junchao, > > I attached a "bad" configure log and a "good" configure log. > > The "bad" one was on produced at 246ba74192519a5f34fb6e227d1c64364e19ce2c > > and the "good" one at 384645a00975869a1aacbd3169de62ba40cad683 > > This good hash is the last good hash that is just the right before the bad > one. > > I think you could do a comparison between these two logs, and check what > the differences were. > > Thanks, > > Fande > > On Tue, Jan 25, 2022 at 8:21 PM Junchao Zhang > wrote: > >> Fande, could you send the configure.log that works (i.e., before this >> offending commit)? >> --Junchao Zhang >> >> >> On Tue, Jan 25, 2022 at 8:21 PM Fande Kong wrote: >> >>> Not sure if this is helpful. I did "git bisect", and here was the result: >>> >>> [kongf at sawtooth2 petsc]$ git bisect bad >>> 246ba74192519a5f34fb6e227d1c64364e19ce2c is the first bad commit >>> commit 246ba74192519a5f34fb6e227d1c64364e19ce2c >>> Author: Junchao Zhang >>> Date: Wed Oct 13 05:32:43 2021 +0000 >>> >>> Config: fix CUDA library and header dirs >>> >>> :040000 040000 187c86055adb80f53c1d0565a8888704fec43a96 >>> ea1efd7f594fd5e8df54170bc1bc7b00f35e4d5f M config >>> >>> >>> Started from this commit, and GPU did not work for me on our HPC >>> >>> Thanks, >>> Fande >>> >>> On Tue, Jan 25, 2022 at 7:18 PM Fande Kong wrote: >>> >>>> >>>> >>>> On Tue, Jan 25, 2022 at 9:04 AM Jacob Faibussowitsch < >>>> jacob.fai at gmail.com> wrote: >>>> >>>>> Configure should not have an impact here I think. The reason I had you >>>>> run `cudaGetDeviceCount()` is because this is the CUDA call (and in fact >>>>> the only CUDA call) in the initialization sequence that returns the error >>>>> code. There should be no prior CUDA calls. Maybe this is a problem with >>>>> oversubscribing GPU?s? In the runs that crash, how many ranks are using any >>>>> given GPU at once? Maybe MPS is required. >>>>> >>>> >>>> I used one MPI rank. >>>> >>>> Fande >>>> >>>> >>>> >>>>> >>>>> Best regards, >>>>> >>>>> Jacob Faibussowitsch >>>>> (Jacob Fai - booss - oh - vitch) >>>>> >>>>> On Jan 21, 2022, at 12:01, Fande Kong wrote: >>>>> >>>>> Thanks Jacob, >>>>> >>>>> On Thu, Jan 20, 2022 at 6:25 PM Jacob Faibussowitsch < >>>>> jacob.fai at gmail.com> wrote: >>>>> >>>>>> Segfault is caused by the following check at >>>>>> src/sys/objects/device/impls/cupm/cupmdevice.cxx:349 being a >>>>>> PetscUnlikelyDebug() rather than just PetscUnlikely(): >>>>>> >>>>>> ``` >>>>>> if (PetscUnlikelyDebug(_defaultDevice < 0)) { // _defaultDevice is in >>>>>> fact < 0 here and uncaught >>>>>> ``` >>>>>> >>>>>> To clarify: >>>>>> >>>>>> ?lazy? initialization is not that lazy after all, it still does some >>>>>> 50% of the initialization that ?eager? initialization does. It stops short >>>>>> initializing the CUDA runtime, checking CUDA aware MPI, gathering device >>>>>> data, and initializing cublas and friends. Lazy also importantly swallows >>>>>> any errors that crop up during initialization, storing the resulting error >>>>>> code for later (specifically _defaultDevice = -init_error_value;). >>>>>> >>>>>> So whether you initialize lazily or eagerly makes no difference here, >>>>>> as _defaultDevice will always contain -35. >>>>>> >>>>>> The bigger question is why cudaGetDeviceCount() is returning >>>>>> cudaErrorInsufficientDriver. Can you compile and run >>>>>> >>>>>> ``` >>>>>> #include >>>>>> >>>>>> int main() >>>>>> { >>>>>> int ndev; >>>>>> return cudaGetDeviceCount(&ndev): >>>>>> } >>>>>> ``` >>>>>> >>>>>> Then show the value of "echo $??? >>>>>> >>>>> >>>>> Modify your code a little to get more information. >>>>> >>>>> #include >>>>> #include >>>>> >>>>> int main() >>>>> { >>>>> int ndev; >>>>> int error = cudaGetDeviceCount(&ndev); >>>>> printf("ndev %d \n", ndev); >>>>> printf("error %d \n", error); >>>>> return 0; >>>>> } >>>>> >>>>> Results: >>>>> >>>>> $ ./a.out >>>>> ndev 4 >>>>> error 0 >>>>> >>>>> >>>>> I have not read the PETSc cuda initialization code yet. If I need to >>>>> guess at what was happening. I will naively think that PETSc did not get >>>>> correct GPU information in the configuration because the compiler node does >>>>> not have GPUs, and there was no way to get any GPU device information. >>>>> >>>>> >>>>> During the runtime on GPU nodes, PETSc might have incorrect >>>>> information grabbed during configuration and had this kind of false error >>>>> message. >>>>> >>>>> Thanks, >>>>> >>>>> Fande >>>>> >>>>> >>>>> >>>>>> >>>>>> Best regards, >>>>>> >>>>>> Jacob Faibussowitsch >>>>>> (Jacob Fai - booss - oh - vitch) >>>>>> >>>>>> On Jan 20, 2022, at 17:47, Matthew Knepley wrote: >>>>>> >>>>>> On Thu, Jan 20, 2022 at 6:44 PM Fande Kong >>>>>> wrote: >>>>>> >>>>>>> Thanks, Jed >>>>>>> >>>>>>> On Thu, Jan 20, 2022 at 4:34 PM Jed Brown wrote: >>>>>>> >>>>>>>> You can't create CUDA or Kokkos Vecs if you're running on a node >>>>>>>> without a GPU. >>>>>>> >>>>>>> >>>>>>> I am running the code on compute nodes that do have GPUs. >>>>>>> >>>>>> >>>>>> If you are actually running on GPUs, why would you need lazy >>>>>> initialization? It would not break with GPUs present. >>>>>> >>>>>> Matt >>>>>> >>>>>> >>>>>>> With PETSc-3.16.1, I got good speedup by running GAMG on GPUs. >>>>>>> That might be a bug of PETSc-main. >>>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> Fande >>>>>>> >>>>>>> >>>>>>> >>>>>>> KSPSetUp 13 1.0 6.4400e-01 1.0 2.02e+09 1.0 0.0e+00 >>>>>>> 0.0e+00 0.0e+00 0 5 0 0 0 0 5 0 0 0 3140 64630 15 >>>>>>> 1.05e+02 5 3.49e+01 100 >>>>>>> KSPSolve 1 1.0 1.0109e+00 1.0 3.49e+10 1.0 0.0e+00 >>>>>>> 0.0e+00 0.0e+00 0 87 0 0 0 0 87 0 0 0 34522 69556 4 >>>>>>> 4.35e-03 1 2.38e-03 100 >>>>>>> KSPGMRESOrthog 142 1.0 1.2674e-01 1.0 1.06e+10 1.0 0.0e+00 >>>>>>> 0.0e+00 0.0e+00 0 27 0 0 0 0 27 0 0 0 83755 87801 0 >>>>>>> 0.00e+00 0 0.00e+00 100 >>>>>>> SNESSolve 1 1.0 4.4402e+01 1.0 4.00e+10 1.0 0.0e+00 >>>>>>> 0.0e+00 0.0e+00 21100 0 0 0 21100 0 0 0 901 51365 57 >>>>>>> 1.10e+03 52 8.78e+02 100 >>>>>>> SNESSetUp 1 1.0 3.9101e-05 1.0 0.00e+00 0.0 0.0e+00 >>>>>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 >>>>>>> 0.00e+00 0 0.00e+00 0 >>>>>>> SNESFunctionEval 2 1.0 1.7097e+01 1.0 1.60e+07 1.0 0.0e+00 >>>>>>> 0.0e+00 0.0e+00 8 0 0 0 0 8 0 0 0 0 1 0 0 >>>>>>> 0.00e+00 6 1.92e+02 0 >>>>>>> SNESJacobianEval 1 1.0 1.6213e+01 1.0 2.80e+07 1.0 0.0e+00 >>>>>>> 0.0e+00 0.0e+00 8 0 0 0 0 8 0 0 0 0 2 0 0 >>>>>>> 0.00e+00 1 3.20e+01 0 >>>>>>> SNESLineSearch 1 1.0 8.5582e+00 1.0 1.24e+08 1.0 0.0e+00 >>>>>>> 0.0e+00 0.0e+00 4 0 0 0 0 4 0 0 0 0 14 64153 1 >>>>>>> 3.20e+01 3 9.61e+01 94 >>>>>>> PCGAMGGraph_AGG 5 1.0 3.0509e+00 1.0 8.19e+07 1.0 0.0e+00 >>>>>>> 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 27 0 5 >>>>>>> 3.49e+01 9 7.43e+01 0 >>>>>>> PCGAMGCoarse_AGG 5 1.0 3.8711e+00 1.0 0.00e+00 0.0 0.0e+00 >>>>>>> 0.0e+00 0.0e+00 2 0 0 0 0 2 0 0 0 0 0 0 0 >>>>>>> 0.00e+00 0 0.00e+00 0 >>>>>>> PCGAMGProl_AGG 5 1.0 7.0748e-01 1.0 0.00e+00 0.0 0.0e+00 >>>>>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 >>>>>>> 0.00e+00 0 0.00e+00 0 >>>>>>> PCGAMGPOpt_AGG 5 1.0 1.2904e+00 1.0 2.14e+09 1.0 0.0e+00 >>>>>>> 0.0e+00 0.0e+00 1 5 0 0 0 1 5 0 0 0 1661 29807 26 >>>>>>> 7.15e+02 20 2.90e+02 99 >>>>>>> GAMG: createProl 5 1.0 8.9489e+00 1.0 2.22e+09 1.0 0.0e+00 >>>>>>> 0.0e+00 0.0e+00 4 6 0 0 0 4 6 0 0 0 249 29666 31 >>>>>>> 7.50e+02 29 3.64e+02 96 >>>>>>> Graph 10 1.0 3.0478e+00 1.0 8.19e+07 1.0 0.0e+00 >>>>>>> 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 27 0 5 >>>>>>> 3.49e+01 9 7.43e+01 0 >>>>>>> MIS/Agg 5 1.0 4.1290e-01 1.0 0.00e+00 0.0 0.0e+00 >>>>>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 >>>>>>> 0.00e+00 0 0.00e+00 0 >>>>>>> SA: col data 5 1.0 1.9127e-02 1.0 0.00e+00 0.0 0.0e+00 >>>>>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 >>>>>>> 0.00e+00 0 0.00e+00 0 >>>>>>> SA: frmProl0 5 1.0 6.2662e-01 1.0 0.00e+00 0.0 0.0e+00 >>>>>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 >>>>>>> 0.00e+00 0 0.00e+00 0 >>>>>>> SA: smooth 5 1.0 4.9595e-01 1.0 1.21e+08 1.0 0.0e+00 >>>>>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 244 2709 15 >>>>>>> 1.97e+02 15 2.55e+02 90 >>>>>>> GAMG: partLevel 5 1.0 4.7330e-01 1.0 6.98e+08 1.0 0.0e+00 >>>>>>> 0.0e+00 0.0e+00 0 2 0 0 0 0 2 0 0 0 1475 4120 5 >>>>>>> 1.78e+02 10 2.55e+02 100 >>>>>>> PCGAMG Squ l00 1 1.0 2.6027e+00 1.0 0.00e+00 0.0 0.0e+00 >>>>>>> 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 0 0 0 >>>>>>> 0.00e+00 0 0.00e+00 0 >>>>>>> PCGAMG Gal l00 1 1.0 3.8406e-01 1.0 5.48e+08 1.0 0.0e+00 >>>>>>> 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 1426 4270 1 >>>>>>> 1.48e+02 2 2.11e+02 100 >>>>>>> PCGAMG Opt l00 1 1.0 2.4932e-01 1.0 7.20e+07 1.0 0.0e+00 >>>>>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 289 2653 1 >>>>>>> 6.41e+01 1 1.13e+02 100 >>>>>>> PCGAMG Gal l01 1 1.0 6.6279e-02 1.0 1.09e+08 1.0 0.0e+00 >>>>>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1645 3851 1 >>>>>>> 2.40e+01 2 3.64e+01 100 >>>>>>> PCGAMG Opt l01 1 1.0 2.9544e-02 1.0 7.15e+06 1.0 0.0e+00 >>>>>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 242 1671 1 >>>>>>> 4.84e+00 1 1.23e+01 100 >>>>>>> PCGAMG Gal l02 1 1.0 1.8874e-02 1.0 3.72e+07 1.0 0.0e+00 >>>>>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1974 3636 1 >>>>>>> 5.04e+00 2 6.58e+00 100 >>>>>>> PCGAMG Opt l02 1 1.0 7.4353e-03 1.0 2.40e+06 1.0 0.0e+00 >>>>>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 323 1457 1 >>>>>>> 7.71e-01 1 2.30e+00 100 >>>>>>> PCGAMG Gal l03 1 1.0 2.8479e-03 1.0 4.10e+06 1.0 0.0e+00 >>>>>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1440 2266 1 >>>>>>> 4.44e-01 2 5.51e-01 100 >>>>>>> PCGAMG Opt l03 1 1.0 8.2684e-04 1.0 2.80e+05 1.0 0.0e+00 >>>>>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 339 1667 1 >>>>>>> 6.72e-02 1 2.03e-01 100 >>>>>>> PCGAMG Gal l04 1 1.0 1.2238e-03 1.0 2.09e+05 1.0 0.0e+00 >>>>>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 170 244 1 >>>>>>> 2.05e-02 2 2.53e-02 100 >>>>>>> PCGAMG Opt l04 1 1.0 4.1008e-04 1.0 1.77e+04 1.0 0.0e+00 >>>>>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 43 165 1 >>>>>>> 4.49e-03 1 1.19e-02 100 >>>>>>> PCSetUp 2 1.0 9.9632e+00 1.0 4.95e+09 1.0 0.0e+00 >>>>>>> 0.0e+00 0.0e+00 5 12 0 0 0 5 12 0 0 0 496 17826 55 >>>>>>> 1.03e+03 45 6.54e+02 98 >>>>>>> PCSetUpOnBlocks 44 1.0 9.9087e-04 1.0 2.88e+03 1.0 >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>>> The point of lazy initialization is to make it possible to run a >>>>>>>> solve that doesn't use a GPU in PETSC_ARCH that supports GPUs, regardless >>>>>>>> of whether a GPU is actually present. >>>>>>>> >>>>>>>> Fande Kong writes: >>>>>>>> >>>>>>>> > I spoke too soon. It seems that we have trouble creating >>>>>>>> cuda/kokkos vecs >>>>>>>> > now. Got Segmentation fault. >>>>>>>> > >>>>>>>> > Thanks, >>>>>>>> > >>>>>>>> > Fande >>>>>>>> > >>>>>>>> > Program received signal SIGSEGV, Segmentation fault. >>>>>>>> > 0x00002aaab5558b11 in >>>>>>>> > >>>>>>>> Petsc::CUPMDevice<(Petsc::CUPMDeviceType)0>::CUPMDeviceInternal::initialize >>>>>>>> > (this=0x1) at >>>>>>>> > >>>>>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/impls/cupm/cupmdevice.cxx:54 >>>>>>>> > 54 PetscErrorCode CUPMDevice::CUPMDeviceInternal::initialize() >>>>>>>> noexcept >>>>>>>> > Missing separate debuginfos, use: debuginfo-install >>>>>>>> > bzip2-libs-1.0.6-13.el7.x86_64 elfutils-libelf-0.176-5.el7.x86_64 >>>>>>>> > elfutils-libs-0.176-5.el7.x86_64 glibc-2.17-325.el7_9.x86_64 >>>>>>>> > libX11-1.6.7-4.el7_9.x86_64 libXau-1.0.8-2.1.el7.x86_64 >>>>>>>> > libattr-2.4.46-13.el7.x86_64 libcap-2.22-11.el7.x86_64 >>>>>>>> > libibmad-5.4.0.MLNX20190423.1d917ae-0.1.49224.x86_64 >>>>>>>> > libibumad-43.1.1.MLNX20200211.078947f-0.1.49224.x86_64 >>>>>>>> > libibverbs-41mlnx1-OFED.4.9.0.0.7.49224.x86_64 >>>>>>>> > libmlx4-41mlnx1-OFED.4.7.3.0.3.49224.x86_64 >>>>>>>> > libmlx5-41mlnx1-OFED.4.9.0.1.2.49224.x86_64 >>>>>>>> libnl3-3.2.28-4.el7.x86_64 >>>>>>>> > librdmacm-41mlnx1-OFED.4.7.3.0.6.49224.x86_64 >>>>>>>> > librxe-41mlnx1-OFED.4.4.2.4.6.49224.x86_64 >>>>>>>> libxcb-1.13-1.el7.x86_64 >>>>>>>> > libxml2-2.9.1-6.el7_9.6.x86_64 numactl-libs-2.0.12-5.el7.x86_64 >>>>>>>> > systemd-libs-219-78.el7_9.3.x86_64 xz-libs-5.2.2-1.el7.x86_64 >>>>>>>> > zlib-1.2.7-19.el7_9.x86_64 >>>>>>>> > (gdb) bt >>>>>>>> > #0 0x00002aaab5558b11 in >>>>>>>> > >>>>>>>> Petsc::CUPMDevice<(Petsc::CUPMDeviceType)0>::CUPMDeviceInternal::initialize >>>>>>>> > (this=0x1) at >>>>>>>> > >>>>>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/impls/cupm/cupmdevice.cxx:54 >>>>>>>> > #1 0x00002aaab5558db7 in >>>>>>>> > Petsc::CUPMDevice<(Petsc::CUPMDeviceType)0>::getDevice >>>>>>>> > (this=this at entry=0x2aaab7f37b70 >>>>>>>> > , device=0x115da00, id=-35, id at entry=-1) at >>>>>>>> > >>>>>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/impls/cupm/cupmdevice.cxx:344 >>>>>>>> > #2 0x00002aaab55577de in PetscDeviceCreate (type=type at entry >>>>>>>> =PETSC_DEVICE_CUDA, >>>>>>>> > devid=devid at entry=-1, device=device at entry=0x2aaab7f37b48 >>>>>>>> > ) at >>>>>>>> > >>>>>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/interface/device.cxx:107 >>>>>>>> > #3 0x00002aaab5557b3a in >>>>>>>> PetscDeviceInitializeDefaultDevice_Internal >>>>>>>> > (type=type at entry=PETSC_DEVICE_CUDA, >>>>>>>> defaultDeviceId=defaultDeviceId at entry=-1) >>>>>>>> > at >>>>>>>> > >>>>>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/interface/device.cxx:273 >>>>>>>> > #4 0x00002aaab5557bf6 in PetscDeviceInitialize >>>>>>>> > (type=type at entry=PETSC_DEVICE_CUDA) >>>>>>>> > at >>>>>>>> > >>>>>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/interface/device.cxx:234 >>>>>>>> > #5 0x00002aaab5661fcd in VecCreate_SeqCUDA (V=0x115d150) at >>>>>>>> > >>>>>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/vec/vec/impls/seq/seqcuda/veccuda.c:244 >>>>>>>> > #6 0x00002aaab5649b40 in VecSetType (vec=vec at entry=0x115d150, >>>>>>>> > method=method at entry=0x2aaab70b45b8 "seqcuda") at >>>>>>>> > >>>>>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/vec/vec/interface/vecreg.c:93 >>>>>>>> > #7 0x00002aaab579c33f in VecCreate_CUDA (v=0x115d150) at >>>>>>>> > >>>>>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/vec/vec/impls/mpi/mpicuda/ >>>>>>>> > mpicuda.cu:214 >>>>>>>> > #8 0x00002aaab5649b40 in VecSetType (vec=vec at entry=0x115d150, >>>>>>>> > method=method at entry=0x7fffffff9260 "cuda") at >>>>>>>> > >>>>>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/vec/vec/interface/vecreg.c:93 >>>>>>>> > #9 0x00002aaab5648bf1 in VecSetTypeFromOptions_Private >>>>>>>> (vec=0x115d150, >>>>>>>> > PetscOptionsObject=0x7fffffff9210) at >>>>>>>> > >>>>>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/vec/vec/interface/vector.c:1263 >>>>>>>> > #10 VecSetFromOptions (vec=0x115d150) at >>>>>>>> > >>>>>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/vec/vec/interface/vector.c:1297 >>>>>>>> > #11 0x00002aaab02ef227 in libMesh::PetscVector::init >>>>>>>> > (this=0x11cd1a0, n=441, n_local=441, fast=false, >>>>>>>> ptype=libMesh::PARALLEL) >>>>>>>> > at >>>>>>>> > >>>>>>>> /home/kongf/workhome/sawtooth/moosegpu/scripts/../libmesh/installed/include/libmesh/petsc_vector.h:693 >>>>>>>> > >>>>>>>> > On Thu, Jan 20, 2022 at 1:09 PM Fande Kong >>>>>>>> wrote: >>>>>>>> > >>>>>>>> >> Thanks, Jed, >>>>>>>> >> >>>>>>>> >> This worked! >>>>>>>> >> >>>>>>>> >> Fande >>>>>>>> >> >>>>>>>> >> On Wed, Jan 19, 2022 at 11:03 PM Jed Brown >>>>>>>> wrote: >>>>>>>> >> >>>>>>>> >>> Fande Kong writes: >>>>>>>> >>> >>>>>>>> >>> > On Wed, Jan 19, 2022 at 11:39 AM Jacob Faibussowitsch < >>>>>>>> >>> jacob.fai at gmail.com> >>>>>>>> >>> > wrote: >>>>>>>> >>> > >>>>>>>> >>> >> Are you running on login nodes or compute nodes (I can?t >>>>>>>> seem to tell >>>>>>>> >>> from >>>>>>>> >>> >> the configure.log)? >>>>>>>> >>> >> >>>>>>>> >>> > >>>>>>>> >>> > I was compiling codes on login nodes, and running codes on >>>>>>>> compute >>>>>>>> >>> nodes. >>>>>>>> >>> > Login nodes do not have GPUs, but compute nodes do have GPUs. >>>>>>>> >>> > >>>>>>>> >>> > Just to be clear, the same thing (code, machine) with >>>>>>>> PETSc-3.16.1 >>>>>>>> >>> worked >>>>>>>> >>> > perfectly. I have this trouble with PETSc-main. >>>>>>>> >>> >>>>>>>> >>> I assume you can >>>>>>>> >>> >>>>>>>> >>> export PETSC_OPTIONS='-device_enable lazy' >>>>>>>> >>> >>>>>>>> >>> and it'll work. >>>>>>>> >>> >>>>>>>> >>> I think this should be the default. The main complaint is that >>>>>>>> timing the >>>>>>>> >>> first GPU-using event isn't accurate if it includes >>>>>>>> initialization, but I >>>>>>>> >>> think this is mostly hypothetical because you can't trust any >>>>>>>> timing that >>>>>>>> >>> doesn't preload in some form and the first GPU-using event will >>>>>>>> almost >>>>>>>> >>> always be something uninteresting so I think it will rarely >>>>>>>> lead to >>>>>>>> >>> confusion. Meanwhile, eager initialization is viscerally >>>>>>>> disruptive for >>>>>>>> >>> lots of people. >>>>>>>> >>> >>>>>>>> >> >>>>>>>> >>>>>>> >>>>>> >>>>>> -- >>>>>> What most experimenters take for granted before they begin their >>>>>> experiments is infinitely more interesting than any results to which their >>>>>> experiments lead. >>>>>> -- Norbert Wiener >>>>>> >>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>> >>>>>> >>>>>> >>>>> -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Tue Jan 25 23:59:21 2022 From: bsmith at petsc.dev (Barry Smith) Date: Wed, 26 Jan 2022 00:59:21 -0500 Subject: [petsc-users] Cannot eagerly initialize cuda, as doing so results in cuda error 35 (cudaErrorInsufficientDriver) : CUDA driver version is insufficient for CUDA runtime version In-Reply-To: References: <233D0F20-AD95-481B-B862-353D5789C556@gmail.com> <87czkn9c64.fsf@jedbrown.org> <875yqe7zip.fsf@jedbrown.org> Message-ID: bad has extra -L/apps/local/spack/software/gcc-7.5.0/cuda-10.1.243-v4ymjqcrr7f72qfiuzsstuy5jiajbuey/lib64/stubs -lcuda good does not. Try removing the stubs directory and -lcuda from the bad $PETSC_ARCH/lib/petsc/conf/variables and likely the bad will start working. Barry I never liked the stubs stuff. > On Jan 25, 2022, at 11:29 PM, Fande Kong wrote: > > Hi Junchao, > > I attached a "bad" configure log and a "good" configure log. > > The "bad" one was on produced at 246ba74192519a5f34fb6e227d1c64364e19ce2c > > and the "good" one at 384645a00975869a1aacbd3169de62ba40cad683 > > This good hash is the last good hash that is just the right before the bad one. > > I think you could do a comparison between these two logs, and check what the differences were. > > Thanks, > > Fande > > On Tue, Jan 25, 2022 at 8:21 PM Junchao Zhang > wrote: > Fande, could you send the configure.log that works (i.e., before this offending commit)? > --Junchao Zhang > > > On Tue, Jan 25, 2022 at 8:21 PM Fande Kong > wrote: > Not sure if this is helpful. I did "git bisect", and here was the result: > > [kongf at sawtooth2 petsc]$ git bisect bad > 246ba74192519a5f34fb6e227d1c64364e19ce2c is the first bad commit > commit 246ba74192519a5f34fb6e227d1c64364e19ce2c > Author: Junchao Zhang > > Date: Wed Oct 13 05:32:43 2021 +0000 > > Config: fix CUDA library and header dirs > > :040000 040000 187c86055adb80f53c1d0565a8888704fec43a96 ea1efd7f594fd5e8df54170bc1bc7b00f35e4d5f M config > > > Started from this commit, and GPU did not work for me on our HPC > > Thanks, > Fande > > On Tue, Jan 25, 2022 at 7:18 PM Fande Kong > wrote: > > > On Tue, Jan 25, 2022 at 9:04 AM Jacob Faibussowitsch > wrote: > Configure should not have an impact here I think. The reason I had you run `cudaGetDeviceCount()` is because this is the CUDA call (and in fact the only CUDA call) in the initialization sequence that returns the error code. There should be no prior CUDA calls. Maybe this is a problem with oversubscribing GPU?s? In the runs that crash, how many ranks are using any given GPU at once? Maybe MPS is required. > > I used one MPI rank. > > Fande > > > > Best regards, > > Jacob Faibussowitsch > (Jacob Fai - booss - oh - vitch) > >> On Jan 21, 2022, at 12:01, Fande Kong > wrote: >> >> Thanks Jacob, >> >> On Thu, Jan 20, 2022 at 6:25 PM Jacob Faibussowitsch > wrote: >> Segfault is caused by the following check at src/sys/objects/device/impls/cupm/cupmdevice.cxx:349 being a PetscUnlikelyDebug() rather than just PetscUnlikely(): >> >> ``` >> if (PetscUnlikelyDebug(_defaultDevice < 0)) { // _defaultDevice is in fact < 0 here and uncaught >> ``` >> >> To clarify: >> >> ?lazy? initialization is not that lazy after all, it still does some 50% of the initialization that ?eager? initialization does. It stops short initializing the CUDA runtime, checking CUDA aware MPI, gathering device data, and initializing cublas and friends. Lazy also importantly swallows any errors that crop up during initialization, storing the resulting error code for later (specifically _defaultDevice = -init_error_value;). >> >> So whether you initialize lazily or eagerly makes no difference here, as _defaultDevice will always contain -35. >> >> The bigger question is why cudaGetDeviceCount() is returning cudaErrorInsufficientDriver. Can you compile and run >> >> ``` >> #include >> >> int main() >> { >> int ndev; >> return cudaGetDeviceCount(&ndev): >> } >> ``` >> >> Then show the value of "echo $??? >> >> Modify your code a little to get more information. >> >> #include >> #include >> >> int main() >> { >> int ndev; >> int error = cudaGetDeviceCount(&ndev); >> printf("ndev %d \n", ndev); >> printf("error %d \n", error); >> return 0; >> } >> >> Results: >> >> $ ./a.out >> ndev 4 >> error 0 >> >> >> I have not read the PETSc cuda initialization code yet. If I need to guess at what was happening. I will naively think that PETSc did not get correct GPU information in the configuration because the compiler node does not have GPUs, and there was no way to get any GPU device information. >> >> During the runtime on GPU nodes, PETSc might have incorrect information grabbed during configuration and had this kind of false error message. >> >> Thanks, >> >> Fande >> >> >> >> Best regards, >> >> Jacob Faibussowitsch >> (Jacob Fai - booss - oh - vitch) >> >>> On Jan 20, 2022, at 17:47, Matthew Knepley > wrote: >>> >>> On Thu, Jan 20, 2022 at 6:44 PM Fande Kong > wrote: >>> Thanks, Jed >>> >>> On Thu, Jan 20, 2022 at 4:34 PM Jed Brown > wrote: >>> You can't create CUDA or Kokkos Vecs if you're running on a node without a GPU. >>> >>> I am running the code on compute nodes that do have GPUs. >>> >>> If you are actually running on GPUs, why would you need lazy initialization? It would not break with GPUs present. >>> >>> Matt >>> >>> With PETSc-3.16.1, I got good speedup by running GAMG on GPUs. That might be a bug of PETSc-main. >>> >>> Thanks, >>> >>> Fande >>> >>> >>> >>> KSPSetUp 13 1.0 6.4400e-01 1.0 2.02e+09 1.0 0.0e+00 0.0e+00 0.0e+00 0 5 0 0 0 0 5 0 0 0 3140 64630 15 1.05e+02 5 3.49e+01 100 >>> KSPSolve 1 1.0 1.0109e+00 1.0 3.49e+10 1.0 0.0e+00 0.0e+00 0.0e+00 0 87 0 0 0 0 87 0 0 0 34522 69556 4 4.35e-03 1 2.38e-03 100 >>> KSPGMRESOrthog 142 1.0 1.2674e-01 1.0 1.06e+10 1.0 0.0e+00 0.0e+00 0.0e+00 0 27 0 0 0 0 27 0 0 0 83755 87801 0 0.00e+00 0 0.00e+00 100 >>> SNESSolve 1 1.0 4.4402e+01 1.0 4.00e+10 1.0 0.0e+00 0.0e+00 0.0e+00 21100 0 0 0 21100 0 0 0 901 51365 57 1.10e+03 52 8.78e+02 100 >>> SNESSetUp 1 1.0 3.9101e-05 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0 >>> SNESFunctionEval 2 1.0 1.7097e+01 1.0 1.60e+07 1.0 0.0e+00 0.0e+00 0.0e+00 8 0 0 0 0 8 0 0 0 0 1 0 0 0.00e+00 6 1.92e+02 0 >>> SNESJacobianEval 1 1.0 1.6213e+01 1.0 2.80e+07 1.0 0.0e+00 0.0e+00 0.0e+00 8 0 0 0 0 8 0 0 0 0 2 0 0 0.00e+00 1 3.20e+01 0 >>> SNESLineSearch 1 1.0 8.5582e+00 1.0 1.24e+08 1.0 0.0e+00 0.0e+00 0.0e+00 4 0 0 0 0 4 0 0 0 0 14 64153 1 3.20e+01 3 9.61e+01 94 >>> PCGAMGGraph_AGG 5 1.0 3.0509e+00 1.0 8.19e+07 1.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 27 0 5 3.49e+01 9 7.43e+01 0 >>> PCGAMGCoarse_AGG 5 1.0 3.8711e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 2 0 0 0 0 2 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0 >>> PCGAMGProl_AGG 5 1.0 7.0748e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0 >>> PCGAMGPOpt_AGG 5 1.0 1.2904e+00 1.0 2.14e+09 1.0 0.0e+00 0.0e+00 0.0e+00 1 5 0 0 0 1 5 0 0 0 1661 29807 26 7.15e+02 20 2.90e+02 99 >>> GAMG: createProl 5 1.0 8.9489e+00 1.0 2.22e+09 1.0 0.0e+00 0.0e+00 0.0e+00 4 6 0 0 0 4 6 0 0 0 249 29666 31 7.50e+02 29 3.64e+02 96 >>> Graph 10 1.0 3.0478e+00 1.0 8.19e+07 1.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 27 0 5 3.49e+01 9 7.43e+01 0 >>> MIS/Agg 5 1.0 4.1290e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0 >>> SA: col data 5 1.0 1.9127e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0 >>> SA: frmProl0 5 1.0 6.2662e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0 >>> SA: smooth 5 1.0 4.9595e-01 1.0 1.21e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 244 2709 15 1.97e+02 15 2.55e+02 90 >>> GAMG: partLevel 5 1.0 4.7330e-01 1.0 6.98e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 2 0 0 0 0 2 0 0 0 1475 4120 5 1.78e+02 10 2.55e+02 100 >>> PCGAMG Squ l00 1 1.0 2.6027e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0 >>> PCGAMG Gal l00 1 1.0 3.8406e-01 1.0 5.48e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 1426 4270 1 1.48e+02 2 2.11e+02 100 >>> PCGAMG Opt l00 1 1.0 2.4932e-01 1.0 7.20e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 289 2653 1 6.41e+01 1 1.13e+02 100 >>> PCGAMG Gal l01 1 1.0 6.6279e-02 1.0 1.09e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1645 3851 1 2.40e+01 2 3.64e+01 100 >>> PCGAMG Opt l01 1 1.0 2.9544e-02 1.0 7.15e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 242 1671 1 4.84e+00 1 1.23e+01 100 >>> PCGAMG Gal l02 1 1.0 1.8874e-02 1.0 3.72e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1974 3636 1 5.04e+00 2 6.58e+00 100 >>> PCGAMG Opt l02 1 1.0 7.4353e-03 1.0 2.40e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 323 1457 1 7.71e-01 1 2.30e+00 100 >>> PCGAMG Gal l03 1 1.0 2.8479e-03 1.0 4.10e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1440 2266 1 4.44e-01 2 5.51e-01 100 >>> PCGAMG Opt l03 1 1.0 8.2684e-04 1.0 2.80e+05 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 339 1667 1 6.72e-02 1 2.03e-01 100 >>> PCGAMG Gal l04 1 1.0 1.2238e-03 1.0 2.09e+05 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 170 244 1 2.05e-02 2 2.53e-02 100 >>> PCGAMG Opt l04 1 1.0 4.1008e-04 1.0 1.77e+04 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 43 165 1 4.49e-03 1 1.19e-02 100 >>> PCSetUp 2 1.0 9.9632e+00 1.0 4.95e+09 1.0 0.0e+00 0.0e+00 0.0e+00 5 12 0 0 0 5 12 0 0 0 496 17826 55 1.03e+03 45 6.54e+02 98 >>> PCSetUpOnBlocks 44 1.0 9.9087e-04 1.0 2.88e+03 1.0 >>> >>> >>> >>> >>> >>> The point of lazy initialization is to make it possible to run a solve that doesn't use a GPU in PETSC_ARCH that supports GPUs, regardless of whether a GPU is actually present. >>> >>> Fande Kong > writes: >>> >>> > I spoke too soon. It seems that we have trouble creating cuda/kokkos vecs >>> > now. Got Segmentation fault. >>> > >>> > Thanks, >>> > >>> > Fande >>> > >>> > Program received signal SIGSEGV, Segmentation fault. >>> > 0x00002aaab5558b11 in >>> > Petsc::CUPMDevice<(Petsc::CUPMDeviceType)0>::CUPMDeviceInternal::initialize >>> > (this=0x1) at >>> > /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/impls/cupm/cupmdevice.cxx:54 >>> > 54 PetscErrorCode CUPMDevice::CUPMDeviceInternal::initialize() noexcept >>> > Missing separate debuginfos, use: debuginfo-install >>> > bzip2-libs-1.0.6-13.el7.x86_64 elfutils-libelf-0.176-5.el7.x86_64 >>> > elfutils-libs-0.176-5.el7.x86_64 glibc-2.17-325.el7_9.x86_64 >>> > libX11-1.6.7-4.el7_9.x86_64 libXau-1.0.8-2.1.el7.x86_64 >>> > libattr-2.4.46-13.el7.x86_64 libcap-2.22-11.el7.x86_64 >>> > libibmad-5.4.0.MLNX20190423.1d917ae-0.1.49224.x86_64 >>> > libibumad-43.1.1.MLNX20200211.078947f-0.1.49224.x86_64 >>> > libibverbs-41mlnx1-OFED.4.9.0.0.7.49224.x86_64 >>> > libmlx4-41mlnx1-OFED.4.7.3.0.3.49224.x86_64 >>> > libmlx5-41mlnx1-OFED.4.9.0.1.2.49224.x86_64 libnl3-3.2.28-4.el7.x86_64 >>> > librdmacm-41mlnx1-OFED.4.7.3.0.6.49224.x86_64 >>> > librxe-41mlnx1-OFED.4.4.2.4.6.49224.x86_64 libxcb-1.13-1.el7.x86_64 >>> > libxml2-2.9.1-6.el7_9.6.x86_64 numactl-libs-2.0.12-5.el7.x86_64 >>> > systemd-libs-219-78.el7_9.3.x86_64 xz-libs-5.2.2-1.el7.x86_64 >>> > zlib-1.2.7-19.el7_9.x86_64 >>> > (gdb) bt >>> > #0 0x00002aaab5558b11 in >>> > Petsc::CUPMDevice<(Petsc::CUPMDeviceType)0>::CUPMDeviceInternal::initialize >>> > (this=0x1) at >>> > /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/impls/cupm/cupmdevice.cxx:54 >>> > #1 0x00002aaab5558db7 in >>> > Petsc::CUPMDevice<(Petsc::CUPMDeviceType)0>::getDevice >>> > (this=this at entry=0x2aaab7f37b70 >>> > , device=0x115da00, id=-35, id at entry=-1) at >>> > /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/impls/cupm/cupmdevice.cxx:344 >>> > #2 0x00002aaab55577de in PetscDeviceCreate (type=type at entry=PETSC_DEVICE_CUDA, >>> > devid=devid at entry=-1, device=device at entry=0x2aaab7f37b48 >>> > ) at >>> > /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/interface/device.cxx:107 >>> > #3 0x00002aaab5557b3a in PetscDeviceInitializeDefaultDevice_Internal >>> > (type=type at entry=PETSC_DEVICE_CUDA, defaultDeviceId=defaultDeviceId at entry=-1) >>> > at >>> > /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/interface/device.cxx:273 >>> > #4 0x00002aaab5557bf6 in PetscDeviceInitialize >>> > (type=type at entry=PETSC_DEVICE_CUDA) >>> > at >>> > /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/interface/device.cxx:234 >>> > #5 0x00002aaab5661fcd in VecCreate_SeqCUDA (V=0x115d150) at >>> > /home/kongf/workhome/sawtooth/moosegpu/petsc/src/vec/vec/impls/seq/seqcuda/veccuda.c:244 >>> > #6 0x00002aaab5649b40 in VecSetType (vec=vec at entry=0x115d150, >>> > method=method at entry=0x2aaab70b45b8 "seqcuda") at >>> > /home/kongf/workhome/sawtooth/moosegpu/petsc/src/vec/vec/interface/vecreg.c:93 >>> > #7 0x00002aaab579c33f in VecCreate_CUDA (v=0x115d150) at >>> > /home/kongf/workhome/sawtooth/moosegpu/petsc/src/vec/vec/impls/mpi/mpicuda/ >>> > mpicuda.cu:214 >>> > #8 0x00002aaab5649b40 in VecSetType (vec=vec at entry=0x115d150, >>> > method=method at entry=0x7fffffff9260 "cuda") at >>> > /home/kongf/workhome/sawtooth/moosegpu/petsc/src/vec/vec/interface/vecreg.c:93 >>> > #9 0x00002aaab5648bf1 in VecSetTypeFromOptions_Private (vec=0x115d150, >>> > PetscOptionsObject=0x7fffffff9210) at >>> > /home/kongf/workhome/sawtooth/moosegpu/petsc/src/vec/vec/interface/vector.c:1263 >>> > #10 VecSetFromOptions (vec=0x115d150) at >>> > /home/kongf/workhome/sawtooth/moosegpu/petsc/src/vec/vec/interface/vector.c:1297 >>> > #11 0x00002aaab02ef227 in libMesh::PetscVector::init >>> > (this=0x11cd1a0, n=441, n_local=441, fast=false, ptype=libMesh::PARALLEL) >>> > at >>> > /home/kongf/workhome/sawtooth/moosegpu/scripts/../libmesh/installed/include/libmesh/petsc_vector.h:693 >>> > >>> > On Thu, Jan 20, 2022 at 1:09 PM Fande Kong > wrote: >>> > >>> >> Thanks, Jed, >>> >> >>> >> This worked! >>> >> >>> >> Fande >>> >> >>> >> On Wed, Jan 19, 2022 at 11:03 PM Jed Brown > wrote: >>> >> >>> >>> Fande Kong > writes: >>> >>> >>> >>> > On Wed, Jan 19, 2022 at 11:39 AM Jacob Faibussowitsch < >>> >>> jacob.fai at gmail.com > >>> >>> > wrote: >>> >>> > >>> >>> >> Are you running on login nodes or compute nodes (I can?t seem to tell >>> >>> from >>> >>> >> the configure.log)? >>> >>> >> >>> >>> > >>> >>> > I was compiling codes on login nodes, and running codes on compute >>> >>> nodes. >>> >>> > Login nodes do not have GPUs, but compute nodes do have GPUs. >>> >>> > >>> >>> > Just to be clear, the same thing (code, machine) with PETSc-3.16.1 >>> >>> worked >>> >>> > perfectly. I have this trouble with PETSc-main. >>> >>> >>> >>> I assume you can >>> >>> >>> >>> export PETSC_OPTIONS='-device_enable lazy' >>> >>> >>> >>> and it'll work. >>> >>> >>> >>> I think this should be the default. The main complaint is that timing the >>> >>> first GPU-using event isn't accurate if it includes initialization, but I >>> >>> think this is mostly hypothetical because you can't trust any timing that >>> >>> doesn't preload in some form and the first GPU-using event will almost >>> >>> always be something uninteresting so I think it will rarely lead to >>> >>> confusion. Meanwhile, eager initialization is viscerally disruptive for >>> >>> lots of people. >>> >>> >>> >> >>> >>> >>> -- >>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>> -- Norbert Wiener >>> >>> https://www.cse.buffalo.edu/~knepley/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From axel.fourmont at cea.fr Wed Jan 26 02:55:17 2022 From: axel.fourmont at cea.fr (FOURMONT Axel) Date: Wed, 26 Jan 2022 08:55:17 +0000 Subject: [petsc-users] problem with spack instaler (trilinos link) In-Reply-To: References: <6498bab351304fb38d92b12ebf7563ca@cea.fr>, Message-ID: <770c6a604e9c4352bafd4868a2726d70@cea.fr> ok thanks Axel ________________________________ De : Satish Balay Envoy? : mardi 25 janvier 2022 17:46:50 ? : FOURMONT Axel Cc : petsc-users at mcs.anl.gov Objet : Re: [petsc-users] problem with spack instaler (trilinos link) Yeah petsc+trilinos is currently broken in spack. On the petsc side - we currently use minimal ml tarball and that doesn't translate to full trilinos install. So if you need this feature - the current fix is to install petsc manually - without spack [or hack spack to add in '--download-ml' to petsc configure option - and not use +trilinos] Satish On Tue, 25 Jan 2022, FOURMONT Axel wrote: > Dear PETSc developers, > > First of all thank you for your work! > > I try to use the spack tool to install petsc with mumps: spack install petsc+mumps~hdf5 (with the good version for compilers). All is OK, PETSc works fine. > But now, I want acces to ML preconditioner, so I need install a PETSc version with trilinos: spack install petsc+mumps+trilinos~hdf5 > > The compilation fails (in the check phase), I notices 2 things: > petsc links on trilinos/lib64 but the directory path is lib > I make a symbolic link: ln -s trilinos/lib trilinos/lib64 to try solve it > Also there is an error with the definition of Zoltan_Create() > > Can you help me please? > > I attach the configure.log. > I use the last spack release v0.17.1 and > ubuntu 20.04 with gcc-10 gfortran-10, openmpi 4.0.3 from apt instaler > > > Thanks, > Axel > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From junchao.zhang at gmail.com Wed Jan 26 08:40:19 2022 From: junchao.zhang at gmail.com (Junchao Zhang) Date: Wed, 26 Jan 2022 08:40:19 -0600 Subject: [petsc-users] Cannot eagerly initialize cuda, as doing so results in cuda error 35 (cudaErrorInsufficientDriver) : CUDA driver version is insufficient for CUDA runtime version In-Reply-To: References: <233D0F20-AD95-481B-B862-353D5789C556@gmail.com> <87czkn9c64.fsf@jedbrown.org> <875yqe7zip.fsf@jedbrown.org> Message-ID: The good uses the compiler's default library/header path. The bad searches from cuda toolkit path and uses rpath linking. Though the paths look the same on the login node, they could have different behavior on a compute node depending on its environment. I think we fixed the issue in cuda.py (i.e., first try the compiler's default, then toolkit). That's why I wanted Fande to use petsc/main. --Junchao Zhang On Tue, Jan 25, 2022 at 11:59 PM Barry Smith wrote: > > bad has extra > > -L/apps/local/spack/software/gcc-7.5.0/cuda-10.1.243-v4ymjqcrr7f72qfiuzsstuy5jiajbuey/lib64/stubs > -lcuda > > good does not. > > Try removing the stubs directory and -lcuda from the bad > $PETSC_ARCH/lib/petsc/conf/variables and likely the bad will start working. > > Barry > > I never liked the stubs stuff. > > On Jan 25, 2022, at 11:29 PM, Fande Kong wrote: > > Hi Junchao, > > I attached a "bad" configure log and a "good" configure log. > > The "bad" one was on produced at 246ba74192519a5f34fb6e227d1c64364e19ce2c > > and the "good" one at 384645a00975869a1aacbd3169de62ba40cad683 > > This good hash is the last good hash that is just the right before the bad > one. > > I think you could do a comparison between these two logs, and check what > the differences were. > > Thanks, > > Fande > > On Tue, Jan 25, 2022 at 8:21 PM Junchao Zhang > wrote: > >> Fande, could you send the configure.log that works (i.e., before this >> offending commit)? >> --Junchao Zhang >> >> >> On Tue, Jan 25, 2022 at 8:21 PM Fande Kong wrote: >> >>> Not sure if this is helpful. I did "git bisect", and here was the result: >>> >>> [kongf at sawtooth2 petsc]$ git bisect bad >>> 246ba74192519a5f34fb6e227d1c64364e19ce2c is the first bad commit >>> commit 246ba74192519a5f34fb6e227d1c64364e19ce2c >>> Author: Junchao Zhang >>> Date: Wed Oct 13 05:32:43 2021 +0000 >>> >>> Config: fix CUDA library and header dirs >>> >>> :040000 040000 187c86055adb80f53c1d0565a8888704fec43a96 >>> ea1efd7f594fd5e8df54170bc1bc7b00f35e4d5f M config >>> >>> >>> Started from this commit, and GPU did not work for me on our HPC >>> >>> Thanks, >>> Fande >>> >>> On Tue, Jan 25, 2022 at 7:18 PM Fande Kong wrote: >>> >>>> >>>> >>>> On Tue, Jan 25, 2022 at 9:04 AM Jacob Faibussowitsch < >>>> jacob.fai at gmail.com> wrote: >>>> >>>>> Configure should not have an impact here I think. The reason I had you >>>>> run `cudaGetDeviceCount()` is because this is the CUDA call (and in fact >>>>> the only CUDA call) in the initialization sequence that returns the error >>>>> code. There should be no prior CUDA calls. Maybe this is a problem with >>>>> oversubscribing GPU?s? In the runs that crash, how many ranks are using any >>>>> given GPU at once? Maybe MPS is required. >>>>> >>>> >>>> I used one MPI rank. >>>> >>>> Fande >>>> >>>> >>>> >>>>> >>>>> Best regards, >>>>> >>>>> Jacob Faibussowitsch >>>>> (Jacob Fai - booss - oh - vitch) >>>>> >>>>> On Jan 21, 2022, at 12:01, Fande Kong wrote: >>>>> >>>>> Thanks Jacob, >>>>> >>>>> On Thu, Jan 20, 2022 at 6:25 PM Jacob Faibussowitsch < >>>>> jacob.fai at gmail.com> wrote: >>>>> >>>>>> Segfault is caused by the following check at >>>>>> src/sys/objects/device/impls/cupm/cupmdevice.cxx:349 being a >>>>>> PetscUnlikelyDebug() rather than just PetscUnlikely(): >>>>>> >>>>>> ``` >>>>>> if (PetscUnlikelyDebug(_defaultDevice < 0)) { // _defaultDevice is in >>>>>> fact < 0 here and uncaught >>>>>> ``` >>>>>> >>>>>> To clarify: >>>>>> >>>>>> ?lazy? initialization is not that lazy after all, it still does some >>>>>> 50% of the initialization that ?eager? initialization does. It stops short >>>>>> initializing the CUDA runtime, checking CUDA aware MPI, gathering device >>>>>> data, and initializing cublas and friends. Lazy also importantly swallows >>>>>> any errors that crop up during initialization, storing the resulting error >>>>>> code for later (specifically _defaultDevice = -init_error_value;). >>>>>> >>>>>> So whether you initialize lazily or eagerly makes no difference here, >>>>>> as _defaultDevice will always contain -35. >>>>>> >>>>>> The bigger question is why cudaGetDeviceCount() is returning >>>>>> cudaErrorInsufficientDriver. Can you compile and run >>>>>> >>>>>> ``` >>>>>> #include >>>>>> >>>>>> int main() >>>>>> { >>>>>> int ndev; >>>>>> return cudaGetDeviceCount(&ndev): >>>>>> } >>>>>> ``` >>>>>> >>>>>> Then show the value of "echo $??? >>>>>> >>>>> >>>>> Modify your code a little to get more information. >>>>> >>>>> #include >>>>> #include >>>>> >>>>> int main() >>>>> { >>>>> int ndev; >>>>> int error = cudaGetDeviceCount(&ndev); >>>>> printf("ndev %d \n", ndev); >>>>> printf("error %d \n", error); >>>>> return 0; >>>>> } >>>>> >>>>> Results: >>>>> >>>>> $ ./a.out >>>>> ndev 4 >>>>> error 0 >>>>> >>>>> >>>>> I have not read the PETSc cuda initialization code yet. If I need to >>>>> guess at what was happening. I will naively think that PETSc did not get >>>>> correct GPU information in the configuration because the compiler node does >>>>> not have GPUs, and there was no way to get any GPU device information. >>>>> >>>>> >>>>> During the runtime on GPU nodes, PETSc might have incorrect >>>>> information grabbed during configuration and had this kind of false error >>>>> message. >>>>> >>>>> Thanks, >>>>> >>>>> Fande >>>>> >>>>> >>>>> >>>>>> >>>>>> Best regards, >>>>>> >>>>>> Jacob Faibussowitsch >>>>>> (Jacob Fai - booss - oh - vitch) >>>>>> >>>>>> On Jan 20, 2022, at 17:47, Matthew Knepley wrote: >>>>>> >>>>>> On Thu, Jan 20, 2022 at 6:44 PM Fande Kong >>>>>> wrote: >>>>>> >>>>>>> Thanks, Jed >>>>>>> >>>>>>> On Thu, Jan 20, 2022 at 4:34 PM Jed Brown wrote: >>>>>>> >>>>>>>> You can't create CUDA or Kokkos Vecs if you're running on a node >>>>>>>> without a GPU. >>>>>>> >>>>>>> >>>>>>> I am running the code on compute nodes that do have GPUs. >>>>>>> >>>>>> >>>>>> If you are actually running on GPUs, why would you need lazy >>>>>> initialization? It would not break with GPUs present. >>>>>> >>>>>> Matt >>>>>> >>>>>> >>>>>>> With PETSc-3.16.1, I got good speedup by running GAMG on GPUs. >>>>>>> That might be a bug of PETSc-main. >>>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> Fande >>>>>>> >>>>>>> >>>>>>> >>>>>>> KSPSetUp 13 1.0 6.4400e-01 1.0 2.02e+09 1.0 0.0e+00 >>>>>>> 0.0e+00 0.0e+00 0 5 0 0 0 0 5 0 0 0 3140 64630 15 >>>>>>> 1.05e+02 5 3.49e+01 100 >>>>>>> KSPSolve 1 1.0 1.0109e+00 1.0 3.49e+10 1.0 0.0e+00 >>>>>>> 0.0e+00 0.0e+00 0 87 0 0 0 0 87 0 0 0 34522 69556 4 >>>>>>> 4.35e-03 1 2.38e-03 100 >>>>>>> KSPGMRESOrthog 142 1.0 1.2674e-01 1.0 1.06e+10 1.0 0.0e+00 >>>>>>> 0.0e+00 0.0e+00 0 27 0 0 0 0 27 0 0 0 83755 87801 0 >>>>>>> 0.00e+00 0 0.00e+00 100 >>>>>>> SNESSolve 1 1.0 4.4402e+01 1.0 4.00e+10 1.0 0.0e+00 >>>>>>> 0.0e+00 0.0e+00 21100 0 0 0 21100 0 0 0 901 51365 57 >>>>>>> 1.10e+03 52 8.78e+02 100 >>>>>>> SNESSetUp 1 1.0 3.9101e-05 1.0 0.00e+00 0.0 0.0e+00 >>>>>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 >>>>>>> 0.00e+00 0 0.00e+00 0 >>>>>>> SNESFunctionEval 2 1.0 1.7097e+01 1.0 1.60e+07 1.0 0.0e+00 >>>>>>> 0.0e+00 0.0e+00 8 0 0 0 0 8 0 0 0 0 1 0 0 >>>>>>> 0.00e+00 6 1.92e+02 0 >>>>>>> SNESJacobianEval 1 1.0 1.6213e+01 1.0 2.80e+07 1.0 0.0e+00 >>>>>>> 0.0e+00 0.0e+00 8 0 0 0 0 8 0 0 0 0 2 0 0 >>>>>>> 0.00e+00 1 3.20e+01 0 >>>>>>> SNESLineSearch 1 1.0 8.5582e+00 1.0 1.24e+08 1.0 0.0e+00 >>>>>>> 0.0e+00 0.0e+00 4 0 0 0 0 4 0 0 0 0 14 64153 1 >>>>>>> 3.20e+01 3 9.61e+01 94 >>>>>>> PCGAMGGraph_AGG 5 1.0 3.0509e+00 1.0 8.19e+07 1.0 0.0e+00 >>>>>>> 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 27 0 5 >>>>>>> 3.49e+01 9 7.43e+01 0 >>>>>>> PCGAMGCoarse_AGG 5 1.0 3.8711e+00 1.0 0.00e+00 0.0 0.0e+00 >>>>>>> 0.0e+00 0.0e+00 2 0 0 0 0 2 0 0 0 0 0 0 0 >>>>>>> 0.00e+00 0 0.00e+00 0 >>>>>>> PCGAMGProl_AGG 5 1.0 7.0748e-01 1.0 0.00e+00 0.0 0.0e+00 >>>>>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 >>>>>>> 0.00e+00 0 0.00e+00 0 >>>>>>> PCGAMGPOpt_AGG 5 1.0 1.2904e+00 1.0 2.14e+09 1.0 0.0e+00 >>>>>>> 0.0e+00 0.0e+00 1 5 0 0 0 1 5 0 0 0 1661 29807 26 >>>>>>> 7.15e+02 20 2.90e+02 99 >>>>>>> GAMG: createProl 5 1.0 8.9489e+00 1.0 2.22e+09 1.0 0.0e+00 >>>>>>> 0.0e+00 0.0e+00 4 6 0 0 0 4 6 0 0 0 249 29666 31 >>>>>>> 7.50e+02 29 3.64e+02 96 >>>>>>> Graph 10 1.0 3.0478e+00 1.0 8.19e+07 1.0 0.0e+00 >>>>>>> 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 27 0 5 >>>>>>> 3.49e+01 9 7.43e+01 0 >>>>>>> MIS/Agg 5 1.0 4.1290e-01 1.0 0.00e+00 0.0 0.0e+00 >>>>>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 >>>>>>> 0.00e+00 0 0.00e+00 0 >>>>>>> SA: col data 5 1.0 1.9127e-02 1.0 0.00e+00 0.0 0.0e+00 >>>>>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 >>>>>>> 0.00e+00 0 0.00e+00 0 >>>>>>> SA: frmProl0 5 1.0 6.2662e-01 1.0 0.00e+00 0.0 0.0e+00 >>>>>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 >>>>>>> 0.00e+00 0 0.00e+00 0 >>>>>>> SA: smooth 5 1.0 4.9595e-01 1.0 1.21e+08 1.0 0.0e+00 >>>>>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 244 2709 15 >>>>>>> 1.97e+02 15 2.55e+02 90 >>>>>>> GAMG: partLevel 5 1.0 4.7330e-01 1.0 6.98e+08 1.0 0.0e+00 >>>>>>> 0.0e+00 0.0e+00 0 2 0 0 0 0 2 0 0 0 1475 4120 5 >>>>>>> 1.78e+02 10 2.55e+02 100 >>>>>>> PCGAMG Squ l00 1 1.0 2.6027e+00 1.0 0.00e+00 0.0 0.0e+00 >>>>>>> 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 0 0 0 >>>>>>> 0.00e+00 0 0.00e+00 0 >>>>>>> PCGAMG Gal l00 1 1.0 3.8406e-01 1.0 5.48e+08 1.0 0.0e+00 >>>>>>> 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 1426 4270 1 >>>>>>> 1.48e+02 2 2.11e+02 100 >>>>>>> PCGAMG Opt l00 1 1.0 2.4932e-01 1.0 7.20e+07 1.0 0.0e+00 >>>>>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 289 2653 1 >>>>>>> 6.41e+01 1 1.13e+02 100 >>>>>>> PCGAMG Gal l01 1 1.0 6.6279e-02 1.0 1.09e+08 1.0 0.0e+00 >>>>>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1645 3851 1 >>>>>>> 2.40e+01 2 3.64e+01 100 >>>>>>> PCGAMG Opt l01 1 1.0 2.9544e-02 1.0 7.15e+06 1.0 0.0e+00 >>>>>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 242 1671 1 >>>>>>> 4.84e+00 1 1.23e+01 100 >>>>>>> PCGAMG Gal l02 1 1.0 1.8874e-02 1.0 3.72e+07 1.0 0.0e+00 >>>>>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1974 3636 1 >>>>>>> 5.04e+00 2 6.58e+00 100 >>>>>>> PCGAMG Opt l02 1 1.0 7.4353e-03 1.0 2.40e+06 1.0 0.0e+00 >>>>>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 323 1457 1 >>>>>>> 7.71e-01 1 2.30e+00 100 >>>>>>> PCGAMG Gal l03 1 1.0 2.8479e-03 1.0 4.10e+06 1.0 0.0e+00 >>>>>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1440 2266 1 >>>>>>> 4.44e-01 2 5.51e-01 100 >>>>>>> PCGAMG Opt l03 1 1.0 8.2684e-04 1.0 2.80e+05 1.0 0.0e+00 >>>>>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 339 1667 1 >>>>>>> 6.72e-02 1 2.03e-01 100 >>>>>>> PCGAMG Gal l04 1 1.0 1.2238e-03 1.0 2.09e+05 1.0 0.0e+00 >>>>>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 170 244 1 >>>>>>> 2.05e-02 2 2.53e-02 100 >>>>>>> PCGAMG Opt l04 1 1.0 4.1008e-04 1.0 1.77e+04 1.0 0.0e+00 >>>>>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 43 165 1 >>>>>>> 4.49e-03 1 1.19e-02 100 >>>>>>> PCSetUp 2 1.0 9.9632e+00 1.0 4.95e+09 1.0 0.0e+00 >>>>>>> 0.0e+00 0.0e+00 5 12 0 0 0 5 12 0 0 0 496 17826 55 >>>>>>> 1.03e+03 45 6.54e+02 98 >>>>>>> PCSetUpOnBlocks 44 1.0 9.9087e-04 1.0 2.88e+03 1.0 >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>>> The point of lazy initialization is to make it possible to run a >>>>>>>> solve that doesn't use a GPU in PETSC_ARCH that supports GPUs, regardless >>>>>>>> of whether a GPU is actually present. >>>>>>>> >>>>>>>> Fande Kong writes: >>>>>>>> >>>>>>>> > I spoke too soon. It seems that we have trouble creating >>>>>>>> cuda/kokkos vecs >>>>>>>> > now. Got Segmentation fault. >>>>>>>> > >>>>>>>> > Thanks, >>>>>>>> > >>>>>>>> > Fande >>>>>>>> > >>>>>>>> > Program received signal SIGSEGV, Segmentation fault. >>>>>>>> > 0x00002aaab5558b11 in >>>>>>>> > >>>>>>>> Petsc::CUPMDevice<(Petsc::CUPMDeviceType)0>::CUPMDeviceInternal::initialize >>>>>>>> > (this=0x1) at >>>>>>>> > >>>>>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/impls/cupm/cupmdevice.cxx:54 >>>>>>>> > 54 PetscErrorCode CUPMDevice::CUPMDeviceInternal::initialize() >>>>>>>> noexcept >>>>>>>> > Missing separate debuginfos, use: debuginfo-install >>>>>>>> > bzip2-libs-1.0.6-13.el7.x86_64 elfutils-libelf-0.176-5.el7.x86_64 >>>>>>>> > elfutils-libs-0.176-5.el7.x86_64 glibc-2.17-325.el7_9.x86_64 >>>>>>>> > libX11-1.6.7-4.el7_9.x86_64 libXau-1.0.8-2.1.el7.x86_64 >>>>>>>> > libattr-2.4.46-13.el7.x86_64 libcap-2.22-11.el7.x86_64 >>>>>>>> > libibmad-5.4.0.MLNX20190423.1d917ae-0.1.49224.x86_64 >>>>>>>> > libibumad-43.1.1.MLNX20200211.078947f-0.1.49224.x86_64 >>>>>>>> > libibverbs-41mlnx1-OFED.4.9.0.0.7.49224.x86_64 >>>>>>>> > libmlx4-41mlnx1-OFED.4.7.3.0.3.49224.x86_64 >>>>>>>> > libmlx5-41mlnx1-OFED.4.9.0.1.2.49224.x86_64 >>>>>>>> libnl3-3.2.28-4.el7.x86_64 >>>>>>>> > librdmacm-41mlnx1-OFED.4.7.3.0.6.49224.x86_64 >>>>>>>> > librxe-41mlnx1-OFED.4.4.2.4.6.49224.x86_64 >>>>>>>> libxcb-1.13-1.el7.x86_64 >>>>>>>> > libxml2-2.9.1-6.el7_9.6.x86_64 numactl-libs-2.0.12-5.el7.x86_64 >>>>>>>> > systemd-libs-219-78.el7_9.3.x86_64 xz-libs-5.2.2-1.el7.x86_64 >>>>>>>> > zlib-1.2.7-19.el7_9.x86_64 >>>>>>>> > (gdb) bt >>>>>>>> > #0 0x00002aaab5558b11 in >>>>>>>> > >>>>>>>> Petsc::CUPMDevice<(Petsc::CUPMDeviceType)0>::CUPMDeviceInternal::initialize >>>>>>>> > (this=0x1) at >>>>>>>> > >>>>>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/impls/cupm/cupmdevice.cxx:54 >>>>>>>> > #1 0x00002aaab5558db7 in >>>>>>>> > Petsc::CUPMDevice<(Petsc::CUPMDeviceType)0>::getDevice >>>>>>>> > (this=this at entry=0x2aaab7f37b70 >>>>>>>> > , device=0x115da00, id=-35, id at entry=-1) at >>>>>>>> > >>>>>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/impls/cupm/cupmdevice.cxx:344 >>>>>>>> > #2 0x00002aaab55577de in PetscDeviceCreate (type=type at entry >>>>>>>> =PETSC_DEVICE_CUDA, >>>>>>>> > devid=devid at entry=-1, device=device at entry=0x2aaab7f37b48 >>>>>>>> > ) at >>>>>>>> > >>>>>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/interface/device.cxx:107 >>>>>>>> > #3 0x00002aaab5557b3a in >>>>>>>> PetscDeviceInitializeDefaultDevice_Internal >>>>>>>> > (type=type at entry=PETSC_DEVICE_CUDA, >>>>>>>> defaultDeviceId=defaultDeviceId at entry=-1) >>>>>>>> > at >>>>>>>> > >>>>>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/interface/device.cxx:273 >>>>>>>> > #4 0x00002aaab5557bf6 in PetscDeviceInitialize >>>>>>>> > (type=type at entry=PETSC_DEVICE_CUDA) >>>>>>>> > at >>>>>>>> > >>>>>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/interface/device.cxx:234 >>>>>>>> > #5 0x00002aaab5661fcd in VecCreate_SeqCUDA (V=0x115d150) at >>>>>>>> > >>>>>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/vec/vec/impls/seq/seqcuda/veccuda.c:244 >>>>>>>> > #6 0x00002aaab5649b40 in VecSetType (vec=vec at entry=0x115d150, >>>>>>>> > method=method at entry=0x2aaab70b45b8 "seqcuda") at >>>>>>>> > >>>>>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/vec/vec/interface/vecreg.c:93 >>>>>>>> > #7 0x00002aaab579c33f in VecCreate_CUDA (v=0x115d150) at >>>>>>>> > >>>>>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/vec/vec/impls/mpi/mpicuda/ >>>>>>>> > mpicuda.cu:214 >>>>>>>> > #8 0x00002aaab5649b40 in VecSetType (vec=vec at entry=0x115d150, >>>>>>>> > method=method at entry=0x7fffffff9260 "cuda") at >>>>>>>> > >>>>>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/vec/vec/interface/vecreg.c:93 >>>>>>>> > #9 0x00002aaab5648bf1 in VecSetTypeFromOptions_Private >>>>>>>> (vec=0x115d150, >>>>>>>> > PetscOptionsObject=0x7fffffff9210) at >>>>>>>> > >>>>>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/vec/vec/interface/vector.c:1263 >>>>>>>> > #10 VecSetFromOptions (vec=0x115d150) at >>>>>>>> > >>>>>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/vec/vec/interface/vector.c:1297 >>>>>>>> > #11 0x00002aaab02ef227 in libMesh::PetscVector::init >>>>>>>> > (this=0x11cd1a0, n=441, n_local=441, fast=false, >>>>>>>> ptype=libMesh::PARALLEL) >>>>>>>> > at >>>>>>>> > >>>>>>>> /home/kongf/workhome/sawtooth/moosegpu/scripts/../libmesh/installed/include/libmesh/petsc_vector.h:693 >>>>>>>> > >>>>>>>> > On Thu, Jan 20, 2022 at 1:09 PM Fande Kong >>>>>>>> wrote: >>>>>>>> > >>>>>>>> >> Thanks, Jed, >>>>>>>> >> >>>>>>>> >> This worked! >>>>>>>> >> >>>>>>>> >> Fande >>>>>>>> >> >>>>>>>> >> On Wed, Jan 19, 2022 at 11:03 PM Jed Brown >>>>>>>> wrote: >>>>>>>> >> >>>>>>>> >>> Fande Kong writes: >>>>>>>> >>> >>>>>>>> >>> > On Wed, Jan 19, 2022 at 11:39 AM Jacob Faibussowitsch < >>>>>>>> >>> jacob.fai at gmail.com> >>>>>>>> >>> > wrote: >>>>>>>> >>> > >>>>>>>> >>> >> Are you running on login nodes or compute nodes (I can?t >>>>>>>> seem to tell >>>>>>>> >>> from >>>>>>>> >>> >> the configure.log)? >>>>>>>> >>> >> >>>>>>>> >>> > >>>>>>>> >>> > I was compiling codes on login nodes, and running codes on >>>>>>>> compute >>>>>>>> >>> nodes. >>>>>>>> >>> > Login nodes do not have GPUs, but compute nodes do have GPUs. >>>>>>>> >>> > >>>>>>>> >>> > Just to be clear, the same thing (code, machine) with >>>>>>>> PETSc-3.16.1 >>>>>>>> >>> worked >>>>>>>> >>> > perfectly. I have this trouble with PETSc-main. >>>>>>>> >>> >>>>>>>> >>> I assume you can >>>>>>>> >>> >>>>>>>> >>> export PETSC_OPTIONS='-device_enable lazy' >>>>>>>> >>> >>>>>>>> >>> and it'll work. >>>>>>>> >>> >>>>>>>> >>> I think this should be the default. The main complaint is that >>>>>>>> timing the >>>>>>>> >>> first GPU-using event isn't accurate if it includes >>>>>>>> initialization, but I >>>>>>>> >>> think this is mostly hypothetical because you can't trust any >>>>>>>> timing that >>>>>>>> >>> doesn't preload in some form and the first GPU-using event will >>>>>>>> almost >>>>>>>> >>> always be something uninteresting so I think it will rarely >>>>>>>> lead to >>>>>>>> >>> confusion. Meanwhile, eager initialization is viscerally >>>>>>>> disruptive for >>>>>>>> >>> lots of people. >>>>>>>> >>> >>>>>>>> >> >>>>>>>> >>>>>>> >>>>>> >>>>>> -- >>>>>> What most experimenters take for granted before they begin their >>>>>> experiments is infinitely more interesting than any results to which their >>>>>> experiments lead. >>>>>> -- Norbert Wiener >>>>>> >>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>> >>>>>> >>>>>> >>>>> > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From fdkong.jd at gmail.com Wed Jan 26 12:24:26 2022 From: fdkong.jd at gmail.com (Fande Kong) Date: Wed, 26 Jan 2022 11:24:26 -0700 Subject: [petsc-users] Cannot eagerly initialize cuda, as doing so results in cuda error 35 (cudaErrorInsufficientDriver) : CUDA driver version is insufficient for CUDA runtime version In-Reply-To: References: <233D0F20-AD95-481B-B862-353D5789C556@gmail.com> <87czkn9c64.fsf@jedbrown.org> <875yqe7zip.fsf@jedbrown.org> Message-ID: On Tue, Jan 25, 2022 at 10:59 PM Barry Smith wrote: > > bad has extra > > -L/apps/local/spack/software/gcc-7.5.0/cuda-10.1.243-v4ymjqcrr7f72qfiuzsstuy5jiajbuey/lib64/stubs > -lcuda > > good does not. > > Try removing the stubs directory and -lcuda from the bad > $PETSC_ARCH/lib/petsc/conf/variables and likely the bad will start working. > It seems I still got the same issue after removing stubs directory and -lcuda. Thanks, Fande > > Barry > > I never liked the stubs stuff. > > On Jan 25, 2022, at 11:29 PM, Fande Kong wrote: > > Hi Junchao, > > I attached a "bad" configure log and a "good" configure log. > > The "bad" one was on produced at 246ba74192519a5f34fb6e227d1c64364e19ce2c > > and the "good" one at 384645a00975869a1aacbd3169de62ba40cad683 > > This good hash is the last good hash that is just the right before the bad > one. > > I think you could do a comparison between these two logs, and check what > the differences were. > > Thanks, > > Fande > > On Tue, Jan 25, 2022 at 8:21 PM Junchao Zhang > wrote: > >> Fande, could you send the configure.log that works (i.e., before this >> offending commit)? >> --Junchao Zhang >> >> >> On Tue, Jan 25, 2022 at 8:21 PM Fande Kong wrote: >> >>> Not sure if this is helpful. I did "git bisect", and here was the result: >>> >>> [kongf at sawtooth2 petsc]$ git bisect bad >>> 246ba74192519a5f34fb6e227d1c64364e19ce2c is the first bad commit >>> commit 246ba74192519a5f34fb6e227d1c64364e19ce2c >>> Author: Junchao Zhang >>> Date: Wed Oct 13 05:32:43 2021 +0000 >>> >>> Config: fix CUDA library and header dirs >>> >>> :040000 040000 187c86055adb80f53c1d0565a8888704fec43a96 >>> ea1efd7f594fd5e8df54170bc1bc7b00f35e4d5f M config >>> >>> >>> Started from this commit, and GPU did not work for me on our HPC >>> >>> Thanks, >>> Fande >>> >>> On Tue, Jan 25, 2022 at 7:18 PM Fande Kong wrote: >>> >>>> >>>> >>>> On Tue, Jan 25, 2022 at 9:04 AM Jacob Faibussowitsch < >>>> jacob.fai at gmail.com> wrote: >>>> >>>>> Configure should not have an impact here I think. The reason I had you >>>>> run `cudaGetDeviceCount()` is because this is the CUDA call (and in fact >>>>> the only CUDA call) in the initialization sequence that returns the error >>>>> code. There should be no prior CUDA calls. Maybe this is a problem with >>>>> oversubscribing GPU?s? In the runs that crash, how many ranks are using any >>>>> given GPU at once? Maybe MPS is required. >>>>> >>>> >>>> I used one MPI rank. >>>> >>>> Fande >>>> >>>> >>>> >>>>> >>>>> Best regards, >>>>> >>>>> Jacob Faibussowitsch >>>>> (Jacob Fai - booss - oh - vitch) >>>>> >>>>> On Jan 21, 2022, at 12:01, Fande Kong wrote: >>>>> >>>>> Thanks Jacob, >>>>> >>>>> On Thu, Jan 20, 2022 at 6:25 PM Jacob Faibussowitsch < >>>>> jacob.fai at gmail.com> wrote: >>>>> >>>>>> Segfault is caused by the following check at >>>>>> src/sys/objects/device/impls/cupm/cupmdevice.cxx:349 being a >>>>>> PetscUnlikelyDebug() rather than just PetscUnlikely(): >>>>>> >>>>>> ``` >>>>>> if (PetscUnlikelyDebug(_defaultDevice < 0)) { // _defaultDevice is in >>>>>> fact < 0 here and uncaught >>>>>> ``` >>>>>> >>>>>> To clarify: >>>>>> >>>>>> ?lazy? initialization is not that lazy after all, it still does some >>>>>> 50% of the initialization that ?eager? initialization does. It stops short >>>>>> initializing the CUDA runtime, checking CUDA aware MPI, gathering device >>>>>> data, and initializing cublas and friends. Lazy also importantly swallows >>>>>> any errors that crop up during initialization, storing the resulting error >>>>>> code for later (specifically _defaultDevice = -init_error_value;). >>>>>> >>>>>> So whether you initialize lazily or eagerly makes no difference here, >>>>>> as _defaultDevice will always contain -35. >>>>>> >>>>>> The bigger question is why cudaGetDeviceCount() is returning >>>>>> cudaErrorInsufficientDriver. Can you compile and run >>>>>> >>>>>> ``` >>>>>> #include >>>>>> >>>>>> int main() >>>>>> { >>>>>> int ndev; >>>>>> return cudaGetDeviceCount(&ndev): >>>>>> } >>>>>> ``` >>>>>> >>>>>> Then show the value of "echo $??? >>>>>> >>>>> >>>>> Modify your code a little to get more information. >>>>> >>>>> #include >>>>> #include >>>>> >>>>> int main() >>>>> { >>>>> int ndev; >>>>> int error = cudaGetDeviceCount(&ndev); >>>>> printf("ndev %d \n", ndev); >>>>> printf("error %d \n", error); >>>>> return 0; >>>>> } >>>>> >>>>> Results: >>>>> >>>>> $ ./a.out >>>>> ndev 4 >>>>> error 0 >>>>> >>>>> >>>>> I have not read the PETSc cuda initialization code yet. If I need to >>>>> guess at what was happening. I will naively think that PETSc did not get >>>>> correct GPU information in the configuration because the compiler node does >>>>> not have GPUs, and there was no way to get any GPU device information. >>>>> >>>>> >>>>> During the runtime on GPU nodes, PETSc might have incorrect >>>>> information grabbed during configuration and had this kind of false error >>>>> message. >>>>> >>>>> Thanks, >>>>> >>>>> Fande >>>>> >>>>> >>>>> >>>>>> >>>>>> Best regards, >>>>>> >>>>>> Jacob Faibussowitsch >>>>>> (Jacob Fai - booss - oh - vitch) >>>>>> >>>>>> On Jan 20, 2022, at 17:47, Matthew Knepley wrote: >>>>>> >>>>>> On Thu, Jan 20, 2022 at 6:44 PM Fande Kong >>>>>> wrote: >>>>>> >>>>>>> Thanks, Jed >>>>>>> >>>>>>> On Thu, Jan 20, 2022 at 4:34 PM Jed Brown wrote: >>>>>>> >>>>>>>> You can't create CUDA or Kokkos Vecs if you're running on a node >>>>>>>> without a GPU. >>>>>>> >>>>>>> >>>>>>> I am running the code on compute nodes that do have GPUs. >>>>>>> >>>>>> >>>>>> If you are actually running on GPUs, why would you need lazy >>>>>> initialization? It would not break with GPUs present. >>>>>> >>>>>> Matt >>>>>> >>>>>> >>>>>>> With PETSc-3.16.1, I got good speedup by running GAMG on GPUs. >>>>>>> That might be a bug of PETSc-main. >>>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> Fande >>>>>>> >>>>>>> >>>>>>> >>>>>>> KSPSetUp 13 1.0 6.4400e-01 1.0 2.02e+09 1.0 0.0e+00 >>>>>>> 0.0e+00 0.0e+00 0 5 0 0 0 0 5 0 0 0 3140 64630 15 >>>>>>> 1.05e+02 5 3.49e+01 100 >>>>>>> KSPSolve 1 1.0 1.0109e+00 1.0 3.49e+10 1.0 0.0e+00 >>>>>>> 0.0e+00 0.0e+00 0 87 0 0 0 0 87 0 0 0 34522 69556 4 >>>>>>> 4.35e-03 1 2.38e-03 100 >>>>>>> KSPGMRESOrthog 142 1.0 1.2674e-01 1.0 1.06e+10 1.0 0.0e+00 >>>>>>> 0.0e+00 0.0e+00 0 27 0 0 0 0 27 0 0 0 83755 87801 0 >>>>>>> 0.00e+00 0 0.00e+00 100 >>>>>>> SNESSolve 1 1.0 4.4402e+01 1.0 4.00e+10 1.0 0.0e+00 >>>>>>> 0.0e+00 0.0e+00 21100 0 0 0 21100 0 0 0 901 51365 57 >>>>>>> 1.10e+03 52 8.78e+02 100 >>>>>>> SNESSetUp 1 1.0 3.9101e-05 1.0 0.00e+00 0.0 0.0e+00 >>>>>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 >>>>>>> 0.00e+00 0 0.00e+00 0 >>>>>>> SNESFunctionEval 2 1.0 1.7097e+01 1.0 1.60e+07 1.0 0.0e+00 >>>>>>> 0.0e+00 0.0e+00 8 0 0 0 0 8 0 0 0 0 1 0 0 >>>>>>> 0.00e+00 6 1.92e+02 0 >>>>>>> SNESJacobianEval 1 1.0 1.6213e+01 1.0 2.80e+07 1.0 0.0e+00 >>>>>>> 0.0e+00 0.0e+00 8 0 0 0 0 8 0 0 0 0 2 0 0 >>>>>>> 0.00e+00 1 3.20e+01 0 >>>>>>> SNESLineSearch 1 1.0 8.5582e+00 1.0 1.24e+08 1.0 0.0e+00 >>>>>>> 0.0e+00 0.0e+00 4 0 0 0 0 4 0 0 0 0 14 64153 1 >>>>>>> 3.20e+01 3 9.61e+01 94 >>>>>>> PCGAMGGraph_AGG 5 1.0 3.0509e+00 1.0 8.19e+07 1.0 0.0e+00 >>>>>>> 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 27 0 5 >>>>>>> 3.49e+01 9 7.43e+01 0 >>>>>>> PCGAMGCoarse_AGG 5 1.0 3.8711e+00 1.0 0.00e+00 0.0 0.0e+00 >>>>>>> 0.0e+00 0.0e+00 2 0 0 0 0 2 0 0 0 0 0 0 0 >>>>>>> 0.00e+00 0 0.00e+00 0 >>>>>>> PCGAMGProl_AGG 5 1.0 7.0748e-01 1.0 0.00e+00 0.0 0.0e+00 >>>>>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 >>>>>>> 0.00e+00 0 0.00e+00 0 >>>>>>> PCGAMGPOpt_AGG 5 1.0 1.2904e+00 1.0 2.14e+09 1.0 0.0e+00 >>>>>>> 0.0e+00 0.0e+00 1 5 0 0 0 1 5 0 0 0 1661 29807 26 >>>>>>> 7.15e+02 20 2.90e+02 99 >>>>>>> GAMG: createProl 5 1.0 8.9489e+00 1.0 2.22e+09 1.0 0.0e+00 >>>>>>> 0.0e+00 0.0e+00 4 6 0 0 0 4 6 0 0 0 249 29666 31 >>>>>>> 7.50e+02 29 3.64e+02 96 >>>>>>> Graph 10 1.0 3.0478e+00 1.0 8.19e+07 1.0 0.0e+00 >>>>>>> 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 27 0 5 >>>>>>> 3.49e+01 9 7.43e+01 0 >>>>>>> MIS/Agg 5 1.0 4.1290e-01 1.0 0.00e+00 0.0 0.0e+00 >>>>>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 >>>>>>> 0.00e+00 0 0.00e+00 0 >>>>>>> SA: col data 5 1.0 1.9127e-02 1.0 0.00e+00 0.0 0.0e+00 >>>>>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 >>>>>>> 0.00e+00 0 0.00e+00 0 >>>>>>> SA: frmProl0 5 1.0 6.2662e-01 1.0 0.00e+00 0.0 0.0e+00 >>>>>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 >>>>>>> 0.00e+00 0 0.00e+00 0 >>>>>>> SA: smooth 5 1.0 4.9595e-01 1.0 1.21e+08 1.0 0.0e+00 >>>>>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 244 2709 15 >>>>>>> 1.97e+02 15 2.55e+02 90 >>>>>>> GAMG: partLevel 5 1.0 4.7330e-01 1.0 6.98e+08 1.0 0.0e+00 >>>>>>> 0.0e+00 0.0e+00 0 2 0 0 0 0 2 0 0 0 1475 4120 5 >>>>>>> 1.78e+02 10 2.55e+02 100 >>>>>>> PCGAMG Squ l00 1 1.0 2.6027e+00 1.0 0.00e+00 0.0 0.0e+00 >>>>>>> 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 0 0 0 >>>>>>> 0.00e+00 0 0.00e+00 0 >>>>>>> PCGAMG Gal l00 1 1.0 3.8406e-01 1.0 5.48e+08 1.0 0.0e+00 >>>>>>> 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 1426 4270 1 >>>>>>> 1.48e+02 2 2.11e+02 100 >>>>>>> PCGAMG Opt l00 1 1.0 2.4932e-01 1.0 7.20e+07 1.0 0.0e+00 >>>>>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 289 2653 1 >>>>>>> 6.41e+01 1 1.13e+02 100 >>>>>>> PCGAMG Gal l01 1 1.0 6.6279e-02 1.0 1.09e+08 1.0 0.0e+00 >>>>>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1645 3851 1 >>>>>>> 2.40e+01 2 3.64e+01 100 >>>>>>> PCGAMG Opt l01 1 1.0 2.9544e-02 1.0 7.15e+06 1.0 0.0e+00 >>>>>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 242 1671 1 >>>>>>> 4.84e+00 1 1.23e+01 100 >>>>>>> PCGAMG Gal l02 1 1.0 1.8874e-02 1.0 3.72e+07 1.0 0.0e+00 >>>>>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1974 3636 1 >>>>>>> 5.04e+00 2 6.58e+00 100 >>>>>>> PCGAMG Opt l02 1 1.0 7.4353e-03 1.0 2.40e+06 1.0 0.0e+00 >>>>>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 323 1457 1 >>>>>>> 7.71e-01 1 2.30e+00 100 >>>>>>> PCGAMG Gal l03 1 1.0 2.8479e-03 1.0 4.10e+06 1.0 0.0e+00 >>>>>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1440 2266 1 >>>>>>> 4.44e-01 2 5.51e-01 100 >>>>>>> PCGAMG Opt l03 1 1.0 8.2684e-04 1.0 2.80e+05 1.0 0.0e+00 >>>>>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 339 1667 1 >>>>>>> 6.72e-02 1 2.03e-01 100 >>>>>>> PCGAMG Gal l04 1 1.0 1.2238e-03 1.0 2.09e+05 1.0 0.0e+00 >>>>>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 170 244 1 >>>>>>> 2.05e-02 2 2.53e-02 100 >>>>>>> PCGAMG Opt l04 1 1.0 4.1008e-04 1.0 1.77e+04 1.0 0.0e+00 >>>>>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 43 165 1 >>>>>>> 4.49e-03 1 1.19e-02 100 >>>>>>> PCSetUp 2 1.0 9.9632e+00 1.0 4.95e+09 1.0 0.0e+00 >>>>>>> 0.0e+00 0.0e+00 5 12 0 0 0 5 12 0 0 0 496 17826 55 >>>>>>> 1.03e+03 45 6.54e+02 98 >>>>>>> PCSetUpOnBlocks 44 1.0 9.9087e-04 1.0 2.88e+03 1.0 >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>>> The point of lazy initialization is to make it possible to run a >>>>>>>> solve that doesn't use a GPU in PETSC_ARCH that supports GPUs, regardless >>>>>>>> of whether a GPU is actually present. >>>>>>>> >>>>>>>> Fande Kong writes: >>>>>>>> >>>>>>>> > I spoke too soon. It seems that we have trouble creating >>>>>>>> cuda/kokkos vecs >>>>>>>> > now. Got Segmentation fault. >>>>>>>> > >>>>>>>> > Thanks, >>>>>>>> > >>>>>>>> > Fande >>>>>>>> > >>>>>>>> > Program received signal SIGSEGV, Segmentation fault. >>>>>>>> > 0x00002aaab5558b11 in >>>>>>>> > >>>>>>>> Petsc::CUPMDevice<(Petsc::CUPMDeviceType)0>::CUPMDeviceInternal::initialize >>>>>>>> > (this=0x1) at >>>>>>>> > >>>>>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/impls/cupm/cupmdevice.cxx:54 >>>>>>>> > 54 PetscErrorCode CUPMDevice::CUPMDeviceInternal::initialize() >>>>>>>> noexcept >>>>>>>> > Missing separate debuginfos, use: debuginfo-install >>>>>>>> > bzip2-libs-1.0.6-13.el7.x86_64 elfutils-libelf-0.176-5.el7.x86_64 >>>>>>>> > elfutils-libs-0.176-5.el7.x86_64 glibc-2.17-325.el7_9.x86_64 >>>>>>>> > libX11-1.6.7-4.el7_9.x86_64 libXau-1.0.8-2.1.el7.x86_64 >>>>>>>> > libattr-2.4.46-13.el7.x86_64 libcap-2.22-11.el7.x86_64 >>>>>>>> > libibmad-5.4.0.MLNX20190423.1d917ae-0.1.49224.x86_64 >>>>>>>> > libibumad-43.1.1.MLNX20200211.078947f-0.1.49224.x86_64 >>>>>>>> > libibverbs-41mlnx1-OFED.4.9.0.0.7.49224.x86_64 >>>>>>>> > libmlx4-41mlnx1-OFED.4.7.3.0.3.49224.x86_64 >>>>>>>> > libmlx5-41mlnx1-OFED.4.9.0.1.2.49224.x86_64 >>>>>>>> libnl3-3.2.28-4.el7.x86_64 >>>>>>>> > librdmacm-41mlnx1-OFED.4.7.3.0.6.49224.x86_64 >>>>>>>> > librxe-41mlnx1-OFED.4.4.2.4.6.49224.x86_64 >>>>>>>> libxcb-1.13-1.el7.x86_64 >>>>>>>> > libxml2-2.9.1-6.el7_9.6.x86_64 numactl-libs-2.0.12-5.el7.x86_64 >>>>>>>> > systemd-libs-219-78.el7_9.3.x86_64 xz-libs-5.2.2-1.el7.x86_64 >>>>>>>> > zlib-1.2.7-19.el7_9.x86_64 >>>>>>>> > (gdb) bt >>>>>>>> > #0 0x00002aaab5558b11 in >>>>>>>> > >>>>>>>> Petsc::CUPMDevice<(Petsc::CUPMDeviceType)0>::CUPMDeviceInternal::initialize >>>>>>>> > (this=0x1) at >>>>>>>> > >>>>>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/impls/cupm/cupmdevice.cxx:54 >>>>>>>> > #1 0x00002aaab5558db7 in >>>>>>>> > Petsc::CUPMDevice<(Petsc::CUPMDeviceType)0>::getDevice >>>>>>>> > (this=this at entry=0x2aaab7f37b70 >>>>>>>> > , device=0x115da00, id=-35, id at entry=-1) at >>>>>>>> > >>>>>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/impls/cupm/cupmdevice.cxx:344 >>>>>>>> > #2 0x00002aaab55577de in PetscDeviceCreate (type=type at entry >>>>>>>> =PETSC_DEVICE_CUDA, >>>>>>>> > devid=devid at entry=-1, device=device at entry=0x2aaab7f37b48 >>>>>>>> > ) at >>>>>>>> > >>>>>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/interface/device.cxx:107 >>>>>>>> > #3 0x00002aaab5557b3a in >>>>>>>> PetscDeviceInitializeDefaultDevice_Internal >>>>>>>> > (type=type at entry=PETSC_DEVICE_CUDA, >>>>>>>> defaultDeviceId=defaultDeviceId at entry=-1) >>>>>>>> > at >>>>>>>> > >>>>>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/interface/device.cxx:273 >>>>>>>> > #4 0x00002aaab5557bf6 in PetscDeviceInitialize >>>>>>>> > (type=type at entry=PETSC_DEVICE_CUDA) >>>>>>>> > at >>>>>>>> > >>>>>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/interface/device.cxx:234 >>>>>>>> > #5 0x00002aaab5661fcd in VecCreate_SeqCUDA (V=0x115d150) at >>>>>>>> > >>>>>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/vec/vec/impls/seq/seqcuda/veccuda.c:244 >>>>>>>> > #6 0x00002aaab5649b40 in VecSetType (vec=vec at entry=0x115d150, >>>>>>>> > method=method at entry=0x2aaab70b45b8 "seqcuda") at >>>>>>>> > >>>>>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/vec/vec/interface/vecreg.c:93 >>>>>>>> > #7 0x00002aaab579c33f in VecCreate_CUDA (v=0x115d150) at >>>>>>>> > >>>>>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/vec/vec/impls/mpi/mpicuda/ >>>>>>>> > mpicuda.cu:214 >>>>>>>> > #8 0x00002aaab5649b40 in VecSetType (vec=vec at entry=0x115d150, >>>>>>>> > method=method at entry=0x7fffffff9260 "cuda") at >>>>>>>> > >>>>>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/vec/vec/interface/vecreg.c:93 >>>>>>>> > #9 0x00002aaab5648bf1 in VecSetTypeFromOptions_Private >>>>>>>> (vec=0x115d150, >>>>>>>> > PetscOptionsObject=0x7fffffff9210) at >>>>>>>> > >>>>>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/vec/vec/interface/vector.c:1263 >>>>>>>> > #10 VecSetFromOptions (vec=0x115d150) at >>>>>>>> > >>>>>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/vec/vec/interface/vector.c:1297 >>>>>>>> > #11 0x00002aaab02ef227 in libMesh::PetscVector::init >>>>>>>> > (this=0x11cd1a0, n=441, n_local=441, fast=false, >>>>>>>> ptype=libMesh::PARALLEL) >>>>>>>> > at >>>>>>>> > >>>>>>>> /home/kongf/workhome/sawtooth/moosegpu/scripts/../libmesh/installed/include/libmesh/petsc_vector.h:693 >>>>>>>> > >>>>>>>> > On Thu, Jan 20, 2022 at 1:09 PM Fande Kong >>>>>>>> wrote: >>>>>>>> > >>>>>>>> >> Thanks, Jed, >>>>>>>> >> >>>>>>>> >> This worked! >>>>>>>> >> >>>>>>>> >> Fande >>>>>>>> >> >>>>>>>> >> On Wed, Jan 19, 2022 at 11:03 PM Jed Brown >>>>>>>> wrote: >>>>>>>> >> >>>>>>>> >>> Fande Kong writes: >>>>>>>> >>> >>>>>>>> >>> > On Wed, Jan 19, 2022 at 11:39 AM Jacob Faibussowitsch < >>>>>>>> >>> jacob.fai at gmail.com> >>>>>>>> >>> > wrote: >>>>>>>> >>> > >>>>>>>> >>> >> Are you running on login nodes or compute nodes (I can?t >>>>>>>> seem to tell >>>>>>>> >>> from >>>>>>>> >>> >> the configure.log)? >>>>>>>> >>> >> >>>>>>>> >>> > >>>>>>>> >>> > I was compiling codes on login nodes, and running codes on >>>>>>>> compute >>>>>>>> >>> nodes. >>>>>>>> >>> > Login nodes do not have GPUs, but compute nodes do have GPUs. >>>>>>>> >>> > >>>>>>>> >>> > Just to be clear, the same thing (code, machine) with >>>>>>>> PETSc-3.16.1 >>>>>>>> >>> worked >>>>>>>> >>> > perfectly. I have this trouble with PETSc-main. >>>>>>>> >>> >>>>>>>> >>> I assume you can >>>>>>>> >>> >>>>>>>> >>> export PETSC_OPTIONS='-device_enable lazy' >>>>>>>> >>> >>>>>>>> >>> and it'll work. >>>>>>>> >>> >>>>>>>> >>> I think this should be the default. The main complaint is that >>>>>>>> timing the >>>>>>>> >>> first GPU-using event isn't accurate if it includes >>>>>>>> initialization, but I >>>>>>>> >>> think this is mostly hypothetical because you can't trust any >>>>>>>> timing that >>>>>>>> >>> doesn't preload in some form and the first GPU-using event will >>>>>>>> almost >>>>>>>> >>> always be something uninteresting so I think it will rarely >>>>>>>> lead to >>>>>>>> >>> confusion. Meanwhile, eager initialization is viscerally >>>>>>>> disruptive for >>>>>>>> >>> lots of people. >>>>>>>> >>> >>>>>>>> >> >>>>>>>> >>>>>>> >>>>>> >>>>>> -- >>>>>> What most experimenters take for granted before they begin their >>>>>> experiments is infinitely more interesting than any results to which their >>>>>> experiments lead. >>>>>> -- Norbert Wiener >>>>>> >>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>> >>>>>> >>>>>> >>>>> > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From fdkong.jd at gmail.com Wed Jan 26 12:25:17 2022 From: fdkong.jd at gmail.com (Fande Kong) Date: Wed, 26 Jan 2022 11:25:17 -0700 Subject: [petsc-users] Cannot eagerly initialize cuda, as doing so results in cuda error 35 (cudaErrorInsufficientDriver) : CUDA driver version is insufficient for CUDA runtime version In-Reply-To: References: <233D0F20-AD95-481B-B862-353D5789C556@gmail.com> <87czkn9c64.fsf@jedbrown.org> <875yqe7zip.fsf@jedbrown.org> Message-ID: I am on the petsc-main commit 1390d3a27d88add7d79c9b38bf1a895ae5e67af6 Merge: 96c919c d5f3255 Author: Satish Balay Date: Wed Jan 26 10:28:32 2022 -0600 Merge remote-tracking branch 'origin/release' It is still broken. Thanks, Fande On Wed, Jan 26, 2022 at 7:40 AM Junchao Zhang wrote: > The good uses the compiler's default library/header path. The bad > searches from cuda toolkit path and uses rpath linking. > Though the paths look the same on the login node, they could have > different behavior on a compute node depending on its environment. > I think we fixed the issue in cuda.py (i.e., first try the compiler's > default, then toolkit). That's why I wanted Fande to use petsc/main. > > --Junchao Zhang > > > On Tue, Jan 25, 2022 at 11:59 PM Barry Smith wrote: > >> >> bad has extra >> >> -L/apps/local/spack/software/gcc-7.5.0/cuda-10.1.243-v4ymjqcrr7f72qfiuzsstuy5jiajbuey/lib64/stubs >> -lcuda >> >> good does not. >> >> Try removing the stubs directory and -lcuda from the bad >> $PETSC_ARCH/lib/petsc/conf/variables and likely the bad will start working. >> >> Barry >> >> I never liked the stubs stuff. >> >> On Jan 25, 2022, at 11:29 PM, Fande Kong wrote: >> >> Hi Junchao, >> >> I attached a "bad" configure log and a "good" configure log. >> >> The "bad" one was on produced at 246ba74192519a5f34fb6e227d1c64364e19ce2c >> >> and the "good" one at 384645a00975869a1aacbd3169de62ba40cad683 >> >> This good hash is the last good hash that is just the right before the >> bad one. >> >> I think you could do a comparison between these two logs, and check what >> the differences were. >> >> Thanks, >> >> Fande >> >> On Tue, Jan 25, 2022 at 8:21 PM Junchao Zhang >> wrote: >> >>> Fande, could you send the configure.log that works (i.e., before this >>> offending commit)? >>> --Junchao Zhang >>> >>> >>> On Tue, Jan 25, 2022 at 8:21 PM Fande Kong wrote: >>> >>>> Not sure if this is helpful. I did "git bisect", and here was the >>>> result: >>>> >>>> [kongf at sawtooth2 petsc]$ git bisect bad >>>> 246ba74192519a5f34fb6e227d1c64364e19ce2c is the first bad commit >>>> commit 246ba74192519a5f34fb6e227d1c64364e19ce2c >>>> Author: Junchao Zhang >>>> Date: Wed Oct 13 05:32:43 2021 +0000 >>>> >>>> Config: fix CUDA library and header dirs >>>> >>>> :040000 040000 187c86055adb80f53c1d0565a8888704fec43a96 >>>> ea1efd7f594fd5e8df54170bc1bc7b00f35e4d5f M config >>>> >>>> >>>> Started from this commit, and GPU did not work for me on our HPC >>>> >>>> Thanks, >>>> Fande >>>> >>>> On Tue, Jan 25, 2022 at 7:18 PM Fande Kong wrote: >>>> >>>>> >>>>> >>>>> On Tue, Jan 25, 2022 at 9:04 AM Jacob Faibussowitsch < >>>>> jacob.fai at gmail.com> wrote: >>>>> >>>>>> Configure should not have an impact here I think. The reason I had >>>>>> you run `cudaGetDeviceCount()` is because this is the CUDA call (and in >>>>>> fact the only CUDA call) in the initialization sequence that returns the >>>>>> error code. There should be no prior CUDA calls. Maybe this is a problem >>>>>> with oversubscribing GPU?s? In the runs that crash, how many ranks are >>>>>> using any given GPU at once? Maybe MPS is required. >>>>>> >>>>> >>>>> I used one MPI rank. >>>>> >>>>> Fande >>>>> >>>>> >>>>> >>>>>> >>>>>> Best regards, >>>>>> >>>>>> Jacob Faibussowitsch >>>>>> (Jacob Fai - booss - oh - vitch) >>>>>> >>>>>> On Jan 21, 2022, at 12:01, Fande Kong wrote: >>>>>> >>>>>> Thanks Jacob, >>>>>> >>>>>> On Thu, Jan 20, 2022 at 6:25 PM Jacob Faibussowitsch < >>>>>> jacob.fai at gmail.com> wrote: >>>>>> >>>>>>> Segfault is caused by the following check at >>>>>>> src/sys/objects/device/impls/cupm/cupmdevice.cxx:349 being a >>>>>>> PetscUnlikelyDebug() rather than just PetscUnlikely(): >>>>>>> >>>>>>> ``` >>>>>>> if (PetscUnlikelyDebug(_defaultDevice < 0)) { // _defaultDevice is >>>>>>> in fact < 0 here and uncaught >>>>>>> ``` >>>>>>> >>>>>>> To clarify: >>>>>>> >>>>>>> ?lazy? initialization is not that lazy after all, it still does some >>>>>>> 50% of the initialization that ?eager? initialization does. It stops short >>>>>>> initializing the CUDA runtime, checking CUDA aware MPI, gathering device >>>>>>> data, and initializing cublas and friends. Lazy also importantly swallows >>>>>>> any errors that crop up during initialization, storing the resulting error >>>>>>> code for later (specifically _defaultDevice = -init_error_value;). >>>>>>> >>>>>>> So whether you initialize lazily or eagerly makes no difference >>>>>>> here, as _defaultDevice will always contain -35. >>>>>>> >>>>>>> The bigger question is why cudaGetDeviceCount() is returning >>>>>>> cudaErrorInsufficientDriver. Can you compile and run >>>>>>> >>>>>>> ``` >>>>>>> #include >>>>>>> >>>>>>> int main() >>>>>>> { >>>>>>> int ndev; >>>>>>> return cudaGetDeviceCount(&ndev): >>>>>>> } >>>>>>> ``` >>>>>>> >>>>>>> Then show the value of "echo $??? >>>>>>> >>>>>> >>>>>> Modify your code a little to get more information. >>>>>> >>>>>> #include >>>>>> #include >>>>>> >>>>>> int main() >>>>>> { >>>>>> int ndev; >>>>>> int error = cudaGetDeviceCount(&ndev); >>>>>> printf("ndev %d \n", ndev); >>>>>> printf("error %d \n", error); >>>>>> return 0; >>>>>> } >>>>>> >>>>>> Results: >>>>>> >>>>>> $ ./a.out >>>>>> ndev 4 >>>>>> error 0 >>>>>> >>>>>> >>>>>> I have not read the PETSc cuda initialization code yet. If I need to >>>>>> guess at what was happening. I will naively think that PETSc did not get >>>>>> correct GPU information in the configuration because the compiler node does >>>>>> not have GPUs, and there was no way to get any GPU device information. >>>>>> >>>>>> >>>>>> During the runtime on GPU nodes, PETSc might have incorrect >>>>>> information grabbed during configuration and had this kind of false error >>>>>> message. >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Fande >>>>>> >>>>>> >>>>>> >>>>>>> >>>>>>> Best regards, >>>>>>> >>>>>>> Jacob Faibussowitsch >>>>>>> (Jacob Fai - booss - oh - vitch) >>>>>>> >>>>>>> On Jan 20, 2022, at 17:47, Matthew Knepley >>>>>>> wrote: >>>>>>> >>>>>>> On Thu, Jan 20, 2022 at 6:44 PM Fande Kong >>>>>>> wrote: >>>>>>> >>>>>>>> Thanks, Jed >>>>>>>> >>>>>>>> On Thu, Jan 20, 2022 at 4:34 PM Jed Brown wrote: >>>>>>>> >>>>>>>>> You can't create CUDA or Kokkos Vecs if you're running on a node >>>>>>>>> without a GPU. >>>>>>>> >>>>>>>> >>>>>>>> I am running the code on compute nodes that do have GPUs. >>>>>>>> >>>>>>> >>>>>>> If you are actually running on GPUs, why would you need lazy >>>>>>> initialization? It would not break with GPUs present. >>>>>>> >>>>>>> Matt >>>>>>> >>>>>>> >>>>>>>> With PETSc-3.16.1, I got good speedup by running GAMG on GPUs. >>>>>>>> That might be a bug of PETSc-main. >>>>>>>> >>>>>>>> Thanks, >>>>>>>> >>>>>>>> Fande >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> KSPSetUp 13 1.0 6.4400e-01 1.0 2.02e+09 1.0 0.0e+00 >>>>>>>> 0.0e+00 0.0e+00 0 5 0 0 0 0 5 0 0 0 3140 64630 15 >>>>>>>> 1.05e+02 5 3.49e+01 100 >>>>>>>> KSPSolve 1 1.0 1.0109e+00 1.0 3.49e+10 1.0 0.0e+00 >>>>>>>> 0.0e+00 0.0e+00 0 87 0 0 0 0 87 0 0 0 34522 69556 4 >>>>>>>> 4.35e-03 1 2.38e-03 100 >>>>>>>> KSPGMRESOrthog 142 1.0 1.2674e-01 1.0 1.06e+10 1.0 0.0e+00 >>>>>>>> 0.0e+00 0.0e+00 0 27 0 0 0 0 27 0 0 0 83755 87801 0 >>>>>>>> 0.00e+00 0 0.00e+00 100 >>>>>>>> SNESSolve 1 1.0 4.4402e+01 1.0 4.00e+10 1.0 0.0e+00 >>>>>>>> 0.0e+00 0.0e+00 21100 0 0 0 21100 0 0 0 901 51365 57 >>>>>>>> 1.10e+03 52 8.78e+02 100 >>>>>>>> SNESSetUp 1 1.0 3.9101e-05 1.0 0.00e+00 0.0 0.0e+00 >>>>>>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 >>>>>>>> 0.00e+00 0 0.00e+00 0 >>>>>>>> SNESFunctionEval 2 1.0 1.7097e+01 1.0 1.60e+07 1.0 0.0e+00 >>>>>>>> 0.0e+00 0.0e+00 8 0 0 0 0 8 0 0 0 0 1 0 0 >>>>>>>> 0.00e+00 6 1.92e+02 0 >>>>>>>> SNESJacobianEval 1 1.0 1.6213e+01 1.0 2.80e+07 1.0 0.0e+00 >>>>>>>> 0.0e+00 0.0e+00 8 0 0 0 0 8 0 0 0 0 2 0 0 >>>>>>>> 0.00e+00 1 3.20e+01 0 >>>>>>>> SNESLineSearch 1 1.0 8.5582e+00 1.0 1.24e+08 1.0 0.0e+00 >>>>>>>> 0.0e+00 0.0e+00 4 0 0 0 0 4 0 0 0 0 14 64153 1 >>>>>>>> 3.20e+01 3 9.61e+01 94 >>>>>>>> PCGAMGGraph_AGG 5 1.0 3.0509e+00 1.0 8.19e+07 1.0 0.0e+00 >>>>>>>> 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 27 0 5 >>>>>>>> 3.49e+01 9 7.43e+01 0 >>>>>>>> PCGAMGCoarse_AGG 5 1.0 3.8711e+00 1.0 0.00e+00 0.0 0.0e+00 >>>>>>>> 0.0e+00 0.0e+00 2 0 0 0 0 2 0 0 0 0 0 0 0 >>>>>>>> 0.00e+00 0 0.00e+00 0 >>>>>>>> PCGAMGProl_AGG 5 1.0 7.0748e-01 1.0 0.00e+00 0.0 0.0e+00 >>>>>>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 >>>>>>>> 0.00e+00 0 0.00e+00 0 >>>>>>>> PCGAMGPOpt_AGG 5 1.0 1.2904e+00 1.0 2.14e+09 1.0 0.0e+00 >>>>>>>> 0.0e+00 0.0e+00 1 5 0 0 0 1 5 0 0 0 1661 29807 26 >>>>>>>> 7.15e+02 20 2.90e+02 99 >>>>>>>> GAMG: createProl 5 1.0 8.9489e+00 1.0 2.22e+09 1.0 0.0e+00 >>>>>>>> 0.0e+00 0.0e+00 4 6 0 0 0 4 6 0 0 0 249 29666 31 >>>>>>>> 7.50e+02 29 3.64e+02 96 >>>>>>>> Graph 10 1.0 3.0478e+00 1.0 8.19e+07 1.0 0.0e+00 >>>>>>>> 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 27 0 5 >>>>>>>> 3.49e+01 9 7.43e+01 0 >>>>>>>> MIS/Agg 5 1.0 4.1290e-01 1.0 0.00e+00 0.0 0.0e+00 >>>>>>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 >>>>>>>> 0.00e+00 0 0.00e+00 0 >>>>>>>> SA: col data 5 1.0 1.9127e-02 1.0 0.00e+00 0.0 0.0e+00 >>>>>>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 >>>>>>>> 0.00e+00 0 0.00e+00 0 >>>>>>>> SA: frmProl0 5 1.0 6.2662e-01 1.0 0.00e+00 0.0 0.0e+00 >>>>>>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 >>>>>>>> 0.00e+00 0 0.00e+00 0 >>>>>>>> SA: smooth 5 1.0 4.9595e-01 1.0 1.21e+08 1.0 0.0e+00 >>>>>>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 244 2709 15 >>>>>>>> 1.97e+02 15 2.55e+02 90 >>>>>>>> GAMG: partLevel 5 1.0 4.7330e-01 1.0 6.98e+08 1.0 0.0e+00 >>>>>>>> 0.0e+00 0.0e+00 0 2 0 0 0 0 2 0 0 0 1475 4120 5 >>>>>>>> 1.78e+02 10 2.55e+02 100 >>>>>>>> PCGAMG Squ l00 1 1.0 2.6027e+00 1.0 0.00e+00 0.0 0.0e+00 >>>>>>>> 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 0 0 0 >>>>>>>> 0.00e+00 0 0.00e+00 0 >>>>>>>> PCGAMG Gal l00 1 1.0 3.8406e-01 1.0 5.48e+08 1.0 0.0e+00 >>>>>>>> 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 1426 4270 1 >>>>>>>> 1.48e+02 2 2.11e+02 100 >>>>>>>> PCGAMG Opt l00 1 1.0 2.4932e-01 1.0 7.20e+07 1.0 0.0e+00 >>>>>>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 289 2653 1 >>>>>>>> 6.41e+01 1 1.13e+02 100 >>>>>>>> PCGAMG Gal l01 1 1.0 6.6279e-02 1.0 1.09e+08 1.0 0.0e+00 >>>>>>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1645 3851 1 >>>>>>>> 2.40e+01 2 3.64e+01 100 >>>>>>>> PCGAMG Opt l01 1 1.0 2.9544e-02 1.0 7.15e+06 1.0 0.0e+00 >>>>>>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 242 1671 1 >>>>>>>> 4.84e+00 1 1.23e+01 100 >>>>>>>> PCGAMG Gal l02 1 1.0 1.8874e-02 1.0 3.72e+07 1.0 0.0e+00 >>>>>>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1974 3636 1 >>>>>>>> 5.04e+00 2 6.58e+00 100 >>>>>>>> PCGAMG Opt l02 1 1.0 7.4353e-03 1.0 2.40e+06 1.0 0.0e+00 >>>>>>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 323 1457 1 >>>>>>>> 7.71e-01 1 2.30e+00 100 >>>>>>>> PCGAMG Gal l03 1 1.0 2.8479e-03 1.0 4.10e+06 1.0 0.0e+00 >>>>>>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1440 2266 1 >>>>>>>> 4.44e-01 2 5.51e-01 100 >>>>>>>> PCGAMG Opt l03 1 1.0 8.2684e-04 1.0 2.80e+05 1.0 0.0e+00 >>>>>>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 339 1667 1 >>>>>>>> 6.72e-02 1 2.03e-01 100 >>>>>>>> PCGAMG Gal l04 1 1.0 1.2238e-03 1.0 2.09e+05 1.0 0.0e+00 >>>>>>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 170 244 1 >>>>>>>> 2.05e-02 2 2.53e-02 100 >>>>>>>> PCGAMG Opt l04 1 1.0 4.1008e-04 1.0 1.77e+04 1.0 0.0e+00 >>>>>>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 43 165 1 >>>>>>>> 4.49e-03 1 1.19e-02 100 >>>>>>>> PCSetUp 2 1.0 9.9632e+00 1.0 4.95e+09 1.0 0.0e+00 >>>>>>>> 0.0e+00 0.0e+00 5 12 0 0 0 5 12 0 0 0 496 17826 55 >>>>>>>> 1.03e+03 45 6.54e+02 98 >>>>>>>> PCSetUpOnBlocks 44 1.0 9.9087e-04 1.0 2.88e+03 1.0 >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>>> The point of lazy initialization is to make it possible to run a >>>>>>>>> solve that doesn't use a GPU in PETSC_ARCH that supports GPUs, regardless >>>>>>>>> of whether a GPU is actually present. >>>>>>>>> >>>>>>>>> Fande Kong writes: >>>>>>>>> >>>>>>>>> > I spoke too soon. It seems that we have trouble creating >>>>>>>>> cuda/kokkos vecs >>>>>>>>> > now. Got Segmentation fault. >>>>>>>>> > >>>>>>>>> > Thanks, >>>>>>>>> > >>>>>>>>> > Fande >>>>>>>>> > >>>>>>>>> > Program received signal SIGSEGV, Segmentation fault. >>>>>>>>> > 0x00002aaab5558b11 in >>>>>>>>> > >>>>>>>>> Petsc::CUPMDevice<(Petsc::CUPMDeviceType)0>::CUPMDeviceInternal::initialize >>>>>>>>> > (this=0x1) at >>>>>>>>> > >>>>>>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/impls/cupm/cupmdevice.cxx:54 >>>>>>>>> > 54 PetscErrorCode >>>>>>>>> CUPMDevice::CUPMDeviceInternal::initialize() noexcept >>>>>>>>> > Missing separate debuginfos, use: debuginfo-install >>>>>>>>> > bzip2-libs-1.0.6-13.el7.x86_64 elfutils-libelf-0.176-5.el7.x86_64 >>>>>>>>> > elfutils-libs-0.176-5.el7.x86_64 glibc-2.17-325.el7_9.x86_64 >>>>>>>>> > libX11-1.6.7-4.el7_9.x86_64 libXau-1.0.8-2.1.el7.x86_64 >>>>>>>>> > libattr-2.4.46-13.el7.x86_64 libcap-2.22-11.el7.x86_64 >>>>>>>>> > libibmad-5.4.0.MLNX20190423.1d917ae-0.1.49224.x86_64 >>>>>>>>> > libibumad-43.1.1.MLNX20200211.078947f-0.1.49224.x86_64 >>>>>>>>> > libibverbs-41mlnx1-OFED.4.9.0.0.7.49224.x86_64 >>>>>>>>> > libmlx4-41mlnx1-OFED.4.7.3.0.3.49224.x86_64 >>>>>>>>> > libmlx5-41mlnx1-OFED.4.9.0.1.2.49224.x86_64 >>>>>>>>> libnl3-3.2.28-4.el7.x86_64 >>>>>>>>> > librdmacm-41mlnx1-OFED.4.7.3.0.6.49224.x86_64 >>>>>>>>> > librxe-41mlnx1-OFED.4.4.2.4.6.49224.x86_64 >>>>>>>>> libxcb-1.13-1.el7.x86_64 >>>>>>>>> > libxml2-2.9.1-6.el7_9.6.x86_64 numactl-libs-2.0.12-5.el7.x86_64 >>>>>>>>> > systemd-libs-219-78.el7_9.3.x86_64 xz-libs-5.2.2-1.el7.x86_64 >>>>>>>>> > zlib-1.2.7-19.el7_9.x86_64 >>>>>>>>> > (gdb) bt >>>>>>>>> > #0 0x00002aaab5558b11 in >>>>>>>>> > >>>>>>>>> Petsc::CUPMDevice<(Petsc::CUPMDeviceType)0>::CUPMDeviceInternal::initialize >>>>>>>>> > (this=0x1) at >>>>>>>>> > >>>>>>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/impls/cupm/cupmdevice.cxx:54 >>>>>>>>> > #1 0x00002aaab5558db7 in >>>>>>>>> > Petsc::CUPMDevice<(Petsc::CUPMDeviceType)0>::getDevice >>>>>>>>> > (this=this at entry=0x2aaab7f37b70 >>>>>>>>> > , device=0x115da00, id=-35, id at entry=-1) at >>>>>>>>> > >>>>>>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/impls/cupm/cupmdevice.cxx:344 >>>>>>>>> > #2 0x00002aaab55577de in PetscDeviceCreate (type=type at entry >>>>>>>>> =PETSC_DEVICE_CUDA, >>>>>>>>> > devid=devid at entry=-1, device=device at entry=0x2aaab7f37b48 >>>>>>>>> > ) at >>>>>>>>> > >>>>>>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/interface/device.cxx:107 >>>>>>>>> > #3 0x00002aaab5557b3a in >>>>>>>>> PetscDeviceInitializeDefaultDevice_Internal >>>>>>>>> > (type=type at entry=PETSC_DEVICE_CUDA, >>>>>>>>> defaultDeviceId=defaultDeviceId at entry=-1) >>>>>>>>> > at >>>>>>>>> > >>>>>>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/interface/device.cxx:273 >>>>>>>>> > #4 0x00002aaab5557bf6 in PetscDeviceInitialize >>>>>>>>> > (type=type at entry=PETSC_DEVICE_CUDA) >>>>>>>>> > at >>>>>>>>> > >>>>>>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/interface/device.cxx:234 >>>>>>>>> > #5 0x00002aaab5661fcd in VecCreate_SeqCUDA (V=0x115d150) at >>>>>>>>> > >>>>>>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/vec/vec/impls/seq/seqcuda/veccuda.c:244 >>>>>>>>> > #6 0x00002aaab5649b40 in VecSetType (vec=vec at entry=0x115d150, >>>>>>>>> > method=method at entry=0x2aaab70b45b8 "seqcuda") at >>>>>>>>> > >>>>>>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/vec/vec/interface/vecreg.c:93 >>>>>>>>> > #7 0x00002aaab579c33f in VecCreate_CUDA (v=0x115d150) at >>>>>>>>> > >>>>>>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/vec/vec/impls/mpi/mpicuda/ >>>>>>>>> > mpicuda.cu:214 >>>>>>>>> > #8 0x00002aaab5649b40 in VecSetType (vec=vec at entry=0x115d150, >>>>>>>>> > method=method at entry=0x7fffffff9260 "cuda") at >>>>>>>>> > >>>>>>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/vec/vec/interface/vecreg.c:93 >>>>>>>>> > #9 0x00002aaab5648bf1 in VecSetTypeFromOptions_Private >>>>>>>>> (vec=0x115d150, >>>>>>>>> > PetscOptionsObject=0x7fffffff9210) at >>>>>>>>> > >>>>>>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/vec/vec/interface/vector.c:1263 >>>>>>>>> > #10 VecSetFromOptions (vec=0x115d150) at >>>>>>>>> > >>>>>>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/vec/vec/interface/vector.c:1297 >>>>>>>>> > #11 0x00002aaab02ef227 in libMesh::PetscVector::init >>>>>>>>> > (this=0x11cd1a0, n=441, n_local=441, fast=false, >>>>>>>>> ptype=libMesh::PARALLEL) >>>>>>>>> > at >>>>>>>>> > >>>>>>>>> /home/kongf/workhome/sawtooth/moosegpu/scripts/../libmesh/installed/include/libmesh/petsc_vector.h:693 >>>>>>>>> > >>>>>>>>> > On Thu, Jan 20, 2022 at 1:09 PM Fande Kong >>>>>>>>> wrote: >>>>>>>>> > >>>>>>>>> >> Thanks, Jed, >>>>>>>>> >> >>>>>>>>> >> This worked! >>>>>>>>> >> >>>>>>>>> >> Fande >>>>>>>>> >> >>>>>>>>> >> On Wed, Jan 19, 2022 at 11:03 PM Jed Brown >>>>>>>>> wrote: >>>>>>>>> >> >>>>>>>>> >>> Fande Kong writes: >>>>>>>>> >>> >>>>>>>>> >>> > On Wed, Jan 19, 2022 at 11:39 AM Jacob Faibussowitsch < >>>>>>>>> >>> jacob.fai at gmail.com> >>>>>>>>> >>> > wrote: >>>>>>>>> >>> > >>>>>>>>> >>> >> Are you running on login nodes or compute nodes (I can?t >>>>>>>>> seem to tell >>>>>>>>> >>> from >>>>>>>>> >>> >> the configure.log)? >>>>>>>>> >>> >> >>>>>>>>> >>> > >>>>>>>>> >>> > I was compiling codes on login nodes, and running codes on >>>>>>>>> compute >>>>>>>>> >>> nodes. >>>>>>>>> >>> > Login nodes do not have GPUs, but compute nodes do have GPUs. >>>>>>>>> >>> > >>>>>>>>> >>> > Just to be clear, the same thing (code, machine) with >>>>>>>>> PETSc-3.16.1 >>>>>>>>> >>> worked >>>>>>>>> >>> > perfectly. I have this trouble with PETSc-main. >>>>>>>>> >>> >>>>>>>>> >>> I assume you can >>>>>>>>> >>> >>>>>>>>> >>> export PETSC_OPTIONS='-device_enable lazy' >>>>>>>>> >>> >>>>>>>>> >>> and it'll work. >>>>>>>>> >>> >>>>>>>>> >>> I think this should be the default. The main complaint is that >>>>>>>>> timing the >>>>>>>>> >>> first GPU-using event isn't accurate if it includes >>>>>>>>> initialization, but I >>>>>>>>> >>> think this is mostly hypothetical because you can't trust any >>>>>>>>> timing that >>>>>>>>> >>> doesn't preload in some form and the first GPU-using event >>>>>>>>> will almost >>>>>>>>> >>> always be something uninteresting so I think it will rarely >>>>>>>>> lead to >>>>>>>>> >>> confusion. Meanwhile, eager initialization is viscerally >>>>>>>>> disruptive for >>>>>>>>> >>> lots of people. >>>>>>>>> >>> >>>>>>>>> >> >>>>>>>>> >>>>>>>> >>>>>>> >>>>>>> -- >>>>>>> What most experimenters take for granted before they begin their >>>>>>> experiments is infinitely more interesting than any results to which their >>>>>>> experiments lead. >>>>>>> -- Norbert Wiener >>>>>>> >>>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>>> >>>>>>> >>>>>>> >>>>>> >> >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From fdkong.jd at gmail.com Wed Jan 26 12:42:55 2022 From: fdkong.jd at gmail.com (Fande Kong) Date: Wed, 26 Jan 2022 11:42:55 -0700 Subject: [petsc-users] Cannot eagerly initialize cuda, as doing so results in cuda error 35 (cudaErrorInsufficientDriver) : CUDA driver version is insufficient for CUDA runtime version In-Reply-To: References: <233D0F20-AD95-481B-B862-353D5789C556@gmail.com> <87czkn9c64.fsf@jedbrown.org> <875yqe7zip.fsf@jedbrown.org> Message-ID: The make.log generated after removing "stubs and -lcuda", was attached in case it might be helpful I am not aware of the motivation for making the changes in cuda.py. Might I ask to revert that bad commit before we fully understand the issue? Thanks, Fande On Wed, Jan 26, 2022 at 11:25 AM Fande Kong wrote: > I am on the petsc-main > > commit 1390d3a27d88add7d79c9b38bf1a895ae5e67af6 > > Merge: 96c919c d5f3255 > > Author: Satish Balay > > Date: Wed Jan 26 10:28:32 2022 -0600 > > > Merge remote-tracking branch 'origin/release' > > > It is still broken. > > Thanks, > > > Fande > > On Wed, Jan 26, 2022 at 7:40 AM Junchao Zhang > wrote: > >> The good uses the compiler's default library/header path. The bad >> searches from cuda toolkit path and uses rpath linking. >> Though the paths look the same on the login node, they could have >> different behavior on a compute node depending on its environment. >> I think we fixed the issue in cuda.py (i.e., first try the compiler's >> default, then toolkit). That's why I wanted Fande to use petsc/main. >> >> --Junchao Zhang >> >> >> On Tue, Jan 25, 2022 at 11:59 PM Barry Smith wrote: >> >>> >>> bad has extra >>> >>> -L/apps/local/spack/software/gcc-7.5.0/cuda-10.1.243-v4ymjqcrr7f72qfiuzsstuy5jiajbuey/lib64/stubs >>> -lcuda >>> >>> good does not. >>> >>> Try removing the stubs directory and -lcuda from the bad >>> $PETSC_ARCH/lib/petsc/conf/variables and likely the bad will start working. >>> >>> Barry >>> >>> I never liked the stubs stuff. >>> >>> On Jan 25, 2022, at 11:29 PM, Fande Kong wrote: >>> >>> Hi Junchao, >>> >>> I attached a "bad" configure log and a "good" configure log. >>> >>> The "bad" one was on produced at 246ba74192519a5f34fb6e227d1c64364e19ce2c >>> >>> and the "good" one at 384645a00975869a1aacbd3169de62ba40cad683 >>> >>> This good hash is the last good hash that is just the right before the >>> bad one. >>> >>> I think you could do a comparison between these two logs, and check >>> what the differences were. >>> >>> Thanks, >>> >>> Fande >>> >>> On Tue, Jan 25, 2022 at 8:21 PM Junchao Zhang >>> wrote: >>> >>>> Fande, could you send the configure.log that works (i.e., before this >>>> offending commit)? >>>> --Junchao Zhang >>>> >>>> >>>> On Tue, Jan 25, 2022 at 8:21 PM Fande Kong wrote: >>>> >>>>> Not sure if this is helpful. I did "git bisect", and here was the >>>>> result: >>>>> >>>>> [kongf at sawtooth2 petsc]$ git bisect bad >>>>> 246ba74192519a5f34fb6e227d1c64364e19ce2c is the first bad commit >>>>> commit 246ba74192519a5f34fb6e227d1c64364e19ce2c >>>>> Author: Junchao Zhang >>>>> Date: Wed Oct 13 05:32:43 2021 +0000 >>>>> >>>>> Config: fix CUDA library and header dirs >>>>> >>>>> :040000 040000 187c86055adb80f53c1d0565a8888704fec43a96 >>>>> ea1efd7f594fd5e8df54170bc1bc7b00f35e4d5f M config >>>>> >>>>> >>>>> Started from this commit, and GPU did not work for me on our HPC >>>>> >>>>> Thanks, >>>>> Fande >>>>> >>>>> On Tue, Jan 25, 2022 at 7:18 PM Fande Kong >>>>> wrote: >>>>> >>>>>> >>>>>> >>>>>> On Tue, Jan 25, 2022 at 9:04 AM Jacob Faibussowitsch < >>>>>> jacob.fai at gmail.com> wrote: >>>>>> >>>>>>> Configure should not have an impact here I think. The reason I had >>>>>>> you run `cudaGetDeviceCount()` is because this is the CUDA call (and in >>>>>>> fact the only CUDA call) in the initialization sequence that returns the >>>>>>> error code. There should be no prior CUDA calls. Maybe this is a problem >>>>>>> with oversubscribing GPU?s? In the runs that crash, how many ranks are >>>>>>> using any given GPU at once? Maybe MPS is required. >>>>>>> >>>>>> >>>>>> I used one MPI rank. >>>>>> >>>>>> Fande >>>>>> >>>>>> >>>>>> >>>>>>> >>>>>>> Best regards, >>>>>>> >>>>>>> Jacob Faibussowitsch >>>>>>> (Jacob Fai - booss - oh - vitch) >>>>>>> >>>>>>> On Jan 21, 2022, at 12:01, Fande Kong wrote: >>>>>>> >>>>>>> Thanks Jacob, >>>>>>> >>>>>>> On Thu, Jan 20, 2022 at 6:25 PM Jacob Faibussowitsch < >>>>>>> jacob.fai at gmail.com> wrote: >>>>>>> >>>>>>>> Segfault is caused by the following check at >>>>>>>> src/sys/objects/device/impls/cupm/cupmdevice.cxx:349 being a >>>>>>>> PetscUnlikelyDebug() rather than just PetscUnlikely(): >>>>>>>> >>>>>>>> ``` >>>>>>>> if (PetscUnlikelyDebug(_defaultDevice < 0)) { // _defaultDevice is >>>>>>>> in fact < 0 here and uncaught >>>>>>>> ``` >>>>>>>> >>>>>>>> To clarify: >>>>>>>> >>>>>>>> ?lazy? initialization is not that lazy after all, it still does >>>>>>>> some 50% of the initialization that ?eager? initialization does. It stops >>>>>>>> short initializing the CUDA runtime, checking CUDA aware MPI, gathering >>>>>>>> device data, and initializing cublas and friends. Lazy also importantly >>>>>>>> swallows any errors that crop up during initialization, storing the >>>>>>>> resulting error code for later (specifically _defaultDevice = >>>>>>>> -init_error_value;). >>>>>>>> >>>>>>>> So whether you initialize lazily or eagerly makes no difference >>>>>>>> here, as _defaultDevice will always contain -35. >>>>>>>> >>>>>>>> The bigger question is why cudaGetDeviceCount() is returning >>>>>>>> cudaErrorInsufficientDriver. Can you compile and run >>>>>>>> >>>>>>>> ``` >>>>>>>> #include >>>>>>>> >>>>>>>> int main() >>>>>>>> { >>>>>>>> int ndev; >>>>>>>> return cudaGetDeviceCount(&ndev): >>>>>>>> } >>>>>>>> ``` >>>>>>>> >>>>>>>> Then show the value of "echo $??? >>>>>>>> >>>>>>> >>>>>>> Modify your code a little to get more information. >>>>>>> >>>>>>> #include >>>>>>> #include >>>>>>> >>>>>>> int main() >>>>>>> { >>>>>>> int ndev; >>>>>>> int error = cudaGetDeviceCount(&ndev); >>>>>>> printf("ndev %d \n", ndev); >>>>>>> printf("error %d \n", error); >>>>>>> return 0; >>>>>>> } >>>>>>> >>>>>>> Results: >>>>>>> >>>>>>> $ ./a.out >>>>>>> ndev 4 >>>>>>> error 0 >>>>>>> >>>>>>> >>>>>>> I have not read the PETSc cuda initialization code yet. If I need to >>>>>>> guess at what was happening. I will naively think that PETSc did not get >>>>>>> correct GPU information in the configuration because the compiler node does >>>>>>> not have GPUs, and there was no way to get any GPU device information. >>>>>>> >>>>>>> >>>>>>> During the runtime on GPU nodes, PETSc might have incorrect >>>>>>> information grabbed during configuration and had this kind of false error >>>>>>> message. >>>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> Fande >>>>>>> >>>>>>> >>>>>>> >>>>>>>> >>>>>>>> Best regards, >>>>>>>> >>>>>>>> Jacob Faibussowitsch >>>>>>>> (Jacob Fai - booss - oh - vitch) >>>>>>>> >>>>>>>> On Jan 20, 2022, at 17:47, Matthew Knepley >>>>>>>> wrote: >>>>>>>> >>>>>>>> On Thu, Jan 20, 2022 at 6:44 PM Fande Kong >>>>>>>> wrote: >>>>>>>> >>>>>>>>> Thanks, Jed >>>>>>>>> >>>>>>>>> On Thu, Jan 20, 2022 at 4:34 PM Jed Brown >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>>> You can't create CUDA or Kokkos Vecs if you're running on a node >>>>>>>>>> without a GPU. >>>>>>>>> >>>>>>>>> >>>>>>>>> I am running the code on compute nodes that do have GPUs. >>>>>>>>> >>>>>>>> >>>>>>>> If you are actually running on GPUs, why would you need lazy >>>>>>>> initialization? It would not break with GPUs present. >>>>>>>> >>>>>>>> Matt >>>>>>>> >>>>>>>> >>>>>>>>> With PETSc-3.16.1, I got good speedup by running GAMG on GPUs. >>>>>>>>> That might be a bug of PETSc-main. >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> >>>>>>>>> Fande >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> KSPSetUp 13 1.0 6.4400e-01 1.0 2.02e+09 1.0 0.0e+00 >>>>>>>>> 0.0e+00 0.0e+00 0 5 0 0 0 0 5 0 0 0 3140 64630 15 >>>>>>>>> 1.05e+02 5 3.49e+01 100 >>>>>>>>> KSPSolve 1 1.0 1.0109e+00 1.0 3.49e+10 1.0 0.0e+00 >>>>>>>>> 0.0e+00 0.0e+00 0 87 0 0 0 0 87 0 0 0 34522 69556 4 >>>>>>>>> 4.35e-03 1 2.38e-03 100 >>>>>>>>> KSPGMRESOrthog 142 1.0 1.2674e-01 1.0 1.06e+10 1.0 0.0e+00 >>>>>>>>> 0.0e+00 0.0e+00 0 27 0 0 0 0 27 0 0 0 83755 87801 0 >>>>>>>>> 0.00e+00 0 0.00e+00 100 >>>>>>>>> SNESSolve 1 1.0 4.4402e+01 1.0 4.00e+10 1.0 0.0e+00 >>>>>>>>> 0.0e+00 0.0e+00 21100 0 0 0 21100 0 0 0 901 51365 57 >>>>>>>>> 1.10e+03 52 8.78e+02 100 >>>>>>>>> SNESSetUp 1 1.0 3.9101e-05 1.0 0.00e+00 0.0 0.0e+00 >>>>>>>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 >>>>>>>>> 0.00e+00 0 0.00e+00 0 >>>>>>>>> SNESFunctionEval 2 1.0 1.7097e+01 1.0 1.60e+07 1.0 0.0e+00 >>>>>>>>> 0.0e+00 0.0e+00 8 0 0 0 0 8 0 0 0 0 1 0 0 >>>>>>>>> 0.00e+00 6 1.92e+02 0 >>>>>>>>> SNESJacobianEval 1 1.0 1.6213e+01 1.0 2.80e+07 1.0 0.0e+00 >>>>>>>>> 0.0e+00 0.0e+00 8 0 0 0 0 8 0 0 0 0 2 0 0 >>>>>>>>> 0.00e+00 1 3.20e+01 0 >>>>>>>>> SNESLineSearch 1 1.0 8.5582e+00 1.0 1.24e+08 1.0 0.0e+00 >>>>>>>>> 0.0e+00 0.0e+00 4 0 0 0 0 4 0 0 0 0 14 64153 1 >>>>>>>>> 3.20e+01 3 9.61e+01 94 >>>>>>>>> PCGAMGGraph_AGG 5 1.0 3.0509e+00 1.0 8.19e+07 1.0 0.0e+00 >>>>>>>>> 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 27 0 5 >>>>>>>>> 3.49e+01 9 7.43e+01 0 >>>>>>>>> PCGAMGCoarse_AGG 5 1.0 3.8711e+00 1.0 0.00e+00 0.0 0.0e+00 >>>>>>>>> 0.0e+00 0.0e+00 2 0 0 0 0 2 0 0 0 0 0 0 0 >>>>>>>>> 0.00e+00 0 0.00e+00 0 >>>>>>>>> PCGAMGProl_AGG 5 1.0 7.0748e-01 1.0 0.00e+00 0.0 0.0e+00 >>>>>>>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 >>>>>>>>> 0.00e+00 0 0.00e+00 0 >>>>>>>>> PCGAMGPOpt_AGG 5 1.0 1.2904e+00 1.0 2.14e+09 1.0 0.0e+00 >>>>>>>>> 0.0e+00 0.0e+00 1 5 0 0 0 1 5 0 0 0 1661 29807 26 >>>>>>>>> 7.15e+02 20 2.90e+02 99 >>>>>>>>> GAMG: createProl 5 1.0 8.9489e+00 1.0 2.22e+09 1.0 0.0e+00 >>>>>>>>> 0.0e+00 0.0e+00 4 6 0 0 0 4 6 0 0 0 249 29666 31 >>>>>>>>> 7.50e+02 29 3.64e+02 96 >>>>>>>>> Graph 10 1.0 3.0478e+00 1.0 8.19e+07 1.0 0.0e+00 >>>>>>>>> 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 27 0 5 >>>>>>>>> 3.49e+01 9 7.43e+01 0 >>>>>>>>> MIS/Agg 5 1.0 4.1290e-01 1.0 0.00e+00 0.0 0.0e+00 >>>>>>>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 >>>>>>>>> 0.00e+00 0 0.00e+00 0 >>>>>>>>> SA: col data 5 1.0 1.9127e-02 1.0 0.00e+00 0.0 0.0e+00 >>>>>>>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 >>>>>>>>> 0.00e+00 0 0.00e+00 0 >>>>>>>>> SA: frmProl0 5 1.0 6.2662e-01 1.0 0.00e+00 0.0 0.0e+00 >>>>>>>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 >>>>>>>>> 0.00e+00 0 0.00e+00 0 >>>>>>>>> SA: smooth 5 1.0 4.9595e-01 1.0 1.21e+08 1.0 0.0e+00 >>>>>>>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 244 2709 15 >>>>>>>>> 1.97e+02 15 2.55e+02 90 >>>>>>>>> GAMG: partLevel 5 1.0 4.7330e-01 1.0 6.98e+08 1.0 0.0e+00 >>>>>>>>> 0.0e+00 0.0e+00 0 2 0 0 0 0 2 0 0 0 1475 4120 5 >>>>>>>>> 1.78e+02 10 2.55e+02 100 >>>>>>>>> PCGAMG Squ l00 1 1.0 2.6027e+00 1.0 0.00e+00 0.0 0.0e+00 >>>>>>>>> 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 0 0 0 >>>>>>>>> 0.00e+00 0 0.00e+00 0 >>>>>>>>> PCGAMG Gal l00 1 1.0 3.8406e-01 1.0 5.48e+08 1.0 0.0e+00 >>>>>>>>> 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 1426 4270 1 >>>>>>>>> 1.48e+02 2 2.11e+02 100 >>>>>>>>> PCGAMG Opt l00 1 1.0 2.4932e-01 1.0 7.20e+07 1.0 0.0e+00 >>>>>>>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 289 2653 1 >>>>>>>>> 6.41e+01 1 1.13e+02 100 >>>>>>>>> PCGAMG Gal l01 1 1.0 6.6279e-02 1.0 1.09e+08 1.0 0.0e+00 >>>>>>>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1645 3851 1 >>>>>>>>> 2.40e+01 2 3.64e+01 100 >>>>>>>>> PCGAMG Opt l01 1 1.0 2.9544e-02 1.0 7.15e+06 1.0 0.0e+00 >>>>>>>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 242 1671 1 >>>>>>>>> 4.84e+00 1 1.23e+01 100 >>>>>>>>> PCGAMG Gal l02 1 1.0 1.8874e-02 1.0 3.72e+07 1.0 0.0e+00 >>>>>>>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1974 3636 1 >>>>>>>>> 5.04e+00 2 6.58e+00 100 >>>>>>>>> PCGAMG Opt l02 1 1.0 7.4353e-03 1.0 2.40e+06 1.0 0.0e+00 >>>>>>>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 323 1457 1 >>>>>>>>> 7.71e-01 1 2.30e+00 100 >>>>>>>>> PCGAMG Gal l03 1 1.0 2.8479e-03 1.0 4.10e+06 1.0 0.0e+00 >>>>>>>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1440 2266 1 >>>>>>>>> 4.44e-01 2 5.51e-01 100 >>>>>>>>> PCGAMG Opt l03 1 1.0 8.2684e-04 1.0 2.80e+05 1.0 0.0e+00 >>>>>>>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 339 1667 1 >>>>>>>>> 6.72e-02 1 2.03e-01 100 >>>>>>>>> PCGAMG Gal l04 1 1.0 1.2238e-03 1.0 2.09e+05 1.0 0.0e+00 >>>>>>>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 170 244 1 >>>>>>>>> 2.05e-02 2 2.53e-02 100 >>>>>>>>> PCGAMG Opt l04 1 1.0 4.1008e-04 1.0 1.77e+04 1.0 0.0e+00 >>>>>>>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 43 165 1 >>>>>>>>> 4.49e-03 1 1.19e-02 100 >>>>>>>>> PCSetUp 2 1.0 9.9632e+00 1.0 4.95e+09 1.0 0.0e+00 >>>>>>>>> 0.0e+00 0.0e+00 5 12 0 0 0 5 12 0 0 0 496 17826 55 >>>>>>>>> 1.03e+03 45 6.54e+02 98 >>>>>>>>> PCSetUpOnBlocks 44 1.0 9.9087e-04 1.0 2.88e+03 1.0 >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>>> The point of lazy initialization is to make it possible to run a >>>>>>>>>> solve that doesn't use a GPU in PETSC_ARCH that supports GPUs, regardless >>>>>>>>>> of whether a GPU is actually present. >>>>>>>>>> >>>>>>>>>> Fande Kong writes: >>>>>>>>>> >>>>>>>>>> > I spoke too soon. It seems that we have trouble creating >>>>>>>>>> cuda/kokkos vecs >>>>>>>>>> > now. Got Segmentation fault. >>>>>>>>>> > >>>>>>>>>> > Thanks, >>>>>>>>>> > >>>>>>>>>> > Fande >>>>>>>>>> > >>>>>>>>>> > Program received signal SIGSEGV, Segmentation fault. >>>>>>>>>> > 0x00002aaab5558b11 in >>>>>>>>>> > >>>>>>>>>> Petsc::CUPMDevice<(Petsc::CUPMDeviceType)0>::CUPMDeviceInternal::initialize >>>>>>>>>> > (this=0x1) at >>>>>>>>>> > >>>>>>>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/impls/cupm/cupmdevice.cxx:54 >>>>>>>>>> > 54 PetscErrorCode >>>>>>>>>> CUPMDevice::CUPMDeviceInternal::initialize() noexcept >>>>>>>>>> > Missing separate debuginfos, use: debuginfo-install >>>>>>>>>> > bzip2-libs-1.0.6-13.el7.x86_64 >>>>>>>>>> elfutils-libelf-0.176-5.el7.x86_64 >>>>>>>>>> > elfutils-libs-0.176-5.el7.x86_64 glibc-2.17-325.el7_9.x86_64 >>>>>>>>>> > libX11-1.6.7-4.el7_9.x86_64 libXau-1.0.8-2.1.el7.x86_64 >>>>>>>>>> > libattr-2.4.46-13.el7.x86_64 libcap-2.22-11.el7.x86_64 >>>>>>>>>> > libibmad-5.4.0.MLNX20190423.1d917ae-0.1.49224.x86_64 >>>>>>>>>> > libibumad-43.1.1.MLNX20200211.078947f-0.1.49224.x86_64 >>>>>>>>>> > libibverbs-41mlnx1-OFED.4.9.0.0.7.49224.x86_64 >>>>>>>>>> > libmlx4-41mlnx1-OFED.4.7.3.0.3.49224.x86_64 >>>>>>>>>> > libmlx5-41mlnx1-OFED.4.9.0.1.2.49224.x86_64 >>>>>>>>>> libnl3-3.2.28-4.el7.x86_64 >>>>>>>>>> > librdmacm-41mlnx1-OFED.4.7.3.0.6.49224.x86_64 >>>>>>>>>> > librxe-41mlnx1-OFED.4.4.2.4.6.49224.x86_64 >>>>>>>>>> libxcb-1.13-1.el7.x86_64 >>>>>>>>>> > libxml2-2.9.1-6.el7_9.6.x86_64 numactl-libs-2.0.12-5.el7.x86_64 >>>>>>>>>> > systemd-libs-219-78.el7_9.3.x86_64 xz-libs-5.2.2-1.el7.x86_64 >>>>>>>>>> > zlib-1.2.7-19.el7_9.x86_64 >>>>>>>>>> > (gdb) bt >>>>>>>>>> > #0 0x00002aaab5558b11 in >>>>>>>>>> > >>>>>>>>>> Petsc::CUPMDevice<(Petsc::CUPMDeviceType)0>::CUPMDeviceInternal::initialize >>>>>>>>>> > (this=0x1) at >>>>>>>>>> > >>>>>>>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/impls/cupm/cupmdevice.cxx:54 >>>>>>>>>> > #1 0x00002aaab5558db7 in >>>>>>>>>> > Petsc::CUPMDevice<(Petsc::CUPMDeviceType)0>::getDevice >>>>>>>>>> > (this=this at entry=0x2aaab7f37b70 >>>>>>>>>> > , device=0x115da00, id=-35, id at entry=-1) at >>>>>>>>>> > >>>>>>>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/impls/cupm/cupmdevice.cxx:344 >>>>>>>>>> > #2 0x00002aaab55577de in PetscDeviceCreate (type=type at entry >>>>>>>>>> =PETSC_DEVICE_CUDA, >>>>>>>>>> > devid=devid at entry=-1, device=device at entry=0x2aaab7f37b48 >>>>>>>>>> > ) at >>>>>>>>>> > >>>>>>>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/interface/device.cxx:107 >>>>>>>>>> > #3 0x00002aaab5557b3a in >>>>>>>>>> PetscDeviceInitializeDefaultDevice_Internal >>>>>>>>>> > (type=type at entry=PETSC_DEVICE_CUDA, >>>>>>>>>> defaultDeviceId=defaultDeviceId at entry=-1) >>>>>>>>>> > at >>>>>>>>>> > >>>>>>>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/interface/device.cxx:273 >>>>>>>>>> > #4 0x00002aaab5557bf6 in PetscDeviceInitialize >>>>>>>>>> > (type=type at entry=PETSC_DEVICE_CUDA) >>>>>>>>>> > at >>>>>>>>>> > >>>>>>>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/interface/device.cxx:234 >>>>>>>>>> > #5 0x00002aaab5661fcd in VecCreate_SeqCUDA (V=0x115d150) at >>>>>>>>>> > >>>>>>>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/vec/vec/impls/seq/seqcuda/veccuda.c:244 >>>>>>>>>> > #6 0x00002aaab5649b40 in VecSetType (vec=vec at entry=0x115d150, >>>>>>>>>> > method=method at entry=0x2aaab70b45b8 "seqcuda") at >>>>>>>>>> > >>>>>>>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/vec/vec/interface/vecreg.c:93 >>>>>>>>>> > #7 0x00002aaab579c33f in VecCreate_CUDA (v=0x115d150) at >>>>>>>>>> > >>>>>>>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/vec/vec/impls/mpi/mpicuda/ >>>>>>>>>> > mpicuda.cu:214 >>>>>>>>>> > #8 0x00002aaab5649b40 in VecSetType (vec=vec at entry=0x115d150, >>>>>>>>>> > method=method at entry=0x7fffffff9260 "cuda") at >>>>>>>>>> > >>>>>>>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/vec/vec/interface/vecreg.c:93 >>>>>>>>>> > #9 0x00002aaab5648bf1 in VecSetTypeFromOptions_Private >>>>>>>>>> (vec=0x115d150, >>>>>>>>>> > PetscOptionsObject=0x7fffffff9210) at >>>>>>>>>> > >>>>>>>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/vec/vec/interface/vector.c:1263 >>>>>>>>>> > #10 VecSetFromOptions (vec=0x115d150) at >>>>>>>>>> > >>>>>>>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/vec/vec/interface/vector.c:1297 >>>>>>>>>> > #11 0x00002aaab02ef227 in libMesh::PetscVector::init >>>>>>>>>> > (this=0x11cd1a0, n=441, n_local=441, fast=false, >>>>>>>>>> ptype=libMesh::PARALLEL) >>>>>>>>>> > at >>>>>>>>>> > >>>>>>>>>> /home/kongf/workhome/sawtooth/moosegpu/scripts/../libmesh/installed/include/libmesh/petsc_vector.h:693 >>>>>>>>>> > >>>>>>>>>> > On Thu, Jan 20, 2022 at 1:09 PM Fande Kong >>>>>>>>>> wrote: >>>>>>>>>> > >>>>>>>>>> >> Thanks, Jed, >>>>>>>>>> >> >>>>>>>>>> >> This worked! >>>>>>>>>> >> >>>>>>>>>> >> Fande >>>>>>>>>> >> >>>>>>>>>> >> On Wed, Jan 19, 2022 at 11:03 PM Jed Brown >>>>>>>>>> wrote: >>>>>>>>>> >> >>>>>>>>>> >>> Fande Kong writes: >>>>>>>>>> >>> >>>>>>>>>> >>> > On Wed, Jan 19, 2022 at 11:39 AM Jacob Faibussowitsch < >>>>>>>>>> >>> jacob.fai at gmail.com> >>>>>>>>>> >>> > wrote: >>>>>>>>>> >>> > >>>>>>>>>> >>> >> Are you running on login nodes or compute nodes (I can?t >>>>>>>>>> seem to tell >>>>>>>>>> >>> from >>>>>>>>>> >>> >> the configure.log)? >>>>>>>>>> >>> >> >>>>>>>>>> >>> > >>>>>>>>>> >>> > I was compiling codes on login nodes, and running codes on >>>>>>>>>> compute >>>>>>>>>> >>> nodes. >>>>>>>>>> >>> > Login nodes do not have GPUs, but compute nodes do have >>>>>>>>>> GPUs. >>>>>>>>>> >>> > >>>>>>>>>> >>> > Just to be clear, the same thing (code, machine) with >>>>>>>>>> PETSc-3.16.1 >>>>>>>>>> >>> worked >>>>>>>>>> >>> > perfectly. I have this trouble with PETSc-main. >>>>>>>>>> >>> >>>>>>>>>> >>> I assume you can >>>>>>>>>> >>> >>>>>>>>>> >>> export PETSC_OPTIONS='-device_enable lazy' >>>>>>>>>> >>> >>>>>>>>>> >>> and it'll work. >>>>>>>>>> >>> >>>>>>>>>> >>> I think this should be the default. The main complaint is >>>>>>>>>> that timing the >>>>>>>>>> >>> first GPU-using event isn't accurate if it includes >>>>>>>>>> initialization, but I >>>>>>>>>> >>> think this is mostly hypothetical because you can't trust any >>>>>>>>>> timing that >>>>>>>>>> >>> doesn't preload in some form and the first GPU-using event >>>>>>>>>> will almost >>>>>>>>>> >>> always be something uninteresting so I think it will rarely >>>>>>>>>> lead to >>>>>>>>>> >>> confusion. Meanwhile, eager initialization is viscerally >>>>>>>>>> disruptive for >>>>>>>>>> >>> lots of people. >>>>>>>>>> >>> >>>>>>>>>> >> >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> What most experimenters take for granted before they begin their >>>>>>>> experiments is infinitely more interesting than any results to which their >>>>>>>> experiments lead. >>>>>>>> -- Norbert Wiener >>>>>>>> >>>>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>> >>> >>> >>> -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: make.log Type: application/octet-stream Size: 110081 bytes Desc: not available URL: From junchao.zhang at gmail.com Wed Jan 26 12:49:26 2022 From: junchao.zhang at gmail.com (Junchao Zhang) Date: Wed, 26 Jan 2022 12:49:26 -0600 Subject: [petsc-users] Cannot eagerly initialize cuda, as doing so results in cuda error 35 (cudaErrorInsufficientDriver) : CUDA driver version is insufficient for CUDA runtime version In-Reply-To: References: <233D0F20-AD95-481B-B862-353D5789C556@gmail.com> <87czkn9c64.fsf@jedbrown.org> <875yqe7zip.fsf@jedbrown.org> Message-ID: Do you have the configure.log with main? --Junchao Zhang On Wed, Jan 26, 2022 at 12:26 PM Fande Kong wrote: > I am on the petsc-main > > commit 1390d3a27d88add7d79c9b38bf1a895ae5e67af6 > > Merge: 96c919c d5f3255 > > Author: Satish Balay > > Date: Wed Jan 26 10:28:32 2022 -0600 > > > Merge remote-tracking branch 'origin/release' > > > It is still broken. > > Thanks, > > > Fande > > On Wed, Jan 26, 2022 at 7:40 AM Junchao Zhang > wrote: > >> The good uses the compiler's default library/header path. The bad >> searches from cuda toolkit path and uses rpath linking. >> Though the paths look the same on the login node, they could have >> different behavior on a compute node depending on its environment. >> I think we fixed the issue in cuda.py (i.e., first try the compiler's >> default, then toolkit). That's why I wanted Fande to use petsc/main. >> >> --Junchao Zhang >> >> >> On Tue, Jan 25, 2022 at 11:59 PM Barry Smith wrote: >> >>> >>> bad has extra >>> >>> -L/apps/local/spack/software/gcc-7.5.0/cuda-10.1.243-v4ymjqcrr7f72qfiuzsstuy5jiajbuey/lib64/stubs >>> -lcuda >>> >>> good does not. >>> >>> Try removing the stubs directory and -lcuda from the bad >>> $PETSC_ARCH/lib/petsc/conf/variables and likely the bad will start working. >>> >>> Barry >>> >>> I never liked the stubs stuff. >>> >>> On Jan 25, 2022, at 11:29 PM, Fande Kong wrote: >>> >>> Hi Junchao, >>> >>> I attached a "bad" configure log and a "good" configure log. >>> >>> The "bad" one was on produced at 246ba74192519a5f34fb6e227d1c64364e19ce2c >>> >>> and the "good" one at 384645a00975869a1aacbd3169de62ba40cad683 >>> >>> This good hash is the last good hash that is just the right before the >>> bad one. >>> >>> I think you could do a comparison between these two logs, and check >>> what the differences were. >>> >>> Thanks, >>> >>> Fande >>> >>> On Tue, Jan 25, 2022 at 8:21 PM Junchao Zhang >>> wrote: >>> >>>> Fande, could you send the configure.log that works (i.e., before this >>>> offending commit)? >>>> --Junchao Zhang >>>> >>>> >>>> On Tue, Jan 25, 2022 at 8:21 PM Fande Kong wrote: >>>> >>>>> Not sure if this is helpful. I did "git bisect", and here was the >>>>> result: >>>>> >>>>> [kongf at sawtooth2 petsc]$ git bisect bad >>>>> 246ba74192519a5f34fb6e227d1c64364e19ce2c is the first bad commit >>>>> commit 246ba74192519a5f34fb6e227d1c64364e19ce2c >>>>> Author: Junchao Zhang >>>>> Date: Wed Oct 13 05:32:43 2021 +0000 >>>>> >>>>> Config: fix CUDA library and header dirs >>>>> >>>>> :040000 040000 187c86055adb80f53c1d0565a8888704fec43a96 >>>>> ea1efd7f594fd5e8df54170bc1bc7b00f35e4d5f M config >>>>> >>>>> >>>>> Started from this commit, and GPU did not work for me on our HPC >>>>> >>>>> Thanks, >>>>> Fande >>>>> >>>>> On Tue, Jan 25, 2022 at 7:18 PM Fande Kong >>>>> wrote: >>>>> >>>>>> >>>>>> >>>>>> On Tue, Jan 25, 2022 at 9:04 AM Jacob Faibussowitsch < >>>>>> jacob.fai at gmail.com> wrote: >>>>>> >>>>>>> Configure should not have an impact here I think. The reason I had >>>>>>> you run `cudaGetDeviceCount()` is because this is the CUDA call (and in >>>>>>> fact the only CUDA call) in the initialization sequence that returns the >>>>>>> error code. There should be no prior CUDA calls. Maybe this is a problem >>>>>>> with oversubscribing GPU?s? In the runs that crash, how many ranks are >>>>>>> using any given GPU at once? Maybe MPS is required. >>>>>>> >>>>>> >>>>>> I used one MPI rank. >>>>>> >>>>>> Fande >>>>>> >>>>>> >>>>>> >>>>>>> >>>>>>> Best regards, >>>>>>> >>>>>>> Jacob Faibussowitsch >>>>>>> (Jacob Fai - booss - oh - vitch) >>>>>>> >>>>>>> On Jan 21, 2022, at 12:01, Fande Kong wrote: >>>>>>> >>>>>>> Thanks Jacob, >>>>>>> >>>>>>> On Thu, Jan 20, 2022 at 6:25 PM Jacob Faibussowitsch < >>>>>>> jacob.fai at gmail.com> wrote: >>>>>>> >>>>>>>> Segfault is caused by the following check at >>>>>>>> src/sys/objects/device/impls/cupm/cupmdevice.cxx:349 being a >>>>>>>> PetscUnlikelyDebug() rather than just PetscUnlikely(): >>>>>>>> >>>>>>>> ``` >>>>>>>> if (PetscUnlikelyDebug(_defaultDevice < 0)) { // _defaultDevice is >>>>>>>> in fact < 0 here and uncaught >>>>>>>> ``` >>>>>>>> >>>>>>>> To clarify: >>>>>>>> >>>>>>>> ?lazy? initialization is not that lazy after all, it still does >>>>>>>> some 50% of the initialization that ?eager? initialization does. It stops >>>>>>>> short initializing the CUDA runtime, checking CUDA aware MPI, gathering >>>>>>>> device data, and initializing cublas and friends. Lazy also importantly >>>>>>>> swallows any errors that crop up during initialization, storing the >>>>>>>> resulting error code for later (specifically _defaultDevice = >>>>>>>> -init_error_value;). >>>>>>>> >>>>>>>> So whether you initialize lazily or eagerly makes no difference >>>>>>>> here, as _defaultDevice will always contain -35. >>>>>>>> >>>>>>>> The bigger question is why cudaGetDeviceCount() is returning >>>>>>>> cudaErrorInsufficientDriver. Can you compile and run >>>>>>>> >>>>>>>> ``` >>>>>>>> #include >>>>>>>> >>>>>>>> int main() >>>>>>>> { >>>>>>>> int ndev; >>>>>>>> return cudaGetDeviceCount(&ndev): >>>>>>>> } >>>>>>>> ``` >>>>>>>> >>>>>>>> Then show the value of "echo $??? >>>>>>>> >>>>>>> >>>>>>> Modify your code a little to get more information. >>>>>>> >>>>>>> #include >>>>>>> #include >>>>>>> >>>>>>> int main() >>>>>>> { >>>>>>> int ndev; >>>>>>> int error = cudaGetDeviceCount(&ndev); >>>>>>> printf("ndev %d \n", ndev); >>>>>>> printf("error %d \n", error); >>>>>>> return 0; >>>>>>> } >>>>>>> >>>>>>> Results: >>>>>>> >>>>>>> $ ./a.out >>>>>>> ndev 4 >>>>>>> error 0 >>>>>>> >>>>>>> >>>>>>> I have not read the PETSc cuda initialization code yet. If I need to >>>>>>> guess at what was happening. I will naively think that PETSc did not get >>>>>>> correct GPU information in the configuration because the compiler node does >>>>>>> not have GPUs, and there was no way to get any GPU device information. >>>>>>> >>>>>>> >>>>>>> During the runtime on GPU nodes, PETSc might have incorrect >>>>>>> information grabbed during configuration and had this kind of false error >>>>>>> message. >>>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> Fande >>>>>>> >>>>>>> >>>>>>> >>>>>>>> >>>>>>>> Best regards, >>>>>>>> >>>>>>>> Jacob Faibussowitsch >>>>>>>> (Jacob Fai - booss - oh - vitch) >>>>>>>> >>>>>>>> On Jan 20, 2022, at 17:47, Matthew Knepley >>>>>>>> wrote: >>>>>>>> >>>>>>>> On Thu, Jan 20, 2022 at 6:44 PM Fande Kong >>>>>>>> wrote: >>>>>>>> >>>>>>>>> Thanks, Jed >>>>>>>>> >>>>>>>>> On Thu, Jan 20, 2022 at 4:34 PM Jed Brown >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>>> You can't create CUDA or Kokkos Vecs if you're running on a node >>>>>>>>>> without a GPU. >>>>>>>>> >>>>>>>>> >>>>>>>>> I am running the code on compute nodes that do have GPUs. >>>>>>>>> >>>>>>>> >>>>>>>> If you are actually running on GPUs, why would you need lazy >>>>>>>> initialization? It would not break with GPUs present. >>>>>>>> >>>>>>>> Matt >>>>>>>> >>>>>>>> >>>>>>>>> With PETSc-3.16.1, I got good speedup by running GAMG on GPUs. >>>>>>>>> That might be a bug of PETSc-main. >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> >>>>>>>>> Fande >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> KSPSetUp 13 1.0 6.4400e-01 1.0 2.02e+09 1.0 0.0e+00 >>>>>>>>> 0.0e+00 0.0e+00 0 5 0 0 0 0 5 0 0 0 3140 64630 15 >>>>>>>>> 1.05e+02 5 3.49e+01 100 >>>>>>>>> KSPSolve 1 1.0 1.0109e+00 1.0 3.49e+10 1.0 0.0e+00 >>>>>>>>> 0.0e+00 0.0e+00 0 87 0 0 0 0 87 0 0 0 34522 69556 4 >>>>>>>>> 4.35e-03 1 2.38e-03 100 >>>>>>>>> KSPGMRESOrthog 142 1.0 1.2674e-01 1.0 1.06e+10 1.0 0.0e+00 >>>>>>>>> 0.0e+00 0.0e+00 0 27 0 0 0 0 27 0 0 0 83755 87801 0 >>>>>>>>> 0.00e+00 0 0.00e+00 100 >>>>>>>>> SNESSolve 1 1.0 4.4402e+01 1.0 4.00e+10 1.0 0.0e+00 >>>>>>>>> 0.0e+00 0.0e+00 21100 0 0 0 21100 0 0 0 901 51365 57 >>>>>>>>> 1.10e+03 52 8.78e+02 100 >>>>>>>>> SNESSetUp 1 1.0 3.9101e-05 1.0 0.00e+00 0.0 0.0e+00 >>>>>>>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 >>>>>>>>> 0.00e+00 0 0.00e+00 0 >>>>>>>>> SNESFunctionEval 2 1.0 1.7097e+01 1.0 1.60e+07 1.0 0.0e+00 >>>>>>>>> 0.0e+00 0.0e+00 8 0 0 0 0 8 0 0 0 0 1 0 0 >>>>>>>>> 0.00e+00 6 1.92e+02 0 >>>>>>>>> SNESJacobianEval 1 1.0 1.6213e+01 1.0 2.80e+07 1.0 0.0e+00 >>>>>>>>> 0.0e+00 0.0e+00 8 0 0 0 0 8 0 0 0 0 2 0 0 >>>>>>>>> 0.00e+00 1 3.20e+01 0 >>>>>>>>> SNESLineSearch 1 1.0 8.5582e+00 1.0 1.24e+08 1.0 0.0e+00 >>>>>>>>> 0.0e+00 0.0e+00 4 0 0 0 0 4 0 0 0 0 14 64153 1 >>>>>>>>> 3.20e+01 3 9.61e+01 94 >>>>>>>>> PCGAMGGraph_AGG 5 1.0 3.0509e+00 1.0 8.19e+07 1.0 0.0e+00 >>>>>>>>> 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 27 0 5 >>>>>>>>> 3.49e+01 9 7.43e+01 0 >>>>>>>>> PCGAMGCoarse_AGG 5 1.0 3.8711e+00 1.0 0.00e+00 0.0 0.0e+00 >>>>>>>>> 0.0e+00 0.0e+00 2 0 0 0 0 2 0 0 0 0 0 0 0 >>>>>>>>> 0.00e+00 0 0.00e+00 0 >>>>>>>>> PCGAMGProl_AGG 5 1.0 7.0748e-01 1.0 0.00e+00 0.0 0.0e+00 >>>>>>>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 >>>>>>>>> 0.00e+00 0 0.00e+00 0 >>>>>>>>> PCGAMGPOpt_AGG 5 1.0 1.2904e+00 1.0 2.14e+09 1.0 0.0e+00 >>>>>>>>> 0.0e+00 0.0e+00 1 5 0 0 0 1 5 0 0 0 1661 29807 26 >>>>>>>>> 7.15e+02 20 2.90e+02 99 >>>>>>>>> GAMG: createProl 5 1.0 8.9489e+00 1.0 2.22e+09 1.0 0.0e+00 >>>>>>>>> 0.0e+00 0.0e+00 4 6 0 0 0 4 6 0 0 0 249 29666 31 >>>>>>>>> 7.50e+02 29 3.64e+02 96 >>>>>>>>> Graph 10 1.0 3.0478e+00 1.0 8.19e+07 1.0 0.0e+00 >>>>>>>>> 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 27 0 5 >>>>>>>>> 3.49e+01 9 7.43e+01 0 >>>>>>>>> MIS/Agg 5 1.0 4.1290e-01 1.0 0.00e+00 0.0 0.0e+00 >>>>>>>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 >>>>>>>>> 0.00e+00 0 0.00e+00 0 >>>>>>>>> SA: col data 5 1.0 1.9127e-02 1.0 0.00e+00 0.0 0.0e+00 >>>>>>>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 >>>>>>>>> 0.00e+00 0 0.00e+00 0 >>>>>>>>> SA: frmProl0 5 1.0 6.2662e-01 1.0 0.00e+00 0.0 0.0e+00 >>>>>>>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 >>>>>>>>> 0.00e+00 0 0.00e+00 0 >>>>>>>>> SA: smooth 5 1.0 4.9595e-01 1.0 1.21e+08 1.0 0.0e+00 >>>>>>>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 244 2709 15 >>>>>>>>> 1.97e+02 15 2.55e+02 90 >>>>>>>>> GAMG: partLevel 5 1.0 4.7330e-01 1.0 6.98e+08 1.0 0.0e+00 >>>>>>>>> 0.0e+00 0.0e+00 0 2 0 0 0 0 2 0 0 0 1475 4120 5 >>>>>>>>> 1.78e+02 10 2.55e+02 100 >>>>>>>>> PCGAMG Squ l00 1 1.0 2.6027e+00 1.0 0.00e+00 0.0 0.0e+00 >>>>>>>>> 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 0 0 0 >>>>>>>>> 0.00e+00 0 0.00e+00 0 >>>>>>>>> PCGAMG Gal l00 1 1.0 3.8406e-01 1.0 5.48e+08 1.0 0.0e+00 >>>>>>>>> 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 1426 4270 1 >>>>>>>>> 1.48e+02 2 2.11e+02 100 >>>>>>>>> PCGAMG Opt l00 1 1.0 2.4932e-01 1.0 7.20e+07 1.0 0.0e+00 >>>>>>>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 289 2653 1 >>>>>>>>> 6.41e+01 1 1.13e+02 100 >>>>>>>>> PCGAMG Gal l01 1 1.0 6.6279e-02 1.0 1.09e+08 1.0 0.0e+00 >>>>>>>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1645 3851 1 >>>>>>>>> 2.40e+01 2 3.64e+01 100 >>>>>>>>> PCGAMG Opt l01 1 1.0 2.9544e-02 1.0 7.15e+06 1.0 0.0e+00 >>>>>>>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 242 1671 1 >>>>>>>>> 4.84e+00 1 1.23e+01 100 >>>>>>>>> PCGAMG Gal l02 1 1.0 1.8874e-02 1.0 3.72e+07 1.0 0.0e+00 >>>>>>>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1974 3636 1 >>>>>>>>> 5.04e+00 2 6.58e+00 100 >>>>>>>>> PCGAMG Opt l02 1 1.0 7.4353e-03 1.0 2.40e+06 1.0 0.0e+00 >>>>>>>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 323 1457 1 >>>>>>>>> 7.71e-01 1 2.30e+00 100 >>>>>>>>> PCGAMG Gal l03 1 1.0 2.8479e-03 1.0 4.10e+06 1.0 0.0e+00 >>>>>>>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1440 2266 1 >>>>>>>>> 4.44e-01 2 5.51e-01 100 >>>>>>>>> PCGAMG Opt l03 1 1.0 8.2684e-04 1.0 2.80e+05 1.0 0.0e+00 >>>>>>>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 339 1667 1 >>>>>>>>> 6.72e-02 1 2.03e-01 100 >>>>>>>>> PCGAMG Gal l04 1 1.0 1.2238e-03 1.0 2.09e+05 1.0 0.0e+00 >>>>>>>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 170 244 1 >>>>>>>>> 2.05e-02 2 2.53e-02 100 >>>>>>>>> PCGAMG Opt l04 1 1.0 4.1008e-04 1.0 1.77e+04 1.0 0.0e+00 >>>>>>>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 43 165 1 >>>>>>>>> 4.49e-03 1 1.19e-02 100 >>>>>>>>> PCSetUp 2 1.0 9.9632e+00 1.0 4.95e+09 1.0 0.0e+00 >>>>>>>>> 0.0e+00 0.0e+00 5 12 0 0 0 5 12 0 0 0 496 17826 55 >>>>>>>>> 1.03e+03 45 6.54e+02 98 >>>>>>>>> PCSetUpOnBlocks 44 1.0 9.9087e-04 1.0 2.88e+03 1.0 >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>>> The point of lazy initialization is to make it possible to run a >>>>>>>>>> solve that doesn't use a GPU in PETSC_ARCH that supports GPUs, regardless >>>>>>>>>> of whether a GPU is actually present. >>>>>>>>>> >>>>>>>>>> Fande Kong writes: >>>>>>>>>> >>>>>>>>>> > I spoke too soon. It seems that we have trouble creating >>>>>>>>>> cuda/kokkos vecs >>>>>>>>>> > now. Got Segmentation fault. >>>>>>>>>> > >>>>>>>>>> > Thanks, >>>>>>>>>> > >>>>>>>>>> > Fande >>>>>>>>>> > >>>>>>>>>> > Program received signal SIGSEGV, Segmentation fault. >>>>>>>>>> > 0x00002aaab5558b11 in >>>>>>>>>> > >>>>>>>>>> Petsc::CUPMDevice<(Petsc::CUPMDeviceType)0>::CUPMDeviceInternal::initialize >>>>>>>>>> > (this=0x1) at >>>>>>>>>> > >>>>>>>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/impls/cupm/cupmdevice.cxx:54 >>>>>>>>>> > 54 PetscErrorCode >>>>>>>>>> CUPMDevice::CUPMDeviceInternal::initialize() noexcept >>>>>>>>>> > Missing separate debuginfos, use: debuginfo-install >>>>>>>>>> > bzip2-libs-1.0.6-13.el7.x86_64 >>>>>>>>>> elfutils-libelf-0.176-5.el7.x86_64 >>>>>>>>>> > elfutils-libs-0.176-5.el7.x86_64 glibc-2.17-325.el7_9.x86_64 >>>>>>>>>> > libX11-1.6.7-4.el7_9.x86_64 libXau-1.0.8-2.1.el7.x86_64 >>>>>>>>>> > libattr-2.4.46-13.el7.x86_64 libcap-2.22-11.el7.x86_64 >>>>>>>>>> > libibmad-5.4.0.MLNX20190423.1d917ae-0.1.49224.x86_64 >>>>>>>>>> > libibumad-43.1.1.MLNX20200211.078947f-0.1.49224.x86_64 >>>>>>>>>> > libibverbs-41mlnx1-OFED.4.9.0.0.7.49224.x86_64 >>>>>>>>>> > libmlx4-41mlnx1-OFED.4.7.3.0.3.49224.x86_64 >>>>>>>>>> > libmlx5-41mlnx1-OFED.4.9.0.1.2.49224.x86_64 >>>>>>>>>> libnl3-3.2.28-4.el7.x86_64 >>>>>>>>>> > librdmacm-41mlnx1-OFED.4.7.3.0.6.49224.x86_64 >>>>>>>>>> > librxe-41mlnx1-OFED.4.4.2.4.6.49224.x86_64 >>>>>>>>>> libxcb-1.13-1.el7.x86_64 >>>>>>>>>> > libxml2-2.9.1-6.el7_9.6.x86_64 numactl-libs-2.0.12-5.el7.x86_64 >>>>>>>>>> > systemd-libs-219-78.el7_9.3.x86_64 xz-libs-5.2.2-1.el7.x86_64 >>>>>>>>>> > zlib-1.2.7-19.el7_9.x86_64 >>>>>>>>>> > (gdb) bt >>>>>>>>>> > #0 0x00002aaab5558b11 in >>>>>>>>>> > >>>>>>>>>> Petsc::CUPMDevice<(Petsc::CUPMDeviceType)0>::CUPMDeviceInternal::initialize >>>>>>>>>> > (this=0x1) at >>>>>>>>>> > >>>>>>>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/impls/cupm/cupmdevice.cxx:54 >>>>>>>>>> > #1 0x00002aaab5558db7 in >>>>>>>>>> > Petsc::CUPMDevice<(Petsc::CUPMDeviceType)0>::getDevice >>>>>>>>>> > (this=this at entry=0x2aaab7f37b70 >>>>>>>>>> > , device=0x115da00, id=-35, id at entry=-1) at >>>>>>>>>> > >>>>>>>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/impls/cupm/cupmdevice.cxx:344 >>>>>>>>>> > #2 0x00002aaab55577de in PetscDeviceCreate (type=type at entry >>>>>>>>>> =PETSC_DEVICE_CUDA, >>>>>>>>>> > devid=devid at entry=-1, device=device at entry=0x2aaab7f37b48 >>>>>>>>>> > ) at >>>>>>>>>> > >>>>>>>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/interface/device.cxx:107 >>>>>>>>>> > #3 0x00002aaab5557b3a in >>>>>>>>>> PetscDeviceInitializeDefaultDevice_Internal >>>>>>>>>> > (type=type at entry=PETSC_DEVICE_CUDA, >>>>>>>>>> defaultDeviceId=defaultDeviceId at entry=-1) >>>>>>>>>> > at >>>>>>>>>> > >>>>>>>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/interface/device.cxx:273 >>>>>>>>>> > #4 0x00002aaab5557bf6 in PetscDeviceInitialize >>>>>>>>>> > (type=type at entry=PETSC_DEVICE_CUDA) >>>>>>>>>> > at >>>>>>>>>> > >>>>>>>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/interface/device.cxx:234 >>>>>>>>>> > #5 0x00002aaab5661fcd in VecCreate_SeqCUDA (V=0x115d150) at >>>>>>>>>> > >>>>>>>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/vec/vec/impls/seq/seqcuda/veccuda.c:244 >>>>>>>>>> > #6 0x00002aaab5649b40 in VecSetType (vec=vec at entry=0x115d150, >>>>>>>>>> > method=method at entry=0x2aaab70b45b8 "seqcuda") at >>>>>>>>>> > >>>>>>>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/vec/vec/interface/vecreg.c:93 >>>>>>>>>> > #7 0x00002aaab579c33f in VecCreate_CUDA (v=0x115d150) at >>>>>>>>>> > >>>>>>>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/vec/vec/impls/mpi/mpicuda/ >>>>>>>>>> > mpicuda.cu:214 >>>>>>>>>> > #8 0x00002aaab5649b40 in VecSetType (vec=vec at entry=0x115d150, >>>>>>>>>> > method=method at entry=0x7fffffff9260 "cuda") at >>>>>>>>>> > >>>>>>>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/vec/vec/interface/vecreg.c:93 >>>>>>>>>> > #9 0x00002aaab5648bf1 in VecSetTypeFromOptions_Private >>>>>>>>>> (vec=0x115d150, >>>>>>>>>> > PetscOptionsObject=0x7fffffff9210) at >>>>>>>>>> > >>>>>>>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/vec/vec/interface/vector.c:1263 >>>>>>>>>> > #10 VecSetFromOptions (vec=0x115d150) at >>>>>>>>>> > >>>>>>>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/vec/vec/interface/vector.c:1297 >>>>>>>>>> > #11 0x00002aaab02ef227 in libMesh::PetscVector::init >>>>>>>>>> > (this=0x11cd1a0, n=441, n_local=441, fast=false, >>>>>>>>>> ptype=libMesh::PARALLEL) >>>>>>>>>> > at >>>>>>>>>> > >>>>>>>>>> /home/kongf/workhome/sawtooth/moosegpu/scripts/../libmesh/installed/include/libmesh/petsc_vector.h:693 >>>>>>>>>> > >>>>>>>>>> > On Thu, Jan 20, 2022 at 1:09 PM Fande Kong >>>>>>>>>> wrote: >>>>>>>>>> > >>>>>>>>>> >> Thanks, Jed, >>>>>>>>>> >> >>>>>>>>>> >> This worked! >>>>>>>>>> >> >>>>>>>>>> >> Fande >>>>>>>>>> >> >>>>>>>>>> >> On Wed, Jan 19, 2022 at 11:03 PM Jed Brown >>>>>>>>>> wrote: >>>>>>>>>> >> >>>>>>>>>> >>> Fande Kong writes: >>>>>>>>>> >>> >>>>>>>>>> >>> > On Wed, Jan 19, 2022 at 11:39 AM Jacob Faibussowitsch < >>>>>>>>>> >>> jacob.fai at gmail.com> >>>>>>>>>> >>> > wrote: >>>>>>>>>> >>> > >>>>>>>>>> >>> >> Are you running on login nodes or compute nodes (I can?t >>>>>>>>>> seem to tell >>>>>>>>>> >>> from >>>>>>>>>> >>> >> the configure.log)? >>>>>>>>>> >>> >> >>>>>>>>>> >>> > >>>>>>>>>> >>> > I was compiling codes on login nodes, and running codes on >>>>>>>>>> compute >>>>>>>>>> >>> nodes. >>>>>>>>>> >>> > Login nodes do not have GPUs, but compute nodes do have >>>>>>>>>> GPUs. >>>>>>>>>> >>> > >>>>>>>>>> >>> > Just to be clear, the same thing (code, machine) with >>>>>>>>>> PETSc-3.16.1 >>>>>>>>>> >>> worked >>>>>>>>>> >>> > perfectly. I have this trouble with PETSc-main. >>>>>>>>>> >>> >>>>>>>>>> >>> I assume you can >>>>>>>>>> >>> >>>>>>>>>> >>> export PETSC_OPTIONS='-device_enable lazy' >>>>>>>>>> >>> >>>>>>>>>> >>> and it'll work. >>>>>>>>>> >>> >>>>>>>>>> >>> I think this should be the default. The main complaint is >>>>>>>>>> that timing the >>>>>>>>>> >>> first GPU-using event isn't accurate if it includes >>>>>>>>>> initialization, but I >>>>>>>>>> >>> think this is mostly hypothetical because you can't trust any >>>>>>>>>> timing that >>>>>>>>>> >>> doesn't preload in some form and the first GPU-using event >>>>>>>>>> will almost >>>>>>>>>> >>> always be something uninteresting so I think it will rarely >>>>>>>>>> lead to >>>>>>>>>> >>> confusion. Meanwhile, eager initialization is viscerally >>>>>>>>>> disruptive for >>>>>>>>>> >>> lots of people. >>>>>>>>>> >>> >>>>>>>>>> >> >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> What most experimenters take for granted before they begin their >>>>>>>> experiments is infinitely more interesting than any results to which their >>>>>>>> experiments lead. >>>>>>>> -- Norbert Wiener >>>>>>>> >>>>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>> >>> >>> >>> -------------- next part -------------- An HTML attachment was scrubbed... URL: From fdkong.jd at gmail.com Wed Jan 26 14:49:45 2022 From: fdkong.jd at gmail.com (Fande Kong) Date: Wed, 26 Jan 2022 13:49:45 -0700 Subject: [petsc-users] Cannot eagerly initialize cuda, as doing so results in cuda error 35 (cudaErrorInsufficientDriver) : CUDA driver version is insufficient for CUDA runtime version In-Reply-To: References: <233D0F20-AD95-481B-B862-353D5789C556@gmail.com> <87czkn9c64.fsf@jedbrown.org> <875yqe7zip.fsf@jedbrown.org> Message-ID: Yes, please see the attached file. Fande On Wed, Jan 26, 2022 at 11:49 AM Junchao Zhang wrote: > Do you have the configure.log with main? > > --Junchao Zhang > > > On Wed, Jan 26, 2022 at 12:26 PM Fande Kong wrote: > >> I am on the petsc-main >> >> commit 1390d3a27d88add7d79c9b38bf1a895ae5e67af6 >> >> Merge: 96c919c d5f3255 >> >> Author: Satish Balay >> >> Date: Wed Jan 26 10:28:32 2022 -0600 >> >> >> Merge remote-tracking branch 'origin/release' >> >> >> It is still broken. >> >> Thanks, >> >> >> Fande >> >> On Wed, Jan 26, 2022 at 7:40 AM Junchao Zhang >> wrote: >> >>> The good uses the compiler's default library/header path. The bad >>> searches from cuda toolkit path and uses rpath linking. >>> Though the paths look the same on the login node, they could have >>> different behavior on a compute node depending on its environment. >>> I think we fixed the issue in cuda.py (i.e., first try the compiler's >>> default, then toolkit). That's why I wanted Fande to use petsc/main. >>> >>> --Junchao Zhang >>> >>> >>> On Tue, Jan 25, 2022 at 11:59 PM Barry Smith wrote: >>> >>>> >>>> bad has extra >>>> >>>> -L/apps/local/spack/software/gcc-7.5.0/cuda-10.1.243-v4ymjqcrr7f72qfiuzsstuy5jiajbuey/lib64/stubs >>>> -lcuda >>>> >>>> good does not. >>>> >>>> Try removing the stubs directory and -lcuda from the bad >>>> $PETSC_ARCH/lib/petsc/conf/variables and likely the bad will start working. >>>> >>>> Barry >>>> >>>> I never liked the stubs stuff. >>>> >>>> On Jan 25, 2022, at 11:29 PM, Fande Kong wrote: >>>> >>>> Hi Junchao, >>>> >>>> I attached a "bad" configure log and a "good" configure log. >>>> >>>> The "bad" one was on produced >>>> at 246ba74192519a5f34fb6e227d1c64364e19ce2c >>>> >>>> and the "good" one at 384645a00975869a1aacbd3169de62ba40cad683 >>>> >>>> This good hash is the last good hash that is just the right before the >>>> bad one. >>>> >>>> I think you could do a comparison between these two logs, and check >>>> what the differences were. >>>> >>>> Thanks, >>>> >>>> Fande >>>> >>>> On Tue, Jan 25, 2022 at 8:21 PM Junchao Zhang >>>> wrote: >>>> >>>>> Fande, could you send the configure.log that works (i.e., before this >>>>> offending commit)? >>>>> --Junchao Zhang >>>>> >>>>> >>>>> On Tue, Jan 25, 2022 at 8:21 PM Fande Kong >>>>> wrote: >>>>> >>>>>> Not sure if this is helpful. I did "git bisect", and here was the >>>>>> result: >>>>>> >>>>>> [kongf at sawtooth2 petsc]$ git bisect bad >>>>>> 246ba74192519a5f34fb6e227d1c64364e19ce2c is the first bad commit >>>>>> commit 246ba74192519a5f34fb6e227d1c64364e19ce2c >>>>>> Author: Junchao Zhang >>>>>> Date: Wed Oct 13 05:32:43 2021 +0000 >>>>>> >>>>>> Config: fix CUDA library and header dirs >>>>>> >>>>>> :040000 040000 187c86055adb80f53c1d0565a8888704fec43a96 >>>>>> ea1efd7f594fd5e8df54170bc1bc7b00f35e4d5f M config >>>>>> >>>>>> >>>>>> Started from this commit, and GPU did not work for me on our HPC >>>>>> >>>>>> Thanks, >>>>>> Fande >>>>>> >>>>>> On Tue, Jan 25, 2022 at 7:18 PM Fande Kong >>>>>> wrote: >>>>>> >>>>>>> >>>>>>> >>>>>>> On Tue, Jan 25, 2022 at 9:04 AM Jacob Faibussowitsch < >>>>>>> jacob.fai at gmail.com> wrote: >>>>>>> >>>>>>>> Configure should not have an impact here I think. The reason I had >>>>>>>> you run `cudaGetDeviceCount()` is because this is the CUDA call (and in >>>>>>>> fact the only CUDA call) in the initialization sequence that returns the >>>>>>>> error code. There should be no prior CUDA calls. Maybe this is a problem >>>>>>>> with oversubscribing GPU?s? In the runs that crash, how many ranks are >>>>>>>> using any given GPU at once? Maybe MPS is required. >>>>>>>> >>>>>>> >>>>>>> I used one MPI rank. >>>>>>> >>>>>>> Fande >>>>>>> >>>>>>> >>>>>>> >>>>>>>> >>>>>>>> Best regards, >>>>>>>> >>>>>>>> Jacob Faibussowitsch >>>>>>>> (Jacob Fai - booss - oh - vitch) >>>>>>>> >>>>>>>> On Jan 21, 2022, at 12:01, Fande Kong wrote: >>>>>>>> >>>>>>>> Thanks Jacob, >>>>>>>> >>>>>>>> On Thu, Jan 20, 2022 at 6:25 PM Jacob Faibussowitsch < >>>>>>>> jacob.fai at gmail.com> wrote: >>>>>>>> >>>>>>>>> Segfault is caused by the following check at >>>>>>>>> src/sys/objects/device/impls/cupm/cupmdevice.cxx:349 being a >>>>>>>>> PetscUnlikelyDebug() rather than just PetscUnlikely(): >>>>>>>>> >>>>>>>>> ``` >>>>>>>>> if (PetscUnlikelyDebug(_defaultDevice < 0)) { // _defaultDevice is >>>>>>>>> in fact < 0 here and uncaught >>>>>>>>> ``` >>>>>>>>> >>>>>>>>> To clarify: >>>>>>>>> >>>>>>>>> ?lazy? initialization is not that lazy after all, it still does >>>>>>>>> some 50% of the initialization that ?eager? initialization does. It stops >>>>>>>>> short initializing the CUDA runtime, checking CUDA aware MPI, gathering >>>>>>>>> device data, and initializing cublas and friends. Lazy also importantly >>>>>>>>> swallows any errors that crop up during initialization, storing the >>>>>>>>> resulting error code for later (specifically _defaultDevice = >>>>>>>>> -init_error_value;). >>>>>>>>> >>>>>>>>> So whether you initialize lazily or eagerly makes no difference >>>>>>>>> here, as _defaultDevice will always contain -35. >>>>>>>>> >>>>>>>>> The bigger question is why cudaGetDeviceCount() is returning >>>>>>>>> cudaErrorInsufficientDriver. Can you compile and run >>>>>>>>> >>>>>>>>> ``` >>>>>>>>> #include >>>>>>>>> >>>>>>>>> int main() >>>>>>>>> { >>>>>>>>> int ndev; >>>>>>>>> return cudaGetDeviceCount(&ndev): >>>>>>>>> } >>>>>>>>> ``` >>>>>>>>> >>>>>>>>> Then show the value of "echo $??? >>>>>>>>> >>>>>>>> >>>>>>>> Modify your code a little to get more information. >>>>>>>> >>>>>>>> #include >>>>>>>> #include >>>>>>>> >>>>>>>> int main() >>>>>>>> { >>>>>>>> int ndev; >>>>>>>> int error = cudaGetDeviceCount(&ndev); >>>>>>>> printf("ndev %d \n", ndev); >>>>>>>> printf("error %d \n", error); >>>>>>>> return 0; >>>>>>>> } >>>>>>>> >>>>>>>> Results: >>>>>>>> >>>>>>>> $ ./a.out >>>>>>>> ndev 4 >>>>>>>> error 0 >>>>>>>> >>>>>>>> >>>>>>>> I have not read the PETSc cuda initialization code yet. If I need >>>>>>>> to guess at what was happening. I will naively think that PETSc did not get >>>>>>>> correct GPU information in the configuration because the compiler node does >>>>>>>> not have GPUs, and there was no way to get any GPU device information. >>>>>>>> >>>>>>>> >>>>>>>> During the runtime on GPU nodes, PETSc might have incorrect >>>>>>>> information grabbed during configuration and had this kind of false error >>>>>>>> message. >>>>>>>> >>>>>>>> Thanks, >>>>>>>> >>>>>>>> Fande >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>>> >>>>>>>>> Best regards, >>>>>>>>> >>>>>>>>> Jacob Faibussowitsch >>>>>>>>> (Jacob Fai - booss - oh - vitch) >>>>>>>>> >>>>>>>>> On Jan 20, 2022, at 17:47, Matthew Knepley >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>> On Thu, Jan 20, 2022 at 6:44 PM Fande Kong >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>>> Thanks, Jed >>>>>>>>>> >>>>>>>>>> On Thu, Jan 20, 2022 at 4:34 PM Jed Brown >>>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>>> You can't create CUDA or Kokkos Vecs if you're running on a node >>>>>>>>>>> without a GPU. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> I am running the code on compute nodes that do have GPUs. >>>>>>>>>> >>>>>>>>> >>>>>>>>> If you are actually running on GPUs, why would you need lazy >>>>>>>>> initialization? It would not break with GPUs present. >>>>>>>>> >>>>>>>>> Matt >>>>>>>>> >>>>>>>>> >>>>>>>>>> With PETSc-3.16.1, I got good speedup by running GAMG on GPUs. >>>>>>>>>> That might be a bug of PETSc-main. >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> >>>>>>>>>> Fande >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> KSPSetUp 13 1.0 6.4400e-01 1.0 2.02e+09 1.0 0.0e+00 >>>>>>>>>> 0.0e+00 0.0e+00 0 5 0 0 0 0 5 0 0 0 3140 64630 15 >>>>>>>>>> 1.05e+02 5 3.49e+01 100 >>>>>>>>>> KSPSolve 1 1.0 1.0109e+00 1.0 3.49e+10 1.0 0.0e+00 >>>>>>>>>> 0.0e+00 0.0e+00 0 87 0 0 0 0 87 0 0 0 34522 69556 4 >>>>>>>>>> 4.35e-03 1 2.38e-03 100 >>>>>>>>>> KSPGMRESOrthog 142 1.0 1.2674e-01 1.0 1.06e+10 1.0 0.0e+00 >>>>>>>>>> 0.0e+00 0.0e+00 0 27 0 0 0 0 27 0 0 0 83755 87801 0 >>>>>>>>>> 0.00e+00 0 0.00e+00 100 >>>>>>>>>> SNESSolve 1 1.0 4.4402e+01 1.0 4.00e+10 1.0 0.0e+00 >>>>>>>>>> 0.0e+00 0.0e+00 21100 0 0 0 21100 0 0 0 901 51365 57 >>>>>>>>>> 1.10e+03 52 8.78e+02 100 >>>>>>>>>> SNESSetUp 1 1.0 3.9101e-05 1.0 0.00e+00 0.0 0.0e+00 >>>>>>>>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 >>>>>>>>>> 0.00e+00 0 0.00e+00 0 >>>>>>>>>> SNESFunctionEval 2 1.0 1.7097e+01 1.0 1.60e+07 1.0 0.0e+00 >>>>>>>>>> 0.0e+00 0.0e+00 8 0 0 0 0 8 0 0 0 0 1 0 0 >>>>>>>>>> 0.00e+00 6 1.92e+02 0 >>>>>>>>>> SNESJacobianEval 1 1.0 1.6213e+01 1.0 2.80e+07 1.0 0.0e+00 >>>>>>>>>> 0.0e+00 0.0e+00 8 0 0 0 0 8 0 0 0 0 2 0 0 >>>>>>>>>> 0.00e+00 1 3.20e+01 0 >>>>>>>>>> SNESLineSearch 1 1.0 8.5582e+00 1.0 1.24e+08 1.0 0.0e+00 >>>>>>>>>> 0.0e+00 0.0e+00 4 0 0 0 0 4 0 0 0 0 14 64153 1 >>>>>>>>>> 3.20e+01 3 9.61e+01 94 >>>>>>>>>> PCGAMGGraph_AGG 5 1.0 3.0509e+00 1.0 8.19e+07 1.0 0.0e+00 >>>>>>>>>> 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 27 0 5 >>>>>>>>>> 3.49e+01 9 7.43e+01 0 >>>>>>>>>> PCGAMGCoarse_AGG 5 1.0 3.8711e+00 1.0 0.00e+00 0.0 0.0e+00 >>>>>>>>>> 0.0e+00 0.0e+00 2 0 0 0 0 2 0 0 0 0 0 0 0 >>>>>>>>>> 0.00e+00 0 0.00e+00 0 >>>>>>>>>> PCGAMGProl_AGG 5 1.0 7.0748e-01 1.0 0.00e+00 0.0 0.0e+00 >>>>>>>>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 >>>>>>>>>> 0.00e+00 0 0.00e+00 0 >>>>>>>>>> PCGAMGPOpt_AGG 5 1.0 1.2904e+00 1.0 2.14e+09 1.0 0.0e+00 >>>>>>>>>> 0.0e+00 0.0e+00 1 5 0 0 0 1 5 0 0 0 1661 29807 26 >>>>>>>>>> 7.15e+02 20 2.90e+02 99 >>>>>>>>>> GAMG: createProl 5 1.0 8.9489e+00 1.0 2.22e+09 1.0 0.0e+00 >>>>>>>>>> 0.0e+00 0.0e+00 4 6 0 0 0 4 6 0 0 0 249 29666 31 >>>>>>>>>> 7.50e+02 29 3.64e+02 96 >>>>>>>>>> Graph 10 1.0 3.0478e+00 1.0 8.19e+07 1.0 0.0e+00 >>>>>>>>>> 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 27 0 5 >>>>>>>>>> 3.49e+01 9 7.43e+01 0 >>>>>>>>>> MIS/Agg 5 1.0 4.1290e-01 1.0 0.00e+00 0.0 0.0e+00 >>>>>>>>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 >>>>>>>>>> 0.00e+00 0 0.00e+00 0 >>>>>>>>>> SA: col data 5 1.0 1.9127e-02 1.0 0.00e+00 0.0 0.0e+00 >>>>>>>>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 >>>>>>>>>> 0.00e+00 0 0.00e+00 0 >>>>>>>>>> SA: frmProl0 5 1.0 6.2662e-01 1.0 0.00e+00 0.0 0.0e+00 >>>>>>>>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 >>>>>>>>>> 0.00e+00 0 0.00e+00 0 >>>>>>>>>> SA: smooth 5 1.0 4.9595e-01 1.0 1.21e+08 1.0 0.0e+00 >>>>>>>>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 244 2709 15 >>>>>>>>>> 1.97e+02 15 2.55e+02 90 >>>>>>>>>> GAMG: partLevel 5 1.0 4.7330e-01 1.0 6.98e+08 1.0 0.0e+00 >>>>>>>>>> 0.0e+00 0.0e+00 0 2 0 0 0 0 2 0 0 0 1475 4120 5 >>>>>>>>>> 1.78e+02 10 2.55e+02 100 >>>>>>>>>> PCGAMG Squ l00 1 1.0 2.6027e+00 1.0 0.00e+00 0.0 0.0e+00 >>>>>>>>>> 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 0 0 0 >>>>>>>>>> 0.00e+00 0 0.00e+00 0 >>>>>>>>>> PCGAMG Gal l00 1 1.0 3.8406e-01 1.0 5.48e+08 1.0 0.0e+00 >>>>>>>>>> 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 1426 4270 1 >>>>>>>>>> 1.48e+02 2 2.11e+02 100 >>>>>>>>>> PCGAMG Opt l00 1 1.0 2.4932e-01 1.0 7.20e+07 1.0 0.0e+00 >>>>>>>>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 289 2653 1 >>>>>>>>>> 6.41e+01 1 1.13e+02 100 >>>>>>>>>> PCGAMG Gal l01 1 1.0 6.6279e-02 1.0 1.09e+08 1.0 0.0e+00 >>>>>>>>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1645 3851 1 >>>>>>>>>> 2.40e+01 2 3.64e+01 100 >>>>>>>>>> PCGAMG Opt l01 1 1.0 2.9544e-02 1.0 7.15e+06 1.0 0.0e+00 >>>>>>>>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 242 1671 1 >>>>>>>>>> 4.84e+00 1 1.23e+01 100 >>>>>>>>>> PCGAMG Gal l02 1 1.0 1.8874e-02 1.0 3.72e+07 1.0 0.0e+00 >>>>>>>>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1974 3636 1 >>>>>>>>>> 5.04e+00 2 6.58e+00 100 >>>>>>>>>> PCGAMG Opt l02 1 1.0 7.4353e-03 1.0 2.40e+06 1.0 0.0e+00 >>>>>>>>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 323 1457 1 >>>>>>>>>> 7.71e-01 1 2.30e+00 100 >>>>>>>>>> PCGAMG Gal l03 1 1.0 2.8479e-03 1.0 4.10e+06 1.0 0.0e+00 >>>>>>>>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1440 2266 1 >>>>>>>>>> 4.44e-01 2 5.51e-01 100 >>>>>>>>>> PCGAMG Opt l03 1 1.0 8.2684e-04 1.0 2.80e+05 1.0 0.0e+00 >>>>>>>>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 339 1667 1 >>>>>>>>>> 6.72e-02 1 2.03e-01 100 >>>>>>>>>> PCGAMG Gal l04 1 1.0 1.2238e-03 1.0 2.09e+05 1.0 0.0e+00 >>>>>>>>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 170 244 1 >>>>>>>>>> 2.05e-02 2 2.53e-02 100 >>>>>>>>>> PCGAMG Opt l04 1 1.0 4.1008e-04 1.0 1.77e+04 1.0 0.0e+00 >>>>>>>>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 43 165 1 >>>>>>>>>> 4.49e-03 1 1.19e-02 100 >>>>>>>>>> PCSetUp 2 1.0 9.9632e+00 1.0 4.95e+09 1.0 0.0e+00 >>>>>>>>>> 0.0e+00 0.0e+00 5 12 0 0 0 5 12 0 0 0 496 17826 55 >>>>>>>>>> 1.03e+03 45 6.54e+02 98 >>>>>>>>>> PCSetUpOnBlocks 44 1.0 9.9087e-04 1.0 2.88e+03 1.0 >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> The point of lazy initialization is to make it possible to run a >>>>>>>>>>> solve that doesn't use a GPU in PETSC_ARCH that supports GPUs, regardless >>>>>>>>>>> of whether a GPU is actually present. >>>>>>>>>>> >>>>>>>>>>> Fande Kong writes: >>>>>>>>>>> >>>>>>>>>>> > I spoke too soon. It seems that we have trouble creating >>>>>>>>>>> cuda/kokkos vecs >>>>>>>>>>> > now. Got Segmentation fault. >>>>>>>>>>> > >>>>>>>>>>> > Thanks, >>>>>>>>>>> > >>>>>>>>>>> > Fande >>>>>>>>>>> > >>>>>>>>>>> > Program received signal SIGSEGV, Segmentation fault. >>>>>>>>>>> > 0x00002aaab5558b11 in >>>>>>>>>>> > >>>>>>>>>>> Petsc::CUPMDevice<(Petsc::CUPMDeviceType)0>::CUPMDeviceInternal::initialize >>>>>>>>>>> > (this=0x1) at >>>>>>>>>>> > >>>>>>>>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/impls/cupm/cupmdevice.cxx:54 >>>>>>>>>>> > 54 PetscErrorCode >>>>>>>>>>> CUPMDevice::CUPMDeviceInternal::initialize() noexcept >>>>>>>>>>> > Missing separate debuginfos, use: debuginfo-install >>>>>>>>>>> > bzip2-libs-1.0.6-13.el7.x86_64 >>>>>>>>>>> elfutils-libelf-0.176-5.el7.x86_64 >>>>>>>>>>> > elfutils-libs-0.176-5.el7.x86_64 glibc-2.17-325.el7_9.x86_64 >>>>>>>>>>> > libX11-1.6.7-4.el7_9.x86_64 libXau-1.0.8-2.1.el7.x86_64 >>>>>>>>>>> > libattr-2.4.46-13.el7.x86_64 libcap-2.22-11.el7.x86_64 >>>>>>>>>>> > libibmad-5.4.0.MLNX20190423.1d917ae-0.1.49224.x86_64 >>>>>>>>>>> > libibumad-43.1.1.MLNX20200211.078947f-0.1.49224.x86_64 >>>>>>>>>>> > libibverbs-41mlnx1-OFED.4.9.0.0.7.49224.x86_64 >>>>>>>>>>> > libmlx4-41mlnx1-OFED.4.7.3.0.3.49224.x86_64 >>>>>>>>>>> > libmlx5-41mlnx1-OFED.4.9.0.1.2.49224.x86_64 >>>>>>>>>>> libnl3-3.2.28-4.el7.x86_64 >>>>>>>>>>> > librdmacm-41mlnx1-OFED.4.7.3.0.6.49224.x86_64 >>>>>>>>>>> > librxe-41mlnx1-OFED.4.4.2.4.6.49224.x86_64 >>>>>>>>>>> libxcb-1.13-1.el7.x86_64 >>>>>>>>>>> > libxml2-2.9.1-6.el7_9.6.x86_64 numactl-libs-2.0.12-5.el7.x86_64 >>>>>>>>>>> > systemd-libs-219-78.el7_9.3.x86_64 xz-libs-5.2.2-1.el7.x86_64 >>>>>>>>>>> > zlib-1.2.7-19.el7_9.x86_64 >>>>>>>>>>> > (gdb) bt >>>>>>>>>>> > #0 0x00002aaab5558b11 in >>>>>>>>>>> > >>>>>>>>>>> Petsc::CUPMDevice<(Petsc::CUPMDeviceType)0>::CUPMDeviceInternal::initialize >>>>>>>>>>> > (this=0x1) at >>>>>>>>>>> > >>>>>>>>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/impls/cupm/cupmdevice.cxx:54 >>>>>>>>>>> > #1 0x00002aaab5558db7 in >>>>>>>>>>> > Petsc::CUPMDevice<(Petsc::CUPMDeviceType)0>::getDevice >>>>>>>>>>> > (this=this at entry=0x2aaab7f37b70 >>>>>>>>>>> > , device=0x115da00, id=-35, id at entry=-1) at >>>>>>>>>>> > >>>>>>>>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/impls/cupm/cupmdevice.cxx:344 >>>>>>>>>>> > #2 0x00002aaab55577de in PetscDeviceCreate (type=type at entry >>>>>>>>>>> =PETSC_DEVICE_CUDA, >>>>>>>>>>> > devid=devid at entry=-1, device=device at entry=0x2aaab7f37b48 >>>>>>>>>>> > ) at >>>>>>>>>>> > >>>>>>>>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/interface/device.cxx:107 >>>>>>>>>>> > #3 0x00002aaab5557b3a in >>>>>>>>>>> PetscDeviceInitializeDefaultDevice_Internal >>>>>>>>>>> > (type=type at entry=PETSC_DEVICE_CUDA, >>>>>>>>>>> defaultDeviceId=defaultDeviceId at entry=-1) >>>>>>>>>>> > at >>>>>>>>>>> > >>>>>>>>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/interface/device.cxx:273 >>>>>>>>>>> > #4 0x00002aaab5557bf6 in PetscDeviceInitialize >>>>>>>>>>> > (type=type at entry=PETSC_DEVICE_CUDA) >>>>>>>>>>> > at >>>>>>>>>>> > >>>>>>>>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/interface/device.cxx:234 >>>>>>>>>>> > #5 0x00002aaab5661fcd in VecCreate_SeqCUDA (V=0x115d150) at >>>>>>>>>>> > >>>>>>>>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/vec/vec/impls/seq/seqcuda/veccuda.c:244 >>>>>>>>>>> > #6 0x00002aaab5649b40 in VecSetType (vec=vec at entry=0x115d150, >>>>>>>>>>> > method=method at entry=0x2aaab70b45b8 "seqcuda") at >>>>>>>>>>> > >>>>>>>>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/vec/vec/interface/vecreg.c:93 >>>>>>>>>>> > #7 0x00002aaab579c33f in VecCreate_CUDA (v=0x115d150) at >>>>>>>>>>> > >>>>>>>>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/vec/vec/impls/mpi/mpicuda/ >>>>>>>>>>> > mpicuda.cu:214 >>>>>>>>>>> > #8 0x00002aaab5649b40 in VecSetType (vec=vec at entry=0x115d150, >>>>>>>>>>> > method=method at entry=0x7fffffff9260 "cuda") at >>>>>>>>>>> > >>>>>>>>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/vec/vec/interface/vecreg.c:93 >>>>>>>>>>> > #9 0x00002aaab5648bf1 in VecSetTypeFromOptions_Private >>>>>>>>>>> (vec=0x115d150, >>>>>>>>>>> > PetscOptionsObject=0x7fffffff9210) at >>>>>>>>>>> > >>>>>>>>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/vec/vec/interface/vector.c:1263 >>>>>>>>>>> > #10 VecSetFromOptions (vec=0x115d150) at >>>>>>>>>>> > >>>>>>>>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/vec/vec/interface/vector.c:1297 >>>>>>>>>>> > #11 0x00002aaab02ef227 in libMesh::PetscVector::init >>>>>>>>>>> > (this=0x11cd1a0, n=441, n_local=441, fast=false, >>>>>>>>>>> ptype=libMesh::PARALLEL) >>>>>>>>>>> > at >>>>>>>>>>> > >>>>>>>>>>> /home/kongf/workhome/sawtooth/moosegpu/scripts/../libmesh/installed/include/libmesh/petsc_vector.h:693 >>>>>>>>>>> > >>>>>>>>>>> > On Thu, Jan 20, 2022 at 1:09 PM Fande Kong < >>>>>>>>>>> fdkong.jd at gmail.com> wrote: >>>>>>>>>>> > >>>>>>>>>>> >> Thanks, Jed, >>>>>>>>>>> >> >>>>>>>>>>> >> This worked! >>>>>>>>>>> >> >>>>>>>>>>> >> Fande >>>>>>>>>>> >> >>>>>>>>>>> >> On Wed, Jan 19, 2022 at 11:03 PM Jed Brown >>>>>>>>>>> wrote: >>>>>>>>>>> >> >>>>>>>>>>> >>> Fande Kong writes: >>>>>>>>>>> >>> >>>>>>>>>>> >>> > On Wed, Jan 19, 2022 at 11:39 AM Jacob Faibussowitsch < >>>>>>>>>>> >>> jacob.fai at gmail.com> >>>>>>>>>>> >>> > wrote: >>>>>>>>>>> >>> > >>>>>>>>>>> >>> >> Are you running on login nodes or compute nodes (I can?t >>>>>>>>>>> seem to tell >>>>>>>>>>> >>> from >>>>>>>>>>> >>> >> the configure.log)? >>>>>>>>>>> >>> >> >>>>>>>>>>> >>> > >>>>>>>>>>> >>> > I was compiling codes on login nodes, and running codes on >>>>>>>>>>> compute >>>>>>>>>>> >>> nodes. >>>>>>>>>>> >>> > Login nodes do not have GPUs, but compute nodes do have >>>>>>>>>>> GPUs. >>>>>>>>>>> >>> > >>>>>>>>>>> >>> > Just to be clear, the same thing (code, machine) with >>>>>>>>>>> PETSc-3.16.1 >>>>>>>>>>> >>> worked >>>>>>>>>>> >>> > perfectly. I have this trouble with PETSc-main. >>>>>>>>>>> >>> >>>>>>>>>>> >>> I assume you can >>>>>>>>>>> >>> >>>>>>>>>>> >>> export PETSC_OPTIONS='-device_enable lazy' >>>>>>>>>>> >>> >>>>>>>>>>> >>> and it'll work. >>>>>>>>>>> >>> >>>>>>>>>>> >>> I think this should be the default. The main complaint is >>>>>>>>>>> that timing the >>>>>>>>>>> >>> first GPU-using event isn't accurate if it includes >>>>>>>>>>> initialization, but I >>>>>>>>>>> >>> think this is mostly hypothetical because you can't trust >>>>>>>>>>> any timing that >>>>>>>>>>> >>> doesn't preload in some form and the first GPU-using event >>>>>>>>>>> will almost >>>>>>>>>>> >>> always be something uninteresting so I think it will rarely >>>>>>>>>>> lead to >>>>>>>>>>> >>> confusion. Meanwhile, eager initialization is viscerally >>>>>>>>>>> disruptive for >>>>>>>>>>> >>> lots of people. >>>>>>>>>>> >>> >>>>>>>>>>> >> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> What most experimenters take for granted before they begin their >>>>>>>>> experiments is infinitely more interesting than any results to which their >>>>>>>>> experiments lead. >>>>>>>>> -- Norbert Wiener >>>>>>>>> >>>>>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>> >>>> >>>> >>>> -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: configure_main.log Type: application/octet-stream Size: 2451022 bytes Desc: not available URL: From patrick.sanan at gmail.com Thu Jan 27 08:08:21 2022 From: patrick.sanan at gmail.com (Patrick Sanan) Date: Thu, 27 Jan 2022 15:08:21 +0100 Subject: [petsc-users] Finite difference approximation of Jacobian In-Reply-To: References: <231abd15aab544f9850826cb437366f7@lanl.gov> <877db5se57.fsf@jedbrown.org> <87y23lquzl.fsf@jedbrown.org> <249DED57-7AA6-4748-A15A-0B8DDFBC5B85@petsc.dev> Message-ID: Dave convinced me that it's so relatively easy to use MatPreallocator that we might as well do it here even if we're going to do something more general later. Here's a draft of how it looks in 1D - most of the changes in the MR are just putting the guts of the matrix assembly in a separate function so we can call it twice. The part that actually uses the MatPreallocator API is very contained and so doesn't seem like it would be difficult to refactor. https://gitlab.com/petsc/petsc/-/merge_requests/4769 Am Mi., 12. Jan. 2022 um 14:53 Uhr schrieb Patrick Sanan < patrick.sanan at gmail.com>: > Thanks a lot, all! So given that there's still some debate about whether > we should even use MATPREALLOCATOR or a better integration of that hash > logic , as in Issue 852, I'll proceed with simply aping what DMDA does > (with apologies for all this code duplication). > > One thing I had missed, which I just added, is respect for > DMSetMatrixPreallocation() / -dm_preallocate_only . Now, when that's > activated, the previous behavior should be recovered, but by default it'll > assemble a matrix which can be used for coloring, as is the main objective > of this thread. > > Am Mi., 12. Jan. 2022 um 08:53 Uhr schrieb Barry Smith : > >> >> Actually given the Subject of this email "Finite difference >> approximation of Jacobian" what I suggested is A complete story anyways for >> Patrick since the user cannot provide numerical values (of course Patrick >> could write a general new hash matrix type and then use it with zeros but >> that is asking a lot of him). >> >> Regarding Fande's needs. One could rig things so that later one could >> "flip" back the matrix to again use the hasher for setting values when the >> contacts change. >> >> > On Jan 12, 2022, at 2:48 AM, Barry Smith wrote: >> > >> > >> > >> >> On Jan 12, 2022, at 2:22 AM, Jed Brown wrote: >> >> >> >> Because if a user jumps right into MatSetValues() and then wants to >> use the result, it better have the values. It's true that DM preallocation >> normally just "inserts zeros" and thus matches what MATPREALLOCATOR does. >> > >> > I am not saying the user jumps right into MatSetValues(). I am saying >> they do a "value" free assembly to have the preallocation set up and then >> they do the first real assembly; hence very much JUST a refactorization of >> MATPREALLOCATE. So the user is "preallocation-aware" but does not have to >> do anything difficult to determine their pattern analytically. >> > >> > Of course handling the values also is fine and simplifies user code >> and the conceptual ideas (since preallocation ceases to exist as a concept >> for users). I have no problem with skipping the "JUST a refactorization of >> MATPREALLOCATE code" but for Patrick's current pressing need I think a >> refactorization of MATPREALLOCATE would be fastest to develop hence I >> suggested it. >> > >> > I did not look at Jed's branch but one way to eliminate the >> preallocation part completely would be to have something like MATHASH and >> then when the first assembly is complete do a MatConvert to AIJ or >> whatever. The conversion could be hidden inside the assemblyend of the >> hash so the user need not be aware that something different was done the >> first time through. Doing it this way would easily allow multiple MATHASH* >> implementations (written by different people) that would compete on speed >> and memory usage. So there could be MatSetType() and a new >> MatSetInitialType() where the initial type would be automatically set to >> some MATHASH if preallocation info is not provided. >> > >> >> >> >> Barry Smith writes: >> >> >> >>> Why does it need to handle values? >> >>> >> >>>> On Jan 12, 2022, at 12:43 AM, Jed Brown wrote: >> >>>> >> >>>> I agree with this and even started a branch jed/mat-hash in (yikes!) >> 2017. I think it should be default if no preallocation functions are >> called. But it isn't exactly the MATPREALLOCATOR code because it needs to >> handle values too. Should not be a lot of code and will essentially remove >> this FAQ and one of the most irritating subtle aspects of new codes using >> PETSc matrices. >> >>>> >> >>>> >> https://petsc.org/release/faq/#assembling-large-sparse-matrices-takes-a-long-time-what-can-i-do-to-make-this-process-faster-or-matsetvalues-is-so-slow-what-can-i-do-to-speed-it-up >> >>>> >> >>>> Barry Smith writes: >> >>>> >> >>>>> I think the MATPREALLOCATOR as a MatType is a cumbersome strange >> thing and would prefer it was just functionality that Mat provided >> directly; for example MatSetOption(mat, preallocator_mode,true); >> matsetvalues,... MatAssemblyBegin/End. Now the matrix has its proper >> nonzero structure of whatever type the user set initially, aij, baij, >> sbaij, .... And the user can now use it efficiently. >> >>>>> >> >>>>> Barry >> >>>>> >> >>>>> So turning on the option just swaps out temporarily the operations >> for MatSetValues and AssemblyBegin/End to be essentially those in >> MATPREALLOCATOR. The refactorization should take almost no time and would >> be faster than trying to rig dmstag to use MATPREALLOCATOR as is. >> >>>>> >> >>>>> >> >>>>>> On Jan 11, 2022, at 9:43 PM, Matthew Knepley >> wrote: >> >>>>>> >> >>>>>> On Tue, Jan 11, 2022 at 12:09 PM Patrick Sanan < >> patrick.sanan at gmail.com > wrote: >> >>>>>> Working on doing this incrementally, in progress here: >> https://gitlab.com/petsc/petsc/-/merge_requests/4712 < >> https://gitlab.com/petsc/petsc/-/merge_requests/4712> >> >>>>>> >> >>>>>> This works in 1D for AIJ matrices, assembling a matrix with a >> maximal number of zero entries as dictated by the stencil width (which is >> intended to be very very close to what DMDA would do if you >> >>>>>> associated all the unknowns with a particular grid point, which is >> the way DMStag largely works under the hood). >> >>>>>> >> >>>>>> Dave, before I get into it, am I correct in my understanding that >> MATPREALLOCATOR would be better here because you would avoid superfluous >> zeros in the sparsity pattern, >> >>>>>> because this routine wouldn't have to assemble the Mat returned by >> DMCreateMatrix()? >> >>>>>> >> >>>>>> Yes, here is how it works. You throw in all the nonzeros you come >> across. Preallocator is a hash table that can check for duplicates. At the >> end, it returns the sparsity pattern. >> >>>>>> >> >>>>>> Thanks, >> >>>>>> >> >>>>>> Matt >> >>>>>> >> >>>>>> If this seems like a sane way to go, I will continue to add some >> more tests (in particular periodic BCs not tested yet) and add the code for >> 2D and 3D. >> >>>>>> >> >>>>>> >> >>>>>> >> >>>>>> Am Mo., 13. Dez. 2021 um 20:17 Uhr schrieb Dave May < >> dave.mayhem23 at gmail.com >: >> >>>>>> >> >>>>>> >> >>>>>> On Mon, 13 Dec 2021 at 20:13, Matthew Knepley > > wrote: >> >>>>>> On Mon, Dec 13, 2021 at 1:52 PM Dave May > > wrote: >> >>>>>> On Mon, 13 Dec 2021 at 19:29, Matthew Knepley > > wrote: >> >>>>>> On Mon, Dec 13, 2021 at 1:16 PM Dave May > > wrote: >> >>>>>> >> >>>>>> >> >>>>>> On Sat 11. Dec 2021 at 22:28, Matthew Knepley > > wrote: >> >>>>>> On Sat, Dec 11, 2021 at 1:58 PM Tang, Qi > tangqi at msu.edu>> wrote: >> >>>>>> Hi, >> >>>>>> Does anyone have comment on finite difference coloring with >> DMStag? We are using DMStag and TS to evolve some nonlinear equations >> implicitly. It would be helpful to have the coloring Jacobian option with >> that. >> >>>>>> >> >>>>>> Since DMStag produces the Jacobian connectivity, >> >>>>>> >> >>>>>> This is incorrect. >> >>>>>> The DMCreateMatrix implementation for DMSTAG only sets the number >> of nonzeros (very inaccurately). It does not insert any zero values and >> thus the nonzero structure is actually not defined. >> >>>>>> That is why coloring doesn?t work. >> >>>>>> >> >>>>>> Ah, thanks Dave. >> >>>>>> >> >>>>>> Okay, we should fix that.It is perfectly possible to compute the >> nonzero pattern from the DMStag information. >> >>>>>> >> >>>>>> Agreed. The API for DMSTAG is complete enough to enable one to >> >>>>>> loop over the cells, and for all quantities defined on the cell >> (centre, face, vertex), >> >>>>>> insert values into the appropriate slot in the matrix. >> >>>>>> Combined with MATPREALLOCATOR, I believe a compact and readable >> >>>>>> code should be possible to write for the preallocation (cf DMDA). >> >>>>>> >> >>>>>> I think the only caveat with the approach of using all quantities >> defined on the cell is >> >>>>>> It may slightly over allocate depending on how the user wishes to >> impose the boundary condition, >> >>>>>> or slightly over allocate for says Stokes where there is no >> pressure-pressure coupling term. >> >>>>>> >> >>>>>> Yes, and would not handle higher order stencils.I think the >> overallocating is livable for the first imeplementation. >> >>>>>> >> >>>>>> >> >>>>>> Sure, but neither does DMDA. >> >>>>>> >> >>>>>> The user always has to know what they are doing and set the >> stencil width accordingly. >> >>>>>> I actually had this point listed in my initial email (and the >> stencil growth issue when using FD for nonlinear problems), >> >>>>>> however I deleted it as all the same issue exist in DMDA and no >> one complains (at least not loudly) :D >> >>>>>> >> >>>>>> >> >>>>>> >> >>>>>> >> >>>>>> Thanks, >> >>>>>> >> >>>>>> Matt >> >>>>>> >> >>>>>> Thanks, >> >>>>>> Dave >> >>>>>> >> >>>>>> >> >>>>>> Paging Patrick :) >> >>>>>> >> >>>>>> Thanks, >> >>>>>> >> >>>>>> Matt >> >>>>>> >> >>>>>> Thanks, >> >>>>>> Dave >> >>>>>> >> >>>>>> >> >>>>>> you can use -snes_fd_color_use_mat. It has many options. Here is >> an example of us using that: >> >>>>>> >> >>>>>> >> https://gitlab.com/petsc/petsc/-/blob/main/src/snes/tutorials/ex19.c#L898 >> < >> https://gitlab.com/petsc/petsc/-/blob/main/src/snes/tutorials/ex19.c#L898 >> > >> >>>>>> >> >>>>>> Thanks, >> >>>>>> >> >>>>>> Matt >> >>>>>> >> >>>>>> Thanks, >> >>>>>> Qi >> >>>>>> >> >>>>>> >> >>>>>>> On Oct 15, 2021, at 3:07 PM, Jorti, Zakariae via petsc-users < >> petsc-users at mcs.anl.gov > wrote: >> >>>>>>> >> >>>>>>> Hello, >> >>>>>>> >> >>>>>>> Does the Jacobian approximation using coloring and finite >> differencing of the function evaluation work in DMStag? >> >>>>>>> Thank you. >> >>>>>>> Best regards, >> >>>>>>> >> >>>>>>> Zakariae >> >>>>>> >> >>>>>> >> >>>>>> >> >>>>>> -- >> >>>>>> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> >>>>>> -- Norbert Wiener >> >>>>>> >> >>>>>> https://www.cse.buffalo.edu/~knepley/ < >> http://www.cse.buffalo.edu/~knepley/> >> >>>>>> >> >>>>>> >> >>>>>> -- >> >>>>>> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> >>>>>> -- Norbert Wiener >> >>>>>> >> >>>>>> https://www.cse.buffalo.edu/~knepley/ < >> http://www.cse.buffalo.edu/~knepley/> >> >>>>>> >> >>>>>> >> >>>>>> -- >> >>>>>> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> >>>>>> -- Norbert Wiener >> >>>>>> >> >>>>>> https://www.cse.buffalo.edu/~knepley/ < >> http://www.cse.buffalo.edu/~knepley/> >> >>>>>> >> >>>>>> >> >>>>>> -- >> >>>>>> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> >>>>>> -- Norbert Wiener >> >>>>>> >> >>>>>> https://www.cse.buffalo.edu/~knepley/ < >> http://www.cse.buffalo.edu/~knepley/> >> > >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From pierre.seize at onera.fr Thu Jan 27 10:21:53 2022 From: pierre.seize at onera.fr (Pierre Seize) Date: Thu, 27 Jan 2022 17:21:53 +0100 Subject: [petsc-users] Multiple phi-function evaluations (SLEPc) Message-ID: <9b43ba65-309a-c6b2-2d04-7e1a850d9911@onera.fr> Hello PETSc and SLEPc users I read that I can ask questions regarding SLEPc here, so here I go: I'm using SLEPc to compute some phi-functions of a given matrix A. At some point I want to compute phi_0(A)b, phi_1(A)b and phi_1(hA)b where h is a scalar coefficient and b a given vector. If I'm correct, I could reuse some information, as the Krylov decomposition used in those three computation is the same. Only the matrix used in the Pade approximant changes with the scalar coefficient, and phi function with different indices can be evaluated together. What's the most efficient way to get the three desired results with SLEPc ? As always, thank you for your help, other users and the PETSc / SLEPc teams. Pierre Seize From jroman at dsic.upv.es Thu Jan 27 11:03:47 2022 From: jroman at dsic.upv.es (Jose E. Roman) Date: Thu, 27 Jan 2022 18:03:47 +0100 Subject: [petsc-users] Multiple phi-function evaluations (SLEPc) In-Reply-To: <9b43ba65-309a-c6b2-2d04-7e1a850d9911@onera.fr> References: <9b43ba65-309a-c6b2-2d04-7e1a850d9911@onera.fr> Message-ID: In a basic Arnoldi implementation, it would be possible to do what you suggest: run Arnoldi once and compute the three approximations from it. But SLEPc's implementation is a restarted method, where the computed Krylov basis is discarded when the maximum dimension is reached. Not sure how this would interact with what you propose. I think it would be feasible, but there could be issues such as each phi_k function converging at different rates. Also, the user interface for this functionality would not fit well with the current interface. I am interested in getting feedback from you related to the performance of the MFN solver with your problem, i.e., convergence and timings. If you want, contact me to my personal email and we can discuss further and see if the multiple phi-function stuff can be accomodated in the current solver. Jose > El 27 ene 2022, a las 17:21, Pierre Seize escribi?: > > Hello PETSc and SLEPc users > > I read that I can ask questions regarding SLEPc here, so here I go: > > I'm using SLEPc to compute some phi-functions of a given matrix A. At some point I want to compute phi_0(A)b, phi_1(A)b and phi_1(hA)b where h is a scalar coefficient and b a given vector. > > If I'm correct, I could reuse some information, as the Krylov decomposition used in those three computation is the same. Only the matrix used in the Pade approximant changes with the scalar coefficient, and phi function with different indices can be evaluated together. What's the most efficient way to get the three desired results with SLEPc ? > > > As always, thank you for your help, other users and the PETSc / SLEPc teams. > > > Pierre Seize > From jacob.fai at gmail.com Fri Jan 28 11:37:37 2022 From: jacob.fai at gmail.com (Jacob Faibussowitsch) Date: Fri, 28 Jan 2022 11:37:37 -0600 Subject: [petsc-users] Strange CUDA failure with a second petscfinalize with PETSc 3.16 In-Reply-To: References: <45585074-8C0E-453B-993B-554D23A1E971@gmail.com> <60E1BD31-E98A-462F-BA2B-2099745B2582@gmail.com> Message-ID: Hi Hao, Strange? these must be internal errors to CUDA, as we don?t call any any of these directly. FYI, CUDA has 2 API?s: 1. The simpler ?runtime? API (which we use) which handles many things like CUDA context management and the like behind the scenes. 2. The more advanced ?driver? API, where one can explicitly control the above. The errors you are seeing are from the driver API. This would indicate to me that the bug may be on NVIDIA?s side; if they were caused on our end I would expect to get CUDA runtime errors. Best regards, Jacob Faibussowitsch (Jacob Fai - booss - oh - vitch) > On Jan 25, 2022, at 21:52, Hao DONG wrote: > > Thanks Jacob, > > Right, I forgot to change the clone -b release to main - my mistake. The c++ dialect option now works without problem. > > My laptop system is indeed configured with WSL2, with Linux kernel of ?5.10.60.1-microsoft-standard-WSL2?. And I have a windows 11 Version 21H2 (OS Build 22000.438) as the host system. The Nvidia driver version is ?510.06? with cuda 11.4. > > Interestingly, my ex11fc can now get pass the second petscfinalize. The code can now get to the third loop of kspsolve and reach the mpifinalize without a problem. So changing to main branch solves the petscfinalize error problem. However, it still complains with an error like: > -------------------------------------------------------------------------- > The call to cuEventDestory failed. This is a unrecoverable error and will > cause the program to abort. > cuEventDestory return value: 400 > Check the cuda.h file for what the return value means. > -------------------------------------------------------------------------- > The full log files are also attached. I also noticed there are other event-management related errors like cuEventCreate and cuIpcGetEventHandle in the log. Does it give any insights on why we have the problem? > > Cheers, > Hao > > Sent from Mail for Windows > > From: Jacob Faibussowitsch > Sent: Tuesday, January 25, 2022 11:19 PM > To: Hao DONG > Cc: petsc-users ; Junchao Zhang > Subject: Re: [petsc-users] Strange CUDA failure with a second petscfinalize with PETSc 3.16 > > Hi Hao, > > I have tried to git pull on the laptop from main and re-config as you suggested. > > It looks like you?re still on the release branch. Do > > ``` > $ git checkout main > $ git pull > ``` > > Then reconfigure. This is also why the cxx dialect flag did not work, I forgot that this change had not made it to release yet. > > my laptop setup is based on WSL > > What version of windows do you have? And what version of WSL? And what version is the linux kernel? You will need at least WSL2 and both your NVIDIA driver, windows version, and linux kernel version are required to be fairly new AFAIK to be able to run CUDA on them. See here https://docs.nvidia.com/cuda/wsl-user-guide/index.html . > > To get your windows version: > > 1. Press Windows key+R > 2. Type winver in the box, and press enter > 3. You should see a line with Version and a build number. For example on my windows machine I see ?Version 21H2 (OS Build 19044.1496)? > > To get WSL version: > > 1. Open WSL > 2. Type uname -r, for example I get "5.10.60.1-microsoft-standard-wsl2" > > To get NVIDIA driver version: > > 1. Open up the NVIDIA control panel > 2. Click on ?System Information? in the bottom left corner > 3. You should see a dual list, ?Items? and ?Details?. In the details column. You should see ?Driver verion?. For example on my machine I see ?511.23? > > Best regards, > > Jacob Faibussowitsch > (Jacob Fai - booss - oh - vitch) > > > On Jan 25, 2022, at 03:42, Hao DONG > wrote: > > Hi Jacob, > > Thanks for the comments ? silly that I have overlooked the debugging flag so far. Unfortunately, I am out of office for a couple of days so I cannot confirm the result on my workstation, for now. > > However, I have a laptop with nvidia graphic card (old gtx1050, which is actually slower than the cpu in terms of double precision calculation), I have tried to git pull on the laptop from main and re-config as you suggested. > > However, using ?--with-cxx-dialect=14? throws up an error like: > > Unknown C++ dialect: with-cxx-dialect=14 > > And omitting the ?--with-cuda-dialect=cxx14? also gives me a similar complaint with: > > CUDA Error: Using CUDA with PetscComplex requires a C++ dialect at least cxx11. Use --with-cxx-dialect=xxx and --with-cuda-dialect=xxx to specify a suitable compiler. > > Eventually I was able to configure and compile with the following config setup: > > ./configure --prefix=/opt/petsc/debug --with-cc=mpicc --with-cxx=mpicxx --with-fc=mpif90 --with-scalar-type=complex --with-precision=double --with-cuda-dir=/usr/local/cuda --with-fortran-kernels=1 --with-cxx-dialect=cxx14 --with-cuda-dialect=cxx14 --with-debugging=1 > > But still, I got the same error regarding cuda (still the error code 97 thing). I attached the configure and output log of my ex11 on my laptop ? is there anything that can help pinpoint the problem? I can also confirm that PETSc 3.15.2 works well with my ex11fc code with cuda, on my laptop. Sadly, my laptop setup is based on WSL, which is far from an ideal environment to test CUDA. I will let you know once I get my hands on my workstations. > > Cheers, > Hao > > Sent from Mail for Windows > > From: Jacob Faibussowitsch > Sent: Tuesday, January 25, 2022 1:22 AM > To: Hao DONG > Cc: petsc-users ; Junchao Zhang > Subject: Re: [petsc-users] Strange CUDA failure with a second petscfinalize with PETSc 3.16 > > Hi Hao, > > Any luck reproducing the CUDA problem? > > Sorry for the long radio silence, I still have not been able to reproduce the problem unfortunately. I have tried on a local machine, and a few larger clusters and all return without errors both with and without cuda? > > Can you try pulling the latest version of main, reconfiguring and trying again? > > BTW, your configure arguments are a little wonky: > > 1. --with-clanguage=c - this isn?t needed, PETSc will default to C > 2. --with-cxx-dialect=cxx14 --with-cuda-dialect=cxx14 - use --with-cxx-dialect=14 instead, PETSc will detect that you have gnu compilers and enable gnu extensions > 3. -with-debugging=1 - this is missing an extra dash, but you also have optimization flags set so maybe just leave this one out > > Best regards, > > Jacob Faibussowitsch > (Jacob Fai - booss - oh - vitch) > > > > On Jan 23, 2022, at 21:29, Hao DONG > wrote: > > Dear Jacob, > > Any luck reproducing the CUDA problem? - just write to check in, in case somehow the response did not reach me (this happens to my colleagues abroad sometimes, probably due to the Wall). > > All the best, > Hao > > > > On Jan 19, 2022, at 3:01 PM, Hao DONG > wrote: > > ? > Thanks Jacob for looking into this ? > > You can see the updated source code of ex11fc in the attachment ? although there is not much that I modified (except for the jabbers I outputted). I also attached the full output (ex11fc.log) along with the configure.log file. It?s an old dual Xeon workstation (one of my ?production? machines) with Linux kernel 5.4.0 and gcc 9.3. > > I simply ran the code with > > mpiexec -np 2 ex11fc -usecuda > > for GPU test. And as stated before, calling without the ?-usecuda? option shows no errors. > > Please let me know if you find anything wrong with the configure/code. > > Cheers, > Hao > > From: Jacob Faibussowitsch > Sent: Wednesday, January 19, 2022 3:38 AM > To: Hao DONG > Cc: Junchao Zhang ; petsc-users > Subject: Re: [petsc-users] Strange CUDA failure with a second petscfinalize with PETSc 3.16 > > Apologies, forgot to mention in my previous email but can you also include a copy of the full printout of the error message that you get? It will include all the command-line flags that you ran with (if any) so I can exactly mirror your environment. > > Best regards, > > Jacob Faibussowitsch > (Jacob Fai - booss - oh - vitch) > > > > > On Jan 18, 2022, at 14:06, Jacob Faibussowitsch > wrote: > > Can you send your updated source file as well as your configure.log (should be $PETSC_DIR/configure.log). I will see if I can reproduce the error on my end. > > Best regards, > > Jacob Faibussowitsch > (Jacob Fai - booss - oh - vitch) > > > > > On Jan 17, 2022, at 23:06, Hao DONG > wrote: > > ? > Dear Junchao and Jacob, > > Thanks a lot for the response ? I also don?t understand why this is related to the device, especially on why the procedure can be successfully finished for *once* ? As instructed, I tried to add a CHKERRA() macro after (almost) every petsc line ? such as the initialization, mat assemble, ksp create, solve, mat destroy, etc. However, all other petsc commands returns with error code 0. It only gives me a similar (still not very informative) error after I call the petscfinalize (again for the second time), with error code 97: > > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > [0]PETSC ERROR: GPU error > [0]PETSC ERROR: cuda error 709 (cudaErrorContextIsDestroyed) : context is destroyed > [0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting. > [0]PETSC ERROR: Petsc Release Version 3.16.3, unknown > [0]PETSC ERROR: ./ex11f on a named stratosphere by donghao Tue Jan 18 11:39:43 2022 > [0]PETSC ERROR: Configure options --prefix=/opt/petsc/complex-double-with-cuda --with-cc=mpicc --with-cxx=mpicxx --with-fc=mpif90 COPTFLAGS="-O3 -mavx2" CXXOPTFLAGS="-O3 -mavx2" FOPTFLAGS="-O3 -ffree-line-length-none -mavx2" CUDAOPTFLAGS=-O3 --with-cxx-dialect=cxx14 --with-cuda-dialect=cxx14 --with-scalar-type=complex --with-precision=double --with-cuda-dir=/usr/local/cuda --with-debugging=1 > [0]PETSC ERROR: #1 PetscFinalize() at /home/donghao/packages/petsc-current/src/sys/objects/pinit.c:1638 > [0]PETSC ERROR: #2 User provided function() at User file:0 > > I can also confirm that rolling back to petsc 3.15 will *not* see the problem, even with the new nvidia driver. And petsc 3.16.3 with an old nvidia driver (470.42) also get this same error. So it?s probably not connected to the nvidia driver. > > Any idea on where I should look at next? > Thanks a lot in advance, and all the best, > Hao > > From: Jacob Faibussowitsch > Sent: Sunday, January 16, 2022 12:12 AM > To: Junchao Zhang > Cc: petsc-users ; Hao DONG > Subject: Re: [petsc-users] Strange CUDA failure with a second petscfinalize with PETSc 3.16 > > I don?t quite understand how it is getting to the CUDA error to be honest. None of the code in the stack trace is anywhere near the device code. Reading the error message carefully, it first chokes on PetscLogGetStageLog() from a call to PetscClassIdRegister(): > > PetscErrorCode PetscLogGetStageLog(PetscStageLog *stageLog) > { > PetscFunctionBegin; > PetscValidPointer(stageLog,1); > if (!petsc_stageLog) { > fprintf(stderr, "PETSC ERROR: Logging has not been enabled.\nYou might have forgotten to call PetscInitialize().\n"); > PETSCABORT(MPI_COMM_WORLD, PETSC_ERR_SUP); // Here > } > ... > > But then jumps to PetscFinalize(). You can also see the "You might have forgotten to call PetscInitialize().? message in the error message, just under the topmost level of the stack trace. > > Can you check the value of ierr of each function call (use the CHKERRA() macro to do so)? I suspect the problem here that errors occurring previously in the program are being ignored, leading to the garbled stack trace. > > Best regards, > > Jacob Faibussowitsch > (Jacob Fai - booss - oh - vitch) > > > > > > On Jan 14, 2022, at 20:58, Junchao Zhang > wrote: > > Jacob, > Could you have a look as it seems the "invalid device context" is in your newly added module? > Thanks > --Junchao Zhang > > > On Fri, Jan 14, 2022 at 12:49 AM Hao DONG > wrote: > Dear All, > > I have encountered a peculiar problem when fiddling with a code with PETSC 3.16.3 (which worked fine with PETSc 3.15). It is a very straight forward PDE-based optimization code which repeatedly solves a linearized PDE problem with KSP in a subroutine (the rest of the code does not contain any PETSc related content). The main program provides the subroutine with an MPI comm. Then I set the comm as PETSC_COMM_WORLD to tell PETSC to attach to it (and detach with it when the solving is finished each time). > > Strangely, I observe a CUDA failure whenever the petscfinalize is called for a *second* time. In other words, the first and second PDE calculations with GPU are fine (with correct solutions). The petsc code just fails after the SECOND petscfinalize command is called. You can also see the PETSC config in the error message: > > [1]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > [1]PETSC ERROR: GPU error > [1]PETSC ERROR: cuda error 201 (cudaErrorDeviceUninitialized) : invalid device context > [1]PETSC ERROR: Seehttps://petsc.org/release/faq/ for trouble shooting. > [1]PETSC ERROR: Petsc Release Version 3.16.3, unknown > [1]PETSC ERROR: maxwell.gpu on a named stratosphere by hao Fri Jan 14 10:21:05 2022 > [1]PETSC ERROR: Configure options --prefix=/opt/petsc/complex-double-with-cuda --with-cc=mpicc --with-cxx=mpicxx --with-fc=mpif90 COPTFLAGS="-O3 -mavx2" CXXOPTFLAGS="-O3 -mavx2" FOPTFLAGS="-O3 -ffree-line-length-none -mavx2" CUDAOPTFLAGS=-O3 --with-cxx-dialect=cxx14 --with-cuda-dialect=cxx14 --with-scalar-type=complex --with-precision=double --with-cuda-dir=/usr/local/cuda --with-debugging=1 > [1]PETSC ERROR: #1 PetscFinalize() at /home/hao/packages/petsc-current/src/sys/objects/pinit.c:1638 > You might have forgotten to call PetscInitialize(). > The EXACT line numbers in the error traceback are not available. > Instead the line number of the start of the function is given. > [1] #1 PetscAbortFindSourceFile_Private() at /home/hao/packages/petsc-current/src/sys/error/err.c:35 > [1] #2 PetscLogGetStageLog() at /home/hao/packages/petsc-current/src/sys/logging/utils/stagelog.c:29 > [1] #3 PetscClassIdRegister() at /home/hao/packages/petsc-current/src/sys/logging/plog.c:2376 > [1] #4 MatMFFDInitializePackage() at /home/hao/packages/petsc-current/src/mat/impls/mffd/mffd.c:45 > [1] #5 MatInitializePackage() at /home/hao/packages/petsc-current/src/mat/interface/dlregismat.c:163 > [1] #6 MatCreate() at /home/hao/packages/petsc-current/src/mat/utils/gcreate.c:77 > > However, it doesn?t seem to affect the other part of my code, so the code can continue running until it gets to the petsc part again (the *third* time). Unfortunately, it doesn?t give me any further information even if I set the debugging to yes in the configure file. It also worth noting that PETSC without CUDA (i.e. with simple MATMPIAIJ) works perfectly fine. > > I am able to re-produce the problem with a toy code modified from ex11f. Please see the attached file (ex11fc.F90) for details. Essentially the code does the same thing as ex11f, but three times with a do loop. To do that I added an extra MPI_INIT/MPI_FINALIZE to ensure that the MPI communicator is not destroyed when PETSC_FINALIZE is called. I used the PetscOptionsHasName utility to check if you have ?-usecuda? in the options. So running the code with and without that option can give you a comparison w/o CUDA. I can see that the code also fails after the second loop of the KSP operation. Could you kindly shed some lights on this problem? > > I should say that I am not even sure if the problem is from PETSc, as I also accidentally updated the NVIDIA driver (for now it is 510.06 with cuda 11.6). And it is well known that NVIDIA can give you some surprise in the updates (yes, I know I shouldn?t have touched that if it?s not broken). But my CUDA code without PETSC (which basically does the same PDE thing, but with cusparse/cublas directly) seems to work just fine after the update. It is also possible that my petsc code related to CUDA was not quite ?legitimate? ? I just use: > MatSetType(A, MATMPIAIJCUSPARSE, ierr) > and > MatCreateVecs(A, u, PETSC_NULL_VEC, ierr) > to make the data onto GPU. I would very much appreciate it if you could show me the ?right? way to do that. > > Thanks a lot in advance, and all the best, > Hao > > > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mfadams at lbl.gov Fri Jan 28 12:07:32 2022 From: mfadams at lbl.gov (Mark Adams) Date: Fri, 28 Jan 2022 13:07:32 -0500 Subject: [petsc-users] Crusher configure problem Message-ID: Crusher has been giving me fits and now I get this error (empty log) 13:01 main *= crusher:/gpfs/alpine/csc314/scratch/adams/petsc$ ../arch-olcf-crusher.py Traceback (most recent call last): File "../arch-olcf-crusher.py", line 55, in configure.petsc_configure(configure_options) AttributeError: module 'configure' has no attribute 'petsc_configure' This is a modified version of the repo configure file. This was working this AM. Any idea what could cause this? Thanks, Mark #!/usr/bin/python3 # Modules loaded by default (on login to Crusher): # # 1) craype-x86-trento 9) cce/13.0.0 # 2) libfabric/1.13.0.0 10) craype/2.7.13 # 3) craype-network-ofi 11) cray-dsmml/0.2.2 # 4) perftools-base/21.12.0 12) cray-libsci/21.08.1.2 # 5) xpmem/2.3.2-2.2_1.16__g9ea452c.shasta 13) PrgEnv-cray/8.2.0 # 6) cray-pmi/6.0.16 14) DefApps/default # 7) cray-pmi-lib/6.0.16 15) rocm/4.5.0 # 8) tmux/3.2a 16) cray-mpich/8.1.12 # # We use Cray Programming Environment, Cray compilers, Cray-mpich. # To enable GPU-aware MPI, one has to also set this environment variable # # export MPICH_GPU_SUPPORT_ENABLED=1 # # Additional note: If "craype-accel-amd-gfx90a" module is loaded (that is # needed for "OpenMP offload") - it causes link errors when using 'cc or hipcc' # with fortran objs, hence not used # if __name__ == '__main__': import sys import os sys.path.insert(0, os.path.abspath('config')) import configure configure_options = [ '--with-cc=cc', '--with-cxx=CC', '--with-fc=ftn', '--COPTFLAGS=-g -ggdb', '--CXXOPTFLAGS=-g -ggdb', '--FOPTFLAGS=-g', '--with-fortran-bindings=0', 'LIBS=-L{x}/gtl/lib -lmpi_gtl_hsa'.format(x=os.environ['CRAY_MPICH_ROOTDIR']), '--with-debugging=1', #'--with-64-bit-indices=1', '--with-mpiexec=srun -p batch -N 1 -A csc314_crusher -t 00:10:00', '--with-hip', '--with-hipc=hipcc', '--download-hypre', #'--download-hypre-commit=HEAD', #'--download-hypre-configure-arguments=--enable-unified-memory', #'--with-hypre-gpuarch=gfx90a', '--with-hip-arch=gfx90a', '--download-kokkos', '--download-kokkos-kernels', #'--with-kokkos-kernels-tpl=1', #'--prefix=/gpfs/alpine/world-shared/geo127/petsc/arch-crusher-opt-cray', # /gpfs/alpine/phy122/proj-shared/petsc/current/arch-opt-amd-hypre', ] configure.petsc_configure(configure_options) -------------- next part -------------- An HTML attachment was scrubbed... URL: From mfadams at lbl.gov Fri Jan 28 12:15:18 2022 From: mfadams at lbl.gov (Mark Adams) Date: Fri, 28 Jan 2022 13:15:18 -0500 Subject: [petsc-users] Crusher configure problem In-Reply-To: References: Message-ID: Something is very messed up on Crusher. I've never seen this "Cannot allocate memory", but see it for everything: 13:11 1 main= crusher:/gpfs/alpine/csc314/scratch/adams/petsc2$ ll ls: cannot read symbolic link 'configure.log.bkp': Cannot allocate memory ls: cannot read symbolic link 'make.log': Cannot allocate memory ls: cannot read symbolic link 'configure.log': Cannot allocate memory total 25637 drwxr-xr-x 8 adams adams 4096 Jan 20 08:51 arch-olcf-crusher drwxr-xr-x 8 adams adams 4096 Jan 18 20:23 arch-spock-amd drwxrwxr-x 5 adams adams 4096 Jan 26 19:10 arch-summit-dbg-gnu-cuda .... and 13:11 main= crusher:/gpfs/alpine/csc314/scratch/adams/petsc2$ git fetch remote: Enumerating objects: 648, done. remote: Counting objects: 100% (547/547), done. remote: Compressing objects: 100% (238/238), done. remote: Total 648 (delta 379), reused 442 (delta 308), pack-reused 101 Receiving objects: 100% (648/648), 591.28 KiB | 1.38 MiB/s, done. Resolving deltas: 100% (402/402), completed with 114 local objects. error: cannot update the ref 'refs/remotes/origin/main': unable to append to '.git/logs/refs/remotes/origin/main': Cannot allocate memory >From https://gitlab.com/petsc/petsc On Fri, Jan 28, 2022 at 1:07 PM Mark Adams wrote: > Crusher has been giving me fits and now I get this error (empty log) > > 13:01 main *= crusher:/gpfs/alpine/csc314/scratch/adams/petsc$ > ../arch-olcf-crusher.py > Traceback (most recent call last): > File "../arch-olcf-crusher.py", line 55, in > configure.petsc_configure(configure_options) > AttributeError: module 'configure' has no attribute 'petsc_configure' > > This is a modified version of the repo configure file. This was working > this AM. > Any idea what could cause this? > > Thanks, > Mark > > #!/usr/bin/python3 > > # Modules loaded by default (on login to Crusher): > # > # 1) craype-x86-trento 9) cce/13.0.0 > # 2) libfabric/1.13.0.0 10) craype/2.7.13 > # 3) craype-network-ofi 11) cray-dsmml/0.2.2 > # 4) perftools-base/21.12.0 12) cray-libsci/21.08.1.2 > # 5) xpmem/2.3.2-2.2_1.16__g9ea452c.shasta 13) PrgEnv-cray/8.2.0 > # 6) cray-pmi/6.0.16 14) DefApps/default > # 7) cray-pmi-lib/6.0.16 15) rocm/4.5.0 > # 8) tmux/3.2a 16) cray-mpich/8.1.12 > # > # We use Cray Programming Environment, Cray compilers, Cray-mpich. > # To enable GPU-aware MPI, one has to also set this environment variable > # > # export MPICH_GPU_SUPPORT_ENABLED=1 > # > # Additional note: If "craype-accel-amd-gfx90a" module is loaded (that is > # needed for "OpenMP offload") - it causes link errors when using 'cc or > hipcc' > # with fortran objs, hence not used > # > > if __name__ == '__main__': > import sys > import os > sys.path.insert(0, os.path.abspath('config')) > import configure > configure_options = [ > '--with-cc=cc', > '--with-cxx=CC', > '--with-fc=ftn', > '--COPTFLAGS=-g -ggdb', > '--CXXOPTFLAGS=-g -ggdb', > '--FOPTFLAGS=-g', > '--with-fortran-bindings=0', > 'LIBS=-L{x}/gtl/lib > -lmpi_gtl_hsa'.format(x=os.environ['CRAY_MPICH_ROOTDIR']), > '--with-debugging=1', > #'--with-64-bit-indices=1', > '--with-mpiexec=srun -p batch -N 1 -A csc314_crusher -t 00:10:00', > '--with-hip', > '--with-hipc=hipcc', > '--download-hypre', > #'--download-hypre-commit=HEAD', > #'--download-hypre-configure-arguments=--enable-unified-memory', > #'--with-hypre-gpuarch=gfx90a', > '--with-hip-arch=gfx90a', > '--download-kokkos', > '--download-kokkos-kernels', > #'--with-kokkos-kernels-tpl=1', > > #'--prefix=/gpfs/alpine/world-shared/geo127/petsc/arch-crusher-opt-cray', # > /gpfs/alpine/phy122/proj-shared/petsc/current/arch-opt-amd-hypre', > ] > configure.petsc_configure(configure_options) > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mfadams at lbl.gov Fri Jan 28 12:18:22 2022 From: mfadams at lbl.gov (Mark Adams) Date: Fri, 28 Jan 2022 13:18:22 -0500 Subject: [petsc-users] Crusher configure problem In-Reply-To: References: Message-ID: And, building in my home directory looks fine. Looks like a problem with the scratch directory. On Fri, Jan 28, 2022 at 1:15 PM Mark Adams wrote: > Something is very messed up on Crusher. I've never seen this "Cannot > allocate memory", but see it for everything: > > 13:11 1 main= crusher:/gpfs/alpine/csc314/scratch/adams/petsc2$ ll > ls: cannot read symbolic link 'configure.log.bkp': Cannot allocate memory > ls: cannot read symbolic link 'make.log': Cannot allocate memory > ls: cannot read symbolic link 'configure.log': Cannot allocate memory > total 25637 > drwxr-xr-x 8 adams adams 4096 Jan 20 08:51 arch-olcf-crusher > drwxr-xr-x 8 adams adams 4096 Jan 18 20:23 arch-spock-amd > drwxrwxr-x 5 adams adams 4096 Jan 26 19:10 arch-summit-dbg-gnu-cuda > .... > > and > > 13:11 main= crusher:/gpfs/alpine/csc314/scratch/adams/petsc2$ git fetch > remote: Enumerating objects: 648, done. > remote: Counting objects: 100% (547/547), done. > remote: Compressing objects: 100% (238/238), done. > remote: Total 648 (delta 379), reused 442 (delta 308), pack-reused 101 > Receiving objects: 100% (648/648), 591.28 KiB | 1.38 MiB/s, done. > Resolving deltas: 100% (402/402), completed with 114 local objects. > error: cannot update the ref 'refs/remotes/origin/main': unable to append > to '.git/logs/refs/remotes/origin/main': Cannot allocate memory > From https://gitlab.com/petsc/petsc > > On Fri, Jan 28, 2022 at 1:07 PM Mark Adams wrote: > >> Crusher has been giving me fits and now I get this error (empty log) >> >> 13:01 main *= crusher:/gpfs/alpine/csc314/scratch/adams/petsc$ >> ../arch-olcf-crusher.py >> Traceback (most recent call last): >> File "../arch-olcf-crusher.py", line 55, in >> configure.petsc_configure(configure_options) >> AttributeError: module 'configure' has no attribute 'petsc_configure' >> >> This is a modified version of the repo configure file. This was working >> this AM. >> Any idea what could cause this? >> >> Thanks, >> Mark >> >> #!/usr/bin/python3 >> >> # Modules loaded by default (on login to Crusher): >> # >> # 1) craype-x86-trento 9) cce/13.0.0 >> # 2) libfabric/1.13.0.0 10) craype/2.7.13 >> # 3) craype-network-ofi 11) cray-dsmml/0.2.2 >> # 4) perftools-base/21.12.0 12) cray-libsci/21.08.1.2 >> # 5) xpmem/2.3.2-2.2_1.16__g9ea452c.shasta 13) PrgEnv-cray/8.2.0 >> # 6) cray-pmi/6.0.16 14) DefApps/default >> # 7) cray-pmi-lib/6.0.16 15) rocm/4.5.0 >> # 8) tmux/3.2a 16) cray-mpich/8.1.12 >> # >> # We use Cray Programming Environment, Cray compilers, Cray-mpich. >> # To enable GPU-aware MPI, one has to also set this environment variable >> # >> # export MPICH_GPU_SUPPORT_ENABLED=1 >> # >> # Additional note: If "craype-accel-amd-gfx90a" module is loaded (that is >> # needed for "OpenMP offload") - it causes link errors when using 'cc or >> hipcc' >> # with fortran objs, hence not used >> # >> >> if __name__ == '__main__': >> import sys >> import os >> sys.path.insert(0, os.path.abspath('config')) >> import configure >> configure_options = [ >> '--with-cc=cc', >> '--with-cxx=CC', >> '--with-fc=ftn', >> '--COPTFLAGS=-g -ggdb', >> '--CXXOPTFLAGS=-g -ggdb', >> '--FOPTFLAGS=-g', >> '--with-fortran-bindings=0', >> 'LIBS=-L{x}/gtl/lib >> -lmpi_gtl_hsa'.format(x=os.environ['CRAY_MPICH_ROOTDIR']), >> '--with-debugging=1', >> #'--with-64-bit-indices=1', >> '--with-mpiexec=srun -p batch -N 1 -A csc314_crusher -t 00:10:00', >> '--with-hip', >> '--with-hipc=hipcc', >> '--download-hypre', >> #'--download-hypre-commit=HEAD', >> #'--download-hypre-configure-arguments=--enable-unified-memory', >> #'--with-hypre-gpuarch=gfx90a', >> '--with-hip-arch=gfx90a', >> '--download-kokkos', >> '--download-kokkos-kernels', >> #'--with-kokkos-kernels-tpl=1', >> >> #'--prefix=/gpfs/alpine/world-shared/geo127/petsc/arch-crusher-opt-cray', # >> /gpfs/alpine/phy122/proj-shared/petsc/current/arch-opt-amd-hypre', >> ] >> configure.petsc_configure(configure_options) >> >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at jedbrown.org Fri Jan 28 12:18:42 2022 From: jed at jedbrown.org (Jed Brown) Date: Fri, 28 Jan 2022 11:18:42 -0700 Subject: [petsc-users] Crusher configure problem In-Reply-To: References: Message-ID: <87zgnfoha5.fsf@jedbrown.org> Let's move Crusher stuff to petsc-maint. If top/htop doesn't make it obvious why there is no memory, I think you should follow up with OLCF support. Mark Adams writes: > Something is very messed up on Crusher. I've never seen this "Cannot > allocate memory", but see it for everything: > > 13:11 1 main= crusher:/gpfs/alpine/csc314/scratch/adams/petsc2$ ll > ls: cannot read symbolic link 'configure.log.bkp': Cannot allocate memory > ls: cannot read symbolic link 'make.log': Cannot allocate memory > ls: cannot read symbolic link 'configure.log': Cannot allocate memory > total 25637 > drwxr-xr-x 8 adams adams 4096 Jan 20 08:51 arch-olcf-crusher > drwxr-xr-x 8 adams adams 4096 Jan 18 20:23 arch-spock-amd > drwxrwxr-x 5 adams adams 4096 Jan 26 19:10 arch-summit-dbg-gnu-cuda > .... > > and > > 13:11 main= crusher:/gpfs/alpine/csc314/scratch/adams/petsc2$ git fetch > remote: Enumerating objects: 648, done. > remote: Counting objects: 100% (547/547), done. > remote: Compressing objects: 100% (238/238), done. > remote: Total 648 (delta 379), reused 442 (delta 308), pack-reused 101 > Receiving objects: 100% (648/648), 591.28 KiB | 1.38 MiB/s, done. > Resolving deltas: 100% (402/402), completed with 114 local objects. > error: cannot update the ref 'refs/remotes/origin/main': unable to append > to '.git/logs/refs/remotes/origin/main': Cannot allocate memory > From https://gitlab.com/petsc/petsc > > On Fri, Jan 28, 2022 at 1:07 PM Mark Adams wrote: > >> Crusher has been giving me fits and now I get this error (empty log) >> >> 13:01 main *= crusher:/gpfs/alpine/csc314/scratch/adams/petsc$ >> ../arch-olcf-crusher.py >> Traceback (most recent call last): >> File "../arch-olcf-crusher.py", line 55, in >> configure.petsc_configure(configure_options) >> AttributeError: module 'configure' has no attribute 'petsc_configure' >> >> This is a modified version of the repo configure file. This was working >> this AM. >> Any idea what could cause this? >> >> Thanks, >> Mark >> >> #!/usr/bin/python3 >> >> # Modules loaded by default (on login to Crusher): >> # >> # 1) craype-x86-trento 9) cce/13.0.0 >> # 2) libfabric/1.13.0.0 10) craype/2.7.13 >> # 3) craype-network-ofi 11) cray-dsmml/0.2.2 >> # 4) perftools-base/21.12.0 12) cray-libsci/21.08.1.2 >> # 5) xpmem/2.3.2-2.2_1.16__g9ea452c.shasta 13) PrgEnv-cray/8.2.0 >> # 6) cray-pmi/6.0.16 14) DefApps/default >> # 7) cray-pmi-lib/6.0.16 15) rocm/4.5.0 >> # 8) tmux/3.2a 16) cray-mpich/8.1.12 >> # >> # We use Cray Programming Environment, Cray compilers, Cray-mpich. >> # To enable GPU-aware MPI, one has to also set this environment variable >> # >> # export MPICH_GPU_SUPPORT_ENABLED=1 >> # >> # Additional note: If "craype-accel-amd-gfx90a" module is loaded (that is >> # needed for "OpenMP offload") - it causes link errors when using 'cc or >> hipcc' >> # with fortran objs, hence not used >> # >> >> if __name__ == '__main__': >> import sys >> import os >> sys.path.insert(0, os.path.abspath('config')) >> import configure >> configure_options = [ >> '--with-cc=cc', >> '--with-cxx=CC', >> '--with-fc=ftn', >> '--COPTFLAGS=-g -ggdb', >> '--CXXOPTFLAGS=-g -ggdb', >> '--FOPTFLAGS=-g', >> '--with-fortran-bindings=0', >> 'LIBS=-L{x}/gtl/lib >> -lmpi_gtl_hsa'.format(x=os.environ['CRAY_MPICH_ROOTDIR']), >> '--with-debugging=1', >> #'--with-64-bit-indices=1', >> '--with-mpiexec=srun -p batch -N 1 -A csc314_crusher -t 00:10:00', >> '--with-hip', >> '--with-hipc=hipcc', >> '--download-hypre', >> #'--download-hypre-commit=HEAD', >> #'--download-hypre-configure-arguments=--enable-unified-memory', >> #'--with-hypre-gpuarch=gfx90a', >> '--with-hip-arch=gfx90a', >> '--download-kokkos', >> '--download-kokkos-kernels', >> #'--with-kokkos-kernels-tpl=1', >> >> #'--prefix=/gpfs/alpine/world-shared/geo127/petsc/arch-crusher-opt-cray', # >> /gpfs/alpine/phy122/proj-shared/petsc/current/arch-opt-amd-hypre', >> ] >> configure.petsc_configure(configure_options) >> >> >> From mfadams at lbl.gov Fri Jan 28 12:42:52 2022 From: mfadams at lbl.gov (Mark Adams) Date: Fri, 28 Jan 2022 13:42:52 -0500 Subject: [petsc-users] Use of hypre in your application In-Reply-To: References: Message-ID: Moving this to the users list (We can not talk about Crusher on public forums, but this is on Summit. I had to check this thread carefully!) Treb is using hypre on Summit and getting this error: CUSPARSE ERROR (code = 11, insufficient resources) at csr_spgemm_device_cusparse.c:128 This is probably from Hypre's RAP. He has contacted OLCF, which seems like the right place to go, but does anyone have any ideas? Treb: You might ask Hypre also. We do actually have a fair amount of experience with hypre but hypre has more! Thanks, Mark On Fri, Jan 28, 2022 at 1:31 PM David Trebotich wrote: > Thought I sent you this...will change the order of MatSetOption to see if > that helps > > I ran it and get that error which I have already sent a ticket to OLCF: > CUSPARSE ERROR (code = 11, insufficient resources) at > csr_spgemm_device_cusparse.c:128 > > Here's my petscrc > #do not use -mat_view with hypre-cuda if running on gpu > #-mat_view :A.m:ascii_matlab > -help > -proj_mac_pc_type hypre > -proj_mac_pc_hypre_type boomeramg > -proj_mac_pc_hypre_boomeramg_no_CF > -proj_mac_pc_hypre_boomeramg_agg_nl 0 > -proj_mac_pc_hypre_boomeramg_coarsen_type PMIS > -proj_mac_pc_hypre_boomeramg_interp_type ext+i > -proj_mac_pc_hypre_boomeramg_print_statistics > -proj_mac_pc_hypre_boomeramg_relax_type_all l1scaled-Jacobi > -proj_mac_ksp_type gmres > -proj_mac_ksp_max_it 50 > -proj_mac_ksp_rtol 1.e-12 > -proj_mac_ksp_atol 1.e-30 > -mat_type hypre > -use_gpu_aware_mpi 0 > -log_view > -history PETSc.history > -visc_ksp_rtol 1.e-12 > -visc_pc_type jacobi > -visc_ksp_type gmres > -visc_ksp_max_it 50 > -diff_ksp_rtol 1.e-6 > -diff_pc_type jacobi > -diff_ksp_max_it 50 > -proj_mac_ksp_converged_reason > -visc_ksp_converged_reason > -diff_ksp_converged_reason > -proj_mac_ksp_norm_type unpreconditioned > -diff_ksp_norm_type unpreconditioned > -visc_ksp_norm_type unpreconditioned > > And here's my code: > ierr = > MatSetSizes(m_mat,NN,NN,PETSC_DECIDE,PETSC_DECIDE);CHKERRQ(ierr); > ierr = MatSetBlockSize(m_mat,nc);CHKERRQ(ierr); > ierr = MatSetType(m_mat,MATAIJ);CHKERRQ(ierr); > // ierr = > MatSetOption(m_mat,MAT_NEW_NONZERO_ALLOCATION_ERR,PETSC_FALSE) > ;CHKERRQ(ierr); > ierr = MatSetFromOptions( m_mat ); CHKERRQ(ierr); > ierr = MatSeqAIJSetPreallocation(m_mat,nnzrow, d_nnz);CHKERRQ(ierr); > ierr = MatMPIAIJSetPreallocation(m_mat,nnzrow, d_nnz, nnzrow/2, > o_nnz);CHKERRQ(ierr); > ierr = > MatSetOption(m_mat,MAT_NEW_NONZERO_ALLOCATION_ERR,PETSC_FALSE) > ;CHKERRQ(ierr); > > #if defined(PETSC_HAVE_HYPRE) > ierr = MatHYPRESetPreallocation(m_mat,nnzrow, d_nnz, nnzrow/2, > o_nnz);CHKERRQ(ierr); > #endif > > On Wed, Jan 26, 2022 at 6:18 PM Mark Adams wrote: > >> >> >> >> On Wed, Jan 26, 2022 at 7:43 PM David Trebotich >> wrote: >> >>> Can you confirm with me on the settings in .petscrc for Summit with >>> -pc_type hypre? We were using >>> -mat_type aijcusparse >>> which worked a few months ago and now is not working. >>> >>> I don't know the difference between cusparse, aijcusparse and hypre as >>> -mat_type >>> >>> >> cusparse is not a matrix type. The other two are and they both should >> work. >> >> I tested the builds that I just sent in another email (eg, >> PETSC_DIR=/gpfs/alpine/world-shared/geo127/petsc/arch-summit-dbg-gcc-cuda >> PETSC_ARCH="") >> >> $ make PETSC_DIR=$PWD PETSC_ARCH=arch-summit-opt-gnu-hypre-cuda -f >> gmakefile test search='ksp_ksp_tutorials-ex55_hypre_device' >> Using MAKEFLAGS: -- search=ksp_ksp_tutorials-ex55_hypre_device >> PETSC_ARCH=arch-summit-opt-gnu-hypre-cuda >> PETSC_DIR=/gpfs/alpine/csc314/scratch/adams/petsc2 >> CC arch-summit-opt-gnu-hypre-cuda/tests/ksp/ksp/tutorials/ex55.o >> CLINKER arch-summit-opt-gnu-hypre-cuda/tests/ksp/ksp/tutorials/ex55 >> TEST >> arch-summit-opt-gnu-hypre-cuda/tests/counts/ksp_ksp_tutorials-ex55_hypre_device.counts >> ok ksp_ksp_tutorials-ex55_hypre_device >> ok diff-ksp_ksp_tutorials-ex55_hypre_device >> >> So this work. In this file (attached) you will see an example of a >> construction a matrix that we have gone over before: >> >> /* create stiffness matrix */ >> ierr = MatCreate(comm,&Amat);CHKERRQ(ierr); >> ierr = MatSetSizes(Amat,m,m,M,M);CHKERRQ(ierr); >> ierr = MatSetType(Amat,MATAIJ);CHKERRQ(ierr); >> ierr = MatSetOption(Amat,MAT_SPD,PETSC_TRUE);CHKERRQ(ierr); >> ierr = MatSetFromOptions(Amat);CHKERRQ(ierr); >> ierr = MatSetBlockSize(Amat,2);CHKERRQ(ierr); >> ierr = MatSeqAIJSetPreallocation(Amat,18,NULL);CHKERRQ(ierr); >> ierr = MatMPIAIJSetPreallocation(Amat,18,NULL,18,NULL);CHKERRQ(ierr); >> #if defined(PETSC_HAVE_HYPRE) >> ierr = MatHYPRESetPreallocation(Amat,18,NULL,18,NULL);CHKERRQ(ierr); >> #endif >> >> At the end of the file you will what is executed with this test for >> "hypre_device": >> >> # command line options match GPU defaults >> test: >> suffix: hypre_device >> nsize: 4 >> requires: hypre !complex >> args: *-mat_type hypre* -ksp_view -ne 29 -alpha 1.e-3 -ksp_type cg *-pc_type >> hypre **-pc_hypre_type boomeramg *-ksp_monitor_short *-pc_hypre_boomeramg_relax_type_all >> l1scaled-Jacobi -pc_hypre_boomeramg_interp_type ext+i >> -pc_hypre_boomeramg_coarsen_type PMIS -pc_hypre_boomeramg_no_CF* >> >> All you need is *-mat_type hypre and -pc_type hypre*. *You could also >> add these hypre arguments.* >> >> If this is not working please send me a description of the problem, like >> any error output on your screen and your petsc.history file. >> >> >> > > > -- > ---------------------- > David Trebotich > Lawrence Berkeley National Laboratory > Computational Research Division > Applied Numerical Algorithms Group > treb at lbl.gov > (510) 486-5984 office > (510) 384-6868 mobile > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mfadams at lbl.gov Fri Jan 28 15:27:47 2022 From: mfadams at lbl.gov (Mark Adams) Date: Fri, 28 Jan 2022 16:27:47 -0500 Subject: [petsc-users] Use of hypre in your application In-Reply-To: References: Message-ID: (Junchao), Ruipeng said this was an OOM error and suggested trying the in-house with SpGEMM: HYPRE_SetSpGemmUseCusparse(FALSE); Should I clone a hypre argument to make a -pc_hypre_use_tpl_spgemm or -pc_hypre_use_cusparse_spgemm ? On Fri, Jan 28, 2022 at 1:42 PM Mark Adams wrote: > Moving this to the users list (We can not talk about Crusher on public > forums, but this is on Summit. I had to check this thread carefully!) > > Treb is using hypre on Summit and getting this error: > > CUSPARSE ERROR (code = 11, insufficient resources) at > csr_spgemm_device_cusparse.c:128 > > This is probably from Hypre's RAP. > > He has contacted OLCF, which seems like the right place to go, but does > anyone have any ideas? > > Treb: You might ask Hypre also. We do actually have a fair amount of > experience with hypre but hypre has more! > > Thanks, > Mark > > > > > > On Fri, Jan 28, 2022 at 1:31 PM David Trebotich > wrote: > >> Thought I sent you this...will change the order of MatSetOption to see if >> that helps >> >> I ran it and get that error which I have already sent a ticket to OLCF: >> CUSPARSE ERROR (code = 11, insufficient resources) at >> csr_spgemm_device_cusparse.c:128 >> >> Here's my petscrc >> #do not use -mat_view with hypre-cuda if running on gpu >> #-mat_view :A.m:ascii_matlab >> -help >> -proj_mac_pc_type hypre >> -proj_mac_pc_hypre_type boomeramg >> -proj_mac_pc_hypre_boomeramg_no_CF >> -proj_mac_pc_hypre_boomeramg_agg_nl 0 >> -proj_mac_pc_hypre_boomeramg_coarsen_type PMIS >> -proj_mac_pc_hypre_boomeramg_interp_type ext+i >> -proj_mac_pc_hypre_boomeramg_print_statistics >> -proj_mac_pc_hypre_boomeramg_relax_type_all l1scaled-Jacobi >> -proj_mac_ksp_type gmres >> -proj_mac_ksp_max_it 50 >> -proj_mac_ksp_rtol 1.e-12 >> -proj_mac_ksp_atol 1.e-30 >> -mat_type hypre >> -use_gpu_aware_mpi 0 >> -log_view >> -history PETSc.history >> -visc_ksp_rtol 1.e-12 >> -visc_pc_type jacobi >> -visc_ksp_type gmres >> -visc_ksp_max_it 50 >> -diff_ksp_rtol 1.e-6 >> -diff_pc_type jacobi >> -diff_ksp_max_it 50 >> -proj_mac_ksp_converged_reason >> -visc_ksp_converged_reason >> -diff_ksp_converged_reason >> -proj_mac_ksp_norm_type unpreconditioned >> -diff_ksp_norm_type unpreconditioned >> -visc_ksp_norm_type unpreconditioned >> >> And here's my code: >> ierr = >> MatSetSizes(m_mat,NN,NN,PETSC_DECIDE,PETSC_DECIDE);CHKERRQ(ierr); >> ierr = MatSetBlockSize(m_mat,nc);CHKERRQ(ierr); >> ierr = MatSetType(m_mat,MATAIJ);CHKERRQ(ierr); >> // ierr = >> MatSetOption(m_mat,MAT_NEW_NONZERO_ALLOCATION_ERR,PETSC_FALSE) >> ;CHKERRQ(ierr); >> ierr = MatSetFromOptions( m_mat ); CHKERRQ(ierr); >> ierr = MatSeqAIJSetPreallocation(m_mat,nnzrow, d_nnz);CHKERRQ(ierr); >> ierr = MatMPIAIJSetPreallocation(m_mat,nnzrow, d_nnz, nnzrow/2, >> o_nnz);CHKERRQ(ierr); >> ierr = >> MatSetOption(m_mat,MAT_NEW_NONZERO_ALLOCATION_ERR,PETSC_FALSE) >> ;CHKERRQ(ierr); >> >> #if defined(PETSC_HAVE_HYPRE) >> ierr = MatHYPRESetPreallocation(m_mat,nnzrow, d_nnz, nnzrow/2, >> o_nnz);CHKERRQ(ierr); >> #endif >> >> On Wed, Jan 26, 2022 at 6:18 PM Mark Adams wrote: >> >>> >>> >>> >>> On Wed, Jan 26, 2022 at 7:43 PM David Trebotich >>> wrote: >>> >>>> Can you confirm with me on the settings in .petscrc for Summit with >>>> -pc_type hypre? We were using >>>> -mat_type aijcusparse >>>> which worked a few months ago and now is not working. >>>> >>>> I don't know the difference between cusparse, aijcusparse and hypre as >>>> -mat_type >>>> >>>> >>> cusparse is not a matrix type. The other two are and they both should >>> work. >>> >>> I tested the builds that I just sent in another email (eg, >>> PETSC_DIR=/gpfs/alpine/world-shared/geo127/petsc/arch-summit-dbg-gcc-cuda >>> PETSC_ARCH="") >>> >>> $ make PETSC_DIR=$PWD PETSC_ARCH=arch-summit-opt-gnu-hypre-cuda -f >>> gmakefile test search='ksp_ksp_tutorials-ex55_hypre_device' >>> Using MAKEFLAGS: -- search=ksp_ksp_tutorials-ex55_hypre_device >>> PETSC_ARCH=arch-summit-opt-gnu-hypre-cuda >>> PETSC_DIR=/gpfs/alpine/csc314/scratch/adams/petsc2 >>> CC >>> arch-summit-opt-gnu-hypre-cuda/tests/ksp/ksp/tutorials/ex55.o >>> CLINKER arch-summit-opt-gnu-hypre-cuda/tests/ksp/ksp/tutorials/ex55 >>> TEST >>> arch-summit-opt-gnu-hypre-cuda/tests/counts/ksp_ksp_tutorials-ex55_hypre_device.counts >>> ok ksp_ksp_tutorials-ex55_hypre_device >>> ok diff-ksp_ksp_tutorials-ex55_hypre_device >>> >>> So this work. In this file (attached) you will see an example of a >>> construction a matrix that we have gone over before: >>> >>> /* create stiffness matrix */ >>> ierr = MatCreate(comm,&Amat);CHKERRQ(ierr); >>> ierr = MatSetSizes(Amat,m,m,M,M);CHKERRQ(ierr); >>> ierr = MatSetType(Amat,MATAIJ);CHKERRQ(ierr); >>> ierr = MatSetOption(Amat,MAT_SPD,PETSC_TRUE);CHKERRQ(ierr); >>> ierr = MatSetFromOptions(Amat);CHKERRQ(ierr); >>> ierr = MatSetBlockSize(Amat,2);CHKERRQ(ierr); >>> ierr = MatSeqAIJSetPreallocation(Amat,18,NULL);CHKERRQ(ierr); >>> ierr = MatMPIAIJSetPreallocation(Amat,18,NULL,18,NULL);CHKERRQ(ierr); >>> #if defined(PETSC_HAVE_HYPRE) >>> ierr = MatHYPRESetPreallocation(Amat,18,NULL,18,NULL);CHKERRQ(ierr); >>> #endif >>> >>> At the end of the file you will what is executed with this test for >>> "hypre_device": >>> >>> # command line options match GPU defaults >>> test: >>> suffix: hypre_device >>> nsize: 4 >>> requires: hypre !complex >>> args: *-mat_type hypre* -ksp_view -ne 29 -alpha 1.e-3 -ksp_type >>> cg *-pc_type hypre **-pc_hypre_type boomeramg *-ksp_monitor_short *-pc_hypre_boomeramg_relax_type_all >>> l1scaled-Jacobi -pc_hypre_boomeramg_interp_type ext+i >>> -pc_hypre_boomeramg_coarsen_type PMIS -pc_hypre_boomeramg_no_CF* >>> >>> All you need is *-mat_type hypre and -pc_type hypre*. *You could also >>> add these hypre arguments.* >>> >>> If this is not working please send me a description of the problem, like >>> any error output on your screen and your petsc.history file. >>> >>> >>> >> >> >> -- >> ---------------------- >> David Trebotich >> Lawrence Berkeley National Laboratory >> Computational Research Division >> Applied Numerical Algorithms Group >> treb at lbl.gov >> (510) 486-5984 office >> (510) 384-6868 mobile >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From junchao.zhang at gmail.com Fri Jan 28 20:37:26 2022 From: junchao.zhang at gmail.com (Junchao Zhang) Date: Fri, 28 Jan 2022 20:37:26 -0600 Subject: [petsc-users] Use of hypre in your application In-Reply-To: References: Message-ID: On Fri, Jan 28, 2022 at 3:27 PM Mark Adams wrote: > (Junchao), Ruipeng said this was an OOM error and suggested trying the > in-house with > > SpGEMM: HYPRE_SetSpGemmUseCusparse(FALSE); > > Should I clone a hypre argument to make a -pc_hypre_use_tpl_spgemm or > -pc_hypre_use_cusparse_spgemm ? > Better use the same words as hypre, e.g., -pc_hypre_set_spgemm_use_cusparse > > > On Fri, Jan 28, 2022 at 1:42 PM Mark Adams wrote: > >> Moving this to the users list (We can not talk about Crusher on public >> forums, but this is on Summit. I had to check this thread carefully!) >> >> Treb is using hypre on Summit and getting this error: >> >> CUSPARSE ERROR (code = 11, insufficient resources) at >> csr_spgemm_device_cusparse.c:128 >> >> This is probably from Hypre's RAP. >> >> He has contacted OLCF, which seems like the right place to go, but does >> anyone have any ideas? >> >> Treb: You might ask Hypre also. We do actually have a fair amount of >> experience with hypre but hypre has more! >> >> Thanks, >> Mark >> >> >> >> >> >> On Fri, Jan 28, 2022 at 1:31 PM David Trebotich >> wrote: >> >>> Thought I sent you this...will change the order of MatSetOption to see >>> if that helps >>> >>> I ran it and get that error which I have already sent a ticket to OLCF: >>> CUSPARSE ERROR (code = 11, insufficient resources) at >>> csr_spgemm_device_cusparse.c:128 >>> >>> Here's my petscrc >>> #do not use -mat_view with hypre-cuda if running on gpu >>> #-mat_view :A.m:ascii_matlab >>> -help >>> -proj_mac_pc_type hypre >>> -proj_mac_pc_hypre_type boomeramg >>> -proj_mac_pc_hypre_boomeramg_no_CF >>> -proj_mac_pc_hypre_boomeramg_agg_nl 0 >>> -proj_mac_pc_hypre_boomeramg_coarsen_type PMIS >>> -proj_mac_pc_hypre_boomeramg_interp_type ext+i >>> -proj_mac_pc_hypre_boomeramg_print_statistics >>> -proj_mac_pc_hypre_boomeramg_relax_type_all l1scaled-Jacobi >>> -proj_mac_ksp_type gmres >>> -proj_mac_ksp_max_it 50 >>> -proj_mac_ksp_rtol 1.e-12 >>> -proj_mac_ksp_atol 1.e-30 >>> -mat_type hypre >>> -use_gpu_aware_mpi 0 >>> -log_view >>> -history PETSc.history >>> -visc_ksp_rtol 1.e-12 >>> -visc_pc_type jacobi >>> -visc_ksp_type gmres >>> -visc_ksp_max_it 50 >>> -diff_ksp_rtol 1.e-6 >>> -diff_pc_type jacobi >>> -diff_ksp_max_it 50 >>> -proj_mac_ksp_converged_reason >>> -visc_ksp_converged_reason >>> -diff_ksp_converged_reason >>> -proj_mac_ksp_norm_type unpreconditioned >>> -diff_ksp_norm_type unpreconditioned >>> -visc_ksp_norm_type unpreconditioned >>> >>> And here's my code: >>> ierr = >>> MatSetSizes(m_mat,NN,NN,PETSC_DECIDE,PETSC_DECIDE);CHKERRQ(ierr); >>> ierr = MatSetBlockSize(m_mat,nc);CHKERRQ(ierr); >>> ierr = MatSetType(m_mat,MATAIJ);CHKERRQ(ierr); >>> // ierr = >>> MatSetOption(m_mat,MAT_NEW_NONZERO_ALLOCATION_ERR,PETSC_FALSE) >>> ;CHKERRQ(ierr); >>> ierr = MatSetFromOptions( m_mat ); CHKERRQ(ierr); >>> ierr = MatSeqAIJSetPreallocation(m_mat,nnzrow, >>> d_nnz);CHKERRQ(ierr); >>> ierr = MatMPIAIJSetPreallocation(m_mat,nnzrow, d_nnz, nnzrow/2, >>> o_nnz);CHKERRQ(ierr); >>> ierr = >>> MatSetOption(m_mat,MAT_NEW_NONZERO_ALLOCATION_ERR,PETSC_FALSE) >>> ;CHKERRQ(ierr); >>> >>> #if defined(PETSC_HAVE_HYPRE) >>> ierr = MatHYPRESetPreallocation(m_mat,nnzrow, d_nnz, nnzrow/2, >>> o_nnz);CHKERRQ(ierr); >>> #endif >>> >>> On Wed, Jan 26, 2022 at 6:18 PM Mark Adams wrote: >>> >>>> >>>> >>>> >>>> On Wed, Jan 26, 2022 at 7:43 PM David Trebotich >>>> wrote: >>>> >>>>> Can you confirm with me on the settings in .petscrc for Summit with >>>>> -pc_type hypre? We were using >>>>> -mat_type aijcusparse >>>>> which worked a few months ago and now is not working. >>>>> >>>>> I don't know the difference between cusparse, aijcusparse and hypre as >>>>> -mat_type >>>>> >>>>> >>>> cusparse is not a matrix type. The other two are and they both should >>>> work. >>>> >>>> I tested the builds that I just sent in another email (eg, >>>> PETSC_DIR=/gpfs/alpine/world-shared/geo127/petsc/arch-summit-dbg-gcc-cuda >>>> PETSC_ARCH="") >>>> >>>> $ make PETSC_DIR=$PWD PETSC_ARCH=arch-summit-opt-gnu-hypre-cuda -f >>>> gmakefile test search='ksp_ksp_tutorials-ex55_hypre_device' >>>> Using MAKEFLAGS: -- search=ksp_ksp_tutorials-ex55_hypre_device >>>> PETSC_ARCH=arch-summit-opt-gnu-hypre-cuda >>>> PETSC_DIR=/gpfs/alpine/csc314/scratch/adams/petsc2 >>>> CC >>>> arch-summit-opt-gnu-hypre-cuda/tests/ksp/ksp/tutorials/ex55.o >>>> CLINKER arch-summit-opt-gnu-hypre-cuda/tests/ksp/ksp/tutorials/ex55 >>>> TEST >>>> arch-summit-opt-gnu-hypre-cuda/tests/counts/ksp_ksp_tutorials-ex55_hypre_device.counts >>>> ok ksp_ksp_tutorials-ex55_hypre_device >>>> ok diff-ksp_ksp_tutorials-ex55_hypre_device >>>> >>>> So this work. In this file (attached) you will see an example of a >>>> construction a matrix that we have gone over before: >>>> >>>> /* create stiffness matrix */ >>>> ierr = MatCreate(comm,&Amat);CHKERRQ(ierr); >>>> ierr = MatSetSizes(Amat,m,m,M,M);CHKERRQ(ierr); >>>> ierr = MatSetType(Amat,MATAIJ);CHKERRQ(ierr); >>>> ierr = MatSetOption(Amat,MAT_SPD,PETSC_TRUE);CHKERRQ(ierr); >>>> ierr = MatSetFromOptions(Amat);CHKERRQ(ierr); >>>> ierr = MatSetBlockSize(Amat,2);CHKERRQ(ierr); >>>> ierr = MatSeqAIJSetPreallocation(Amat,18,NULL);CHKERRQ(ierr); >>>> ierr = MatMPIAIJSetPreallocation(Amat,18,NULL,18,NULL);CHKERRQ(ierr); >>>> #if defined(PETSC_HAVE_HYPRE) >>>> ierr = MatHYPRESetPreallocation(Amat,18,NULL,18,NULL);CHKERRQ(ierr); >>>> #endif >>>> >>>> At the end of the file you will what is executed with this test for >>>> "hypre_device": >>>> >>>> # command line options match GPU defaults >>>> test: >>>> suffix: hypre_device >>>> nsize: 4 >>>> requires: hypre !complex >>>> args: *-mat_type hypre* -ksp_view -ne 29 -alpha 1.e-3 -ksp_type >>>> cg *-pc_type hypre **-pc_hypre_type boomeramg *-ksp_monitor_short *-pc_hypre_boomeramg_relax_type_all >>>> l1scaled-Jacobi -pc_hypre_boomeramg_interp_type ext+i >>>> -pc_hypre_boomeramg_coarsen_type PMIS -pc_hypre_boomeramg_no_CF* >>>> >>>> All you need is *-mat_type hypre and -pc_type hypre*. *You could also >>>> add these hypre arguments.* >>>> >>>> If this is not working please send me a description of the problem, >>>> like any error output on your screen and your petsc.history file. >>>> >>>> >>>> >>> >>> >>> -- >>> ---------------------- >>> David Trebotich >>> Lawrence Berkeley National Laboratory >>> Computational Research Division >>> Applied Numerical Algorithms Group >>> treb at lbl.gov >>> (510) 486-5984 office >>> (510) 384-6868 mobile >>> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From zjorti at lanl.gov Mon Jan 31 09:29:42 2022 From: zjorti at lanl.gov (Jorti, Zakariae) Date: Mon, 31 Jan 2022 15:29:42 +0000 Subject: [petsc-users] [EXTERNAL] Re: Question about PCFieldSplit In-Reply-To: References: <9bd268fa58a14c8ba55abffa3661dad1@lanl.gov> <8A8FE6DE-F995-4242-92C7-878C48ED1A70@gmail.com> <037C3F53-8329-4950-B83B-63AC118526E3@msu.edu> <2e1a71b944a949c99c24ad85bed9350a@lanl.gov> <3667C2C9-C4E7-48C7-B7FC-B48DDA210BCF@gmail.com> , Message-ID: <5f9009ef45944d068fd0e6d0c3cc8ca7@lanl.gov> Hi Patrick, Thanks for your recent updates on DMStag. After getting the Finite Difference Coloring to work with our solver, I was testing that DMCreateFieldDecomposition routine that you added last week. It seems to work fine when there is only one unknown per location (i.e. one unknown on vertices and/or one unknown on edges and/or one unknown on faces and/or one unknown on elements). That being said, when there is more than one unknown in some location (let's say 2 unknowns on vertices for instance), I could not get the ISs for those two unknowns with that routine. Should I still rely on PCFieldSplitSetDetectSaddlePoint in this case? Many thanks once again. Kind regards, Zakariae ________________________________ From: Patrick Sanan Sent: Tuesday, January 25, 2022 5:36:17 AM To: Matthew Knepley Cc: Jorti, Zakariae; petsc-users at mcs.anl.gov; Tang, Xianzhu; Dave May Subject: [EXTERNAL] Re: [petsc-users] Question about PCFieldSplit Here is an MR which intends to introduce some logic to support DMCreateFieldDecomposition(). It doesn't use the PetscSection approach, which might be preferable, but nonetheless is a necessary component so It'd like to get it in, even if it has room for further optimization. Hopefully this can be followed fairly soon with some more examples and tests using PCFieldSplit itself. https://gitlab.com/petsc/petsc/-/merge_requests/4740 Am Mi., 23. Juni 2021 um 12:15 Uhr schrieb Matthew Knepley >: On Wed, Jun 23, 2021 at 12:51 AM Patrick Sanan > wrote: Hi Zakariae - The usual way to do this is to define an IS (index set) with the degrees of freedom of interest for the rows, and another one for the columns, and then use MatCreateSubmatrix [1] . There's not a particularly convenient way to create an IS with the degrees of freedom corresponding to a particular "stratum" (i.e. elements, faces, edges, or vertices) of a DMStag, but fortunately I believe we have some code to do exactly this in a development branch. I'll track it down and see if it can quickly be added to the main branch. Note that an easy way to keep track of this would be to create a section with the different locations as fields. This Section could then easily create the ISes, and could automatically interface with PCFIELDSPLIT. Thanks, Matt [1]: https://petsc.org/release/docs/manualpages/Mat/MatCreateSubMatrix.html Am 22.06.2021 um 22:29 schrieb Jorti, Zakariae >: Hello, I am working on DMStag and I have one dof on vertices (let us call it V), one dof on edges (let us call it E), one dof on faces ((let us call it F)) and one dof on cells (let us call it C). I build a matrix on this DM, and I was wondering if there was a way to get blocks (or sub matrices) of this matrix corresponding to specific degrees of freedom, for example rows corresponding to V dofs and columns corresponding to E dofs. I already asked this question before and the answer I got was I could call PCFieldSplitSetDetectSaddlePoint with the diagonal entries being of the matrix being zero or nonzero. That worked well. Nonetheless, I am curious to know if there was another alternative that does not require creating a dummy matrix with appropriate diagonal entries and solving a dummy linear system with this matrix to define the splits. Many thanks. Best regards, Zakariae ________________________________ From: petsc-users > on behalf of Tang, Qi > Sent: Sunday, April 18, 2021 11:51:59 PM To: Patrick Sanan Cc: petsc-users at mcs.anl.gov; Tang, Xianzhu Subject: [EXTERNAL] Re: [petsc-users] Question about PCFieldSplit Thanks a lot, Patrick. We appreciate your help. Qi On Apr 18, 2021, at 11:30 PM, Patrick Sanan > wrote: We have this functionality in a branch, which I'm working on cleaning up to get to master. It doesn't use PETScSection. Sorry about the delay! You can only use PCFieldSplitSetDetectSaddlePoint when your diagonal entries being zero or non-zero defines the splits correctly. Am 17.04.2021 um 21:09 schrieb Matthew Knepley >: On Fri, Apr 16, 2021 at 8:39 PM Jorti, Zakariae via petsc-users > wrote: Hello, I have a DMStag grid with one dof on each edge and face center. I want to use a PCFieldSplit preconditioner on a Jacobian matrix that I assume is already split but I am not sure how to determine the fields. In the DMStag examples (ex2.c and ex3.c), the function PCFieldSplitSetDetectSaddlePoint is used to determine those fields based on zero diagonal entries. In my case, I have a Jacobian matrix that does not have zero diagonal entries. Can I use that PCFieldSplitSetDetectSaddlePoint in this case? If not, how should I do? Should I do like this example (https://www.mcs.anl.gov/petsc/petsc-master/src/ksp/ksp/tutorials/ex43.c.html): const PetscInt Bfields[1] = {0},Efields[1] = {1}; KSPGetPC(ksp,&pc); PCFieldSplitSetBlockSize(pc,2); PCFieldSplitSetFields(pc,"B",1,Bfields,Bfields); PCFieldSplitSetFields(pc,"E",1,Efields,Efields); where my B unknowns are defined on face centers and E unknowns are defined on edge centers? That will not work.That interface only works for colocated fields that you get from DMDA. Patrick, does DMSTAG use PetscSection? Then the field split would be automatically calculated. If not, does it maintain the field division so that it could be given to PCFIELDSPLIT as ISes? Thanks, Matt One last thing, I do not know which field comes first. Is it the one defined for face dofs or edge dofs. Thank you. Best regards, Zakariae -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From fdkong.jd at gmail.com Mon Jan 31 09:40:24 2022 From: fdkong.jd at gmail.com (Fande Kong) Date: Mon, 31 Jan 2022 08:40:24 -0700 Subject: [petsc-users] Cannot eagerly initialize cuda, as doing so results in cuda error 35 (cudaErrorInsufficientDriver) : CUDA driver version is insufficient for CUDA runtime version In-Reply-To: References: <233D0F20-AD95-481B-B862-353D5789C556@gmail.com> <87czkn9c64.fsf@jedbrown.org> <875yqe7zip.fsf@jedbrown.org> Message-ID: OK, Finally we resolved the issue. The issue was that there were two libcuda libs on a GPU compute node: /usr/lib64/libcuda and /apps/local/spack/software/gcc-7.5.0/cuda-10.1.243-v4ymjqcrr7f72qfiuzsstuy5jiajbuey/lib64/stubs/libcuda. But on a login node there is one libcuda lib: /apps/local/spack/software/gcc-7.5.0/cuda-10.1.243-v4ymjqcrr7f72qfiuzsstuy5jiajbuey/lib64/stubs/libcuda. We can not see /usr/lib64/libcuda from a login node where I was compiling the code. Before the Junchao's commit, we did not have "-Wl,-rpath" to force PETSc take /apps/local/spack/software/gcc-7.5.0/cuda-10.1.243-v4ymjqcrr7f72qfiuzsstuy5jiajbuey/lib64/stubs/libcuda. A code compiled on a login node could correctly pick up the cuda lib from /usr/lib64/libcuda at runtime. When with "-Wl,-rpath", the code always takes the cuda lib from /apps/local/spack/software/gcc-7.5.0/cuda-10.1.243-v4ymjqcrr7f72qfiuzsstuy5jiajbuey/lib64/stubs/libcuda, wihch was a bad lib. Right now, I just compiled code on a compute node instead of a login node, PETSc was able to pick up the correct lib from /usr/lib64/libcuda, and everything ran fine. I am not sure whether or not it is a good idea to search for "stubs" since the system might have the correct ones in other places. Should not I do a batch compiling? Thanks, Fande On Wed, Jan 26, 2022 at 1:49 PM Fande Kong wrote: > Yes, please see the attached file. > > Fande > > On Wed, Jan 26, 2022 at 11:49 AM Junchao Zhang > wrote: > >> Do you have the configure.log with main? >> >> --Junchao Zhang >> >> >> On Wed, Jan 26, 2022 at 12:26 PM Fande Kong wrote: >> >>> I am on the petsc-main >>> >>> commit 1390d3a27d88add7d79c9b38bf1a895ae5e67af6 >>> >>> Merge: 96c919c d5f3255 >>> >>> Author: Satish Balay >>> >>> Date: Wed Jan 26 10:28:32 2022 -0600 >>> >>> >>> Merge remote-tracking branch 'origin/release' >>> >>> >>> It is still broken. >>> >>> Thanks, >>> >>> >>> Fande >>> >>> On Wed, Jan 26, 2022 at 7:40 AM Junchao Zhang >>> wrote: >>> >>>> The good uses the compiler's default library/header path. The bad >>>> searches from cuda toolkit path and uses rpath linking. >>>> Though the paths look the same on the login node, they could have >>>> different behavior on a compute node depending on its environment. >>>> I think we fixed the issue in cuda.py (i.e., first try the compiler's >>>> default, then toolkit). That's why I wanted Fande to use petsc/main. >>>> >>>> --Junchao Zhang >>>> >>>> >>>> On Tue, Jan 25, 2022 at 11:59 PM Barry Smith wrote: >>>> >>>>> >>>>> bad has extra >>>>> >>>>> -L/apps/local/spack/software/gcc-7.5.0/cuda-10.1.243-v4ymjqcrr7f72qfiuzsstuy5jiajbuey/lib64/stubs >>>>> -lcuda >>>>> >>>>> good does not. >>>>> >>>>> Try removing the stubs directory and -lcuda from the bad >>>>> $PETSC_ARCH/lib/petsc/conf/variables and likely the bad will start working. >>>>> >>>>> Barry >>>>> >>>>> I never liked the stubs stuff. >>>>> >>>>> On Jan 25, 2022, at 11:29 PM, Fande Kong wrote: >>>>> >>>>> Hi Junchao, >>>>> >>>>> I attached a "bad" configure log and a "good" configure log. >>>>> >>>>> The "bad" one was on produced >>>>> at 246ba74192519a5f34fb6e227d1c64364e19ce2c >>>>> >>>>> and the "good" one at 384645a00975869a1aacbd3169de62ba40cad683 >>>>> >>>>> This good hash is the last good hash that is just the right before the >>>>> bad one. >>>>> >>>>> I think you could do a comparison between these two logs, and check >>>>> what the differences were. >>>>> >>>>> Thanks, >>>>> >>>>> Fande >>>>> >>>>> On Tue, Jan 25, 2022 at 8:21 PM Junchao Zhang >>>>> wrote: >>>>> >>>>>> Fande, could you send the configure.log that works (i.e., before this >>>>>> offending commit)? >>>>>> --Junchao Zhang >>>>>> >>>>>> >>>>>> On Tue, Jan 25, 2022 at 8:21 PM Fande Kong >>>>>> wrote: >>>>>> >>>>>>> Not sure if this is helpful. I did "git bisect", and here was the >>>>>>> result: >>>>>>> >>>>>>> [kongf at sawtooth2 petsc]$ git bisect bad >>>>>>> 246ba74192519a5f34fb6e227d1c64364e19ce2c is the first bad commit >>>>>>> commit 246ba74192519a5f34fb6e227d1c64364e19ce2c >>>>>>> Author: Junchao Zhang >>>>>>> Date: Wed Oct 13 05:32:43 2021 +0000 >>>>>>> >>>>>>> Config: fix CUDA library and header dirs >>>>>>> >>>>>>> :040000 040000 187c86055adb80f53c1d0565a8888704fec43a96 >>>>>>> ea1efd7f594fd5e8df54170bc1bc7b00f35e4d5f M config >>>>>>> >>>>>>> >>>>>>> Started from this commit, and GPU did not work for me on our HPC >>>>>>> >>>>>>> Thanks, >>>>>>> Fande >>>>>>> >>>>>>> On Tue, Jan 25, 2022 at 7:18 PM Fande Kong >>>>>>> wrote: >>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On Tue, Jan 25, 2022 at 9:04 AM Jacob Faibussowitsch < >>>>>>>> jacob.fai at gmail.com> wrote: >>>>>>>> >>>>>>>>> Configure should not have an impact here I think. The reason I had >>>>>>>>> you run `cudaGetDeviceCount()` is because this is the CUDA call (and in >>>>>>>>> fact the only CUDA call) in the initialization sequence that returns the >>>>>>>>> error code. There should be no prior CUDA calls. Maybe this is a problem >>>>>>>>> with oversubscribing GPU?s? In the runs that crash, how many ranks are >>>>>>>>> using any given GPU at once? Maybe MPS is required. >>>>>>>>> >>>>>>>> >>>>>>>> I used one MPI rank. >>>>>>>> >>>>>>>> Fande >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>>> >>>>>>>>> Best regards, >>>>>>>>> >>>>>>>>> Jacob Faibussowitsch >>>>>>>>> (Jacob Fai - booss - oh - vitch) >>>>>>>>> >>>>>>>>> On Jan 21, 2022, at 12:01, Fande Kong wrote: >>>>>>>>> >>>>>>>>> Thanks Jacob, >>>>>>>>> >>>>>>>>> On Thu, Jan 20, 2022 at 6:25 PM Jacob Faibussowitsch < >>>>>>>>> jacob.fai at gmail.com> wrote: >>>>>>>>> >>>>>>>>>> Segfault is caused by the following check at >>>>>>>>>> src/sys/objects/device/impls/cupm/cupmdevice.cxx:349 being a >>>>>>>>>> PetscUnlikelyDebug() rather than just PetscUnlikely(): >>>>>>>>>> >>>>>>>>>> ``` >>>>>>>>>> if (PetscUnlikelyDebug(_defaultDevice < 0)) { // _defaultDevice >>>>>>>>>> is in fact < 0 here and uncaught >>>>>>>>>> ``` >>>>>>>>>> >>>>>>>>>> To clarify: >>>>>>>>>> >>>>>>>>>> ?lazy? initialization is not that lazy after all, it still does >>>>>>>>>> some 50% of the initialization that ?eager? initialization does. It stops >>>>>>>>>> short initializing the CUDA runtime, checking CUDA aware MPI, gathering >>>>>>>>>> device data, and initializing cublas and friends. Lazy also importantly >>>>>>>>>> swallows any errors that crop up during initialization, storing the >>>>>>>>>> resulting error code for later (specifically _defaultDevice = >>>>>>>>>> -init_error_value;). >>>>>>>>>> >>>>>>>>>> So whether you initialize lazily or eagerly makes no difference >>>>>>>>>> here, as _defaultDevice will always contain -35. >>>>>>>>>> >>>>>>>>>> The bigger question is why cudaGetDeviceCount() is returning >>>>>>>>>> cudaErrorInsufficientDriver. Can you compile and run >>>>>>>>>> >>>>>>>>>> ``` >>>>>>>>>> #include >>>>>>>>>> >>>>>>>>>> int main() >>>>>>>>>> { >>>>>>>>>> int ndev; >>>>>>>>>> return cudaGetDeviceCount(&ndev): >>>>>>>>>> } >>>>>>>>>> ``` >>>>>>>>>> >>>>>>>>>> Then show the value of "echo $??? >>>>>>>>>> >>>>>>>>> >>>>>>>>> Modify your code a little to get more information. >>>>>>>>> >>>>>>>>> #include >>>>>>>>> #include >>>>>>>>> >>>>>>>>> int main() >>>>>>>>> { >>>>>>>>> int ndev; >>>>>>>>> int error = cudaGetDeviceCount(&ndev); >>>>>>>>> printf("ndev %d \n", ndev); >>>>>>>>> printf("error %d \n", error); >>>>>>>>> return 0; >>>>>>>>> } >>>>>>>>> >>>>>>>>> Results: >>>>>>>>> >>>>>>>>> $ ./a.out >>>>>>>>> ndev 4 >>>>>>>>> error 0 >>>>>>>>> >>>>>>>>> >>>>>>>>> I have not read the PETSc cuda initialization code yet. If I need >>>>>>>>> to guess at what was happening. I will naively think that PETSc did not get >>>>>>>>> correct GPU information in the configuration because the compiler node does >>>>>>>>> not have GPUs, and there was no way to get any GPU device information. >>>>>>>>> >>>>>>>>> >>>>>>>>> During the runtime on GPU nodes, PETSc might have incorrect >>>>>>>>> information grabbed during configuration and had this kind of false error >>>>>>>>> message. >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> >>>>>>>>> Fande >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>>> >>>>>>>>>> Best regards, >>>>>>>>>> >>>>>>>>>> Jacob Faibussowitsch >>>>>>>>>> (Jacob Fai - booss - oh - vitch) >>>>>>>>>> >>>>>>>>>> On Jan 20, 2022, at 17:47, Matthew Knepley >>>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>> On Thu, Jan 20, 2022 at 6:44 PM Fande Kong >>>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>>> Thanks, Jed >>>>>>>>>>> >>>>>>>>>>> On Thu, Jan 20, 2022 at 4:34 PM Jed Brown >>>>>>>>>>> wrote: >>>>>>>>>>> >>>>>>>>>>>> You can't create CUDA or Kokkos Vecs if you're running on a >>>>>>>>>>>> node without a GPU. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> I am running the code on compute nodes that do have GPUs. >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> If you are actually running on GPUs, why would you need lazy >>>>>>>>>> initialization? It would not break with GPUs present. >>>>>>>>>> >>>>>>>>>> Matt >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> With PETSc-3.16.1, I got good speedup by running GAMG on GPUs. >>>>>>>>>>> That might be a bug of PETSc-main. >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> >>>>>>>>>>> Fande >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> KSPSetUp 13 1.0 6.4400e-01 1.0 2.02e+09 1.0 0.0e+00 >>>>>>>>>>> 0.0e+00 0.0e+00 0 5 0 0 0 0 5 0 0 0 3140 64630 15 >>>>>>>>>>> 1.05e+02 5 3.49e+01 100 >>>>>>>>>>> KSPSolve 1 1.0 1.0109e+00 1.0 3.49e+10 1.0 0.0e+00 >>>>>>>>>>> 0.0e+00 0.0e+00 0 87 0 0 0 0 87 0 0 0 34522 69556 4 >>>>>>>>>>> 4.35e-03 1 2.38e-03 100 >>>>>>>>>>> KSPGMRESOrthog 142 1.0 1.2674e-01 1.0 1.06e+10 1.0 0.0e+00 >>>>>>>>>>> 0.0e+00 0.0e+00 0 27 0 0 0 0 27 0 0 0 83755 87801 0 >>>>>>>>>>> 0.00e+00 0 0.00e+00 100 >>>>>>>>>>> SNESSolve 1 1.0 4.4402e+01 1.0 4.00e+10 1.0 0.0e+00 >>>>>>>>>>> 0.0e+00 0.0e+00 21100 0 0 0 21100 0 0 0 901 51365 57 >>>>>>>>>>> 1.10e+03 52 8.78e+02 100 >>>>>>>>>>> SNESSetUp 1 1.0 3.9101e-05 1.0 0.00e+00 0.0 0.0e+00 >>>>>>>>>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 >>>>>>>>>>> 0.00e+00 0 0.00e+00 0 >>>>>>>>>>> SNESFunctionEval 2 1.0 1.7097e+01 1.0 1.60e+07 1.0 0.0e+00 >>>>>>>>>>> 0.0e+00 0.0e+00 8 0 0 0 0 8 0 0 0 0 1 0 0 >>>>>>>>>>> 0.00e+00 6 1.92e+02 0 >>>>>>>>>>> SNESJacobianEval 1 1.0 1.6213e+01 1.0 2.80e+07 1.0 0.0e+00 >>>>>>>>>>> 0.0e+00 0.0e+00 8 0 0 0 0 8 0 0 0 0 2 0 0 >>>>>>>>>>> 0.00e+00 1 3.20e+01 0 >>>>>>>>>>> SNESLineSearch 1 1.0 8.5582e+00 1.0 1.24e+08 1.0 0.0e+00 >>>>>>>>>>> 0.0e+00 0.0e+00 4 0 0 0 0 4 0 0 0 0 14 64153 1 >>>>>>>>>>> 3.20e+01 3 9.61e+01 94 >>>>>>>>>>> PCGAMGGraph_AGG 5 1.0 3.0509e+00 1.0 8.19e+07 1.0 0.0e+00 >>>>>>>>>>> 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 27 0 5 >>>>>>>>>>> 3.49e+01 9 7.43e+01 0 >>>>>>>>>>> PCGAMGCoarse_AGG 5 1.0 3.8711e+00 1.0 0.00e+00 0.0 0.0e+00 >>>>>>>>>>> 0.0e+00 0.0e+00 2 0 0 0 0 2 0 0 0 0 0 0 0 >>>>>>>>>>> 0.00e+00 0 0.00e+00 0 >>>>>>>>>>> PCGAMGProl_AGG 5 1.0 7.0748e-01 1.0 0.00e+00 0.0 0.0e+00 >>>>>>>>>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 >>>>>>>>>>> 0.00e+00 0 0.00e+00 0 >>>>>>>>>>> PCGAMGPOpt_AGG 5 1.0 1.2904e+00 1.0 2.14e+09 1.0 0.0e+00 >>>>>>>>>>> 0.0e+00 0.0e+00 1 5 0 0 0 1 5 0 0 0 1661 29807 26 >>>>>>>>>>> 7.15e+02 20 2.90e+02 99 >>>>>>>>>>> GAMG: createProl 5 1.0 8.9489e+00 1.0 2.22e+09 1.0 0.0e+00 >>>>>>>>>>> 0.0e+00 0.0e+00 4 6 0 0 0 4 6 0 0 0 249 29666 31 >>>>>>>>>>> 7.50e+02 29 3.64e+02 96 >>>>>>>>>>> Graph 10 1.0 3.0478e+00 1.0 8.19e+07 1.0 >>>>>>>>>>> 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 27 0 5 >>>>>>>>>>> 3.49e+01 9 7.43e+01 0 >>>>>>>>>>> MIS/Agg 5 1.0 4.1290e-01 1.0 0.00e+00 0.0 >>>>>>>>>>> 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 >>>>>>>>>>> 0.00e+00 0 0.00e+00 0 >>>>>>>>>>> SA: col data 5 1.0 1.9127e-02 1.0 0.00e+00 0.0 >>>>>>>>>>> 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 >>>>>>>>>>> 0.00e+00 0 0.00e+00 0 >>>>>>>>>>> SA: frmProl0 5 1.0 6.2662e-01 1.0 0.00e+00 0.0 >>>>>>>>>>> 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 >>>>>>>>>>> 0.00e+00 0 0.00e+00 0 >>>>>>>>>>> SA: smooth 5 1.0 4.9595e-01 1.0 1.21e+08 1.0 >>>>>>>>>>> 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 244 2709 15 >>>>>>>>>>> 1.97e+02 15 2.55e+02 90 >>>>>>>>>>> GAMG: partLevel 5 1.0 4.7330e-01 1.0 6.98e+08 1.0 0.0e+00 >>>>>>>>>>> 0.0e+00 0.0e+00 0 2 0 0 0 0 2 0 0 0 1475 4120 5 >>>>>>>>>>> 1.78e+02 10 2.55e+02 100 >>>>>>>>>>> PCGAMG Squ l00 1 1.0 2.6027e+00 1.0 0.00e+00 0.0 0.0e+00 >>>>>>>>>>> 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 0 0 0 >>>>>>>>>>> 0.00e+00 0 0.00e+00 0 >>>>>>>>>>> PCGAMG Gal l00 1 1.0 3.8406e-01 1.0 5.48e+08 1.0 0.0e+00 >>>>>>>>>>> 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 1426 4270 1 >>>>>>>>>>> 1.48e+02 2 2.11e+02 100 >>>>>>>>>>> PCGAMG Opt l00 1 1.0 2.4932e-01 1.0 7.20e+07 1.0 0.0e+00 >>>>>>>>>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 289 2653 1 >>>>>>>>>>> 6.41e+01 1 1.13e+02 100 >>>>>>>>>>> PCGAMG Gal l01 1 1.0 6.6279e-02 1.0 1.09e+08 1.0 0.0e+00 >>>>>>>>>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1645 3851 1 >>>>>>>>>>> 2.40e+01 2 3.64e+01 100 >>>>>>>>>>> PCGAMG Opt l01 1 1.0 2.9544e-02 1.0 7.15e+06 1.0 0.0e+00 >>>>>>>>>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 242 1671 1 >>>>>>>>>>> 4.84e+00 1 1.23e+01 100 >>>>>>>>>>> PCGAMG Gal l02 1 1.0 1.8874e-02 1.0 3.72e+07 1.0 0.0e+00 >>>>>>>>>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1974 3636 1 >>>>>>>>>>> 5.04e+00 2 6.58e+00 100 >>>>>>>>>>> PCGAMG Opt l02 1 1.0 7.4353e-03 1.0 2.40e+06 1.0 0.0e+00 >>>>>>>>>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 323 1457 1 >>>>>>>>>>> 7.71e-01 1 2.30e+00 100 >>>>>>>>>>> PCGAMG Gal l03 1 1.0 2.8479e-03 1.0 4.10e+06 1.0 0.0e+00 >>>>>>>>>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1440 2266 1 >>>>>>>>>>> 4.44e-01 2 5.51e-01 100 >>>>>>>>>>> PCGAMG Opt l03 1 1.0 8.2684e-04 1.0 2.80e+05 1.0 0.0e+00 >>>>>>>>>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 339 1667 1 >>>>>>>>>>> 6.72e-02 1 2.03e-01 100 >>>>>>>>>>> PCGAMG Gal l04 1 1.0 1.2238e-03 1.0 2.09e+05 1.0 0.0e+00 >>>>>>>>>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 170 244 1 >>>>>>>>>>> 2.05e-02 2 2.53e-02 100 >>>>>>>>>>> PCGAMG Opt l04 1 1.0 4.1008e-04 1.0 1.77e+04 1.0 0.0e+00 >>>>>>>>>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 43 165 1 >>>>>>>>>>> 4.49e-03 1 1.19e-02 100 >>>>>>>>>>> PCSetUp 2 1.0 9.9632e+00 1.0 4.95e+09 1.0 0.0e+00 >>>>>>>>>>> 0.0e+00 0.0e+00 5 12 0 0 0 5 12 0 0 0 496 17826 55 >>>>>>>>>>> 1.03e+03 45 6.54e+02 98 >>>>>>>>>>> PCSetUpOnBlocks 44 1.0 9.9087e-04 1.0 2.88e+03 1.0 >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>> The point of lazy initialization is to make it possible to run >>>>>>>>>>>> a solve that doesn't use a GPU in PETSC_ARCH that supports GPUs, regardless >>>>>>>>>>>> of whether a GPU is actually present. >>>>>>>>>>>> >>>>>>>>>>>> Fande Kong writes: >>>>>>>>>>>> >>>>>>>>>>>> > I spoke too soon. It seems that we have trouble creating >>>>>>>>>>>> cuda/kokkos vecs >>>>>>>>>>>> > now. Got Segmentation fault. >>>>>>>>>>>> > >>>>>>>>>>>> > Thanks, >>>>>>>>>>>> > >>>>>>>>>>>> > Fande >>>>>>>>>>>> > >>>>>>>>>>>> > Program received signal SIGSEGV, Segmentation fault. >>>>>>>>>>>> > 0x00002aaab5558b11 in >>>>>>>>>>>> > >>>>>>>>>>>> Petsc::CUPMDevice<(Petsc::CUPMDeviceType)0>::CUPMDeviceInternal::initialize >>>>>>>>>>>> > (this=0x1) at >>>>>>>>>>>> > >>>>>>>>>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/impls/cupm/cupmdevice.cxx:54 >>>>>>>>>>>> > 54 PetscErrorCode >>>>>>>>>>>> CUPMDevice::CUPMDeviceInternal::initialize() noexcept >>>>>>>>>>>> > Missing separate debuginfos, use: debuginfo-install >>>>>>>>>>>> > bzip2-libs-1.0.6-13.el7.x86_64 >>>>>>>>>>>> elfutils-libelf-0.176-5.el7.x86_64 >>>>>>>>>>>> > elfutils-libs-0.176-5.el7.x86_64 glibc-2.17-325.el7_9.x86_64 >>>>>>>>>>>> > libX11-1.6.7-4.el7_9.x86_64 libXau-1.0.8-2.1.el7.x86_64 >>>>>>>>>>>> > libattr-2.4.46-13.el7.x86_64 libcap-2.22-11.el7.x86_64 >>>>>>>>>>>> > libibmad-5.4.0.MLNX20190423.1d917ae-0.1.49224.x86_64 >>>>>>>>>>>> > libibumad-43.1.1.MLNX20200211.078947f-0.1.49224.x86_64 >>>>>>>>>>>> > libibverbs-41mlnx1-OFED.4.9.0.0.7.49224.x86_64 >>>>>>>>>>>> > libmlx4-41mlnx1-OFED.4.7.3.0.3.49224.x86_64 >>>>>>>>>>>> > libmlx5-41mlnx1-OFED.4.9.0.1.2.49224.x86_64 >>>>>>>>>>>> libnl3-3.2.28-4.el7.x86_64 >>>>>>>>>>>> > librdmacm-41mlnx1-OFED.4.7.3.0.6.49224.x86_64 >>>>>>>>>>>> > librxe-41mlnx1-OFED.4.4.2.4.6.49224.x86_64 >>>>>>>>>>>> libxcb-1.13-1.el7.x86_64 >>>>>>>>>>>> > libxml2-2.9.1-6.el7_9.6.x86_64 >>>>>>>>>>>> numactl-libs-2.0.12-5.el7.x86_64 >>>>>>>>>>>> > systemd-libs-219-78.el7_9.3.x86_64 xz-libs-5.2.2-1.el7.x86_64 >>>>>>>>>>>> > zlib-1.2.7-19.el7_9.x86_64 >>>>>>>>>>>> > (gdb) bt >>>>>>>>>>>> > #0 0x00002aaab5558b11 in >>>>>>>>>>>> > >>>>>>>>>>>> Petsc::CUPMDevice<(Petsc::CUPMDeviceType)0>::CUPMDeviceInternal::initialize >>>>>>>>>>>> > (this=0x1) at >>>>>>>>>>>> > >>>>>>>>>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/impls/cupm/cupmdevice.cxx:54 >>>>>>>>>>>> > #1 0x00002aaab5558db7 in >>>>>>>>>>>> > Petsc::CUPMDevice<(Petsc::CUPMDeviceType)0>::getDevice >>>>>>>>>>>> > (this=this at entry=0x2aaab7f37b70 >>>>>>>>>>>> > , device=0x115da00, id=-35, id at entry=-1) at >>>>>>>>>>>> > >>>>>>>>>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/impls/cupm/cupmdevice.cxx:344 >>>>>>>>>>>> > #2 0x00002aaab55577de in PetscDeviceCreate (type=type at entry >>>>>>>>>>>> =PETSC_DEVICE_CUDA, >>>>>>>>>>>> > devid=devid at entry=-1, device=device at entry=0x2aaab7f37b48 >>>>>>>>>>>> > ) at >>>>>>>>>>>> > >>>>>>>>>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/interface/device.cxx:107 >>>>>>>>>>>> > #3 0x00002aaab5557b3a in >>>>>>>>>>>> PetscDeviceInitializeDefaultDevice_Internal >>>>>>>>>>>> > (type=type at entry=PETSC_DEVICE_CUDA, >>>>>>>>>>>> defaultDeviceId=defaultDeviceId at entry=-1) >>>>>>>>>>>> > at >>>>>>>>>>>> > >>>>>>>>>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/interface/device.cxx:273 >>>>>>>>>>>> > #4 0x00002aaab5557bf6 in PetscDeviceInitialize >>>>>>>>>>>> > (type=type at entry=PETSC_DEVICE_CUDA) >>>>>>>>>>>> > at >>>>>>>>>>>> > >>>>>>>>>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/interface/device.cxx:234 >>>>>>>>>>>> > #5 0x00002aaab5661fcd in VecCreate_SeqCUDA (V=0x115d150) at >>>>>>>>>>>> > >>>>>>>>>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/vec/vec/impls/seq/seqcuda/veccuda.c:244 >>>>>>>>>>>> > #6 0x00002aaab5649b40 in VecSetType (vec=vec at entry >>>>>>>>>>>> =0x115d150, >>>>>>>>>>>> > method=method at entry=0x2aaab70b45b8 "seqcuda") at >>>>>>>>>>>> > >>>>>>>>>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/vec/vec/interface/vecreg.c:93 >>>>>>>>>>>> > #7 0x00002aaab579c33f in VecCreate_CUDA (v=0x115d150) at >>>>>>>>>>>> > >>>>>>>>>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/vec/vec/impls/mpi/mpicuda/ >>>>>>>>>>>> > mpicuda.cu:214 >>>>>>>>>>>> > #8 0x00002aaab5649b40 in VecSetType (vec=vec at entry >>>>>>>>>>>> =0x115d150, >>>>>>>>>>>> > method=method at entry=0x7fffffff9260 "cuda") at >>>>>>>>>>>> > >>>>>>>>>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/vec/vec/interface/vecreg.c:93 >>>>>>>>>>>> > #9 0x00002aaab5648bf1 in VecSetTypeFromOptions_Private >>>>>>>>>>>> (vec=0x115d150, >>>>>>>>>>>> > PetscOptionsObject=0x7fffffff9210) at >>>>>>>>>>>> > >>>>>>>>>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/vec/vec/interface/vector.c:1263 >>>>>>>>>>>> > #10 VecSetFromOptions (vec=0x115d150) at >>>>>>>>>>>> > >>>>>>>>>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/vec/vec/interface/vector.c:1297 >>>>>>>>>>>> > #11 0x00002aaab02ef227 in libMesh::PetscVector::init >>>>>>>>>>>> > (this=0x11cd1a0, n=441, n_local=441, fast=false, >>>>>>>>>>>> ptype=libMesh::PARALLEL) >>>>>>>>>>>> > at >>>>>>>>>>>> > >>>>>>>>>>>> /home/kongf/workhome/sawtooth/moosegpu/scripts/../libmesh/installed/include/libmesh/petsc_vector.h:693 >>>>>>>>>>>> > >>>>>>>>>>>> > On Thu, Jan 20, 2022 at 1:09 PM Fande Kong < >>>>>>>>>>>> fdkong.jd at gmail.com> wrote: >>>>>>>>>>>> > >>>>>>>>>>>> >> Thanks, Jed, >>>>>>>>>>>> >> >>>>>>>>>>>> >> This worked! >>>>>>>>>>>> >> >>>>>>>>>>>> >> Fande >>>>>>>>>>>> >> >>>>>>>>>>>> >> On Wed, Jan 19, 2022 at 11:03 PM Jed Brown >>>>>>>>>>>> wrote: >>>>>>>>>>>> >> >>>>>>>>>>>> >>> Fande Kong writes: >>>>>>>>>>>> >>> >>>>>>>>>>>> >>> > On Wed, Jan 19, 2022 at 11:39 AM Jacob Faibussowitsch < >>>>>>>>>>>> >>> jacob.fai at gmail.com> >>>>>>>>>>>> >>> > wrote: >>>>>>>>>>>> >>> > >>>>>>>>>>>> >>> >> Are you running on login nodes or compute nodes (I can?t >>>>>>>>>>>> seem to tell >>>>>>>>>>>> >>> from >>>>>>>>>>>> >>> >> the configure.log)? >>>>>>>>>>>> >>> >> >>>>>>>>>>>> >>> > >>>>>>>>>>>> >>> > I was compiling codes on login nodes, and running codes >>>>>>>>>>>> on compute >>>>>>>>>>>> >>> nodes. >>>>>>>>>>>> >>> > Login nodes do not have GPUs, but compute nodes do have >>>>>>>>>>>> GPUs. >>>>>>>>>>>> >>> > >>>>>>>>>>>> >>> > Just to be clear, the same thing (code, machine) with >>>>>>>>>>>> PETSc-3.16.1 >>>>>>>>>>>> >>> worked >>>>>>>>>>>> >>> > perfectly. I have this trouble with PETSc-main. >>>>>>>>>>>> >>> >>>>>>>>>>>> >>> I assume you can >>>>>>>>>>>> >>> >>>>>>>>>>>> >>> export PETSC_OPTIONS='-device_enable lazy' >>>>>>>>>>>> >>> >>>>>>>>>>>> >>> and it'll work. >>>>>>>>>>>> >>> >>>>>>>>>>>> >>> I think this should be the default. The main complaint is >>>>>>>>>>>> that timing the >>>>>>>>>>>> >>> first GPU-using event isn't accurate if it includes >>>>>>>>>>>> initialization, but I >>>>>>>>>>>> >>> think this is mostly hypothetical because you can't trust >>>>>>>>>>>> any timing that >>>>>>>>>>>> >>> doesn't preload in some form and the first GPU-using event >>>>>>>>>>>> will almost >>>>>>>>>>>> >>> always be something uninteresting so I think it will rarely >>>>>>>>>>>> lead to >>>>>>>>>>>> >>> confusion. Meanwhile, eager initialization is viscerally >>>>>>>>>>>> disruptive for >>>>>>>>>>>> >>> lots of people. >>>>>>>>>>>> >>> >>>>>>>>>>>> >> >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> What most experimenters take for granted before they begin their >>>>>>>>>> experiments is infinitely more interesting than any results to which their >>>>>>>>>> experiments lead. >>>>>>>>>> -- Norbert Wiener >>>>>>>>>> >>>>>>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>> >>>>> >>>>> -------------- next part -------------- An HTML attachment was scrubbed... URL: From patrick.sanan at gmail.com Mon Jan 31 09:47:23 2022 From: patrick.sanan at gmail.com (Patrick Sanan) Date: Mon, 31 Jan 2022 16:47:23 +0100 Subject: [petsc-users] [EXTERNAL] Re: Question about PCFieldSplit In-Reply-To: <5f9009ef45944d068fd0e6d0c3cc8ca7@lanl.gov> References: <9bd268fa58a14c8ba55abffa3661dad1@lanl.gov> <8A8FE6DE-F995-4242-92C7-878C48ED1A70@gmail.com> <037C3F53-8329-4950-B83B-63AC118526E3@msu.edu> <2e1a71b944a949c99c24ad85bed9350a@lanl.gov> <3667C2C9-C4E7-48C7-B7FC-B48DDA210BCF@gmail.com> <5f9009ef45944d068fd0e6d0c3cc8ca7@lanl.gov> Message-ID: The current behavior is that a single IS is returned for each stratum, so if you have 2 unknowns on vertices, it should still return a single IS including both of those unknowns per vertex. Am I understanding that that's working as expected but you need *two* ISs in that case? Am Mo., 31. Jan. 2022 um 16:29 Uhr schrieb Jorti, Zakariae : > Hi Patrick, > > > Thanks for your recent updates on DMStag. > After getting the Finite Difference Coloring to work with our solver, I > was testing that DMCreateFieldDecomposition routine that you added last > week. > It seems to work fine when there is only one unknown per location (i.e. > one unknown on vertices and/or one unknown on edges and/or one unknown on > faces and/or one unknown on elements). > That being said, when there is more than one unknown in some location > (let's say 2 unknowns on vertices for instance), I could not get the ISs > for those two unknowns with that routine. > Should I still rely on PCFieldSplitSetDetectSaddlePoint in this case? > Many thanks once again. > > > Kind regards, > > > Zakariae > ------------------------------ > *From:* Patrick Sanan > *Sent:* Tuesday, January 25, 2022 5:36:17 AM > *To:* Matthew Knepley > *Cc:* Jorti, Zakariae; petsc-users at mcs.anl.gov; Tang, Xianzhu; Dave May > *Subject:* [EXTERNAL] Re: [petsc-users] Question about PCFieldSplit > > Here is an MR which intends to introduce some logic to support > DMCreateFieldDecomposition(). It doesn't use the PetscSection approach, > which might be preferable, but nonetheless is a necessary component so It'd > like to get it in, even if it has room for further optimization. Hopefully > this can be followed fairly soon with some more examples and tests using > PCFieldSplit itself. > > https://gitlab.com/petsc/petsc/-/merge_requests/4740 > > Am Mi., 23. Juni 2021 um 12:15 Uhr schrieb Matthew Knepley < > knepley at gmail.com>: > >> On Wed, Jun 23, 2021 at 12:51 AM Patrick Sanan >> wrote: >> >>> Hi Zakariae - >>> >>> The usual way to do this is to define an IS (index set) with the degrees >>> of freedom of interest for the rows, and another one for the columns, and >>> then use MatCreateSubmatrix [1] . >>> >>> There's not a particularly convenient way to create an IS with the >>> degrees of freedom corresponding to a particular "stratum" (i.e. elements, >>> faces, edges, or vertices) of a DMStag, but fortunately I believe we have >>> some code to do exactly this in a development branch. >>> >>> I'll track it down and see if it can quickly be added to the main branch. >>> >> >> Note that an easy way to keep track of this would be to create a section >> with the different locations as fields. This Section could then >> easily create the ISes, and could automatically interface with >> PCFIELDSPLIT. >> >> Thanks, >> >> Matt >> >> >>> >>> [1]: >>> https://petsc.org/release/docs/manualpages/Mat/MatCreateSubMatrix.html >>> >>> Am 22.06.2021 um 22:29 schrieb Jorti, Zakariae : >>> >>> Hello, >>> >>> I am working on DMStag and I have one dof on vertices (let us call >>> it V), one dof on edges (let us call it E), one dof on faces ((let us >>> call it F)) and one dof on cells (let us call it C). >>> I build a matrix on this DM, and I was wondering if there was a way to >>> get blocks (or sub matrices) of this matrix corresponding to specific >>> degrees of freedom, for example rows corresponding to V dofs and columns >>> corresponding to E dofs. >>> I already asked this question before and the answer I got was I could >>> call PCFieldSplitSetDetectSaddlePoint with the diagonal entries being >>> of the matrix being zero or nonzero. >>> That worked well. Nonetheless, I am curious to know if there >>> was another alternative that does not require creating a dummy matrix >>> with appropriate diagonal entries and solving a dummy linear system with >>> this matrix to define the splits. >>> >>> >>> Many thanks. >>> >>> Best regards, >>> >>> Zakariae >>> ------------------------------ >>> *From:* petsc-users on behalf of >>> Tang, Qi >>> *Sent:* Sunday, April 18, 2021 11:51:59 PM >>> *To:* Patrick Sanan >>> *Cc:* petsc-users at mcs.anl.gov; Tang, Xianzhu >>> *Subject:* [EXTERNAL] Re: [petsc-users] Question about PCFieldSplit >>> >>> Thanks a lot, Patrick. We appreciate your help. >>> >>> Qi >>> >>> >>> >>> On Apr 18, 2021, at 11:30 PM, Patrick Sanan >>> wrote: >>> >>> We have this functionality in a branch, which I'm working on cleaning up >>> to get to master. It doesn't use PETScSection. Sorry about the delay! >>> >>> You can only use PCFieldSplitSetDetectSaddlePoint when your diagonal >>> entries being zero or non-zero defines the splits correctly. >>> >>> Am 17.04.2021 um 21:09 schrieb Matthew Knepley : >>> >>> On Fri, Apr 16, 2021 at 8:39 PM Jorti, Zakariae via petsc-users < >>> petsc-users at mcs.anl.gov> wrote: >>> >>>> Hello, >>>> >>>> I have a DMStag grid with one dof on each edge and face center. >>>> I want to use a PCFieldSplit preconditioner on a Jacobian matrix that I >>>> assume is already split but I am not sure how to determine the fields. >>>> In the DMStag examples (ex2.c and ex3.c), the >>>> function PCFieldSplitSetDetectSaddlePoint is used to determine those fields >>>> based on zero diagonal entries. In my case, I have a Jacobian matrix that >>>> does not have zero diagonal entries. >>>> Can I use that PCFieldSplitSetDetectSaddlePoint in this case? >>>> If not, how should I do? >>>> Should I do like this example ( >>>> https://www.mcs.anl.gov/petsc/petsc-master/src/ksp/ksp/tutorials/ex43.c.html >>>> >>>> ): >>>> const PetscInt Bfields[1] = {0},Efields[1] = {1}; >>>> KSPGetPC(ksp,&pc); >>>> PCFieldSplitSetBlockSize(pc,2); >>>> PCFieldSplitSetFields(pc,"B",1,Bfields,Bfields); >>>> PCFieldSplitSetFields(pc,"E",1,Efields,Efields); >>>> where my B unknowns are defined on face centers and E unknowns are >>>> defined on edge centers? >>>> >>> That will not work.That interface only works for colocated fields that >>> you get from DMDA. >>> >>> Patrick, does DMSTAG use PetscSection? Then the field split would be >>> automatically calculated. If not, does it maintain the >>> field division so that it could be given to PCFIELDSPLIT as ISes? >>> >>> Thanks, >>> >>> Matt >>> >>>> One last thing, I do not know which field comes first. Is it the one >>>> defined for face dofs or edge dofs. >>>> >>>> Thank you. >>>> Best regards, >>>> >>>> Zakariae >>>> >>>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >>> https://www.cse.buffalo.edu/~knepley/ >>> >>> >>> >>> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From zjorti at lanl.gov Mon Jan 31 09:48:34 2022 From: zjorti at lanl.gov (Jorti, Zakariae) Date: Mon, 31 Jan 2022 15:48:34 +0000 Subject: [petsc-users] [EXTERNAL] Re: Question about PCFieldSplit In-Reply-To: References: <9bd268fa58a14c8ba55abffa3661dad1@lanl.gov> <8A8FE6DE-F995-4242-92C7-878C48ED1A70@gmail.com> <037C3F53-8329-4950-B83B-63AC118526E3@msu.edu> <2e1a71b944a949c99c24ad85bed9350a@lanl.gov> <3667C2C9-C4E7-48C7-B7FC-B48DDA210BCF@gmail.com> <5f9009ef45944d068fd0e6d0c3cc8ca7@lanl.gov>, Message-ID: <03813e5c80f74ae3a643ef2628086c12@lanl.gov> Yes, exactly. ________________________________ From: Patrick Sanan Sent: Monday, January 31, 2022 8:47:23 AM To: Jorti, Zakariae Cc: petsc-users at mcs.anl.gov; Tang, Xianzhu; Tang, Qi Subject: Re: [EXTERNAL] Re: [petsc-users] Question about PCFieldSplit The current behavior is that a single IS is returned for each stratum, so if you have 2 unknowns on vertices, it should still return a single IS including both of those unknowns per vertex. Am I understanding that that's working as expected but you need *two* ISs in that case? Am Mo., 31. Jan. 2022 um 16:29 Uhr schrieb Jorti, Zakariae >: Hi Patrick, Thanks for your recent updates on DMStag. After getting the Finite Difference Coloring to work with our solver, I was testing that DMCreateFieldDecomposition routine that you added last week. It seems to work fine when there is only one unknown per location (i.e. one unknown on vertices and/or one unknown on edges and/or one unknown on faces and/or one unknown on elements). That being said, when there is more than one unknown in some location (let's say 2 unknowns on vertices for instance), I could not get the ISs for those two unknowns with that routine. Should I still rely on PCFieldSplitSetDetectSaddlePoint in this case? Many thanks once again. Kind regards, Zakariae ________________________________ From: Patrick Sanan > Sent: Tuesday, January 25, 2022 5:36:17 AM To: Matthew Knepley Cc: Jorti, Zakariae; petsc-users at mcs.anl.gov; Tang, Xianzhu; Dave May Subject: [EXTERNAL] Re: [petsc-users] Question about PCFieldSplit Here is an MR which intends to introduce some logic to support DMCreateFieldDecomposition(). It doesn't use the PetscSection approach, which might be preferable, but nonetheless is a necessary component so It'd like to get it in, even if it has room for further optimization. Hopefully this can be followed fairly soon with some more examples and tests using PCFieldSplit itself. https://gitlab.com/petsc/petsc/-/merge_requests/4740 Am Mi., 23. Juni 2021 um 12:15 Uhr schrieb Matthew Knepley >: On Wed, Jun 23, 2021 at 12:51 AM Patrick Sanan > wrote: Hi Zakariae - The usual way to do this is to define an IS (index set) with the degrees of freedom of interest for the rows, and another one for the columns, and then use MatCreateSubmatrix [1] . There's not a particularly convenient way to create an IS with the degrees of freedom corresponding to a particular "stratum" (i.e. elements, faces, edges, or vertices) of a DMStag, but fortunately I believe we have some code to do exactly this in a development branch. I'll track it down and see if it can quickly be added to the main branch. Note that an easy way to keep track of this would be to create a section with the different locations as fields. This Section could then easily create the ISes, and could automatically interface with PCFIELDSPLIT. Thanks, Matt [1]: https://petsc.org/release/docs/manualpages/Mat/MatCreateSubMatrix.html Am 22.06.2021 um 22:29 schrieb Jorti, Zakariae >: Hello, I am working on DMStag and I have one dof on vertices (let us call it V), one dof on edges (let us call it E), one dof on faces ((let us call it F)) and one dof on cells (let us call it C). I build a matrix on this DM, and I was wondering if there was a way to get blocks (or sub matrices) of this matrix corresponding to specific degrees of freedom, for example rows corresponding to V dofs and columns corresponding to E dofs. I already asked this question before and the answer I got was I could call PCFieldSplitSetDetectSaddlePoint with the diagonal entries being of the matrix being zero or nonzero. That worked well. Nonetheless, I am curious to know if there was another alternative that does not require creating a dummy matrix with appropriate diagonal entries and solving a dummy linear system with this matrix to define the splits. Many thanks. Best regards, Zakariae ________________________________ From: petsc-users > on behalf of Tang, Qi > Sent: Sunday, April 18, 2021 11:51:59 PM To: Patrick Sanan Cc: petsc-users at mcs.anl.gov; Tang, Xianzhu Subject: [EXTERNAL] Re: [petsc-users] Question about PCFieldSplit Thanks a lot, Patrick. We appreciate your help. Qi On Apr 18, 2021, at 11:30 PM, Patrick Sanan > wrote: We have this functionality in a branch, which I'm working on cleaning up to get to master. It doesn't use PETScSection. Sorry about the delay! You can only use PCFieldSplitSetDetectSaddlePoint when your diagonal entries being zero or non-zero defines the splits correctly. Am 17.04.2021 um 21:09 schrieb Matthew Knepley >: On Fri, Apr 16, 2021 at 8:39 PM Jorti, Zakariae via petsc-users > wrote: Hello, I have a DMStag grid with one dof on each edge and face center. I want to use a PCFieldSplit preconditioner on a Jacobian matrix that I assume is already split but I am not sure how to determine the fields. In the DMStag examples (ex2.c and ex3.c), the function PCFieldSplitSetDetectSaddlePoint is used to determine those fields based on zero diagonal entries. In my case, I have a Jacobian matrix that does not have zero diagonal entries. Can I use that PCFieldSplitSetDetectSaddlePoint in this case? If not, how should I do? Should I do like this example (https://www.mcs.anl.gov/petsc/petsc-master/src/ksp/ksp/tutorials/ex43.c.html): const PetscInt Bfields[1] = {0},Efields[1] = {1}; KSPGetPC(ksp,&pc); PCFieldSplitSetBlockSize(pc,2); PCFieldSplitSetFields(pc,"B",1,Bfields,Bfields); PCFieldSplitSetFields(pc,"E",1,Efields,Efields); where my B unknowns are defined on face centers and E unknowns are defined on edge centers? That will not work.That interface only works for colocated fields that you get from DMDA. Patrick, does DMSTAG use PetscSection? Then the field split would be automatically calculated. If not, does it maintain the field division so that it could be given to PCFIELDSPLIT as ISes? Thanks, Matt One last thing, I do not know which field comes first. Is it the one defined for face dofs or edge dofs. Thank you. Best regards, Zakariae -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Mon Jan 31 10:01:02 2022 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 31 Jan 2022 11:01:02 -0500 Subject: [petsc-users] [EXTERNAL] Re: Question about PCFieldSplit In-Reply-To: <03813e5c80f74ae3a643ef2628086c12@lanl.gov> References: <9bd268fa58a14c8ba55abffa3661dad1@lanl.gov> <8A8FE6DE-F995-4242-92C7-878C48ED1A70@gmail.com> <037C3F53-8329-4950-B83B-63AC118526E3@msu.edu> <2e1a71b944a949c99c24ad85bed9350a@lanl.gov> <3667C2C9-C4E7-48C7-B7FC-B48DDA210BCF@gmail.com> <5f9009ef45944d068fd0e6d0c3cc8ca7@lanl.gov> <03813e5c80f74ae3a643ef2628086c12@lanl.gov> Message-ID: On Mon, Jan 31, 2022 at 10:48 AM Jorti, Zakariae via petsc-users < petsc-users at mcs.anl.gov> wrote: > Yes, exactly. > If you already have the FieldSplit for the 2 dofs on vertices, you can easily recursively split this into 1 dof per block, since those are laid out like a colocated field. You just give another PCFIELDSPLIT of size 2 and the split will be automatic. Thanks, Matt > ------------------------------ > *From:* Patrick Sanan > *Sent:* Monday, January 31, 2022 8:47:23 AM > *To:* Jorti, Zakariae > *Cc:* petsc-users at mcs.anl.gov; Tang, Xianzhu; Tang, Qi > *Subject:* Re: [EXTERNAL] Re: [petsc-users] Question about PCFieldSplit > > The current behavior is that a single IS is returned for each stratum, so > if you have 2 unknowns on vertices, it should still return a single IS > including both of those unknowns per vertex. Am I understanding that that's > working as expected but you need *two* ISs in that case? > > > Am Mo., 31. Jan. 2022 um 16:29 Uhr schrieb Jorti, Zakariae < > zjorti at lanl.gov>: > >> Hi Patrick, >> >> >> Thanks for your recent updates on DMStag. >> After getting the Finite Difference Coloring to work with our solver, I >> was testing that DMCreateFieldDecomposition routine that you added last >> week. >> It seems to work fine when there is only one unknown per location (i.e. >> one unknown on vertices and/or one unknown on edges and/or one unknown >> on faces and/or one unknown on elements). >> That being said, when there is more than one unknown in some location >> (let's say 2 unknowns on vertices for instance), I could not get the ISs >> for those two unknowns with that routine. >> Should I still rely on PCFieldSplitSetDetectSaddlePoint in this case? >> Many thanks once again. >> >> >> Kind regards, >> >> >> Zakariae >> ------------------------------ >> *From:* Patrick Sanan >> *Sent:* Tuesday, January 25, 2022 5:36:17 AM >> *To:* Matthew Knepley >> *Cc:* Jorti, Zakariae; petsc-users at mcs.anl.gov; Tang, Xianzhu; Dave May >> *Subject:* [EXTERNAL] Re: [petsc-users] Question about PCFieldSplit >> >> Here is an MR which intends to introduce some logic to support >> DMCreateFieldDecomposition(). It doesn't use the PetscSection approach, >> which might be preferable, but nonetheless is a necessary component so It'd >> like to get it in, even if it has room for further optimization. Hopefully >> this can be followed fairly soon with some more examples and tests using >> PCFieldSplit itself. >> >> https://gitlab.com/petsc/petsc/-/merge_requests/4740 >> >> Am Mi., 23. Juni 2021 um 12:15 Uhr schrieb Matthew Knepley < >> knepley at gmail.com>: >> >>> On Wed, Jun 23, 2021 at 12:51 AM Patrick Sanan >>> wrote: >>> >>>> Hi Zakariae - >>>> >>>> The usual way to do this is to define an IS (index set) with the >>>> degrees of freedom of interest for the rows, and another one for the >>>> columns, and then use MatCreateSubmatrix [1] . >>>> >>>> There's not a particularly convenient way to create an IS with the >>>> degrees of freedom corresponding to a particular "stratum" (i.e. elements, >>>> faces, edges, or vertices) of a DMStag, but fortunately I believe we have >>>> some code to do exactly this in a development branch. >>>> >>>> I'll track it down and see if it can quickly be added to the main >>>> branch. >>>> >>> >>> Note that an easy way to keep track of this would be to create a section >>> with the different locations as fields. This Section could then >>> easily create the ISes, and could automatically interface with >>> PCFIELDSPLIT. >>> >>> Thanks, >>> >>> Matt >>> >>> >>>> >>>> [1]: >>>> https://petsc.org/release/docs/manualpages/Mat/MatCreateSubMatrix.html >>>> >>>> Am 22.06.2021 um 22:29 schrieb Jorti, Zakariae : >>>> >>>> Hello, >>>> >>>> I am working on DMStag and I have one dof on vertices (let us call >>>> it V), one dof on edges (let us call it E), one dof on faces ((let us >>>> call it F)) and one dof on cells (let us call it C). >>>> I build a matrix on this DM, and I was wondering if there was a way to >>>> get blocks (or sub matrices) of this matrix corresponding to specific >>>> degrees of freedom, for example rows corresponding to V dofs and columns >>>> corresponding to E dofs. >>>> I already asked this question before and the answer I got was I could >>>> call PCFieldSplitSetDetectSaddlePoint with the diagonal entries being >>>> of the matrix being zero or nonzero. >>>> That worked well. Nonetheless, I am curious to know if there >>>> was another alternative that does not require creating a dummy matrix >>>> with appropriate diagonal entries and solving a dummy linear system with >>>> this matrix to define the splits. >>>> >>>> >>>> Many thanks. >>>> >>>> Best regards, >>>> >>>> Zakariae >>>> ------------------------------ >>>> *From:* petsc-users on behalf of >>>> Tang, Qi >>>> *Sent:* Sunday, April 18, 2021 11:51:59 PM >>>> *To:* Patrick Sanan >>>> *Cc:* petsc-users at mcs.anl.gov; Tang, Xianzhu >>>> *Subject:* [EXTERNAL] Re: [petsc-users] Question about PCFieldSplit >>>> >>>> Thanks a lot, Patrick. We appreciate your help. >>>> >>>> Qi >>>> >>>> >>>> >>>> On Apr 18, 2021, at 11:30 PM, Patrick Sanan >>>> wrote: >>>> >>>> We have this functionality in a branch, which I'm working on cleaning >>>> up to get to master. It doesn't use PETScSection. Sorry about the delay! >>>> >>>> You can only use PCFieldSplitSetDetectSaddlePoint when your diagonal >>>> entries being zero or non-zero defines the splits correctly. >>>> >>>> Am 17.04.2021 um 21:09 schrieb Matthew Knepley : >>>> >>>> On Fri, Apr 16, 2021 at 8:39 PM Jorti, Zakariae via petsc-users < >>>> petsc-users at mcs.anl.gov> wrote: >>>> >>>>> Hello, >>>>> >>>>> I have a DMStag grid with one dof on each edge and face center. >>>>> I want to use a PCFieldSplit preconditioner on a Jacobian matrix that >>>>> I assume is already split but I am not sure how to determine the fields. >>>>> In the DMStag examples (ex2.c and ex3.c), the >>>>> function PCFieldSplitSetDetectSaddlePoint is used to determine those fields >>>>> based on zero diagonal entries. In my case, I have a Jacobian matrix that >>>>> does not have zero diagonal entries. >>>>> Can I use that PCFieldSplitSetDetectSaddlePoint in this case? >>>>> If not, how should I do? >>>>> Should I do like this example ( >>>>> https://www.mcs.anl.gov/petsc/petsc-master/src/ksp/ksp/tutorials/ex43.c.html >>>>> >>>>> ): >>>>> const PetscInt Bfields[1] = {0},Efields[1] = {1}; >>>>> KSPGetPC(ksp,&pc); >>>>> PCFieldSplitSetBlockSize(pc,2); >>>>> PCFieldSplitSetFields(pc,"B",1,Bfields,Bfields); >>>>> PCFieldSplitSetFields(pc,"E",1,Efields,Efields); >>>>> where my B unknowns are defined on face centers and E unknowns are >>>>> defined on edge centers? >>>>> >>>> That will not work.That interface only works for colocated fields that >>>> you get from DMDA. >>>> >>>> Patrick, does DMSTAG use PetscSection? Then the field split would be >>>> automatically calculated. If not, does it maintain the >>>> field division so that it could be given to PCFIELDSPLIT as ISes? >>>> >>>> Thanks, >>>> >>>> Matt >>>> >>>>> One last thing, I do not know which field comes first. Is it the one >>>>> defined for face dofs or edge dofs. >>>>> >>>>> Thank you. >>>>> Best regards, >>>>> >>>>> Zakariae >>>>> >>>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their >>>> experiments is infinitely more interesting than any results to which their >>>> experiments lead. >>>> -- Norbert Wiener >>>> >>>> https://www.cse.buffalo.edu/~knepley/ >>>> >>>> >>>> >>>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >>> https://www.cse.buffalo.edu/~knepley/ >>> >>> >> -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From junchao.zhang at gmail.com Mon Jan 31 10:19:38 2022 From: junchao.zhang at gmail.com (Junchao Zhang) Date: Mon, 31 Jan 2022 10:19:38 -0600 Subject: [petsc-users] Cannot eagerly initialize cuda, as doing so results in cuda error 35 (cudaErrorInsufficientDriver) : CUDA driver version is insufficient for CUDA runtime version In-Reply-To: References: <233D0F20-AD95-481B-B862-353D5789C556@gmail.com> <87czkn9c64.fsf@jedbrown.org> <875yqe7zip.fsf@jedbrown.org> Message-ID: Fande, From your configure_main.log cuda: Version: 10.1 Includes: -I/apps/local/spack/software/gcc-7.5.0/cuda-10.1.243-v4ymjqcrr7f72qfiuzsstuy5jiajbuey/include Library: -Wl,-rpath,/apps/local/spack/software/gcc-7.5.0/cuda-10.1.243-v4ymjqcrr7f72qfiuzsstuy5jiajbuey/lib64 -L/apps/local/spack/software/gcc-7.5.0/cuda-10.1.243-v4ymjqcrr7f72qfiuzsstuy5jiajbuey/lib64 -L/apps/local/spack/software/gcc-7.5.0/cuda-10.1.243-v4ymjqcrr7f72qfiuzsstuy5jiajbuey/lib64/stubs -lcudart -lcufft -lcublas -lcusparse -lcusolver -lcurand -lcuda You can see the `stubs` directory is not in rpath. We took a lot of effort to achieve that. You need to double check the reason. --Junchao Zhang On Mon, Jan 31, 2022 at 9:40 AM Fande Kong wrote: > OK, > > Finally we resolved the issue. The issue was that there were two libcuda > libs on a GPU compute node: /usr/lib64/libcuda > and /apps/local/spack/software/gcc-7.5.0/cuda-10.1.243-v4ymjqcrr7f72qfiuzsstuy5jiajbuey/lib64/stubs/libcuda. > But on a login node there is one libcuda lib: > /apps/local/spack/software/gcc-7.5.0/cuda-10.1.243-v4ymjqcrr7f72qfiuzsstuy5jiajbuey/lib64/stubs/libcuda. > We can not see /usr/lib64/libcuda from a login node where I was compiling > the code. > > Before the Junchao's commit, we did not have "-Wl,-rpath" to force PETSc > take > /apps/local/spack/software/gcc-7.5.0/cuda-10.1.243-v4ymjqcrr7f72qfiuzsstuy5jiajbuey/lib64/stubs/libcuda. > A code compiled on a login node could correctly pick up the cuda lib > from /usr/lib64/libcuda at runtime. When with "-Wl,-rpath", the code > always takes the cuda lib from > /apps/local/spack/software/gcc-7.5.0/cuda-10.1.243-v4ymjqcrr7f72qfiuzsstuy5jiajbuey/lib64/stubs/libcuda, > wihch was a bad lib. > > Right now, I just compiled code on a compute node instead of a login node, > PETSc was able to pick up the correct lib from /usr/lib64/libcuda, and > everything ran fine. > > I am not sure whether or not it is a good idea to search for "stubs" since > the system might have the correct ones in other places. Should not I do a > batch compiling? > > Thanks, > > Fande > > > On Wed, Jan 26, 2022 at 1:49 PM Fande Kong wrote: > >> Yes, please see the attached file. >> >> Fande >> >> On Wed, Jan 26, 2022 at 11:49 AM Junchao Zhang >> wrote: >> >>> Do you have the configure.log with main? >>> >>> --Junchao Zhang >>> >>> >>> On Wed, Jan 26, 2022 at 12:26 PM Fande Kong wrote: >>> >>>> I am on the petsc-main >>>> >>>> commit 1390d3a27d88add7d79c9b38bf1a895ae5e67af6 >>>> >>>> Merge: 96c919c d5f3255 >>>> >>>> Author: Satish Balay >>>> >>>> Date: Wed Jan 26 10:28:32 2022 -0600 >>>> >>>> >>>> Merge remote-tracking branch 'origin/release' >>>> >>>> >>>> It is still broken. >>>> >>>> Thanks, >>>> >>>> >>>> Fande >>>> >>>> On Wed, Jan 26, 2022 at 7:40 AM Junchao Zhang >>>> wrote: >>>> >>>>> The good uses the compiler's default library/header path. The bad >>>>> searches from cuda toolkit path and uses rpath linking. >>>>> Though the paths look the same on the login node, they could have >>>>> different behavior on a compute node depending on its environment. >>>>> I think we fixed the issue in cuda.py (i.e., first try the compiler's >>>>> default, then toolkit). That's why I wanted Fande to use petsc/main. >>>>> >>>>> --Junchao Zhang >>>>> >>>>> >>>>> On Tue, Jan 25, 2022 at 11:59 PM Barry Smith wrote: >>>>> >>>>>> >>>>>> bad has extra >>>>>> >>>>>> -L/apps/local/spack/software/gcc-7.5.0/cuda-10.1.243-v4ymjqcrr7f72qfiuzsstuy5jiajbuey/lib64/stubs >>>>>> -lcuda >>>>>> >>>>>> good does not. >>>>>> >>>>>> Try removing the stubs directory and -lcuda from the bad >>>>>> $PETSC_ARCH/lib/petsc/conf/variables and likely the bad will start working. >>>>>> >>>>>> Barry >>>>>> >>>>>> I never liked the stubs stuff. >>>>>> >>>>>> On Jan 25, 2022, at 11:29 PM, Fande Kong wrote: >>>>>> >>>>>> Hi Junchao, >>>>>> >>>>>> I attached a "bad" configure log and a "good" configure log. >>>>>> >>>>>> The "bad" one was on produced >>>>>> at 246ba74192519a5f34fb6e227d1c64364e19ce2c >>>>>> >>>>>> and the "good" one at 384645a00975869a1aacbd3169de62ba40cad683 >>>>>> >>>>>> This good hash is the last good hash that is just the right before >>>>>> the bad one. >>>>>> >>>>>> I think you could do a comparison between these two logs, and check >>>>>> what the differences were. >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Fande >>>>>> >>>>>> On Tue, Jan 25, 2022 at 8:21 PM Junchao Zhang < >>>>>> junchao.zhang at gmail.com> wrote: >>>>>> >>>>>>> Fande, could you send the configure.log that works (i.e., before >>>>>>> this offending commit)? >>>>>>> --Junchao Zhang >>>>>>> >>>>>>> >>>>>>> On Tue, Jan 25, 2022 at 8:21 PM Fande Kong >>>>>>> wrote: >>>>>>> >>>>>>>> Not sure if this is helpful. I did "git bisect", and here was the >>>>>>>> result: >>>>>>>> >>>>>>>> [kongf at sawtooth2 petsc]$ git bisect bad >>>>>>>> 246ba74192519a5f34fb6e227d1c64364e19ce2c is the first bad commit >>>>>>>> commit 246ba74192519a5f34fb6e227d1c64364e19ce2c >>>>>>>> Author: Junchao Zhang >>>>>>>> Date: Wed Oct 13 05:32:43 2021 +0000 >>>>>>>> >>>>>>>> Config: fix CUDA library and header dirs >>>>>>>> >>>>>>>> :040000 040000 187c86055adb80f53c1d0565a8888704fec43a96 >>>>>>>> ea1efd7f594fd5e8df54170bc1bc7b00f35e4d5f M config >>>>>>>> >>>>>>>> >>>>>>>> Started from this commit, and GPU did not work for me on our HPC >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Fande >>>>>>>> >>>>>>>> On Tue, Jan 25, 2022 at 7:18 PM Fande Kong >>>>>>>> wrote: >>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> On Tue, Jan 25, 2022 at 9:04 AM Jacob Faibussowitsch < >>>>>>>>> jacob.fai at gmail.com> wrote: >>>>>>>>> >>>>>>>>>> Configure should not have an impact here I think. The reason I >>>>>>>>>> had you run `cudaGetDeviceCount()` is because this is the CUDA call (and in >>>>>>>>>> fact the only CUDA call) in the initialization sequence that returns the >>>>>>>>>> error code. There should be no prior CUDA calls. Maybe this is a problem >>>>>>>>>> with oversubscribing GPU?s? In the runs that crash, how many ranks are >>>>>>>>>> using any given GPU at once? Maybe MPS is required. >>>>>>>>>> >>>>>>>>> >>>>>>>>> I used one MPI rank. >>>>>>>>> >>>>>>>>> Fande >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>>> >>>>>>>>>> Best regards, >>>>>>>>>> >>>>>>>>>> Jacob Faibussowitsch >>>>>>>>>> (Jacob Fai - booss - oh - vitch) >>>>>>>>>> >>>>>>>>>> On Jan 21, 2022, at 12:01, Fande Kong >>>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>> Thanks Jacob, >>>>>>>>>> >>>>>>>>>> On Thu, Jan 20, 2022 at 6:25 PM Jacob Faibussowitsch < >>>>>>>>>> jacob.fai at gmail.com> wrote: >>>>>>>>>> >>>>>>>>>>> Segfault is caused by the following check at >>>>>>>>>>> src/sys/objects/device/impls/cupm/cupmdevice.cxx:349 being a >>>>>>>>>>> PetscUnlikelyDebug() rather than just PetscUnlikely(): >>>>>>>>>>> >>>>>>>>>>> ``` >>>>>>>>>>> if (PetscUnlikelyDebug(_defaultDevice < 0)) { // _defaultDevice >>>>>>>>>>> is in fact < 0 here and uncaught >>>>>>>>>>> ``` >>>>>>>>>>> >>>>>>>>>>> To clarify: >>>>>>>>>>> >>>>>>>>>>> ?lazy? initialization is not that lazy after all, it still does >>>>>>>>>>> some 50% of the initialization that ?eager? initialization does. It stops >>>>>>>>>>> short initializing the CUDA runtime, checking CUDA aware MPI, gathering >>>>>>>>>>> device data, and initializing cublas and friends. Lazy also importantly >>>>>>>>>>> swallows any errors that crop up during initialization, storing the >>>>>>>>>>> resulting error code for later (specifically _defaultDevice = >>>>>>>>>>> -init_error_value;). >>>>>>>>>>> >>>>>>>>>>> So whether you initialize lazily or eagerly makes no difference >>>>>>>>>>> here, as _defaultDevice will always contain -35. >>>>>>>>>>> >>>>>>>>>>> The bigger question is why cudaGetDeviceCount() is returning >>>>>>>>>>> cudaErrorInsufficientDriver. Can you compile and run >>>>>>>>>>> >>>>>>>>>>> ``` >>>>>>>>>>> #include >>>>>>>>>>> >>>>>>>>>>> int main() >>>>>>>>>>> { >>>>>>>>>>> int ndev; >>>>>>>>>>> return cudaGetDeviceCount(&ndev): >>>>>>>>>>> } >>>>>>>>>>> ``` >>>>>>>>>>> >>>>>>>>>>> Then show the value of "echo $??? >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Modify your code a little to get more information. >>>>>>>>>> >>>>>>>>>> #include >>>>>>>>>> #include >>>>>>>>>> >>>>>>>>>> int main() >>>>>>>>>> { >>>>>>>>>> int ndev; >>>>>>>>>> int error = cudaGetDeviceCount(&ndev); >>>>>>>>>> printf("ndev %d \n", ndev); >>>>>>>>>> printf("error %d \n", error); >>>>>>>>>> return 0; >>>>>>>>>> } >>>>>>>>>> >>>>>>>>>> Results: >>>>>>>>>> >>>>>>>>>> $ ./a.out >>>>>>>>>> ndev 4 >>>>>>>>>> error 0 >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> I have not read the PETSc cuda initialization code yet. If I need >>>>>>>>>> to guess at what was happening. I will naively think that PETSc did not get >>>>>>>>>> correct GPU information in the configuration because the compiler node does >>>>>>>>>> not have GPUs, and there was no way to get any GPU device information. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> During the runtime on GPU nodes, PETSc might have incorrect >>>>>>>>>> information grabbed during configuration and had this kind of false error >>>>>>>>>> message. >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> >>>>>>>>>> Fande >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Best regards, >>>>>>>>>>> >>>>>>>>>>> Jacob Faibussowitsch >>>>>>>>>>> (Jacob Fai - booss - oh - vitch) >>>>>>>>>>> >>>>>>>>>>> On Jan 20, 2022, at 17:47, Matthew Knepley >>>>>>>>>>> wrote: >>>>>>>>>>> >>>>>>>>>>> On Thu, Jan 20, 2022 at 6:44 PM Fande Kong >>>>>>>>>>> wrote: >>>>>>>>>>> >>>>>>>>>>>> Thanks, Jed >>>>>>>>>>>> >>>>>>>>>>>> On Thu, Jan 20, 2022 at 4:34 PM Jed Brown >>>>>>>>>>>> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> You can't create CUDA or Kokkos Vecs if you're running on a >>>>>>>>>>>>> node without a GPU. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> I am running the code on compute nodes that do have GPUs. >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> If you are actually running on GPUs, why would you need lazy >>>>>>>>>>> initialization? It would not break with GPUs present. >>>>>>>>>>> >>>>>>>>>>> Matt >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>> With PETSc-3.16.1, I got good speedup by running GAMG on >>>>>>>>>>>> GPUs. That might be a bug of PETSc-main. >>>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> >>>>>>>>>>>> Fande >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> KSPSetUp 13 1.0 6.4400e-01 1.0 2.02e+09 1.0 >>>>>>>>>>>> 0.0e+00 0.0e+00 0.0e+00 0 5 0 0 0 0 5 0 0 0 3140 64630 15 >>>>>>>>>>>> 1.05e+02 5 3.49e+01 100 >>>>>>>>>>>> KSPSolve 1 1.0 1.0109e+00 1.0 3.49e+10 1.0 >>>>>>>>>>>> 0.0e+00 0.0e+00 0.0e+00 0 87 0 0 0 0 87 0 0 0 34522 69556 4 >>>>>>>>>>>> 4.35e-03 1 2.38e-03 100 >>>>>>>>>>>> KSPGMRESOrthog 142 1.0 1.2674e-01 1.0 1.06e+10 1.0 >>>>>>>>>>>> 0.0e+00 0.0e+00 0.0e+00 0 27 0 0 0 0 27 0 0 0 83755 87801 0 >>>>>>>>>>>> 0.00e+00 0 0.00e+00 100 >>>>>>>>>>>> SNESSolve 1 1.0 4.4402e+01 1.0 4.00e+10 1.0 >>>>>>>>>>>> 0.0e+00 0.0e+00 0.0e+00 21100 0 0 0 21100 0 0 0 901 51365 57 >>>>>>>>>>>> 1.10e+03 52 8.78e+02 100 >>>>>>>>>>>> SNESSetUp 1 1.0 3.9101e-05 1.0 0.00e+00 0.0 >>>>>>>>>>>> 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 >>>>>>>>>>>> 0.00e+00 0 0.00e+00 0 >>>>>>>>>>>> SNESFunctionEval 2 1.0 1.7097e+01 1.0 1.60e+07 1.0 >>>>>>>>>>>> 0.0e+00 0.0e+00 0.0e+00 8 0 0 0 0 8 0 0 0 0 1 0 0 >>>>>>>>>>>> 0.00e+00 6 1.92e+02 0 >>>>>>>>>>>> SNESJacobianEval 1 1.0 1.6213e+01 1.0 2.80e+07 1.0 >>>>>>>>>>>> 0.0e+00 0.0e+00 0.0e+00 8 0 0 0 0 8 0 0 0 0 2 0 0 >>>>>>>>>>>> 0.00e+00 1 3.20e+01 0 >>>>>>>>>>>> SNESLineSearch 1 1.0 8.5582e+00 1.0 1.24e+08 1.0 >>>>>>>>>>>> 0.0e+00 0.0e+00 0.0e+00 4 0 0 0 0 4 0 0 0 0 14 64153 1 >>>>>>>>>>>> 3.20e+01 3 9.61e+01 94 >>>>>>>>>>>> PCGAMGGraph_AGG 5 1.0 3.0509e+00 1.0 8.19e+07 1.0 >>>>>>>>>>>> 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 27 0 5 >>>>>>>>>>>> 3.49e+01 9 7.43e+01 0 >>>>>>>>>>>> PCGAMGCoarse_AGG 5 1.0 3.8711e+00 1.0 0.00e+00 0.0 >>>>>>>>>>>> 0.0e+00 0.0e+00 0.0e+00 2 0 0 0 0 2 0 0 0 0 0 0 0 >>>>>>>>>>>> 0.00e+00 0 0.00e+00 0 >>>>>>>>>>>> PCGAMGProl_AGG 5 1.0 7.0748e-01 1.0 0.00e+00 0.0 >>>>>>>>>>>> 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 >>>>>>>>>>>> 0.00e+00 0 0.00e+00 0 >>>>>>>>>>>> PCGAMGPOpt_AGG 5 1.0 1.2904e+00 1.0 2.14e+09 1.0 >>>>>>>>>>>> 0.0e+00 0.0e+00 0.0e+00 1 5 0 0 0 1 5 0 0 0 1661 29807 26 >>>>>>>>>>>> 7.15e+02 20 2.90e+02 99 >>>>>>>>>>>> GAMG: createProl 5 1.0 8.9489e+00 1.0 2.22e+09 1.0 >>>>>>>>>>>> 0.0e+00 0.0e+00 0.0e+00 4 6 0 0 0 4 6 0 0 0 249 29666 31 >>>>>>>>>>>> 7.50e+02 29 3.64e+02 96 >>>>>>>>>>>> Graph 10 1.0 3.0478e+00 1.0 8.19e+07 1.0 >>>>>>>>>>>> 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 27 0 5 >>>>>>>>>>>> 3.49e+01 9 7.43e+01 0 >>>>>>>>>>>> MIS/Agg 5 1.0 4.1290e-01 1.0 0.00e+00 0.0 >>>>>>>>>>>> 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 >>>>>>>>>>>> 0.00e+00 0 0.00e+00 0 >>>>>>>>>>>> SA: col data 5 1.0 1.9127e-02 1.0 0.00e+00 0.0 >>>>>>>>>>>> 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 >>>>>>>>>>>> 0.00e+00 0 0.00e+00 0 >>>>>>>>>>>> SA: frmProl0 5 1.0 6.2662e-01 1.0 0.00e+00 0.0 >>>>>>>>>>>> 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 >>>>>>>>>>>> 0.00e+00 0 0.00e+00 0 >>>>>>>>>>>> SA: smooth 5 1.0 4.9595e-01 1.0 1.21e+08 1.0 >>>>>>>>>>>> 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 244 2709 15 >>>>>>>>>>>> 1.97e+02 15 2.55e+02 90 >>>>>>>>>>>> GAMG: partLevel 5 1.0 4.7330e-01 1.0 6.98e+08 1.0 >>>>>>>>>>>> 0.0e+00 0.0e+00 0.0e+00 0 2 0 0 0 0 2 0 0 0 1475 4120 5 >>>>>>>>>>>> 1.78e+02 10 2.55e+02 100 >>>>>>>>>>>> PCGAMG Squ l00 1 1.0 2.6027e+00 1.0 0.00e+00 0.0 >>>>>>>>>>>> 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 0 0 0 >>>>>>>>>>>> 0.00e+00 0 0.00e+00 0 >>>>>>>>>>>> PCGAMG Gal l00 1 1.0 3.8406e-01 1.0 5.48e+08 1.0 >>>>>>>>>>>> 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 1426 4270 1 >>>>>>>>>>>> 1.48e+02 2 2.11e+02 100 >>>>>>>>>>>> PCGAMG Opt l00 1 1.0 2.4932e-01 1.0 7.20e+07 1.0 >>>>>>>>>>>> 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 289 2653 1 >>>>>>>>>>>> 6.41e+01 1 1.13e+02 100 >>>>>>>>>>>> PCGAMG Gal l01 1 1.0 6.6279e-02 1.0 1.09e+08 1.0 >>>>>>>>>>>> 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1645 3851 1 >>>>>>>>>>>> 2.40e+01 2 3.64e+01 100 >>>>>>>>>>>> PCGAMG Opt l01 1 1.0 2.9544e-02 1.0 7.15e+06 1.0 >>>>>>>>>>>> 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 242 1671 1 >>>>>>>>>>>> 4.84e+00 1 1.23e+01 100 >>>>>>>>>>>> PCGAMG Gal l02 1 1.0 1.8874e-02 1.0 3.72e+07 1.0 >>>>>>>>>>>> 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1974 3636 1 >>>>>>>>>>>> 5.04e+00 2 6.58e+00 100 >>>>>>>>>>>> PCGAMG Opt l02 1 1.0 7.4353e-03 1.0 2.40e+06 1.0 >>>>>>>>>>>> 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 323 1457 1 >>>>>>>>>>>> 7.71e-01 1 2.30e+00 100 >>>>>>>>>>>> PCGAMG Gal l03 1 1.0 2.8479e-03 1.0 4.10e+06 1.0 >>>>>>>>>>>> 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1440 2266 1 >>>>>>>>>>>> 4.44e-01 2 5.51e-01 100 >>>>>>>>>>>> PCGAMG Opt l03 1 1.0 8.2684e-04 1.0 2.80e+05 1.0 >>>>>>>>>>>> 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 339 1667 1 >>>>>>>>>>>> 6.72e-02 1 2.03e-01 100 >>>>>>>>>>>> PCGAMG Gal l04 1 1.0 1.2238e-03 1.0 2.09e+05 1.0 >>>>>>>>>>>> 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 170 244 1 >>>>>>>>>>>> 2.05e-02 2 2.53e-02 100 >>>>>>>>>>>> PCGAMG Opt l04 1 1.0 4.1008e-04 1.0 1.77e+04 1.0 >>>>>>>>>>>> 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 43 165 1 >>>>>>>>>>>> 4.49e-03 1 1.19e-02 100 >>>>>>>>>>>> PCSetUp 2 1.0 9.9632e+00 1.0 4.95e+09 1.0 >>>>>>>>>>>> 0.0e+00 0.0e+00 0.0e+00 5 12 0 0 0 5 12 0 0 0 496 17826 55 >>>>>>>>>>>> 1.03e+03 45 6.54e+02 98 >>>>>>>>>>>> PCSetUpOnBlocks 44 1.0 9.9087e-04 1.0 2.88e+03 1.0 >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>>> The point of lazy initialization is to make it possible to run >>>>>>>>>>>>> a solve that doesn't use a GPU in PETSC_ARCH that supports GPUs, regardless >>>>>>>>>>>>> of whether a GPU is actually present. >>>>>>>>>>>>> >>>>>>>>>>>>> Fande Kong writes: >>>>>>>>>>>>> >>>>>>>>>>>>> > I spoke too soon. It seems that we have trouble creating >>>>>>>>>>>>> cuda/kokkos vecs >>>>>>>>>>>>> > now. Got Segmentation fault. >>>>>>>>>>>>> > >>>>>>>>>>>>> > Thanks, >>>>>>>>>>>>> > >>>>>>>>>>>>> > Fande >>>>>>>>>>>>> > >>>>>>>>>>>>> > Program received signal SIGSEGV, Segmentation fault. >>>>>>>>>>>>> > 0x00002aaab5558b11 in >>>>>>>>>>>>> > >>>>>>>>>>>>> Petsc::CUPMDevice<(Petsc::CUPMDeviceType)0>::CUPMDeviceInternal::initialize >>>>>>>>>>>>> > (this=0x1) at >>>>>>>>>>>>> > >>>>>>>>>>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/impls/cupm/cupmdevice.cxx:54 >>>>>>>>>>>>> > 54 PetscErrorCode >>>>>>>>>>>>> CUPMDevice::CUPMDeviceInternal::initialize() noexcept >>>>>>>>>>>>> > Missing separate debuginfos, use: debuginfo-install >>>>>>>>>>>>> > bzip2-libs-1.0.6-13.el7.x86_64 >>>>>>>>>>>>> elfutils-libelf-0.176-5.el7.x86_64 >>>>>>>>>>>>> > elfutils-libs-0.176-5.el7.x86_64 glibc-2.17-325.el7_9.x86_64 >>>>>>>>>>>>> > libX11-1.6.7-4.el7_9.x86_64 libXau-1.0.8-2.1.el7.x86_64 >>>>>>>>>>>>> > libattr-2.4.46-13.el7.x86_64 libcap-2.22-11.el7.x86_64 >>>>>>>>>>>>> > libibmad-5.4.0.MLNX20190423.1d917ae-0.1.49224.x86_64 >>>>>>>>>>>>> > libibumad-43.1.1.MLNX20200211.078947f-0.1.49224.x86_64 >>>>>>>>>>>>> > libibverbs-41mlnx1-OFED.4.9.0.0.7.49224.x86_64 >>>>>>>>>>>>> > libmlx4-41mlnx1-OFED.4.7.3.0.3.49224.x86_64 >>>>>>>>>>>>> > libmlx5-41mlnx1-OFED.4.9.0.1.2.49224.x86_64 >>>>>>>>>>>>> libnl3-3.2.28-4.el7.x86_64 >>>>>>>>>>>>> > librdmacm-41mlnx1-OFED.4.7.3.0.6.49224.x86_64 >>>>>>>>>>>>> > librxe-41mlnx1-OFED.4.4.2.4.6.49224.x86_64 >>>>>>>>>>>>> libxcb-1.13-1.el7.x86_64 >>>>>>>>>>>>> > libxml2-2.9.1-6.el7_9.6.x86_64 >>>>>>>>>>>>> numactl-libs-2.0.12-5.el7.x86_64 >>>>>>>>>>>>> > systemd-libs-219-78.el7_9.3.x86_64 xz-libs-5.2.2-1.el7.x86_64 >>>>>>>>>>>>> > zlib-1.2.7-19.el7_9.x86_64 >>>>>>>>>>>>> > (gdb) bt >>>>>>>>>>>>> > #0 0x00002aaab5558b11 in >>>>>>>>>>>>> > >>>>>>>>>>>>> Petsc::CUPMDevice<(Petsc::CUPMDeviceType)0>::CUPMDeviceInternal::initialize >>>>>>>>>>>>> > (this=0x1) at >>>>>>>>>>>>> > >>>>>>>>>>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/impls/cupm/cupmdevice.cxx:54 >>>>>>>>>>>>> > #1 0x00002aaab5558db7 in >>>>>>>>>>>>> > Petsc::CUPMDevice<(Petsc::CUPMDeviceType)0>::getDevice >>>>>>>>>>>>> > (this=this at entry=0x2aaab7f37b70 >>>>>>>>>>>>> > , device=0x115da00, id=-35, id at entry=-1) at >>>>>>>>>>>>> > >>>>>>>>>>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/impls/cupm/cupmdevice.cxx:344 >>>>>>>>>>>>> > #2 0x00002aaab55577de in PetscDeviceCreate (type=type at entry >>>>>>>>>>>>> =PETSC_DEVICE_CUDA, >>>>>>>>>>>>> > devid=devid at entry=-1, device=device at entry=0x2aaab7f37b48 >>>>>>>>>>>>> > ) at >>>>>>>>>>>>> > >>>>>>>>>>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/interface/device.cxx:107 >>>>>>>>>>>>> > #3 0x00002aaab5557b3a in >>>>>>>>>>>>> PetscDeviceInitializeDefaultDevice_Internal >>>>>>>>>>>>> > (type=type at entry=PETSC_DEVICE_CUDA, >>>>>>>>>>>>> defaultDeviceId=defaultDeviceId at entry=-1) >>>>>>>>>>>>> > at >>>>>>>>>>>>> > >>>>>>>>>>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/interface/device.cxx:273 >>>>>>>>>>>>> > #4 0x00002aaab5557bf6 in PetscDeviceInitialize >>>>>>>>>>>>> > (type=type at entry=PETSC_DEVICE_CUDA) >>>>>>>>>>>>> > at >>>>>>>>>>>>> > >>>>>>>>>>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/interface/device.cxx:234 >>>>>>>>>>>>> > #5 0x00002aaab5661fcd in VecCreate_SeqCUDA (V=0x115d150) at >>>>>>>>>>>>> > >>>>>>>>>>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/vec/vec/impls/seq/seqcuda/veccuda.c:244 >>>>>>>>>>>>> > #6 0x00002aaab5649b40 in VecSetType (vec=vec at entry >>>>>>>>>>>>> =0x115d150, >>>>>>>>>>>>> > method=method at entry=0x2aaab70b45b8 "seqcuda") at >>>>>>>>>>>>> > >>>>>>>>>>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/vec/vec/interface/vecreg.c:93 >>>>>>>>>>>>> > #7 0x00002aaab579c33f in VecCreate_CUDA (v=0x115d150) at >>>>>>>>>>>>> > >>>>>>>>>>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/vec/vec/impls/mpi/mpicuda/ >>>>>>>>>>>>> > mpicuda.cu:214 >>>>>>>>>>>>> > #8 0x00002aaab5649b40 in VecSetType (vec=vec at entry >>>>>>>>>>>>> =0x115d150, >>>>>>>>>>>>> > method=method at entry=0x7fffffff9260 "cuda") at >>>>>>>>>>>>> > >>>>>>>>>>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/vec/vec/interface/vecreg.c:93 >>>>>>>>>>>>> > #9 0x00002aaab5648bf1 in VecSetTypeFromOptions_Private >>>>>>>>>>>>> (vec=0x115d150, >>>>>>>>>>>>> > PetscOptionsObject=0x7fffffff9210) at >>>>>>>>>>>>> > >>>>>>>>>>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/vec/vec/interface/vector.c:1263 >>>>>>>>>>>>> > #10 VecSetFromOptions (vec=0x115d150) at >>>>>>>>>>>>> > >>>>>>>>>>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/vec/vec/interface/vector.c:1297 >>>>>>>>>>>>> > #11 0x00002aaab02ef227 in libMesh::PetscVector::init >>>>>>>>>>>>> > (this=0x11cd1a0, n=441, n_local=441, fast=false, >>>>>>>>>>>>> ptype=libMesh::PARALLEL) >>>>>>>>>>>>> > at >>>>>>>>>>>>> > >>>>>>>>>>>>> /home/kongf/workhome/sawtooth/moosegpu/scripts/../libmesh/installed/include/libmesh/petsc_vector.h:693 >>>>>>>>>>>>> > >>>>>>>>>>>>> > On Thu, Jan 20, 2022 at 1:09 PM Fande Kong < >>>>>>>>>>>>> fdkong.jd at gmail.com> wrote: >>>>>>>>>>>>> > >>>>>>>>>>>>> >> Thanks, Jed, >>>>>>>>>>>>> >> >>>>>>>>>>>>> >> This worked! >>>>>>>>>>>>> >> >>>>>>>>>>>>> >> Fande >>>>>>>>>>>>> >> >>>>>>>>>>>>> >> On Wed, Jan 19, 2022 at 11:03 PM Jed Brown < >>>>>>>>>>>>> jed at jedbrown.org> wrote: >>>>>>>>>>>>> >> >>>>>>>>>>>>> >>> Fande Kong writes: >>>>>>>>>>>>> >>> >>>>>>>>>>>>> >>> > On Wed, Jan 19, 2022 at 11:39 AM Jacob Faibussowitsch < >>>>>>>>>>>>> >>> jacob.fai at gmail.com> >>>>>>>>>>>>> >>> > wrote: >>>>>>>>>>>>> >>> > >>>>>>>>>>>>> >>> >> Are you running on login nodes or compute nodes (I >>>>>>>>>>>>> can?t seem to tell >>>>>>>>>>>>> >>> from >>>>>>>>>>>>> >>> >> the configure.log)? >>>>>>>>>>>>> >>> >> >>>>>>>>>>>>> >>> > >>>>>>>>>>>>> >>> > I was compiling codes on login nodes, and running codes >>>>>>>>>>>>> on compute >>>>>>>>>>>>> >>> nodes. >>>>>>>>>>>>> >>> > Login nodes do not have GPUs, but compute nodes do have >>>>>>>>>>>>> GPUs. >>>>>>>>>>>>> >>> > >>>>>>>>>>>>> >>> > Just to be clear, the same thing (code, machine) with >>>>>>>>>>>>> PETSc-3.16.1 >>>>>>>>>>>>> >>> worked >>>>>>>>>>>>> >>> > perfectly. I have this trouble with PETSc-main. >>>>>>>>>>>>> >>> >>>>>>>>>>>>> >>> I assume you can >>>>>>>>>>>>> >>> >>>>>>>>>>>>> >>> export PETSC_OPTIONS='-device_enable lazy' >>>>>>>>>>>>> >>> >>>>>>>>>>>>> >>> and it'll work. >>>>>>>>>>>>> >>> >>>>>>>>>>>>> >>> I think this should be the default. The main complaint is >>>>>>>>>>>>> that timing the >>>>>>>>>>>>> >>> first GPU-using event isn't accurate if it includes >>>>>>>>>>>>> initialization, but I >>>>>>>>>>>>> >>> think this is mostly hypothetical because you can't trust >>>>>>>>>>>>> any timing that >>>>>>>>>>>>> >>> doesn't preload in some form and the first GPU-using event >>>>>>>>>>>>> will almost >>>>>>>>>>>>> >>> always be something uninteresting so I think it will >>>>>>>>>>>>> rarely lead to >>>>>>>>>>>>> >>> confusion. Meanwhile, eager initialization is viscerally >>>>>>>>>>>>> disruptive for >>>>>>>>>>>>> >>> lots of people. >>>>>>>>>>>>> >>> >>>>>>>>>>>>> >> >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> -- >>>>>>>>>>> What most experimenters take for granted before they begin their >>>>>>>>>>> experiments is infinitely more interesting than any results to which their >>>>>>>>>>> experiments lead. >>>>>>>>>>> -- Norbert Wiener >>>>>>>>>>> >>>>>>>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>> >>>>>> >>>>>> -------------- next part -------------- An HTML attachment was scrubbed... URL: From fdkong.jd at gmail.com Mon Jan 31 10:49:56 2022 From: fdkong.jd at gmail.com (Fande Kong) Date: Mon, 31 Jan 2022 09:49:56 -0700 Subject: [petsc-users] Cannot eagerly initialize cuda, as doing so results in cuda error 35 (cudaErrorInsufficientDriver) : CUDA driver version is insufficient for CUDA runtime version In-Reply-To: References: <233D0F20-AD95-481B-B862-353D5789C556@gmail.com> <87czkn9c64.fsf@jedbrown.org> <875yqe7zip.fsf@jedbrown.org> Message-ID: Sorry for the confusion. I thought I explained pretty well :-) Good: PETSc was linked to /usr/lib64/libcuda for libcuda Bad: PETSc was linked to /apps/local/spack/software/gcc-7.5.0/cuda-10.1.243-v4ymjqcrr7f72qfiuzsstuy5jiajbuey/lib64/stubs for libcuda My question would be: where should I look for libcuda? Our HPC admin told me that I should use the one from /usr/lib64/libcuda I am trying to understand why we need to link to "stubs"? Just to be clear, I am fine with PETSc-main as is since I can use a compute node to compile PETSc. However, here I am trying really hard to understand where I should look for the right libcuda. Thanks for your help Fande On Mon, Jan 31, 2022 at 9:19 AM Junchao Zhang wrote: > Fande, > From your configure_main.log > > cuda: > Version: 10.1 > Includes: > -I/apps/local/spack/software/gcc-7.5.0/cuda-10.1.243-v4ymjqcrr7f72qfiuzsstuy5jiajbuey/include > Library: > -Wl,-rpath,/apps/local/spack/software/gcc-7.5.0/cuda-10.1.243-v4ymjqcrr7f72qfiuzsstuy5jiajbuey/lib64 > -L/apps/local/spack/software/gcc-7.5.0/cuda-10.1.243-v4ymjqcrr7f72qfiuzsstuy5jiajbuey/lib64 > -L/apps/local/spack/software/gcc-7.5.0/cuda-10.1.243-v4ymjqcrr7f72qfiuzsstuy5jiajbuey/lib64/stubs > -lcudart -lcufft -lcublas -lcusparse -lcusolver -lcurand -lcuda > > > You can see the `stubs` directory is not in rpath. We took a lot of > effort to achieve that. You need to double check the reason. > > --Junchao Zhang > > > On Mon, Jan 31, 2022 at 9:40 AM Fande Kong wrote: > >> OK, >> >> Finally we resolved the issue. The issue was that there were two libcuda >> libs on a GPU compute node: /usr/lib64/libcuda >> and /apps/local/spack/software/gcc-7.5.0/cuda-10.1.243-v4ymjqcrr7f72qfiuzsstuy5jiajbuey/lib64/stubs/libcuda. >> But on a login node there is one libcuda lib: >> /apps/local/spack/software/gcc-7.5.0/cuda-10.1.243-v4ymjqcrr7f72qfiuzsstuy5jiajbuey/lib64/stubs/libcuda. >> We can not see /usr/lib64/libcuda from a login node where I was compiling >> the code. >> >> Before the Junchao's commit, we did not have "-Wl,-rpath" to force PETSc >> take >> /apps/local/spack/software/gcc-7.5.0/cuda-10.1.243-v4ymjqcrr7f72qfiuzsstuy5jiajbuey/lib64/stubs/libcuda. >> A code compiled on a login node could correctly pick up the cuda lib >> from /usr/lib64/libcuda at runtime. When with "-Wl,-rpath", the code >> always takes the cuda lib from >> /apps/local/spack/software/gcc-7.5.0/cuda-10.1.243-v4ymjqcrr7f72qfiuzsstuy5jiajbuey/lib64/stubs/libcuda, >> wihch was a bad lib. >> >> Right now, I just compiled code on a compute node instead of a login >> node, PETSc was able to pick up the correct lib from /usr/lib64/libcuda, >> and everything ran fine. >> >> I am not sure whether or not it is a good idea to search for "stubs" >> since the system might have the correct ones in other places. Should not I >> do a batch compiling? >> >> Thanks, >> >> Fande >> >> >> On Wed, Jan 26, 2022 at 1:49 PM Fande Kong wrote: >> >>> Yes, please see the attached file. >>> >>> Fande >>> >>> On Wed, Jan 26, 2022 at 11:49 AM Junchao Zhang >>> wrote: >>> >>>> Do you have the configure.log with main? >>>> >>>> --Junchao Zhang >>>> >>>> >>>> On Wed, Jan 26, 2022 at 12:26 PM Fande Kong >>>> wrote: >>>> >>>>> I am on the petsc-main >>>>> >>>>> commit 1390d3a27d88add7d79c9b38bf1a895ae5e67af6 >>>>> >>>>> Merge: 96c919c d5f3255 >>>>> >>>>> Author: Satish Balay >>>>> >>>>> Date: Wed Jan 26 10:28:32 2022 -0600 >>>>> >>>>> >>>>> Merge remote-tracking branch 'origin/release' >>>>> >>>>> >>>>> It is still broken. >>>>> >>>>> Thanks, >>>>> >>>>> >>>>> Fande >>>>> >>>>> On Wed, Jan 26, 2022 at 7:40 AM Junchao Zhang >>>>> wrote: >>>>> >>>>>> The good uses the compiler's default library/header path. The bad >>>>>> searches from cuda toolkit path and uses rpath linking. >>>>>> Though the paths look the same on the login node, they could have >>>>>> different behavior on a compute node depending on its environment. >>>>>> I think we fixed the issue in cuda.py (i.e., first try the compiler's >>>>>> default, then toolkit). That's why I wanted Fande to use petsc/main. >>>>>> >>>>>> --Junchao Zhang >>>>>> >>>>>> >>>>>> On Tue, Jan 25, 2022 at 11:59 PM Barry Smith >>>>>> wrote: >>>>>> >>>>>>> >>>>>>> bad has extra >>>>>>> >>>>>>> -L/apps/local/spack/software/gcc-7.5.0/cuda-10.1.243-v4ymjqcrr7f72qfiuzsstuy5jiajbuey/lib64/stubs >>>>>>> -lcuda >>>>>>> >>>>>>> good does not. >>>>>>> >>>>>>> Try removing the stubs directory and -lcuda from the bad >>>>>>> $PETSC_ARCH/lib/petsc/conf/variables and likely the bad will start working. >>>>>>> >>>>>>> Barry >>>>>>> >>>>>>> I never liked the stubs stuff. >>>>>>> >>>>>>> On Jan 25, 2022, at 11:29 PM, Fande Kong >>>>>>> wrote: >>>>>>> >>>>>>> Hi Junchao, >>>>>>> >>>>>>> I attached a "bad" configure log and a "good" configure log. >>>>>>> >>>>>>> The "bad" one was on produced >>>>>>> at 246ba74192519a5f34fb6e227d1c64364e19ce2c >>>>>>> >>>>>>> and the "good" one at 384645a00975869a1aacbd3169de62ba40cad683 >>>>>>> >>>>>>> This good hash is the last good hash that is just the right before >>>>>>> the bad one. >>>>>>> >>>>>>> I think you could do a comparison between these two logs, and check >>>>>>> what the differences were. >>>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> Fande >>>>>>> >>>>>>> On Tue, Jan 25, 2022 at 8:21 PM Junchao Zhang < >>>>>>> junchao.zhang at gmail.com> wrote: >>>>>>> >>>>>>>> Fande, could you send the configure.log that works (i.e., before >>>>>>>> this offending commit)? >>>>>>>> --Junchao Zhang >>>>>>>> >>>>>>>> >>>>>>>> On Tue, Jan 25, 2022 at 8:21 PM Fande Kong >>>>>>>> wrote: >>>>>>>> >>>>>>>>> Not sure if this is helpful. I did "git bisect", and here was the >>>>>>>>> result: >>>>>>>>> >>>>>>>>> [kongf at sawtooth2 petsc]$ git bisect bad >>>>>>>>> 246ba74192519a5f34fb6e227d1c64364e19ce2c is the first bad commit >>>>>>>>> commit 246ba74192519a5f34fb6e227d1c64364e19ce2c >>>>>>>>> Author: Junchao Zhang >>>>>>>>> Date: Wed Oct 13 05:32:43 2021 +0000 >>>>>>>>> >>>>>>>>> Config: fix CUDA library and header dirs >>>>>>>>> >>>>>>>>> :040000 040000 187c86055adb80f53c1d0565a8888704fec43a96 >>>>>>>>> ea1efd7f594fd5e8df54170bc1bc7b00f35e4d5f M config >>>>>>>>> >>>>>>>>> >>>>>>>>> Started from this commit, and GPU did not work for me on our HPC >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Fande >>>>>>>>> >>>>>>>>> On Tue, Jan 25, 2022 at 7:18 PM Fande Kong >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Tue, Jan 25, 2022 at 9:04 AM Jacob Faibussowitsch < >>>>>>>>>> jacob.fai at gmail.com> wrote: >>>>>>>>>> >>>>>>>>>>> Configure should not have an impact here I think. The reason I >>>>>>>>>>> had you run `cudaGetDeviceCount()` is because this is the CUDA call (and in >>>>>>>>>>> fact the only CUDA call) in the initialization sequence that returns the >>>>>>>>>>> error code. There should be no prior CUDA calls. Maybe this is a problem >>>>>>>>>>> with oversubscribing GPU?s? In the runs that crash, how many ranks are >>>>>>>>>>> using any given GPU at once? Maybe MPS is required. >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> I used one MPI rank. >>>>>>>>>> >>>>>>>>>> Fande >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Best regards, >>>>>>>>>>> >>>>>>>>>>> Jacob Faibussowitsch >>>>>>>>>>> (Jacob Fai - booss - oh - vitch) >>>>>>>>>>> >>>>>>>>>>> On Jan 21, 2022, at 12:01, Fande Kong >>>>>>>>>>> wrote: >>>>>>>>>>> >>>>>>>>>>> Thanks Jacob, >>>>>>>>>>> >>>>>>>>>>> On Thu, Jan 20, 2022 at 6:25 PM Jacob Faibussowitsch < >>>>>>>>>>> jacob.fai at gmail.com> wrote: >>>>>>>>>>> >>>>>>>>>>>> Segfault is caused by the following check at >>>>>>>>>>>> src/sys/objects/device/impls/cupm/cupmdevice.cxx:349 being a >>>>>>>>>>>> PetscUnlikelyDebug() rather than just PetscUnlikely(): >>>>>>>>>>>> >>>>>>>>>>>> ``` >>>>>>>>>>>> if (PetscUnlikelyDebug(_defaultDevice < 0)) { // _defaultDevice >>>>>>>>>>>> is in fact < 0 here and uncaught >>>>>>>>>>>> ``` >>>>>>>>>>>> >>>>>>>>>>>> To clarify: >>>>>>>>>>>> >>>>>>>>>>>> ?lazy? initialization is not that lazy after all, it still does >>>>>>>>>>>> some 50% of the initialization that ?eager? initialization does. It stops >>>>>>>>>>>> short initializing the CUDA runtime, checking CUDA aware MPI, gathering >>>>>>>>>>>> device data, and initializing cublas and friends. Lazy also importantly >>>>>>>>>>>> swallows any errors that crop up during initialization, storing the >>>>>>>>>>>> resulting error code for later (specifically _defaultDevice = >>>>>>>>>>>> -init_error_value;). >>>>>>>>>>>> >>>>>>>>>>>> So whether you initialize lazily or eagerly makes no difference >>>>>>>>>>>> here, as _defaultDevice will always contain -35. >>>>>>>>>>>> >>>>>>>>>>>> The bigger question is why cudaGetDeviceCount() is returning >>>>>>>>>>>> cudaErrorInsufficientDriver. Can you compile and run >>>>>>>>>>>> >>>>>>>>>>>> ``` >>>>>>>>>>>> #include >>>>>>>>>>>> >>>>>>>>>>>> int main() >>>>>>>>>>>> { >>>>>>>>>>>> int ndev; >>>>>>>>>>>> return cudaGetDeviceCount(&ndev): >>>>>>>>>>>> } >>>>>>>>>>>> ``` >>>>>>>>>>>> >>>>>>>>>>>> Then show the value of "echo $??? >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Modify your code a little to get more information. >>>>>>>>>>> >>>>>>>>>>> #include >>>>>>>>>>> #include >>>>>>>>>>> >>>>>>>>>>> int main() >>>>>>>>>>> { >>>>>>>>>>> int ndev; >>>>>>>>>>> int error = cudaGetDeviceCount(&ndev); >>>>>>>>>>> printf("ndev %d \n", ndev); >>>>>>>>>>> printf("error %d \n", error); >>>>>>>>>>> return 0; >>>>>>>>>>> } >>>>>>>>>>> >>>>>>>>>>> Results: >>>>>>>>>>> >>>>>>>>>>> $ ./a.out >>>>>>>>>>> ndev 4 >>>>>>>>>>> error 0 >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> I have not read the PETSc cuda initialization code yet. If I >>>>>>>>>>> need to guess at what was happening. I will naively think that PETSc did >>>>>>>>>>> not get correct GPU information in the configuration because the compiler >>>>>>>>>>> node does not have GPUs, and there was no way to get any GPU device >>>>>>>>>>> information. >>>>>>>>>>> >>>>>>>>>>> During the runtime on GPU nodes, PETSc might have incorrect >>>>>>>>>>> information grabbed during configuration and had this kind of false error >>>>>>>>>>> message. >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> >>>>>>>>>>> Fande >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Best regards, >>>>>>>>>>>> >>>>>>>>>>>> Jacob Faibussowitsch >>>>>>>>>>>> (Jacob Fai - booss - oh - vitch) >>>>>>>>>>>> >>>>>>>>>>>> On Jan 20, 2022, at 17:47, Matthew Knepley >>>>>>>>>>>> wrote: >>>>>>>>>>>> >>>>>>>>>>>> On Thu, Jan 20, 2022 at 6:44 PM Fande Kong >>>>>>>>>>>> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> Thanks, Jed >>>>>>>>>>>>> >>>>>>>>>>>>> On Thu, Jan 20, 2022 at 4:34 PM Jed Brown >>>>>>>>>>>>> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> You can't create CUDA or Kokkos Vecs if you're running on a >>>>>>>>>>>>>> node without a GPU. >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> I am running the code on compute nodes that do have GPUs. >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> If you are actually running on GPUs, why would you need lazy >>>>>>>>>>>> initialization? It would not break with GPUs present. >>>>>>>>>>>> >>>>>>>>>>>> Matt >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>>> With PETSc-3.16.1, I got good speedup by running GAMG on >>>>>>>>>>>>> GPUs. That might be a bug of PETSc-main. >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks, >>>>>>>>>>>>> >>>>>>>>>>>>> Fande >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> KSPSetUp 13 1.0 6.4400e-01 1.0 2.02e+09 1.0 >>>>>>>>>>>>> 0.0e+00 0.0e+00 0.0e+00 0 5 0 0 0 0 5 0 0 0 3140 64630 15 >>>>>>>>>>>>> 1.05e+02 5 3.49e+01 100 >>>>>>>>>>>>> KSPSolve 1 1.0 1.0109e+00 1.0 3.49e+10 1.0 >>>>>>>>>>>>> 0.0e+00 0.0e+00 0.0e+00 0 87 0 0 0 0 87 0 0 0 34522 69556 4 >>>>>>>>>>>>> 4.35e-03 1 2.38e-03 100 >>>>>>>>>>>>> KSPGMRESOrthog 142 1.0 1.2674e-01 1.0 1.06e+10 1.0 >>>>>>>>>>>>> 0.0e+00 0.0e+00 0.0e+00 0 27 0 0 0 0 27 0 0 0 83755 87801 0 >>>>>>>>>>>>> 0.00e+00 0 0.00e+00 100 >>>>>>>>>>>>> SNESSolve 1 1.0 4.4402e+01 1.0 4.00e+10 1.0 >>>>>>>>>>>>> 0.0e+00 0.0e+00 0.0e+00 21100 0 0 0 21100 0 0 0 901 51365 57 >>>>>>>>>>>>> 1.10e+03 52 8.78e+02 100 >>>>>>>>>>>>> SNESSetUp 1 1.0 3.9101e-05 1.0 0.00e+00 0.0 >>>>>>>>>>>>> 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 >>>>>>>>>>>>> 0.00e+00 0 0.00e+00 0 >>>>>>>>>>>>> SNESFunctionEval 2 1.0 1.7097e+01 1.0 1.60e+07 1.0 >>>>>>>>>>>>> 0.0e+00 0.0e+00 0.0e+00 8 0 0 0 0 8 0 0 0 0 1 0 0 >>>>>>>>>>>>> 0.00e+00 6 1.92e+02 0 >>>>>>>>>>>>> SNESJacobianEval 1 1.0 1.6213e+01 1.0 2.80e+07 1.0 >>>>>>>>>>>>> 0.0e+00 0.0e+00 0.0e+00 8 0 0 0 0 8 0 0 0 0 2 0 0 >>>>>>>>>>>>> 0.00e+00 1 3.20e+01 0 >>>>>>>>>>>>> SNESLineSearch 1 1.0 8.5582e+00 1.0 1.24e+08 1.0 >>>>>>>>>>>>> 0.0e+00 0.0e+00 0.0e+00 4 0 0 0 0 4 0 0 0 0 14 64153 1 >>>>>>>>>>>>> 3.20e+01 3 9.61e+01 94 >>>>>>>>>>>>> PCGAMGGraph_AGG 5 1.0 3.0509e+00 1.0 8.19e+07 1.0 >>>>>>>>>>>>> 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 27 0 5 >>>>>>>>>>>>> 3.49e+01 9 7.43e+01 0 >>>>>>>>>>>>> PCGAMGCoarse_AGG 5 1.0 3.8711e+00 1.0 0.00e+00 0.0 >>>>>>>>>>>>> 0.0e+00 0.0e+00 0.0e+00 2 0 0 0 0 2 0 0 0 0 0 0 0 >>>>>>>>>>>>> 0.00e+00 0 0.00e+00 0 >>>>>>>>>>>>> PCGAMGProl_AGG 5 1.0 7.0748e-01 1.0 0.00e+00 0.0 >>>>>>>>>>>>> 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 >>>>>>>>>>>>> 0.00e+00 0 0.00e+00 0 >>>>>>>>>>>>> PCGAMGPOpt_AGG 5 1.0 1.2904e+00 1.0 2.14e+09 1.0 >>>>>>>>>>>>> 0.0e+00 0.0e+00 0.0e+00 1 5 0 0 0 1 5 0 0 0 1661 29807 26 >>>>>>>>>>>>> 7.15e+02 20 2.90e+02 99 >>>>>>>>>>>>> GAMG: createProl 5 1.0 8.9489e+00 1.0 2.22e+09 1.0 >>>>>>>>>>>>> 0.0e+00 0.0e+00 0.0e+00 4 6 0 0 0 4 6 0 0 0 249 29666 31 >>>>>>>>>>>>> 7.50e+02 29 3.64e+02 96 >>>>>>>>>>>>> Graph 10 1.0 3.0478e+00 1.0 8.19e+07 1.0 >>>>>>>>>>>>> 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 27 0 5 >>>>>>>>>>>>> 3.49e+01 9 7.43e+01 0 >>>>>>>>>>>>> MIS/Agg 5 1.0 4.1290e-01 1.0 0.00e+00 0.0 >>>>>>>>>>>>> 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 >>>>>>>>>>>>> 0.00e+00 0 0.00e+00 0 >>>>>>>>>>>>> SA: col data 5 1.0 1.9127e-02 1.0 0.00e+00 0.0 >>>>>>>>>>>>> 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 >>>>>>>>>>>>> 0.00e+00 0 0.00e+00 0 >>>>>>>>>>>>> SA: frmProl0 5 1.0 6.2662e-01 1.0 0.00e+00 0.0 >>>>>>>>>>>>> 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 >>>>>>>>>>>>> 0.00e+00 0 0.00e+00 0 >>>>>>>>>>>>> SA: smooth 5 1.0 4.9595e-01 1.0 1.21e+08 1.0 >>>>>>>>>>>>> 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 244 2709 15 >>>>>>>>>>>>> 1.97e+02 15 2.55e+02 90 >>>>>>>>>>>>> GAMG: partLevel 5 1.0 4.7330e-01 1.0 6.98e+08 1.0 >>>>>>>>>>>>> 0.0e+00 0.0e+00 0.0e+00 0 2 0 0 0 0 2 0 0 0 1475 4120 5 >>>>>>>>>>>>> 1.78e+02 10 2.55e+02 100 >>>>>>>>>>>>> PCGAMG Squ l00 1 1.0 2.6027e+00 1.0 0.00e+00 0.0 >>>>>>>>>>>>> 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 0 0 0 >>>>>>>>>>>>> 0.00e+00 0 0.00e+00 0 >>>>>>>>>>>>> PCGAMG Gal l00 1 1.0 3.8406e-01 1.0 5.48e+08 1.0 >>>>>>>>>>>>> 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 1426 4270 1 >>>>>>>>>>>>> 1.48e+02 2 2.11e+02 100 >>>>>>>>>>>>> PCGAMG Opt l00 1 1.0 2.4932e-01 1.0 7.20e+07 1.0 >>>>>>>>>>>>> 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 289 2653 1 >>>>>>>>>>>>> 6.41e+01 1 1.13e+02 100 >>>>>>>>>>>>> PCGAMG Gal l01 1 1.0 6.6279e-02 1.0 1.09e+08 1.0 >>>>>>>>>>>>> 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1645 3851 1 >>>>>>>>>>>>> 2.40e+01 2 3.64e+01 100 >>>>>>>>>>>>> PCGAMG Opt l01 1 1.0 2.9544e-02 1.0 7.15e+06 1.0 >>>>>>>>>>>>> 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 242 1671 1 >>>>>>>>>>>>> 4.84e+00 1 1.23e+01 100 >>>>>>>>>>>>> PCGAMG Gal l02 1 1.0 1.8874e-02 1.0 3.72e+07 1.0 >>>>>>>>>>>>> 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1974 3636 1 >>>>>>>>>>>>> 5.04e+00 2 6.58e+00 100 >>>>>>>>>>>>> PCGAMG Opt l02 1 1.0 7.4353e-03 1.0 2.40e+06 1.0 >>>>>>>>>>>>> 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 323 1457 1 >>>>>>>>>>>>> 7.71e-01 1 2.30e+00 100 >>>>>>>>>>>>> PCGAMG Gal l03 1 1.0 2.8479e-03 1.0 4.10e+06 1.0 >>>>>>>>>>>>> 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1440 2266 1 >>>>>>>>>>>>> 4.44e-01 2 5.51e-01 100 >>>>>>>>>>>>> PCGAMG Opt l03 1 1.0 8.2684e-04 1.0 2.80e+05 1.0 >>>>>>>>>>>>> 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 339 1667 1 >>>>>>>>>>>>> 6.72e-02 1 2.03e-01 100 >>>>>>>>>>>>> PCGAMG Gal l04 1 1.0 1.2238e-03 1.0 2.09e+05 1.0 >>>>>>>>>>>>> 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 170 244 1 >>>>>>>>>>>>> 2.05e-02 2 2.53e-02 100 >>>>>>>>>>>>> PCGAMG Opt l04 1 1.0 4.1008e-04 1.0 1.77e+04 1.0 >>>>>>>>>>>>> 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 43 165 1 >>>>>>>>>>>>> 4.49e-03 1 1.19e-02 100 >>>>>>>>>>>>> PCSetUp 2 1.0 9.9632e+00 1.0 4.95e+09 1.0 >>>>>>>>>>>>> 0.0e+00 0.0e+00 0.0e+00 5 12 0 0 0 5 12 0 0 0 496 17826 55 >>>>>>>>>>>>> 1.03e+03 45 6.54e+02 98 >>>>>>>>>>>>> PCSetUpOnBlocks 44 1.0 9.9087e-04 1.0 2.88e+03 1.0 >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>>> The point of lazy initialization is to make it possible to >>>>>>>>>>>>>> run a solve that doesn't use a GPU in PETSC_ARCH that supports GPUs, >>>>>>>>>>>>>> regardless of whether a GPU is actually present. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Fande Kong writes: >>>>>>>>>>>>>> >>>>>>>>>>>>>> > I spoke too soon. It seems that we have trouble creating >>>>>>>>>>>>>> cuda/kokkos vecs >>>>>>>>>>>>>> > now. Got Segmentation fault. >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > Thanks, >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > Fande >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > Program received signal SIGSEGV, Segmentation fault. >>>>>>>>>>>>>> > 0x00002aaab5558b11 in >>>>>>>>>>>>>> > >>>>>>>>>>>>>> Petsc::CUPMDevice<(Petsc::CUPMDeviceType)0>::CUPMDeviceInternal::initialize >>>>>>>>>>>>>> > (this=0x1) at >>>>>>>>>>>>>> > >>>>>>>>>>>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/impls/cupm/cupmdevice.cxx:54 >>>>>>>>>>>>>> > 54 PetscErrorCode >>>>>>>>>>>>>> CUPMDevice::CUPMDeviceInternal::initialize() noexcept >>>>>>>>>>>>>> > Missing separate debuginfos, use: debuginfo-install >>>>>>>>>>>>>> > bzip2-libs-1.0.6-13.el7.x86_64 >>>>>>>>>>>>>> elfutils-libelf-0.176-5.el7.x86_64 >>>>>>>>>>>>>> > elfutils-libs-0.176-5.el7.x86_64 glibc-2.17-325.el7_9.x86_64 >>>>>>>>>>>>>> > libX11-1.6.7-4.el7_9.x86_64 libXau-1.0.8-2.1.el7.x86_64 >>>>>>>>>>>>>> > libattr-2.4.46-13.el7.x86_64 libcap-2.22-11.el7.x86_64 >>>>>>>>>>>>>> > libibmad-5.4.0.MLNX20190423.1d917ae-0.1.49224.x86_64 >>>>>>>>>>>>>> > libibumad-43.1.1.MLNX20200211.078947f-0.1.49224.x86_64 >>>>>>>>>>>>>> > libibverbs-41mlnx1-OFED.4.9.0.0.7.49224.x86_64 >>>>>>>>>>>>>> > libmlx4-41mlnx1-OFED.4.7.3.0.3.49224.x86_64 >>>>>>>>>>>>>> > libmlx5-41mlnx1-OFED.4.9.0.1.2.49224.x86_64 >>>>>>>>>>>>>> libnl3-3.2.28-4.el7.x86_64 >>>>>>>>>>>>>> > librdmacm-41mlnx1-OFED.4.7.3.0.6.49224.x86_64 >>>>>>>>>>>>>> > librxe-41mlnx1-OFED.4.4.2.4.6.49224.x86_64 >>>>>>>>>>>>>> libxcb-1.13-1.el7.x86_64 >>>>>>>>>>>>>> > libxml2-2.9.1-6.el7_9.6.x86_64 >>>>>>>>>>>>>> numactl-libs-2.0.12-5.el7.x86_64 >>>>>>>>>>>>>> > systemd-libs-219-78.el7_9.3.x86_64 >>>>>>>>>>>>>> xz-libs-5.2.2-1.el7.x86_64 >>>>>>>>>>>>>> > zlib-1.2.7-19.el7_9.x86_64 >>>>>>>>>>>>>> > (gdb) bt >>>>>>>>>>>>>> > #0 0x00002aaab5558b11 in >>>>>>>>>>>>>> > >>>>>>>>>>>>>> Petsc::CUPMDevice<(Petsc::CUPMDeviceType)0>::CUPMDeviceInternal::initialize >>>>>>>>>>>>>> > (this=0x1) at >>>>>>>>>>>>>> > >>>>>>>>>>>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/impls/cupm/cupmdevice.cxx:54 >>>>>>>>>>>>>> > #1 0x00002aaab5558db7 in >>>>>>>>>>>>>> > Petsc::CUPMDevice<(Petsc::CUPMDeviceType)0>::getDevice >>>>>>>>>>>>>> > (this=this at entry=0x2aaab7f37b70 >>>>>>>>>>>>>> > , device=0x115da00, id=-35, id at entry=-1) at >>>>>>>>>>>>>> > >>>>>>>>>>>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/impls/cupm/cupmdevice.cxx:344 >>>>>>>>>>>>>> > #2 0x00002aaab55577de in PetscDeviceCreate (type=type at entry >>>>>>>>>>>>>> =PETSC_DEVICE_CUDA, >>>>>>>>>>>>>> > devid=devid at entry=-1, device=device at entry=0x2aaab7f37b48 >>>>>>>>>>>>>> > ) at >>>>>>>>>>>>>> > >>>>>>>>>>>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/interface/device.cxx:107 >>>>>>>>>>>>>> > #3 0x00002aaab5557b3a in >>>>>>>>>>>>>> PetscDeviceInitializeDefaultDevice_Internal >>>>>>>>>>>>>> > (type=type at entry=PETSC_DEVICE_CUDA, >>>>>>>>>>>>>> defaultDeviceId=defaultDeviceId at entry=-1) >>>>>>>>>>>>>> > at >>>>>>>>>>>>>> > >>>>>>>>>>>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/interface/device.cxx:273 >>>>>>>>>>>>>> > #4 0x00002aaab5557bf6 in PetscDeviceInitialize >>>>>>>>>>>>>> > (type=type at entry=PETSC_DEVICE_CUDA) >>>>>>>>>>>>>> > at >>>>>>>>>>>>>> > >>>>>>>>>>>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/interface/device.cxx:234 >>>>>>>>>>>>>> > #5 0x00002aaab5661fcd in VecCreate_SeqCUDA (V=0x115d150) at >>>>>>>>>>>>>> > >>>>>>>>>>>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/vec/vec/impls/seq/seqcuda/veccuda.c:244 >>>>>>>>>>>>>> > #6 0x00002aaab5649b40 in VecSetType (vec=vec at entry >>>>>>>>>>>>>> =0x115d150, >>>>>>>>>>>>>> > method=method at entry=0x2aaab70b45b8 "seqcuda") at >>>>>>>>>>>>>> > >>>>>>>>>>>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/vec/vec/interface/vecreg.c:93 >>>>>>>>>>>>>> > #7 0x00002aaab579c33f in VecCreate_CUDA (v=0x115d150) at >>>>>>>>>>>>>> > >>>>>>>>>>>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/vec/vec/impls/mpi/mpicuda/ >>>>>>>>>>>>>> > mpicuda.cu:214 >>>>>>>>>>>>>> > #8 0x00002aaab5649b40 in VecSetType (vec=vec at entry >>>>>>>>>>>>>> =0x115d150, >>>>>>>>>>>>>> > method=method at entry=0x7fffffff9260 "cuda") at >>>>>>>>>>>>>> > >>>>>>>>>>>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/vec/vec/interface/vecreg.c:93 >>>>>>>>>>>>>> > #9 0x00002aaab5648bf1 in VecSetTypeFromOptions_Private >>>>>>>>>>>>>> (vec=0x115d150, >>>>>>>>>>>>>> > PetscOptionsObject=0x7fffffff9210) at >>>>>>>>>>>>>> > >>>>>>>>>>>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/vec/vec/interface/vector.c:1263 >>>>>>>>>>>>>> > #10 VecSetFromOptions (vec=0x115d150) at >>>>>>>>>>>>>> > >>>>>>>>>>>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/vec/vec/interface/vector.c:1297 >>>>>>>>>>>>>> > #11 0x00002aaab02ef227 in libMesh::PetscVector::init >>>>>>>>>>>>>> > (this=0x11cd1a0, n=441, n_local=441, fast=false, >>>>>>>>>>>>>> ptype=libMesh::PARALLEL) >>>>>>>>>>>>>> > at >>>>>>>>>>>>>> > >>>>>>>>>>>>>> /home/kongf/workhome/sawtooth/moosegpu/scripts/../libmesh/installed/include/libmesh/petsc_vector.h:693 >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > On Thu, Jan 20, 2022 at 1:09 PM Fande Kong < >>>>>>>>>>>>>> fdkong.jd at gmail.com> wrote: >>>>>>>>>>>>>> > >>>>>>>>>>>>>> >> Thanks, Jed, >>>>>>>>>>>>>> >> >>>>>>>>>>>>>> >> This worked! >>>>>>>>>>>>>> >> >>>>>>>>>>>>>> >> Fande >>>>>>>>>>>>>> >> >>>>>>>>>>>>>> >> On Wed, Jan 19, 2022 at 11:03 PM Jed Brown < >>>>>>>>>>>>>> jed at jedbrown.org> wrote: >>>>>>>>>>>>>> >> >>>>>>>>>>>>>> >>> Fande Kong writes: >>>>>>>>>>>>>> >>> >>>>>>>>>>>>>> >>> > On Wed, Jan 19, 2022 at 11:39 AM Jacob Faibussowitsch < >>>>>>>>>>>>>> >>> jacob.fai at gmail.com> >>>>>>>>>>>>>> >>> > wrote: >>>>>>>>>>>>>> >>> > >>>>>>>>>>>>>> >>> >> Are you running on login nodes or compute nodes (I >>>>>>>>>>>>>> can?t seem to tell >>>>>>>>>>>>>> >>> from >>>>>>>>>>>>>> >>> >> the configure.log)? >>>>>>>>>>>>>> >>> >> >>>>>>>>>>>>>> >>> > >>>>>>>>>>>>>> >>> > I was compiling codes on login nodes, and running codes >>>>>>>>>>>>>> on compute >>>>>>>>>>>>>> >>> nodes. >>>>>>>>>>>>>> >>> > Login nodes do not have GPUs, but compute nodes do have >>>>>>>>>>>>>> GPUs. >>>>>>>>>>>>>> >>> > >>>>>>>>>>>>>> >>> > Just to be clear, the same thing (code, machine) with >>>>>>>>>>>>>> PETSc-3.16.1 >>>>>>>>>>>>>> >>> worked >>>>>>>>>>>>>> >>> > perfectly. I have this trouble with PETSc-main. >>>>>>>>>>>>>> >>> >>>>>>>>>>>>>> >>> I assume you can >>>>>>>>>>>>>> >>> >>>>>>>>>>>>>> >>> export PETSC_OPTIONS='-device_enable lazy' >>>>>>>>>>>>>> >>> >>>>>>>>>>>>>> >>> and it'll work. >>>>>>>>>>>>>> >>> >>>>>>>>>>>>>> >>> I think this should be the default. The main complaint is >>>>>>>>>>>>>> that timing the >>>>>>>>>>>>>> >>> first GPU-using event isn't accurate if it includes >>>>>>>>>>>>>> initialization, but I >>>>>>>>>>>>>> >>> think this is mostly hypothetical because you can't trust >>>>>>>>>>>>>> any timing that >>>>>>>>>>>>>> >>> doesn't preload in some form and the first GPU-using >>>>>>>>>>>>>> event will almost >>>>>>>>>>>>>> >>> always be something uninteresting so I think it will >>>>>>>>>>>>>> rarely lead to >>>>>>>>>>>>>> >>> confusion. Meanwhile, eager initialization is viscerally >>>>>>>>>>>>>> disruptive for >>>>>>>>>>>>>> >>> lots of people. >>>>>>>>>>>>>> >>> >>>>>>>>>>>>>> >> >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> -- >>>>>>>>>>>> What most experimenters take for granted before they begin >>>>>>>>>>>> their experiments is infinitely more interesting than any results to which >>>>>>>>>>>> their experiments lead. >>>>>>>>>>>> -- Norbert Wiener >>>>>>>>>>>> >>>>>>>>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> >>>>>>> >>>>>>> >>>>>>> -------------- next part -------------- An HTML attachment was scrubbed... URL: From junchao.zhang at gmail.com Mon Jan 31 13:10:48 2022 From: junchao.zhang at gmail.com (Junchao Zhang) Date: Mon, 31 Jan 2022 13:10:48 -0600 Subject: [petsc-users] Cannot eagerly initialize cuda, as doing so results in cuda error 35 (cudaErrorInsufficientDriver) : CUDA driver version is insufficient for CUDA runtime version In-Reply-To: References: <233D0F20-AD95-481B-B862-353D5789C556@gmail.com> <87czkn9c64.fsf@jedbrown.org> <875yqe7zip.fsf@jedbrown.org> Message-ID: On Mon, Jan 31, 2022 at 10:50 AM Fande Kong wrote: > Sorry for the confusion. I thought I explained pretty well :-) > > Good: > > PETSc was linked to /usr/lib64/libcuda for libcuda > > Bad: > > PETSc was linked > to /apps/local/spack/software/gcc-7.5.0/cuda-10.1.243-v4ymjqcrr7f72qfiuzsstuy5jiajbuey/lib64/stubs > for libcuda > > My question would be: where should I look for libcuda? > > Our HPC admin told me that I should use the one from /usr/lib64/libcuda > Your admin was correct. > > I am trying to understand why we need to link to "stubs"? > Kokkos needs libcuda.so, so we added this requirement. > Just to be clear, I am fine with PETSc-main as is since I can use a > compute node to compile PETSc. However, here I am trying really hard to > understand where I should look for the right libcuda. > I need your help to find out: Why on compute nodes, did the petsc test executable find libcuda.so at /apps/local/spack/software/ gcc-7.5.0/cuda-10.1.243-v4ymjqcrr7f72qfiuzsstuy5jiajbuey/lib64/stubs? Note this path is not in the executable's rpath. Maybe you need to login to a compute node and do 'env' to list all variables for us to have a look. > > > Thanks for your help > > Fande > > > On Mon, Jan 31, 2022 at 9:19 AM Junchao Zhang > wrote: > >> Fande, >> From your configure_main.log >> >> cuda: >> Version: 10.1 >> Includes: >> -I/apps/local/spack/software/gcc-7.5.0/cuda-10.1.243-v4ymjqcrr7f72qfiuzsstuy5jiajbuey/include >> Library: >> -Wl,-rpath,/apps/local/spack/software/gcc-7.5.0/cuda-10.1.243-v4ymjqcrr7f72qfiuzsstuy5jiajbuey/lib64 >> -L/apps/local/spack/software/gcc-7.5.0/cuda-10.1.243-v4ymjqcrr7f72qfiuzsstuy5jiajbuey/lib64 >> -L/apps/local/spack/software/gcc-7.5.0/cuda-10.1.243-v4ymjqcrr7f72qfiuzsstuy5jiajbuey/lib64/stubs >> -lcudart -lcufft -lcublas -lcusparse -lcusolver -lcurand -lcuda >> >> >> You can see the `stubs` directory is not in rpath. We took a lot of >> effort to achieve that. You need to double check the reason. >> >> --Junchao Zhang >> >> >> On Mon, Jan 31, 2022 at 9:40 AM Fande Kong wrote: >> >>> OK, >>> >>> Finally we resolved the issue. The issue was that there were two >>> libcuda libs on a GPU compute node: /usr/lib64/libcuda >>> and /apps/local/spack/software/gcc-7.5.0/cuda-10.1.243-v4ymjqcrr7f72qfiuzsstuy5jiajbuey/lib64/stubs/libcuda. >>> But on a login node there is one libcuda lib: >>> /apps/local/spack/software/gcc-7.5.0/cuda-10.1.243-v4ymjqcrr7f72qfiuzsstuy5jiajbuey/lib64/stubs/libcuda. >>> We can not see /usr/lib64/libcuda from a login node where I was compiling >>> the code. >>> >>> Before the Junchao's commit, we did not have "-Wl,-rpath" to force >>> PETSc take >>> /apps/local/spack/software/gcc-7.5.0/cuda-10.1.243-v4ymjqcrr7f72qfiuzsstuy5jiajbuey/lib64/stubs/libcuda. >>> A code compiled on a login node could correctly pick up the cuda lib >>> from /usr/lib64/libcuda at runtime. When with "-Wl,-rpath", the code >>> always takes the cuda lib from >>> /apps/local/spack/software/gcc-7.5.0/cuda-10.1.243-v4ymjqcrr7f72qfiuzsstuy5jiajbuey/lib64/stubs/libcuda, >>> wihch was a bad lib. >>> >>> Right now, I just compiled code on a compute node instead of a login >>> node, PETSc was able to pick up the correct lib from /usr/lib64/libcuda, >>> and everything ran fine. >>> >>> I am not sure whether or not it is a good idea to search for "stubs" >>> since the system might have the correct ones in other places. Should not I >>> do a batch compiling? >>> >>> Thanks, >>> >>> Fande >>> >>> >>> On Wed, Jan 26, 2022 at 1:49 PM Fande Kong wrote: >>> >>>> Yes, please see the attached file. >>>> >>>> Fande >>>> >>>> On Wed, Jan 26, 2022 at 11:49 AM Junchao Zhang >>>> wrote: >>>> >>>>> Do you have the configure.log with main? >>>>> >>>>> --Junchao Zhang >>>>> >>>>> >>>>> On Wed, Jan 26, 2022 at 12:26 PM Fande Kong >>>>> wrote: >>>>> >>>>>> I am on the petsc-main >>>>>> >>>>>> commit 1390d3a27d88add7d79c9b38bf1a895ae5e67af6 >>>>>> >>>>>> Merge: 96c919c d5f3255 >>>>>> >>>>>> Author: Satish Balay >>>>>> >>>>>> Date: Wed Jan 26 10:28:32 2022 -0600 >>>>>> >>>>>> >>>>>> Merge remote-tracking branch 'origin/release' >>>>>> >>>>>> >>>>>> It is still broken. >>>>>> >>>>>> Thanks, >>>>>> >>>>>> >>>>>> Fande >>>>>> >>>>>> On Wed, Jan 26, 2022 at 7:40 AM Junchao Zhang < >>>>>> junchao.zhang at gmail.com> wrote: >>>>>> >>>>>>> The good uses the compiler's default library/header path. The bad >>>>>>> searches from cuda toolkit path and uses rpath linking. >>>>>>> Though the paths look the same on the login node, they could have >>>>>>> different behavior on a compute node depending on its environment. >>>>>>> I think we fixed the issue in cuda.py (i.e., first try the >>>>>>> compiler's default, then toolkit). That's why I wanted Fande to use >>>>>>> petsc/main. >>>>>>> >>>>>>> --Junchao Zhang >>>>>>> >>>>>>> >>>>>>> On Tue, Jan 25, 2022 at 11:59 PM Barry Smith >>>>>>> wrote: >>>>>>> >>>>>>>> >>>>>>>> bad has extra >>>>>>>> >>>>>>>> -L/apps/local/spack/software/gcc-7.5.0/cuda-10.1.243-v4ymjqcrr7f72qfiuzsstuy5jiajbuey/lib64/stubs >>>>>>>> -lcuda >>>>>>>> >>>>>>>> good does not. >>>>>>>> >>>>>>>> Try removing the stubs directory and -lcuda from the bad >>>>>>>> $PETSC_ARCH/lib/petsc/conf/variables and likely the bad will start working. >>>>>>>> >>>>>>>> Barry >>>>>>>> >>>>>>>> I never liked the stubs stuff. >>>>>>>> >>>>>>>> On Jan 25, 2022, at 11:29 PM, Fande Kong >>>>>>>> wrote: >>>>>>>> >>>>>>>> Hi Junchao, >>>>>>>> >>>>>>>> I attached a "bad" configure log and a "good" configure log. >>>>>>>> >>>>>>>> The "bad" one was on produced >>>>>>>> at 246ba74192519a5f34fb6e227d1c64364e19ce2c >>>>>>>> >>>>>>>> and the "good" one at 384645a00975869a1aacbd3169de62ba40cad683 >>>>>>>> >>>>>>>> This good hash is the last good hash that is just the right before >>>>>>>> the bad one. >>>>>>>> >>>>>>>> I think you could do a comparison between these two logs, and >>>>>>>> check what the differences were. >>>>>>>> >>>>>>>> Thanks, >>>>>>>> >>>>>>>> Fande >>>>>>>> >>>>>>>> On Tue, Jan 25, 2022 at 8:21 PM Junchao Zhang < >>>>>>>> junchao.zhang at gmail.com> wrote: >>>>>>>> >>>>>>>>> Fande, could you send the configure.log that works (i.e., before >>>>>>>>> this offending commit)? >>>>>>>>> --Junchao Zhang >>>>>>>>> >>>>>>>>> >>>>>>>>> On Tue, Jan 25, 2022 at 8:21 PM Fande Kong >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>>> Not sure if this is helpful. I did "git bisect", and here was the >>>>>>>>>> result: >>>>>>>>>> >>>>>>>>>> [kongf at sawtooth2 petsc]$ git bisect bad >>>>>>>>>> 246ba74192519a5f34fb6e227d1c64364e19ce2c is the first bad commit >>>>>>>>>> commit 246ba74192519a5f34fb6e227d1c64364e19ce2c >>>>>>>>>> Author: Junchao Zhang >>>>>>>>>> Date: Wed Oct 13 05:32:43 2021 +0000 >>>>>>>>>> >>>>>>>>>> Config: fix CUDA library and header dirs >>>>>>>>>> >>>>>>>>>> :040000 040000 187c86055adb80f53c1d0565a8888704fec43a96 >>>>>>>>>> ea1efd7f594fd5e8df54170bc1bc7b00f35e4d5f M config >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Started from this commit, and GPU did not work for me on our HPC >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> Fande >>>>>>>>>> >>>>>>>>>> On Tue, Jan 25, 2022 at 7:18 PM Fande Kong >>>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On Tue, Jan 25, 2022 at 9:04 AM Jacob Faibussowitsch < >>>>>>>>>>> jacob.fai at gmail.com> wrote: >>>>>>>>>>> >>>>>>>>>>>> Configure should not have an impact here I think. The reason I >>>>>>>>>>>> had you run `cudaGetDeviceCount()` is because this is the CUDA call (and in >>>>>>>>>>>> fact the only CUDA call) in the initialization sequence that returns the >>>>>>>>>>>> error code. There should be no prior CUDA calls. Maybe this is a problem >>>>>>>>>>>> with oversubscribing GPU?s? In the runs that crash, how many ranks are >>>>>>>>>>>> using any given GPU at once? Maybe MPS is required. >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> I used one MPI rank. >>>>>>>>>>> >>>>>>>>>>> Fande >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Best regards, >>>>>>>>>>>> >>>>>>>>>>>> Jacob Faibussowitsch >>>>>>>>>>>> (Jacob Fai - booss - oh - vitch) >>>>>>>>>>>> >>>>>>>>>>>> On Jan 21, 2022, at 12:01, Fande Kong >>>>>>>>>>>> wrote: >>>>>>>>>>>> >>>>>>>>>>>> Thanks Jacob, >>>>>>>>>>>> >>>>>>>>>>>> On Thu, Jan 20, 2022 at 6:25 PM Jacob Faibussowitsch < >>>>>>>>>>>> jacob.fai at gmail.com> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> Segfault is caused by the following check at >>>>>>>>>>>>> src/sys/objects/device/impls/cupm/cupmdevice.cxx:349 being a >>>>>>>>>>>>> PetscUnlikelyDebug() rather than just PetscUnlikely(): >>>>>>>>>>>>> >>>>>>>>>>>>> ``` >>>>>>>>>>>>> if (PetscUnlikelyDebug(_defaultDevice < 0)) { // >>>>>>>>>>>>> _defaultDevice is in fact < 0 here and uncaught >>>>>>>>>>>>> ``` >>>>>>>>>>>>> >>>>>>>>>>>>> To clarify: >>>>>>>>>>>>> >>>>>>>>>>>>> ?lazy? initialization is not that lazy after all, it still >>>>>>>>>>>>> does some 50% of the initialization that ?eager? initialization does. It >>>>>>>>>>>>> stops short initializing the CUDA runtime, checking CUDA aware MPI, >>>>>>>>>>>>> gathering device data, and initializing cublas and friends. Lazy also >>>>>>>>>>>>> importantly swallows any errors that crop up during initialization, storing >>>>>>>>>>>>> the resulting error code for later (specifically _defaultDevice = >>>>>>>>>>>>> -init_error_value;). >>>>>>>>>>>>> >>>>>>>>>>>>> So whether you initialize lazily or eagerly makes no >>>>>>>>>>>>> difference here, as _defaultDevice will always contain -35. >>>>>>>>>>>>> >>>>>>>>>>>>> The bigger question is why cudaGetDeviceCount() is returning >>>>>>>>>>>>> cudaErrorInsufficientDriver. Can you compile and run >>>>>>>>>>>>> >>>>>>>>>>>>> ``` >>>>>>>>>>>>> #include >>>>>>>>>>>>> >>>>>>>>>>>>> int main() >>>>>>>>>>>>> { >>>>>>>>>>>>> int ndev; >>>>>>>>>>>>> return cudaGetDeviceCount(&ndev): >>>>>>>>>>>>> } >>>>>>>>>>>>> ``` >>>>>>>>>>>>> >>>>>>>>>>>>> Then show the value of "echo $??? >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Modify your code a little to get more information. >>>>>>>>>>>> >>>>>>>>>>>> #include >>>>>>>>>>>> #include >>>>>>>>>>>> >>>>>>>>>>>> int main() >>>>>>>>>>>> { >>>>>>>>>>>> int ndev; >>>>>>>>>>>> int error = cudaGetDeviceCount(&ndev); >>>>>>>>>>>> printf("ndev %d \n", ndev); >>>>>>>>>>>> printf("error %d \n", error); >>>>>>>>>>>> return 0; >>>>>>>>>>>> } >>>>>>>>>>>> >>>>>>>>>>>> Results: >>>>>>>>>>>> >>>>>>>>>>>> $ ./a.out >>>>>>>>>>>> ndev 4 >>>>>>>>>>>> error 0 >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> I have not read the PETSc cuda initialization code yet. If I >>>>>>>>>>>> need to guess at what was happening. I will naively think that PETSc did >>>>>>>>>>>> not get correct GPU information in the configuration because the compiler >>>>>>>>>>>> node does not have GPUs, and there was no way to get any GPU device >>>>>>>>>>>> information. >>>>>>>>>>>> >>>>>>>>>>>> During the runtime on GPU nodes, PETSc might have incorrect >>>>>>>>>>>> information grabbed during configuration and had this kind of false error >>>>>>>>>>>> message. >>>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> >>>>>>>>>>>> Fande >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> Best regards, >>>>>>>>>>>>> >>>>>>>>>>>>> Jacob Faibussowitsch >>>>>>>>>>>>> (Jacob Fai - booss - oh - vitch) >>>>>>>>>>>>> >>>>>>>>>>>>> On Jan 20, 2022, at 17:47, Matthew Knepley >>>>>>>>>>>>> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>> On Thu, Jan 20, 2022 at 6:44 PM Fande Kong < >>>>>>>>>>>>> fdkong.jd at gmail.com> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> Thanks, Jed >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Thu, Jan 20, 2022 at 4:34 PM Jed Brown >>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>>> You can't create CUDA or Kokkos Vecs if you're running on a >>>>>>>>>>>>>>> node without a GPU. >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> I am running the code on compute nodes that do have GPUs. >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> If you are actually running on GPUs, why would you need lazy >>>>>>>>>>>>> initialization? It would not break with GPUs present. >>>>>>>>>>>>> >>>>>>>>>>>>> Matt >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>>> With PETSc-3.16.1, I got good speedup by running GAMG on >>>>>>>>>>>>>> GPUs. That might be a bug of PETSc-main. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>> >>>>>>>>>>>>>> Fande >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> KSPSetUp 13 1.0 6.4400e-01 1.0 2.02e+09 1.0 >>>>>>>>>>>>>> 0.0e+00 0.0e+00 0.0e+00 0 5 0 0 0 0 5 0 0 0 3140 64630 15 >>>>>>>>>>>>>> 1.05e+02 5 3.49e+01 100 >>>>>>>>>>>>>> KSPSolve 1 1.0 1.0109e+00 1.0 3.49e+10 1.0 >>>>>>>>>>>>>> 0.0e+00 0.0e+00 0.0e+00 0 87 0 0 0 0 87 0 0 0 34522 69556 4 >>>>>>>>>>>>>> 4.35e-03 1 2.38e-03 100 >>>>>>>>>>>>>> KSPGMRESOrthog 142 1.0 1.2674e-01 1.0 1.06e+10 1.0 >>>>>>>>>>>>>> 0.0e+00 0.0e+00 0.0e+00 0 27 0 0 0 0 27 0 0 0 83755 87801 0 >>>>>>>>>>>>>> 0.00e+00 0 0.00e+00 100 >>>>>>>>>>>>>> SNESSolve 1 1.0 4.4402e+01 1.0 4.00e+10 1.0 >>>>>>>>>>>>>> 0.0e+00 0.0e+00 0.0e+00 21100 0 0 0 21100 0 0 0 901 51365 57 >>>>>>>>>>>>>> 1.10e+03 52 8.78e+02 100 >>>>>>>>>>>>>> SNESSetUp 1 1.0 3.9101e-05 1.0 0.00e+00 0.0 >>>>>>>>>>>>>> 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 >>>>>>>>>>>>>> 0.00e+00 0 0.00e+00 0 >>>>>>>>>>>>>> SNESFunctionEval 2 1.0 1.7097e+01 1.0 1.60e+07 1.0 >>>>>>>>>>>>>> 0.0e+00 0.0e+00 0.0e+00 8 0 0 0 0 8 0 0 0 0 1 0 0 >>>>>>>>>>>>>> 0.00e+00 6 1.92e+02 0 >>>>>>>>>>>>>> SNESJacobianEval 1 1.0 1.6213e+01 1.0 2.80e+07 1.0 >>>>>>>>>>>>>> 0.0e+00 0.0e+00 0.0e+00 8 0 0 0 0 8 0 0 0 0 2 0 0 >>>>>>>>>>>>>> 0.00e+00 1 3.20e+01 0 >>>>>>>>>>>>>> SNESLineSearch 1 1.0 8.5582e+00 1.0 1.24e+08 1.0 >>>>>>>>>>>>>> 0.0e+00 0.0e+00 0.0e+00 4 0 0 0 0 4 0 0 0 0 14 64153 1 >>>>>>>>>>>>>> 3.20e+01 3 9.61e+01 94 >>>>>>>>>>>>>> PCGAMGGraph_AGG 5 1.0 3.0509e+00 1.0 8.19e+07 1.0 >>>>>>>>>>>>>> 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 27 0 5 >>>>>>>>>>>>>> 3.49e+01 9 7.43e+01 0 >>>>>>>>>>>>>> PCGAMGCoarse_AGG 5 1.0 3.8711e+00 1.0 0.00e+00 0.0 >>>>>>>>>>>>>> 0.0e+00 0.0e+00 0.0e+00 2 0 0 0 0 2 0 0 0 0 0 0 0 >>>>>>>>>>>>>> 0.00e+00 0 0.00e+00 0 >>>>>>>>>>>>>> PCGAMGProl_AGG 5 1.0 7.0748e-01 1.0 0.00e+00 0.0 >>>>>>>>>>>>>> 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 >>>>>>>>>>>>>> 0.00e+00 0 0.00e+00 0 >>>>>>>>>>>>>> PCGAMGPOpt_AGG 5 1.0 1.2904e+00 1.0 2.14e+09 1.0 >>>>>>>>>>>>>> 0.0e+00 0.0e+00 0.0e+00 1 5 0 0 0 1 5 0 0 0 1661 29807 26 >>>>>>>>>>>>>> 7.15e+02 20 2.90e+02 99 >>>>>>>>>>>>>> GAMG: createProl 5 1.0 8.9489e+00 1.0 2.22e+09 1.0 >>>>>>>>>>>>>> 0.0e+00 0.0e+00 0.0e+00 4 6 0 0 0 4 6 0 0 0 249 29666 31 >>>>>>>>>>>>>> 7.50e+02 29 3.64e+02 96 >>>>>>>>>>>>>> Graph 10 1.0 3.0478e+00 1.0 8.19e+07 1.0 >>>>>>>>>>>>>> 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 27 0 5 >>>>>>>>>>>>>> 3.49e+01 9 7.43e+01 0 >>>>>>>>>>>>>> MIS/Agg 5 1.0 4.1290e-01 1.0 0.00e+00 0.0 >>>>>>>>>>>>>> 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 >>>>>>>>>>>>>> 0.00e+00 0 0.00e+00 0 >>>>>>>>>>>>>> SA: col data 5 1.0 1.9127e-02 1.0 0.00e+00 0.0 >>>>>>>>>>>>>> 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 >>>>>>>>>>>>>> 0.00e+00 0 0.00e+00 0 >>>>>>>>>>>>>> SA: frmProl0 5 1.0 6.2662e-01 1.0 0.00e+00 0.0 >>>>>>>>>>>>>> 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 >>>>>>>>>>>>>> 0.00e+00 0 0.00e+00 0 >>>>>>>>>>>>>> SA: smooth 5 1.0 4.9595e-01 1.0 1.21e+08 1.0 >>>>>>>>>>>>>> 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 244 2709 15 >>>>>>>>>>>>>> 1.97e+02 15 2.55e+02 90 >>>>>>>>>>>>>> GAMG: partLevel 5 1.0 4.7330e-01 1.0 6.98e+08 1.0 >>>>>>>>>>>>>> 0.0e+00 0.0e+00 0.0e+00 0 2 0 0 0 0 2 0 0 0 1475 4120 5 >>>>>>>>>>>>>> 1.78e+02 10 2.55e+02 100 >>>>>>>>>>>>>> PCGAMG Squ l00 1 1.0 2.6027e+00 1.0 0.00e+00 0.0 >>>>>>>>>>>>>> 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 0 0 0 >>>>>>>>>>>>>> 0.00e+00 0 0.00e+00 0 >>>>>>>>>>>>>> PCGAMG Gal l00 1 1.0 3.8406e-01 1.0 5.48e+08 1.0 >>>>>>>>>>>>>> 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 1426 4270 1 >>>>>>>>>>>>>> 1.48e+02 2 2.11e+02 100 >>>>>>>>>>>>>> PCGAMG Opt l00 1 1.0 2.4932e-01 1.0 7.20e+07 1.0 >>>>>>>>>>>>>> 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 289 2653 1 >>>>>>>>>>>>>> 6.41e+01 1 1.13e+02 100 >>>>>>>>>>>>>> PCGAMG Gal l01 1 1.0 6.6279e-02 1.0 1.09e+08 1.0 >>>>>>>>>>>>>> 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1645 3851 1 >>>>>>>>>>>>>> 2.40e+01 2 3.64e+01 100 >>>>>>>>>>>>>> PCGAMG Opt l01 1 1.0 2.9544e-02 1.0 7.15e+06 1.0 >>>>>>>>>>>>>> 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 242 1671 1 >>>>>>>>>>>>>> 4.84e+00 1 1.23e+01 100 >>>>>>>>>>>>>> PCGAMG Gal l02 1 1.0 1.8874e-02 1.0 3.72e+07 1.0 >>>>>>>>>>>>>> 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1974 3636 1 >>>>>>>>>>>>>> 5.04e+00 2 6.58e+00 100 >>>>>>>>>>>>>> PCGAMG Opt l02 1 1.0 7.4353e-03 1.0 2.40e+06 1.0 >>>>>>>>>>>>>> 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 323 1457 1 >>>>>>>>>>>>>> 7.71e-01 1 2.30e+00 100 >>>>>>>>>>>>>> PCGAMG Gal l03 1 1.0 2.8479e-03 1.0 4.10e+06 1.0 >>>>>>>>>>>>>> 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1440 2266 1 >>>>>>>>>>>>>> 4.44e-01 2 5.51e-01 100 >>>>>>>>>>>>>> PCGAMG Opt l03 1 1.0 8.2684e-04 1.0 2.80e+05 1.0 >>>>>>>>>>>>>> 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 339 1667 1 >>>>>>>>>>>>>> 6.72e-02 1 2.03e-01 100 >>>>>>>>>>>>>> PCGAMG Gal l04 1 1.0 1.2238e-03 1.0 2.09e+05 1.0 >>>>>>>>>>>>>> 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 170 244 1 >>>>>>>>>>>>>> 2.05e-02 2 2.53e-02 100 >>>>>>>>>>>>>> PCGAMG Opt l04 1 1.0 4.1008e-04 1.0 1.77e+04 1.0 >>>>>>>>>>>>>> 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 43 165 1 >>>>>>>>>>>>>> 4.49e-03 1 1.19e-02 100 >>>>>>>>>>>>>> PCSetUp 2 1.0 9.9632e+00 1.0 4.95e+09 1.0 >>>>>>>>>>>>>> 0.0e+00 0.0e+00 0.0e+00 5 12 0 0 0 5 12 0 0 0 496 17826 55 >>>>>>>>>>>>>> 1.03e+03 45 6.54e+02 98 >>>>>>>>>>>>>> PCSetUpOnBlocks 44 1.0 9.9087e-04 1.0 2.88e+03 1.0 >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>>> The point of lazy initialization is to make it possible to >>>>>>>>>>>>>>> run a solve that doesn't use a GPU in PETSC_ARCH that supports GPUs, >>>>>>>>>>>>>>> regardless of whether a GPU is actually present. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Fande Kong writes: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> > I spoke too soon. It seems that we have trouble creating >>>>>>>>>>>>>>> cuda/kokkos vecs >>>>>>>>>>>>>>> > now. Got Segmentation fault. >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> > Thanks, >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> > Fande >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> > Program received signal SIGSEGV, Segmentation fault. >>>>>>>>>>>>>>> > 0x00002aaab5558b11 in >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> Petsc::CUPMDevice<(Petsc::CUPMDeviceType)0>::CUPMDeviceInternal::initialize >>>>>>>>>>>>>>> > (this=0x1) at >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/impls/cupm/cupmdevice.cxx:54 >>>>>>>>>>>>>>> > 54 PetscErrorCode >>>>>>>>>>>>>>> CUPMDevice::CUPMDeviceInternal::initialize() noexcept >>>>>>>>>>>>>>> > Missing separate debuginfos, use: debuginfo-install >>>>>>>>>>>>>>> > bzip2-libs-1.0.6-13.el7.x86_64 >>>>>>>>>>>>>>> elfutils-libelf-0.176-5.el7.x86_64 >>>>>>>>>>>>>>> > elfutils-libs-0.176-5.el7.x86_64 >>>>>>>>>>>>>>> glibc-2.17-325.el7_9.x86_64 >>>>>>>>>>>>>>> > libX11-1.6.7-4.el7_9.x86_64 libXau-1.0.8-2.1.el7.x86_64 >>>>>>>>>>>>>>> > libattr-2.4.46-13.el7.x86_64 libcap-2.22-11.el7.x86_64 >>>>>>>>>>>>>>> > libibmad-5.4.0.MLNX20190423.1d917ae-0.1.49224.x86_64 >>>>>>>>>>>>>>> > libibumad-43.1.1.MLNX20200211.078947f-0.1.49224.x86_64 >>>>>>>>>>>>>>> > libibverbs-41mlnx1-OFED.4.9.0.0.7.49224.x86_64 >>>>>>>>>>>>>>> > libmlx4-41mlnx1-OFED.4.7.3.0.3.49224.x86_64 >>>>>>>>>>>>>>> > libmlx5-41mlnx1-OFED.4.9.0.1.2.49224.x86_64 >>>>>>>>>>>>>>> libnl3-3.2.28-4.el7.x86_64 >>>>>>>>>>>>>>> > librdmacm-41mlnx1-OFED.4.7.3.0.6.49224.x86_64 >>>>>>>>>>>>>>> > librxe-41mlnx1-OFED.4.4.2.4.6.49224.x86_64 >>>>>>>>>>>>>>> libxcb-1.13-1.el7.x86_64 >>>>>>>>>>>>>>> > libxml2-2.9.1-6.el7_9.6.x86_64 >>>>>>>>>>>>>>> numactl-libs-2.0.12-5.el7.x86_64 >>>>>>>>>>>>>>> > systemd-libs-219-78.el7_9.3.x86_64 >>>>>>>>>>>>>>> xz-libs-5.2.2-1.el7.x86_64 >>>>>>>>>>>>>>> > zlib-1.2.7-19.el7_9.x86_64 >>>>>>>>>>>>>>> > (gdb) bt >>>>>>>>>>>>>>> > #0 0x00002aaab5558b11 in >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> Petsc::CUPMDevice<(Petsc::CUPMDeviceType)0>::CUPMDeviceInternal::initialize >>>>>>>>>>>>>>> > (this=0x1) at >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/impls/cupm/cupmdevice.cxx:54 >>>>>>>>>>>>>>> > #1 0x00002aaab5558db7 in >>>>>>>>>>>>>>> > Petsc::CUPMDevice<(Petsc::CUPMDeviceType)0>::getDevice >>>>>>>>>>>>>>> > (this=this at entry=0x2aaab7f37b70 >>>>>>>>>>>>>>> > , device=0x115da00, id=-35, id at entry=-1) at >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/impls/cupm/cupmdevice.cxx:344 >>>>>>>>>>>>>>> > #2 0x00002aaab55577de in PetscDeviceCreate >>>>>>>>>>>>>>> (type=type at entry=PETSC_DEVICE_CUDA, >>>>>>>>>>>>>>> > devid=devid at entry=-1, device=device at entry=0x2aaab7f37b48 >>>>>>>>>>>>>>> > ) at >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/interface/device.cxx:107 >>>>>>>>>>>>>>> > #3 0x00002aaab5557b3a in >>>>>>>>>>>>>>> PetscDeviceInitializeDefaultDevice_Internal >>>>>>>>>>>>>>> > (type=type at entry=PETSC_DEVICE_CUDA, >>>>>>>>>>>>>>> defaultDeviceId=defaultDeviceId at entry=-1) >>>>>>>>>>>>>>> > at >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/interface/device.cxx:273 >>>>>>>>>>>>>>> > #4 0x00002aaab5557bf6 in PetscDeviceInitialize >>>>>>>>>>>>>>> > (type=type at entry=PETSC_DEVICE_CUDA) >>>>>>>>>>>>>>> > at >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/sys/objects/device/interface/device.cxx:234 >>>>>>>>>>>>>>> > #5 0x00002aaab5661fcd in VecCreate_SeqCUDA (V=0x115d150) >>>>>>>>>>>>>>> at >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/vec/vec/impls/seq/seqcuda/veccuda.c:244 >>>>>>>>>>>>>>> > #6 0x00002aaab5649b40 in VecSetType (vec=vec at entry >>>>>>>>>>>>>>> =0x115d150, >>>>>>>>>>>>>>> > method=method at entry=0x2aaab70b45b8 "seqcuda") at >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/vec/vec/interface/vecreg.c:93 >>>>>>>>>>>>>>> > #7 0x00002aaab579c33f in VecCreate_CUDA (v=0x115d150) at >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/vec/vec/impls/mpi/mpicuda/ >>>>>>>>>>>>>>> > mpicuda.cu:214 >>>>>>>>>>>>>>> > #8 0x00002aaab5649b40 in VecSetType (vec=vec at entry >>>>>>>>>>>>>>> =0x115d150, >>>>>>>>>>>>>>> > method=method at entry=0x7fffffff9260 "cuda") at >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/vec/vec/interface/vecreg.c:93 >>>>>>>>>>>>>>> > #9 0x00002aaab5648bf1 in VecSetTypeFromOptions_Private >>>>>>>>>>>>>>> (vec=0x115d150, >>>>>>>>>>>>>>> > PetscOptionsObject=0x7fffffff9210) at >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/vec/vec/interface/vector.c:1263 >>>>>>>>>>>>>>> > #10 VecSetFromOptions (vec=0x115d150) at >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> /home/kongf/workhome/sawtooth/moosegpu/petsc/src/vec/vec/interface/vector.c:1297 >>>>>>>>>>>>>>> > #11 0x00002aaab02ef227 in >>>>>>>>>>>>>>> libMesh::PetscVector::init >>>>>>>>>>>>>>> > (this=0x11cd1a0, n=441, n_local=441, fast=false, >>>>>>>>>>>>>>> ptype=libMesh::PARALLEL) >>>>>>>>>>>>>>> > at >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> /home/kongf/workhome/sawtooth/moosegpu/scripts/../libmesh/installed/include/libmesh/petsc_vector.h:693 >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> > On Thu, Jan 20, 2022 at 1:09 PM Fande Kong < >>>>>>>>>>>>>>> fdkong.jd at gmail.com> wrote: >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> >> Thanks, Jed, >>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>> >> This worked! >>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>> >> Fande >>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>> >> On Wed, Jan 19, 2022 at 11:03 PM Jed Brown < >>>>>>>>>>>>>>> jed at jedbrown.org> wrote: >>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>> >>> Fande Kong writes: >>>>>>>>>>>>>>> >>> >>>>>>>>>>>>>>> >>> > On Wed, Jan 19, 2022 at 11:39 AM Jacob Faibussowitsch < >>>>>>>>>>>>>>> >>> jacob.fai at gmail.com> >>>>>>>>>>>>>>> >>> > wrote: >>>>>>>>>>>>>>> >>> > >>>>>>>>>>>>>>> >>> >> Are you running on login nodes or compute nodes (I >>>>>>>>>>>>>>> can?t seem to tell >>>>>>>>>>>>>>> >>> from >>>>>>>>>>>>>>> >>> >> the configure.log)? >>>>>>>>>>>>>>> >>> >> >>>>>>>>>>>>>>> >>> > >>>>>>>>>>>>>>> >>> > I was compiling codes on login nodes, and running >>>>>>>>>>>>>>> codes on compute >>>>>>>>>>>>>>> >>> nodes. >>>>>>>>>>>>>>> >>> > Login nodes do not have GPUs, but compute nodes do >>>>>>>>>>>>>>> have GPUs. >>>>>>>>>>>>>>> >>> > >>>>>>>>>>>>>>> >>> > Just to be clear, the same thing (code, machine) with >>>>>>>>>>>>>>> PETSc-3.16.1 >>>>>>>>>>>>>>> >>> worked >>>>>>>>>>>>>>> >>> > perfectly. I have this trouble with PETSc-main. >>>>>>>>>>>>>>> >>> >>>>>>>>>>>>>>> >>> I assume you can >>>>>>>>>>>>>>> >>> >>>>>>>>>>>>>>> >>> export PETSC_OPTIONS='-device_enable lazy' >>>>>>>>>>>>>>> >>> >>>>>>>>>>>>>>> >>> and it'll work. >>>>>>>>>>>>>>> >>> >>>>>>>>>>>>>>> >>> I think this should be the default. The main complaint >>>>>>>>>>>>>>> is that timing the >>>>>>>>>>>>>>> >>> first GPU-using event isn't accurate if it includes >>>>>>>>>>>>>>> initialization, but I >>>>>>>>>>>>>>> >>> think this is mostly hypothetical because you can't >>>>>>>>>>>>>>> trust any timing that >>>>>>>>>>>>>>> >>> doesn't preload in some form and the first GPU-using >>>>>>>>>>>>>>> event will almost >>>>>>>>>>>>>>> >>> always be something uninteresting so I think it will >>>>>>>>>>>>>>> rarely lead to >>>>>>>>>>>>>>> >>> confusion. Meanwhile, eager initialization is viscerally >>>>>>>>>>>>>>> disruptive for >>>>>>>>>>>>>>> >>> lots of people. >>>>>>>>>>>>>>> >>> >>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> -- >>>>>>>>>>>>> What most experimenters take for granted before they begin >>>>>>>>>>>>> their experiments is infinitely more interesting than any results to which >>>>>>>>>>>>> their experiments lead. >>>>>>>>>>>>> -- Norbert Wiener >>>>>>>>>>>>> >>>>>>>>>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -------------- next part -------------- An HTML attachment was scrubbed... URL: