[petsc-users] how to run petsc in MPI mode correct?
Ji Zhang
gotofd at gmail.com
Thu Oct 27 03:52:07 CDT 2016
Thank you very much, OMP_NUM_THREADS=1 works well!
此致
敬礼
张骥(博士研究生)
北京计算科学研究中心
北京市海淀区西北旺东路10号院东区9号楼 (100193)
Best,
Regards,
Zhang Ji, PhD student
Beijing Computational Science Research Center
Zhongguancun Software Park II, No. 10 Dongbeiwang West Road, Haidian
District, Beijing 100193, China
On Thu, Oct 27, 2016 at 4:24 PM, Stefano Zampini <stefano.zampini at gmail.com>
wrote:
>
>
> 2016-10-27 11:11 GMT+03:00 Ji Zhang <gotofd at gmail.com>:
>
>> Dear all,
>>
>> I'm using petsc as a solver for my project. However, the solver in
>> parallel mode creates much more process then my expectation.
>>
>> The code using python and petsc4py. The machine have 4 cores.
>> (a). If I run it directly, petsc uses only 1 process to assemble the
>> matrix, and creates 4 process to solve the equations,
>> (b). If I use comment 'mpirun -n 4', petsc uses 4 process to assemble
>> the matrix, but creates 16 process to solve the equations,
>>
>
> What do you mean by "PETSc creates 16 processes"? PETSc does not create
> processes.
> What's the output of PETSc.COMM_WORLD.getSize()?
>
> My feeling is that you have some python component (numpy?) or the
> BLAS/LAPACK library that is multithreaded. Rerun using OMP_NUM_THREADS=1
> (or MKL_NUM_THREADS=1)
> If this does not fix your issues, try running under strace
>
>
>> I have checked my own python code,, the main component associates with
>> matrix create is as follow:
>>
>> m = PETSc.Mat().create(comm=PETSc.COMM_WORLD)
>> m.setSizes(((None, n_vnode[0]*3), (None, n_fnode[0]*3)))
>> m.setType('dense')
>> m.setFromOptions()
>> m.setUp()
>> m_start, m_end = m.getOwnershipRange()
>> for i0 in range(m_start, m_end):
>> delta_xi = fnodes - vnodes[i0//3]
>> temp1 = delta_xi ** 2
>> delta_2 = np.square(delta) # delta_2 = e^2
>> delta_r2 = temp1.sum(axis=1) + delta_2 # delta_r2 = r^2+e^2
>> delta_r3 = delta_r2 * np.sqrt(delta_r2) # delta_r3 = (r^2+e^2)^1.5
>> temp2 = (delta_r2 + delta_2) / delta_r3 # temp2 = (r^2+2*e^2)/(r^2+e^2)^1.5
>> if i0 % 3 == 0: # x axis
>> m[i0, 0::3] = ( temp2 + np.square(delta_xi[:, 0]) / delta_r3 ) / (8 * np.pi) # Mxx
>> m[i0, 1::3] = delta_xi[:, 0] * delta_xi[:, 1] / delta_r3 / (8 * np.pi) # Mxy
>> m[i0, 2::3] = delta_xi[:, 0] * delta_xi[:, 2] / delta_r3 / (8 * np.pi) # Mxz
>> elif i0 % 3 == 1: # y axis
>> m[i0, 0::3] = delta_xi[:, 0] * delta_xi[:, 1] / delta_r3 / (8 * np.pi) # Mxy
>> m[i0, 1::3] = ( temp2 + np.square(delta_xi[:, 1]) / delta_r3 ) / (8 * np.pi) # Myy
>> m[i0, 2::3] = delta_xi[:, 1] * delta_xi[:, 2] / delta_r3 / (8 * np.pi) # Myz
>> else: # z axis
>> m[i0, 0::3] = delta_xi[:, 0] * delta_xi[:, 2] / delta_r3 / (8 * np.pi) # Mxz
>> m[i0, 1::3] = delta_xi[:, 1] * delta_xi[:, 2] / delta_r3 / (8 * np.pi) # Myz
>> m[i0, 2::3] = ( temp2 + np.square(delta_xi[:, 2]) / delta_r3 ) / (8 * np.pi) # Mzz
>> m.assemble()
>>
>>
>>
>> the main component associates to petsc solver is as follow:
>>
>> ksp = PETSc.KSP()
>> ksp.create(comm=PETSc.COMM_WORLD)
>> ksp.setType(solve_method)
>> ksp.getPC().setType(precondition_method)
>> ksp.setOperators(self._M_petsc)
>> ksp.setFromOptions()
>> ksp.solve(velocity_petsc, force_petsc)
>>
>> Is there any one could give me some suggestion? Thanks.
>> 此致
>> 敬礼
>> 张骥(博士研究生)
>> 北京计算科学研究中心
>> 北京市海淀区西北旺东路10号院东区9号楼 (100193)
>>
>> Best,
>> Regards,
>> Zhang Ji, PhD student
>> Beijing Computational Science Research Center
>> Zhongguancun Software Park II, No. 10 Dongbeiwang West Road, Haidian
>> District, Beijing 100193, China
>>
>
>
>
> --
> Stefano
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20161027/6524125b/attachment.html>
More information about the petsc-users
mailing list