<div dir="ltr">I've have a test up and running but hypre and GAMG are running very very slow. The test only has about 100 equation per core. Jed mentioned 20K cycles to start OMP parallel (really?) which would explain a lot. Do I understand that correctly Jed? <div><br></div><div><br></div></div><div class="gmail_extra"><br><div class="gmail_quote">On Fri, Jan 9, 2015 at 1:44 PM, Abhyankar, Shrirang G. <span dir="ltr"><<a href="mailto:abhyshr@mcs.anl.gov" target="_blank">abhyshr@mcs.anl.gov</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div style="word-wrap:break-word;color:rgb(0,0,0);font-size:14px;font-family:Helvetica,sans-serif">
<div>The values need to be comma separated. You can specify the affinities in different ways as described in the PetscOptionsGetIntArray docs. <a href="http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Sys/PetscOptionsGetIntArray.html" target="_blank">http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Sys/PetscOptionsGetIntArray.html</a>. </div>
<div><br>
</div>
<div>I'll fix the docs for threadcomm.</div>
<div><br>
</div>
<span>
<div style="font-family:Calibri;font-size:11pt;text-align:left;color:black;BORDER-BOTTOM:medium none;BORDER-LEFT:medium none;PADDING-BOTTOM:0in;PADDING-LEFT:0in;PADDING-RIGHT:0in;BORDER-TOP:#b5c4df 1pt solid;BORDER-RIGHT:medium none;PADDING-TOP:3pt">
<span style="font-weight:bold">From: </span>Mark Adams <<a href="mailto:mfadams@lbl.gov" target="_blank">mfadams@lbl.gov</a>><br>
<span style="font-weight:bold">Date: </span>Fri, 9 Jan 2015 13:28:48 -0500<br>
<span style="font-weight:bold">To: </span>Shri <<a href="mailto:abhyshr@mcs.anl.gov" target="_blank">abhyshr@mcs.anl.gov</a>><br>
<span style="font-weight:bold">Cc: </span>barry smith <<a href="mailto:bsmith@mcs.anl.gov" target="_blank">bsmith@mcs.anl.gov</a>>, petsc-dev mailing list <<a href="mailto:petsc-dev@mcs.anl.gov" target="_blank">petsc-dev@mcs.anl.gov</a>><div><div class="h5"><br>
<span style="font-weight:bold">Subject: </span>Re: [petsc-dev] configuring hypre on batch system<br>
</div></div></div><div><div class="h5">
<div><br>
</div>
<blockquote style="BORDER-LEFT:#b5c4df 5 solid;PADDING:0 0 0 5;MARGIN:0 0 0 5">
<div dir="ltr">OK, threads seem to be working as advertised but I still get this error with
<div><br>
</div>
<div>
<div>-threadcomm_nthreads 8</div>
<div>-threadcomm_affinities 0 1 2 3 4 5 6 7</div>
</div>
<div><br>
</div>
<div>I'm guessing it must do the right thing.</div>
<div><br>
</div>
<div>Thanks again,</div>
<div>Mark</div>
<div><br>
</div>
<div>
<div>SOLVER_INIT: make partitioning with 2872/ 3705 real vertices</div>
<div>[10]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------</div>
<div>[10]PETSC ERROR: Nonconforming object sizes</div>
<div>[10]PETSC ERROR: Must set affinities for all threads, Threads = 8, Core affinities set = 1</div>
<div>[10]PETSC ERROR: See <a href="http://www.mcs.anl.gov/petsc/documentation/faq.html" target="_blank">
http://www.mcs.anl.gov/petsc/documentation/faq.html</a> for trouble shooting.</div>
<div>[10]PETSC ERROR: Petsc Development GIT revision: v3.5.2-1345-g927ffcc GIT Date: 2015-01-08 16:04:39 -0700</div>
<div>[10]PETSC ERROR: ./xgc2 on a arch-titan-opt-pgi named nid02295 by adams Fri Jan 9 13:20:52 2015</div>
<div>[10]PETSC ERROR: Configure options --COPTFLAGS="-mp -fast" --CXXOPTFLAGS="-mp -fast" --FOPTFLAGS="-mp -fast" --with-threadcomm --with-pthreadclasses --with-openmp --download-hypre --download-metis --download-parmetis --with-cc=cc --with-clib-autodetect=0
--with-cxx=CC --with-cxxlib-autodetect=0 --with-fc=ftn --with-fortranlib-autodetect=0 --with-shared-libraries=0 --known-mpi-shared-libraries=1 --with-x=0 --with-debugging=0 PETSC_ARCH=arch-titan-opt-pgi PETSC_DIR=/lustre/atlas2/env003/scratch/adams/petsc2</div>
<div>[10]PETSC ERROR: #1 PetscThreadCommSetAffinities() line 431 in /lustre/atlas2/env003/scratch/adams/petsc2/src/sys/threadcomm/interface/threadcomm.c</div>
<div>[10]PETSC ERROR: #2 PetscThreadCommWorldInitialize() line 1231 in /lustre/atlas2/env003/scratch/adams/petsc2/src/sys/threadcomm/interface/threadcomm.c</div>
<div>[10]PETSC ERROR: #3 PetscGetThreadCommWorld() line 82 in /lustre/atlas2/env003/scratch/adams/petsc2/src/sys/threadcomm/interface/threadcomm.c</div>
<div>[10]PETSC ERROR: #4 PetscCommGetThreadComm() line 117 in /lustre/atlas2/env003/scratch/adams/petsc2/src/sys/threadcomm/interface/threadcomm.c</div>
<div>[10]PETSC ERROR: #5 PetscCommDuplicate() line 195 in /lustre/atlas2/env003/scratch/adams/petsc2/src/sys/objects/tagm.c</div>
<div>[10]PETSC ERROR: #6 PetscHeaderCreate_Private() line 59 in /lustre/atlas2/env003/scratch/adams/petsc2/src/sys/objects/inherit.c</div>
</div>
<div><br>
</div>
</div>
<div class="gmail_extra"><br>
<div class="gmail_quote">On Fri, Jan 9, 2015 at 12:25 AM, Abhyankar, Shrirang G. <span dir="ltr">
<<a href="mailto:abhyshr@mcs.anl.gov" target="_blank">abhyshr@mcs.anl.gov</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div style="color:rgb(0,0,0);font-size:14px;font-family:Helvetica,sans-serif;word-wrap:break-word">
<div>Mark,</div>
<div> The input for -threadcomm_affinities are the processor numbers </div>
<div><br>
</div>
<div>So -threadcomm_nthreads 4 </div>
<span>
<div> -threadcomm_affinities 0 1 2 3</div>
<div><br>
</div>
</span>
<div>will pin the 4 threads to processors 0,1,2,3. Unfortunately, there is no standardization of processor number mapping on physical and/or logical cores (it is decided by the OS I think). For example, on one node with two quad-core CPUs (total 8 processors,
no hyperthreading), the 1st CPU may have processor numbers 0,1,3,5, while the other 2,4,6,8. On another node with similar hardware, the processor numbers may be 0,1,2,3 on the 1st CPU and 4,5,6,7 on the second. Hence, tools like likwid or hwloc are very helpful
for getting the hardware layout. You may also obtain this info by looking at /proc/cupinfo on linux.</div>
<div><br>
</div>
<div>Shri</div>
<span>
<div style="font-family:Calibri;font-size:11pt;text-align:left;color:black;BORDER-BOTTOM:medium none;BORDER-LEFT:medium none;PADDING-BOTTOM:0in;PADDING-LEFT:0in;PADDING-RIGHT:0in;BORDER-TOP:#b5c4df 1pt solid;BORDER-RIGHT:medium none;PADDING-TOP:3pt">
<span style="font-weight:bold">From: </span>Mark Adams <<a href="mailto:mfadams@lbl.gov" target="_blank">mfadams@lbl.gov</a>><br>
<span style="font-weight:bold">Date: </span>Thu, 8 Jan 2015 21:43:30 -0500<br>
<span style="font-weight:bold">To: </span>barry smith <<a href="mailto:bsmith@mcs.anl.gov" target="_blank">bsmith@mcs.anl.gov</a>><br>
<span style="font-weight:bold">Cc: </span>petsc-dev mailing list <<a href="mailto:petsc-dev@mcs.anl.gov" target="_blank">petsc-dev@mcs.anl.gov</a>><br>
<span style="font-weight:bold">Subject: </span>Re: [petsc-dev] configuring hypre on batch system<br>
</div>
<div>
<div>
<div><br>
</div>
<blockquote style="BORDER-LEFT:#b5c4df 5 solid;PADDING:0 0 0 5;MARGIN:0 0 0 5">
<div dir="ltr">
<div class="gmail_extra">
<div class="gmail_quote">
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
<span><br>
> -threadcomm_affinities 0 1 2 3 4 5 6 7 ?????<br>
<br>
</span>I don't know what the flag is here<br>
<span><br>
</span></blockquote>
<div><br>
</div>
<div>Neither do I. The web page <a href="http://www.mcs.anl.gov/petsc/features/threads.html" target="_blank">http://www.mcs.anl.gov/petsc/features/threads.html</a> says:</div>
<div><br>
</div>
<div>
<ul style="color:rgb(0,0,0);font-family:Times;font-size:medium;background-color:rgb(213,234,255)">
<li><br>
-threadcomm_affinities <list_of_affinities>: Sets the core affinities of threads</li></ul>
</div>
<div>I'm not sure what to put here ...</div>
<div><br>
</div>
<div> </div>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
<span>> -threadcomm_type openmp<br>
><br>
> Then would I get threaded MatVec and other CG + MG stuff? I know this will not be faster but I just need data to corroborate what we all know. And I don't care about setup.<br>
<br>
</span> Depends on the smoother, we don't have any threaded SOR, if you using Jacobi + Cheyby it will be threaded.<br>
<br>
</blockquote>
<div><br>
</div>
<div>Oh right, thanks,</div>
<div><br>
</div>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
<br>
</blockquote>
</div>
<br>
</div>
</div>
</blockquote>
</div>
</div>
</span></div>
</blockquote>
</div>
<br>
</div>
</blockquote>
</div></div></span>
</div>
</blockquote></div><br></div>