<div dir="ltr"><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div>If I use, say Np = 16 processes on one node, MPI is running 16 versions of the code on a single node (which has 16 cores). How does OpenMP figure out how to fork? Does it fork a total of 16 threads/MPI process = 256 threads or is it smart to just fork a total of 16 threads/node = 1 thread/core = 16 threads? I'm a bit confused here how the job is scheduled when MPI and OpenMP are mixed? </div>
</div></blockquote><div><br></div><div>This is one important use for OMP_NUM_THREADS. If you're trying to increase the amount of memory per process, you should map one process per node and set OMP_NUM_THREADS to the number of OpenMP threads you'd like. There are lots of tutorials and even textbooks now that discuss hybrid programming techniques that you should look to for more information (or you could try <a href="http://scicomp.stackexchange.com">scicomp.stackexchange.com</a>).</div>
<div><br></div><div>Cheers,</div><div>Aron</div></div></div>