<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
</head>
<body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" class="">
<div class="">I did some quick tests (with a different example) on a single KNL node and a single Haswell node, both using 4 processes. Check below for the results about MatMult. And the total running time on KNL is a bit more than two times of that on Haswell.
So I think the results Justin got with SNE ex48 are reasonable, considering the fact that KNL cores are much less powerful than Haswell cores, as Richard mentioned.</div>
<div class=""><br class="">
</div>
<div class="">
<div class="">------------------------------------------------------------------------------------------------------------------------</div>
<div class="">Event Count Time (sec) Flops --- Global --- --- Stage --- Total</div>
<div class=""> Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s</div>
<div class="">------------------------------------------------------------------------------------------------------------------------</div>
</div>
<div class="">MatMult(KNL) 1609 1.0 1.4044e+02 1.0 6.41e+10 1.0 1.3e+04 3.3e+04 0.0e+00 18 19 91 93 0 18 19 91 93 0 1826</div>
<div class=""><br class="">
</div>
<div class="">MatMult(Haswell) 1609 1.0 4.4927e+01 1.0 6.41e+10 1.0 1.3e+04 3.3e+04 0.0e+00 18 19 91 93 0 18 19 91 93 0 5708</div>
<div class=""><br class="">
</div>
<div class="">Hong(Mr.)</div>
<div class=""><br class="">
</div>
<div>
<blockquote type="cite" class="">
<div class="">On Apr 4, 2017, at 11:05 AM, Matthew Knepley <<a href="mailto:knepley@gmail.com" class="">knepley@gmail.com</a>> wrote:</div>
<br class="Apple-interchange-newline">
<div class="">
<div dir="ltr" style="font-family: Verdana; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class="">
<div class="gmail_extra">
<div class="gmail_quote">On Tue, Apr 4, 2017 at 10:57 AM, Justin Chang<span class="Apple-converted-space"> </span><span dir="ltr" class=""><<a href="mailto:jychang48@gmail.com" target="_blank" class="">jychang48@gmail.com</a>></span><span class="Apple-converted-space"> </span>wrote:<br class="">
<blockquote class="gmail_quote" style="margin: 0px 0px 0px 0.8ex; border-left-width: 1px; border-left-color: rgb(204, 204, 204); border-left-style: solid; padding-left: 1ex;">
<div class="">Thanks everyone for the helpful advice. So I tried all the suggestions including using libsci. The performance did not improve for my particular runs, which I think suggests the problem parameters chosen for my tests (SNES ex48) are not optimal
for KNL. Does anyone have example test runs I could reproduce that compare the performance between KNL and Haswell/Ivybridge/etc? </div>
</blockquote>
<div class=""><br class="">
</div>
<div class="">Lets try to see what is going on with your existing data first.</div>
<div class=""><br class="">
</div>
<div class="">First, I think that main thing is to make sure we are using MCDRAM. Everything else in KNL</div>
<div class="">is window dressing (IMHO). All we have to look at is something like MAXPY. You can get the</div>
<div class="">bandwidth estimate from the flop rate and problem size (I think), and we can at least get</div>
<div class="">bandwidth ratios between Haswell and KNL with that number.</div>
<div class=""><br class="">
</div>
<div class=""> Matt</div>
<div class=""> </div>
<blockquote class="gmail_quote" style="margin: 0px 0px 0px 0.8ex; border-left-width: 1px; border-left-color: rgb(204, 204, 204); border-left-style: solid; padding-left: 1ex;">
<div class="HOEnZb">
<div class="h5">
<div class="">
<div class="gmail_quote">
<div class="">On Mon, Apr 3, 2017 at 3:06 PM Richard Mills <<a href="mailto:richardtmills@gmail.com" target="_blank" class="">richardtmills@gmail.com</a>> wrote:<br class="">
</div>
<blockquote class="gmail_quote" style="margin: 0px 0px 0px 0.8ex; border-left-width: 1px; border-left-color: rgb(204, 204, 204); border-left-style: solid; padding-left: 1ex;">
<div class="m_-883216464682320638gmail_msg">
<div class="m_-883216464682320638gmail_msg">Yes, one should rely on MKL (or Cray LibSci, if using the Cray toolchain) on Cori. But I'm guessing that this will make no noticeable difference for what Justin is doing.<br class="m_-883216464682320638gmail_msg">
<br class="m_-883216464682320638gmail_msg">
</div>
</div>
<div class="m_-883216464682320638gmail_msg">--Richard<br class="m_-883216464682320638gmail_msg">
</div>
<div class="gmail_extra m_-883216464682320638gmail_msg"><br class="m_-883216464682320638gmail_msg">
<div class="gmail_quote m_-883216464682320638gmail_msg">On Mon, Apr 3, 2017 at 12:57 PM, murat keçeli<span class="Apple-converted-space"> </span><span class="m_-883216464682320638gmail_msg"><<a href="mailto:keceli@gmail.com" class="m_-883216464682320638gmail_msg" target="_blank">keceli@gmail.com</a>></span><span class="Apple-converted-space"> </span>wrote:<br class="m_-883216464682320638gmail_msg">
<blockquote class="gmail_quote m_-883216464682320638gmail_msg" style="margin: 0px 0px 0px 0.8ex; border-left-width: 1px; border-left-color: rgb(204, 204, 204); border-left-style: solid; padding-left: 1ex;">
<div class="m_-883216464682320638gmail_msg">How about replacing <span class="m_-883216464682320638gmail_msg" style="font-size: 12.8px;">--download-<wbr class="">fblaslapack with vendor specific BLAS/LAPACK? </span><span class="m_-883216464682320638m_1223051407712377658HOEnZb m_-883216464682320638gmail_msg"><font color="#888888" class="m_-883216464682320638gmail_msg">
<div class="m_-883216464682320638gmail_msg"><span class="m_-883216464682320638gmail_msg" style="font-size: 12.8px;"><br class="m_-883216464682320638gmail_msg">
</span>
<div class="m_-883216464682320638gmail_msg"><span class="m_-883216464682320638gmail_msg" style="font-size: 12.8px;">Murat</span></div>
</div>
</font></span></div>
<div class="m_-883216464682320638m_1223051407712377658HOEnZb m_-883216464682320638gmail_msg">
<div class="m_-883216464682320638gmail_msg m_-883216464682320638m_1223051407712377658h5">
<div class="gmail_extra m_-883216464682320638gmail_msg"><br class="m_-883216464682320638gmail_msg">
<div class="gmail_quote m_-883216464682320638gmail_msg">On Mon, Apr 3, 2017 at 2:45 PM, Richard Mills<span class="Apple-converted-space"> </span><span class="m_-883216464682320638gmail_msg"><<a href="mailto:richardtmills@gmail.com" class="m_-883216464682320638gmail_msg" target="_blank">richardtmills@gmail.com</a>></span><span class="Apple-converted-space"> </span>wrote:<br class="m_-883216464682320638gmail_msg">
<blockquote class="gmail_quote m_-883216464682320638gmail_msg" style="margin: 0px 0px 0px 0.8ex; border-left-width: 1px; border-left-color: rgb(204, 204, 204); border-left-style: solid; padding-left: 1ex;">
<div class="m_-883216464682320638gmail_msg">
<div class="gmail_extra m_-883216464682320638gmail_msg">
<div class="gmail_quote m_-883216464682320638gmail_msg"><span class="m_-883216464682320638gmail_msg">On Mon, Apr 3, 2017 at 12:24 PM, Zhang, Hong<span class="Apple-converted-space"> </span><span class="m_-883216464682320638gmail_msg"><<a href="mailto:hongzhang@anl.gov" class="m_-883216464682320638gmail_msg" target="_blank">hongzhang@anl.gov</a>></span><span class="Apple-converted-space"> </span>wrote:<br class="m_-883216464682320638gmail_msg">
<blockquote class="gmail_quote m_-883216464682320638gmail_msg" style="margin: 0px 0px 0px 0.8ex; border-left-width: 1px; border-left-color: rgb(204, 204, 204); border-left-style: solid; padding-left: 1ex;">
<div class="m_-883216464682320638gmail_msg" style="word-wrap: break-word;">
<div class="m_-883216464682320638gmail_msg"><br class="m_-883216464682320638gmail_msg">
</div>
<div class="m_-883216464682320638gmail_msg"><span class="m_-883216464682320638gmail_msg">
<blockquote type="cite" class="m_-883216464682320638gmail_msg">
<div class="m_-883216464682320638gmail_msg">On Apr 3, 2017, at 1:44 PM, Justin Chang <<a href="mailto:jychang48@gmail.com" class="m_-883216464682320638gmail_msg" target="_blank">jychang48@gmail.com</a>> wrote:</div>
<br class="m_-883216464682320638gmail_msg m_-883216464682320638m_1223051407712377658m_3572076414804514015m_-516510004710254006m_4134330283867791398Apple-interchange-newline">
<div class="m_-883216464682320638gmail_msg">
<div class="m_-883216464682320638gmail_msg">
<div class="m_-883216464682320638gmail_msg">
<div class="m_-883216464682320638gmail_msg">
<div class="m_-883216464682320638gmail_msg">
<div class="m_-883216464682320638gmail_msg">
<div class="m_-883216464682320638gmail_msg">
<div class="m_-883216464682320638gmail_msg">Richard,<br class="m_-883216464682320638gmail_msg">
<br class="m_-883216464682320638gmail_msg">
</div>
This is what my job script looks like:<br class="m_-883216464682320638gmail_msg">
<br class="m_-883216464682320638gmail_msg">
#!/bin/bash<br class="m_-883216464682320638gmail_msg">
#SBATCH -N 16<br class="m_-883216464682320638gmail_msg">
#SBATCH -C knl,quad,flat<br class="m_-883216464682320638gmail_msg">
#SBATCH -p regular<br class="m_-883216464682320638gmail_msg">
#SBATCH -J knlflat1024<br class="m_-883216464682320638gmail_msg">
#SBATCH -L SCRATCH<br class="m_-883216464682320638gmail_msg">
#SBATCH -o knlflat1024.o%j<br class="m_-883216464682320638gmail_msg">
#SBATCH --mail-type=ALL<br class="m_-883216464682320638gmail_msg">
#SBATCH --mail-user=<a href="mailto:jychang48@gmail.com" class="m_-883216464682320638gmail_msg" target="_blank">jychang48@gmail.<wbr class="">com</a><br class="m_-883216464682320638gmail_msg">
#SBATCH -t 00:20:00<br class="m_-883216464682320638gmail_msg">
<br class="m_-883216464682320638gmail_msg">
#run the application:<br class="m_-883216464682320638gmail_msg">
cd $SCRATCH/Icesheet<br class="m_-883216464682320638gmail_msg">
sbcast --compress=lz4 ./ex48cori /tmp/ex48cori<br class="m_-883216464682320638gmail_msg">
srun -n 1024 -c 4 --cpu_bind=cores numactl -p 1 /tmp/ex48cori -M 128 -N 128 -P 16 -thi_mat_type baij -pc_type mg -mg_coarse_pc_type gamg -da_refine 1<br class="m_-883216464682320638gmail_msg">
<br class="m_-883216464682320638gmail_msg">
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</blockquote>
<div class="m_-883216464682320638gmail_msg"><br class="m_-883216464682320638gmail_msg">
</div>
</span>
<div class="m_-883216464682320638gmail_msg">Maybe it is a typo. It should be numactl -m 1.</div>
</div>
</div>
</blockquote>
<div class="m_-883216464682320638gmail_msg"><br class="m_-883216464682320638gmail_msg">
</div>
</span>
<div class="m_-883216464682320638gmail_msg">"-p 1" will also work. "-p" means to "prefer" NUMA node 1 (the MCDRAM), whereas "-m" means to use only NUMA node 1. In the former case, MCDRAM will be used for allocations until the available memory there has been
exhausted, and then things will spill over into the DRAM. One would think that "-m" would be better for doing performance studies, but on systems where the nodes have swap space enabled, you can get terrible performance if your code's working set exceeds
the size of the MCDRAM, as the system will obediently obey your wishes to not use the DRAM and go straight to the swap disk! I assume the Cori nodes don't have swap space, though I could be wrong.<br class="m_-883216464682320638gmail_msg">
<br class="m_-883216464682320638gmail_msg">
</div>
<span class="m_-883216464682320638gmail_msg">
<blockquote class="gmail_quote m_-883216464682320638gmail_msg" style="margin: 0px 0px 0px 0.8ex; border-left-width: 1px; border-left-color: rgb(204, 204, 204); border-left-style: solid; padding-left: 1ex;">
<div class="m_-883216464682320638gmail_msg" style="word-wrap: break-word;">
<div class="m_-883216464682320638gmail_msg"><span class="m_-883216464682320638gmail_msg">
<div class="m_-883216464682320638gmail_msg"><br class="m_-883216464682320638gmail_msg">
</div>
<blockquote type="cite" class="m_-883216464682320638gmail_msg">
<div class="m_-883216464682320638gmail_msg">
<div class="m_-883216464682320638gmail_msg">
<div class="m_-883216464682320638gmail_msg">
<div class="m_-883216464682320638gmail_msg">
<div class="m_-883216464682320638gmail_msg">
<div class="m_-883216464682320638gmail_msg">According to the NERSC info pages, they say to add the "numactl" if using flat mode. Previously I tried cache mode but the performance seems to be unaffected.<br class="m_-883216464682320638gmail_msg">
</div>
</div>
</div>
</div>
</div>
</div>
</blockquote>
<div class="m_-883216464682320638gmail_msg"><br class="m_-883216464682320638gmail_msg">
</div>
</span>
<div class="m_-883216464682320638gmail_msg">Using cache mode should give similar performance as using flat mode with the numactl option. But both approaches should be significant faster than using flat mode without the numactl option. I usually see over 3X
speedup. You can also do such comparison to see if the high-bandwidth memory is working properly.</div>
<span class="m_-883216464682320638gmail_msg"><br class="m_-883216464682320638gmail_msg">
<blockquote type="cite" class="m_-883216464682320638gmail_msg">
<div class="m_-883216464682320638gmail_msg">
<div class="m_-883216464682320638gmail_msg">
<div class="m_-883216464682320638gmail_msg">
<div class="m_-883216464682320638gmail_msg">
<div class="m_-883216464682320638gmail_msg">I also comparerd 256 haswell nodes vs 256 KNL nodes and haswell is nearly 4-5x faster. Though I suspect this drastic change has much to do with the initial coarse grid size now being extremely small.</div>
</div>
</div>
</div>
</div>
</blockquote>
</span></div>
</div>
</blockquote>
</span>
<div class="m_-883216464682320638gmail_msg">I think you may be right about why you see such a big difference. The KNL nodes need enough work to be able to use the SIMD lanes effectively. Also, if your problem gets small enough, then it's going to be able
to fit in the Haswell's L3 cache. Although KNL has MCDRAM and this delivers *a lot* more memory bandwidth than the DDR4 memory, it will deliver a lot less bandwidth than the Haswell's L3.<span class="Apple-converted-space"> </span><br class="m_-883216464682320638gmail_msg">
</div>
<span class="m_-883216464682320638gmail_msg">
<blockquote class="gmail_quote m_-883216464682320638gmail_msg" style="margin: 0px 0px 0px 0.8ex; border-left-width: 1px; border-left-color: rgb(204, 204, 204); border-left-style: solid; padding-left: 1ex;">
<div class="m_-883216464682320638gmail_msg" style="word-wrap: break-word;">
<div class="m_-883216464682320638gmail_msg"><span class="m_-883216464682320638gmail_msg">
<blockquote type="cite" class="m_-883216464682320638gmail_msg">
<div class="m_-883216464682320638gmail_msg">
<div class="m_-883216464682320638gmail_msg">
<div class="m_-883216464682320638gmail_msg">
<div class="m_-883216464682320638gmail_msg"></div>
</div>
</div>
</div>
</blockquote>
<blockquote type="cite" class="m_-883216464682320638gmail_msg">
<div class="m_-883216464682320638gmail_msg">
<div class="m_-883216464682320638gmail_msg">
<div class="m_-883216464682320638gmail_msg">
<div class="m_-883216464682320638gmail_msg">I'll give the COPTFLAGS a try and see what happens<br class="m_-883216464682320638gmail_msg">
</div>
</div>
</div>
</div>
</blockquote>
<div class="m_-883216464682320638gmail_msg"><br class="m_-883216464682320638gmail_msg">
</div>
</span>
<div class="m_-883216464682320638gmail_msg">Make sure to use --with-memalign=64 for data alignment when configuring PETSc.</div>
</div>
</div>
</blockquote>
<div class="m_-883216464682320638gmail_msg"><br class="m_-883216464682320638gmail_msg">
</div>
</span>
<div class="m_-883216464682320638gmail_msg">Ah, yes, I forgot that. Thanks for mentioning it, Hong!<span class="Apple-converted-space"> </span><br class="m_-883216464682320638gmail_msg">
<br class="m_-883216464682320638gmail_msg">
</div>
<span class="m_-883216464682320638gmail_msg">
<blockquote class="gmail_quote m_-883216464682320638gmail_msg" style="margin: 0px 0px 0px 0.8ex; border-left-width: 1px; border-left-color: rgb(204, 204, 204); border-left-style: solid; padding-left: 1ex;">
<div class="m_-883216464682320638gmail_msg" style="word-wrap: break-word;">
<div class="m_-883216464682320638gmail_msg">
<div class="m_-883216464682320638gmail_msg"><br class="m_-883216464682320638gmail_msg">
</div>
<div class="m_-883216464682320638gmail_msg">The option -xMIC-AVX512 would improve the vectorization performance. But it may cause problems for the MPIBAIJ format for some unknown reason. MPIAIJ should work fine with this option.</div>
</div>
</div>
</blockquote>
<div class="m_-883216464682320638gmail_msg"><br class="m_-883216464682320638gmail_msg">
</div>
</span>
<div class="m_-883216464682320638gmail_msg">Hmm. Try both, and, if you see worse performance with MPIBAIJ, let us know and I'll try to figure this out.<span class="m_-883216464682320638m_1223051407712377658m_3572076414804514015HOEnZb m_-883216464682320638gmail_msg"><font color="#888888" class="m_-883216464682320638gmail_msg"><br class="m_-883216464682320638gmail_msg">
<br class="m_-883216464682320638gmail_msg">
</font></span></div>
<span class="m_-883216464682320638m_1223051407712377658m_3572076414804514015HOEnZb m_-883216464682320638gmail_msg"><font color="#888888" class="m_-883216464682320638gmail_msg">
<div class="m_-883216464682320638gmail_msg">--Richard<br class="m_-883216464682320638gmail_msg">
<br class="m_-883216464682320638gmail_msg">
</div>
</font></span><span class="m_-883216464682320638gmail_msg">
<blockquote class="gmail_quote m_-883216464682320638gmail_msg" style="margin: 0px 0px 0px 0.8ex; border-left-width: 1px; border-left-color: rgb(204, 204, 204); border-left-style: solid; padding-left: 1ex;">
<div class="m_-883216464682320638gmail_msg" style="word-wrap: break-word;">
<div class="m_-883216464682320638gmail_msg">
<div class="m_-883216464682320638gmail_msg"><br class="m_-883216464682320638gmail_msg">
</div>
<div class="m_-883216464682320638gmail_msg">Hong (Mr.)</div>
<span class="m_-883216464682320638gmail_msg"><br class="m_-883216464682320638gmail_msg">
<blockquote type="cite" class="m_-883216464682320638gmail_msg">
<div class="m_-883216464682320638gmail_msg">
<div class="m_-883216464682320638gmail_msg">
<div class="m_-883216464682320638gmail_msg">Thanks,<br class="m_-883216464682320638gmail_msg">
</div>
Justin<br class="m_-883216464682320638gmail_msg">
</div>
<div class="gmail_extra m_-883216464682320638gmail_msg"><br class="m_-883216464682320638gmail_msg">
<div class="gmail_quote m_-883216464682320638gmail_msg">On Mon, Apr 3, 2017 at 1:36 PM, Richard Mills<span class="Apple-converted-space"> </span><span class="m_-883216464682320638gmail_msg"><<a href="mailto:richardtmills@gmail.com" class="m_-883216464682320638gmail_msg" target="_blank">richardtmills@gmail.com</a>></span><span class="Apple-converted-space"> </span>wrote:<br class="m_-883216464682320638gmail_msg">
<blockquote class="gmail_quote m_-883216464682320638gmail_msg" style="margin: 0px 0px 0px 0.8ex; border-left-width: 1px; border-left-color: rgb(204, 204, 204); border-left-style: solid; padding-left: 1ex;">
<div class="m_-883216464682320638gmail_msg">
<div class="m_-883216464682320638gmail_msg">
<div class="m_-883216464682320638gmail_msg">
<div class="m_-883216464682320638gmail_msg">
<div class="m_-883216464682320638gmail_msg">Hi Justin,<br class="m_-883216464682320638gmail_msg">
<br class="m_-883216464682320638gmail_msg">
</div>
How is the MCDRAM (on-package "high-bandwidth memory") configured for your KNL runs? And if it is in "flat" mode, what are you doing to ensure that you use the MCDRAM? Doing this wrong seems to be one of the most common reasons for unexpected poor performance
on KNL.<br class="m_-883216464682320638gmail_msg">
<br class="m_-883216464682320638gmail_msg">
</div>
<div class="m_-883216464682320638gmail_msg">I'm not that familiar with the environment on Cori, but I think that if you are building for KNL, you should add "-xMIC-AVX512" to your compiler flags to explicitly instruct the compiler to use the AVX512 instruction
set. I usually use something along the lines of<br class="m_-883216464682320638gmail_msg">
<br class="m_-883216464682320638gmail_msg">
'COPTFLAGS=-g -O3 -fp-model fast -xMIC-AVX512'<br class="m_-883216464682320638gmail_msg">
<br class="m_-883216464682320638gmail_msg">
</div>
<div class="m_-883216464682320638gmail_msg">(The "-g" just adds symbols, which make the output from performance profiling tools much more useful.)<span class="Apple-converted-space"> </span><br class="m_-883216464682320638gmail_msg">
</div>
<div class="m_-883216464682320638gmail_msg"><br class="m_-883216464682320638gmail_msg">
</div>
That said, I think that if you are comparing 1024 Haswell cores vs. 1024 KNL cores (so double the number of Haswell nodes), I'm not surprised that the simulations are almost twice as fast using the Haswell nodes. Keep in mind that individual KNL cores are
much less powerful than an individual Haswell node. You are also using roughly twice the power footprint (dual socket Haswell node should be roughly equivalent to a KNL node, I believe). How do things look on when you compare equal nodes?<br class="m_-883216464682320638gmail_msg">
<br class="m_-883216464682320638gmail_msg">
</div>
Cheers,<br class="m_-883216464682320638gmail_msg">
</div>
Richard<br class="m_-883216464682320638gmail_msg">
</div>
<div class="m_-883216464682320638m_1223051407712377658m_3572076414804514015m_-516510004710254006m_4134330283867791398HOEnZb m_-883216464682320638gmail_msg">
<div class="m_-883216464682320638gmail_msg m_-883216464682320638m_1223051407712377658m_3572076414804514015m_-516510004710254006m_4134330283867791398h5">
<div class="gmail_extra m_-883216464682320638gmail_msg"><br class="m_-883216464682320638gmail_msg">
<div class="gmail_quote m_-883216464682320638gmail_msg">On Mon, Apr 3, 2017 at 11:13 AM, Justin Chang<span class="Apple-converted-space"> </span><span class="m_-883216464682320638gmail_msg"><<a href="mailto:jychang48@gmail.com" class="m_-883216464682320638gmail_msg" target="_blank">jychang48@gmail.com</a>></span><span class="Apple-converted-space"> </span>wrote:<br class="m_-883216464682320638gmail_msg">
<blockquote class="gmail_quote m_-883216464682320638gmail_msg" style="margin: 0px 0px 0px 0.8ex; border-left-width: 1px; border-left-color: rgb(204, 204, 204); border-left-style: solid; padding-left: 1ex;">
<div class="m_-883216464682320638gmail_msg">Hi all,
<div class="m_-883216464682320638gmail_msg"><br class="m_-883216464682320638gmail_msg">
</div>
<div class="m_-883216464682320638gmail_msg">On NERSC's Cori I have the following configure options for PETSc:</div>
<div class="m_-883216464682320638gmail_msg"><br class="m_-883216464682320638gmail_msg">
</div>
<div class="m_-883216464682320638gmail_msg">./configure --download-fblaslapack --with-cc=cc --with-clib-autodetect=0 --with-cxx=CC --with-cxxlib-autodetect=0 --with-debugging=0 --with-fc=ftn --with-fortranlib-autodetect=0 --with-mpiexec=srun --with-64-bit-indices=1
COPTFLAGS=-O3 CXXOPTFLAGS=-O3 FOPTFLAGS=-O3 PETSC_ARCH=arch-cori-opt</div>
<div class="m_-883216464682320638gmail_msg"><br class="m_-883216464682320638gmail_msg">
</div>
<div class="m_-883216464682320638gmail_msg">Where I swapped out the default Intel programming environment with that of Cray (e.g., 'module switch PrgEnv-intel/6.0.3 PrgEnv-cray/6.0.3'). I want to document the performance difference between Cori's Haswell and
KNL processors.</div>
<div class="m_-883216464682320638gmail_msg"><br class="m_-883216464682320638gmail_msg">
</div>
<div class="m_-883216464682320638gmail_msg">When I run a PETSc example like SNES ex48 on 1024 cores (32 Haswell and 16 KNL nodes), the simulations are almost twice as fast on Haswell nodes. Which leads me to suspect that I am not doing something right for KNL.
Does anyone know what are some "optimal" configure options for running PETSc on KNL?</div>
<div class="m_-883216464682320638gmail_msg"><br class="m_-883216464682320638gmail_msg">
</div>
<div class="m_-883216464682320638gmail_msg">Thanks,</div>
<div class="m_-883216464682320638gmail_msg">Justin</div>
</div>
</blockquote>
</div>
<br class="m_-883216464682320638gmail_msg">
</div>
</div>
</div>
</blockquote>
</div>
<br class="m_-883216464682320638gmail_msg">
</div>
</div>
</blockquote>
</span></div>
<br class="m_-883216464682320638gmail_msg">
</div>
</blockquote>
</span></div>
<br class="m_-883216464682320638gmail_msg">
</div>
</div>
</blockquote>
</div>
<br class="m_-883216464682320638gmail_msg">
</div>
</div>
</div>
</blockquote>
</div>
<br class="m_-883216464682320638gmail_msg">
</div>
</blockquote>
</div>
</div>
</div>
</div>
</blockquote>
</div>
<br class="">
<br clear="all" class="">
<div class=""><br class="">
</div>
--<span class="Apple-converted-space"> </span><br class="">
<div class="gmail_signature" data-smartmail="gmail_signature">What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.<br class="">
-- Norbert Wiener</div>
</div>
</div>
</div>
</blockquote>
</div>
<br class="">
</body>
</html>