[petsc-dev] Anyone run STREAMS on an Apple M1 system?

Pierre Jolivet pierre at joliv.et
Fri Oct 29 09:14:46 CDT 2021


To address your original question in case you didn’t get an answer in private, Barry, see infra what I get on a Macmini9,1.
I can send you an update when I get my hands on the newer MBP if you don’t have new batch of results by then.

Thanks,
Pierre 

/Users/jolivet/petsc/arch-darwin-c-opt/bin/mpicc -o MPIVersion.o -c -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -fstack-protector -fno-stack-check -Qunused-arguments -fvisibility=hidden -Ofast    -I/Users/jolivet/petsc/include -I/Users/jolivet/petsc/arch-darwin-c-opt/include    `pwd`/MPIVersion.c
Running streams with '/Users/jolivet/petsc/arch-darwin-c-opt/bin/mpiexec ' using 'NPMAX=8'
1  58465.2862   Rate (MB/s)
2  57883.6350   Rate (MB/s) 0.990051
3  57538.6275   Rate (MB/s) 0.98415
4  57343.4226   Rate (MB/s) 0.980811
5  53210.3840   Rate (MB/s) 0.910119
6  50956.9355   Rate (MB/s) 0.871576
7  51621.7521   Rate (MB/s) 0.882947
8  48377.2931   Rate (MB/s) 0.827453
------------------------------------------------
Traceback (most recent call last):
  File "process.py", line 89, in <module>
    process(sys.argv[1],len(sys.argv)-2)
  File "process.py", line 33, in process
    speedups[i] = triads[i]/triads[0]
TypeError: 'dict_values' object is not subscriptable
make[2]: [mpistream] Error 1 (ignored)
Traceback (most recent call last):
  File "process.py", line 89, in <module>
    process(sys.argv[1],len(sys.argv)-2)
  File "process.py", line 33, in process
    speedups[i] = triads[i]/triads[0]
TypeError: 'dict_values' object is not subscriptable
make[2]: [mpistreams] Error 1 (ignored)

> On 25 Oct 2021, at 10:52 PM, Jed Brown <jed at jedbrown.org> wrote:
> 
> This shows 240 GB/s using 10 cores (8 performance + 2 efficiency) and 224 GB/s with 8 cores (as you'd most likely run HPC apps). Good, but far from the theoretical 400 GB/s headline.
> 
> https://www.anandtech.com/show/17024/apple-m1-max-performance-review/2
> 
> Barry Smith <bsmith at petsc.dev> writes:
> 
>>  Thanks, presumably we'll see the new Mac's there in a few days. BTW: the old streams benchmark page should point to this site; google is worthless.
>> 
>>  I get 24 on my Intel MacBook Pro and it also saturates with 1 core.
>> 
>>  Barry
>> 
>> 
>>> On Oct 18, 2021, at 4:48 PM, Jed Brown <jed at jedbrown.org> wrote:
>>> 
>>> I don't have one, but this suggests it gets about 40 GB/s and can be saturated by a single core. I believe it uses two channels of LPDDR4X-4266, which has a theoretical peak of 68 GB/s.
>>> 
>>> https://browser.geekbench.com/v3/cpu/8931693
>>> 
>>> The press release claims up to 400 GB/s on the Max using DDR5. I assume that's calculated based on 8 channels of LPDDR5-6400, which seems like a surprisingly big step and I'm skeptical of what will actually be realized.
>>> 
>>> Barry Smith <bsmith at petsc.dev> writes:
>>> 
>>>>  Can anyone who owns an Apple M1 system run the MPI streams benchmark? Make sure the -O3 (or something) optimization flags are turned on.
>>>> 
>>>> Thanks
>>>> 
>>>>   Barry



More information about the petsc-dev mailing list