[petsc-dev] Anyone run STREAMS on an Apple M1 system?
Pierre Jolivet
pierre at joliv.et
Fri Oct 29 09:14:46 CDT 2021
To address your original question in case you didn’t get an answer in private, Barry, see infra what I get on a Macmini9,1.
I can send you an update when I get my hands on the newer MBP if you don’t have new batch of results by then.
Thanks,
Pierre
/Users/jolivet/petsc/arch-darwin-c-opt/bin/mpicc -o MPIVersion.o -c -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -fstack-protector -fno-stack-check -Qunused-arguments -fvisibility=hidden -Ofast -I/Users/jolivet/petsc/include -I/Users/jolivet/petsc/arch-darwin-c-opt/include `pwd`/MPIVersion.c
Running streams with '/Users/jolivet/petsc/arch-darwin-c-opt/bin/mpiexec ' using 'NPMAX=8'
1 58465.2862 Rate (MB/s)
2 57883.6350 Rate (MB/s) 0.990051
3 57538.6275 Rate (MB/s) 0.98415
4 57343.4226 Rate (MB/s) 0.980811
5 53210.3840 Rate (MB/s) 0.910119
6 50956.9355 Rate (MB/s) 0.871576
7 51621.7521 Rate (MB/s) 0.882947
8 48377.2931 Rate (MB/s) 0.827453
------------------------------------------------
Traceback (most recent call last):
File "process.py", line 89, in <module>
process(sys.argv[1],len(sys.argv)-2)
File "process.py", line 33, in process
speedups[i] = triads[i]/triads[0]
TypeError: 'dict_values' object is not subscriptable
make[2]: [mpistream] Error 1 (ignored)
Traceback (most recent call last):
File "process.py", line 89, in <module>
process(sys.argv[1],len(sys.argv)-2)
File "process.py", line 33, in process
speedups[i] = triads[i]/triads[0]
TypeError: 'dict_values' object is not subscriptable
make[2]: [mpistreams] Error 1 (ignored)
> On 25 Oct 2021, at 10:52 PM, Jed Brown <jed at jedbrown.org> wrote:
>
> This shows 240 GB/s using 10 cores (8 performance + 2 efficiency) and 224 GB/s with 8 cores (as you'd most likely run HPC apps). Good, but far from the theoretical 400 GB/s headline.
>
> https://www.anandtech.com/show/17024/apple-m1-max-performance-review/2
>
> Barry Smith <bsmith at petsc.dev> writes:
>
>> Thanks, presumably we'll see the new Mac's there in a few days. BTW: the old streams benchmark page should point to this site; google is worthless.
>>
>> I get 24 on my Intel MacBook Pro and it also saturates with 1 core.
>>
>> Barry
>>
>>
>>> On Oct 18, 2021, at 4:48 PM, Jed Brown <jed at jedbrown.org> wrote:
>>>
>>> I don't have one, but this suggests it gets about 40 GB/s and can be saturated by a single core. I believe it uses two channels of LPDDR4X-4266, which has a theoretical peak of 68 GB/s.
>>>
>>> https://browser.geekbench.com/v3/cpu/8931693
>>>
>>> The press release claims up to 400 GB/s on the Max using DDR5. I assume that's calculated based on 8 channels of LPDDR5-6400, which seems like a surprisingly big step and I'm skeptical of what will actually be realized.
>>>
>>> Barry Smith <bsmith at petsc.dev> writes:
>>>
>>>> Can anyone who owns an Apple M1 system run the MPI streams benchmark? Make sure the -O3 (or something) optimization flags are turned on.
>>>>
>>>> Thanks
>>>>
>>>> Barry
More information about the petsc-dev
mailing list