[codes-ross-users] Replaying OTF2 traces

Ghosh, Sayan sayan.ghosh at pnnl.gov
Wed Sep 25 10:51:45 CDT 2019


Hi – I am trying to use CODES/ROSS to replay OTF2 traces  (generated using ScoreP) on network models, after consulting the latest CODES/ROSS tutorial: https://www.mcs.anl.gov/projects/codes/files/2017/08/codes-hoti-tutorial-v4.pdf

I had used 1024 processes to generate OTF2 traces of my application (assuming it will run in serial mode without mpiexec). Ideally I would like to run the replay on my local machine using 1024 logical processes, without using mpiexec. Perhaps my understanding is not right here, please help.

WE38523:codes-bw ghos167$ ~/builds/CODES/bin/model-net-mpi-replay --sync=3 --workload_type="online" --workload_file=scorep-measurement-tmp/ --num_net_traces=1024 -- ~/sources/CODES/tests/conf/modelnet-test-dragonfly.conf
/Users/ghos167/builds/CODES/bin/model-net-mpi-replay --sync=3 --workload_type=online --workload_file=scorep-measurement-tmp/ --num_net_traces=1024 -- /Users/ghos167/sources/CODES/tests/conf/modelnet-test-dragonfly.conf

Wed Sep 25 08:43:31 2019

ROSS Version: v7.1.1

tw_net_start: Found world size to be 1
Warning: Defaulting to Sequential Simulation, not enough PEs defined.
NIC num injection port not specified, setting to 1
NIC seq delay not specified, setting to 10.000000
NIC num copy queues not specified, setting to 1

Total nodes 72 routers 36 groups 9 radix 8
within node transfer per byte delay is 0.190476
Within-node eager limit (node_eager_limit) not specified, setting to 16000

ROSS Core Configuration:
                Total PEs                                                    1
                Total KPs                                          [Nodes (1) x KPs (16)] 16
                Total LPs                                                  180
                Simulation End Time                                3600000000000.00
                LP-to-PE Mapping                                   model defined


ROSS Event Memory Allocation:
                Model events                                             46081
                Network events                                              16
                Total events                                             46096

Assertion failed: (num_net_traces <= num_mpi_lps), function nw_test_init, file src/network-workloads/model-net-mpi-replay.c, line 2107.
Abort trap: 6

--
Sayan

Data Sciences
Pacific Northwest National Laboratory

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/codes-ross-users/attachments/20190925/1da1f9e2/attachment-0001.html>


More information about the codes-ross-users mailing list