512 1000 16.82 29.03 1024 1000 32.84 29.74 2048 1000 30.52 64.00 4096 1000 35.83 109.01 8192 1000 70.04 111.55 16384 1000 138.96 112.44 32768 1000 277.05 112.80 65536 640 553.24 112.97 131072 320 1106.60 112.96 262144 160 2211.18 113.06 [proxy:0:0@wci30] requesting checkpoint [proxy:0:1@wci31.cse.ohio-state.edu] requesting checkpoint [proxy:0:1@wci31.cse.ohio-state.edu] checkpoint completed Fatal error in MPI_Allreduce: Other MPI error, error stack: MPI_Allreduce(826)........: MPI_Allreduce(sbuf=0x7fff67216aa8, rbuf=0x7fff67216aa0, count=1, MPI_DOUBLE, MPI_MAX, comm=0x84000000) failed MPIR_Allreduce_impl(684)..: MPIR_Allreduce_intra(363).: MPIC_Sendrecv(189)........: MPIC_Wait(528)............: MPIDI_CH3I_Progress(254)..: MPIDI_nem_ckpt_finish(459): sem_wait() failed Interrupted system call [proxy:0:0@wci30] checkpoint completed Fatal error in MPI_Allreduce: Other MPI error, error stack: MPI_Allreduce(826).............: MPI_Allreduce(sbuf=0x7fff87e7ece8, rbuf=0x7fff87e7ece0, count=1, MPI_DOUBLE, MPI_MAX, comm=0x84000000) failed MPIR_Allreduce_impl(684).......: MPIR_Allreduce_intra(363)......: MPIC_Sendrecv(186).............: MPIC_Wait(528).................: MPIDI_CH3I_Progress(335).......: MPID_nem_mpich2_test_recv(747).: MPID_nem_tcp_connpoll(1843)....: state_commrdy_handler(1674)....: MPID_nem_tcp_recv_handler(1653): Communication error with rank 1 MPID_nem_tcp_recv_handler(1554): socket closed APPLICATION TERMINATED WITH THE EXIT STRING: Hangup (signal 1)