[petsc-users] MatPermuteSparsify hanging

Barry Smith bsmith at mcs.anl.gov
Mon Nov 29 13:33:01 CST 2010


  Since it is a Cray it should have TotalView debugger installed. You can run in that and then see where it is when it "hangs".

    Barry

On Nov 29, 2010, at 1:28 PM, Kelly Fermoyle wrote:

> Barry,
> 
> I am using a Cray machine and it requires that I call as aprun -n 1
> ./prog_name, which does not seem to work with -start_in_debugger.
> 
> Jed,
> 
> Below is the output of the full backtrace:
> 
> -Kelly
> 
> 
> [Thread debugging using libthread_db enabled]
> [New Thread 47303838703712 (LWP 30863)]
> 
> Program received signal SIGINT, Interrupt.
> [Switching to Thread 47303838703712 (LWP 30863)]
> 0x00002b05c868197f in poll () from /lib64/libc.so.6
> (gdb) backtrace full
> #0  0x00002b05c868197f in poll () from /lib64/libc.so.6
> No symbol table info available.
> #1  0x0000000000405ec6 in control_loop (argc=20, argv=0x7fffe2814538) at
> aprun.c:1137
> 	l = (List) 0x58b6d0
> 	i = 4
> 	pollRet = 0
> 	ret = 0
> 	numClients = 0
> 	currNumPoll = 4
> 	connAttempt = 0
> 	controlConnClosed = 0
> 	readFail = 0
> 	apsysIndex = 0
> 	pipeInt = 0
> 	nRead = 0
> 	dropConnCheck = 0
> 	xferCnt = 0
> 	chkpntBlocked = 1
> 	apsys_socket_temp = 0
> 	buf = '\0' <repeats 16 times>,
> "?gW\000\000\000\000\000`hW\000\000\000\000\000\220?X\000\000\000\000\000
> `W\000\000\000\000\000
> `W\000\000\000\000\000\000?W\000\000\000\000\000/usr/bin/gzip --fast -c
> -- ./SolveTest.exe", '\0' <repeats 8085 times>
> 	emsg = 0x0
> 	tcpInfoMsg = 0x0
> 	tcpInfo = {tcpi_state = 0 '\0', tcpi_ca_state = 0 '\0', tcpi_retransmits
> = 0 '\0', tcpi_probes = 0 '\0',
>  tcpi_backoff = 0 '\0', tcpi_options = 0 '\0', tcpi_snd_wscale = 0 '\0',
> tcpi_rcv_wscale = 0 '\0', tcpi_rto = 0, tcpi_ato = 0,
>  tcpi_snd_mss = 0, tcpi_rcv_mss = 0, tcpi_unacked = 0, tcpi_sacked = 0,
> tcpi_lost = 0, tcpi_retrans = 0, tcpi_fackets = 0,
>  tcpi_last_data_sent = 0, tcpi_last_ack_sent = 0, tcpi_last_data_recv =
> 0, tcpi_last_ack_recv = 5731168, tcpi_pmtu = 0,
>  tcpi_rcv_ssthresh = 1, tcpi_rtt = 1, tcpi_rttvar = 0, tcpi_snd_ssthresh
> = 0, tcpi_snd_cwnd = 0, tcpi_advmss = 0, tcpi_reordering = 0}
> 	timeNow = 0
> 	lastConnLog = 0
> 	restartLaunch = (XMLRPC_REQUEST) 0x0
> #2  0x0000000000403be7 in main (argc=20, argv=0x7fffe2814538) at aprun.c:225
> 	maxNumPoll = 1024
> 	rlim = {rlim_cur = 1024, rlim_max = 1024}
> 	launch = (XMLRPC_REQUEST) 0x576980
> 	emsg = 0x0
> (gdb)
> 
> 
> 
> 
> 
> 
> 
>> On Mon, Nov 29, 2010 at 20:15, Kelly Fermoyle <kjf198 at cse.psu.edu> wrote:
>> 
>>> I tried it in the debugger, and it said
>>> that aprun was calling poll(), I don't know what that means.
>>> 
>> 
>> Type "backtrace full" and send the results.
>> 
>> Jed
>> 
> 



More information about the petsc-users mailing list