<div dir="ltr"><div>Hi Mihael,</div><div><br></div>Please find the strace output from _swiftwrap attached. It gives the same error on trying with -f switch though.<div><br></div><div>Thanks,</div><div>Ketan</div></div><div class="gmail_extra"><br><div class="gmail_quote">On Mon, Dec 8, 2014 at 3:36 PM, Mihael Hategan <span dir="ltr"><<a href="mailto:hategan@mcs.anl.gov" target="_blank">hategan@mcs.anl.gov</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span class="">Again, can you put the strace call in _swiftwrap rather than bg.sh?<br>
<br>
Also, can you paste the exact line that you used to run strace? You are<br>
asking me to debug an invisible program.<br>
<br>
</span>Mihael<br>
<span class="im HOEnZb"><br>
On Mon, 2014-12-08 at 15:26 -0600, Ketan Maheshwari wrote:<br>
</span><div class="HOEnZb"><div class="h5">> Hi Mihael,<br>
><br>
> The strace command is not accepting the -f option. From the man page of<br>
> strace, I see that the option relates to the forked processes which might<br>
> be the reason why that option is causing error on BG/Q. Here is the error<br>
> message:<br>
><br>
> Execution failed:<br>
> Exception in strace:<br>
> Arguments: [-fo, /home/ketan/strace.f.out,<br>
> /home/ketan/SwiftApps/subjobs/bg.sh,<br>
> /soft/applications/lammps/24Apr13/lmp_bgq_xlomp, -in, input.lammps]<br>
> Host: cluster<br>
> Directory: workflow.bgq-run016/jobs/r/strace-rqnmne1m<br>
> exception @ swift-int-staging.k, line: 181<br>
> Caused by: The following output files were not created by the application:<br>
> lammps.dump<br>
><br>
> ------- Application STDERR --------<br>
> 2014-12-08 21:20:43.872 (INFO ) [0xfff7c25bde0] ibm.runjob.AbstractOptions:<br>
> using properties file /bgsys/local/etc/bg.properties<br>
> 2014-12-08 21:20:43.874 (INFO ) [0xfff7c25bde0] ibm.runjob.AbstractOptions:<br>
> max open file descriptors: 65536<br>
> 2014-12-08 21:20:43.874 (INFO ) [0xfff7c25bde0] ibm.runjob.AbstractOptions:<br>
> core file limit: 18446744073709551615<br>
> 2014-12-08 21:20:43.876 (INFO ) [0xfff7c25bde0] 27211:tatu.runjob.client:<br>
> scheduler job id is 377978<br>
> log4cxx: No appender could be found for logger (tatu.runjob.monitor).<br>
> log4cxx: Please initialize the log4cxx system properly.<br>
> 2014-12-08 21:20:43.912 (FATAL) [0xfff7c25bde0] 27211:tatu.runjob.client:<br>
> failed reading: Connection reset by peer<br>
> 2014-12-08 21:20:43.912 (FATAL) [0xfff7c25bde0] 27211:tatu.runjob.client:<br>
> protocol version exchange between the runjob client and monitor failed<br>
> -----------------------------------<br>
><br>
> Thanks,<br>
> Ketan<br>
><br>
> On Mon, Dec 8, 2014 at 3:09 PM, Mihael Hategan <<a href="mailto:hategan@mcs.anl.gov">hategan@mcs.anl.gov</a>> wrote:<br>
><br>
> > On Mon, 2014-12-08 at 14:07 -0600, Ketan Maheshwari wrote:<br>
> > > I tried to get strace output with two methods:<br>
> > ><br>
> > > stderr.txt: This was obtained by attaching the "--strace 0" switch to the<br>
> > > runjob command. It seems to be exiting normally after writing a bunch of<br>
> > > stuff.<br>
> > ><br>
> > > strace.out: This one was obtained by wrapping the app exe with strace -o<br>
> > > $HOME/strace.out ...<br>
> ><br>
> > Are you sure? It looks like you wrapped the execution of bg.sh in<br>
> > strace. This log only tells us that bg.sh starts runjob and runjob never<br>
> > completes, which we already know. You probably want to go to the lowest<br>
> > level possible. But see below (*).<br>
> ><br>
> > ><br>
> > > This one shows a stuck output with the last line as:<br>
> > ><br>
> > > waitpid(-1, %<br>
> ><br>
> > waitpid means it's waiting for a subprocess, so this isn't useful<br>
> > because we want to find out what the leaf subprocess is hanging on. You<br>
> > could use the '-f' argument to strace to make it follow subprocesses. If<br>
> > you do that, it probably won't matter (aside from noise) at what level<br>
> > you use strace (*).<br>
> ><br>
> > Mihael<br>
> ><br>
> > _______________________________________________<br>
> > Swift-devel mailing list<br>
> > <a href="mailto:Swift-devel@ci.uchicago.edu">Swift-devel@ci.uchicago.edu</a><br>
> > <a href="https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel" target="_blank">https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel</a><br>
> ><br>
<br>
<br>
_______________________________________________<br>
Swift-devel mailing list<br>
<a href="mailto:Swift-devel@ci.uchicago.edu">Swift-devel@ci.uchicago.edu</a><br>
<a href="https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel" target="_blank">https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel</a><br>
</div></div></blockquote></div><br></div>