[ExM Users] debugging suggestions for non-static main-wrap segfault

Tim Armstrong tim.g.armstrong at gmail.com
Tue Jul 29 15:14:23 CDT 2014


proc just defines the functions. You need to call them for it to run.

What are we trying to achieve by running this file anyway?  This look like
a set of library functions rather than the entry point for a script.

- Tim


On Tue, Jul 29, 2014 at 3:08 PM, Ketan Maheshwari <ketan at mcs.anl.gov> wrote:

> Here is the tcl script with puts messages:
>
> package provide leaf_main 0.0
>
> # dnl Receive USER_LEAF from environment for m4 processing
> set USER_LEAF dock_wrap
> puts hello1
>
> namespace eval leaf_main {
> puts hello2
>
>     proc leaf_main_wrap { rc A } {
>     deeprule $A 1 0 "leaf_main::leaf_main_wrap_impl $rc $A" type
> $::turbine::WORK
>     }
>
>     proc leaf_main_wrap_impl { rc A } {
>
>         global USER_LEAF
>
>         set length [ adlb::container_size $A ]
>         set tds [ adlb::enumerate $A dict all 0 ]
>         set argv [ list ]
>
>         puts hello3
>
>         # Fill argv with blanks
>         dict for { i v } $tds {
>             lappend argv 0
>         }
>         # Set values at ordered list positions
>         dict for { i v } $tds {
>             lset argv $i $v
>         }
>         set rc_value [ ${USER_LEAF}_extension {*}$argv ]
>         turbine::store_integer $rc $rc_value
>         puts hello4
>     }
>     puts hello5
> }
>
>
>
>
> It prints:
>
> hello1
> hello2
> hello5
>
> I see that it is not going in the proc_leaf_main_wrap_impl but I am not
> familiar enough with TCL to understand why.
>
>
>
> On Tue, Jul 29, 2014 at 2:41 PM, Tim Armstrong <tim.g.armstrong at gmail.com>
> wrote:
>
>>  I don't see any reason why that invocation of tclsh would silently fail
>> to run the tcl script.  Have you attempted to confirm your hypothesis that
>> it's not running the script, for example by modifying the script to print
>> something at the beginning or end?
>>
>>
>> On Tue, Jul 29, 2014 at 1:42 PM, Ketan Maheshwari <ketan at mcs.anl.gov>
>> wrote:
>>
>>> I expect it to run the application or crash on segfault. Nothing
>>> happens.
>>>
>>>
>>>
>>>  On Tue, Jul 29, 2014 at 1:39 PM, Tim Armstrong <
>>> tim.g.armstrong at gmail.com> wrote:
>>>
>>>>   That looks right, it should run dock_wrap.tcl fine.  And it runs
>>>> successfully to completion with no output?  Is that what you expected it to
>>>> do?
>>>>
>>>>  Backtracking to your original problem, if you could work out which
>>>> "package require" statement was failing and provide some info about that
>>>> package it might help understand the issue.
>>>>
>>>>  - Tim
>>>>
>>>>
>>>>  On Tue, Jul 29, 2014 at 1:32 PM, Ketan Maheshwari <ketan at mcs.anl.gov>
>>>> wrote:
>>>>
>>>>>  I run tclsh as follows:
>>>>>
>>>>>  /home/ketan/tcl-install/bin/tclsh8.5 dock_wrap.tcl -i rigid.in
>>>>>
>>>>>  and
>>>>>
>>>>>  mpiexec -n 3 /home/ketan/tcl-install/bin/tclsh8.5 dock_wrap.tcl -i
>>>>> rigid.in
>>>>>
>>>>>
>>>>>  On Tue, Jul 29, 2014 at 1:28 PM, Tim Armstrong <
>>>>> tim.g.armstrong at gmail.com> wrote:
>>>>>
>>>>>>   I forgot to reply all earlier, re-including the list.
>>>>>>
>>>>>>  How are you running tclsh?
>>>>>>
>>>>>>
>>>>>> On Tue, Jul 29, 2014 at 11:53 AM, Ketan Maheshwari <ketan at mcs.anl.gov
>>>>>> > wrote:
>>>>>>
>>>>>>> when I try tclsh, it does not do anything. Just returns with an exit
>>>>>>> status 0.
>>>>>>>
>>>>>>>
>>>>>>>  On Tue, Jul 29, 2014 at 11:02 AM, Tim Armstrong <
>>>>>>> tim.g.armstrong at gmail.com> wrote:
>>>>>>>
>>>>>>>>   You can run it directly with tclsh or mpiexec tclsh, which is
>>>>>>>> what turbine eventually does after setting up environment variables, etc.
>>>>>>>>
>>>>>>>>  - Tim
>>>>>>>>
>>>>>>>>
>>>>>>>> On Tue, Jul 29, 2014 at 10:57 AM, Ketan Maheshwari <
>>>>>>>> ketan at mcs.anl.gov> wrote:
>>>>>>>>
>>>>>>>>> Is it possible to run the dock_wrap.tcl outside of turbine just as
>>>>>>>>> in the case of static build?
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>  On Tue, Jul 29, 2014 at 10:45 AM, Wozniak, Justin M. <
>>>>>>>>> wozniak at mcs.anl.gov> wrote:
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Ok, it's in.  The Swift/K SVN is apparently down so it's not on
>>>>>>>>>> the web yet but see the asciidoc.
>>>>>>>>>>
>>>>>>>>>> On 07/29/2014 10:21 AM, Justin M Wozniak wrote:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> I thought VALGRIND was in the manual already but it isn't.  I
>>>>>>>>>> will add it now.  I will also talk about our GDB feature.
>>>>>>>>>>
>>>>>>>>>> On 07/29/2014 10:17 AM, Ketan Maheshwari wrote:
>>>>>>>>>>
>>>>>>>>>> Thanks! Seems turbine script already had a placeholder for
>>>>>>>>>> Valgrind so I tried that and from the output, it seems tcl libraries are
>>>>>>>>>> causing segfault but I may be wrong. Attached is the Valgrind output.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Tue, Jul 29, 2014 at 10:05 AM, Tim Armstrong <
>>>>>>>>>> tim.g.armstrong at gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>>>  I don't have any particular insight into the cause of the
>>>>>>>>>>> segfault, I can help with the debugger though.
>>>>>>>>>>>
>>>>>>>>>>> You need to point gdb at the tclsh that is being used by turbine
>>>>>>>>>>> (which is just a shell script).  You can locate the correct tclsh by
>>>>>>>>>>> looking at TCLSH in scripts/turbine-config.sh in the turbine install
>>>>>>>>>>> directory.
>>>>>>>>>>>
>>>>>>>>>>>  - TIm
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>  On Tue, Jul 29, 2014 at 10:00 AM, Ketan Maheshwari <
>>>>>>>>>>> ketan at mcs.anl.gov> wrote:
>>>>>>>>>>>
>>>>>>>>>>>>  Hi,
>>>>>>>>>>>>
>>>>>>>>>>>>  Trying to main-wrap DOCK 6.6 application for ATPESC, I get
>>>>>>>>>>>> the build right (seems) but things fail at runtime giving segfault:
>>>>>>>>>>>>
>>>>>>>>>>>>  $ turbine -n 4 user-code.tcl
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> ===================================================================================
>>>>>>>>>>>> =   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
>>>>>>>>>>>> =   EXIT CODE: 139
>>>>>>>>>>>> =   CLEANING UP REMAINING PROCESSES
>>>>>>>>>>>> =   YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
>>>>>>>>>>>>
>>>>>>>>>>>> ===================================================================================
>>>>>>>>>>>> YOUR APPLICATION TERMINATED WITH THE EXIT STRING: Segmentation
>>>>>>>>>>>> fault (signal 11)
>>>>>>>>>>>> This typically refers to a problem with your application.
>>>>>>>>>>>> Please see the FAQ page for debugging suggestions
>>>>>>>>>>>>
>>>>>>>>>>>>  This is on MCS machine. Any suggestion to debug this? I tried
>>>>>>>>>>>> gdb but it gives:
>>>>>>>>>>>>
>>>>>>>>>>>>   "/nfs2/ketan/exm-install/turbine/bin/turbine": not in
>>>>>>>>>>>> executable format: File format not recognized
>>>>>>>>>>>>
>>>>>>>>>>>>  With strace, I see some signs of missing files but not sure
>>>>>>>>>>>> if that is the cause of segfault. Attached is the strace output of:
>>>>>>>>>>>>
>>>>>>>>>>>>  strace -o strace.out turbine -n 4 user-code.tcl
>>>>>>>>>>>>
>>>>>>>>>>>>  The code has some MPI and pthread elements but does not use
>>>>>>>>>>>> them as far as I understand.
>>>>>>>>>>>>
>>>>>>>>>>>>  Thanks for any suggestions.
>>>>>>>>>>>>
>>>>>>>>>>>>  --
>>>>>>>>>>>> Ketan
>>>>>>>>>>>>
>>>>>>>>>>>>  _______________________________________________
>>>>>>>>>>>> ExM-user mailing list
>>>>>>>>>>>> ExM-user at lists.mcs.anl.gov
>>>>>>>>>>>> https://lists.mcs.anl.gov/mailman/listinfo/exm-user
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> _______________________________________________
>>>>>>>>>> ExM-user mailing listExM-user at lists.mcs.anl.govhttps://lists.mcs.anl.gov/mailman/listinfo/exm-user
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>   --
>>>>>>>>>> Justin M Wozniak
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> Justin M Wozniak
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/exm-user/attachments/20140729/f544e116/attachment.html>


More information about the ExM-user mailing list