[AG-TECH] DPPT3 Failure - Conflict description and a fix
Rick Stevens
stevens at mcs.anl.gov
Thu Oct 18 21:07:52 CDT 2001
Awesome debugging!!
At 05:57 PM 10/17/2001 -0500, Marty Hoag wrote:
> For the past few months several of us have had some strange problems
>with DPPT. Our clients would fail for no apparent reason. While the
>last message is something like "nullpointer exception" we noticed that
>up a few lines there is a message like
>
> Caught JSDT exception: name in use
>
>We had this problem on Monday during a virtual conference in genomics
>and bioinformatics we were hub for. Cindy Sievers at LANL mentioned
>a couple times in the moo that her client was failing but then later
>NDSU's failed and hers worked. This pattern was consistent throughout
>the day. We had tried rebooting our display at one point even.
>Kay Gunn was great in calling out slide changes but it wasn't so fun
>for Cindy or Jim Senechal here who had to manually keep up (Jim
>was too far from the screen to read the numbers so we put binoculars
>on our AG Node large event requirements list).
>
> It struck me that this sounded like some sort of resource conflict
>between LANL and NDSU. We looked at the agserv command window and
>noticed the clients are identified by userid and host name. We
>saw that ours was coming in as ag at agdisplay . Cindy had the same
>combination! Bob Olson has verified that this is indeed a problem
>by checking the code.
>
> While I think we had the windows "DNS" stuff set up to use both a
>hostname and domain, we did NOT have that set up in the "Net ID" field
>of windows. We think we have found a quick fix for this to avoid any
>conflict. There could be some strange implications if you are using
>windows networking in some way but we made the change Monday evening
>and ran all day Tuesday with LANL and NDSU coexisting and had no other
>problems.
>
> Here is the "quick fix" we used for W2K (we are at SP2 but I don't
>think that should matter). Please note that I'm not a windows
>expert so my terminology might be wrong but I think I have our
>procedures down). Use at your own risk (at least until independently
>confirmed):
>
>1) Go do Start / Settings / Control Panel / System
>
>2) Click on the Network Identification tab. If your "Full computer
>name" already is fully qualified then your machine should be
>uniquely identified (at least for this problem). If not go on...
>
>3) Click on the Properties button.
>
>4) The Computer Name is just the first element of the name. To add a
>domain to qualify this click on the More... button.
>
>5) For primary DNS suffix for this computer enter your domain. E.g.
>if the machine is agdisplay.foo.bar.edu then you'd add foo.bar.edu
>here.
>
>6) We UNCHECKED the "Change primary DNS suffix when domain membership
>changes" because we didn't understand it and didn't want to do something
>we didn't understand.
>
>7) We OKed those and I think had to reboot. Then when we connected to
>our agserv we saw our Client ID as ag at agdisplay.ndsu.nodak.edu .
>
> Since more sites are joining the AG and using more standardized
>software and documentation that probably explains why this seems to
>be a more common problem (or maybe Cindy and NDSU are the only ones
>who are "uncreative").
>
> Given work on a future replacement for DPPT (rppt) and the apparent
>ease with fixing this (we probably need more than one site to try the
>fix to confirm that) I'm not sure it is worth spending time rewriting
>DPPT at this point. But that would be for others to decide.
>
> At least LANL and NDSU should be able to both be clients to the same
>dppt server now. ;-) And this is yet another proof of the value of
>collaboration and tools like the moo! I owe Cindy a couple breakfast
>burritos.
>
> Marty
More information about the ag-tech
mailing list