NetBSD port
Kevin.Buckley at ecs.vuw.ac.nz
Kevin.Buckley at ecs.vuw.ac.nz
Wed Dec 16 20:17:02 CST 2009
>> That summary misses the whole point of the errors I am seeing.
>>
>> The code runs fine locally AND under Sun Grid Engine, if you only
>> spawn TWO processes but not FOUR or EIGHT.
>
> Well the the 'np 2' runs could be scheduled on your local node [or a
> single SMP remote node].
Well, they "could be", yes: they are not though.
Look, you need to trust me when I tell you things (except for version
numbers, ha ha).
I would not be bothering you if I had not looked into this to a
reasonable extent before deciding to bother you.
I am in control of where the jobs are running.
> And I suspect there is something wrong in your OpenMPI+SunGridEngine
> config thats triggering this problem.
I am happy to accept that and I even suggested that might be the case.
I am happy to go and look around the OpenMPI and SGE sources, if that
turns out to be the case.
However, I came to the PETSc list for some insight from the PETSc
error messages.
If they can confirm/reject the notion that it might be an SGE/OpenMPI
issue and not a PETSc one then I will have gained information.
> I don't know exactly how though..
So far, nothing has been confirmed either way.
> [the basic petsc examples are supporsed to work in any valid
> MPI enviornment].
I don't doubt for a minute that they are supposed too.
I am also aware that few people are likley to be using this
software stack on NetBSD and thus there may be some gaps in
your map of "valid MPI environments".
> ok - mpi is shared. Can you confirm that the exact same version of
> openmpi is installed on all the nodes - and that there is no minor
> version differences that could trigger this?
Just take that as read.
Are you saying that the error messages PETSc is throwing out ARE
consistent with a slightly mis-matched MPI then ?
I am building an OpenMPI with some debugging in at present. I'll get
back to you once I have rolled it out across the nodes and have
some more info.
In the meantime, if you can think of anything I can tickle PETSc with,
you being familiar with PETSC, so as to get some error messages that
might tell you something, do let me know.
--
Kevin M. Buckley Room: CO327
School of Engineering and Phone: +64 4 463 5971
Computer Science
Victoria University of Wellington
New Zealand
More information about the petsc-dev
mailing list