NetBSD port

Kevin.Buckley at ecs.vuw.ac.nz Kevin.Buckley at ecs.vuw.ac.nz
Wed Dec 16 20:17:02 CST 2009


>> That summary misses the whole point of the errors I am seeing.
>>
>> The code runs fine locally AND under Sun Grid Engine, if you only
>> spawn TWO processes but not FOUR or EIGHT.
>
> Well the the 'np 2' runs could be scheduled on your local node [or a
> single SMP remote node].

Well, they "could be", yes: they are not though.

Look, you need to trust me when I tell you things (except for version
numbers, ha ha).

I would not be bothering you if I had not looked into this to a
reasonable extent before deciding to bother you.

I am in control of where the jobs are running.

> And I suspect there is something wrong in your OpenMPI+SunGridEngine
> config thats triggering this problem.

I am happy to accept that and I even suggested that might be the case.

I am happy to go and look around the OpenMPI and SGE sources, if that
turns out to be the case.

However, I came to the PETSc list for some insight from the PETSc
error messages.

If they can confirm/reject the notion that it might be an SGE/OpenMPI
issue and not a PETSc one then I will have gained information.


> I don't know exactly how though..

So far, nothing has been confirmed either way.

> [the basic petsc examples are supporsed to work in any valid
> MPI enviornment].


I don't doubt for a minute that they are supposed too.

I am also aware that few people are likley to be using this
software stack on NetBSD and thus there may be some gaps in
your map of "valid MPI environments".


> ok - mpi is shared. Can you confirm that the exact same version of
> openmpi is installed on all the nodes - and that there is no minor
> version differences that could trigger this?

Just take that as read.

Are you saying that the error messages PETSc is throwing out ARE
consistent with a slightly mis-matched MPI then ?


I am building an OpenMPI with some debugging in at present. I'll get
back to you once I have rolled it out across the nodes and have
some more info.

In the meantime, if you can think of anything I can tickle PETSc with,
you being familiar with PETSC, so as to get some error messages that
might tell you something, do let me know.

-- 
Kevin M. Buckley                                  Room:  CO327
School of Engineering and                         Phone: +64 4 463 5971
 Computer Science
Victoria University of Wellington
New Zealand




More information about the petsc-dev mailing list