[petsc-dev] BG hang still broken in petsc-maint!
Satish Balay
balay at mcs.anl.gov
Wed Dec 18 16:42:07 CST 2013
On Wed, 18 Dec 2013, Jed Brown wrote:
> Satish Balay <balay at mcs.anl.gov> writes:
>
> > ~/.petscrc does not get loaded for me on vesta [with maint] - as there
> > is no $HOME in the run env.
>
> We still attempt it for the current directory (./.petscrc and ./petscrc).
>
> I dug into the PAMID source, but the environment variable processing is
> not something we can mess with later without some support being added
> inside MPICH/PAMID. If PETSc calls MPI_Init(), we could use setenv()
> before, though that could be surprising.
>
> We could write our own reference MPI_Bcast over point-to-point. Our
> messages are always pretty short so this is actually a decent algorithm
> and would be best for the users (most reliable, least confusing).
I forced reading in ~/.petscrc by using 'runjob --env-all' and ran the
following runs in a loop [50*3 = 150 runs?] with the basic petsc
example src/sys/examples/tutorials/ex1.c - and there was no hang in
the whole session.
Satish
-------
[balay at cetuslac1 tutorials]$ cat ~/reserve.sh
#!/bin/bash
echo I am going to wait for an hour
sleep 3600
echo Good bye
[balay at cetuslac1 ~]$ qsub -A BGQtools_esp -n 512 -t 60 --mode script ./reserve.sh
[balay at cetuslac1 tutorials]$ cat run.sh
#!/bin/bash
for i in `seq 1 50`;
do
echo ******** $i ***************
runjob --env-all --np 2048 --ranks-per-node 16 --cwd $PWD --block CET-00040-33371-512 : $PWD/ex1 -log_summary
runjob --env-all --np 4096 --ranks-per-node 16 --cwd $PWD --block CET-00040-33371-512 : $PWD/ex1 -log_summary
runjob --env-all --np 8192 --ranks-per-node 16 --cwd $PWD --block CET-00040-33371-512 : $PWD/ex1 -log_summary
done
[balay at cetuslac1 tutorials]$ ./run.sh
<snip>
More information about the petsc-dev
mailing list