[petsc-users] Scalability of PETSc on vesta.alcf

Roc Wang pengxwang at hotmail.com
Tue Jan 21 10:01:45 CST 2014



> From: jed at jedbrown.org
> To: pengxwang at hotmail.com
> CC: petsc-users at mcs.anl.gov
> Subject: RE: [petsc-users] Scalability of PETSc on vesta.alcf
> Date: Mon, 20 Jan 2014 10:32:32 -0700
> 
> Roc Wang <pengxwang at hotmail.com> writes:
> >   I tried c16 for 1024 ranks and 2048 ranks, but the job cannot run
> >   successfully. It seems the job was started but the program didn't
> >   execute. Please take a look at the attached log file for 1024 with
> >   c16 mode. Is this because some environment parameters I didn't set
> >   right? Actually, the same program is only able to run with 1024
> >   ranks in c1, c2 and c32, c64 modes and 2048 ranks in c64 mode.
> 
> You have non-scalable "Generate Vector" and VecView (the latter maybe
> because you don't use MPI-IO?).  It is probably failing at this step.
> 
> | qsub -A SUGAR -t 00:10:00 -n 512 --proccount 2048 --mode script ./vesta.job
> 
> I thought you said you were trying c16?

Yes, I said so. But, I tried both ways:  qsub the executable and qsub script.  The command is like this:

qsub -n 64 -t 10 --mode c16 -O p1024_c16 --env "F00=a:BAR=b" ./x.r -ksp_type bcgsl -ksp_bcgsl_ell 1 -sub_pc_type ilu -sub_pc_factor_levels 3 -sub_ksp_type preonly -my_ksp_monitor true -ksp_view -log_summary

the script:

#!/bin/bash

proN=1024

preName=p$proN

echo "Script JOB with Jobid COBALT_JOBID="$preName


qsub -A SUGAR -t 00:10:00 -n 64   --proccount $proN  --mode script ./vesta.job


and vesta.job:

#!/bin/sh
Nrank=1024
echo Starting Cobalt job script

LOCARGS="--block $COBALT_PARTNAME ${COBALT_CORNER:+--corner} $COBALT_CORNER ${COBALT_SHAPE:+--shape} $COBALT_SHAPE"

runjob $LOCARGS -n $Nrank -p 16 :  x.r -ksp_type bcgsl -ksp_bcgsl_ell 1 -sub_pc_type ilu -sub_pc_factor_levels 3 -sub_ksp_type preonly -my_ksp_monitor true -ksp_view -log_summary

echo End of jobscript.sh

exit 0

Both of them cannot run the program successfully. In these two ways, the runtime log showed the job started but no output to stdout file.

I just run the same program by:
qsub -n 16 -t 10 --mode c64 -O n1024_c64 --env "F00=a:BAR=b" ./x.r -ksp_type bcgsl -ksp_bcgsl_ell 1 -sub_pc_type ilu -sub_pc_factor_levels 3 -sub_ksp_type preonly -my_ksp_monitor true -ksp_view -log_summary

The job was able to run and the stdout file showed all the runtime output.  If there is non-scalable "Generate Vector" and VecView (the latter maybe> because you don't use MPI-IO?), why is c64 mode able to run? It's sort of strange to me. Thanks.
 		 	   		  
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140121/435ea145/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: n1024_c16_mode.cobaltlog
Type: application/octet-stream
Size: 6317 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140121/435ea145/attachment.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: n1024_c16_mode.error
Type: application/octet-stream
Size: 2179 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140121/435ea145/attachment-0001.obj>


More information about the petsc-users mailing list