bombing out writing large scratch files

Satish Balay balay at mcs.anl.gov
Sat May 27 23:33:41 CDT 2006


Looks like you have direct access to all the cluster nodes. Perhaps
you have admin access? You can do either of the following:

 * if the cluster frontend/compute nodes have common filesystem [i.e
 all machines can see the same file for ~/.Xauthority] and you can get
 'sshd' settings on the frontend changed - then:

 - configure sshd with 'X11UseLocalhost no' - this way xterms on the
   compute-nodes can connect to the 'ssh-x11' port on the frontend 
 - run the PETSc app with: '-display frontend:ssh-x11-port'

 * However if the above is not possible - but you can ssh directly to
  all the the compute nodes [perhaps from the frontend] then you can
  cascade x11 forwarding with:

 - ssh from desktop to frontend
 - ssh from frontend to node-9 [if you know which machine is node9
   from the machine file.]
 - If you don't know which one is the node-9 - then ssh from frontend
   to all the nodes :). Mostlikely all nodes will get a display 'localhost:l0.0'
 - so now you can run the executable with the option
       -display localhost:10.0

The other alternative that might work [for interactive runs] is:

-start_in_debugger noxterm -debugger_nodes 9

Satish

On Sat, 27 May 2006, Randall Mackie wrote:

> I can't seem to get the debugger to pop up on my screen.
> 
> When I'm logged into the cluster I'm working on, I can
> type xterm &, and an xterm pops up on my display. So I know
> I can get something from the remote cluster.
> 
> Now, when I try this using PETSc, I'm getting the following error
> message, for example:
> 
> ------------------------------------------------------------------------
> [17]PETSC ERROR: PETSC: Attaching gdb to
> /home/randy/d3inv/PETSC_V3.3/d3inv_3_3_petsc of pid 3628 on display
> 24.5.142.138:0.0 on machine compute-0-23.local
> ------------------------------------------------------------------------
> 
> I'm using this in my command file:
> 
> source ~/.bashrc
> time /opt/mpich/intel/bin/mpirun -np 20 -nolocal -machinefile machines \
>          /home/randy/d3inv/PETSC_V3.3/d3inv_3_3_petsc \
>          -start_in_debugger \
>          -debugger_node 1 \
>          -display 24.5.142.138:0.0 \
>          -em_ksp_type bcgs \
>          -em_sub_pc_type ilu \
>          -em_sub_pc_factor_levels 8 \
>          -em_sub_pc_factor_fill 4 \
>          -em_sub_pc_factor_reuse_ordering \
>          -em_sub_pc_factor_reuse_fill \
>          -em_sub_pc_factor_mat_ordering_type rcm \
>          -divh_ksp_type cr \
>          -divh_sub_pc_type icc \
>          -ppc_sub_pc_type ilu \
> << EOF




More information about the petsc-users mailing list