bombing out writing large scratch files

Randall Mackie randy at geosystem.us
Sun May 28 09:22:49 CDT 2006


Satish,

Thanks, using method (2) worked. However, when I run a bt in gdb,
I get the following output:

Loaded symbols for /lib/libnss_files.so.2
0x080b2631 in d3inv_3_3 () at d3inv_3_3.F:2063
2063          call VecAssemblyBegin(xyz,ierr)
(gdb) cont
Continuing.

Program received signal SIGUSR1, User defined signal 1.
[Switching to Thread 1082952160 (LWP 23496)]
0x088cd729 in _intel_fast_memcpy.J ()
Current language:  auto; currently fortran
(gdb) bt
#0  0x088cd729 in _intel_fast_memcpy.J ()
#1  0x40620628 in for_write_dir_xmit ()
    from /opt/intel_fc_80/lib/libifcore.so.5
#2  0xbfffa6b0 in ?? ()
#3  0x00000008 in ?? ()
#4  0xbfff986c in ?? ()
#5  0xbfff9890 in ?? ()
#6  0x406873a8 in __dtors_list_end () from /opt/intel_fc_80/lib/libifcore.so.5
#7  0x00000002 in ?? ()
#8  0x00000000 in ?? ()
(gdb)

This all makes me think this is an INTEL compiler bug, and has nothing to
do with my code.

Any ideas?

Randy


Satish Balay wrote:
> Looks like you have direct access to all the cluster nodes. Perhaps
> you have admin access? You can do either of the following:
> 
>  * if the cluster frontend/compute nodes have common filesystem [i.e
>  all machines can see the same file for ~/.Xauthority] and you can get
>  'sshd' settings on the frontend changed - then:
> 
>  - configure sshd with 'X11UseLocalhost no' - this way xterms on the
>    compute-nodes can connect to the 'ssh-x11' port on the frontend 
>  - run the PETSc app with: '-display frontend:ssh-x11-port'
> 
>  * However if the above is not possible - but you can ssh directly to
>   all the the compute nodes [perhaps from the frontend] then you can
>   cascade x11 forwarding with:
> 
>  - ssh from desktop to frontend
>  - ssh from frontend to node-9 [if you know which machine is node9
>    from the machine file.]
>  - If you don't know which one is the node-9 - then ssh from frontend
>    to all the nodes :). Mostlikely all nodes will get a display 'localhost:l0.0'
>  - so now you can run the executable with the option
>        -display localhost:10.0
> 
> The other alternative that might work [for interactive runs] is:
> 
> -start_in_debugger noxterm -debugger_nodes 9
> 
> Satish
> 
> On Sat, 27 May 2006, Randall Mackie wrote:
> 
>> I can't seem to get the debugger to pop up on my screen.
>>
>> When I'm logged into the cluster I'm working on, I can
>> type xterm &, and an xterm pops up on my display. So I know
>> I can get something from the remote cluster.
>>
>> Now, when I try this using PETSc, I'm getting the following error
>> message, for example:
>>
>> ------------------------------------------------------------------------
>> [17]PETSC ERROR: PETSC: Attaching gdb to
>> /home/randy/d3inv/PETSC_V3.3/d3inv_3_3_petsc of pid 3628 on display
>> 24.5.142.138:0.0 on machine compute-0-23.local
>> ------------------------------------------------------------------------
>>
>> I'm using this in my command file:
>>
>> source ~/.bashrc
>> time /opt/mpich/intel/bin/mpirun -np 20 -nolocal -machinefile machines \
>>          /home/randy/d3inv/PETSC_V3.3/d3inv_3_3_petsc \
>>          -start_in_debugger \
>>          -debugger_node 1 \
>>          -display 24.5.142.138:0.0 \
>>          -em_ksp_type bcgs \
>>          -em_sub_pc_type ilu \
>>          -em_sub_pc_factor_levels 8 \
>>          -em_sub_pc_factor_fill 4 \
>>          -em_sub_pc_factor_reuse_ordering \
>>          -em_sub_pc_factor_reuse_fill \
>>          -em_sub_pc_factor_mat_ordering_type rcm \
>>          -divh_ksp_type cr \
>>          -divh_sub_pc_type icc \
>>          -ppc_sub_pc_type ilu \
>> << EOF
> 

-- 
Randall Mackie
GSY-USA, Inc.
PMB# 643
2261 Market St.,
San Francisco, CA 94114-1600
Tel (415) 469-8649
Fax (415) 469-5044

California Registered Geophysicist
License No. GP 1034




More information about the petsc-users mailing list