[Darshan-users] darshan3.1.5 issue on Cray XC40 cle6.up05
Wadud Miah
wadud.miah at nag.co.uk
Mon Feb 26 09:23:32 CST 2018
Hi,
It doesn’t look like the code is doing any MPI, and only using the compiler command (not MPI compilation wrappers), hence the errors about undefined MPI symbols. There’s nothing conclusive that this is a Darshan issue.
Wadud.
From: Darshan-users [mailto:darshan-users-bounces at lists.mcs.anl.gov] On Behalf Of Phil Carns
Sent: 26 February 2018 15:18
To: darshan-users at lists.mcs.anl.gov
Subject: Re: [Darshan-users] darshan3.1.5 issue on Cray XC40 cle6.up05
Hi Bilel,
Thanks for the bug report. We can try to reproduce here and confirm.
In the mean time, can you tell me if the test_scalapack.f90 code is using MPI or not?
thanks,
-Phil
On 02/24/2018 02:25 PM, Bilel Hadri wrote:
Dear Darshan colleagues,
I recently installed darshan 3.1.5 on Shaheen, Cray XC40, we recently upgraded the OS to CLE6.up05 and using the 17.12 PrgEnv.
Compiling Scalapack with Cray Libsci fails with the error shown below with all programming environment. Similar issue was observed for other codes, like a simple petsc code.
After digging, it seems that it is related to darshan3.1.5 version recently installed on Shaheen. When unloading darshan, the compilation works fine with no issue.
The error is not appearing when using darshan 3.1.4.
ftn -o exe test_scalapack.f90
/opt/cray/dmapp/default/lib64/libdmapp.a(dmapp_internal.o): In function `_dmappi_is_pure_dmapp_job':
/home/abuild/rpmbuild/BUILD/cray-dmapp-7.1.1/src/dmapp_internal.c:1401: undefined reference to `__wrap_MPI_Init'
/opt/cray/pe/libsci/17.12.1/CRAY/8.6/x86_64/lib/libsci_cray_mpi_mp.a(blacs_exit_.o): In function `blacs_exit_':
/b/worker/csml-libsci-sles/build/mp/scalapack/BLACS/SRC/blacs_exit_.c:42: undefined reference to `__wrap_MPI_Finalize'
/opt/cray/pe/libsci/17.12.1/CRAY/8.6/x86_64/lib/libsci_cray_mpi_mp.a(blacs_pinfo_.o): In function `blacs_pinfo_':
/b/worker/csml-libsci-sles/build/mp/scalapack/BLACS/SRC/blacs_pinfo_.c:18: undefined reference to `__wrap_MPI_Init'
/opt/cray/pe/libsci/17.12.1/CRAY/8.6/x86_64/lib/libsci_cray_mpi_mp.a(blacs_pinfo_.oo): In function `Cblacs_pinfo':
/b/worker/csml-libsci-sles/build/mp/scalapack/BLACS/SRC/blacs_pinfo_.c:18: undefined reference to `__wrap_MPI_Init'
/opt/cray/pe/cce/8.6.5/binutils/x86_64/x86_64-pc-linux-gnu/bin/ld: link errors found, deleting executable `exe'
/usr/bin/sha1sum: exe: No such file or directory
hadrib at cdl1:~> ll /usr/bin/sha1
sha1pass sha1sum
hadrib at cdl1:~> ll /usr/bin/sha1sum
-rwxr-xr-x 1 root root 43912 Aug 6 2016 /usr/bin/sha1sum
hadrib at cdl1:~> which sha1sum
/usr/bin/sha1sum
hadrib at cdl1:~> module list
Currently Loaded Modulefiles:
1) modules/3.2.10.6 9) pmi/5.0.13 17) atp/2.1.1
2) eproxy/2.0.22-6.0.5.0_2.1__g1ebe45c.ari 10) dmapp/7.1.1-6.0.5.0_49.8__g1125556.ari 18) perftools-base/7.0.0
3) cce/8.6.5 11) gni-headers/5.0.12-6.0.5.0_2.15__g2ef1ebc.ari 19) PrgEnv-cray/6.0.4
4) craype-network-aries 12) xpmem/2.2.4-6.0.5.0_4.8__g35d5e73.ari 20) cray-mpich/7.7.0
5) craype/2.5.13 13) job/2.2.2-6.0.5.0_8.47__g3c644b5.ari 21) slurm/slurm
6) cray-libsci/17.12.1 14) dvs/2.7_2.2.52-6.0.5.2_17.6__g5170dea 22) craype-haswell
7) udreg/2.3.2-6.0.5.0_13.12__ga14955a.ari 15) alps/6.5.28-6.0.5.0_18.6__g13a91b6.ari 23) texlive/2017
8) ugni/6.0.14-6.0.5.0_16.9__g19583bb.ari 16) rca/2.2.16-6.0.5.0_15.34__g5e09e6d.ari 24) darshan/3.1.5
hadrib at cdl1:~> module swap PrgEnv-cray/6.0.4 PrgEnv-intel
hadrib at cdl1:~> ftn -o exe_i test_scalapack.f90
/opt/cray/pe/libsci/17.12.1/INTEL/16.0/x86_64/lib/libsci_intel_mpi.a(blacs_exit_.o): In function `blacs_exit_':
blacs_exit_.c:(.text+0xe9): undefined reference to `__wrap_MPI_Finalize'
/opt/cray/pe/libsci/17.12.1/INTEL/16.0/x86_64/lib/libsci_intel_mpi.a(blacs_pinfo_.o): In function `blacs_pinfo_':
blacs_pinfo_.c:(.text+0x9b): undefined reference to `__wrap_MPI_Init'
/opt/cray/pe/libsci/17.12.1/INTEL/16.0/x86_64/lib/libsci_intel_mpi.a(blacs_pinfo_.oo): In function `Cblacs_pinfo':
blacs_pinfo_.c:(.text+0x9b): undefined reference to `__wrap_MPI_Init'
hadrib at cdl1:~>
hadrib at cdl1:~>
hadrib at cdl1:~> module swap PrgEnv-intel/6.0.4 PrgEnv-gnu
PrgEnv-gnu PrgEnv-gnu/6.0.4
hadrib at cdl1:~> module swap PrgEnv-intel/6.0.4 PrgEnv-gnu
hadrib at cdl1:~> ftn -o exe_i test_scalapack.f90
/opt/cray/pe/libsci/17.12.1/GNU/6.1/x86_64/lib/libsci_gnu_61_mpi.a(blacs_exit_.o): In function `blacs_exit_':
blacs_exit_.c:(.text+0xdb): undefined reference to `__wrap_MPI_Finalize'
/opt/cray/pe/libsci/17.12.1/GNU/6.1/x86_64/lib/libsci_gnu_61_mpi.a(blacs_pinfo_.o): In function `blacs_pinfo_':
blacs_pinfo_.c:(.text+0xb3): undefined reference to `__wrap_MPI_Init'
/opt/cray/pe/libsci/17.12.1/GNU/6.1/x86_64/lib/libsci_gnu_61_mpi.a(blacs_pinfo_.oo): In function `Cblacs_pinfo':
blacs_pinfo_.c:(.text+0xb3): undefined reference to `__wrap_MPI_Init'
=====
implicit none
integer :: n, nb ! problem size and block size
integer :: myunit ! local output unit number
integer :: myArows, myAcols ! size of local subset of global array
integer :: i,j, igrid,jgrid, iproc,jproc, myi,myj, p
real*8, dimension(:,:), allocatable :: myA,myB,myC
integer :: numroc ! blacs routine
integer :: me, procs, icontxt, prow, pcol, myrow, mycol ! blacs data
integer :: info ! scalapack return value
integer, dimension(9) :: ides_a, ides_b, ides_c ! scalapack array desc
open(unit=1,file="ABCp.dat",status="old",form="formatted")
read(1,*)prow
read(1,*)pcol
read(1,*)n
read(1,*)nb
close(1)
if (((n/nb) < prow) .or. ((n/nb) < pcol)) then
print *,"Problem size too small for processor set!"
stop 100
endif
call blacs_pinfo (me,procs)
call blacs_get (0, 0, icontxt)
call blacs_gridinit(icontxt, 'R', prow, pcol)
call blacs_gridinfo(icontxt, prow, pcol, myrow, mycol)
myunit = 10+me
write(myunit,*)"--------"
write(myunit,*)"Output for processor ",me," to unit ",myunit
write(myunit,*)"Proc ",me,": myrow, mycol in p-array is ", &
myrow, mycol
myArows = numroc(n, nb, myrow, 0, prow)
myAcols = numroc(n, nb, mycol, 0, pcol)
write(myunit,*)"Size of global array is ",n," x ",n
write(myunit,*)"Size of block is ",nb," x ",nb
write(myunit,*)"Size of local array is ",myArows," x ",myAcols
allocate(myA(myArows,myAcols))
allocate(myB(myArows,myAcols))
allocate(myC(myArows,myAcols))
do i=1,n
call g2l(i,n,prow,nb,iproc,myi)
if (myrow==iproc) then
do j=1,n
call g2l(j,n,pcol,nb,jproc,myj)
if (mycol==jproc) then
myA(myi,myj) = real(i+j)
myB(myi,myj) = real(i-j)
myC(myi,myj) = 0.d0
endif
enddo
endif
enddo
! Prepare array descriptors for ScaLAPACK
ides_a(1) = 1 ! descriptor type
ides_a(2) = icontxt ! blacs context
ides_a(3) = n ! global number of rows
ides_a(4) = n ! global number of columns
ides_a(5) = nb ! row block size
ides_a(6) = nb ! column block size
ides_a(7) = 0 ! initial process row
ides_a(8) = 0 ! initial process column
ides_a(9) = myArows ! leading dimension of local array
do i=1,9
ides_b(i) = ides_a(i)
ides_c(i) = ides_a(i)
enddo
! Call ScaLAPACK library routine
call pdgemm('T','T',n,n,n,1.0d0, myA,1,1,ides_a, &
myB,1,1,ides_b,0.d0, &
myC,1,1,ides_c )
! Print results
call g2l(n,n,prow,nb,iproc,myi)
call g2l(n,n,pcol,nb,jproc,myj)
if ((myrow==iproc) .and. (mycol==jproc)) &
write(*,*) 'c(',n,n,')=',myC(myi,myj)
! Deallocate the local arrays
deallocate(myA, myB, myC)
! End blacs for processors that are used
call blacs_gridexit(icontxt)
call blacs_exit(0)
end
! convert global index to local index in block-cyclic distribution
subroutine g2l(i,n,np,nb,p,il)
implicit none
integer :: i ! global array index, input
integer :: n ! global array dimension, input
integer :: np ! processor array dimension, input
integer :: nb ! block size, input
integer :: p ! processor array index, output
integer :: il ! local array index, output
integer :: im1
im1 = i-1
p = mod((im1/nb),np)
il = (im1/(np*nb))*nb + mod(im1,nb) + 1
return
end
! convert local index to global index in block-cyclic distribution
subroutine l2g(il,p,n,np,nb,i)
implicit none
integer :: il ! local array index, input
integer :: p ! processor array index, input
integer :: n ! global array dimension, input
integer :: np ! processor array dimension, input
integer :: nb ! block size, input
integer :: i ! global array index, output
integer :: ilm1
ilm1 = il-1
i = (((ilm1/nb) * np) + p)*nb + mod(ilm1,nb) + 1
return
end
-------
Bilel Hadri, PhD
Computational Scientist
KAUST Supercomputing Lab
Al Khawarizmi Bldg. (1) Office 126
4700 King Abdullah University of Science and Technology
Thuwal 23955-6900
Kingdom of Saudi Arabia
Office Phone: +966 12 808 0654
Cell Phone: + 966 544 700 893
________________________________
This message and its contents including attachments are intended solely for the original recipient. If you are not the intended recipient or have received this message in error, please notify me immediately and delete this message from your computer system. Any unauthorized use or distribution is prohibited. Please consider the environment before printing this email.
_______________________________________________
Darshan-users mailing list
Darshan-users at lists.mcs.anl.gov<mailto:Darshan-users at lists.mcs.anl.gov>
https://lists.mcs.anl.gov/mailman/listinfo/darshan-users
Disclaimer
The Numerical Algorithms Group Ltd is a company registered in England and Wales with company number 1249803. The registered office is:
Wilkinson House, Jordan Hill Road, Oxford OX2 8DR, United Kingdom.
This e-mail has been scanned for all viruses and malware, and may have been automatically archived by Mimecast Ltd, an innovator in Software as a Service (SaaS) for business.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/darshan-users/attachments/20180226/e706e2b5/attachment-0001.html>
More information about the Darshan-users
mailing list