[MPICH] MPICH105 shm drops packages on SUN niagara

Darius Buntinas buntinas at mcs.anl.gov
Mon Sep 17 13:06:06 CDT 2007


It seems to be working on linux, but we don't have a solaris box to try 
it on.  Can you try it and let us know?

-d

On 09/17/2007 12:07 PM, chong tan wrote:
> In the 'change liost' of the new 106 release, I see thie item:
>  
> # Bugfix for shm and ssm channels. Added missing read and write memory 
> barriers for x86, and missing volatile in packet structure
>  
> does it means this problem is fixed ?
>  
> thanks
> 
> 
> */William Gropp <gropp at mcs.anl.gov>/* wrote:
> 
>     We're looking at it; I've added a variation of this to our regular
>     tests.  No solution yet, however.  My guess is that there is a
>     missing volatile or memory barrier somewhere; this should force us
>     to clean up the current code.
> 
>     Bill
> 
>     On May 16, 2007, at 12:18 PM, chong tan wrote:
> 
>>     No taker on this ?  There is an identical proble on Linux.  Just
>>     that I am not sure if this code can reproduce the problem. 
>>     tan
>>
>>
>>      
>>     ----- Original Message ----
>>     From: chong tan <chong_guan_tan at yahoo.com
>>     <mailto:chong_guan_tan at yahoo.com>>
>>     To: mpich-discuss at mcs.anl.gov <mailto:mpich-discuss at mcs.anl.gov>
>>     Sent: Friday, April 27, 2007 3:24:09 PM
>>     Subject: Re: [MPICH] MPICH105 shm drops packages on SUN niagara
>>
>>     The following code reproduces the problem.  I think you maybe able
>>     to reproduce the error on
>>     Linux, but I am not sure.
>>      
>>      
>>     It is best to run :
>>     mpiexec -n 8 a.out
>>     to reproduce the problem.  You will need a machine with
>>     8CPU/cores.  SOmetime you will need to
>>     run the code multiple time to see the error.
>>      
>>     there will be files fast_mpi_?.dmp created, where ? is the rank of
>>     the related 'rank'.  When MPI get stuck,
>>     you should look at the last line of fast_mpi_0.dmp.  If it says:
>>
>>       read from child 7
>>      
>>     then you should look at the last line of fast_mpi_7.dmp, it will say:
>>       read from master
>>      
>>     hope this help to debug the error.
>>      
>>     thanks
>>     tan
>>
>>     ---------------------
>>     #include "stdlib.h"
>>     #include "stdio.h"
>>     #include "mpi.h"
>>      
>>     #define LOOP_COUNT  1000000
>>     #define DATA_SIZE   4
>>     #define MP_TAG      999
>>     main()
>>     {
>>         int     nProc, rank ;
>>         int     argc = 0 ;
>>         int     i, j, status ;
>>         char    buf[ 128 ] ;
>>         FILE    *pf ;
>>         MPI_Init( &argc, NULL ) ;
>>         MPI_Comm_size( MPI_COMM_WORLD, &nProc ) ;
>>         MPI_Comm_rank( MPI_COMM_WORLD, &rank ) ;
>>         sprintf( buf, "fast_mpi_%d.dmp", rank ) ;
>>         pf = fopen( buf, "w" ) ;
>>         if( !rank ) {
>>            int      **psend ;
>>            int      **precv ;
>>            psend = (int**)calloc( nProc, sizeof( int *) ) ;
>>            precv = (int**)calloc( nProc, sizeof( int *) ) ;
>>            for( i = 0 ; i < nProc ; i++ ) {
>>                psend[ i ] = (int*)calloc( DATA_SIZE, sizeof( int ) ) ;
>>                precv[ i ] = (int*)calloc( DATA_SIZE, sizeof( int ) ) ;
>>            }
>>            for( i = 0 ; i < LOOP_COUNT ; i++ ) {
>>               fprintf( pf, "Master : loop %d\n", i ) ;
>>               fflush( pf ) ;
>>               for( j = 1 ; j < nProc ; j++ ) {
>>                  fprintf( pf, "  read from child %d\n", j ) ;
>>                  fflush( pf ) ;
>>                  status = MPI_Recv( precv[ j ], DATA_SIZE, MPI_LONG,
>>     j, MP_TAG, MPI_COMM_WORLD, MP
>>     I_STATUS_IGNORE ) ;
>>                  fprintf( pf, "  read from child %d done, status =
>>     %d\n", j, status ) ;
>>                  fflush( pf ) ;
>>               }
>>               for( j = 1 ; j < nProc ; j++ ) {
>>                  fprintf( pf, "  send to child %d\n", j ) ;
>>                  fflush( pf ) ;
>>                  status = MPI_Send( psend[ j ], DATA_SIZE - 1,
>>     MPI_LONG, j, MP_TAG, MPI_COMM_WORLD
>>      ) ;
>>                  fprintf( pf, "  send to child %d done, status =
>>     %d\n", j, status ) ;
>>                  fflush( pf ) ;
>>               }
>>            }
>>         } else {
>>            int  *psend ;
>>            int  *precv ;
>>            psend = (int*)calloc( DATA_SIZE, sizeof( int ) ) ;
>>            precv = (int*)calloc( DATA_SIZE, sizeof( int ) ) ;
>>            for( i = 0 ; i < LOOP_COUNT ; i++ ) {
>>                  fprintf( pf, "  send to master\n" ) ;
>>                  fflush( pf ) ;
>>                  status = MPI_Send( psend, DATA_SIZE - 1, MPI_LONG, 0,
>>     MP_TAG, MPI_COMM_WORLD ) ;
>>                  fprintf( pf, "  send to master done, status = %d\n",
>>     status ) ;
>>                  fflush( pf ) ;
>>                  fprintf( pf, "  read from master\n" ) ;
>>                  fflush( pf ) ;
>>                  status = MPI_Recv( precv, DATA_SIZE, MPI_LONG, 0,
>>     MP_TAG, MPI_COMM_WORLD, MPI_STATUS_IGNORE ) ;
>>                  fprintf( pf, "  read from master done, status =
>>     %d\n", status ) ;
>>                  fflush( pf ) ;
>>            }
>>         }
>>         fclose( pf ) ;
>>         MPI_Finalize() ;
>>     }
>>
>>     ------------------------------------------------------------------------
>>     Ahhh...imagining that irresistible "new car" smell?
>>     Check out new cars at Yahoo! Autos.
>>     <http://us.rd.yahoo.com/evt=48245/*http://autos.yahoo.com/new_cars.html;_ylc=X3oDMTE1YW1jcXJ2BF9TAzk3MTA3MDc2BHNlYwNtYWlsdGFncwRzbGsDbmV3LWNhcnM->
>>
>>
>>     ------------------------------------------------------------------------
>>     Be a better Heartthrob. Get better relationship answers
>>     <http://us.rd.yahoo.com/evt=48255/*http://answers.yahoo.com/dir/_ylc=X3oDMTI5MGx2aThyBF9TAzIxMTU1MDAzNTIEX3MDMzk2NTQ1MTAzBHNlYwNCQUJwaWxsYXJfTklfMzYwBHNsawNQcm9kdWN0X3F1ZXN0aW9uX3BhZ2U-?link=list&sid=396545433>from
>>     someone who knows.
>>     Yahoo! Answers - Check it out.
> 
> 
> ------------------------------------------------------------------------
> Luggage? GPS? Comic books?
> Check out fitting gifts for grads 
> <http://us.rd.yahoo.com/evt=48249/*http://search.yahoo.com/search?fr=oni_on_mail&p=graduation+gifts&cs=bz> 
> at Yahoo! Search.




More information about the mpich-discuss mailing list