[mpich-discuss] [mvapich-discuss] How many Groups ?

Lewis Alderton lalderto at us.ibm.com
Mon Feb 27 09:23:05 CST 2012


Hi Dave - Yes we can share communicators and are going to build a test
implementation of this model
Thanks.



From:	Dave Goodell <goodell at mcs.anl.gov>
To:	mpich-discuss at mcs.anl.gov
Cc:	mvapich-core at cse.ohio-state.edu
Date:	02/27/2012 09:50 AM
Subject:	Re: [mpich-discuss] [mvapich-discuss] How many Groups ?
Sent by:	mpich-discuss-bounces at mcs.anl.gov



Do you actually need separate communicators so that you can invoke
collective operations?  Could you instead share the communicator among
many/all threads and separate point-to-point communication by using unique
tags for each thread?

-Dave

On Feb 27, 2012, at 8:31 AM CST, Lewis Alderton wrote:

> We were hoping to use more than 2000 communicators - on each node we want
> about 100 processes each running about 30 threads. Creating 3000
> communicators ( one per thread ) seemed to be the easiest way of doing
this
> ( this way our job controller can broadcast a job to each thread ). Is
> there a better way of doing this ?
>
>
>
> From:		 Pavan Balaji <balaji at mcs.anl.gov>
> To:		 mpich-discuss at mcs.anl.gov
> Cc:		 Krishna Kandalla <kandalla at cse.ohio-state.edu>,
>            mvapich-core at cse.ohio-state.edu, Lewis
>            Alderton/Marlborough/IBM at IBMUS
> Date:		 02/26/2012 11:25 AM
> Subject:		 Re: [mpich-discuss] [mvapich-discuss] How many Groups ?
>
>
>
>
> Why are you not free'ing the older communicators?  Are you really
> looking for more than 2000 *active* communicators?  The number of bits
> set aside for context IDs can be increased, but that will lose some
> internal optimizations within MPICH2 that are used when the source, tag
> and context ID all fit within 64 bits for queue searches.
> Alternatively, you can take away some number of bits from the tag space
> and give it to the context ID space.
>
> But this looks like a bad application that doesn't free its resources.
> Fixing the application seems much easier.
>
>  -- Pavan
>
> On 02/25/2012 11:43 PM, Krishna Kandalla wrote:
>> Hi,
>>    We recently received the following post and we seem to have the
>> same behavior with mpich2-1.5a1. We realize that this is because we are
>> running out
>> of context id's. Do you folks think it is feasible to increase the range
> of
>> allowable context id's?
>>
>>
>>    The following code can be used to reproduce this behavior:
>>
>> ---------------------------------------------------------
>>
>> #include <mpi.h>
>> #include <stdlib.h>
>>
>> int main(int argc, char **argv)
>> {
>>    int num_groups = 1000, my_rank, world_size, *group_members, i;
>>    MPI_Group orig_group, *groups;
>>    MPI_Init(&argc, &argv);
>>    MPI_Comm *comms;
>>
>>    if ( argc > 1 ) num_groups=atoi(argv[1]);
>>
>>    MPI_Comm_rank( MPI_COMM_WORLD, &my_rank );
>>    MPI_Comm_size(MPI_COMM_WORLD, &world_size);
>>    MPI_Comm_group(MPI_COMM_WORLD, &orig_group);
>>
>>    group_members = calloc(sizeof(int), world_size);
>>    groups = calloc(sizeof(MPI_Group), num_groups);
>>    comms = calloc(sizeof(MPI_Comm), num_groups);
>>
>>    for ( i = 0; i < world_size; ++i ) group_members[i] = i;
>>
>>    for ( i = 0; i < num_groups; ++i )
>>    {
>>        MPI_Group_incl(orig_group, world_size, group_members, &groups
> [i]);
>>        MPI_Comm_create(MPI_COMM_WORLD, groups[i], &comms[i]);
>>    }
>> }
>>
>> The observed error is :
>> PMPI_Comm_create(656).........:
>> MPI_Comm_create(MPI_COMM_WORLD, group=0xc80700f6, new_comm=0x1d1ecd4)
> failed
>> PMPI_Comm_create(611).........:
>> MPIR_Comm_create_intra(266)...:
>> MPIR_Get_contextid(554).......:
>> MPIR_Get_contextid_sparse(785): Too many communicators
>> [cli_0]: aborting job:
>>
>>
>>
>> Thanks,
>> Krishna
>>
>>
>> On Thu, Feb 23, 2012 at 1:32 PM, Lewis Alderton <lalderto at us.ibm.com
>> <mailto:lalderto at us.ibm.com>> wrote:
>>
>>
>>    I'm using MPI_Group_incl to create many groups. There seems to be
>>    limit of
>>    2048 groups - any way to increase this number ?
>>
>>    ( I'm using mvapich2-1.8a1p1 )
>>
>>    Thanks.
>>
>>    _______________________________________________
>>    mvapich-discuss mailing list
>>    mvapich-discuss at cse.ohio-state.edu
>>    <mailto:mvapich-discuss at cse.ohio-state.edu>
>>    http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>>
>>
>>
>>
>> _______________________________________________
>> mpich-discuss mailing list     mpich-discuss at mcs.anl.gov
>> To manage subscription options or unsubscribe:
>> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
>
> --
> Pavan Balaji
> http://www.mcs.anl.gov/~balaji
>
>
>
> _______________________________________________
> mpich-discuss mailing list     mpich-discuss at mcs.anl.gov
> To manage subscription options or unsubscribe:
> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss

_______________________________________________
mpich-discuss mailing list     mpich-discuss at mcs.anl.gov
To manage subscription options or unsubscribe:
https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss





More information about the mpich-discuss mailing list