[mpich-discuss] [mvapich-discuss] How many Groups ?
Pavan Balaji
balaji at mcs.anl.gov
Sun Feb 26 10:24:53 CST 2012
Why are you not free'ing the older communicators? Are you really
looking for more than 2000 *active* communicators? The number of bits
set aside for context IDs can be increased, but that will lose some
internal optimizations within MPICH2 that are used when the source, tag
and context ID all fit within 64 bits for queue searches.
Alternatively, you can take away some number of bits from the tag space
and give it to the context ID space.
But this looks like a bad application that doesn't free its resources.
Fixing the application seems much easier.
-- Pavan
On 02/25/2012 11:43 PM, Krishna Kandalla wrote:
> Hi,
> We recently received the following post and we seem to have the
> same behavior with mpich2-1.5a1. We realize that this is because we are
> running out
> of context id's. Do you folks think it is feasible to increase the range of
> allowable context id's?
>
>
> The following code can be used to reproduce this behavior:
>
> ---------------------------------------------------------
>
> #include <mpi.h>
> #include <stdlib.h>
>
> int main(int argc, char **argv)
> {
> int num_groups = 1000, my_rank, world_size, *group_members, i;
> MPI_Group orig_group, *groups;
> MPI_Init(&argc, &argv);
> MPI_Comm *comms;
>
> if ( argc > 1 ) num_groups=atoi(argv[1]);
>
> MPI_Comm_rank( MPI_COMM_WORLD, &my_rank );
> MPI_Comm_size(MPI_COMM_WORLD, &world_size);
> MPI_Comm_group(MPI_COMM_WORLD, &orig_group);
>
> group_members = calloc(sizeof(int), world_size);
> groups = calloc(sizeof(MPI_Group), num_groups);
> comms = calloc(sizeof(MPI_Comm), num_groups);
>
> for ( i = 0; i < world_size; ++i ) group_members[i] = i;
>
> for ( i = 0; i < num_groups; ++i )
> {
> MPI_Group_incl(orig_group, world_size, group_members, &groups[i]);
> MPI_Comm_create(MPI_COMM_WORLD, groups[i], &comms[i]);
> }
> }
>
> The observed error is :
> PMPI_Comm_create(656).........:
> MPI_Comm_create(MPI_COMM_WORLD, group=0xc80700f6, new_comm=0x1d1ecd4) failed
> PMPI_Comm_create(611).........:
> MPIR_Comm_create_intra(266)...:
> MPIR_Get_contextid(554).......:
> MPIR_Get_contextid_sparse(785): Too many communicators
> [cli_0]: aborting job:
>
>
>
> Thanks,
> Krishna
>
>
> On Thu, Feb 23, 2012 at 1:32 PM, Lewis Alderton <lalderto at us.ibm.com
> <mailto:lalderto at us.ibm.com>> wrote:
>
>
> I'm using MPI_Group_incl to create many groups. There seems to be
> limit of
> 2048 groups - any way to increase this number ?
>
> ( I'm using mvapich2-1.8a1p1 )
>
> Thanks.
>
> _______________________________________________
> mvapich-discuss mailing list
> mvapich-discuss at cse.ohio-state.edu
> <mailto:mvapich-discuss at cse.ohio-state.edu>
> http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>
>
>
>
> _______________________________________________
> mpich-discuss mailing list mpich-discuss at mcs.anl.gov
> To manage subscription options or unsubscribe:
> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
--
Pavan Balaji
http://www.mcs.anl.gov/~balaji
More information about the mpich-discuss
mailing list