[mpich-discuss] MP_EAGER_LIMIT and MP_BUFFER_MEM question

Jeff Hammond jhammond at alcf.anl.gov
Wed Nov 7 19:31:56 CST 2012


The following program, which is only trivially modified from yours,
works for me up to 1M bytes.

I don't fully understand what you're trying to do but I the program
has an error in it.  Because it didn't fail for me, I didn't bother to
look too hard for the errors, but the way you're using character
arrays and the arguments to bcast look somewhat sketchy to me.

I find that MPI programs written in C++ that don't work properly often
work properly when converted to C.  You might want to do that in this
case.

Best,

Jeff

#include <iostream>
#include <string>

#include <mpi.h>

void master(void);
void slave(void);

int main(int argc, char* argv[])
{
  int myid, numprocs;

  MPI_Init(&argc,&argv);
  MPI_Comm_size(MPI_COMM_WORLD,&numprocs);
  MPI_Comm_rank(MPI_COMM_WORLD,&myid);

  if (myid == 0)
    master();
  else
    slave();

  MPI_Finalize();

}

void master(void)
{
  int myid, numprocs;
  MPI_Comm_rank(MPI_COMM_WORLD,&myid);
  MPI_Comm_size(MPI_COMM_WORLD,&numprocs);

  std::cout << "I am master on process: " << myid << "\n";

  std::cout << "Enter the number of bytes to send: ";
  int numbytes;
  std::cin >> numbytes;

  std::string to_send_str = "";

  for (int i=0; i<numbytes; ++i)
    to_send_str.append("a");

  char* to_send = (char*)(to_send_str.c_str());

  int vectorlengths[1] = {0};
  vectorlengths[0] = to_send_str.size();

  MPI_Bcast(&vectorlengths,1, MPI_INT,0, MPI_COMM_WORLD);
  MPI_Bcast(&to_send[0], to_send_str.size(), MPI_CHAR, 0, MPI_COMM_WORLD);

  return;
}

void slave(){

  int numrocs;
  int rank;
  int myid;
  MPI_Comm_rank(MPI_COMM_WORLD,&myid);
  MPI_Status status;

  std::cout << "I am a slave on process: " << myid << "\n";

  int vectorlengths[1] = {0};

  MPI_Bcast(&vectorlengths, 1, MPI_INT, 0, MPI_COMM_WORLD);

  char sent_chars[vectorlengths[0]];
  MPI_Bcast(&sent_chars, vectorlengths[0], MPI_CHAR, 0, MPI_COMM_WORLD);

  if (myid == 1)
    std::cout << "Sent the following character array: \n"
              << sent_chars << "\n"
              << "which is of size: vectorlengths[0] = " <<
vectorlengths[0] << "\n";

  return;
}


On Wed, Nov 7, 2012 at 7:21 PM, Matthew Niemerg
<niemerg at math.colostate.edu> wrote:
> Jeff,
>
> Here's a quick mpi program that reproduces the problem.  All you do is enter
> in some # of bytes that you want to pass (to test what is the max size of an
> array).  The max on my machine is 6000 (i.e. 5999 works, but 6000 doesn't).
> I am not sure why that would be the limit.  Anyway, the output on my machine
> indicates the null terminating character isn't sent, or the entire array was
> not able to be passed.  Regardless, perplexed by this issue.
>
> I am, however, able to pass larger arrays (I went up to 8381153)
> successfully, but still get some garbage data at the very end (i.e.,
> outputting the character array to the console is fine, except the last byte
> which outputs as : ??_?).
>
> FYI, the machine I am using is a Mac 10.6.8, and am using mpich2.1.3 with
> the x86_64 architecture.
>
> Matt
>
> On Wed, Nov 7, 2012 at 3:10 PM, Jeff Hammond <jhammond at alcf.anl.gov> wrote:
>>
>> Unless you're on a BGQ or IBM machine with LoadLeveler, I don't see
>> how those variables can have any impact on the execution of your code.
>>
>> cc:mpich2-trunk jhammond$ grep MP_EAGER_LIMIT * -R
>> src/mpid/pamid/src/mpidi_env.c:    char* names[] = {"PAMID_EAGER",
>> "PAMID_RZV", "MP_EAGER_LIMIT", "PAMI_RVZ", "PAMI_RZV", "PAMI_EAGER",
>> NULL};
>> src/mpid/pamid/src/mpidi_env.c:    char* names[] = {"PAMID_RZV_LOCAL",
>> "PAMID_EAGER_LOCAL", "MP_EAGER_LIMIT_LOCAL", "PAMI_RVZ_LOCAL",
>> "PAMI_RZV_LOCAL", "PAMI_EAGER_LOCAL", NULL};
>> src/mpid/pamid/src/mpidi_util.c:
>> MATCHI(eager_limit,"Message Eager Limit (MP_EAGER_LIMIT/Bytes):");
>> src/pm/hydra/tools/bootstrap/external/ll_env.c:
>> "MP_DEBUG_INITIAL_STOP", "MP_DEBUG_LOG", "MP_EAGER_LIMIT",
>>
>> Can you provide a test code that exercises the problem to which you
>> allude?
>>
>> Thanks,
>>
>> Jeff
>>
>> On Wed, Nov 7, 2012 at 4:53 PM, Matthew Niemerg
>> <niemerg at math.colostate.edu> wrote:
>> > Hello,
>> >
>> > I'm developing a parallel program and have run into MPI hanging on jobs.
>> > I've narrowed the problem down to broadcasting data to other nodes.  I
>> > am
>> > curious how to get a hold of and set the environment variables
>> > MP_EAGER_LIMIT and MP_BUFFER_MEM.  We need to be able to adjust our code
>> > based off of individual machine's configurations of MPI, which is why I
>> > need
>> > to get what the environment variables are.  I tried to use the
>> > getenv("MP_EAGER_LIMIT") call, but that seemed to fail.  In addition,
>> > I've
>> > tried to set these environment variables using the export command in the
>> > shell before running my program, but that doesn't seem to work either.
>> >
>> > Are these environment variables set at runtime, during the configuration
>> > of
>> > mpich2, or what?  The data that is being passed is about 10 megs, which,
>> > I
>> > wouldn't think would be an issue, but apparently is!
>> >
>> > Anyway, any help you can provide would be much appreciative.
>> >
>> > Sincerely,
>> > Matthew Niemerg
>> > Graduate Research Assistant
>> > Mathematics Department
>> > Colorado State University
>> >
>> > _______________________________________________
>> > mpich-discuss mailing list     mpich-discuss at mcs.anl.gov
>> > To manage subscription options or unsubscribe:
>> > https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
>> >
>>
>>
>>
>> --
>> Jeff Hammond
>> Argonne Leadership Computing Facility
>> University of Chicago Computation Institute
>> jhammond at alcf.anl.gov / (630) 252-5381
>> http://www.linkedin.com/in/jeffhammond
>> https://wiki.alcf.anl.gov/parts/index.php/User:Jhammond
>> _______________________________________________
>> mpich-discuss mailing list     mpich-discuss at mcs.anl.gov
>> To manage subscription options or unsubscribe:
>> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
>
>
>
> _______________________________________________
> mpich-discuss mailing list     mpich-discuss at mcs.anl.gov
> To manage subscription options or unsubscribe:
> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
>



-- 
Jeff Hammond
Argonne Leadership Computing Facility
University of Chicago Computation Institute
jhammond at alcf.anl.gov / (630) 252-5381
http://www.linkedin.com/in/jeffhammond
https://wiki.alcf.anl.gov/parts/index.php/User:Jhammond


More information about the mpich-discuss mailing list