[mpich-discuss] I wonder if my mpdboot is the causeofproblem...help me!
Pavan Balaji
balaji at mcs.anl.gov
Sat Jul 18 15:17:18 CDT 2009
You have a few options:
1. Check if your processor supports 64-bit Operating Systems -- most
modern processors do. If it does, just reinstall a 64-bit OS on the
machine. This is the most efficient option.
2. Use the "-m32" CFLAGS to your MPICH2 configure -- this will build
MPICH2 in 32-bit mode even on the 64-bit platform. Even your
applications that are built with mpicc and friends will be built as
32-bit binaries. This will work, but you'll not be using the 64-bit
capabilities of one of your machines, so the performance will not be
optimal.
3. You could use MPICH-1 instead of MPICH2, though I wouldn't suggest
doing that. In this case MPICH will internally do the data conversion
for you, which will eat up some performance as well.
-- Pavan
On 07/18/2009 03:08 PM, Gra zeus wrote:
> ello Rajeev,
> *
> *
> ahh sorry about last email, my OS on two machine are different
>
> quadcore machine is 64bit and OS is "Linux myquadcore_machine
> 2.6.18-128.1.1.el5 #1 SMP Tue Feb 10 11:36:29 EST 2009 x86_64 x86_64
> x86_64 GNU/Linux"
>
>
> dual core is 32bit and IS is :: "Linux mydualcore_machine
> 2.6.18-128.1.6.el5PAE #1 SMP Wed Apr 1 07:24:39 EDT 2009 i686 i686 i386
> GNU/Linux"
>
> Are these the cause of my problem? Do i need to run my MPI with the same
> 32-bit machines? Are there any configurations i need to set , to make
> them work togather?
>
> thank you very much,and sorry again about wrong OS info in my last email
>
> regards,
> Gra
>
> --- On *Sat, 7/18/09, Rajeev Thakur /<thakur at mcs.anl.gov>/* wrote:
>
>
> From: Rajeev Thakur <thakur at mcs.anl.gov>
> Subject: Re: [mpich-discuss] I wonder if my mpdboot is the
> causeofproblem...help me!
> To: mpich-discuss at mcs.anl.gov
> Date: Saturday, July 18, 2009, 8:42 AM
>
> Are the CPUs identical on them? Is one 32-bit, the other 64-bit?
>
>
> ------------------------------------------------------------------------
> *From:* mpich-discuss-bounces at mcs.anl.gov
> [mailto:mpich-discuss-bounces at mcs.anl.gov] *On Behalf Of *Gra zeus
> *Sent:* Saturday, July 18, 2009 10:27 AM
> *To:* mpich-discuss at mcs.anl.gov
> *Subject:* Re: [mpich-discuss] I wonder if my mpdboot is the
> causeofproblem...help me!
>
> one of them is quad core and another one is dual core. however,
> OS,account,my password,install path are all the same.
> I use this configuration "./configure
> --prefix=/opt/localhomes/myname/mpich2-install" in both machines.
>
> --- On *Sat, 7/18/09, Rajeev Thakur /<thakur at mcs.anl.gov>/* wrote:
>
>
> From: Rajeev Thakur <thakur at mcs.anl.gov>
> Subject: Re: [mpich-discuss] I wonder if my mpdboot is the
> cause ofproblem...help me!
> To: mpich-discuss at mcs.anl.gov
> Date: Saturday, July 18, 2009, 7:02 AM
>
> What are the exact parameters you passed to configure when
> building MPICH2? Are the two machines identical?
>
> Rajeev
>
> ------------------------------------------------------------------------
> *From:* mpich-discuss-bounces at mcs.anl.gov
> [mailto:mpich-discuss-bounces at mcs.anl.gov] *On Behalf Of
> *Gra zeus
> *Sent:* Saturday, July 18, 2009 12:06 AM
> *To:* mpich-discuss at mcs.anl.gov
> *Subject:* [mpich-discuss] I wonder if my mpdboot is the
> cause ofproblem...help me!
>
> hello,
>
> thx for answer yesterday.
> I tested my code in one machine (with"mpiexec -n 2
> ./myprog"),everything work fine - my program can use
> MPI_Send,MPI_Recv without any problems.
>
> today, I setup mpich2 on two machines. Both machines can
> communicate with others, ssh are tested on both
> machines, mpd work, mpdringtest work.
>
> however,when i run my program that use MPI_Send and
> MPI_Recv, MPI_Recv is blocked forever.
> so i write new simple code to test MPI_Send,MPI_Recv
> like this
>
> int myrank;
> MPI_Status status;
> MPI_Init( &argc, &argv );
> MPI_Comm_rank( MPI_COMM_WORLD, &myrank );
> if (myrank == 0)
> {
> int senddata = 1;
> MPI_Send(&senddata, 1, MPI_INT, 1, 0, MPI_COMM_WORLD);
> }
> else if (myrank == 1)
> {
> int recvdata = 0;
> MPI_Recv(&recvdata, 1, MPI_INT, 0, 0, MPI_COMM_WORLD,
> &status);
> printf("received :%d:\n", recvdata);
> }
> MPI_Finalize();
>
>
> i got this error
>
>
> Assertion failed in file ch3_progress.c at line 489:
> pkt->type >= 0 && pkt->type < MPIDI_NEM_PKT_END
> internal ABORT - process 1
> Fatal error in MPI_Finalize: Other MPI error, error stack:
> MPI_Finalize(315)..................: MPI_Finalize failed
> MPI_Finalize(207)..................:
> MPID_Finalize(92)..................:
> PMPI_Barrier(476)..................:
> MPI_Barrier(comm=0x44000002) failed
> MPIR_Barrier(82)...................:
> MPIC_Sendrecv(164).................:
> MPIC_Wait(405).....................:
> MPIDI_CH3I_Progress(150)...........:
> MPID_nem_mpich2_blocking_recv(1074):
> MPID_nem_tcp_connpoll(1667)........:
> state_commrdy_handler(1517)........:
> MPID_nem_tcp_recv_handler(1413)....: socket closed
>
> ////////////////////////////////////////////////////////////////
>
> I also tried example/cpi that come with install package
> -> result is the example program freezed, without any
> errors.(I assume it stopped at MPI_Bcast())
>
> Can anyone help me with this?
> This code and my program can run smoothly when I use 1
> machine (with option , -n 2, -n 4 .... etc) but
> whenever I start mpdboot with 2 machines, mpi processes
> can't communicate with other mpi processes via
> MPI_Send,MPI_Recv.
>
> thx,
> gra
>
>
>
>
>
--
Pavan Balaji
http://www.mcs.anl.gov/~balaji
More information about the mpich-discuss
mailing list