[petsc-users] PETSc and Windows 10

Satish Balay balay at mcs.anl.gov
Sun Jun 28 09:19:27 CDT 2020


On Sun, 28 Jun 2020, Paolo Lampitella wrote:

> Dear PETSc users,
> 
> I’ve been an happy PETSc user since version 3.3, using it both under Ubuntu (from 14.04 up to 20.04) and CentOS (from 5 to 8).
> 
> I use it as an optional component for a parallel Fortran code (that, BTW, also uses metis) and, wherever allowed, I used to install myself MPI (both MPICH and OpenMPI) and PETSc on top of it without any trouble ever (besides being, myself, as dumb as one can be in this).
> 
> I did this on top of gnu compilers and, less extensively, intel compilers, both on a range of different systems (from virtual machines, to workstations to actual clusters).
> 
> So far so good.
> 
> Today I find myself in the need of deploying my application to Windows 10 users, which means giving them a folder with all the executables and libraries to make them run in it, including the mpi runtime. Unfortunately, I also have to rely on free tools (can’t afford Intel for the moment).
> 
> To the best of my knowledge, considering also far from optimal solutions, my options would then be: Virtual machines and WSL1, Cygwin, MSYS2-MinGW64, Cross compiling with MinGW64 from within Linux, PGI + Visual Studio + Cygwin (not sure about this one)
> 
> I know this is largely unsupported, but I was wondering if there is, nonetheless, some general (and more official) knowledge available on the matter. What I tried so far:
> 
> 
>   1.  Virtual machines and WSL1: both work like a charm, just like in the native OS, but very far from ideal for the distribution purpose
> 
> 
>   1.  Cygwin with gnu compilers (as opposed to using Intel and Visual Studio): I was unable to compile myself MPI as I am used to on Linux, so I just tried going all in and let PETSc do everything for me (using static linking): download and install MPICH, BLAS, LAPACK, METIS and HYPRE. Everything just worked (for now compiling and making trivial tests) and I am able to use everything from within a cygwin terminal (even with executables and dependencies outside cygwin). Still, even within cygwin, I can’t switch to use, say, the cygwin ompi mpirun/mpiexec for an mpi program compiled with PETSc mpich (things run but not as expected). Some troubles start when I try to use cmd.exe (which I pictured as the more natural way to launch in Windows). In particular, using (note that \ is in cmd.exe, / was used in cygwin terminal):

I don't understand. Why build with MPICH - but use mpiexec from OpenMPI?

If it is because you can easily redistribute OpenMPI - why not build PETSc with OpenMPI?

You can't use Intel/MS-MPI from cygwin/gcc/gfortran

Also - even-though --download-mpich works with cygwin/gcc - its no loner supported on windows [by MPICH group].

> 
> .\mpiexec.hydra.exe -np 8 .\my.exe
> 
> Nothing happens unless I push Enter a second time. Things seem to work then, but if I try to run a serial executable with the command above I get the following errors (which, instead, doesn’t happen using the cygwin terminal):
> 
> [proxy:0:0 at Dell7540-Paolo] HYDU_sock_write (utils/sock/sock.c:286): write error (No such process)
> [proxy:0:0 at Dell7540-Paolo] HYD_pmcd_pmip_control_cmd_cb (pm/pmiserv/pmip_cb.c:935): unable to write to downstream stdin
> [proxy:0:0 at Dell7540-Paolo] HYDT_dmxu_poll_wait_for_event (tools/demux/demux_poll.c:76): callback returned error status
> [proxy:0:0 at Dell7540-Paolo] main (pm/pmiserv/pmip.c:206): demux engine error waiting for event
> [mpiexec at Dell7540-Paolo] control_cb (pm/pmiserv/pmiserv_cb.c:200): assert (!closed) failed
> [mpiexec at Dell7540-Paolo] HYDT_dmxu_poll_wait_for_event (tools/demux/demux_poll.c:76): callback returned error status
> [mpiexec at Dell7540-Paolo] HYD_pmci_wait_for_completion (pm/pmiserv/pmiserv_pmci.c:198): error waiting for event
> [mpiexec at Dell7540-Paolo] main (ui/mpich/mpiexec.c:336): process manager error waiting for completion
> 
> Just for the sake of completeness, I also tried using the Intel and Microsoft MPI redistributables, which might be more natural candidates, instead of the petsc compiled version of the MPI runtime (and they are MPICH derivatives, after all). But, running with:
> 
> mpiexec -np 1 my.exe
> 
> I get the following error with Intel:
> 
> [cli_0]: write_line error; fd=440 buf=:cmd=init pmi_version=1 pmi_subversion=1
> :
> system msg for write_line failure : Bad file descriptor
> [cli_0]: Unable to write to PMI_fd
> [cli_0]: write_line error; fd=440 buf=:cmd=get_appnum
> :
> system msg for write_line failure : Bad file descriptor
> Fatal error in MPI_Init: Other MPI error, error stack:
> MPIR_Init_thread(467):
> MPID_Init(140).......: channel initialization failed
> MPID_Init(421).......: PMI_Get_appnum returned -1
> [cli_0]: aborting job:
> Fatal error in MPI_Init: Other MPI error, error stack:
> MPIR_Init_thread(467):
> MPID_Init(140).......: channel initialization failed
> MPID_Init(421).......: PMI_Get_appnum returned -1
> 
> And the following error with MS-MPI:
> 
> [unset]: unable to decode hostport from 44e5747b-d19e-4ea8-ac7a-ec2102cabb21
> Fatal error in MPI_Init: Other MPI error, error stack:
> MPIR_Init_thread(467):
> MPID_Init(140).......: channel initialization failed
> MPID_Init(403).......: PMI_Init returned -1
> [unset]: aborting job:
> Fatal error in MPI_Init: Other MPI error, error stack:
> MPIR_Init_thread(467):
> MPID_Init(140).......: channel initialization failed
> MPID_Init(403).......: PMI_Init returned -1
> 
> independently from the number of processes, but more processes produce more copies of this. However, both Intel and MS-MPI are able to run a serial fortran executable built with cygwin. I think I made everything correctly and adding -localhost didn’t help (actually, it caused more problems to the interpretation of the cmd line arguments for mpiexec)
> 
> 
>   1.  Cygwin with MinGW64 compilers. Never managed to compile MPI, not even trough PETSc.
> 
> 
> 
>   1.  MSYS2+MinGW64 compilers. I understood that MinGW is not well supported, probably because of how it handles paths, but I wanted to give it a try, because it should be more “native” and there seems to be relevant examples out there that managed to do it. I first tried with the msys2 mpi distribution, produced the .mod file out of the mpi.f90 file in the distribution (I tried my best with different hacks from known limitations of this file as also present in the official MS-MPI distribution) and tried with my code without petsc, but it failed in compiling the code with some strange MPI related error (argument mismatch between two unrelated MPI calls in the code, which is non sense to me). In contrast, simple mpi tests (hello world like) worked as expected. Then I decided to follow this:
> 
> 
> 
> https://doc.freefem.org/introduction/installation.html#compilation-on-windows
> 
> 
> 
> but the exact same type of error came up (MPI calls in my code were different, but the error was the same). Trying again from scratch (i.e., without all the things I did in the beginning to compile my code) the same error came up in compiling some of the freefem dependencies (this time not even mpi calls).
> 
> 
> 
> As a side note, there seems to be an official effort in porting petsc to msys2 (https://github.com/okhlybov/MINGW-packages/tree/whpc/mingw-w64-petsc), but it didn’t get into the official packages yet, which I interpret as a warning
> 
> 
> 
>   1.  Didn’t give a try to cross compiling with MinGw from Linux, as I tought it couldn’t be any better than doing it from MSYS2
>   2.  Didn’t try PGI as I actually didn’t know if I would then been able to make PETSc work.
> 
> So, here there are some questions I have with respect to where I stand now and the points above:
> 
> 
>      *   I haven’t seen the MSYS2-MinGw64 toolchain mentioned at all in official documentation/discussions. Should I definitely abandon it (despite someone mentioning it as working) because of known issues?

I don't have experience with MSYS2-MinGw64, However Pierre does - and perhaps can comment on this. I don't know how things work on the fortran side.

>      *   What about the PGI route? I don’t see it mentioned as well. I guess it would require some work on win32fe

Again - no experience here.

>      *   For my Cygwin-GNU route (basically what is mentioned in PFLOTRAN documentation), am I expected to then run from the cygwin terminal or should the windows prompt work as well? Is the fact that I require a second Enter hit and the mismanagement of serial executables the sign of something wrong with the Windows prompt?

I would think Cygwin-GNU route should work. I'll have to see if I can reproduce the issues you have.

Satish

>      *   More generally, is there some known working, albeit non official, route given my constraints (free+fortran+windows+mpi+petsc)?
> 
> Thanks for your attention and your great work on PETSc
> 
> Best regards
> 
> Paolo Lampitella
> 


More information about the petsc-users mailing list