#563 PYTHON_PATH in MPI modules
Closed: Fixed None Opened 3 years ago by zbyszek.

[Packaging:MPI] says that "The module file MUST ... set PYTHONPATH to %{python_sitearch}/%{name}%{?_cc_name_suffix}/". This does not work with Python 3, because %{python_sitearch} is for Python 2 only. In fact trying to load Python2 modules under Python3 can get messy.

Suggested solution: remove PYTHONPATH settings from the module files completely. Instead,
export MPI_PYTHON2_SITEARCH and MPI_PYTHON3_SITEARCH variables from the module file and modify site.py to update PYTHONPATH when those variables are present.

Something like this (untested, but you should get the idea):
{{{

/etc/modulefiles/mpi/openmpi-x86_64

setenv MPI_PYTHON_SITEARCH /usr/lib64/python2.7/site-packages/openmpi
setenv MPI_PYTHON2_SITEARCH /usr/lib64/python2.7/site-packages/openmpi
setenv MPI_PYTHON3_SITEARCH /usr/lib64/python3.4/site-packages/openmpi
}}}

{{{

/usr/lib/python3.4/site.py

p = os.getenv('MPI_PYTHON3_SITEARCH')
if p:
sys.path.insert(0, p)
}}}

{{{

/usr/lib/python2.7/site.py

p = os.getenv('MPI_PYTHON2_SITEARCH')
if p:
sys.path.insert(0, p)
}}}


Sounds reasonable. Adding python maintainers to get their input if this is fine with them/if this could be upstreamed possibly.

Not sure we'd want to deviate from upstream on what's in site.py. However, if I'm reading this ticket right, MPI can be used with both python2 and python3? But there are separate mpi-implementing modules for each? That is unfortunate as Upstream python rejected having a separate PYTHON3PATH environment variable a few releases ago. perhaps this is a time to bring it back up with upstream as this seems like a valid use case.

If the python2.7/site-packages/openmpi and python3.4/site-packages/openmpi files are the same (because the python programs within are written to function with both python2 and python3) then another option opens up to us -- put the files in a non-python-dependent location (for instance, /usr/lib64/mpi/python/openmpi) Then the setenv could be the same for both python2 and python3.

MPI can be used with both python2 and python3?
Yes. I know of pypar (which only supports python2 in Fedora), and mpi4py (which supports both). I would guess that IPython has some support too, but I don't know the details.

But there are separate mpi-implementing modules for each?
Those are binary modules, so they are linked to a specific libpython.so version.

Example:
{{{
$ module load mpi/mpich-x86_64
$ python3 -c 'import mpi4py.dl'
Traceback (most recent call last):
File "<string>", line 1, in <module>
ImportError: dynamic module does not define init function (PyInit_dl)
$ echo PYTHONPATH
/usr/lib64/python2.7/site-packages/mpich
}}}

If the python2.7/site-packages/openmpi and python3.4/site-packages/openmpi files are the same
They are not. E.g. pypar currently does not support python3, so putting it in a path where python3 will find it does not seems like a good idea:

{{{
$ python2 -c 'import pypar'
Pypar (version 2.1.5) initialised MPI OK with 1 processors
$ python3 -c 'import pypar'
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib64/python2.7/site-packages/mpich/pypar/init.py", line 34, in <module>
from metadata import version, date, author
ImportError: No module named 'metadata'
}}}

(If we got past that error, /usr/lib64/python2.7/site-packages/mpich/pypar/mpiext.so is linked to libpython2.7.so.1.0, and cannot work for python3.)

We could also produce two additional modulefiles: MPINAME-ARCH-python2 and MPINAME-ARCH-python3 with the appropriate PYTHONPATH settings. They could then load the base MPINAME-ARCH module if not already loaded.

We could perhaps also somehow detect which python the loader may be using and load that by default in the MPINAME-ARCH module file. Something like:
{{{
if PYTHON=python2
set PYTHONPATH for python2
elsif PYTHON=python3
set PYTHONPATH for python3
}}}

Since modulefiles are TCL code you can do some conditional operations. These two approaches are not mutually exclusive either.

I tend to agree with Toshio in that I think we shouldn't deviate from upstream with this. Orion's idea seems sounds to me, is there a reason why it wouldn't work for now?

Because the modules do something differently based on the PYTHON environment setting. When python2 is the default, but you want to run {{{/usr/bin/python3}}} anyway, I don't see a sane solution in the module files...

Would it be maybe possible to add a {{{openmpi.pth}}}/{{{mpich.pth}}} file according to the [https://docs.python.org/3/library/site.html upstream documentation] to the site-packages directory inside a module file?

Would it be maybe possible to add a openmpi.pth/mpich.pth file according to the ​upstream documentation to the site-packages directory inside a module file?

I don't see how this could work. If we added a .pth file, it will be always "on", but we want to switch between mpich / openmpi / nothing.

A solution should satisfy:
1. no effect without 'module load ...'
2. after 'module load ...' python sees mpi-implementation specific site-packages prepended
3. python2 and python3 work in the same environment
4. there's just one mpi-implementation active at any given time

Anything which uses PYTHONPATH will fail 3. Using .pth files will probably fail 1 and 4. Defining MPINAME-ARCH-python2 and MPINAME-ARCH-python3 cannot do 3 and also requires workflow changes from the user.

I like the solution with customizing site.py because it works automatically, without any changes to the workflow. It also makes it easy to transition between python2 and python3. We could hash out the details of that solution in Fedora, and then submit it upstream.

Better to submit the problem upstream first and then work out a solution from there.

I would like to see if upstream python would re-evaluate the need for a PYTHON3PATH which is why I think that's a good way to go. However, a .pth-based solution might be able to work here as well.

use environment-modules to add a path like /usr/share/mpi/python/{openmpi,mpich}/ to the PYTHONPATH. Inside of /usr/share/mpi/python{openmpi,mpich} have a .pth file that detects if python2 or python3 is being used and adds the appropriate /usr/lib64/python{2.7,3.4}/{openmpi,mpich} directory to the path.

Note: the documentation says that .pth files are only recognized in specific site-packages directories but I thought that PYTHONPATH directories were also legitimate. If I'm wrong, then another way to do this would be to use a .pth file in the site-packages directory (one for python2's site-packages and one for python3's site-packages) that is only enabled when an environment variable is set to a value. We then use environment-modules to set that value. The .pth file would still do the additional adding of a python2-specific or python3-specific path for our chosen mpi implementation.

Yeah, Orion and Toshio are right, .pth files seem to be the way to go. It can be even simpler:

{{{

/etc/modulefiles/mpi/openmpi-x86_64

setenv MPI_PYTHON2_SITEARCH /usr/lib64/python2.7/site-packages/openmpi
setenv MPI_PYTHON3_SITEARCH /usr/lib64/python3.4/site-packages/openmpi
}}}

{{{

%{python2_sitelib}/openmpi.pth

import sys, os; s = os.getenv('MPI_PYTHON2_SITEARCH'); s and (s in sys.path or sys.path.append(s))
}}}

{{{

%{python3_sitelib}/openmpi.pth

import sys, os; s = os.getenv('MPI_PYTHON3_SITEARCH'); s and (s in sys.path or sys.path.append(s))
}}}

mpich would have the same set set of .pth files, they wouldn't conflict.

This looks like it should be solved outside of FPC, if you need us to do anything ask ... or just close the ticket.

I will do the changes to mpich and openmpi myself.

FPC should remove the text "The module file MUST ... set PYTHONPATH to %{python_sitearch}/%{name}%{?_cc_name_suffix}/" from Guidelines:MPI. If you want me to prepare a draft I can do that of course.

So instead of {{{PYTHONPATH}}}, each MPI must set {{{MPI_PYTHON?_SITEARCH}}} and possible {{{MPI_PYTHON_SITEARCH}}} as an unversioned version of it, so people could rely on "the default python" too.
It would be good to have a draft of what worked in practice after your changes have been tested with a mpi implementation.

Replying to [comment:13 tomspur]:

So instead of {{{PYTHONPATH}}}, each MPI must set {{{MPI_PYTHON?_SITEARCH}}} and possible {{{MPI_PYTHON_SITEARCH}}} as an unversioned version of it, so people could rely on "the default python" too.
It would be good to have a draft of what worked in practice after your changes have been tested with a mpi implementation.

Yes. The version from comment 10 is what was implemented:

This was implemented as proposed in comment 10. Seems to work so far :)

Proposed changes to the Packaging:MPI guidelines:
https://fedoraproject.org/w/index.php?title=User%3AZbyszek%2FMPIPackagingDraft&diff=424068&oldid=424066

I've just updated the wiki, so this only needs an announcement.

Nice, thanks!

One nitpick: in the table with directories, "Architecture specific Python modules" could be changed to "Python modules". MPI-enabled modules are very unlikely to be noarch, but it's better to clarify that mpi-specific %{python_sitelib}s do not exist.

I went ahead and wrote in that last sugestion.

Metadata Update from @zbyszek:
- Issue assigned to tibbs

2 years ago

Login to comment on this ticket.

Metadata