#327 Python shebangs should not point to /usr/bin/python
Closed: Fixed None Opened 5 years ago by bkabrda.

Hi,
as one of the results of discussions on fedora-devel [1] and python-dev [2] about Python 3 as default on Linux distros, I'd like to propose that Fedora mandates all Python shebangs to explicitly point to either /usr/bin/python2 or /usr/bin/python3 and not to /usr/bin/python. If accepted, this will help with migration to Python 3 (and sometime in the future with pointing /usr/bin/python to Python 3).

The upstream recommendation [3] says, that:

In order to tolerate differences across platforms, all new code that needs to invoke the Python interpreter should not specify python, but rather should specify either python2 or python3 (or the more specific python2.x and python3.x versions; see the Migration Notes). This distinction should be made in shebangs, when invoking from a shell script, when invoking via the system() call, or when invoking in any other context.

To achieve this, I think it would be best to
1) recommend using %{__python2} to run setup.py in Python packages, so that setuptools use /usr/bin/python2 to produce correct binaries.
2) create a script for __os_install_post that would fix all the shebangs (unless explicitly turned off) - I'm attaching the patch for rpm-redhat-macros as well as the script itself.

[1] https://lists.fedoraproject.org/pipermail/devel/2013-July/186098.html
[2] http://mail.python.org/pipermail/python-dev/2013-July/127516.html
[3] http://www.python.org/dev/peps/pep-0394/#recommendation


FPC would rather see this change happen during %build as opposed to manipulating files in os_install_post. We think a way forward would be to change the %{_python} macro in rpm to point to /usr/bin/python2 instead of /usr/bin/python. FPC is currently split on changing guidelines to say that shebang lines and scripts must not use /usr/bin/python. We held off on voting to see what you and the rest of the Python SIG's thoughts were on continuing down this alternative implementation.

It might be worth explicitly mentioning "#!/usr/bin/env python", assuming it would be changed to /usr/bin/python2 or /usr/bin/python3. Unless that is already handled elsewhere?

I believe that scripts which use /usr/bin/env are not going to be rewritten by setup.py so packagers would need to manually patch them to use /usr/bin/env python2 (or /usr/bin/python2 -- ncoghlan has said that it probably makes sense for upstreams to ship with /usr/bin/env by default but it makes more sense for system packages to use /usr/bin/python2. If we have to patch to get rid of an unversioned "python" anyway, might want to change it to "/usr/bin/python2".

As I won't be able to attend the next meeting (holidays), my view is:

/usr/bin/python must remain and should point to the "default" python interpreter (Which can be an arbitrary version of python}, because this is what some package expect.

What %{_python}, %{_python2}, %{_python3} or else may point to is widely irrelevant.

Note that I think that we've reached agreement on the devel@fp.o mailing list that /usr/bin/python will remain and continue to point to python2 for now (Currently we're thinking until PEP394, linked to above, changes its recommendation to point to python3). However, this is not for packagers but for the benefit of users of Fedora. Users of Fedora will continue to invoke /usr/bin/python because that's what they're used to typing and because they have scripts already written that utilize /usr/bin/python instead of /usr/bin/python2 or /usr/bin/python3. Packagers, OTOH, need to start future proofing their packages and preparing upstreams for the eventual day when /usr/bin/python no longer points to /usr/bin/python2 (which has already happened in Arch Linux :-(

The goal of this ticket is to make it so that software which is in Fedora packages explicitly use /usr/bin/python2 or /usr/bin/python3. This will future proof the packages for the eventual switch of /usr/bin/python to no longer point to python2 (either when we switch it or if the ability to switch it is given to the system administrator or individual user).

Switching %{_python} to point to /usr/bin/python2 is one means of partially implementing this goal. Most python modules use one of two buildsystems (distutils from the python stdlib or python-setuptools; these are invoked using a script called setup.py shipped with the tarball). These two buildsystems will rewrite shebang lines to start whichever interpreter path was used to invoke setup.py. Current spec files usually have:

'''%{_ _python} setup.py build'''

So changing the value of %{ _python} to /usr/bin/python2 will mean that shebang lines for this common case will end up pointing to /usr/bin/python2. bkabrda, in the initial ticket mentions a different way -- Changing packaging guidelines to make packagers no longer use %{ python} in spec files. Instead, packagers will need to update their spec files to use %{ _python2}.

I think this is a question of when work would need to be performed: Changing the value of %{ _python} will allow many packagers to ignore this change for now and still have their packages obey new guidelines that invoking python must explicitly use either the python2 or python3 executable. They will have to update their packages if the value of %{ python} switches to /usr/bin/python3 at some point in the future. OTOH, forcing packagers not to use %{ python} now will mean that they must update their spec files to use %{ python2} now (or more probably, new packages will use %{ _python2} and old packages will switch as python-concerned provenpackagers file bugs and apply changes to fix them).

Since I think bkabrda has more invasive changes to what python spec files should look like (in particular, making both python2 and python3 subpackages and not having a python main package. That way "python-foo" is what will be referenced in bugzilla and git checkouts but python2-foo and python3-foo will be used in dependencies and in what people ask to be installed) I think I'd rather defer the work ''at least'' until that time.

Note that this portion only partially implements our goal which is to no longer use /usr/bin/python in anything we've packaged. To reach that goal we're going to need to do the following:

  • Add the prohibition on /usr/bin/python to the Guidelines
  • Change rpmdevtools templates to use %{ _python2} and %{ python3} instead of %{ _python}
  • Use repoquery to determine everything that requires the python package and audit those packages for use of the python executable
  • python scripts will need to be checked for proper shebang lines (this will catch packages which don't use setuptools or distutils including the /usr/bin/env python invocation mentioned in comment:2)
  • shell scripts will need to be checked for proper invocation of the python interpreter
  • Python-SIG with FESCo blessing will probably need to organize packagers to do this auditing and provenpackagers to commit fixes.

Replying to [comment:5 toshio]:

Note that I think that we've reached agreement on the devel@fp.o mailing list that /usr/bin/python will remain and continue to point to python2 for now (Currently we're thinking until PEP394, linked to above, changes its recommendation to point to python3). However, this is not for packagers but for the benefit of users of Fedora. Users of Fedora will continue to invoke /usr/bin/python because that's what they're used to typing and because they have scripts already written that utilize /usr/bin/python instead of /usr/bin/python2 or /usr/bin/python3. Packagers, OTOH, need to start future proofing their packages and preparing upstreams for the eventual day when /usr/bin/python no longer points to /usr/bin/python2 (which has already happened in Arch Linux :-(

Yes, by my proposal I didn't mean to actually alter the shebangs, just to force python2 packages to point to /usr/bin/python2. All the shebangs should stay the same.

The goal of this ticket is to make it so that software which is in Fedora packages explicitly use /usr/bin/python2 or /usr/bin/python3. This will future proof the packages for the eventual switch of /usr/bin/python to no longer point to python2 (either when we switch it or if the ability to switch it is given to the system administrator or individual user).

Switching %{_python} to point to /usr/bin/python2 is one means of partially implementing this goal. Most python modules use one of two buildsystems (distutils from the python stdlib or python-setuptools; these are invoked using a script called setup.py shipped with the tarball). These two buildsystems will rewrite shebang lines to start whichever interpreter path was used to invoke setup.py. Current spec files usually have:

'''%{_ _python} setup.py build'''

So changing the value of %{ _python} to /usr/bin/python2 will mean that shebang lines for this common case will end up pointing to /usr/bin/python2. bkabrda, in the initial ticket mentions a different way -- Changing packaging guidelines to make packagers no longer use %{ python} in spec files. Instead, packagers will need to update their spec files to use %{ _python2}.

I think this is a question of when work would need to be performed: Changing the value of %{ _python} will allow many packagers to ignore this change for now and still have their packages obey new guidelines that invoking python must explicitly use either the python2 or python3 executable. They will have to update their packages if the value of %{ python} switches to /usr/bin/python3 at some point in the future. OTOH, forcing packagers not to use %{ python} now will mean that they must update their spec files to use %{ python2} now (or more probably, new packages will use %{ _python2} and old packages will switch as python-concerned provenpackagers file bugs and apply changes to fix them).

There are two points I'd like to mention here, that I didn't explicitly mention in my proposal:
- Using %{_ _python2} as default should be mandated by guidelines and applied for new packages.
- For old packages, the _ _os_install_post part will do everything needed when they're rebuilt (e.g. during mass rebuild), so noone would need to file bugs etc. - everything would be done automatically.

Since I think bkabrda has more invasive changes to what python spec files should look like (in particular, making both python2 and python3 subpackages and not having a python main package. That way "python-foo" is what will be referenced in bugzilla and git checkouts but python2-foo and python3-foo will be used in dependencies and in what people ask to be installed) I think I'd rather defer the work ''at least'' until that time.

I was planning to propose that for F21. The reason I proposed this is that (assuming that my proposal is accepted as a whole) it will only require changing guidelines (so people may slowly start getting used to using %{_ _python2}), while most of the work is done automatically.

Note that this portion only partially implements our goal which is to no longer use /usr/bin/python in anything we've packaged. To reach that goal we're going to need to do the following:

  • Add the prohibition on /usr/bin/python to the Guidelines
  • Change rpmdevtools templates to use %{ _python2} and %{ python3} instead of %{ _python}
  • Use repoquery to determine everything that requires the python package and audit those packages for use of the python executable
  • python scripts will need to be checked for proper shebang lines (this will catch packages which don't use setuptools or distutils including the /usr/bin/env python invocation mentioned in comment:2)
  • shell scripts will need to be checked for proper invocation of the python interpreter
  • Python-SIG with FESCo blessing will probably need to organize packagers to do this auditing and provenpackagers to commit fixes.

What do you mean by auditing/checking? IMO if we just replace /usr/bin/python by /usr/bin/python2, nothing should break (the only usecase I can imagine broken by this are some crazy cornercases relying on sys.executable being a specific string, which is not likely to be very often).

Replying to [comment:6 bkabrda]:

Yes, by my proposal I didn't mean to actually alter the shebangs, just to force python2 packages to point to /usr/bin/python2. All the shebangs should stay the same.

I don't understand this. The proposal seems to be all about altering the shebang lines. ie: the old releases of the packages will have #!/usr/bin/python while the new versions will have #!usr/bin/python2. Could you clarify?

There are two points I'd like to mention here, that I didn't explicitly mention in my proposal:
- Using %{_ _python2} as default should be mandated by guidelines and applied for new packages.

This is a variation on the additional need I listed:
* Add the prohibition on /usr/bin/python to the Guidelines

The difference is in what the two wordings cover and the level at which they attempt to manage that for packagers. One wording only explicitly covers deprecating the %{_ _python} rpm macro. The other wording is aimed at removing references to /usr/bin/python from the files that packages install onto the system which is the requirement we need to satisfy in order to future-proof our packages.

  • For old packages, the _ _os_install_post part will do everything needed when they're rebuilt (e.g. during mass rebuild), so noone would need to file bugs etc. - everything would be done automatically.

<nod> the FPC members present at the last meeting didn't like the os_install_post technique. That's why I propose modifying %{_ _python} to point to /usr/bin/python2 which would have the same effect for a large subset of applications written in python.

  • Use repoquery to determine everything that requires the python package and audit those packages for use of the python executable
  • python scripts will need to be checked for proper shebang lines (this will catch packages which don't use setuptools or distutils including the /usr/bin/env python invocation mentioned in comment:2)

What do you mean by auditing/checking? IMO if we just replace /usr/bin/python by /usr/bin/python2, nothing should break (the only usecase I can imagine broken by this are some crazy cornercases relying on sys.executable being a specific string, which is not likely to be very often).

I agree with you that I do not anticipate scripts which have shebang lines changed from #!/usr/bin/python to #!/usr/bin/python2 breaking. The auditing and checking comes into play because merely replacing those shebangs is not enough to future-proof python packages for an eventual change to what /usr/bin/python points to.
As orion points out, there's also #!/usr/bin/env python
There's also scripts which are installed by Makefile or other buildsystems which will not respond to /usr/bin/python2 being the sys.executable used to invoke something.
* Shell scripts can also be invoking python scripts. These would need to be changed to replace "/usr/bin/python /PATH/TO/PYFILE" or "python /PATH/TO/PYFILE" with "/usr/bin/python2 /PATH/TO/PYFILE" and "python2 /PATH/TO/PYFILE"

Replying to [comment:7 toshio]:

Replying to [comment:6 bkabrda]:

Yes, by my proposal I didn't mean to actually alter the shebangs, just to force python2 packages to point to /usr/bin/python2. All the shebangs should stay the same.

I don't understand this. The proposal seems to be all about altering the shebang lines. ie: the old releases of the packages will have #!/usr/bin/python while the new versions will have #!usr/bin/python2. Could you clarify?

Yay, sorry, I was thinking about more things than I should have. What I meant was that I don't want to alter the binaries in /usr/bin, I just want to point python2 packages shebangs to /usr/bin/python2, all the binaries /usr/bin/python{,2,3} should stay the way they are.

There are two points I'd like to mention here, that I didn't explicitly mention in my proposal:
- Using %{_ _python2} as default should be mandated by guidelines and applied for new packages.

This is a variation on the additional need I listed:
* Add the prohibition on /usr/bin/python to the Guidelines

The difference is in what the two wordings cover and the level at which they attempt to manage that for packagers. One wording only explicitly covers deprecating the %{_ _python} rpm macro. The other wording is aimed at removing references to /usr/bin/python from the files that packages install onto the system which is the requirement we need to satisfy in order to future-proof our packages.

I'm fine with your wording here, no problem.

  • For old packages, the _ _os_install_post part will do everything needed when they're rebuilt (e.g. during mass rebuild), so noone would need to file bugs etc. - everything would be done automatically.

<nod> the FPC members present at the last meeting didn't like the os_install_post technique. That's why I propose modifying %{_ _python} to point to /usr/bin/python2 which would have the same effect for a large subset of applications written in python.

If this goes together with prohibiting usage of /usr/bin/python and recommendation of %{ _python2} instead of %{ python}, I'm fine with it. In other words, pointing %{ python} to %{ python2} should only be there for old packages, new packages should use %{ _python2}

  • Use repoquery to determine everything that requires the python package and audit those packages for use of the python executable
  • python scripts will need to be checked for proper shebang lines (this will catch packages which don't use setuptools or distutils including the /usr/bin/env python invocation mentioned in comment:2)

What do you mean by auditing/checking? IMO if we just replace /usr/bin/python by /usr/bin/python2, nothing should break (the only usecase I can imagine broken by this are some crazy cornercases relying on sys.executable being a specific string, which is not likely to be very often).

I agree with you that I do not anticipate scripts which have shebang lines changed from #!/usr/bin/python to #!/usr/bin/python2 breaking. The auditing and checking comes into play because merely replacing those shebangs is not enough to future-proof python packages for an eventual change to what /usr/bin/python points to.
* As orion points out, there's also #!/usr/bin/env python

Which is IMO totally wrong in distribution package. If user compiles his own Python and puts it on $PATH before our Python, the distribution package may break. Distribution packages should always point to our Python (which they were tested with) so that we are sure everything works.

  • There's also scripts which are installed by Makefile or other buildsystems which will not respond to /usr/bin/python2 being the sys.executable used to invoke something.

Yep, that is true. These will probably need bug filing/patches for upstream. But we will clearly see them once we switch %{ _python} to point to %{ _python2}.

  • Shell scripts can also be invoking python scripts. These would need to be changed to replace "/usr/bin/python /PATH/TO/PYFILE" or "python /PATH/TO/PYFILE" with "/usr/bin/python2 /PATH/TO/PYFILE" and "python2 /PATH/TO/PYFILE"

Hrm, this may turn out to be the most problematic use-case, because we can't find these out easily... Moreover scripts like this don't necessarily have to live in /usr/bin (to take it to extreme, any package in any language can invoke "python" or "/usr/bin/python" from its code as subprocess). I don't see any general solution for this, just testing and testing and testing if everything works.

Broke the proposals here into three parts and voted at today's meeting:

  1. change the guidelines and rpmdevtools-newspec to use explicit %{__python2} and %{__python3} instead of %{__python}. Mention on the Python Guidelines page that %{__python} should be considered deprecated PASSED: (+7, 0, -1)
  2. Change the definition of the %{__python} to be /usr/bin/python2 PASSED: (+8, 0,-0)
  3. Add wording to the python guidelines that /usr/bin/python should not be invoked in files we're shipping -- invoke /usr/bin/python2 or /usr/bin/python3 instead. PASSED: (+6, 1, -1)

I'll take responsibility for writing these in proper form in the proper places in the Guidelines. We'll need to coordinate with rpmdevtools and rpm-build/redhat-config-rpm as well.

Cool, thanks a lot. Let me know if there is anything I can help with.

Question: Should I deprecate %{python_sitelib} and %{python_sitearch} as well? (to be replaced with %{python2_sitelib} and %{python2_sitearch}). The original macros do not need any changes to their definitions as reference the versioned location by their nature.

Replying to [comment:11 toshio]:

Question: Should I deprecate %{python_sitelib} and %{python_sitearch} as well? (to be replaced with %{python2_sitelib} and %{python2_sitearch}). The original macros do not need any changes to their definitions as reference the versioned location by their nature.

Hmm, that's a good point... Yes, I think we should use the versioned macros everywhere to stay consistent and prevent confusion.

Hey bkabrda, my FPC time of late has been mostly dealing with the scl review and redrafting. If you happen to have more time you could draft what the changes approved in comment 9 would look like and I could copy them over to the Python Guideline page. However, I know that you are also working on the scl draft so don't feel obligated to work on this -- I just wanted to update this ticket with why it's taking so long to get this pushed into the guidelines.

Okay, I took some time tonight to do a part of this. The python guidelines should all use %{__python2} and %{python2_sitelib} now. Let met know if I've missed anything there (or on any other pages; I'm sure the macros are used in some odd nooks and crannies). The other pieces still need doing:

  • change rpmdevtools-newspec to use explicit %{python2} (and python2_sitelib/sitearch) and %{python3} instead of %{python}
  • Change the definition of %{python} to be /usr/bin/python2 (Perhaps in rpm /usr/lib/rpm/macros or in an override in redhat-rpm-config)
  • Add wording to the python guidelines that /usr/bin/python should not be invoked in files we're shipping -- invoke /usr/bin/python2 or /usr/bin/python3 instead.

This never made it back to a meeting. Toshio, is there anything that you think the committee could deal with at this point?

The packaging committee is still awaiting additional information before it can properly address your request. Please provide that within the next month or this ticket will be closed. (Of course, you can always reopen it if the situation changes) Thanks!

So from the points mentioned in comment 14:
- rpmdev-newspec - I'll handle this
- changing defintion of {{{%__python}}} to point to /usr/bin/python2 - I'll handle this
- Add wording to the guidelines about explicit /usr/bin/python2 or /usr/bin/python3 invokation - this is something that needs to be put in guidelines - I think it can be as simple as saying "RPM-shipped binaries must never use /usr/bin/python, they must always use /usr/bin/python2 or /usr/bin/python3". Could someone from FPC do this, please?

Sure, we can take care of that. I can see that this would work absolutely everywhere (Fedora since at least FC5, and RHEL5 and up) so I think this is a no-brainer.

Should I wait for these to go through before changing the guidelines?

From comment 17, it turns out that FPC actually voted on that ages ago (comment 9 point 3) and so I've written that into the initial section of the Python guidelines. However, note that what FPC voted on is a "should", not a "must". In practice there's not much of a difference (and if fedora-review flags it then it will disappear pretty quickly) but if you do want "must" instead of "should" then let us know and we'll discuss it.

I think this ticket is actually done now, but feel tree to let me know if I've missed something.

Announcement text:

The Multiple Python Runtimes section of the Python packaging guidelines
was updated to indicate that packages in fedora should not reference
/usr/bin/python but instead should use either /usr/bin/python2 or
/usr/bin/python3 as appropriate.
https://fedoraproject.org/wiki/Packaging:Python#Multiple_Python_Runtimes
https://fedorahosted.org/fpc/ticket/327#comment:9

Metadata Update from @orion:
- Issue assigned to tibbs

2 years ago

Login to comment on this ticket.

Metadata