Hi folks,
I've been thinking of this a while but was waiting to get some feedback from folks at Flock etc. before putting this up for discussion. At the moment, we (the neuro-sig) is at ~500 packages. Many of these we've inherited, others we've taken over because they're deps for ours, and so on. It's quite a large number, and it can get a bit tricky to keep up with them, especially with newer releases of Python etc. when a bunch of things break simultaneously.
A lot of our packages are python libraries but I'm seeing more and more users just rely on pip (or anaconda etc.) to install their packages instead of using dnf installed packages. I was therefore, wondering if packaging up Python libraries off PyPi was worth really doing.
pip
dnf
I managed to catch up with Karolina from the Python SIG here at Flock today (Miro couldn't make it, unfortunately), and I had a good chat with them about this particular issue. Karolina said that the Python SIG's priority tends to be to make sure that the core python packages work---i.e., the interpreter, pip, things like setuptools. For libraries etc., it is really up to us packagers to see if our use-cases required system packages, and I think it is unlikely that users will rely on system packages for their libraries, especially given that most upstream documentation will suggest using pip etc. directly. The other advantage of installing bits from pip is also that one can install different versions of packages in isolated virtual environments. This cannot be done with the dnf installed system packages.
So, I was wondering if in the future, we should consider only packaging python packages that are not available on Pypi, and perhaps slowly dropping our packages that are on pypi already. I.e., if something is pip installable, recommend that people use these directly off pypi.
Pros:
Cons:
Things to keep in mind:
Questions:
What do you think? (I may also discuss this with the Fedora python community to solicit wider community feedback, but I wanted our team's views first).
PS: the python sig have attempted to automatically convert all of pypi to rpms in a COPR, but Karolina said that the quality of packages there is not good enough to suggest its use.
As in Fedora at large, the main reason to package PyPI libraries in the neuro-sig should probably be to support packaged versions of Python-based tools and applications. If we want to maintain those, then we need their dependencies to be packaged too.
For example, python-fslpy ships a collection of command-line tools itself, plus it is a dependency for the FSLeyes application (python-fsleyes), for python-nifti-mrs (which ships its own collection of tools), and indirectly for the command-line tool spec2nii, so it has a pretty strong justification for spending time on it.
python-fslpy
python-fsleyes
python-nifti-mrs
spec2nii
On the other hand, python-bioframe is a pure-Python library package that does not provide any command-line tools and is a leaf in Fedora, so – unless it’s a dependency for something someone is still working on packaging – it has the weakest justification.
python-bioframe
There’s still a valid question for tools and applications: what kinds of tools and applications are most valuable for our users to have packaged directly in Fedora, versus installing them in a development virtualenv or using something like pipx or uvx/uv run?
pipx
uvx
uv run
Obviously, anyone can spend time packaging anything they want for any reason they want, but it does make sense to try to define and agree on the SIG’s goals, and to use them to prioritize the application of the limited shared pool of collaborative packaging effort.
Thanks for the quick reply, Ben.
I think our goal can be summarised as "make Fedora an excellent platform (if not the 'goto platform') for neuroscience". It's suitably (intentionally) vague, and the implementation details are something we need to figure out. :D
I agree with your view that PyPi libraries should be included as dnf packages primarily to support packaged versions of tools/applications. I think that's a good guideline to follow. So, things like bioframe should perhaps not be packaged, since users can install them directly off PyPi.
Looking at the particular example of spec2nii, I just noticed that it is also available on PyPi, and one can install it (and deps) using pip:
$ pip install spec2nii ... Installing collected packages: pytz, tzdata, tqdm, six, pyyaml, pyparsing, pydicom, pillow, packaging, numpy, kiwisolver, fonttools, dill, cycler, scipy, python-dateutil, nibabel, h5py, contourpy, brukerapi, pandas, matplotlib, fslpy, pyMapVBVD, nifti-mrs, spec2nii Successfully installed brukerapi-0.1.9 contourpy-1.3.2 cycler-0.12.1 dill-0.4.0 fonttools-4.58.1 fslpy-3.22.1 h5py-3.14.0 kiwisolver-1.4.8 matplotlib-3.10.3 nibabel-5.3.2 nifti-mrs-1.3.3 numpy-2.2.6 packaging-25.0 pandas-2.3.0 pillow-11.2.1 pyMapVBVD-0.6.1 pydicom-3.0.1 pyparsing-3.2.3 python-dateutil-2.9.0.post0 pytz-2025.2 pyyaml-6.0.2 scipy-1.15.3 six-1.17.0 spec2nii-0.8.6 tqdm-4.67.1 tzdata-2025.2 $ which spec2nii ~/.local/share/virtualenvs/spec2nii/bin/spec2nii $ which fsl_abspath ~/.local/share/virtualenvs/spec2nii/bin/fsl_abspat
So, is there a case for keeping it iinstallable via dnf, or is this another case where we should ask users to install it off PyPi?
I note that even NEURON can be installed using pip now---they've managed to build wheels sometime ago and these do provide the various commands/libraries and interfaces (but these won't follow our compiler flags and guidelines, of course). Other simulation engines, and non-python tools like NEST/Arbor/STEPS are not, though, and so they present a use case where it'll be beneficial for our users for us to make them available via dnf.
An edge case is fsleyes: fsleyes is also pip installable but requires gtk3-devel to build wxpython as part of the install process (wxpython don't provide wheels for Linux from the looks of it)
Building wheel for wxpython (pyproject.toml) ... error error: subprocess-exited-with-error ... checking for GTK+ - version >= 3.0.0... no
So, we could say "we should package fsleyes", or we could say "we should clearly document how to install and run fsleyes on Fedora in docs". (or are there other solutions between these extremes?).
Looking at the particular example of spec2nii, I just noticed that it is also available on PyPi, and one can install it (and deps) using pip: […] So, is there a case for keeping it iinstallable via dnf, or is this another case where we should ask users to install it off PyPi?
[…]
Right, and to add complexity to that particular case, the original goal of packaging it was as a dependency for bidscoin, which now has most of its dependencies packaged but is still on the NeuroFedora wish list. But bidscoin itself is on PyPI, and it’s perfectly possible to use it directly from PyPI:
bidscoin
$ uvx bidscoin --help […] usage: bidscoin [-h] [-l] [-p] [-i NAME [NAME ...]] [-u NAME [NAME ...]] [-d FOLDER] [-t [TEMPLATE]] [-b BIDSMAP] [-c OPTIONS [OPTIONS ...]] [-r] [--tracking {yes,no,show}] [-v] […] $ uvx --from bidscoin bidsmapper --help usage: bidsmapper [-h] [-b NAME] [-t NAME] [-p NAME [NAME ...]] [-n PREFIX] [-m PREFIX] [-u PATTERN] [-s] [-a] [-f] [--no-update] sourcefolder bidsfolder […] $ pipx install bidscoin […] $ bidsmapper --help usage: bidsmapper [-h] [-b NAME] [-t NAME] [-p NAME [NAME ...]] [-n PREFIX] [-m PREFIX] [-u PATTERN] [-s] [-a] [-f] [--no-update] sourcefolder bidsfolder […]
There’s a lot of nuance here, and a lot of room for positions between minimalist (package nothing that is on PyPI unless absolutely forced to) and maximalist (package everything the light touches).
In general, I personally find some value in having command-line tools packaged so that people can use them without caring what language they are written in, and without having to use and understand language-specific tools and repositories. However, distribution packaging is certainly much more valuable for software that is difficult to install, with things like system-wide configuration files, daemons, and awkward dependencies, and for software that is not available directly from language-specific indexes like PyPI.
I guess the goal for us, for example, in this case is:
"Ensure that bidscoin/spec2nii/fsleyes/... can be used easily on Fedora (current Fedora releases)".
This can be achieved in multiple ways:
We know how do the first one well, since that's how we do it now.
The second moves us more towards testing than maintaining. We know how to do it manually---install, run the command lines, perhaps run import checks for libraries (run unit tests?)---but I reckon it's possible to come up with a pipeline to automate this process to scale it to a large number of packages (at least Python to begin with) and generate a searchable list that can be published on our docs. (if it works, we could even do this for non neuro-sig packages as a Fedora wide thing?)
Here's an incomplete list of python packages that we maintain. It's incomplete because it's only checking for packages named python-*, so it'll miss "applications" like NEURON, but it should include all libraries:
python-*
https://pagure.io/neuro-sig/NeuroFedora/blob/main/f/package-list/python-list.txt
I wonder if a middle ground would work (leaning slightly more towards "don't package unless necessary). Something like this:
if a package is directly installable from a forge, consider if packaging it is necessary (what is the use case for a user preferring/requiring a dnf installed version of this software?)
Whatever we come up with, they will not be rules. We'll leave it to individuals to decide where on the spectrum of "package everything ---- don't package anything that's on PyPi) they prefer to be for each package.
I think I'd personally lean more towards packaging less, and focussing on packages that do need to be dnf installable---primarily because I'm struggling with time, and I'm not necessarily seeing advantages of the "package everything" approach---at least not anecdotally.
I'd love to hear from more packagers here. Is it worth putting this on the python-sig list for feedback too?
I spoke to more folks here at the packaging workshop. Fabio pointed out that one important purpose of distribution packages is that it enables us maintainers to push updates to users in a timely manner (ideally/theoretically). When installing from forges directly, there doesn't seem to be a way of users being notified if packages that they're using have new updates. (Users have to do it themselves. For example, for Python, one needs to go: pip list --outdated)
pip list --outdated
This certainly applies to lower level libraries such as ITK etc. (in addition to languages where everything is always statically compiled---rust (which is where Fabio was coming from) and golang), where we will continue to provide system packages anyway. I'm uncertain if it applies to high level Python libraries. With them, in the general case, users may not want these libraries to be automatically updated to begin with.
Bugs from the Python mass-rebuild have just been filed. I'll wait a few more days for feedback here before working on them.
@gui1ty : not seen you around the past few days, but would love to hear what you think, since you help with so many of the packages.
Hrm, i haven't seen @gui1ty around in the past couple of weeks. We can wait a few more days for more input.
Here's what I have in mind:
write scripts to:
orphan these packages
We'll hold on to packages that are not installable by pip.
I think one should be able to use tmt for this, and if these are found to be generally useful by the community, we can run these on the testing farm infrastructure too.
What do you think? I'll e-mail all sig members with these updates too, so everyone is aware and has a chance to chime in.
I don't think I have a strong opinion on the matter either way. As I'm not in the field of science, I have no use case myself for the software we provide as part of Neuro Fedora, nor do I know how others approach installing (Python) software in scientific environments. Let me provide some general feedback on the concerns raised.
Number of packages maintained by SIG
I think it makes sense reducing the number of packages maintained by the SIG. With only three active members in the SIG currently trying to stay on top of 500+ packages can prove challenging.
What to package
I agree that we should prioritize packages that are a direct dependency of some other package which cannot be directly installed from an upstream index. Maybe packages providing common command line tools might also be worth considering.
As to the mix of packages we currently have, I believe a good chunk is test only dependencies for other packages.
Having a good guideline of what we want to package will be helpful.
Instructions for local installation
Maybe we should decide on a specific tool (pipx, uv) that we recommend and write instructions for when it comes to installing packages / libraries from upstream repos locally. Much like Python packaging tools, this will likely boil down to personal preferences. However, I think supporting (and understanding) one tool well is better than supporting many tools half-heartedly.
uv
Testing and updating
I'm not sure how I feel about manually testing packages. As part of packaging Python software for Fedora it certainly made sense to put some effort into running tests and figuring out and reporting on issues with tests. It is our responsibility as packagers to make sure the RPM packages we ship are usable and compatible with other RPM packages. When it comes to local installation my first instinct is that the onus is with the user.
A good compromise might be providing information and recommendations in the documentation regarding running tests. From my experience the degree of difficulty may vary greatly depending on upstream documentation as well as packaging quality. I'm thinking of a clear distinction between runtime and test dependencies as well as linters.
Feedback
In addition to gathering feedback from the wider Python packaging community in Fedora it may also be worth reaching out to other distros. For example, Debian has a neuro sciences SIG as well. I would be interested to learn what their approach / motivation is as to what to package and what not.
Hi @gui1ty , thanks for your comment.
I agree with your notes. How does this sound for a general SIG packaging guideline:
Apart from this, I think we can leave individual packages up to individual maintainers. For example, if one uses a tool regularly and thinks there's value in having it as a system package, one can take it on. This ensures that we follow the general community guideline of packaging + maintaining what we use---which will also mean that we test out our packages properly and have a vested interest in maintaining them. If people think command line tools are worth packaging, by all means, they can maintain them---and the team will collectively help out as we do now.
I would like to test if packages installed from indexes also work on Fedora, but not manually. Only if I can figure out if this can be automated in some way will this be doable. Perhaps this can be a "nice to have" that we can think about in phase 2, while phase 1 can be:
I'm personally not bothered if people use uv or pip---they both pick from upstream indexes and should do the same thing at the end based on the python packaging guidelines. The difference, from what I see, is the interface/features and speed/performance. We should be able to test/document both if we want, or we can test with pip as the "default" and let users do uv on their own. (I know @music maintains uv and the python and rust sigs are involved too, so it's in good hands :P)
How does that sound?
I can post on python-devel to announce/gather feedback on our intentions too, but at the end of the day, it's our decision, since we're going to have to do the work :)
NeuroDebian: my understanding has always been that neurodebian is more neuro-imaging focussed. For example, they don't seem to have any packages related to computational modelling at all. Their package list is also relatively small compared to ours:
http://neuro.debian.net/pkglists/toc_all_pkgs.html#toc-all-pkgs
Post to python-devel: https://lists.fedoraproject.org/archives/list/python-devel@lists.fedoraproject.org/message/L6M4ISCAGCLGPWIGN3PLDG7YUIFUIYC6/
I think the big picture is clear and I'm okay with your proposal. Having had some time to ponder, I do have some practical questions. I'll jot 'em down for discussion later.
Regarding NeuroDebian: I knew it existed and I had looked into one or two of their packages at some stage for inspiration, patches or the like. I wasn't aware they are focusing on neuro imaging. I suppose their approach applies to us as well when it comes to the packages not directly installable from upstream indices. That is the trivial Python packages are required dependencies for some other non-trivial package.
Sounds good. I'll go make a list of our python packages so that we can see which ones can be dropped in favour of direct installation from pypi.
Based on this script (I think it should catch most cases?)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32
#!/bin/bash # Copyright 2025 Ankur Sinha # Author: Ankur Sinha <sanjay DOT ankur AT gmail DOT com> # File : get-python-packages.sh # # empty out files echo > python-devel-reqs.txt echo > python-devel-reqs-leaves.txt while read package; do echo "** Checking ${package}" # dnf repoquery is incomplete because sometimes the subpackage requires python3-devel.. re="python" if [[ ${package} =~ $re ]] || dnf repoquery --requires "${package}" --srpm --quiet | grep "python3-devel" then echo "Requires python" echo "${package}" >> python-devel-reqs.txt echo "Checking for leaf" DEPS=( $(fedrq whatrequires-src -X "${package}") ) if [ 0 == ${#DEPS[@]} ] then echo "Yep, is leaf!" echo "${package}" >> python-devel-reqs-leaves.txt fi else echo "Does not require python" fi done < list.txt
We get the following list of python packages that are leaf packages:
COPASI cffconvert dlib dolfin getdp moose openmeeg pydeps pyplane python-PyLEMS python-PyLink python-SALib python-airspeed python-amply python-annarchy python-autograd python-bioframe python-bioread python-bluepyopt python-chaospy python-cro python-cyipopt python-dandischema python-datrie python-devicely python-editdistance python-elephant python-ephyviewer python-exdir python-fsleyes python-git-changelog python-glfw python-glymur python-gradunwarp python-grip python-hdf5storage python-hdfs python-imbalanced-learn python-intern python-irodsclient python-klusta python-lazy-ops python-lqrt python-matplotlib-venn python-maya python-missingno python-mne-bids python-moss python-multiecho python-neatdend python-netpyne python-neurodsp python-neurom python-neurotune python-niaclass python-nipype python-nixio python-odml python-openctm python-outdated python-palettable python-pingouin python-plotnine python-probeinterface python-pyABF python-pyactivetwo python-pycatch22 python-pydapsys python-pydotplus python-pyfim python-pygiftiio python-pylatex python-pynetdicom python-pynn python-pynsgr python-pyopengltk python-pyphi python-pyriemann python-pysb python-pysdl2 python-pyspike python-pyswarms python-pytest-lazy-fixture python-pyunicorn python-pyvhacd python-pyxdf python-pyxid python-ratelimiter python-ratinabox python-read-roi python-resumable-urlretrieve python-scipy-doctest python-sciunit python-simframe python-sklearn-genetic python-sklearn-genetic-opt python-sklearn-nature-inspired-algorithms python-snakemake-executor-plugin-azure-batch python-snakemake-executor-plugin-flux python-snakemake-executor-plugin-kubernetes python-snakemake-executor-plugin-slurm python-snakemake-executor-plugin-tes python-snakemake-storage-plugin-azure python-snakemake-storage-plugin-ftp python-snakemake-storage-plugin-gcs python-snakemake-storage-plugin-webdav python-snakemake-storage-plugin-xrootd python-snakemake-storage-plugin-zenodo python-spyking-circus python-steps python-stopit python-toposort python-tvb-data python-tvb-gdist python-vascpy shybrid smoldyn spec2nii
I attempted a categorization or initial triage of the above list.
The following are not really Python packages, but (mostly C++) packages that offer Python bindings. They should not be considered under this proposed realignment.
The following are no longer leaf packages in Rawhide and should be retained:
The following are desktop/GUI tools, which may argue in favor of retaining them:
circus-artefacts
circus-folders
circus-gui-matlab
circus-gui-python
circus-multi
spyking-circus
spyking-circus-launcher
spyking-circus-subtask
The following are Python command-line tools can be used effectively from PyPI via something like uvx, pipx, or manual pip/uv installation into a temporary virtualenv. They might be candidates for dropping/orphaning under this realignment, depending on their primary maintainers’ wishes. If there are packages where we don’t want to dedicate NeuroFedora resources but the primary maintainer still wants the package, we can ask for the neuro-sig group to be removed from the package.
neuro-sig
The following are Python libraries that also offer command-line tools, and the command-line tools appear to be usable from PyPI via something like uvx, pipx, or manual pip/uv installation into a temporary virtualenv. They might be candidates for dropping/orphaning under this realignment, depending on their primary maintainers’ wishes. If there are packages where we don’t want to dedicate NeuroFedora resources but the primary maintainer still wants the package, we can ask for the neuro-sig group to be removed from the package. I didn’t necessarily check whether enabling extras works well (e.g. uvx --from mne-bids[full] mne_bids or pipx run mne-bids[full]), and I didn’t try anything more than printing the usage message. The details of these should probably be considered on a case-by-case basis.
uvx --from mne-bids[full] mne_bids
pipx run mne-bids[full]
pylems
salib
acq2hdf5
acq2mat
acq2txt
acq_info
acq_layout
acq_markers
git-changelog
gradient_unwarp
gradient_unwarp.py
grip
mne_bids
mecombine
nipypecli
echoscp
echoscu
findscu
getscu
movescu
qrscp
storescp
storescu
nsgr_job
nsgr_submit
pysb_export
sciunit
vascpy
The following are Python libraries that also offer command-line tools, but there may be some difficulties in using them from PyPI via something like uvx, pipx, or manual pip/uv installation into a temporary virtualenv. At least on Python 3.13, the tools might not work as expected or might fail due to obsolete imports, or there might be PyPI dependencies that do not have binary wheels and have to be compiled at install time. These probably merit closer consideration. In some cases the difficulties may indicate that these should be kept, because they are useful but difficult to install and run, and in other cases the difficulties may suggest that it might be time to drop a package because it’s not adequately maintained upstream.
bpopt_tasksdb
jp2dump
jpeg2jp2
tiff2jp2
hdfscli
hdfscli-avro
klusta
check_mni_reg
recon_status
recon_movie
recon_process_stats
ts_movie
warp_qc
compilechannels
nixio
python3-pyxdf-examples
python3 -m pyxdf.cli.print_metadata
The following is a Python library that also includes a Jupyter Notebook extension that is installed system-wide.
The following are Python libraries that don’t offer command-line tools and are leaf packages (verified in F42 to make sure we don’t miss dependencies broken in the Python 3.14 transition). If they are not part of a packaging project we are still working on, they are particularly strong candidates for dropping under this realignment.
The following packages were already at least orphaned, and in most cases retired, for F43 or earlier.
The following packages are Snakemake plugins, or were packaged to support a Snakemake plugin. They should be retained, assuming we want to keep Snakemake.
python-snakemake-storage-plugin-irods
Based on that analysis, the packages that I personally should consider dropping in the short term are python-cyipopt, python-intern, python-probeinterface, python-pycatch22, python-ratinabox, and python-pyxdf.
python-cyipopt
python-intern
python-probeinterface
python-pycatch22
python-ratinabox
python-pyxdf
I’ll probably keep these for now:
python-glymur
python-multiecho
python-bioread
shybrid
.desktop
Thanks for the overview, @music. I do have a question regarding the Python libraries list: What's your definition of pure Python?
I thought pure Python referred to packages only using libraries that are shipped with Python itself. There are packages on the list depending on third party libraries/modules. Hence my question.
Some other questions I wanted to ask:
Will it be one venv per package? Or will it be one big venv for all things neuro science? I suppose the latter is not feasible, because of version conflicts in dependencies since we are no longer or less in control of upstream version pinning. Having one venv per package undoubtedly will increase duplication of packages as the number of installed packages increases. I'm thinking of very common large(r) dependencies like NumPy, SciPy, Pandas, Matplotlib, etc.
Currently the Fedora release dictates what Python version all Python modules are built against. Thanks to the work of the Python SIG and Python package maintainers, this closely follows upstream's release cycle. However, upstream maintainers are not always interested in the most recent Python releases and are sometimes slow to adapt to changes introduced in recent releases. Are we going to follow the bleeding edge here or will we stick to the oldest supported version? By that I mean the default Python version in the oldest still supported Fedora release. Currently that would be Python 3.13. Or do we follow the default that is shipped with a particular Fedora release? Tht would mean we'd have to test on multiple versions.
Earlier today I looked into an issue with the latest release of mizani. I seized the opportunity to install it locally in a Python 3.14 venv. Simply running pip install . didn't work since mizani depends on SciPy and there appears to be no binary package (yet?). Thus I couldn't use --prefer-binary. Installing as a dependency of mizani failed due to OpenBLAS not being found. In Fedora SciPy is built with FlexiBLAS. To make this work, flexiblas-devel needs to be installed and two options need to be passed on to the build backend (-Csetup-args=-Dblas=flexiblas -Csetup-args=-Dlapack=flexiblas) and SciPy needs to be installed separately in the venv.
pip install .
--prefer-binary
flexiblas-devel
-Csetup-args=-Dblas=flexiblas -Csetup-args=-Dlapack=flexiblas
Would this be part of documenting how to install mizani? Or would this provide justification to keep mizani as a system installable RPM package? In the former case, how do we ensure that we catch all required dependencies? I tripped over OpenBLAS/FlexiBLAS, but there are other dependencies which I just happened to have installed?
Thanks for the overview, @music. I do have a question regarding the Python libraries list: What's your definition of pure Python? I thought pure Python referred to packages only using libraries that are shipped with Python itself. There are packages on the list depending on third party libraries/modules. Hence my question.
By pure Python, I mean (and I understand this to be the usual meaning) that a package is written solely in Python, without including anything in other languages (C, C++, Cython), particularly those that would be compiled into a shared-library extension module. I would tend to exclude those that wrap shared libraries with ctypes due to the tight coupling with a dependency outside the Python ecosystem, but this is a marginal case. Pure Python packages may certainly have dependencies on other packages that do contain compiled code, like pandas or numpy.
Some other questions I wanted to ask: How are we going to approach venvs? Will it be one venv per package? Or will it be one big venv for all things neuro science?
Will it be one venv per package? Or will it be one big venv for all things neuro science?
We can’t and won’t ship venvs. My assumption is that we would test individual tools in their own venvs. If we wanted to make everything work together in one environment, then we might as well just package everything. At least then we would actually control it.
I suppose the latter is not feasible, because of version conflicts in dependencies since we are no longer or less in control of upstream version pinning. Having one venv per package undoubtedly will increase duplication of packages as the number of installed packages increases. I'm thinking of very common large(r) dependencies like NumPy, SciPy, Pandas, Matplotlib, etc.
The idea as I understand it is not that we ship venvs with bundled dependencies (which is problematic in many ways from a guidelines perspective, plus it runs into the technical barrier that virtualenvs are not relocatable), but that we would stop packaging some things and just tell people to install some things directly from PyPI themselves, perhaps using one of the tools like pipx or uv run/uvx that can help manage tools installed this way.
What Python version to use for testing? Currently the Fedora release dictates what Python version all Python modules are built against. Thanks to the work of the Python SIG and Python package maintainers, this closely follows upstream's release cycle. However, upstream maintainers are not always interested in the most recent Python releases and are sometimes slow to adapt to changes introduced in recent releases. Are we going to follow the bleeding edge here or will we stick to the oldest supported version? By that I mean the default Python version in the oldest still supported Fedora release. Currently that would be Python 3.13. Or do we follow the default that is shipped with a particular Fedora release? Tht would mean we'd have to test on multiple versions.
Good question!
Thanks for that @music. I've started a shared doc here now, since that'll be a little bit easier than comments on pagure:
https://hackmd.io/@sanjayankur31/HJ8V-_-Qee
Please edit it as required
Venvs: yes, we won't ship these---I didn't think we could. The idea is to test that packages can be installed on Fedora installations, ideally in virtual envs. (I know some folks that don't use venvs and just install everything into user directories, but as we know that this is bad practice, we will not recommend this)
What python version(s) to use for testing is a good question indeed. The default answer could be "whatever versions are supported on Fedora" as we do follow upstream Python quite closely---the only caveat being that we include the newest version of Python before it's released.
https://devguide.python.org/versions/
From what I see, most researchers aren't really too quick to jump to the latest python because it generally takes the various packages some time to catch up. If we want to continue testing with the latest python to help/inform upstreams, we can do so, but we can assume that most of our users will not be on the latest Python (rather, they'd be on a "stable" python).
How about perhaps (and this can be tweaked as we go): "default python version in Fedora rawhide and the previous 4 releases"? For example, we're at Python 3.14 in Fedora (even though it hasn't been released), so we test on 3.10--3.14? (or would that be too much?))
@music: did you write any scripts etc. for your thorough analysis, or was it manually for the moment?
Action items for me:
It was all manual skimming of spec files and in some cases pyproject.toml or other source files, and manual testing of command-line tools with uvx. Much of the initial categorization could have perhaps been scripted, but there were enough different things I needed to consider that I really needed to actually look at the packages this time around.
pyproject.toml
I didn't mean to suggest that we are going to ship venvs. I am probably overthinking this. But in absence of any guidelines questions popped up in my head when I was going to setup a new venv for testing some stuff locally. I agree, trying to force everything into one venv is gonna give us headaches of the kind we experienced when packaging for RPM. Though, in some cases, closely related software may benefit from sharing a venv. Let's cross that bridge when we get there.
With regards to Python version, if we want to allow for mixing system packages with locally installed packages, we'll have to stick with the default Python release for any particular release. @music already alluded to that on Matrix with regards to python-steps.
python-steps
My question regarding mizani remains unanswered, though.
I'll go through my packages and check on what qualifies for retirement. Some leaf packages of mine may have been in preparation for other packages. I'll have to jog my memory on that.
I've made a start on using tmt here:
https://pagure.io/neuro-sig/NeuroFedora/blob/feat-python-package-checks/f/python-package-usage-check
You should be able to test this out by installing tmt:
sudo dnf install tmt tmt+provision-container
and then running tmt run in the python-package-usage-check folder. There are tests for neuron and pyneuroml for the moment.
tmt run
python-package-usage-check
tmt is a little different from things like github actions. Here, one can't really define a matrix of environments to run all the tests in (or I haven't been able to figure out how to do it). It's more hierarchical and file/folder based, so a different folder is used for each python version etc.
I also haven't figured out if tests run (or can run) in parallel. Nothing in the docs about it. I've been asking in the matrix channel, but perhaps I'll also ask on the github discussion to make sure I'm doing this the right way.
I'm also going to see how this can be done using python/pytest with something like this virtualenv plugin for pytest in combination with parameterization:
https://pypi.org/project/pytest-virtualenv/ https://docs.pytest.org/en/7.1.x/example/parametrize.html
We know from experience that pytests does parameterisation + parallelisation very well.
Re: venvs:
Yeh, it' is unlikely that users will install all the packages into one giant venv. In most cases, people will only use a few packages---one for modelling, another for data analysis. Since upstream developers use similar pipelines, they do check that there are no conflicts when it comes to packages that are commonly used together. We do this for the neuroml stack, for example:
https://github.com/NeuroML/pyNeuroML/actions/runs/15591468055/job/43911369538
Re: steps
I see that they use scikit_build as the build-backend, so one should be able to go pip install . in a virtual environment, right? I.e., it should still be installable in a virtual environment? If that's the case, our job becomes to document the non-python system-wide packages required to build steps in that way (cmake etc.?). These non-python shared libraries/tools are still available in virtual environments, and presumably, the python deps will be pulled in from pypi?
Have I got that right?
Re: mizani
For me, the fact that it does not install cleanly on py3.14 is not a good enough reason to keep it as a system package on its own---because I can't see any of our users on py3.14 yet, even if it is the default for Fedora rawhide. If other reasons apply---provides CLI or a GUI---then it's worth keeping (but we leave that decision up to the primary maintainer).
I would be happy enough to have a note in our table in docs under the py3.14 column for mizani that says: "cannot currently be installed from pypi as scipy is not available as a wheel", and reporting this upstream so that they are aware. The assumption is that scipy will make a release for 3.14 once 3.14 is released: https://docs.scipy.org/doc/scipy/dev/toolchain.html#python-versions
This will serve the purpose of us making users and upstream aware that this currently does not install on the latest python version but doesn't require us to jump through the various hoops to maintain it as a system package.
What do you think?
Here is a pytest based checker:
https://pagure.io/neuro-sig/NeuroFedora/pull-request/581
Much simpler than the tmt bits. If you create a venv on a Fedora machine (for whatever python you want to use) and then install the requirements, running pytest -n auto -v should check for the example packages in the json file.
requirements
pytest -n auto -v
This can be tweaked to improve the workflow, but this prototype already works. (We should probably have separate files for each package perhaps, so that one file doesn't get super long etc.)
Initial PR for docs (more tweaks needed there to the text, but this starts by putting a searchable table in)
https://pagure.io/neuro-sig/documentation/pull-request/26
I've only used pip install ... for the moment, but we can modify this to whatever we wish
pip install ...
Log in to comment on this ticket.