RHEL 7 EOL is approaching (June 2024) and we still have some servers running on RHEL7 machines.
sundries01.iad2.fedoraproject.org, sundries01.stg.iad2.fedoraproject.org sundries02.iad2.fedoraproject.org are one of these servers and we need to do something about.
June 2024
Metadata Update from @kevin: - Issue assigned to kevin
I plan to look at these very soon. (Help welcome!)
We need to see all the things they are still doing and decide if we should just move them to rhel9, or want to try and work things so they can just be dropped.
I know they are used to sync a lot of things to proxies. They are also used I think for geoip requests.
So, I think a good step here might be to create a sundries02.stg that is rhel9 and work through deploying it and see what breaks/needs adjustment.
We can then look at it to test if things work
Then we can redeploy sundries01.stg as rhel9, make sure all the things that use that main node work.
Then we can do production.
Metadata Update from @zlopez: - Issue assigned to zlopez (was: kevin)
I will start working on that.
Running the playbook on sundries02.stg, it failed on zanata role (missing package). Do we still need zanata? I thought that it's dead for some time.
I also noticed that geoip-city-wsgi/app role is limited to RHEL7. Not sure if we need it.
geoip-city-wsgi/app
Yeah, I don't think we use zanata anymore for anything. we use weblate now...
On geoip, I thought it was still used by anaconda to figure out timezone/locations, but that could be long since no longer the case. We will need to investigate.
Thanks @kevin I will ask anaconda about that.
I got confirmed that they are still using it, let's see if the role will work on RHEL 9.
So for the geoip-city-wsgi/app role only missing package for EPEL9 is python-iso3166. So I created a EPEL9 request and will wait for maintainer to respond.
python-iso3166
Just to test it out I tried to build the rawhide spec in EPEL9 and it works without issue.
Currently waiting for EPEL9 package update to be pushed to stable.
After python-iso3166 was pushed to EPEL9 the geoip-city-wsgi/app role was being deployed without issue.
However I got stuck on translate-toolkit package for fedora-docs role. I tried to build it in COPR, but I ended up with tests failing on python-scikit-build. Build log could be found here
translate-toolkit
fedora-docs
python-scikit-build
I don't currently have much time to work on this, so if anybody want to take over, feel free to do it.
The logs seem to be missing now... perhaps the builds were too long ago? Can you refire it?
@misc you added this package with a commit of "for pocount". Is this really needed? Or could we just move forward without it for now and circle back to get it later?
@kevin Strange the log is still in place for me, here is the direct link.
So I had some time today to continue with the copr project and I was able to get over python-scikit-build by skipping the fortran compile test.
Now I'm stuck on python-rapidfuzz which complains about
python-rapidfuzz
error: Empty %files file /builddir/build/BUILD/rapidfuzz-3.5.2/debugsourcefiles.list
Not sure why the file is empty.
Looking into the build machine I can see that debugfiles.list is filed in, but the debugsourcefiles.list is really empty :-/
debugfiles.list
debugsourcefiles.list
This is very likely that it's not building with -g (debug symbols). I see it also failing to extract debuginfo...
@kevin After some digging I found out that the CXXFLAGS were missing completely. After adding them I was able to built it.
CXXFLAGS
I was able to build translate-toolkit in COPR for EPEL 9. Here are all the packages that need to be rebuilt in epel9-infra tag to get it running. I needed to fork some of them and do small changes to get them working on EPEL 9.
epel9-infra
I will start building them in epel9-infra tag next week as I will be on DevConf.CZ for rest of this week. Hopefully this will be the last obstacle for deployment of sundries playbook on RHEL9.
After starting building packages in epel9-infra I got stuck on python-rapidfuzz as this has build requirement python3-pandas or python3(x86-32), this was satisfied on COPR as it's including python3.i686 from CodeReady builders repository, but in Fedora koji it's failing. I tried to build python3-pandas, but it has about 20 missing dependencies on EPEL9.
python3-pandas or python3(x86-32)
python3.i686
CodeReady builders
python3-pandas
I also tried to just comment out the requirements, but it introduces plenty of failed tests (see https://kojipkgs.fedoraproject.org/work/tasks/473/119330473/build.log).
So I'm not sure if we can just get rid of the translate-toolkit dependency from the playbook or I should try to build the python3-pandas in COPR and then rebuild in epel9-infra tag.
So, I think that requires was added when python-pandas dropped i686, according to the comment I guess to keep python-rapidfuzz avaiable on i686?
@churchyard can you recall whats going on there?
This has turned into quite the rabbit hole. ;(
python3-pandas or python3(x86-32) is actually a "clever" way to get pandas on anything but i686 but end up with the dependency available in the repo for repoquery. But it has downsides -- when multilib repo is enabled (e.g. in local mock or Copr), it fetches 32bit Python.
Anyway, pandas is optional. Drop the line if you don't have it.
The failures are caused by this relevant line in the build.log:
========================================================================== WARNING: The C extension could not be compiled, speedups are not enabled. Plain-Python build succeeded. ==========================================================================
Which itself is caused by:
CMake Error at /usr/share/cmake/Modules/CMakeTestCXXCompiler.cmake:60 (message): The C++ compiler "/usr/bin/c++" is not able to compile a simple test program. It fails with the following output: Change Dir: /builddir/build/BUILD/rapidfuzz-3.5.2/.pyproject-builddir/pip-req-build-_szb_tpu/_cmake_test_compile/build/CMakeFiles/CMakeScratch/TryCompile-UyMMRP Run Build Command(s):/usr/bin/cmake -E env VERBOSE=1 /usr/bin/gmake -f Makefile cmTC_11f0b/fast && /usr/bin/gmake -f CMakeFiles/cmTC_11f0b.dir/build.make CMakeFiles/cmTC_11f0b.dir/build gmake[1]: Entering directory '/builddir/build/BUILD/rapidfuzz-3.5.2/.pyproject-builddir/pip-req-build-_szb_tpu/_cmake_test_compile/build/CMakeFiles/CMakeScratch/TryCompile-UyMMRP' Building CXX object CMakeFiles/cmTC_11f0b.dir/testCXXCompiler.cxx.o /usr/bin/c++ -O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m64 -march=x86-64-v2 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -o CMakeFiles/cmTC_11f0b.dir/testCXXCompiler.cxx.o -c /builddir/build/BUILD/rapidfuzz-3.5.2/.pyproject-builddir/pip-req-build-_szb_tpu/_cmake_test_compile/build/CMakeFiles/CMakeScratch/TryCompile-UyMMRP/testCXXCompiler.cxx Assembler messages: Error: unknown architecture `x86-64-v2' Error: unrecognized option -march=x86-64-v2 cc1plus: error: unrecognized command-line option ‘-m64’ cc1plus: error: unknown value ‘x86-64-v2’ for ‘-march’ cc1plus: note: valid arguments are: armv8-a armv8.1-a armv8.2-a armv8.3-a armv8.4-a armv8.5-a armv8.6-a armv8-r native cc1plus: error: ‘-fcf-protection=full’ is not supported for this target gmake[1]: *** [CMakeFiles/cmTC_11f0b.dir/build.make:78: CMakeFiles/cmTC_11f0b.dir/testCXXCompiler.cxx.o] Error 1 gmake[1]: Leaving directory '/builddir/build/BUILD/rapidfuzz-3.5.2/.pyproject-builddir/pip-req-build-_szb_tpu/_cmake_test_compile/build/CMakeFiles/CMakeScratch/TryCompile-UyMMRP' gmake: *** [Makefile:127: cmTC_11f0b/fast] Error 2
Which itself is probably caused by passing x86-specific options to CFLAGS on a noarch package built on aarch64.
Which is caused by having this in the spec:
%build # Set the CXXFLAGS, those are not set on EPEL9 CXXFLAGS='-O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m64 -march=x86-64-v2 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection' export CXXFLAGS %pyproject_wheel
Naughty, naughty...
Use a macro instead. %set_build_flags should be available in RHEL 9.
%set_build_flags
BTW the fact that %pyproject_wheel sets CFLAGS only and not CXXFLAGS is probably a bug, it only affects RHEL < 10 because Fedora has https://fedoraproject.org/wiki/Changes/SetBuildFlagsBuildCheck
The macro sets CFLAGS and LDFLAGS in a way that was copied from %py3_build which does it in a way that was copied from actual specfiles in the 2000s. If you open an RFE to pass CXXFLAGS as well, it will likely be fixed.
@churchyard Thanks for the help, I saw the issue with CXXFLAGS on COPR, so I assumed that is because EPEL9 build, so it could be COPR specific. Could you point me where to open RFE for that?
I will use %set_build_flags macro for now.
I think I found the naughty hack somewhere in packager docs, but I'm not sure and can't find it now.
You can either open a Fedora bugzilla for pyproject-rpm-macros saying that CXXFLAGS is not set when _auto_set_build_flags is disabled (which is the non-default option in Fedora, but it is still possible) or open a RHEL 9 Jira saying CXXFLAGS is not set by default. Either way, this will be fixed in Fedora first, RHEL 9 later.
The bug is now filled https://bugzilla.redhat.com/show_bug.cgi?id=2293616
I built the translate-toolkit in epel9-infra tag, but I'm unable to install it on sundries02.stg server.
sundries02.stg
When trying to do so, I will get:
[root@sundries02 ~][STG]$ dnf install translate-toolkit Last metadata expiration check: 7:24:40 ago on Fri 21 Jun 2024 06:52:16 AM UTC. No match for argument: translate-toolkit Error: Unable to find a match: translate-toolkit
But when trying to search for it, it's available:
[zlopez@sundries02 ~][STG]$ dnf search translate-toolkit Last metadata expiration check: 0:07:21 ago on Fri 21 Jun 2024 02:08:01 PM UTC. ================================================================== Name Exactly Matched: translate-toolkit ================================================================== translate-toolkit.noarch : Tools to assist with translation and software localization ================================================================= Name & Summary Matched: translate-toolkit ================================================================= translate-toolkit-docs.noarch : Documentation for translate-toolkit
I'm kind of puzzled what is happening here. Will look into that later.
Yeah, user and root caches for dnf are different. Your user one has it, but the root one doesn't.
I usually use --refresh with installs like this to force it to redownload metadata.
[root@sundries02 ~][STG]# dnf --refresh install translate-toolkit Extras Packages for Enterprise Linux 9 - x86_64 87 kB/s | 4.3 kB 00:00 Fedora Infrastructure tag 9 - x86_64 49 kB/s | 3.0 kB 00:00 Fedora Infrastructure tag 9 - x86_64 53 kB/s | 3.0 kB 00:00 Fedora Infrastructure tag 9 - x86_64 234 kB/s | 41 kB 00:00 rhel9 baseos dvd 57 kB/s | 2.7 kB 00:00 rhel9 AppStream dvd 60 kB/s | 2.8 kB 00:00 rhel9 BaseOS x86_64 89 kB/s | 4.1 kB 00:00 rhel9 AppStream x86_64 98 kB/s | 4.5 kB 00:00 rhel9 CodeReadyBuilder x86_64 99 kB/s | 4.5 kB 00:00 Dependencies resolved. ============================================================================================== Package Arch Version Repository Size ============================================================================================== Installing: translate-toolkit noarch 3.12.2-2.el9 infrastructure-tags-stg 954 k Installing dependencies: enchant x86_64 1:1.6.0-30.el9 rhel9-dvd-AppStream 66 k hunspell x86_64 1.7.0-11.el9 rhel9-dvd-AppStream 329 k hunspell-en-US noarch 0.20140811.1-20.el9 rhel9-dvd-AppStream 178 k hunspell-filesystem x86_64 1.7.0-11.el9 rhel9-dvd-AppStream 9.0 k iso-codes noarch 4.16.0-3.el9 infrastructure-tags-stg 3.4 M python3-Levenshtein x86_64 0.23.0-5.el9 infrastructure-tags-stg 140 k python3-aeidon noarch 1.15-1.el9 epel 187 k python3-charset-normalizer noarch 2.0.10-1.el9 epel 72 k python3-cheroot noarch 8.6.0-4.el9 epel 172 k python3-enchant noarch 3.2.0-5.el9 rhel9-dvd-AppStream 90 k python3-jaraco noarch 8.2.1-3.el9 epel 11 k python3-jaraco-functools noarch 3.5.0-2.el9 epel 19 k python3-more-itertools noarch 8.12.0-2.el9 epel 79 k python3-phply noarch 1.2.5-9.el9 infrastructure-tags-stg 115 k python3-pycountry noarch 23.12.7-2.el9 infrastructure-tags-stg 35 k python3-rapidfuzz x86_64 3.5.2-9.el9 infrastructure-tags-stg 2.1 M python3-ruamel-yaml x86_64 0.16.6-7.el9.1 rhel9-AppStream 213 k python3-ruamel-yaml-clib x86_64 0.2.7-3.el9 rhel9-AppStream 148 k python3-simplejson x86_64 3.17.6-1.el9 epel 264 k python3-toml noarch 0.10.2-6.el9 rhel9-dvd-AppStream 46 k python3-vobject noarch 0.9.6.1-5.el9 epel 83 k xml-common noarch 0.6.3-58.el9 rhel9-dvd-AppStream 36 k Installing weak dependencies: hunspell-en noarch 0.20140811.1-20.el9 rhel9-dvd-AppStream 191 k hunspell-en-GB noarch 0.20140811.1-20.el9 rhel9-dvd-AppStream 226 k Transaction Summary ============================================================================================== Install 25 Packages Total download size: 9.1 M Installed size: 42 M Is this ok [y/N]: n
So it's OK that playbook is failing on first run?
So after that the next error happened on [nfs/client] role:
[nfs/client]
"Error mounting /srv/docs: mount.nfs: access denied by server while mounting ntap-iad2-c 02-fedora01-nfs01a:/openshift_stg_docs\n"
@kevin Does this need some special permissions to be mounted?
Yes, it needs to be in the netapp export-policy for that volume.
I added it and ran the playbook and now all the nfs mounts are there.
@kevin That is awesome, that means that I can finally get to testing on sundries instead of just figuring out the issues during deployments :-)
Last think that was missing was reg package for reg-server playbook. Now I was able to run the whole playbook without error. The next thing will be to check the logs for any error.
reg
reg-server
I have now reinstalled all of them with rhel9.
We will see what if anything is broken... but all nagios checks are fine, and I don't see anything obviously broken yet.
Thanks for all the packaging work on it!
Metadata Update from @kevin: - Issue close_status updated to: Fixed with Explanation - Issue status updated to: Closed (was: Open)
Log in to comment on this ticket.