#2650 F36 Change: libffi 3.4
Closed: Accepted 2 months ago by zbyszek. Opened 2 months ago by bcotton.

Update libffi in Fedora 36 from libffi 3.1 to libffi 3.4 (released June 28 2021), and provide a libffi3.1 compatibility package to handle the library SONAME transition.


The mass rebuild has already started. Pushing the libffi 3.4 + 3.1 updates to rawhide while things are still in-flight does not seem like a good idea to me, unless you want packages to end up in a partial transition state between libffi 3.1 and 3.4.

I see that libffi 3.4 and 3.1 compat package were already committed to dist-git, but only built in f35-build-side-43171. This is the worst possible situation, since the two packages will now be rebuilt and submitted to rawhide as part of the mass rebuild, but packages will not be rebuilt against those updates in the mass rebuild ... and merging the side tag now will also cause things to end up in a partial transition state :(

Suggestion for moving forward:

Make a mini-mass-rebuild in a side tag for the <200 (?) packages that depend on libffi, after the mass rebuild side-tag has been merged into rawhide (which includes the libffi 3.4 and the libffi3.1 compat package). I think this should result in the least unwanted side effects at the earliest point in the release cycle. Reverting the libffi + libffi3.1 changes in dist-git temporarily should not be necessary in this case.

I don't think doing nothing is a good alternative. In that case, packages will transition to libffi 3.4 slowly as they are updated post-mass-rebuild. However, this could have unintended consequences (much) later in the release cycle (basically, until F35 EOL ...), as more packages are rebuilt against libffi 3.4 - which does not sound like a good idea.

According to discussions on the devel list, both libffi 3.4 and the libffi3.1 compat packages were blocked from getting built during the mass rebuild at the last minute.

@codonell How do you plan to proceed? Looks like you didn't see our comments in this ticket. Do you plan to make a mini-mass-rebuild for the ~200 packages that link against libffi after the results of the mass rebuild have been merged into rawhide?

Tagging with meeting (and removing the fast track tag) since I am not happy with some aspects of the Change proposal and I think we should talk about it.

Metadata Update from @decathorpe:
- Issue untagged with: fast track
- Issue tagged with: meeting

2 months ago

This ticket will be discussed during today's meeting: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org/message/P5KOID677ZHCIW4ULTJJIE3OUWZ6IJ4L/

@codonell If you want to join, the meeting will be at 19:00 UTC in #fedora-meeting channel on libera.chat.

Based on releng recommendation in this ticket:
https://pagure.io/releng/issue/10213

The intent was to move forward with libffi 3.4.2 after the mass rebuild. The set of packages are identifiable.

I'll definitely join the meeting today.

We discussed this during today's meeting:

* #2650 Late F35 Change: libffi 3.4  (decathorpe, 19:05:10)
  * LINK:
    https://copr.fedorainfracloud.org/coprs/churchyard/libffi-3.4/builds/
    (mhroncok, 19:30:21)
  * Wait for COPR test builds against libffi 3.4, and vote in ticket
    after mhroncok gets us results  (decathorpe, 19:43:09)
  * If approved, do libffi rebuilds in an on-demand side tag, and ghc
    rebuilds in another side tag after that  (decathorpe, 19:43:31)
  * (Vote was: +7, 0, -0)  (decathorpe, 19:43:48)

We're waiting for the results of the test rebuilds in COPR.

Metadata Update from @decathorpe:
- Issue untagged with: meeting

2 months ago

https://copr.fedorainfracloud.org/coprs/churchyard/libffi-3.4/packages/

Preliminary results.

Failures:

  • cabal-install
  • cjs
  • gambas3 (also failed during the mass rebuild)
  • ghc (both 8.4 and 8.10)
  • gjs
  • jffi (also fails in Koschei)
  • ruby
  • thunderbird (Python 3.10 related, fixable)
  • xs (also fails in Koschei)

Will resubmit some of the failures later to mitigate build order issues, some of them already failed multiple times.

Remaining ~100 packages seem fine. None of hem requires libffi.so.6()(64bit) after rebuild

ghc, ruby and thunderbird worries me.

All the remaining packages finished successfully. Resubmitting a final round of try-agains for the failures, but my hopes are not high.

Since this creates a risk for the ghc change, I am inclined not to approve it.

Segfaults in gjs (and cjs) test suites also don't look good, since those are core components of Workstation (gnome-shell → gjs) and the Cinnamon Spin, respectively ...

ghc failed with:

CallStack (from HasCallStack):
  error, called at libraries/Cabal/Cabal/Distribution/ReadE.hs:42:24 in main:Distribution.ReadE
utils/genapply/ghc.mk:26: utils/genapply/dist/package-data.mk: No such file or directory
make[1]: *** [utils/hsc2hs/ghc.mk:21: utils/hsc2hs/dist/package-data.mk] Error 1

Looks like a build system issue?

The big question is whether those failures also occur in rawhide without the updated libffi.

It doesn't really matter if we cannot rebuild them.

Anyway, cjs, gjs and ruby built fine during the mass rebuild.

cjs and gjs were not tracked in Koschei (which is rather weird for a core component, but whatever), so I've enabled them and bumped the priority.

ruby fails in Koschei, but only on aaarch64.

ghc fails in Koschei but only bacause the latest rawhide-built version does not contain this fix: https://src.fedoraproject.org/rpms/ghc/c/bad2a2b5a9ca21cbc1503fb68254c02280fe1461?branch=rawhide

Given the number of failures (and segfaults), I think it's probably better to defer this.

Proposal: Defer libffi upgrade to Fedora 36.
+1

That said, I think we should land this in Rawhide immediately after the branch.

I went through the COPR rebuild to look if anything was caused by libffi 3.4.2 / libffi3.1. I then rebuilt everything in Fedora Rawhide again. Most failures appear to be unrelated to the new version of libffi 3.4.2.

Results from a rawhide (without libffi) rebuild:

  • cabal-install (fails, identical gcc failure to libffi rebuild)
  • cjs (fails, identical 1 fail in testsuite to libffi rebuild)
  • gambas3 (fails, identical package conflict problem to libffi rebuild)
  • ghc (fails, identical package-data.mk failure to libffi rebuild)
  • gjs (fails, identical 1 fail in testsuite to libffi rebuild)
  • jffi (fails, different sourceRoot failure to libffi rebuild failure with Array.o)
  • ruby (fails, similar core dump in test suite failure to libffi rebuild)
  • thunderbird (fails, identical python problem to libffi rebuild)
  • xs (fails, identical git_date.sh problem to libffi rebuild)

If we're talking A/B testing, yes, we can't know if the fails are shadowing a possible true libffi 3.4.2 failure.

The only things to review again would be ruby and jffi which have differences in the A/B testing of Fedora Rawhide vs. COPR build with libffi 3.4.2 / libffi3.1.

As much as it is unfair, whether or not the failures are related to the libffi upgrade is not relevant here. If we cannot rebuild the packages, we cannot upgrade to the new version without keeping the old one around. Keeping the old one around poses a risk of accidental rebuilds of packages with the new libffi after GA, wich is not desired. Practically speaking, I think it is simply too late in the release cycle to attempt this. We have deadlines and schedules for good reasons.

I agree with @sgallagh that we should do this immediately after branching, in Rawhide (Fedora 36).

If there are compelling reasons to rush this for F35, I might reconsider my vote, but I don't see them.

@churchyard Absolutely, the lack of being able to rebuild will prevent the ABI transition.

There aren't any immediately compelling reasons to do this, and Intel CET and AArch64 PAC+BTI are evolving security features. We absolutely want to do a libffi rebase, but if it's in F35 or F36 is not a big difference for the user community.

@churchyard Or to put it another way, I wanted to try my best to see if we could get this into F35 because it does have value to users, but I'm leaning on fesco's experience here to help guide me with this last-minute system-wide change. I am very grateful for all the feedback and review.

I think we will need to make those packages not FTBFS anyway, because there's a planned update and for other reasons. So we can expect that ghc and other will be rebuilt.

But even if they are not, let's reconsider what happens if they are not rebuilt. They will use the compat version of libffi. As long as they continue to FTBFS, we can do nothing and it's just fine.

As much as it is unfair, whether or not the failures are related to the libffi upgrade is not relevant here.

I think it is unfair, and we shouldn't be unfair. The update of libffi is prepared properly, and should just work. It is important to be able to push new versions of software into Fedora without delays of 6 months. Packages that FTBFS are a nuisance, and we cannot block progress in other packages because of such packages. With the current size of the distro, it's just not a feasible solution to block: there will always be some package that doesn't build.

My proposal: do a rebuild with the new libffi in a side-tag. Packages that FTBFS will continue to use the compat version. We will prioritize the effort to get all the FTBFS packages rebuilt in rawhide within a few weeks. If there are some packages which we just can't get to build but they remain installable, we'll introduce some kludge to keep them using the old libffi version for the lifetime of F35 (*).

(*) I'm not specifying how the kludge will look exactly, because it depends on the number of packages in question. I really hope that the number will be 0, and the issue will become moot. One option would be to introduce compat headers to allow building against the old libffi version. Another option would be to bundle the old libffi version in those packages. I don't think we need to or even should design a solution now, before we know if it will be needed at all. The time is better spent getting those FTBFS packages fixed.

If we can prioritize getting the packages that couldn't be rebuilt fixed, and if they can't be fixed in time, use something like "BuildRequires: libffi3.1-devel" for the lifetime of F35 ... that would work for me too.

Otherwise, postponing this change for F36 is the easiest and least disruptive solution this late in the F35 development cycle.

@zbyszek @decathorpe Given that there is a lack of consensus, which is OK, measuring and executing in the face of risk is hard. I'm going to withdraw the LIBFFI34 proposal for F35, and carry it out in F36 as soon as we branch rawhide. I do agree that FTBFS make it hard to move certain kinds of proposals forward when they don't align with the release schedule. That's normal, and I expect a certain amount of that. As owner of upstream glibc I also have to handle the 6 month time-boxed release, and I know what it's like to say "No" to late features. It always pains me when I say "No." so I empathize from both sides :-)

Metadata Update from @bcotton:
- Issue set to the milestone: Fedora Linux 36 (was: Fedora 35)

2 months ago

Proposal updated for F36. FESco members, please re-vote on this as an F36 proposal.

+1 for F36

Thanks for working with us on this.

(FWIW I just saw this - never saw those ghc failure errors before this, strange... anyway as per the GHC 8.10 Change - I am updating ghc* this coming week in f35-build-side-43587 before branching. For this reason ghc was opted out of the F35 mass rebuild since the rawhide branch was already rebased to 8.10 before the mass rebuild, thanks.)

Formally we need more to wait until 7 days have passed.

After a week, I count the vote as
APPROVED (+7,0,-0)

Metadata Update from @bcotton:
- Issue tagged with: pending announcement

2 months ago

Metadata Update from @zbyszek:
- Issue untagged with: pending announcement
- Issue close_status updated to: Accepted
- Issue status updated to: Closed (was: Open)

2 months ago

Login to comment on this ticket.

Metadata