Bug details: https://bugzilla.redhat.com/show_bug.cgi?id=2283978 Information from BlockerBugs App: <img alt="2283978" src="https://qa.fedoraproject.org/blockerbugs/api/v0/bugimg/2283978" />
Commented but haven't voted yet: pbrobinson, lbrabec, frantisekz
The votes have been last counted at 2024-09-20 08:06 UTC and the last processed comment was #comment-933440
To learn how to vote, see: https://pagure.io/fedora-qa/blocker-review A quick example: BetaBlocker +1 (where the tracker name is one of BetaBlocker/FinalBlocker/BetaFE/FinalFE/0Day/PreviousRelease and the vote is one of +1/0/-1)
BetaBlocker +1
BetaBlocker
FinalBlocker
BetaFE
FinalFE
0Day
PreviousRelease
+1
0
-1
If I understand this correctly, every time you keep the RPi4 idle for 15 minutes, it's dead, because you can't resume it. That means you can't log in, services stop working, and if you have some unsaved work, you lose it. I think that violates many many of our criteria, on the condition that you keep the system >15 minutes idle, e.g. this one:
A system installed with a release-blocking desktop must boot to a log in screen where it is possible to log in to a working desktop using a user account created during installation or a 'first boot' utility. A system installed without a graphical package set must boot to a state where it is possible to log in through at least one of the default virtual consoles. https://fedoraproject.org/wiki/Basic_Release_Criteria#Expected_installed_system_boot_behavior
I know we generally don't block on suspend/resume issues on x86, but this feels different. RPi hardware simply doesn't support suspend. The kernel should reflect this (and it does, on RPi OS, suspend is not available at all), or at least we can disable automatic suspend from userspace (GNOME settings). This feels like a major user experience issue caused by us, on a flagship platform supported by us.
Even though we shipped it like this in earlier releases, at this moment by feeling would be towards: FinalBlocker +1
GNOME or Workstation changed these settings, I don't remember a Fedora change for them, there was certainly no engagement with the arm SIG. I think auto suspend on battery makes sense, which is what the default used to be, but for systems on power it should be the users choice. I would ask Workstation why this change was made.
A system installed with a release-blocking desktop must boot to a log in screen where it is possible to log in to a working desktop using a user account created during installation or a 'first boot' utility.
We don't even block for x86 for suspend, and nothing there states anything about suspend. The user can also disable it.
A system installed without a graphical package set must boot to a state where it is possible to log in through at least one of the default virtual consoles. https://fedoraproject.org/wiki/Basic_Release_Criteria#Expected_installed_system_boot_behavior
The above is irrelevant in this context.
I completely disagree: 1) Why are we prepared to put non x86 architecture through more extreme requirements than x86? 2) The difference between upstream kernel, that Fedora ships, and downstream RPiOS is vastly, even wildly different. 3) They have control over the firmware, we do not.
To be clear, if this goes through I will stand down as Raspberry Pi maintainer in Fedora. I have very little time to work on this as it is, the time is my own (it's not part of my $dayjob) and I would prefer to be using the little time I have enhancing with things users want like making cameras work rather than spending time chasing something that the HW vendor doesn't even support.
To be clear, if this goes through I will stand down as Raspberry Pi maintainer in Fedora.
The discussion got heated rather quickly. Peter, nobody want's that and we all understand that there is a lot on your shoulders. This is not a quarrel, just a discussion.
To my knowledge (albeit quite limited), it should be fairly easy change to disable suspend for ARM images. Would other platform suffer from this change, does suspend work on other SBCs other than RPi?
So I guess this boils down to simple yes or no question. Is it an easy change and other platforms wouldn't suffer from it. If so, then let's do this, if no, then let's all vote -1 on this issue.
I would prefer to be using the little time I have enhancing with things users want like making cameras work rather than spending time chasing something that the HW vendor doesn't even support.
Yes, that is a discussion we should have, I'm not sure where, but if you start one, you'll have my full support. I'm not really sure why do we block on Fedora Workstation on ARM, and at least in my perspective, those small SBCs are a domain of Minimal and Server, not Workstation.
I'm not really sure why do we block on Fedora Workstation on ARM, and at least in my perspective, those small SBCs are a domain of Minimal and Server, not Workstation.
There's lot of different Arm devices, the Pinebook Pro, the Lenovo X13s, even the RPi4 with 8Gb of RAM are quite usable with Workstation.
TBH I have only randomly looked at suspend/resume on a few devices, like tablets/laptops and even then only briefly. A small device like a RPi that's always on with low power was never really on the radar, certainly I don't remember anyone ever asking for it.
https://pagure.io/fedora-workstation/issue/360 https://discussion.fedoraproject.org/t/gnome-suspends-after-15-minutes-of-user-inactivity-even-on-ac-power/79801
I don't think anyone considered ARM. Fedora Server wasn't considered as well, I had to notify them so that they could add an override in time.
It's a conditional violation which occurs on certain hardware under certain conditions.
1) Why are we prepared to put non x86 architecture through more extreme requirements than x86?
If the same issue was present on x86 (suspend advertised and enabled by default, while it always fails to resume on most hardware that people realistically use), I'd have the same opinion.
2) The difference between upstream kernel, that Fedora ships, and downstream RPiOS is vastly, even wildly different.
Do you have any technical insight whether we can make suspend not advertised as supported on RPi4?
3) They have control over the firmware, we do not.
Sure. But we should still be able to at least disable autosuspend by default, that's in our control, isn't it?
That's not what we want, but at the same time, our job is to have QA-related discussions, like this one, because it concerns a release-blocking image. You do have a vote and your opinion is very valuable as one of the main maintainers. I'd like the discussion to be technical.
I don't work with RPis too much, but perhaps that gives me a bit of a "newbie" perspective. If Fedora "froze" on my RPi4 every 15 minutes (idling), I might just go to RPi OS. People might not realize it's because of autosuspend. To my eyes, it feels like a serious UX problem. (At the same time, I haven't searched the forums to figure out how often users complain about it).
...
Well then, do you think it is better to try to support suspend in favour of those platforms? Or is it better to have a good user experience with Fedora on RPi? I'm not suggesting one side or other. Just asking, you are the ARM expert here and you probably have the best insight into Fedora on ARM community.
2) The difference between upstream kernel, that Fedora ships, and downstream RPiOS is vastly, even wildly different. Do you have any technical insight whether we can make suspend not advertised as supported on RPi4?
I will have to look into it, I suspect it will be a kernel change, it will need to be reviewed upstream and once accepted we can work to get it pulled into Fedora.
3) They have control over the firmware, we do not. Sure. But we should still be able to at least disable autosuspend by default, that's in our control, isn't it?
Maybe, see point above about kernel.
Maybe people have just worked out how to disable it?
Well I know people are looking upstream at suspend on the X13s. I am following that and will pull in fixes there when available. I did look at the Pinebook some time ago, I don't remember where we got there, it was some time ago, we can revisit it if people are willing to assist with testing.
Can I kindly ask you to look into it, when you have the time? Not a deep dive, just a general assessment of feasibility for a start? I think it would be great to have this solved in Fedora.
And to summarise this discussion for the vote: There is a suspend problem on RPi, which to solve would probably mean a kernel change, which we cannot realistically expect to happen in F41 cycle and disabling suspend generally would harm the experience on other platforms, on which there is desire to have suspend working.
If my assessment is right, then let's waive this issue, create a common bug workaround and have this issue on mind for the time being (no matter what, this is still a horrible default).
I read Peter's reply as "suspend is possibly currently broken on all main platforms, but I'm happy to pull in fixes". If I read it right, disabling suspend for all arm boards might not be that bad idea, and then gradually work on just selective enable/disable.
But can we actually disable suspend (make it seem unsupported) on kernel level for all arm boards? Is that possible, with some kernel switch or something?
The second, easier option, is to prohibit suspend in systemd.
And the third option is to just disable autosuspend in GNOME by default (when on AC, at least). We can easily do it on edition level (Server, IoT, etc), I wonder if we can do it per architecture/device.
For systemd or GNOME adjustment, my idea was that arm-image-installer could install a config override when burning the disk image. Ideally just for RPi4, but I'm not sure if we can detect that somehow, or for all arm devices.
arm-image-installer
(All this technical discussion should've occurred in Bugzilla, and this ticket was supposed to be just about the general decision whether it's a blocker or not, sigh, too late).
You can either reject the blocker proposal, or accept it. If you accept it, you can then waive it for being hard to fix and push it to the next release.
AGREED AcceptedBetaBlocker
Discussed during the 2024-08-19 blocker review meeting [1]:
This is accepted as a violation of 'All known bugs that can cause corruption of user data must be fixed or documented at Common Issues'. We note that fixing it may not be straightforward and it may have to be waived as hard to fix or documented and considered 'resolved' in that way.
[1] https://meetbot.fedoraproject.org/blocker-review_matrix_fedoraproject-org/2024-08-19/f41-blocker-review.2024-08-19-15.59.log.html
The following votes have been closed:
AGREED RejectedBetaBlocker AGREED AcceptedFinalBlocker
Discussed during the 2024-09-12 Go/No-Go meeting [1]:
This is waived under "Difficult to fix blocker bugs" at https://fedoraproject.org/wiki/QA:SOP_blocker_bug_process#Exceptional_cases , on the advice of the maintainers involved (see the bug for details).
[1] https://meetbot.fedoraproject.org/meeting_matrix_fedoraproject-org/2024-09-12/f41-beta-go-no-go.2024-09-12-17.03.log.html
@adamwill Responding to https://bugzilla.redhat.com/show_bug.cgi?id=2283978#c21 :
As this is now documented as a common issue, its release blocker status under the criterion is resolved, since the criterion says "All known bugs that can cause corruption of user data must be fixed or documented at Common Issues". Unmarking as a blocker.
Your phrasing makes it sound like both documenting or fixing are equivalent, and any of those are sufficient in any circumstance. But I always understood it as the blocker review team deciding whether we go with fixing or documenting. Because if it wasn't our decision, it would be a super-easy way out even for the most severe data loss issues, and we would have no way to stop it.
So, was "just documenting it" an intentional decision? In the meeting log, I don't see it discussed this way, so checking.
Metadata Update from @blockerbot: - Issue status updated to: Closed (was: Open)
Release F41 is no longer tracked by BlockerBugs, closing this ticket.
Just to follow up on this.
It's not broken on all main platforms, in fact in my testing it works on a number of them just fine.
No, and it breaks suspect which generally works.
Which used to be the case, and it changed upstream because another vendor wanted it, I see no reason why it can't be a distro choice how the default works.
That won't work, for one on all Raspberry Pi platforms there is no need to use arm-image-installer, because all our raw images work with RPi OOTB it can be written with Fedora media writer or any other option on Windows/Mac etc. Those sort of tools are absolutely not the way to handle this.
Log in to comment on this ticket.