#718 [F38] Gnome Test Week
Opened 2 months ago by sumantrom. Modified 13 days ago

This tracker is for F38 Gnome test week

GNOME 44 beta was released on 11 February and is almost in stable. Now is the perfect time to run the test week - can we organise that?

cc @amigadave @kalev

It would probably be good to do this as soon as possible now that the 44 beta update made it to stable in F38: https://bodhi.fedoraproject.org/updates/FEDORA-2023-489f1e935f

Next week maybe? I am myself gone for the first half of next week, but it's all amigadave doing the packaging work this cycle so I think it's fine if I'm not around.

Metadata Update from @kparal:
- Issue assigned to sumantrom

a month ago

Metadata Update from @kparal:
- Assignee reset

a month ago

Metadata Update from @kparal:
- Issue assigned to sumantrom

a month ago

So we have a lot of test week (i18n , kernel) happening from 7th ... dnf 5 will be a test day post 14th March. Lets pick a date soon. I would have to wrap all the announcements and re-look at test cases!!

I suggest we do two test days, one for GNOME desktop + major apps, and the second for smaller-and-often-problematic apps (calendar, photos, maps, etc).

We recently did some work to update the workstation test cases - see https://pagure.io/fedora-workstation/issue/310 . These are now divided into core desktop features and default apps.

Last cycle we had an issue with there being too many tests in each section on the test page. It would be good to avoid that this time round. We'll need to be selective about which tests to include.

It would be great to have an opportunity to review the test day results page before the event!

We concluded we'd do two test days: 1. desktop + major apps, 2. smaller problematic apps. The first one on Mar 6-8, the second one Mar 9-10. @sumantrom will create test day skeletons asap, so that we can review the sections and their test cases.

Major points:

  • We want only a few test cases per section, to make the results readable.
  • We want people to join Workstation IRC/Matrix channels instead of our general test-day channel.
  • We want to heavily emphasize that people should file upstream bugs, instead of just leaving comments in the testday app.
  • We want to emphasize exploratory testing, so that people don't just follow the instructions in that test case and call it done.
  • We (both QA and Devel) want to be more active on IRC/Matrix channels to guide people, which will reduce clutter in the results comment section and we can also better nudge them into reporting issues upstream

@sumantrom created the following pages:

There are way too many apps in the second test day. I'll try to help to adjust it to match the goals mentioned in my post above.

I think I've finished editing the two wiki pages. Hopefully I've improved the discussed areas.

@aday @kalev Please extend the "Who's available" section with more developer names. One or two more would be great, ideally covering the timezones a bit.

There's a section on new noteworthy features. If you think something could be mentioned, so that people are aware of it even though we don't have a special testcase for it (for example, it seems that app focusing has been redone lately in shell/mutter), please update it, otherwise we can remove that section.

I'm happy to see any other feedback as well, or feel free to directly edit it, those are your pages.

I'll continue looking at the test cases and polishing them, it seems some of them need quite a lot of touches.

For the second test day, I only included a small selection of apps which I think benefit most from testing. We can include many more, if you wish so, but it might cause people to spread out their attention, and I think it would be more beneficial to make them spend more time on those likely-problematic ones.

Thanks a lot for working on that, Kamil! I'll try to take a look later as well.

@aday @kalev Hi, nobody still edited those sections I mentioned. This is the last working the before the event is live, please look at it, thanks. @sumantrom is going to send out announcements tomorrow.

Also, I tweaked all test cases in the first test day, I hope they are much more helpful now. They are updated to latest software, I tried to include all reasonable operations which are sufficiently common and easy to test in a test day, and they include requests to do exploratory testing and also link to the upstream bug trackers. You don't need to read through them, but does their selection in the test day seem reasonable? Those sections, a few test cases in each one. Something to quickly add or remove at the last moment? Let me know.

I'll try to check and update also the test cases from the second test day today.

Thanks for working on this, @kparal ! I've made a few improvements to the wiki pages and looked over the test pages.

Can we revisit the distribution of the tests between the two test days? The first one has 17 tests, whereas the second only has 7, 3 of which require a bare metal install.

Perhaps if we were to move Software, Files, Text Editor, Document Viewer over to the apps test day?

Great, thanks! There's still one FIXME on the second wiki page, I assume you're aware of it :-)

We can do it as you like. My assumption was that we want to force people to spend more time on those problematic apps, like a Calendar, which we always fight with during release. If there are more test cases, it's easier for people to spend less time on the problematic ones or skip them entirely (Nautilus, Text and Evince are very unlikely to have serious issues that we wouldn't find anyway). Also, the first test day has 3 days runtime, the second one has 2 days. And those Gnome Shell test cases are not that time consuming and Online Accounts might get skipped by many.

But, it's your test day, just tell me how to adjust it :-) It's a good point about some apps requiring a bare metal install, that certainly limits some tester (note that gnome-disks doesn't require it, though, but sure it's better).

I could perhaps leave the really major apps like Files, Software and Terminal in the first day, and move Text, Evince and Help to the second one, but create a separate section for them, e.g. "Apps which require standard amount of testing"? To make it clear that we prefer some apps to be tested more than others?

We've chatted with @aday on Matrix and adjusted the results layout. Hopefully we're prepared now.

I think I've gone through all the test cases, the instructions should be ready and reasonably good.

The test week is over. I've gone through the first test day results, tried to replicate issues where I could, and proposed some of them as F38 blockers when it made sense. I'll try to do the same thing for the second test day tomorrow.

@aday Do you have any feedback for the test week? Was it more useful than in the previous cycle? Have those small changes (people chatting in #workstation, guiding them to report problems upstream, smaller sections in testdays app, etc) helped? Is there something specific you'd like to see improved (both process-wise and testdays app ui changes - I plan to make a list of small touches which could make the experience better)?

@aday also, I can join in tomorrow's meeting and hear the feedback in person if need me.

Login to comment on this ticket.