#2114 What is the scope of Modularity?
Opened 5 months ago by churchyard. Modified 2 months ago

Hey, after recent events in Fedora and after discussions on 2 latest FESCo meetings, I'm worried that modularity is too "wild" and that the situation is actively hurting Fedora as a project.

Let me give a bit context. This is how I remember things, correct me if I'm wrong:

  1. In Fedora 26, we wanted to create a "full modular" server preview (Boltron) - a very little small base, everything lives in modules.
  2. In Fedora 27, we wanted to do this for real. It didn't work.
  3. In Fedora 28, we announced Add-On Modularity. All examples given talked about a leaf group of packages or replacement packages - aka we add Django 1.6 module, but regular Djnaog stays in Fedora. Or we package multiple nodejs streams - where basically nothing depends on nodejs, and the little that does still has default non-modular version of nodejs to use.
  4. In Fedora 29, we say Modules for Everyone - we enable modular repos by default and say how the users won't even notice unless they want to; yet we don't really talk about what is going to be modularized or how does it affect contributors.

Since we abandoned the "modularize everything" idea, every presentation about Modularity I've seen, every article I've read, every change proposal... everything made me think that modules are extra goodies that we can stick on top of Fedora - and that gives much benefits to everybody.

Instead, what we see today, is that large parts of Fedora are moving to modules only, we see that non-modular content depends on modular content (as a result of the Modules for Everyone change, this is technically possible yet I don't remember this being approved as a thing to do), we try to develop mechanisms to enable this in the buildroot.

Shouldn't we stop here and reconsider? What is the scope of modularity? Ultimately, what parts of Fedora we wish to maintain in modules? I'm afraid that currently, the answer is: Whatever the maintainer wants.

Here's why I think this hurts Fedora:

  • The tooling for this is clearly not yet ready (see ursa-major/modules in buildroot ticket or rawhide only packages for Rust), moving stuff to modules only currently causes enormous breakage (see orphaning of 250 Java packages).
    All the FTBFS and orphaned packages and similar releng-y automation only deals with rawhide, no modules.
  • Some of the modules are intended as build deps only for other modules, keeping the power users in the dark when they want to use fedpkg local or similar.
  • It is not clearly visible what versions of what are actually delivered. In theory, my package can build against libfoo3 from nonmodular Fedora, but users get libfoo4 from the WhatnotModule and things will break; as a maintainer I have no idea when this can happen.
  • The bus factor: While 250 packages getting orphaned at once can cause a large stress, there will always be maintainers willing to take them. However if my dependency is modular only and the maintainer goes AWOL, I am not able to maintain the module unless I'm already familiarized with the procedures. My impression from the devel list is that most of our contributors are not. The same applies to occasional contributions. With pull requests, anyone can contribute to their dependencies. Modularity makes it much harder, cultivating tribal knowledge and exclusivity. E.g. how do I maintain or contribute to this? The gap between maintainers who care about modularity and those who don't is getting bigger and bigger every day, we should focus on getting it smaller - or allow maintainers not to care.

I believe that we should define clear boundaries and expectations that say:

  1. this is the kind of thing that we ultimately want to modularize
  2. this is the kind of thing that might be modularized if XYZ conditions are met
  3. this is the kind of thing that is supposed to stay in Fedora non-modular forever

We should actively work with the maintainers of the first category to make it happen, but we should check if the conditions of the second category are met and we should forbid modularizing the third category.

At the end, there should be criteria for creating modules and somebody should check them. Essentially creating a new module or moving stuff into a module should be a Fedora change (of a sort) - it should be discussed, approved, announced.

A quote from the mailing list to ilustrate the current frustration:

It seems a bit crazy to me that we have packages built for Fedora that aren't available for users to install. Why wouldn't we make everything maximally available? I used to love Fedora, because I just play with all the bits. But now, a lot of those bits are going away... I have less to play with... and the focus seems more targeted towards Fedora's internal needs, and not Fedora's users needs. Contributing to Fedora is so much harder now. Do we have to make it harder by making certain packages unavailable to regular users (and casual packager-contributors like me)?


I'm afraid that currently, the answer is: Whatever the maintainer wants.

I agree on this.

rawhide only packages for Rust

Sorry , but how does it break anything? By doing this we were able to ship updates for F28 and F29 which we were not able to do with non-modular content.

It is not clearly visible what versions of what are actually delivered. In theory, my package can build against libfoo3 from nonmodular Fedora, but users get libfoo4 from the WhatnotModule and things will break; as a maintainer I have no idea when this can happen.

As long as you have proper dependencies in RPM, it should be good.

A quote from the mailing list to ilustrate the current frustration:

I feel that pain. On one side, having single repo with everything is cool. But there is no tooling to manage content over multiple releases of Fedora. I mean, it is called modularity nowadays.


I think we should be aiming to go fully-modular, but this requires a lot of work and it seems we do not have enough people to do that work (or even define it). I see future UX as being half-modular (bash for example can live in module with defined lifecycle while some random libfoo can be non-modular). When default stream changes, we rebuild non-modular content in automated way (not the way we do mass rebuilds now).

So I think whatever decision we make, we need to make sure that there are enough people who would be working on defining what is needed to be done to make modularity easier for both users and packagers and then work on a tooling.

rawhide only packages for Rust

Sorry , but how does it break anything? By doing this we were able to ship updates for F28 and F29 which we were not able to do with non-modular content.

Hey Igor!

I wouldn't word it as breaking something, but I will say that I've found it difficult to understand how to manage my rpick module yaml file. I know you wrote a tool to generate that file so you don't have to manage it by hand, but it seems like this is something that the infrastructure should do for me. Since the infrastructure doesn't do that, I would say that modularity does seem to make managing packages more difficult than the traditional RPMs did, at least for Rust packages.

Wasn't modularity marketed as being good for languages like Rust, or am I misremembering that?

I wouldn't word it as breaking something, but I will say that I've found it difficult to understand how to manage my rpick module yaml file. I know you wrote a tool to generate that file so you don't have to manage it by hand, but it seems like this is something that the infrastructure should do for me. Since the infrastructure doesn't do that, I would say that modularity does seem to make managing packages more difficult than the traditional RPMs did, at least for Rust packages.

So you (we, Fedora) have 2 possibilities:

  1. Modularity way: have all crates in their latest versions in rawhide and somehow generate modulemds out of that.
  2. Traditional way: have all crates in their versions (most likely outdated, just look at F29) and not being able to build your app because they are outdated...

Conclusions from this:

  • We don't have way to easily update multiple packages and keep them up to date in stable releases (soon, we won't be able to do this in rawhide either, due to gating);
  • We don't have way to easily generate modulemds and somewhat automatically create modules (and it seems we don't have any people who would be working on this tooling anytime soon);

So you are right that modules are not very packagers-friendly. But maintaining 500+ packages in multiple stable releases is even more painful. And yes, I do agree that we should have much more automation. And no, it seems nobody is looking to work on that (I am willing to help as much as I can, but I can't do all work).

Problem 1: I fear there is a lack of shared vision for Modularity.

To me it sometimes feels like we're maintaining and bugfixing in a situation that still requires design thinking.

One consequence of a missing shared vision is a lack of guidance on how to use certain Modularity features — namely the module API and to some extent the buildroot-only modules. It is very easy for people to build silos of private dependencies instead of collaborating with each other. Integration is a huge part of a Linux distributions' value. Are we even sure those concepts have been designed to solve problems and help the community as a whole?

Another one is that we don't seem to agree on to what extent can non-default module streams affect the rest of the system. As an extreme example, would a module stream only installable in a container be valuable? To me personally, yes. But do we all agree?

Problem 2: We move too slow.

I feel people tend to work on too many things at once, and some of the things get very little progress every week. I see this on myself, too.

As an example, some crutial parts of the architecture — like getting default modules to the buildroot — are being discussed for months but not implemented. And that has terrible consequences (the java packages for example). Generating a lot of unnecessary work for people slowing us down even more.

Moving forward?

I feel that first we need change the way we work in order to move forward effectively. I believe fixing these two problems would get us there mostly.

On Tue, 2019-04-02 at 18:35 +0000, Igor Gnatenko wrote:

  1. Modularity way: have all crates in their latest versions in
    rawhide and somehow generate modulemds out of that.
  2. Traditional way: have all crates in their versions (most likely
    outdated, just look at F29) and not being able to build your app
    because they are outdated...

Hey Igor!

Is #2 different for Rust than it is for other languages in Fedora?

Conclusions from this:
=20
We don't have way to easily update multiple packages and keep them
up to date in stable releases (soon, we won't be able to do this in
rawhide either, due to gating);
We don't have way to easily generate modulemds and somewhat
automatically create modules (and it seems we don't have any people
who would be working on this tooling anytime soon);
=20


=20
So you are right that modules are not very packagers-friendly. But
maintaining 500+ packages in multiple stable releases is even more
painful. And yes, I do agree that we should have much more
automation. And no, it seems nobody is looking to work on that (I am
willing to help as much as I can, but I can't do all work).

Yeah, I agree that it's not easy either way.

Is #2 different for Rust than it is for other languages in Fedora?

No. Java and Go are the same.

Metadata Update from @churchyard:
- Issue tagged with: meeting

5 months ago

OK, I think this thread has gotten a bit lost in specifics, whereas @churchyard 's initial post was much more high-level. I think the initial point is valid: Modularity is lacking some key top-level policy (and documentation for that policy). Unfortunately, as is all too-common in Open Source, we landed the technology before that policy was established. As a result, we have situations we're now reacting to, like the Java package retirement.

I think I generally agree with @churchyard on the three categories of package, but they're not of equal size.

1) The kind of thing that we ultimately want to modularize

I think this is probably the easiest case to describe. The things that make sense to always treat as a module fit all of these conditions:

  • A set of packages that together make up an "application" (using a very loose definition that may non-exclusively mean a desktop application, a web application, a system service, etc.)
  • One or the other of:
    • A release schedule that means two or more major versions of the application are useful to Fedora users within the same Fedora release.
    • Dependencies on content that cannot reasonably be expected to be part of a default stream in Fedora, such as a dependency on a custom, incompatible fork or reliance on an older supported stream of a framework than Fedora has as a default.

2) The kind of thing that might be modularized if XYZ conditions are met

Realistically, this is going to be the majority of packages in Fedora, I think. (Working under the assumption that we manage a heavy focus on packager experience improvements). We need to figure out where the lines are for software to move from the traditional approach into a modular approach, and in such a way that it doesn't break those who aren't ready to do it yet.

It is worth noting that once we land the work we are doing for getting modular default stream content into the non-modular buildroots (for both mock and Koji builds), a lot of these problems become much easier. For most purposes, there will be no difference within a release as to whether their dependency was provided by a non-modular package or a modular package in a default stream.

So, the hard part now is setting these rules. Timing matters as well: if rules are set today, without the modular-buildroot changes, then I think the answer can be drawn pretty simply: Until we revise this policy, do not move a package from the non-modular set to a default module stream if you (or the appropriate SIG) are not the owners of every package known to depend on it and are moving them all to one or more module streams at the same time. It is permissible to create a new, non-default module stream for alternate versions of any package, but if your package has known non-modular dependencies, it must remain in the non-modular repo for now.

Once we fix the buildroot issue, it should be safe to more or less open the floodgates. As long as mock/koji/dnf can properly resolve the buildrequires from things that are either part of the non-modular set or contained in a default module stream, it should be fine for most packagers. Edge-cases will likely crop up (it's possible for default module streams to conflict with one another), but those are hard to predict and should be managed on an individual basis.

3) The kind of thing that is supposed to stay in Fedora non-modular forever

This will probably be the smallest group, ultimately. This should be an identified set of packages whose API is so fundamental that it is essentially the definition of what constitutes a "Fedora Release". I think this is going to be a very minimal set of things (probably not too dissimilar to the content of a container base image plus the kernel and boot stack).

On Thu, 2019-04-04 at 05:47 +0000, Igor Gnatenko wrote:

Is #2 different for Rust than it is for other languages in Fedora?
=20
No. Java and Go are the same.

What about languages like Python or C/C++ or others? Is this problem
unique to Rust, Java, and Go for some reason?

I think I'm not understanding why modularity is thought of as being
particularly helpful to Rust/Java/Go (i.e., why those languages and not
all languages).

On Thu, 2019-04-04 at 05:47 +0000, Igor Gnatenko wrote:

Is #2 different for Rust than it is for other languages in Fedora?
=20
No. Java and Go are the same.

What about languages like Python or C/C++ or others? Is this problem
unique to Rust, Java, and Go for some reason?
I think I'm not understanding why modularity is thought of as being
particularly helpful to Rust/Java/Go (i.e., why those languages and not
all languages).

Golang/Rust binaries are statically linked against fast-moving libraries, which break API often. Which means if you want to build the latest version of a package, you often need to update several libraries to make it work. This is especially true in Golang where semver is rarely used (though encouraged now) and libraries often do not make releases at all, and just assumes you're building from GIT master branch (which can break compatibility anytime).
In Rust, you have a strong semver culture, but it seems a lot of devs are afraid of making a 1.0.0 release and being stable.
In Golang, we lack manpower for keeping everything up-to-date, even in Rawhide. Since libraries publishing releases are rare, you need to periodically sync with GIT. No Anitya to warn you of a new release. No simple way to know if an update will break other dependent packages either. Keeping everything compatible on several branches is tricky. Sometimes your build fails because that particular library has not been updated for a while except on Rawhide.

@sgallagh @churchyard

I wouldn't actually try to answer the question: "modularize or not" through a policy. We want people to get creative, and want them to find new usages for modules and develop this idea further. And as soon as tool is there, it is open, and we shouldn't really restrict all the ways the tool can be used, however weird it might be.

But we should restrict the way how Fedora uses the tool. And I see Fedora part and Fedora control on Modularity focused on those modules which we call default.

Thus I want to suggest to limit the policy to default modules only and enforce certain limitations on them. And then the question "modularize or not this thing in Fedora" will be answered based on those limitations we enforce.

So my suggestion:

A: Default modules MUST be supported and work the same way as regular rpm packages

Corollary A1: Packages provided by default modules MUST be supported in full (not as "just a dependency for this particular rpm")
Corollary A2: Default modules MUST be supported through entire lifetime of Fedora release their are default for.
Corollary A3: Default modules MUST follow Fedora release schedule and policies (Change Proposals, Freezes and Alpha/Beta releases)
Corollary A4: Default modules MUST NOT depend on non-default modules

B: Non-modular buildroot MUST NOT contain non-default modules

The reasoning is: if user see no visible difference between modular and non-modular rpm (which is the case for default modules), then there shouldn't be any hidden difference as well.

So my suggestion:
A: Default modules MUST be supported and work the same way as regular rpm packages
Corollary A1: Packages provided by default modules MUST be supported in full (not as "just a dependency for this particular rpm")

So, in practice, you want a policy on default module streams that all produced (i.e. not filtered out) artifacts in the stream must be part of the API definition for that stream?

I'm not sure I like that. The whole purpose of the API definition is to allow for such cases (e.g. I'm supporting this library for my own purposes and don't want to own it for all possible situations). That said, this is not a hill I'm willing to die on, so if FESCo in general feels that this is a restriction we want on the default streams, I'll fall in line.

Corollary A2: Default modules MUST be supported through entire lifetime of Fedora release their are default for.

Absolutely agreed. Please note, we have just adopted a lifecycle policy that all streams of a module in Fedora must have their EOL aligned with the EOL of a Fedora release. So if you can't support a stream (default or not) through the end of a release's lifetime, you should not include that stream at all.

Corollary A3: Default modules MUST follow Fedora release schedule and policies (Change Proposals, Freezes and Alpha/Beta releases)

Agreed. We already have policy that changes to a default stream have to go through the Change Process (unless they are just moving the same version from non-modular to a default module stream). Policy suggestion: this must be complete by Beta Freeze or deferred to the next release.

Corollary A4: Default modules MUST NOT depend on non-default modules

I disagree with this. I think the ability to have this happen is a major strength of Modularity as it was designed. One of the major issues we have in Fedora today is that if two important packages depend on different versions of the same dependency, only one of them gets to be in Fedora. (As a contrived example, let's consider a desire to ship two different PHP web applications. AppA requires PHP 7.2 and can't work because 7.3 broke compatibility. AppB can only work with 7.3+.

By requiring that default streams cannot depend on non-default streams, only one of AppA or AppB would be permissible to have a default stream.

Maybe this is less problematic if we set stronger guidelines on the stream summary and description text so it's easier to discover non-default streams, I suppose.

What problems, specifically, do you see this rule resolving? It might reduce potential stream conflicts, I suppose. But it's not as if conflicts don't exist in the non-modular RPMs too.

B: Non-modular buildroot MUST NOT contain non-default modules
The reasoning is: if user see no visible difference between modular and non-modular rpm (which is the case for default modules), then there shouldn't be any hidden difference as well.

I am willing to agree to this at this point in time. In the fullness of time, I'd like to see the ability to do buildroot overrides to have non-default streams in the buildroot for modular RPMs if-and-only-if it doesn't change the runtime dependencies. (E.g. I might want to build an application with a pre-release golang or an older Sphinx because the default broke compat, etc.) But that's technology we don't have and likely won't for a year or more, so let's agree to this rule for now.

So, in practice, you want a policy on default module streams that all produced (i.e. not filtered out) artifacts in the stream must be part of the API definition for that stream?
I'm not sure I like that. The whole purpose of the API definition is to allow for such cases (e.g. I'm supporting this library for my own purposes and don't want to own it for all possible situations). That said, this is not a hill I'm willing to die on, so if FESCo in general feels that this is a restriction we want on the default streams, I'll fall in line.

The issue here is that as a user running dnf install X I have no way to know that I am installing partially supported package. What if I am building my app depending on it?

It also takes the name in the default namespace, which means it gets in the way for people who want to maintain this package properly.

It essentially makes it look like package is available and maintained, while it is not.

Corollary A4: Default modules MUST NOT depend on non-default modules

I disagree with this. I think the ability to have this happen is a major strength of Modularity as it was designed. One of the major issues we have in Fedora today is that if two important packages depend on different versions of the same dependency, only one of them gets to be in Fedora. (As a contrived example, let's consider a desire to ship two different PHP web applications. AppA requires PHP 7.2 and can't work because 7.3 broke compatibility. AppB can only work with 7.3+.
By requiring that default streams cannot depend on non-default streams, only one of AppA or AppB would be permissible to have a default stream.
Maybe this is less problematic if we set stronger guidelines on the stream summary and description text so it's easier to discover non-default streams, I suppose.
What problems, specifically, do you see this rule resolving? It might reduce potential stream conflicts, I suppose. But it's not as if conflicts don't exist in the non-modular RPMs too.

The problem I am trying to address is the closure for the policy. For example, I say default module X needs to follow Fedora Change process. But then if it depends on non-default module Y, and Y doesn't follow Fedora change process, then we invalidate this requirement for X. Or say, X has a promise that CVEs are going to be fixed and Y has not, than again, the CVE promise for X is invalidated.

But I see your point. I guess we should come with another name. Like core modules, which follow the guidelines but may not be exposed as default. And then those "good enough" modules could be used as provider for dependencies for default modules.

It's getting harder to contribute to Fedora with all the mass orphaning of dependencies, and I don't have time to figure it all out.

https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org/message/RESI7MA5XJKNKHOK3RBGTND2LJ7DQYMN/

@churchyard Yes, the mass-orphaning of Java is turning into a disaster. I've red-flagged this internally today; I'm trying desperately to get the necessary infrastructure enhancements landed as soon as possible.

We will discuss this during the FESCo meeting on Friday at 15:00UTC in #fedora-meeting-1 on
irc.freenode.net.

Correction: the meeting will be in #fedora-meeting.

AGREED: FESCo forbids the addition of new default stream settings without FESCo approval until further notice while we work out the buildroot problems. (+7, 0, -0)

ACTION: mhroncok to file a bug on the modularity Taiga describing the use cases we need to keep in mind when writing those (modularity default streams) rules

I will leave this ticket open and assigned to @churchyard as a reminder to file the Taiga card.

Metadata Update from @bowlofeggs:
- Issue untagged with: meeting
- Issue assigned to churchyard

5 months ago

@bowlofeggs for existing packages or for new ones?

@ignatenkobrain Effective immediately, the fedora-module-defaults repo will not be merging any changes without FESCo exception. We will revisit this once we have the buildroot situation worked out.

@sgallagh, reading IRC logs it seemed that we don't want to "move ursine packages to module-only", but somehow that was lost in the vote. Was that intentional?

Because that means, I can't add any more Rust applications without FESCo exception.

@ignatenkobrain That was intentional. Yes, you need FESCo approval for Rust applications.

The reasoning was that we didn't want to keep arguing over where the line of "acceptable" defaults was in the meeting. We opted to disallow all new defaults, with the option to have FESCo vote to approve some individually.

Question: can I get a vote from FESCo on lessening the proscription on defaults changes to exclude changes that do not touch the default stream?

In other words, if someone wants to add a set of default profiles for a new, non-default stream, I think it's reasonable to allow that without a FESCo ticket.

We've stopped "new default stream settings". I think this does not affect non-default streams at all. What am I not getting?

We've stopped "new default stream settings". I think this does not affect non-default streams at all. What am I not getting?

I may have misread our decision. Carry on.

@churchyard we left this open as a reminder for you to file a follow-up ticket in taiga... Any progress?

Sorry, I completely procrastinated this one off. Will try to get back to this after Czech PyCon (the week after next).

Metadata Update from @churchyard:
- Issue tagged with: stalled

2 months ago

Login to comment on this ticket.

Metadata