#70 Add history, background, requirements
Merged 4 years ago by asamalik. Opened 4 years ago by pfrields.
fedora-docs/ pfrields/modularity history  into  master

file modified
+2 -2
@@ -42,5 +42,5 @@ 

  * xref:references.adoc[References]

  * xref:archive.adoc[Archive]

  ** xref:architecture/consuming/dnf-behavior.adoc[DNF Behavior]

- 

- 

+ * xref:history.adoc[History and Background]

+ ** xref:requirements.adoc[Requirements and Use Cases]

@@ -0,0 +1,117 @@ 

+ = Modularity: History and Background

+ 

+ Fedora Modularity is a new technology that tries to solve important

+ and complex user problems. From time to time other solutions are

+ suggested to solve these problems in different ways. Often those

+ solutions fail to address one or more intended use cases. These pages

+ to enumerate these cases in detail so as to serve as a common

+ reference point for the ongoing discussions.

+ 

+ It is important to note these are goals. There are numerous places

+ where the implementation of Modularity at the time of this writing is

+ not yet fully adherent to them.

+ 

+ == It’s all about the apps!

+ 

+ Very few people install a Linux distribution for its own

+ sake. Ultimately, the goal is to “scratch a particular itch” that the

+ user is experiencing. The solutions may take many forms, but

+ ultimately this user wants to deploy some software that solves a

+ problem for them.

+ 

+ This leads us to a classic problem that Linux distributions have faced:

+ the “Too Fast/Too Slow” problem. Linux distributions are traditionally

+ quite monolithic. The package collections they ship are generally

+ self-consistent, providing generally whatever the latest stable major

+ release of the software at the time of the distribution release. As the

+ release ages, it will receive bugfixes and enhancements, but usually

+ will remain on the same major version.

+ 

+ This is excellent for the maintainers of the distribution, because it

+ allows them to test that everything works together as a cohesive whole.

+ It means that there’s one authoritative version to align to.

+ 

+ Users, on the other hand, are most concerned about solving their

+ problem. It matters less to them that the distribution is cohesive and

+ more that the tools they need are available to them.

+ 

+ The “Too Fast/Too Slow” problem is basically this: users want a solid,

+ stable, reliable, _unchanging_ system. They want it to stay that way for

+ the life of their application. However, they also want their application

+ to run using the set of dependencies it was designed for. If that

+ doesn’t happen to be the same version (newer or older) as the one

+ selected for the monolithic distribution, the user will now have to

+ resort to alternative means to get up and running. This may be as simple

+ as bundling a dependency or as drastic as selecting an entirely

+ different distribution that better fits their specific need.

+ 

+ == A little background

+ 

+ One of the precursors to Fedora Modularity was

+ https://www.softwarecollections.org/[Software Collections] (SCLs). This

+ was a first try at solving the Too Fast/Too Slow” problem in the

+ Fedora/Red Hat ecosystem. provides two basic advantages: _Parallel

+ Availability_ and _Parallel Installability_.

+ 

+ _Parallel Availability_ means that more than one major release of a

+ popular software project is available for installation. For example, the

+ “Developer Toolset” SCLs provide access to newer versions of GCC and its

+ related toolchain for building software. There are Python and Ruby SCLs

+ that provide assorted runtimes for those languages and so on.

+ 

+ _Parallel Installability_ means that more than one major release of a

+ software project can be installed on the same userspace.

+ 

+ A few years back, the Product Management team inside Red Hat performed a

+ large-scale survey of customers and potential customers about the user

+ experience of Red Hat Enterprise Linux. In particular, they asked about

+ their level of satisfaction with the software available from the

+ enterprise distribution and their opinion on these Software Collections.

+ 

+ Perhaps unsurprisingly, the overwhelming majority of respondents were

+ thrilled to have supported versions of software beyond what had shipped

+ with the base operating system. What the survey team did come away with

+ that was an epiphany was that the respondents generally did not care

+ about the parallel installability of the SCLs. For the most part, they

+ maintained individual userspaces (using bare metal, traditional

+ virtualization or containers) for each of the applications they cared

+ about.

+ 

+ The most common problem reported for Software Collections was that using

+ them required changes to the applications they wanted to run. SCLs

+ install to a separate filesystem location from more traditional RPMs and

+ applications that rely on them need to know where to look for them. (In

+ SCL parlance, this is called “activating” the collection.)

+ 

+ The consequence of this relocation on disk is that users were unable to

+ take existing applications (either FOSS or proprietary) and simply use

+ them. Instead, they had to modify the projects to first activate the

+ collections. This was a consistent pain point.

+ 

+ Given this feedback, Red Hat came to the conclusion that parallel

+ installability, while nice to have, was not a critical user requirement.

+ Instead, the focus would be on the parallel _availability_. By dropping

+ this requirement, it became possible to create a solution that allowed

+ the different versions to be swapped in and take over the standard

+ locations on the disk.

+ 

+ == Meanwhile in Fedora

+ 

+ Of course, it’s not just Red Hat — people in Fedora are also concerned

+ with solving this Too Fast / Too Slow problem for our users. Efforts

+ around this kicked off in seriousness with the

+ https://fedoramagazine.org/fedora-present-and-future-a-fedora-next-2014-update-part-ii-whats-happening/[Fedora.next

+ initiative] and Fedora Project Leader Matthew Miller’s

+ “link:https://lwn.net/Articles/563395/[Rings]” talk at the first Flock

+ conference in 2013.

+ 

+ This led to the proposal and approval by the Fedora Council of the

+ link:https://fedoraproject.org/wiki/Objectives/Fedora_Modularization%2C_Prototype_Phase[Modularity

+ Prototype Fedora Objective] and its follow-up

+ link:https://fedoraproject.org/wiki/Objectives/Fedora_Modularization_%E2%80%94_The_Release[Modularity

+ Release Fedora Objective].

+ 

+ == Requirements and use cases

+ 

+ For more information on requirements and use cases, read on to the

+ xref:requirements.adoc[Modularity requirements and use cases] page.

@@ -0,0 +1,311 @@ 

+ = Modularity: Requirements and use cases

+ 

+ If you haven't already read it, you should check out the first part of this essay on the xref:history.adoc[history and background of modularity].

+ 

+ == Critical use cases for consumers

+ 

+ First and foremost, our primary driving goal is to make it easy for our

+ users to understand and interact with alternative software versions. In

+ any instance where choosing between the packager experience and the user

+ experience is in conflict, we elect to improve things for the user.

+ 

+ === Standard Locations

+ 

+ In order to make deployment of users’ applications simpler, we need to

+ make sure that software can be installed into the common, expected

+ locations on the system. This includes (but is not limited to):

+ 

+ * Libraries must be installed to `/usr/lib[64]`.

+ * Headers must be installed to `/usr/include`.

+ * Executables must be installed to a location in the default system

+ `$PATH`

+ * Other -devel functionality such as pkgconfig files must be installed

+ in their standard lookup locations.

+ * Installed services may own a well-known DBUS address.

+ * Services may own the appropriate standard TCP/UDP ports or local

+ socket paths.

+ 

+ _Requirement_: Installation must occur in the same locations as

+ traditional RPM software delivery.

+ 

+ === Don’t break the app!

+ 

+ It is very common for Fedora to update to the latest major version of

+ packages at each new semiannual release. This ensures that Fedora

+ remains at the leading edge of software development, but it can wreak

+ havoc on anyone trying to maintain a deployment on Fedora. If they are

+ running an app that is built for PostgreSQL 9.6 and Fedora switches to

+ carrying PostgreSQL 10 in the next major release, upgrading to that

+ release may break their app (possibly in ways undetectable by the

+ upgrade process).

+ 

+ However, staying on an old version of Fedora forever has its own

+ problems. Not least of these is the problem of security updates: Once a

+ release has been out for about 13 months, it stops receiving errata.

+ Moreover, new releases of the Fedora platform may have other useful

+ enhancements (better security defaults, increased performance thanks to

+ compiler improvements, etc.).

+ 

+ _Requirement_: We need to allow users to “lock” themselves onto certain

+ dependencies as long as the packager is maintaining them. These

+ dependencies must continue to receive updates.

+ 

+ _Requirement_: There must be appropriate and helpful UX for dealing with

+ when those dependencies go EOL.

+ 

+ === Support the developers

+ 

+ Developers often want to build their applications using the

+ latest-and-greatest version of their dependencies. However, that may not

+ have been released until after the most recent Fedora release. In

+ non-Modular Fedora, that means waiting up to six months to be able to

+ work on it there.

+ 

+ _Requirement_: It must be possible to gain access to newer software than

+ was available at the Fedora release GA.

+ 

+ Additionally, Dev/Ops people are rapidly switching to a new paradigm of

+ development and deployment (containers) to solve the above issue.

+ However, most containers today are retrieved from public repositories.

+ The public repositories are generally user-managed and have not been

+ verified and validated for security.

+ 

+ _Requirement_: Provide a mechanism for building *trusted* container base

+ and application images with content alternatives.

+ 

+ === Keep it updated

+ 

+ It’s not enough that other versions of software are available to

+ install. They also need to be kept up to date with bug fixes and

+ security updates. In non-Modular Fedora, users had the ability to force

+ DNF to lock to a specific RPM NEVRA, but they wouldn’t get updates from

+ it.

+ 

+ _Requirement_: Alternative software must receive be able to recieve and

+ apply updates.

+ 

+ === Make it discoverable

+ 

+ Having alternative versions available is important but not sufficient.

+ It is also necessary for users to be able to locate these alternatives.

+ Some of our early explorations into this area failed this ease-of-use

+ test because they require the user to have knowledge of external sites

+ and then to search those sites for what they think they want.

+ 

+ _Requirement_: Users must be able to discover what alternative software

+ versions are available with tools that are shipped with the OS by

+ default. Ideally, these should be the same tools that they are already

+ comfortable with.

+ 

+ === Don’t break existing package management workflows

+ 

+ Users are slow to adapt to changes in the way they need to behave.

+ Requiring them to learn a new set of commands to interact with their

+ system will likely result in frustration and possibly exodus to other

+ distributions.

+ 

+ _Requirement_: It must remain possible to continue to operate with only

+ the package management commands used in traditional Fedora. We may

+ provide additional commands to support new functionality, but we must

+ not break the previous workflow.

+ 

+ _Requirement_: Existing automation tools such as anaconda’s kickstart

+ and Ansible must continue to work.

+ 

+ == Critical use-cases for packagers

+ 

+ === Dependencies

+ 

+ Because very little software today is wholly self-contained, it must be

+ possible for Modules to depend on each other.

+ 

+ _Requirement_: There must be a mechanism for packagers to explicitly

+ list dependencies on other software, including alternative versions.

+ This mechanism must support both build-time and run-time dependencies.

+ 

+ === Alternative dependencies

+ 

+ Some software is very restrictive about which dependencies it can work

+ with. Other software may work with several different major releases of a

+ dependency. For example, a user may ship two Ruby-based web

+ applications, one which is capable of running on Ruby 2.5 and the other

+ that can run on either Ruby 2.5 or Ruby 2.6. In non-modular Fedora, only

+ one version of Ruby would be available. If the system version was 2.5,

+ then both applications could run fine. But if in the next release of

+ Fedora the Ruby 2.6 release becomes the system copy, one of those

+ applications will have to be dropped (or patched) to work with it.

+ 

+ _Requirement_: It must be possible to build software that can be run

+ against multiple versions of its dependencies.

+ 

+ _Requirement_: The packaging process for creating software that supports

+ multiple versions of their dependencies must not be significantly more

+ difficult than packaging for a single dependency.

+ 

+ As more and more things become modules, there is concern that such

+ things will grow into an unbounded matrix. For this, we need to

+ establish policies on when the use of alternative dependencies is

+ preferable or when it is better to constrain it to a single version or

+ small set.

+ 

+ _Requirement_: Packaging guidelines need to provide advice on when to

+ use multiple alternative dependencies or to select a single one.

+ 

+ === Managing private dependencies

+ 

+ When a person decides that they want Fedora to carry a particular

+ package and decides to do the work to accomplish this, it is not

+ uncommon to discover that the package they care about has additional

+ dependencies that are not yet packaged in Fedora. Traditionally, this

+ has meant that the packager has needed to package up those dependencies

+ and then continue to maintain them for anyone who may be using them for

+ other purposes. This can sometimes be a significant investment in time

+ and energy, all to support a package they don’t necessarily care about

+ except for how it supports the primary package.

+ 

+ ==== Build-time Dependencies

+ 

+ Sometimes, a package is needed only to build the software and is not

+ required at run-time. In such cases, Modularity should offer the ability

+ to keep those build-time dependencies entirely private and not exposed

+ to the Fedora Package Collection at large.

+ 

+ _Requirement_: Build-time only dependencies for an alternative version

+ may be excluded from the installable output artifacts. These excluded

+ artifacts may be preserved by the build-system for other purposes.

+ 

+ _Requirement_: All sources used for generating alternative versions,

+ regardless of final visibility, must be available to the community for

+ purposes of modification and reproducibility.

+ 

+ ==== Defining the public API

+ 

+ Similarly, there are times when an application the packager cares about

+ depends on another package that is required at runtime, but sufficiently

+ complex that the packager would not want to maintain it for general use.

+ (For example, an application that links to a complicated library but

+ only uses a few functions.)

+ 

+ In this case, we want there to be a standard mechanism for the packager

+ to be able to indicate that some of the output artifacts are not

+ supported for use outside this module. If they are needed by others,

+ they should package it themselves and/or help maintain it in a shared

+ place.

+ 

+ _Requirement_: Packagers must be able to encode whether their output

+ artifacts are intended for use by other projects or if they are

+ effectively private to the alternative version. Packagers must also have

+ a way of finding this information out so they understand what they can

+ and cannot rely on as a dependency.

+ 

+ === Use-case-based installation

+ 

+ Since the earliest days of Linux, the “package” has been the fundamental

+ unit of installable software. If you want to have some functionality on

+ the system, you need to learn the name of the individual packages that

+ provide that functionality (not all of which are named obviously). As we

+ build modules, one of the goals is to try to focus installation around

+ use-cases rather than around upstream projects. A big piece of this is

+ that we want to have a way to install a subset of a module that supports

+ specific use-cases. A common example being “server” and “client” cases.

+ 

+ _Requirement_: It must be possible to install a subset of artifacts from

+ an alternative version. These installation groups should be easily

+ discoverable.

+ 

+ _Recommendation_: Installation groups should be named based on the

+ use-case they are intended to solve. This will provide a better user

+ experience.

+ 

+ === Lifecycle isolation

+ 

+ Another of the major issues faced by Fedora is maintaining a release

+ schedule when all of the components within it follow vastly differing

+ schedules. There are two main aspects to this problem:

+ 

+ * A major version of a popular piece of software is released just after

+ a Fedora release, so it doesn’t land in Fedora for six months.

+ * Some software does frequent major revisions (Django, Node.js, etc.)

+ and swapping them out every six months for the latest one means that

+ dependent projects are constantly needing to adapt to the new breakage

+ or find alternative mechanisms for retaining the older, working version

+ * Some software does not handle multiple-version upgrades (Nextcloud,

+ for example). Attempting to go from version 15 to verison 19 requires

+ first upgrading through 16, 17, and 18.

+ 

+ _Requirement_: It must be possible for new alternative versions of

+ software to become available to the Fedora Package Collection between

+ release dates.

+ 

+ _Requirement_: It must be possible for alternative versions of software

+ to go end-of-life during a Fedora release. This does not mean that the

+ software must disappear from the repositories, merely that an assertion

+ exists somewhere that after a certain date, the package will not receive

+ updates.

+ 

+ _Requirement_: For alternative versions whose lifecycle will continue

+ through at least part of the next Fedora release, it must be possible to

+ upgrade from one release to the next and remain with the

+ fully-compatible version.

+ 

+ === Third-party additions

+ 

+ Some third-party add-on repositories (particularly EPEL) have been

+ limited in the past by relying on the system copies of packages in the

+ base distribution of the release. In the particular case of EPEL, little

+ can be done to upgrade these system copies. In order to be able to

+ package much of the available FOSS software out there, it may be

+ necessary to override some of the content shipped in the base system

+ with packages known to work properly.

+ 

+ _Requirement_: It must be possible for third party repositories to

+ create alternative versions that override base distribution content at

+ the user’s explicit choice.

+ 

+ _Requirement_: It must be possible for third party repositories to

+ create alternative versions of software that exist in the base

+ distribution.

+ 

+ === Reduce duplication in packaging work

+ 

+ There is plenty of software out in the wild that maintains compatibility

+ over time and is therefore useful to carry in multiple releases of

+ Fedora. With traditional packaging, this means carrying and building

+ separate branches of the packages for each release of Fedora. In the

+ case of software “stacks” which are tightly bound, this means also

+ manually building each of its dependencies in each release of Fedora.

+ 

+ _Requirement_: It must be possible to build multiple component software

+ packages in the same build process.

+ 

+ _Requirement_: It must be possible for the packager to specify the order

+ in which packages must be built (and to indicate which ones can be built

+ in parallel).

+ 

+ _Requirement_: It must be possible to be build for all supported

+ platforms using the same specification and with a single build command.

+ 

+ == Non-Goals

+ 

+ === Parallel installability

+ 

+ As mentioned in the Background section, the goals of Modularity are

+ specifically to *not* implement parallel-installability. We recommend

+ instead that users should rely on other mechanisms such as

+ virtualization or containerization to accomplish this. If

+ parallel-installation is unavoidable, then Modularity is not the correct

+ tool for this job.

+ 

+ === Arbitrary stream switching

+ 

+ Module streams are intended to be compatible update streams. That means

+ they must follow the same rules regarding RPM package-level updates

+ within the stream. By definition, two streams of the same module exist

+ because upgrades (or downgrades or cross-grades…) are not capable of

+ being done in a safe, automated fashion.

+ 

+ That does not mean that stream switching should be impossible, but it

+ does mean that we will not build any tools intended to handle such

+ switching in a generic manner. Stream switches should be handled on a

+ module-by-module basis and detailed instructions and/or tools written

+ for each such case.

These pages are sourced from Stephen Gallagher's excellent community blog article on the origins of modularity and its complex set of use cases and requirements.

Oh wow! Let me read through and merge!

Pull-Request has been merged by asamalik

4 years ago