#1165 Accept modulemd for scratch module builds as a parameter in the submitted JSON.
Merged a month ago by mprahl. Opened a month ago by merlinm.
merlinm/fm-orchestrator scrmod_modulemd_in_json  into  master

file modified
+9 -3

@@ -107,9 +107,15 @@ 

    existing requires on the modulemd. The expected format is ``{'platform': ['f28', 'f29']}``.

  - ``scratch`` - a boolean indicating if a scratch module build should be performed.

    Only allowed to be ``True`` if the MBS setting ``MODULES_ALLOW_SCRATCH`` is ``True``.

- - ``yaml`` - a string of the input file when submitting a YAML file directly in a

-   ``multipart/form-data`` request. The MBS setting ``YAML_SUBMIT_ALLOWED`` must be set to ``True``

-   for this to be allowed.

+ - ``yaml`` - a string of the input file when submitting a YAML modulemd file directly in a

+   ``multipart/form-data`` request. Only allowed if ``scratch`` is ``True`` or if the MBS

+   setting ``YAML_SUBMIT_ALLOWED`` is ``True``. The basename of the file will be used as

+   the module name.

+ - ``modulemd`` - a string for submitting a YAML modulemd file as a parameter in the JSON data as

+   an alternative to sending it in a ``multipart/form-data`` request. Only allowed if

+   ``scratch`` is ``True`` or if the MBS setting ``YAML_SUBMIT_ALLOWED`` is ``True``.

+ - ``module_name`` - a string to use as the module name if a scratch build is requested and the

+   YAML modulemd is submitted using the ``modulemd`` parameter.

  - ``srpms`` - an optional list of Koji upload URLs of SRPMs to include in a module scratch build.

    Only allowed if ``scratch`` is ``True``.

  - ``rebuild_strategy`` - a string of the desired rebuild strategy (affects what components get

@@ -19,7 +19,8 @@ 

  ``submit_module_build(...)``.

  

  Alternatively, if submitting a YAML modulemd file is allowed (MBS setting

- ``YAML_SUBMIT_ALLOWED`` is ``True``), the user can send a ``multipart/form-data``

+ ``YAML_SUBMIT_ALLOWED`` is ``True``), the user can include it in the JSON data

+ (called ``modulemd`` and ``module_name``) or send a ``multipart/form-data``

  POST request directly including the contents of a YAML modulemd file

  (called ``yaml``). In this case, the JSON data and YAML file are handled by

  ``views.YAMLFileHandler``, which validates the data and calls the

@@ -27,7 +28,9 @@ 

  to ``submit_module_build(...)``.

  

  If module scratch builds are allowed (MBS setting ``MODULES_ALLOW_SCRATCH`` is

- ``True``), the user can also upload one or more source RPMs uploaded to Koji

+ ``True``), the user can request a scratch module build (called ``scratch``).

+ With a scratch build request, the user can include a YAML modulemd file

+ (see above) and also upload one or more source RPMs to Koji

  via calls to Koji's ``session.uploadWrapper(..)``, and supply the list of

  upload links to MBS (called ``srpms``). Such custom SRPMs will be used to

  override the git repository source for corresponding components.

file modified
+41 -19

@@ -31,6 +31,7 @@ 

  from flask import request, url_for

  from flask.views import MethodView

  from six import string_types

+ from io import BytesIO

  

  from module_build_service import app, conf, log, models, db, version, api_version as max_api_version

  from module_build_service.utils import (

@@ -172,10 +173,11 @@ 

      # Additional POST and DELETE handlers for modules follow.

      @validate_api_version()

      def post(self, api_version):

-         if hasattr(request, 'files') and "yaml" in request.files:

-             handler = YAMLFileHandler(request)

+         data = _dict_from_request(request)

+         if "modulemd" in data or (hasattr(request, "files") and "yaml" in request.files):

+             handler = YAMLFileHandler(request, data)

          else:

-             handler = SCMHandler(request)

+             handler = SCMHandler(request, data)

  

          if conf.no_auth is True and handler.username == "anonymous" and "owner" in handler.data:

              handler.username = handler.data["owner"]

@@ -198,7 +200,7 @@ 

          try:

              r = json.loads(request.get_data().decode("utf-8"))

          except Exception:

-             log.error('Invalid JSON submitted')

+             log.exception('Invalid JSON submitted')

              raise ValidationError('Invalid JSON submitted')

  

          if "owner" in r:

@@ -312,16 +314,9 @@ 

  

  

  class BaseHandler(object):

-     def __init__(self, request):

+     def __init__(self, request, data=None):

          self.username, self.groups = module_build_service.auth.get_user(request)

-         if "multipart/form-data" in request.headers.get("Content-Type", ""):

-             self.data = request.form.to_dict()

-         else:

-             try:

-                 self.data = json.loads(request.get_data().decode("utf-8"))

-             except Exception:

-                 log.error('Invalid JSON submitted')

-                 raise ValidationError('Invalid JSON submitted')

+         self.data = data or _dict_from_request(request)

  

          # canonicalize and validate scratch option

          if 'scratch' in self.data and str_to_bool(str(self.data['scratch'])):

@@ -342,7 +337,8 @@ 

  

      @property

      def optional_params(self):

-         return {k: v for k, v in self.data.items() if k not in ["owner", "scmurl", "branch"]}

+         return {k: v for k, v in self.data.items() if k not in

+                 ["owner", "scmurl", "branch", "modulemd", "module_name"]}

  

      def _validate_dep_overrides_format(self, key):

          """

@@ -367,7 +363,13 @@ 

      def validate_optional_params(self):

          module_build_columns = set([col.key for col in models.ModuleBuild.__table__.columns])

          other_params = set([

-             'branch', 'rebuild_strategy', 'buildrequire_overrides', 'require_overrides'])

+             'branch',

+             'buildrequire_overrides',

+             'modulemd',

+             'module_name',

+             'rebuild_strategy',

+             'require_overrides',

+         ])

          valid_params = other_params | module_build_columns

  

          forbidden_params = [k for k in self.data if k not in valid_params]

@@ -427,23 +429,43 @@ 

  

  

  class YAMLFileHandler(BaseHandler):

-     def __init__(self, request):

-         super(YAMLFileHandler, self).__init__(request)

+     def __init__(self, request, data=None):

+         super(YAMLFileHandler, self).__init__(request, data)

          if not self.data['scratch'] and not conf.yaml_submit_allowed:

              raise Forbidden("YAML submission is not enabled")

  

      def validate(self):

-         if "yaml" not in request.files:

+         if ("modulemd" not in self.data and

+                 (not hasattr(request, "files") or "yaml" not in request.files)):

              log.error('Invalid file submitted')

              raise ValidationError('Invalid file submitted')

          self.validate_optional_params()

  

      def post(self):

-         handle = request.files["yaml"]

+         if "modulemd" in self.data:

+             handle = BytesIO(self.data["modulemd"].encode("utf-8"))

+             if "module_name" in self.data and self.data["module_name"]:

+                 handle.filename = self.data["module_name"]

+             else:

+                 handle.filename = "unnamed"

Instead of setting the filename to unnamed, I'd prefer if you leave it as None. Then you could raise an exception in submit_module_build_from_yaml if the modulemd doesn't have the name set and the filename is None.

I just think it'd be confusing if there were module builds called unnamed in the database.

+         else:

+             handle = request.files["yaml"]

          return submit_module_build_from_yaml(self.username, handle,

                                               optional_params=self.optional_params)

  

  

+ def _dict_from_request(request):

+     if "multipart/form-data" in request.headers.get("Content-Type", ""):

+         data = request.form.to_dict()

+     else:

+         try:

+             data = json.loads(request.get_data().decode("utf-8"))

+         except Exception:

+             log.exception('Invalid JSON submitted')

+             raise ValidationError('Invalid JSON submitted')

+     return data

+ 

+ 

  def register_api():

      """ Registers the MBS API. """

      module_view = ModuleBuildAPI.as_view('module_builds')

file modified
+32 -25

@@ -26,10 +26,9 @@ 

  import module_build_service.scm

  

  from mock import patch, PropertyMock

- from werkzeug.datastructures import FileStorage

  from shutil import copyfile

  from os import path, mkdir

- from os.path import dirname

+ from os.path import basename, dirname, splitext

  from requests.utils import quote

  import hashlib

  import pytest

@@ -1703,8 +1702,8 @@ 

      @patch('module_build_service.scm.SCM')

      @patch('module_build_service.config.Config.modules_allow_scratch',

             new_callable=PropertyMock, return_value=True)

-     def test_submit_scratch_build(self, mocked_allow_scratch, mocked_scm, mocked_get_user,

-                                   api_version):

+     def test_submit_scratch_build(

+             self, mocked_allow_scratch, mocked_scm, mocked_get_user, api_version):

          FakeSCM(mocked_scm, 'testmodule', 'testmodule.yaml',

                  '620ec77321b2ea7b0d67d82992dda3e1d67055b4')

  

@@ -1760,9 +1759,8 @@ 

      @patch('module_build_service.scm.SCM')

      @patch('module_build_service.config.Config.modules_allow_scratch',

             new_callable=PropertyMock, return_value=False)

-     def test_submit_scratch_build_not_allowed(self, mocked_allow_scratch,

-                                               mocked_scm, mocked_get_user,

-                                               api_version):

+     def test_submit_scratch_build_not_allowed(

+             self, mocked_allow_scratch, mocked_scm, mocked_get_user, api_version):

          FakeSCM(mocked_scm, 'testmodule', 'testmodule.yaml',

                  '620ec77321b2ea7b0d67d82992dda3e1d67055b4')

  

@@ -1789,21 +1787,21 @@ 

             new_callable=PropertyMock, return_value=True)

      @patch('module_build_service.config.Config.yaml_submit_allowed',

             new_callable=PropertyMock, return_value=True)

-     def test_submit_scratch_build_with_mmd(self, mocked_allow_yaml,

-                                            mocked_allow_scratch,

-                                            mocked_get_user,

-                                            api_version):

+     def test_submit_scratch_build_with_mmd(

+             self, mocked_allow_yaml, mocked_allow_scratch, mocked_get_user, api_version):

          base_dir = path.abspath(path.dirname(__file__))

          mmd_path = path.join(base_dir, '..', 'staged_data', 'testmodule.yaml')

          post_url = '/module-build-service/{0}/module-builds/'.format(api_version)

          with open(mmd_path, 'rb') as f:

-             yaml_file = FileStorage(f)

-             post_data = {

-                 'branch': 'master',

-                 'scratch': True,

-                 'yaml': yaml_file,

-             }

-             rv = self.client.post(post_url, content_type='multipart/form-data', data=post_data)

+             modulemd = f.read().decode('utf-8')

+ 

+         post_data = {

+             'branch': 'master',

+             'scratch': True,

+             'modulemd': modulemd,

+             'module_name': str(splitext(basename(mmd_path))[0]),

+         }

+         rv = self.client.post(post_url, data=json.dumps(post_data))

          data = json.loads(rv.data)

  

          if api_version >= 2:

@@ -1857,13 +1855,22 @@ 

          mmd_path = path.join(base_dir, '..', 'staged_data', 'testmodule.yaml')

          post_url = '/module-build-service/{0}/module-builds/'.format(api_version)

          with open(mmd_path, 'rb') as f:

-             yaml_file = FileStorage(f)

-             post_data = {

-                 'branch': 'master',

-                 'scratch': True,

-                 'yaml': yaml_file,

-             }

-             rv = self.client.post(post_url, content_type='multipart/form-data', data=post_data)

+             modulemd = f.read().decode('utf-8')

+ 

+         post_data = {

+             'branch': 'master',

+             'scratch': True,

+             'modulemd': modulemd,

+             'module_name': str(splitext(basename(mmd_path))[0]),

+         }

+         rv = self.client.post(post_url, data=json.dumps(post_data))

+         data = json.loads(rv.data)

+ 

+         if api_version >= 2:

+             assert isinstance(data, list)

+             assert len(data) == 1

+             data = data[0]

+ 

          # this test is the same as the previous except YAML_SUBMIT_ALLOWED is False,

          # but it should still succeed since yaml is always allowed for scratch builds

          assert rv.status_code == 201

There is a problem that occurs when submitting a scratch module build that includes a modulemd YAML file, while trying to maintain compatibility with the pre-existing multipart/form-data handling in module_build_service/views.py.

The current method of doing this from the calling side is to pass a files= argument to the python requests.request() POST including a reference to the YAML file. However, when including a files= argument, request() insists that the data= argument be a dictionary--instead of accepting a pre-encoded json.dumps()ed string like it does when there is no files= argument. And that seems to be the problem. When data= is a dictionary, request() ultimately encodes each individual argument on its own--and it corrupts any argument that consists of structured JSON.

For example, one argument in the dictionary that may need to be passed to the MBS can look like:

'buildrequire_overrides': {'platform': ['el8']}

But this gets encoded in the request as:

'buildrequire_overrides': 'platform'

This promptly causes the server side to reject the request because 'buildrequire_overrides' is not in the expected format.

Per @mprahl's suggestion, this PR allows the modulemd YAML file to be directly included in the submitted JSON as the parameter modulemd, along with the module name as the parameter module_name.

pretty please pagure-ci rebuild

a month ago

pretty please pagure-ci rebuild

a month ago

pretty please pagure-ci rebuild

a month ago

pretty please pagure-ci rebuild

a month ago

pretty please pagure-ci rebuild

a month ago

CI keeps failing for this PR after the Clone Test Suite stage hangs for eight hours. When I manually run the CI steps for that stage, I get the following:

$ git clone --single-branch --depth 1 https://pagure.io/fm-orchestrator.git
Cloning into 'fm-orchestrator'...
remote: Counting objects: 245, done.
remote: Compressing objects: 100% (195/195), done.
remote: Total 245 (delta 30), reused 187 (delta 22)
Receiving objects: 100% (245/245), 544.77 KiB | 956.00 KiB/s, done.
Resolving deltas: 100% (30/30), done.
$ cd fm-orchestrator
$ git remote add proposed https://pagure.io/forks/merlinm/fm-orchestrator.git
$ git fetch proposed
fatal: the remote end hung up unexpectedly
$

Something appears to be messed up with Pagure or my specific fork.

try to rebase and force-push it to rerun CI

rebased onto 6f4451503ea57cd3cd5954b4b9a2288feba8e02e

a month ago

@vmaljulin Interesting. Thank you for the suggestion. After rebasing and force-pushing, the CI re-run has now (quickly and successfully) made it past the Clone Test Suite stage.

pretty please pagure-ci rebuild

a month ago

Prove that I'm wrong, but I think adding modulemd and module_name here in the BaseHandler actually allows me to submit non-scratch build using SCMHandler and setting "modulemd": "modulemd_data". This would get into the utils.submit_module_build in https://pagure.io/fm-orchestrator/blob/master/f/module_build_service/utils/submit.py#_650 and although this will probably fail with SyntaxError: keyword argument repeated, it is very fragile to code changes in the submit_module_build.

These new modulemd and modulemd_name params should not get that far.

I think you might not touch optional_params at all and instead do extra check in YAMLFileHandler.

@jkaluza, perhaps I'm misunderstanding what you're saying, but this change puts modulemd and module_name in the list of parameters not included in optional_params--preventing them from ever getting through to submit_module_build()? During initial implementation of this PR, I got complaints from models.ModuleBuild.create() about modulemd and module_name not being valid columns in the data model which brought me to the necessity of adding them here.

@merlinm, yeah, you are right. I just saw the not in the optional_params. I've read it wrong in my review. Ignore my comment please :).

rebased onto 655e0bc

a month ago

@merlinm could you please add documentation on how this new key works in the read me? We haven't been as disciplined as we would like in the past about doing this, but I think this would be a good opportunity.

Optional: Perhaps log.exception would be better here

I know this isn't related to the PR, but this is quite odd. I submitted a PR to clean this up and would appreciate a review:
https://pagure.io/fm-orchestrator/pull-request/1171

I'll finish this review next week.

2 new commits added

  • Add/improve documentation for scratch module build requests,
  • Use log.exception when reporting JSON parsing exceptions.
a month ago

@merlinm could you please add documentation on how this new key works in the read me? We haven't been as disciplined as we would like in the past about doing this, but I think this would be a good opportunity.

Yup. Done in my latest commit.

Optional: Perhaps log.exception would be better here

Thanks for the suggestion. Also done. (Although this is old code I simply moved within the file! :smirk:) I also revised this in a second location that was using log.error within an exception.

This needs to be rebased to PR#1171 or we need to rebase PR#1171 if we merge this.

Otherwise, +1.

Instead of setting the filename to unnamed, I'd prefer if you leave it as None. Then you could raise an exception in submit_module_build_from_yaml if the modulemd doesn't have the name set and the filename is None.

I just think it'd be confusing if there were module builds called unnamed in the database.

rebased onto a46a4ec

a month ago

I left one comment, but since PR #1171 will cause conflicts on this PR. I'll merge it as is, then rebase #1171 and merge it, and then I'll submit the PR with the change I requested in this PR.

:thumbsup:

Pull-Request has been merged by mprahl

a month ago