#303 Handle two repos with the same relative_url in the single content set.
Merged 4 years ago by lsedlar. Opened 4 years ago by jkaluza.
jkaluza/odcs group-rmtree-errors  into  master

@@ -127,6 +127,11 @@ 

              'default': 3600,

              'desc': 'Time in seconds after which the local pungi-koji is '

                      'killed and compose is marked as failed'},

+         'mergerepo_timeout': {

+             'type': int,

+             'default': 1800,

+             'desc': 'Time in seconds after which the mergerepo_c is '

+                     'killed and compose is marked as failed'},

          'pungi_runroot_enabled': {

              'type': bool,

              'default': False,

@@ -126,6 +126,12 @@ 

          # Contains paths to per pulp repo pulp_repo_cache sub-directories.

          repo_paths = []

  

+         # Remove duplicated URLs from repos. It is useless to merge the same URLs and it would

+         # also break the locking code which tries to lock the same cache directory twice.

+         # The mergerepo_c can handle the case when we end up with just single repo in the `repos`.

+         repos = list(set(repos))

+         repos.sort()

+ 

          parsed_url = urlparse(repos[0])

          repo_prefix = "%s://%s" % (parsed_url.scheme, parsed_url.hostname)

          repo_prefix = repo_prefix.strip("/") + "/"
@@ -163,7 +169,7 @@ 

                  args.append("-r")

                  args.append(repo)

  

-             execute_cmd(args)

+             execute_cmd(args, timeout=conf.mergerepo_timeout)

          finally:

              for lock in locks:

                  if lock.is_locked:

file modified
+11 -2
@@ -143,6 +143,15 @@ 

                      "signatures": "SIG1,SIG2",

                  },

              },

+             # Test two same relative_urls here.

+             {

+                 "notes": {

+                     "relative_url": "content/1.0/x86_64/os",

+                     "content_set": "foo-1",

+                     "arch": "x86_64",

+                     "signatures": "SIG1,SIG2",

+                 },

+             },

What's going to happen if there are two repos with the same relative URL but different other fields? Can it even happen? Does ODCS care?

The content_set will always be the same, because that's defined in the query we do to Pulp. The arch does not have to be the same, but we group repos by arch before passing them to MergeRepo class, so merge repos in this case will be OK.

The signatures can be an issue, but it is not new problem introduced here. The Pulp class only returns back the signatures of the first merge repo. It should probably return union of all the signatures. But I think we do not change signing keys that often especially in the middle of product life-time, so even the current state should be OK.

              {

                  "notes": {

                      "relative_url": "content/1.1/x86_64/os",
@@ -184,13 +193,13 @@ 

               '--repo-prefix-search', '%s/pulp_repo_cache' % conf.target_dir,

               '--repo-prefix-replace', 'http://localhost/',

               '-r', repo_prefix + "1.0/x86_64/os",

-              '-r', repo_prefix + "1.1/x86_64/os"])

+              '-r', repo_prefix + "1.1/x86_64/os"], timeout=1800)

          execute_cmd.assert_any_call(

              ['/usr/bin/mergerepo_c', '--method', 'nvr', '-o',

               c.result_repo_dir + '/foo-1/ppc64le',

               '--repo-prefix-search', '%s/pulp_repo_cache' % conf.target_dir,

               '--repo-prefix-replace', 'http://localhost/',

-              '-r', repo_prefix + "1.0/ppc64le/os"])

+              '-r', repo_prefix + "1.0/ppc64le/os"], timeout=1800)

  

          download_repodata.assert_any_call(

              repo_prefix + "1.0/x86_64/os",

Without this commit, when there are two repos in single content set
sharing the same relative_url, ODCS tries to create and lock single
file lock twice, because the locks are based on the relative_url. This
leads to deadlock.

It also calls mergerepo_c with the same repo twice which is waste
of resources.

This commit changes it, so we remove duplicated repos before calling
mergerepo_c.

It also adds 1800 seconds timeout to mergerepo_c to ensure it does not
block whole process forever in case of issues.

What's going to happen if there are two repos with the same relative URL but different other fields? Can it even happen? Does ODCS care?

The content_set will always be the same, because that's defined in the query we do to Pulp. The arch does not have to be the same, but we group repos by arch before passing them to MergeRepo class, so merge repos in this case will be OK.

The signatures can be an issue, but it is not new problem introduced here. The Pulp class only returns back the signatures of the first merge repo. It should probably return union of all the signatures. But I think we do not change signing keys that often especially in the middle of product life-time, so even the current state should be OK.

Pull-Request has been merged by lsedlar

4 years ago

Ok. Merged. I'll make a build and deploy on Monday.