#5 Fix some issues with repoquery processing issues
Merged 4 years ago by clime. Opened 4 years ago by praiskup.
praiskup/prunerepo less-error-prone  into  master

file modified
+8 -4
@@ -42,8 +42,9 @@ 

      '--repofrompath=prunerepo_query,'+os.path.abspath(args.path),

      '--repo=prunerepo_query',

      '--refresh',

-     '--queryformat="%{location}"',

+     '--queryformat=%{location}',

      '--quiet',

+     '--setopt=skip_if_unavailable=False',

  ]

  

  get_latest_packages_cmd = get_all_packages_cmd + [ '--latest-limit=1' ]
@@ -80,10 +81,10 @@ 

          return

      process = subprocess.Popen(cmd, stdout=subprocess.PIPE, stderr=subprocess.PIPE)

      (stdout, stderr) = process.communicate()

+     sys.stderr.write(stderr.decode(encoding='utf-8'))

      if process.returncode != 0:

-         print(stderr.decode(encoding='utf-8'), file=sys.stderr)

          sys.exit(1)

-     return [line.strip('"') for line in stdout.decode(encoding='utf-8').split()] # NOTE: for some reason the get_all_packages_cmd gives output as b'"..."\n"..."\n', hence line.strip('"')

+     return stdout.decode(encoding='utf-8').splitlines()

  

  

  def get_package_build_time(package_path):
@@ -100,7 +101,7 @@ 

      Get paths to rpm packages in the repository according to given repoquery_cmd

      """

      stdout = run_cmd(repoquery_cmd) # returns srpms as well

-     rel_rpms_paths = [relpath.strip('"') for relpath in stdout if not is_srpm(relpath)]

+     rel_rpms_paths = [relpath for relpath in stdout if not is_srpm(relpath)]

      abs_rpms_paths = [os.path.abspath(os.path.join(args.path, relpath)) for relpath in rel_rpms_paths]

      return abs_rpms_paths

  
@@ -131,6 +132,9 @@ 

      log_info('Removing obsoleted packages...')

      was_deletion = False

      latest_rpms = get_rpms(get_latest_packages_cmd)

+     if not latest_rpms:

+         log_info("No RPMs available")

+         return was_deletion

      all_rpms = get_rpms(get_all_packages_cmd)

      to_remove_rpms = set(all_rpms) - set(latest_rpms)

      for rpm in to_remove_rpms:

One of the largest repositories got massive data loss [1] which was
probably caused by improperly calling createrepo (without
skip_if_unavailable=False).

While we are on it
- let's not trash stderr if the exit status is 0
- drop the left/right doublequote around package name in queryformat

Please take a look.

[1] https://pagure.io/copr/copr/issue/1090

1 new commit added

  • Use splitlines instead of split for repoquery parsing
4 years ago

@msuchy had good idea that we can skip prunerepo if the list of "latest" packages is empty..

Which may be good live saver for the bug-cases when dnf returns exit status 0, but still
an empty package set.

I'll add new commit hopefully later today, but I can submit new PR (so feel free to merge).

It is actually hard to tell now what happened originally, but I assume the set
of "latest" packages really was empty in case of cran project. Not everything
was removed (only about 3/4 of packages) but that's likely because the set of
kept RPMs was not yet old enough to be candidates for removal when the
bug occurred.

Hmm, locally I don't need the skip_if_unavailable=False option. interesting.

1 new commit added

  • skip prunerepo if set(latest_rpms) is empty
4 years ago

I plan to hotfix this in production:

--- /usr/bin/prunerepo  2019-04-01 18:02:57.000000000 +0000
+++ /usr/bin/prunerepo  2019-11-12 09:20:27.292341122 +0000
@@ -40,6 +40,7 @@
     '--refresh',
     '--queryformat="%{location}"',
     '--quiet',
+    '--setopt=skip_if_unavailable=False',
 ]

 get_latest_packages_cmd = get_all_packages_cmd + [ '--latest-limit=1' ]
@@ -117,6 +118,9 @@
     log_info('Removing obsoleted packages...')
     was_deletion = False
     latest_rpms = get_rpms(get_latest_packages_cmd)
+    if not latest_rpms:
+        log_info("No RPMs available")
+        return was_deletion
     all_rpms = get_rpms(get_all_packages_cmd)
     to_remove_rpms = set(all_rpms) - set(latest_rpms)
     for rpm in to_remove_rpms:

Can anyone review please?

Hello, can you, please, expand on why --setopt=skip_if_unavailable=False should be needed for dnf repoquery command when local repository is used? If this option is not used, does it mean that dnf repoquery doesn't return non-zero status code when syncing the local repo fails? Btw. by default this option is False, do you have it set to True in your environment where you run prunerepo?

If this option is not used, does it mean that dnf repoquery doesn't return non-zero status code when syncing the local repo fails?

Yes, that's what happens on copr backend. It doesn't happen on my box, and I don't
know why. But if you see man dnf.conf, the default is given by dnf:

       skip_if_unavailable
              boolean

              If enabled, DNF will continue running and disable the repository
              that  couldn't  be  synchronized  for  any  reason.  This option
              doesn't affect skipping of unavailable packages after dependency
              resolution.  To  check  inaccessibility  of repository use it in
              combination with refresh command line  option.  The  default  is
              False.

Drat, I misread the dnf man page -- the default should really be False. So we probably experienced some bug in dnf.

Ahhhh, F30 has this in man dnf.conf:

       skip_if_unavailable
              boolean

              If enabled, DNF will continue running and disable the repository
              that  couldn't  be  synchronized  for  any  reason.  This option
              doesn't affect skipping of unavailable packages after dependency
              resolution.  To  check  inaccessibility  of repository use it in
              combination with refresh command line  option.  The  default  is
              True.

See https://fedoraproject.org/wiki/Changes/Set_skip_if_unavailable_default_to_false

5 new commits added

  • skip prunerepo if set(latest_rpms) is empty
  • Use splitlines instead of split for repoquery parsing
  • Set skip_if_unavailable=False to not loose the data
  • Always dump stderr of repoquery (not only in error case)
  • Drop useless double-quote in --queryformat
4 years ago

I updated the commit message, PTAL.

Anyone can review the hotfix patch?

Anyone can review the hotfix patch?

Makes sense to me, +1

@clime, can you please take another look?

@praiskup yes, sry, i will take a look today.

Do you know what actually happened? i.e. why the local repo was unavailable?

Also...it's a small thing but could you please look at output formatting? If you run tests (tests/run.sh), output e.g. looks like this:

============================ test --nocreaterepo ============================
> [[ `listpkgsbyfs` == `listpkgsbyrepo` ]]
> sleep 1
+ /home/clime/prunerepo/tests/../prunerepo --verbose --nocreaterepo .
Removing obsoleted packages...
Executing: dnf repoquery --repofrompath=prunerepo_query,/home/clime/prunerepo/tests/base/repo-test --repo=prunerepo_query --refresh --queryformat=%{location} --quiet --setopt=skip_if_unavailable=False --latest-limit=1

Executing: dnf repoquery --repofrompath=prunerepo_query,/home/clime/prunerepo/tests/base/repo-test --repo=prunerepo_query --refresh --queryformat=%{location} --quiet --setopt=skip_if_unavailable=False

warning: /home/clime/prunerepo/tests/base/repo-test/2-secondlatestpkg/example-1.0.2-2.fc23.x86_64.rpm: Header V3 RSA/SHA1 Signature, key ID 3a1ce3e4: NOKEY


Removing: /home/clime/prunerepo/tests/base/repo-test/2-secondlatestpkg/example-1.0.2-2.fc23.src.rpm
Removing: /home/clime/prunerepo/tests/base/repo-test/2-secondlatestpkg/example-1.0.2-2.fc23.src.rpm
Removing: /home/clime/prunerepo/tests/base/repo-test/2-secondlatestpkg/example-1.0.2-2.fc23.x86_64.rpm
warning: /home/clime/prunerepo/tests/base/repo-test/0-oldestbuild/example-1.0.1-1.fc23.x86_64.rpm: Header V3 RSA/SHA1 Signature, key ID 3a1ce3e4: NOKEY


Removing: /home/clime/prunerepo/tests/base/repo-test/0-oldestbuild/example-1.0.1-1.fc23.src.rpm
Removing: /home/clime/prunerepo/tests/base/repo-test/0-oldestbuild/example-1.0.1-1.fc23.src.rpm
Removing: /home/clime/prunerepo/tests/base/repo-test/0-oldestbuild/example-1.0.1-1.fc23.x86_64.rpm
warning: /home/clime/prunerepo/tests/base/repo-test/0-oldestbuild/example-debuginfo-1.0.1-1.fc23.x86_64.rpm: Header V3 RSA/SHA1 Signature, key ID 3a1ce3e4: NOKEY


Removing: /home/clime/prunerepo/tests/base/repo-test/0-oldestbuild/example-1.0.1-1.fc23.src.rpm
Removing: /home/clime/prunerepo/tests/base/repo-test/0-oldestbuild/example-1.0.1-1.fc23.src.rpm
Removing: /home/clime/prunerepo/tests/base/repo-test/0-oldestbuild/example-debuginfo-1.0.1-1.fc23.x86_64.rpm
warning: /home/clime/prunerepo/tests/base/repo-test/2-secondlatestpkg/example-debuginfo-1.0.2-2.fc23.x86_64.rpm: Header V3 RSA/SHA1 Signature, key ID 3a1ce3e4: NOKEY


Removing: /home/clime/prunerepo/tests/base/repo-test/2-secondlatestpkg/example-1.0.2-2.fc23.src.rpm
Removing: /home/clime/prunerepo/tests/base/repo-test/2-secondlatestpkg/example-1.0.2-2.fc23.src.rpm
Removing: /home/clime/prunerepo/tests/base/repo-test/2-secondlatestpkg/example-debuginfo-1.0.2-2.fc23.x86_64.rpm
> [[ `listpkgsbyfs` == `listpkgsbyrepo` ]]
> [[ $repomdlastmodtime1 == `stat -c %Y $testrepo/repodata/repomd.xml` ]]
success.

Before it was:

============================ test --nocreaterepo ============================
> [[ `listpkgsbyfs` == `listpkgsbyrepo` ]]
> sleep 1
+ /home/clime/prunerepo/tests/../prunerepo --verbose --nocreaterepo .
Removing obsoleted packages...
Executing: dnf repoquery --repofrompath=prunerepo_query,/home/clime/prunerepo/tests/base/repo-test --repo=prunerepo_query --refresh --queryformat="%{location}" --quiet --latest-limit=1
Executing: dnf repoquery --repofrompath=prunerepo_query,/home/clime/prunerepo/tests/base/repo-test --repo=prunerepo_query --refresh --queryformat="%{location}" --quiet
Removing: /home/clime/prunerepo/tests/base/repo-test/0-oldestbuild/example-1.0.1-1.fc23.src.rpm
Removing: /home/clime/prunerepo/tests/base/repo-test/0-oldestbuild/example-1.0.1-1.fc23.src.rpm
Removing: /home/clime/prunerepo/tests/base/repo-test/0-oldestbuild/example-debuginfo-1.0.1-1.fc23.x86_64.rpm
Removing: /home/clime/prunerepo/tests/base/repo-test/2-secondlatestpkg/example-1.0.2-2.fc23.src.rpm
Removing: /home/clime/prunerepo/tests/base/repo-test/2-secondlatestpkg/example-1.0.2-2.fc23.src.rpm
Removing: /home/clime/prunerepo/tests/base/repo-test/2-secondlatestpkg/example-1.0.2-2.fc23.x86_64.rpm
Removing: /home/clime/prunerepo/tests/base/repo-test/0-oldestbuild/example-1.0.1-1.fc23.src.rpm
Removing: /home/clime/prunerepo/tests/base/repo-test/0-oldestbuild/example-1.0.1-1.fc23.src.rpm
Removing: /home/clime/prunerepo/tests/base/repo-test/0-oldestbuild/example-1.0.1-1.fc23.x86_64.rpm
Removing: /home/clime/prunerepo/tests/base/repo-test/2-secondlatestpkg/example-1.0.2-2.fc23.src.rpm
Removing: /home/clime/prunerepo/tests/base/repo-test/2-secondlatestpkg/example-1.0.2-2.fc23.src.rpm
Removing: /home/clime/prunerepo/tests/base/repo-test/2-secondlatestpkg/example-debuginfo-1.0.2-2.fc23.x86_64.rpm
> [[ `listpkgsbyfs` == `listpkgsbyrepo` ]]
> [[ $repomdlastmodtime1 == `stat -c %Y $testrepo/repodata/repomd.xml` ]]
success.

Do you know what actually happened? i.e. why the local repo was unavailable?

No, unfortunately. That's why I am proposing 5dbb28b.

could you please look at output formatting?

Sure, I missed that.

1 new commit added

  • avoid additional newlines in stderr
4 years ago

Perhaps I could squash 5dbb28b commit with the last one.

That's ok, thank you for looking into this.

Pull-Request has been merged by clime

4 years ago
Metadata