#2154 kojira: swap first_seen with latest mtime for repo
Merged 2 years ago by tkopecek. Opened 2 years ago by tkopecek.
tkopecek/koji issue2139  into  master

file modified
+15 -3
@@ -162,10 +162,22 @@ 

          # XXX - config

          if self.state != koji.REPO_INIT:

              return False

-         age = time.time() - max(self.event_ts, self.first_seen)

-         # the first_seen timestamp is also factored in because a repo can be

+         times = [self.event_ts]

+         # the mtime is also factored in because a repo can be

          # created from an older event and should not be expired based solely on

          # that event's timestamp.

+         path = self.get_path()

+         if os.path.exists(path):

+             try:

+                 times.append(os.stat(path).st_mtime)

+             except Exception:

+                 self.logger.error("Can't read mtime for %s" % path)

+                 return False

+         else:

+             times.append(self.first_seen)

+             self.logger.warning("Repo %d is in INIT state, "

+                                 "but doesn't have directory %s yet?" % (self.repo_id, path))

+         age = time.time() - max(times)

          return age > timeout

  

      def tryDelete(self):
@@ -193,7 +205,7 @@ 

                  self.logger.error("Can't stat repo directory: %s, %s" % (path, e.strerror))

                  return False

          else:

-             times = [self.event_ts, mtime, self.first_seen, self.expire_ts]

+             times = [self.event_ts, mtime, self.expire_ts]

              times = [ts for ts in times if ts is not None]

              age = time.time() - max(times)

              self.logger.debug("Repo %s (%s) age: %i sec", self.repo_id, path, age)

first_seen is measured from start of the kojira process. This is
inconsistent between runs and short-lived processes will never delete
some repos. Replace it with mtime of repo directory, it still should be
sufficient indicator of age (if it exists).

Fixes: https://pagure.io/koji/issue/2139

Metadata Update from @tkopecek:
- Pull-request tagged with: testing-ready

2 years ago

Metadata Update from @jcupova:
- Pull-request tagged with: testing-done

2 years ago

My concern here is that there might be a race where:

  • kojira somehow sees a repo in init before the directory is created
  • the repo in question is from an event in the past

Now perhaps I'm being paranoid. It might be impossible for kojira to see a repo in init before the directory is present.

If such a race is possible, then we could the first_seen data helps us avoid it. We could possibly only use that extra time if mtime is unknown.

Otoh, if it is impossible for mtime to be unknown in the calculation, then we should probably do something other than just ignore the missing dir and mtime.

If there is timeout 3600 wouldn't it be wrong anyway, if that directory still doesn't exist? Or we can raise it to 24 hours if there are concerns for long running tasks (maven?).

If there is timeout 3600 wouldn't it be wrong anyway, if that directory still doesn't exist?

Do you mean the timeout value of 36000 in the stale() function?

Without mtime and first_seen, all we have is the event timestamp, which might not be indicative of age. An admin could have just triggered an one-off repo from a very old event. A transient nfs read error could keep us from reading mtime, or perhaps the possible race I mentioned above.

Perhaps it would make sense to keep first_seen, but only use it if mtime is absent?

Or perhaps we should do something entirely different if we cannot read the directory mtime. I think that is outside the expected range of "stale repos"

I guess what I'm saying is that in the context of h

rebased onto 48a0b63

2 years ago

Ok, I've added first_seen in case directory doesn't exist. In case it exists but can't be read, error is logged and False returned. in tryDelete it shouldn't be a problem.

Commit 68f39fe fixes this pull-request

Pull-Request has been merged by tkopecek

2 years ago