#11736 recurring mirror service stop/failure
Closed: Fixed a month ago by zlopez. Opened a month ago by pgfed.

I've created a new project @ my Pagure

https://pagure.io/pgnd/authenticator-UPSTREAM-MIRROR

it's setup as a mirror of,

https://gitlab.gnome.org/World/Authenticator

It's not populating @ Pagure. This has happened a couple of times before with mirroring ...

@kevin investigated briefly, found the service was stuck/crashed, and restarted it.
iiuc, no cause was determined for the service fail.

last time, with another project/mirror, after ~ 3 hrs & still not populated, that did the trick ... mirror populated within 5 mins of the restart.

this time, ~ 3 hours after restart, still not populated.


Metadata Update from @zlopez:
- Issue priority set to: Waiting on Assignee (was: Needs Review)
- Issue tagged with: Needs investigation, medium-gain

a month ago

there's a history with mirror service, possibly relevant

https://pagure.io/fedora-infrastructure/issue/11703
https://pagure.io/fedora-infrastructure/issue/10806
https://pagure.io/fedora-infrastructure/issue/10262
https://pagure.io/pagure/issue/4983

fwiw, I'd tried, earlier, with/without ".git" suffix on source URL; no apparent change in behavior.

checking another of my mirrors,

https://pagure.io/pgnd/dkimpy-milter-UPSTREAM-MIRROR

it was "Created 21 days ago"

but still reports,

"This repo is brand new and meant to be mirrored from https://git.launchpad.net/dkimpy-milter !"

and is unpopulated

The problem was that a mirror of llvm got stuck:

git 465459 0.0 0.0 909556 6156 ? Sl Jan23 0:05 git-receive-pack /srv/git/repositories/llvm-project-mirror.git

So, we may want to see if we can set some kind of timer and kill the connection if it takes longer than that?
But that might need upstream improvements.

couple of thoughts/comments.

atm,

https://pagure.io/pgnd/dkimpy-milter-UPSTREAM-MIRROR

has populated. but,

https://pagure.io/pgnd/authenticator-UPSTREAM-MIRROR

has NOT.

the "21-days, no update" is a concern. sure, 1st -- my bad for not noticing (we can talk abt Notifications/Messaging at a later date ...)

that length of 'stuck' shouldn't be possible.
and, doesn't seem like the cause is best resolved by just restarting the service ... on timer, or otherwise.

this has happened a number of times (links above).
'something' keeps recurring.

also, fyi, i've setup mirrors of these same repos @ GitLab.
population was virtually immediate.
in other usage, I've never had a mirror @ GL not populate/update.

so far, I've only seen this here @ pagure.

https://pagure.io/pgnd/authenticator-UPSTREAM-MIRROR has populated.

It's just that the default branch is 'main' which doesn't exist. If you go into settings and set it to 'master' it should default right.
Or if you go to https://pagure.io/pgnd/authenticator-UPSTREAM-MIRROR/tree/master it should be there.

It's just that the default branch is 'main' which doesn't exist.

checking @

https://pagure.io/pgnd/authenticator-UPSTREAM-MIRROR/settings#gitbranch-tab

it's, rather, set to

Default Branch == belelmoussaoui/camera-improvement

not something I'd selected.

checking @ upstream,

https://gitlab.gnome.org/World/Authenticator

its default branch is set to 'master'

changing @ Pagure.io

-   Default Branch == belelmoussaoui/camera-improvement
+   Default Branch == 'master'

does, in fact, do the trick. It's populated coorrectly now @ Pagure.

some user-visible notification of the problem would be useful.
That's a different issue/topic ...

That would be something for pagure to implement. I'm closing this as fixed as it's working.

Metadata Update from @zlopez:
- Issue close_status updated to: Fixed
- Issue status updated to: Closed (was: Open)

a month ago

@Zlopez

I see you closed this as 'fixed'.

What exactly is the long-term fix, to keep this from reoccurring?

"all" i'm able to see here that's been done is a restart ...

The longer fix is change in pagure_mirror service to notify user about the wrong branch. You can file that in pagure issue tracker.

keep this from reoccurring

so that won't be addressed.

We can't do much else on fedora infra part, as I understood the issue in this case was the wrong branch set on the mirror, that couldn't be populated.

no, 'wrong branch' hasn't been the cause of the repeated/recurring service fails.

no worries, there are reliable options/alternatives.

Yeah, I think upstream needs to add some kind of timer here. If a mirror takes more than a short time, it should stop it and log an error, but then keep going on to process the rest.

I'm not sure how hard that would be to implement, but without it, this is going to keep happening as you note.

add some kind of timer
...
without it, this is going to keep happening as you note

i've moved these mirrors to GitLab
fwiw, you can at least get mirror status via API; e.g.,

curl -sS --header "PRIVATE-TOKEN: xxxx" "https://gitlab.com/api/v4/projects/54207353" | jq . | grep -E "import_url|updated_at|import_error|import_status"
    "import_url": "https://gitlab.gnome.org/World/Authenticator.git",
    "import_status": "finished",
    "import_error": null,
    "updated_at": "2024-01-24T16:31:01.320Z",

notification can be triggered on result.

so far, i've never had a mirror fail and not recover in short order without my intervention. so, can't speak to what recovery mechanism they have in place. or if it's simply due to a more robust arch.

if similar status reporting exists @

https://pagure.io/api/0/

i haven't found it yet.

Login to comment on this ticket.

Metadata