#9400 Fedora Magazine feed no longer coming to Planet?
Closed: Fixed 3 years ago by pingou. Opened 3 years ago by pfrields.

Describe what you would like us to do:

Re: https://discussion.fedoraproject.org/t/fedora-magazine-not-appearing-in-fedora-planet-feed/24051

It appears the Magazine feed no longer appears on Planet Fedora for a while now. However, the RSS feed on the Magazine appears to be working OK (https://fedoramagazine.org/feed). Could the team take a look at Planet logs to help us determine the problem?

When do you need this to be done by? (YYYY/MM/DD)

Let's say 2020/11/01 -- this is important but not urgent. Our stats appear to be down for a few months now and this could be part of the reason why.


We are seeing the planet software fail on some blogs lately, I am not sure why.

This one however seems to be:

ERROR:planet.runner:Error 403 while updating feed http://fedoramagazine.org/?feed=rss2

A bunch of other ones are failing on:

ERROR:planet.runner:KeyError: 'published_parsed'
ERROR:planet.runner:  File "/usr/lib/python2.7/site-packages/planet/spider.py", line 513, in spiderPlanet
    writeCache(uri, feed_info, data)
ERROR:planet.runner:  File "/usr/lib/python2.7/site-packages/planet/spider.py", line 236, in writeCache
    if entry['published_parsed']:
ERROR:planet.runner:  File "/usr/lib/python2.7/site-packages/planet/vendor/feedparser.py", line 246, in __getitem__
    return UserDict.__getitem__(self, realkey)

Metadata Update from @kevin:
- Issue tagged with: dev, medium-gain, medium-trouble

3 years ago

Metadata Update from @kevin:
- Issue priority set to: Waiting on Assignee (was: Needs Review)

3 years ago

@misc @duck can you check fedoramagazine.org and see if it's doing any blocking based on IP or User-Agent?

This would be coming from: 152.19.134.199 with User-Agent of 'Python-httplib2/$Rev: 227 $'

If I try that with curl on the machine I indeed get a forbidden:

# curl -L -A 'Python-httplib2/$Rev: 227 $'  "https://fedoramagazine.org/?feed=rss2"
<html>
<head><title>403 Forbidden</title></head>
<body>
<center><h1>403 Forbidden</h1></center>
<hr><center>nginx</center>
</body>

I got the same result here, and removing the User-Agent makes it work.

I dug into all the settings on wpengine, as well as the doc, and could not find anything. I'm in a support live-chat right now.

I got them to whitelist the User-Agent as it was blocked as "Bots which blast sites". The setting is not available in the settings and they have no plan to make it so because they want to have control over it. It's nevertheless possible to ask for another site via the support chat.

yeah, we had the issue in the past when we used a proxy , this triggered their bot blocking filter.
And yes, that's only by support, since that's done on their reverse proxy.

@misc @duck could you also check the communityblog?

ERROR:planet.runner:Error 403 while updating feed http://communityblog.fedoraproject.org/?feed=rss

Metadata Update from @pingou:
- Issue untagged with: dev
- Issue tagged with: ops

3 years ago

@pingou I asked for an exception for the community blog too.

I see the fedora magazine showing in the planet now.

I think we'll be able to close this ticket once the communityblog has been fixed as well.

Thanks for your help @misc and @duck!

I see posts from the communityblog on the planet today. So this looks all fixed!

Thanks again @misc and @duck!

Metadata Update from @pingou:
- Issue close_status updated to: Fixed
- Issue status updated to: Closed (was: Open)

3 years ago

Login to comment on this ticket.

Metadata
Boards 1
ops Status: Done