#6087 Staging Pagure indexed
Closed: Fixed 6 years ago Opened 6 years ago by puiterwijk.

stg.pagure.io is indexed by Google (and probably other search engines).
We should add a robots.txt to make it not indexed by search engines.


Metadata Update from @puiterwijk:
- Issue tagged with: easyfix

6 years ago

The robots.txt lgtm. Needs changes in ansible to make use of it.

Actually, I lied, sorry. Per spec, there can't be a blank line between the User-Agent and Disallow.

I can look into making the changes in Ansible to use this. Can someone point me in the right direction for the next step please?

Disclaimer - I'm an apprentice and just started looking at the Ansible layout/configuration.

The repo can be found at https://infrastructure.fedoraproject.org/cgit/ansible.git/ (as seen on current project overview page).

Looking at the pagure/frontend role, I'd convert the robots.txt task from file to template. In the template, I'd use the 'pagure-staging' group_var to define the disallow '/' for staging and leave current disallow/crawl-delay for the rest.

Please let me know if you are no longer working on this issue and I will make the changes.

Attached is a git patch for this change. Changes made -

  • Changed roles/pagure/frontend/tasks/main.yml to deploy robots.txt via template
  • Removed (moved) roles/pagure/frontend/files/robots.txt
  • Created roles/pagure/frontend/templates/robots.txt.j2 and edited to create staging robots.txt based on pagure-staging group_var

Without (much) access, I wasn't able to test. Please let me know if anything else is needed.
0001-disable-indexing-of-pagure-staging-converted-robots..patch

Can I get a patch review?

Looks good! Sorry for the delay.

I've pushed this in...

:guitar:

Metadata Update from @kevin:
- Issue close_status updated to: Fixed
- Issue status updated to: Closed (was: Open)

6 years ago

Login to comment on this ticket.

Metadata
Attachments 3