Can we get a proper robots.txt deployed on fedorahosted. We're hitting several performance issues and until we can figure out a plan for how to deal with them I'd like to restrict google a bit. Specifically I want to disallow:
We may need to use mod_rewrite based on user agent to force this, not sure. I'm thinking about doing the same for git-web's snapshot for crawlers.
I know we are currently frozen, but this looks relatively simple, and can be applied once we are unfrozen.
I am a newb to fedora's puppet practices, so I attached a patch for review/comment/flame :)
As for git-web stuff- I notice some of the other F/LOSS git-web instances are disallowing everything with robots.txt,
//browser/ has been robots.txt'd.
Related puppet commits:
to comment on this ticket.