See older issue and recent mailing list thread
The old ticket suggests a banner, and in the thread I suggested another site. But, in reading the ticket again, I am inclined to agree that this is causing more harm than good, and that if old reference material is needed, it exists on the Internet archive.
I think the main exception is old release notes, because 1) they've got a lot of valuable history for some technical decisions that sometimes it's useful to refer to and 2) by their nature, they're less likely to get confused for current docs.
We convert the English-language release notes from their single-page html form to asciidoc programmatically, and add those to the relevant (new, even though the numbers are old!) branches of https://pagure.io/fedora-docs/release-docs-home.
The non-English release notes (where they exist — it seems pretty random) could also get this treatment but I don't want it to block Steps 1-3.
FC1, FC2, and FC3 have separate documents for 32 and 64 bit systems. For simplicity, I suggest we combine them into a single document (one right after the other).
Add the following redirect rules
RewriteRule "^(\w\w-\w\w)/Fedora(_Core)?/(\d\d?)/html/Release_Notes.*" "/$1/fedora/f$3/release-notes/" [R=301,L] RewriteRule "^(\w\w-\w\w)/Fedora(_Core)?/.*" - [G]
Note case-sensitive — the old docs are under /en-US/Fedora.* and the new docs under /en-US/fedora/. This is kind of terrible, but probably had a good reason. :)
/en-US/Fedora.*
/en-US/fedora/
This redirects for the release notes pages, and returns http 410 ("Gone") for the rest. That theoretically makes search engines drop the pages faster than 404.
Optional: the language codes for the old docs are always in the form aa-AA. The new site uses that for some, but just aa for most. Add (separate) rewrite rules for all of those.
Make a custom 410 page which explains that older documentation can be found on the Internet Archive, ideally with a link.
Also while we're at it, fix the 404 error page on the docs site.
I think this has mostly stalled for lack of anyone feeling like they have enough authority to do this. Well, let's go ahead and declare such authority. Does anyone have any objection to this overall plan? If you do, please speak up quickly. I'm going to post this to the docs list... if there's no significant opposition in, say, two weeks, let's move to working out whatever details remain and then do it.
So, as part of this I would really really love to drop the old docs from our proxies. :( It takes up about 20GB and it's a pain because we have to hardlink it to avoid filling up disk space on proxies.
The process currently is a messy set of rsyncs that combine the old docs, the new docs and docs-redirects into one tree of files that we serve.
So, hopefully we can just drop that tree from the process, but I want to make sure we do that. ;)
I'll edit that in to step 4.
Here's the old docs converted (very quickly with a script) into asciidoc: https://fedorapeople.org/~mattdm/misc/old-release-notes-adoc.tar.xz
These need to be put into branches of https://pagure.io/fedora-docs/release-notes (with the corresponding framework), and also I guess https://pagure.io/fedora-docs/release-docs-home updated?
Also those converted docs should have a "This is old material for reference only" bit added at the top. I didn't do that.
Can someone other than me volunteer to drive the next steps of this? I have too many ideas right now and not enough follow-through. :)
A couple questions here:
Not sure where they are physically stored -- but they're not part of the antora setup as I understand it.
That's the idea, yeah. If there's anything useful in there that hasn't been updated yet, time to do it. If it turns out someone was using something, we can see what we can do to get it up to date.
The old docs are in https://pagure.io/fedora-docs-web.git all 4.3GB of them.
We clone that out on sundries01, then rsync it to proxies.
It's a kind of anoying dance there, what we do is:
(this allows us to hardlink things so the old docs + docs combined doesn't take up 8.6GB + new docs)
Metadata Update from @ryanlerch: - Issue assigned to ryanlerch
ok, i'm going to start having a crack at this -- starting at step 1.
Going to Do the following steps for Fedora 25 release notes:
@ryanlerch Awesome, thanks! See https://fedorapeople.org/~mattdm/misc/old-release-notes-adoc.tar.xz for my stab ad converting to asciidoc.
@ryanlerch Any progress?
I realize I'm a couple months late here, but assuming @ryanlerch hasn't put a significant amount of work into this yet:
Issue no. 1 is keeping these old docs out of search results. I see the repo has a robots.txt that excludes some old IPA docs that people were complaining about earlier. I know those are excluded from searches - is it based on this file or is that handled somewhere else? If it does, why don't we just add en-US/Fedora/26}/* etc. and a few more lines for the other URLs, and fix that problem immediately?
robots.txt
en-US/Fedora/26}/*
As for reducing space taken up by old docs - each doc we published using publican back in the day has a PDF version. How about we just take all those PDFs, put them into one big tarball, put that in a normal repo as an attachment, and link to it for people who want to download the archive? Actually it doesn't even need to be one big tarball, we could just point them to a pagure repo where they could grab anything they wanted. That way we wouldn't need to worry about asciidoc conversions, and Kevin should be able to get rid of the whole rsync process since everything would be in the current system.
That makes sense to me.
Yeah, I'd even go as far as saying "let's throw them on the Internet archive and provide links to that for anyone who needs the old docs".
We agreed today that we like the approach of dropping the old docs from the server and pointing people to the old fedora-docs-web repo when they need historical content. Before we do this, we'll wait a week for comment on the Discussion thread.
Metadata Update from @bcotton: - Issue assigned to pboy (was: ryanlerch)
Just ran across this issue again while doing some housekeeping. In my understanding this is fixed now, isn't it? The old docs are gone and if anyone wants them they can find them in https://pagure.io/fedora-docs-web - so we can finally close this, right? :)
sure
Yes, I think so... although there's something fishy on the infrastructure side still. The new docs and redirects combined are under 1GB, but then combined dir is 13GB... so some old content didn't get removed somewhere along the line. However, we can track that on the infra side.
Alright. Ping me if there's anything I can do.
Metadata Update from @pbokoc: - Issue close_status updated to: Fixed - Issue status updated to: Closed (was: Open)
Log in to comment on this ticket.