#1455 transifex upgrade

Created 7 years ago by glezos

With F11 and the freezes out of the way, we can now remove the old L10n tools. Here's a task list:

  1. Remove Transifex 0.3 from /submit/
  2. Remove Damned Lies from /
  3. Move Transifex 0.5 from /tx/ to /
  4. Upgrade to Tx 0.6

About the last one, we'll need to install Tx 0.6 on one of the EL test servers in order to test and then proceed in upgrading the production instance to 0.6. This could be another ticket, though.

publictest3.fedoraproject.org is built for transifex testing now.

Note that I have applied one live patch on it for the django auth provider in python-fedora - the bug that it fixes will get fixed in the next python-fedora version.

Thanks to ricky, http://translate.fedoraproject.org/ is now running Tx 0.5.2 instead of Damned Lies.

I quickly went through the pages and the only bug I found is the logo not showing up.

http://translate.fedoraproject.org/site_media/images/tx-logo.png

As mentioned on list, the logo issue is fixed now (I forgot a ^ in the RewriteRule). By the way, I put Transifex into a puppet module today, so it's easy to run it on multiple app servers now, as long as the code supports it.

Everything up to the Transifex 0.6 upgrade is done now. This is basically blocking on getting a Transifex 0.6 package for you guys to setup on publictest3, right?

ricky, right. Is it possible to have a test server to serve Tx somewhere from the root dir /?

FYI, I believe that the Tx 0.6 RPM is in rawhide.

publictest3 is open for use whenever you're ready. Also, feel free to use the staging environment (app1.stg and proxy1.stg) to test setting it up at /.

You can get to this at translate.stg.fedoraproject.org.

Replying to [ticket:1455 glezos]:

With F11 and the freezes out of the way, we can now remove the old L10n tools. Here's a task list: 1. Upgrade to Tx 0.6

Runa just pointed out to me that this ticket requests Transifex 0.6, but of course now we should be getting Transifex 0.7 or later.

Note the bug here, too: https://bugzilla.redhat.com/show_bug.cgi?id=527327

Should I open a separate ticket for this?

Ok, it seems the workaround to make publican-based docs on Fedora's Transifex instance (0.5.3) has been a pain in the neck.

Currently we have the version 0.7.3, which supports publican-based docs, so I think it's pretty worth to go on and upgrade our instance.

What's necessary to be done?

1 - Have all the new dependencies of Tx-0.7.3 packaged and available on EPEL-5. I guess I have done it. [http://docs.transifex.org/releases/0.7.html#dependencies 1],[https://admin.fedoraproject.org/pkgdb/users/packages/diegobz 2].

2 - Have Tx-0.7.3 packaged and available on EPEL-5. I have updated the SPEC on the Tx mainline repository, but it still needs some reviewing and testing on koji/mock. You can find the script to generate the SPEC under build-tools/ on the Tx repository.

3 - Get a testing instance somewhere at the Fedora's infra running with PostgreSQL.

4 - Take a copy of the current PostgreSQL database and make the migrations 0.5 -> 0.6 and 0.6 -> 0.7; We can't do 0.5 -> 0.7; It will need to be done with a clean 'clone' of the Tx code, so we won't need to package 0.6. More info: [http://docs.transifex.org/releases/0.6.html#upgrading-to-0-6 0.5 -> 0.6], [http://docs.transifex.org/releases/0.7.html#upgrading-to-0-7 0.6 -> 0.7]

'''IMPORTANT''': We got some backward incompatible changes here in the Action Log feature of Transifex. We need to decide what can be done. We will probably move the related table and afterwards, if necessary, we will create a script to put the data back. I don't think it's that simple tough.

5 - Test it. We will need to find a way to make sure everything works fine in the testing environment, including submissions to remote VCSs.

6 - Finally, make everything again in the production environment.

My concern is that I don't have much time available to do it, as we have gotten some features to add on Tx until the end of the year. However with some help we might be able to do this upgrade. The steps are there and must be followed in that order. :)

Have the newer versions of Transifex been packaged? If not I could attempt to assist with this.

I confirm that #1 is completed.

Sparks: It would be great to have it packaged. It looks like rawhide currently has 0.6. EL-5 is currently on 0.5.2. Because of the backwards incompatibilities I think we'll probably want to get rawhide up to 0.7 and then backport that into an EL-5 package that we build for the infrastructure repository.

We can use the staging environment or a publictest server for #3. I think the choice there hinges on #4 -- what is the incompatibility with the ActionLog? Does the db migration scripts handle it in some way? Do we need to move the ActionLog before 0.5->0.6, 0.6->0.7, or after the migration scripts run? If we're using staging, I can import a recent a recent version of the transifex db there and install the new transifex once we get it packaged.

Question: Do the db migration scripts run on the db server or the app server? What db user does it run as? What permissions does that user need to have?

All 1-3 tasks have been completed. This ticket is now about "Upgrading Transifex from 0.5.x to 0.7.3".

Instructions about migrating the data can be found on the release notes: http://docs.transifex.org/releases/0.7.html#upgrading-to-0-7

The migration scripts run on the app servers. The DB user can be found in .../transifex/settings.py.

Replying to [comment:16 toshio]:

I confirm that #1 is completed.

Great!

Sparks: It would be great to have it packaged. It looks like rawhide currently has 0.6. EL-5 is currently on 0.5.2. Because of the backwards incompatibilities I think we'll probably want to get rawhide up to 0.7 and then backport that into an EL-5 package that we build for the infrastructure repository.

Actually we have an updated SPEC in our repository. For 0.7.x version see: http://code.transifex.org/index.cgi/tx-0.7.x/file/0fafc780e303/build-tools/

It needs to be reviewed and built on koji/mock to see how it goes.

We can use the staging environment or a publictest server for #3. I think the choice there hinges on #4 -- what is the incompatibility with the ActionLog? Does the db migration scripts handle it in some way? Do we need to move the ActionLog before 0.5->0.6, 0.6->0.7, or after the migration scripts run? If we're using staging, I can import a recent a recent version of the transifex db there and install the new transifex once we get it packaged.

Question: Do the db migration scripts run on the db server or the app server? What db user does it run as? What permissions does that user need to have?

The migration of data can be done apart from the packaging. I'm attaching two files on this ticket. One of those has all the steps necessary (ordered) to migrate the database. I have done it in my local box using a copy of the Fedora db that I requested to the infra team time ago.

BTW the migration of data is the last step and surely I can help out with it. :)

I would suggest first to get 0.7.3 packaged/reviewed running on a staging environment or whatever even using SQLite and then go on with the migration.

Toshio, the db settings (of the RPM) can be found in /etc/transifex/ on app1 IIRC.

ActionLog data migration FEDORA.sql

Replying to [comment:18 diegobz]:

Replying to [comment:16 toshio]: [..]

Sparks: It would be great to have it packaged. It looks like rawhide currently has 0.6. EL-5 is currently on 0.5.2. Because of the backwards incompatibilities I think we'll probably want to get rawhide up to 0.7 and then backport that into an EL-5 package that we build for the infrastructure repository.

Actually we have an updated SPEC in our repository. For 0.7.x version see: http://code.transifex.org/index.cgi/tx-0.7.x/file/0fafc780e303/build-tools/

[..]

Yesterday, I tried using that script to generate spec and srpm (thought about once generated get it sync with fedora guidelines). But found few missing dependencies, which does not seem to be listed in http://docs.transifex.org/releases/0.7.html#dependencies Django-south, django-piston and ajax_select . Are these essential dependencies or can be circumvented ?

Meantime I will try to get a build see if i can test it first on rawhide.

Thanks,

Replying to [comment:19 rakesh]:

Yesterday, I tried using that script to generate spec and srpm (thought about once generated get it sync with fedora guidelines). But found few missing dependencies, which does not seem to be listed in http://docs.transifex.org/releases/0.7.html#dependencies Django-south, django-piston and ajax_select . Are these essential dependencies or can be circumvented ?

Uhmm? Well, actually two deps are listed at http://docs.transifex.org/releases/0.7.html#dependencies. - South -> Django-south [https://admin.fedoraproject.org/pkgdb/packages/name/Django-south 1] - django-piston [https://admin.fedoraproject.org/pkgdb/packages/name/django-piston 2]

They are also present in the spec file [http://code.transifex.org/index.cgi/tx-0.7.x/file/0fafc780e303/build-tools/SPECS/transifex.spec.in#l27 3].

About 'ajax_select', this is a new dependency of the development branch, the tx-0.7.x branch does not require it. I guess you got a clone of the development branch, no? :)

You should get the code this way:

{{{ hg clone http://code.transifex.org/tx-0.7.x/ }}}

(Wading in here and changing the Summary to something meaningful.:)

Seem to be quite a few people looking at this but do let me know if I can assist with this.

transifex.spec file used for above build transifex.spec

sorry for delay, I have imported it to rawhide, will test it for some time. Meanwhile will check a build for EPEL and test.

FI, http://rakesh.fedorapeople.org/misc/transifex-0.7.3-1.el5.src.rpm is srpm for EPEL 5. I have requested EPEL steering commitee for approval to allow me push this update.

https://www.redhat.com/archives/epel-devel-list/2009-December/msg00040.html

Replying to [comment:25 rakesh]:

FI, http://rakesh.fedorapeople.org/misc/transifex-0.7.3-1.el5.src.rpm is srpm for EPEL 5. I have requested EPEL steering commitee for approval to allow me push this update.

https://www.redhat.com/archives/epel-devel-list/2009-December/msg00040.html

This would be needed as package can be directly pushed to infrastructure.fedoraproject.org! Thanks toshio for clarifying this (on irc:)

Latest build: transifex-0.7.3-4.el5 has been added to infrastructure.fedoraproject.org. Waiting on staging environment to test in now.

Ricky has the staging environment up. Thanks ricky!

Time to test. ping me to get started. If I don't hear from anyone, I'll do at least the db portion of the update tomorrow so that I'm (hopefully) not blocking anyone. The servers to use are:

For shell, login to bastion.fedoraproject.org. Then app01.stg.fedoraproject.org. Install upgrade, etc. sysadmin-web should be all that's needed here. db01.stg.fedoraproject.org is more locked down which is why I'll update the db if necessary.

For reaching it via the web: https://translate.stg.fedoraproject.org/ => Be careful here, I notice that transifex-0.5, at least, changes the URL back to translate.fedoraproject.org a lot. So when testing, you'll need to constantly check that you're using https://translate.stg.fp.o and not https://translate.fp.o

Bad news, the database upgrade steps given here: Migration_Steps-0.5.x_to_0.7.x.txt

Don't appear to work. I'm going to try again with a fresh dump of the database just in case the dump that was on the staging db server was stale.

Steps I'm running and errors received transifex-upgrade.txt

Diego can you take a look at transifex-upgrade.txt and see if any of those errors are unexpected?

ricky, I get a read-only file error when trying to write to this directory on app01.stg.... is that just how acls are setup for the staging env vs production?

n4aphx2-3.storage.phx2.redhat.com:/vol/fedora/app/scratchdir on /var/lib/transifex/scratchdir type nfs (rw,soft,intr,addr=10.5.88.11)

OK, just talked with Toshio on IRC and apparently we solved all the weird things that happened there.

He's going to try another shoot right now.

db portion of the upgrade completed! The new transifex app still isn't coming up though. This may be because of nfs acls preventing transifex from writing to its scratchdir. I'm going to try bind mounting a local disk after I copy the files over.

Okay, it looks like the configs need to be updated in puppet. The config has been split up into multiple files. I'm not sure if we can just move /etc/transifex/00-default.conf to 99-fedora.conf or we need to move the entries in 00-default.conf into the new config files (this is what's currently done on app01.stg)

Also, something about the database upgrade didn't work quite right. I had to run this command before the main page would come up:

sudo -u transifex /usr/share/transifex/manage.py migrate 0003_add_anyone_submit_field --all

Last thing I noticed: I can't seem to login. Is this happening to everyone? If so there's probably some authentication middleware needed for us to login against fas that didn't get included with my changes to the config files.

I have vacation until Monday after New Years and am going to be travelling so I might not be able to continue this for a while. juhp, you have access to everything except direct db server access (you can use the django manage.py scripts, though) via sysadmin-web.

Is it possible to do something about the redirects from translate.stg.fp.o to translate.fp.o?

It currently makes testing nearly impossible I think.

Hey, I just changed the domain in the staging configuration file - hopefully this helps out a bit.

Note that you still must specify https:// yourself, so make sure to use https://translate.stg.fedoraproject.org/.

Login always seems to redirect me to translate.fp.o.

... so currently seems only read-only testing can be done, is that right?

Replying to [comment:33 toshio]:

Last thing I noticed: I can't seem to login. Is this happening to everyone?

Ah just noticed this comment now... ok.

If so there's probably some authentication middleware needed for us to login against fas that didn't get included with my changes to the config files.

Any pointers or clues on this anyone?

IIRC Toshio worked on the authentication backend -- Toshio, any pointers?

I was able to log in, are others not able?

Replying to [comment:42 mmcgrath]:

I was able to log in, are others not able?

I can't but getting some interesting errors today. ;-)

{{{ 403 Forbidden

Cross Site Request Forgery detected. Request aborted. }}}

and the first time I got a django backtrace

{{{

Page not found (404) Request Method: POST Request URL: http://translate.stg.fedoraproject.org/accounts/login/?next=/

Using the URLconf defined in urls, Django tried these URL patterns, in this order:

  1. ^$
  2. ^projects/
  3. ^collections/
  4. ^search/$
  5. ^admin/doc/
  6. ^admin/(.*)
  7. ^contact/
  8. ^languages/
  9. ^account/
  10. ^site_media/(?P<path>.*)$

The current URL, accounts/login/, didn't match any of these. }}

so something has changed, since before I was just getting redirected to t.fp.o...

This will adversely affect the Release Notes and all other Docs Guides if not completed by Mar 11.

Hey, I've finally gotten FAS auth working on the staging instance, so you should be able to do authenticated testing. Committing to actual modules won't work though. We should try to get some local test repos on app01.stg to test with. diegobz, glezos - can you help get those setup?

Once this testing is done, the upgrade process should be able to go fine on production, although I need to write up some config changes that I needed to fix the previous issue.

I've just branched the repo for one of my books that can be used as a test case:

Project name: Docs :: Installation Quick Start Guide

Location: https://translate.stg.fedoraproject.org/projects/p/docs-install-quick-start-guide/

VCS address: http://git.fedorahosted.org/git/installation-quick-start-guide.git

Branch: testing

POT files: pot/*.pot

PO files: /.po

Hope this helps.

Hi, we'd prefer to avoid pushing to fedorahosted from staging - ideally, we could setup a local clone of your repo on app01.stg and test with that.

diegobz, glezos - could you take a look at setting up such a local test repo when you get a chance?

I created two components, both of them Publican docs and configured as such (this happens in the Edit Component page):

The one clones from upstream the other one clones from /tmp (to which you should be able to submit normally).

The checkout fails because of the following reason:

{{{ `/var/lib/transifex/scratchdir/sources/hg/': Read-only file system }}}

Once that's fixed, you can navigate to the above pages and try to Refresh stats. It should work in both cases. Then in the Edit Component one can choose "Accept translations", which will enable the Edit button, to test online editing and committing.

Hope this helps! Feel free to ping me in #transifex.

Also, please DO note that 0.7.4 has been released which fixes a security issue. We should upgrade to that once instead.

http://docs.transifex.org/releases/0.7.html#transifex-0-7-4-xorn

Thanks for the heads up, I'm updating the package in our infrastructure repo and app01.stg right now.

The error from the checkout probably failed because our staging env only has read only access to the scratchdir. I'm going to just unmount it so that staging uses local storage for the scratchdir.

OK, I've tested pulling on the test repos, and they both seem to work (there was an extra space at the end of the http:// git URL which I removed manually from the database).

Is it expected that there are fewer languages listed at https://translate.stg.fedoraproject.org/projects/p/publican/c/virt-guide-upstream/ compared to https://translate.stg.fedoraproject.org/projects/p/publican/c/virt-guide-localhost/?

Also, the localhost one seems to not allow submissions. Does this have to do with the new ALLOWED_REPOSITORY_PREFIXES option in 0.7.4? Do we need to add / or file:// to this list of prefixes? (And it might be good to add git:// too if that's the case).

Ah, looking through the release notes, we should have a special dedicated path for local repos and add that path to the whitelist.

I've added /var/lib/transifex/local_repos to the whitelist - could you move the repos there and make the necessary db changes for this?

At some point, we shoud update our transifex SOP (http://fedoraproject.org/wiki/Translations_Infrastructure_SOP), since it seems to still be for the old TG version of transifx :-)

Added git:// too, if you want to switch the upstream repo to git://. It shouldn't really make a difference for our testing, but it's more efficient in general, and we recommend it to all fedorahosted.org users that can use it.

Note that in the migration instructions, after FEDORA.sql is sourced, we also need to update the actionlog_logentry_id_seq sequence to match its entries:

{{{ select setval('actionlog_logentry_id_seq', (select max(id) from actionlog_logentry)); }}}

Visiting, for example this link: https://translate.stg.fedoraproject.org/projects/p/publican/c/virt-guide-localhost/l/ru-RU

When I try to download a PO file, it looks like a trailing slash '/' is being appended to the name of the individual .po file, so Django gives an error message. Removing the slash manually gets me the file I expect.

Hm, this appears to happen the first time I hit a link to download the raw file, but the problem doesn't happen the second time I hit the link. Has anyone else seen this?

Replying to [comment:57 pfrields]:

Hm, this appears to happen the first time I hit a link to download the raw file, but the problem doesn't happen the second time I hit the link. Has anyone else seen this? Hey, I think this is because app02.stg (an app server without the upgraded transifex) got reenabled. This explains why it only happens around half of the time. I've temporarily removed it from the rotation, and will possibly test upgrading it there shortly, depending on the situation with getting a shared scratch directory in staging.

As mentioned in ticket #2028, we'll be doing the upgrade tomorrow at 21:00 UTC :-)

This is done now :-)

Login to comment on this ticket.