#4479 Nagios - FAS/OpenID error on https://admin.fedoraproject.org/nagios/
Closed: Fixed None Opened 7 years ago by lanica.

= bug description =
After going to https://admin.fedoraproject.org/nagios/ and doing the FAS/openid login, I receive a page with the error message: There has been an error while attempting to authenticate.

I have attached screen shots.

= bug analysis =
FAS/OpenID works for me on other sides, such as this one, just fine.
Using Firefox 31 on Mac OS X 10.8.5

= fix recommendation =
Good luck?

FYI, I don't mind working this as a newbie to infrastructure, but some pointers might be helpful :)

Also tested with Firefox 31 on Fedora 20...same results. I'm assuming its not browser or OS related, but thought it wouldn't hurt to test.

Will attempt to work this ticket with puiterwijk

Understanding of the problem so far:
(04:35:23 PM) puiterwijk: the problem is that mod_auth_openid is sending a nonce in the authopenid.nonce GET field with the request, but when the request comes back it tries to get openid.modauthopenid.nonce
(04:35:47 PM) puiterwijk: so try to find where in mod_auth_openid it sets and reads those values and figure out why sometimes the reading is prepended with openid.


[Thu Aug 07 11:48:22.804235 2014] [core:error] [pid 17632] [client] Error in authentication: openid.modauthopenid.nonce: no such field, referer: https://admin.fedoraproject.org/nagios?&modauthopenid.nonce=Ee7yhRvS1U&openid.assoc_handle=%7BHMAC-SHA1%7D%7B53e366a3%7D%7B3%2BH8sQ%3D%3D%7D&openid.claimed_id=http%3A%2F%2Flanica.id.fedoraproject.org%2F&openid.identity=http%3A%2F%2Flanica.id.fedoraproject.org%2F&openid.invalidate_handle=%7BHMAC-SHA256%7D%7B53d53eea%7D%7Bn5LJzA%3D%3D%7D&openid.mode=id_res&openid.ns=http%3A%2F%2Fspecs.openid.net%2Fauth%2F2.0&openid.ns.pape=http%3A%2F%2Fspecs.openid.net%2Fextensions%2Fpape%2F1.0&openid.ns.sreg=http%3A%2F%2Fopenid.net%2Fextensions%2Fsreg%2F1.1&openid.op_endpoint=https%3A%2F%2Fid.fedoraproject.org%2Fopenid%2F&openid.pape.auth_level.nist=2&openid.pape.auth_level.ns.nist=http%3A%2F%2Fcsrc.nist.gov%2Fpublications%2Fnistpubs%2F800-63%2FSP800-63V1_0_2.pdf&openid.pape.auth_policies=http%3A%2F%2Fschemas.openid.net%2Fpape%2Fpolicies%2F2007%2F06%2Fnone&openid.pape.auth_time=2014-08-07T11%3A44%3A35Z&openid.response_nonce=2014-08-07T11%3A44%3A35ZOLXs1g&openid.return_to=https%3A%2F%2Fadmin.fedoraproject.org%2Fnagios%3F%26modauthopenid.nonce%3DEe7yhRvS1U&openid.sig=13idmVFVHhE%2BnnFiEgBm1OQ4NI8%3D&openid.signed=assoc_handle%2Cclaimed_id%2Cidentity%2Cinvalidate_handle%2Cmode%2Cns%2Cns.pape%2Cns.sreg%2Cop_endpoint%2Cpape.auth_level.nist%2Cpape.auth_level.ns.nist%2Cpape.auth_policies%2Cpape.auth_time%2Cresponse_nonce%2Creturn_to%2Csigned
[Thu Aug 07 15:04:37.860927 2014] [core:error] [pid 11886] [client] Error in authentication: openid.modauthopenid.nonce: no such field, referer: https://id.fedoraproject.org/

ACCESS LOG: - - [07/Aug/2014:07:35:29 +0000] "GET /nagios?&modauthopenid.nonce=txDwhr5zQX&openid.assoc_handle=%7BHMAC-SHA1%7D%7B53e32b0a%7D%7BnfFccA%3D%3D%7D&openid.claimed_id=http%3A%2F%2Fpuiterwijk.id.fedoraproject.org%2F&openid.identity=http%3A%2F%2Fpuiterwijk.id.fedoraproject.org%2F&openid.invalidate_handle=%7BHMAC-SHA256%7D%7B53d53eea%7D%7Bn5LJzA%3D%3D%7D&openid.mode=id_res&openid.ns=http%3A%2F%2Fspecs.openid.net%2Fauth%2F2.0&openid.ns.pape=http%3A%2F%2Fspecs.openid.net%2Fextensions%2Fpape%2F1.0&openid.ns.sreg=http%3A%2F%2Fopenid.net%2Fextensions%2Fsreg%2F1.1&openid.op_endpoint=https%3A%2F%2Fid.fedoraproject.org%2Fopenid%2F&openid.pape.auth_level.nist=2&openid.pape.auth_level.ns.nist=http%3A%2F%2Fcsrc.nist.gov%2Fpublications%2Fnistpubs%2F800-63%2FSP800-63V1_0_2.pdf&openid.pape.auth_policies=http%3A%2F%2Fschemas.openid.net%2Fpape%2Fpolicies%2F2007%2F06%2Fnone&openid.pape.auth_time=2014-08-07T07%3A30%3A14Z&openid.response_nonce=2014-08-07T07%3A30%3A18ZNhqot0&openid.return_to=https%3A%2F%2Fadmin.fedoraproject.org%2Fnagios%3F%26modauthopenid.nonce%3DtxDwhr5zQX&openid.sig=tU%2FZ7kj3xUzAZgO5%2F9KrmBgY5rY%3D&openid.signed=assoc_handle%2Cclaimed_id%2Cidentity%2Cinvalidate_handle%2Cmode%2Cns%2Cns.pape%2Cns.sreg%2Cop_endpoint%2Cpape.auth_level.nist%2Cpape.auth_level.ns.nist%2Cpape.auth_policies%2Cpape.auth_time%2Cresponse_nonce%2Creturn_to%2Csigned HTTP/1.0" 401 1734 "https://id.fedoraproject.org/openid/?transaction=86a5a7bd4f584b648c2003bab697f1d3" "Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Firefox/31.0"

Adding sysadmin-noc-members back as one of the owners.

Have a lab box set up like noc01 in my lab, with openid currently functional. Waiting for the issue to occur to troubleshoot.

The issue has been recreated in my lab. I am accessing my lab box over the same network, using http. When it reaches out to the FAS openid, it uses https, of course.

The http process has been running since Aug 11th @ 11a and I recreated the issue at 2p today (Aug 18th). The most recent test prior to that was Aug 13th @ 7:10p. That's a pretty large gap, but the important part is it has been recreated.

Here is the error message from /var/log/httpd/error_log:
[Sat Aug 16 14:07:19.890389 2014] [core:error] [pid 1228] [client] Error in authentication: openid.modauthopenid.nonce: no such field

I would attach a sosreport, but the attachment limit is 256KB...

(All times are in Central)

Upon apache restart the mod_auth_openid sqlite database is removed. It is created/updated as users attempt to authenticate. I could recreate a similar issue by moving one database from an apache server to another, however that might not be a fair test. Emptying each table one by one revealed that the fix is in emptying the "associations" table. Something must be getting corrupt and causing that table to not be updated appropriately. Here's what the table looks like (and yes, it normally only has one entry):
CREATE TABLE associations (server VARCHAR(255), handle VARCHAR(100), encryption_type VARCHAR(50), secret VARCHAR(30), expires_on INT);
INSERT INTO "associations" VALUES('https://id.fedoraproject.org/openid/','{HMAC-SHA256}{53eff2b6}{44qTIA==}','HMAC-SHA256','9xOPRq4qgQJm26b0YhUVPdSXGfe2IpKmpQMMSmHn7PU=',1409443766);

The "expires_on" field still has almost 10 days left, so it has yet to expire.

This sounds a lot like a bug mentioned 2010 at this link:

Note that the sqlite DB is at /tmp/systemd-private-XXXXXX/tmp/mod_auth_openid.db (where XXXXXX is a random character sequence, determined on httpd start). The sqlite DB is not created until the openid mod is actually invoked.

Being that I can instantly create the error by copying the "bad" sqlite DB in place, I think this might be easier to solve. I'll take a look at the source and see if I can find anything.

After digging in a bit, it is definitely the same issue. That is, a "server"'s association handle becomes invalid prior to the expiration and mod_auth_openid fails to validate it. Looks like this has to do with the different params interfaces/classes where mod_auth_openid adds "openid." prefix for get_param and has_param. The superclass has different logic and that seems to be conflicting. Talking with puiterwijk and a possible corresponding patch has not been merged into master. Below are the steps to recreate. Of course, its somewhat artificial, as the real test would be to invalidate a handle on the server side.

How to recreate mod_auth_openid issue ("openid.modauthopenid.nonce: no such field" error in logs that prevents anyone from logging in)

1. Restart apache2/httpd
2. Visit web page to get openid prompt, but do not log in or hit approve (this will generate a fresh DB)
3. Set association handle for FAS to "invalidhandle" in the sqlite DB
--systemd based OS (RHEL7, for example)
sqlite3 /tmp/systemd-private-*/tmp/mod_auth_openid.db "UPDATE ASSOCIATIONS SET handle='invalidhandle' WHERE server='https://id.fedoraproject.org/openid/';"
--sysvinit based OS (RHEL5 and RHEL5, for example)
sqlite3 /tmp/mod_auth_openid.db "UPDATE ASSOCIATIONS SET handle='invalidhandle' WHERE server='https://id.fedoraproject.org/openid/';"
(if this doesn't work, try removing the DB file, then starting over on step 1)
4. Go back to web page and log in and hit approve (if you're already logged in, it is usually ok)
5. Receive error message -- and look at messages in apache2/httpd error log

With the patch here: https://github.com/bmuller/mod_auth_openid/commit/bb1a59121b59ee06067c289e347b6fa92cbae866

the problem seems to get "better" in the fact of we do not getting a missing field error. We now receive this error:
[Mon Aug 18 09:57:38.556657 2014] [core:error] [pid 14551] [client] Error in authentication: failed to check_authentication()

https://github.com/hacker/libopkele/blob/master/lib/basic_rp.cc -- line 252 is where it throws the error. line 311 is the original function throwing the error...if it were to get to line 319, for invalidate_assoc then I would see some messages in the logs...it would tell me it was trying to invalidate it

Its possible that recreating the issue artificially is causing part of libopkele to not function as expected. The handle would not exist at all, so therefore it might not be able to locate it. I will let the server/service run for awhile and see if the issue still remains.

Set up my own instance of fedoauth and I receive the "Error in authentication: failed to check_authentication()" error message. Need to find out why libopkele is not attempting to run invalidate_assoc

Its either a problem with libopeke or FedOAuth. The result data from running direct_request within basic_RP:check_authentication has some funkyness going on. I created a function to display all of the fields that res contains, and its showing html rather than field names as it should... Again to reference this can be found at https://github.com/hacker/libopkele/blob/master/lib/basic_rp.cc ... line 315. That's where the direct_request is, and after that, res does not have valid fields... Here's an example of what the field print looks like for the same data type at a different location in the libopkele code: {{{ field print #0 assoc_type field print #1 dh_consumer_public field print #2 dh_gen field print #3 dh_modulus field print #4 mode field print #5 ns field print #6 session_type }}} Those are indeed sane field names. Here's what it looks like directly after line 315 of basic_rp.cc {{{ field print #0 <meta name="generator" content="FedOAuth 3.0.7"> </head> <body >

This is the Federated Open Authentication provider homepage.

ERROR field print #4

Ok, this appears to be a combination between mod_auth_openid and libopkele both having some a bug. mod_auth_openid has better handling for get_field and has_field functions with the patch:

For libopkele, when an association is invalid, invalidate_handle is passed, but libopkele is supposed to confirm back with the openid server that it should invalidate it. However, the process that it goes through keeps the "openid." prefix on all of the URL data, but "openid." is prefixed again (this is normal for the "direct_request" function of opkele, but it has an option to change the prefix) when it sends another request. This is a very basic fix (read: hack), the real fix may require much deeper changes to the code and/or restructuring. It would make more sense for the developers of libopkele to come out with a final fix. I have attached the patch "libopkele-invalid-association.patch" file for this basic fix. As far as I can tell if an association handle becomes invalid (other than for expiration), libopkele will not handle it appropriately. Note that the fix will present an error once to a user, but the association handle will become invalidated on the next attempt it will work just fine -- invalidated association handles should be a rare occasion so this might be appropriate.

Lets just wait for the upstream fix to land. ;)

Thanks so much for all the great detective work here and patches!

Login to comment on this ticket.