#174 Lasso causing interoperation failure with pysaml2
Closed: Invalid None Opened 3 years ago by merlinthp.

This issue came up trying to get a pysaml2-based python SP to work with Ipsilon's IdM.

pysaml2 is generating an AuthnRequest in an HTTP Redirect binding, with the appropriate XML signature as query parameters.

The signature is generated in the following code, lines 128-130:
Line 130 puts the generated signature into a python dict holding all the other parameters, and line 131 uses the urlencode() function to turn the dict into a proper urlencoded query string. The "issue" here is that python dicts don't preserve ordering, so urlencode can, in theory, produce a query string with the parameters in any order.

That's not an issue in and of itself, as the SAML2.0 binding spec (http://docs.oasis-open.org/security/saml/v2.0/saml-bindings-2.0-os.pdf) says (lines 616-619) that when verifying the signature of a message, the query parameter ordering isn't guaranteed, and the verifier must put them into the right order itself. All well and good, except Lasso has different ideas.

Lasso does the redirect binding signature verification in this code:
Lines 887-889 specifically state that Lasso is expecting the Signature parameter to be last. Line 891 splits the query string into two, the chunk before the Signature parameter, and the signature data itself (and any parameters that may be in the query string after it). Line 894 then looks in the first of the two chunks for the start of the SigAlg parameter, and if it doesn't find it, returns LASSO_DS_ERROR_INVALID_SIGALG.

So, the upshot is that when ipsilon gets an AuthnRequest from pysaml2 and passes it to Lasso, Lasso generally returns a DsInvalidSigalgError back to ipsilon.

I've got a local pysaml2 patch to ensure that signature comes last, but Lasso appears to violate the SAML2.0 specs, and I suspect other implementations may run into the same issue.

I'm trying to setup GitLab to use ipsilon and ran into this.

Fields changed

cc: => dkelson@gurulabs.com

Trying to sort out a minimal reproducer, I can't replicate the issue on Fedora 22 with Ipsilon 1.1.0, Lasso 2.4.1, and pysaml2 3.0.0 (which is newer than the 2.2.0 I was using before). Digging deeper into the Lasso code, I think this is a pysaml2 bug after all. The lasso_query_verify_signature() function in lasso/xml/tools.c is called from lasso_provider_verify_query_signature() (lasso/id-ff/provider.c) if the provider is identified as Liberty ID-FF. If the provider is SAML2.0 then a different function (lasso_saml2_query_verify_signature) is called which handles arbitrary query parameter order. I'm currently trying to trace through the Lasso code to work out how the provider protocol is determined, so I can work out what incorrect data pysaml2 is sending.

I'm pretty familiar with how lasso dispatches on the protocol. Usually it is determined by the provider metadata. When lasso loads the metadata for a given provider it looks for the xml elements that describe SAML 2 roles. If it sees a SAML 2 role (e.g. an IDPSSODescriptor or a SPSSODescriptor) element it knows it's SAML2.

However where this process breaks down is if lasso does not know what role a provider is operating in then it can't determine if it's SAML2 or not. In SAML2 a provider can be support multiple roles simultaneously. So what lasso does is wait until the provider does something role specific (e.g. send an AuthnRequest) to determine what role the provider is currently operating in (or you can explicitly assign a role to a provider).

I suspect the problem you're seeing is because lasso does not know the role or you've got bad or incomplete metadata for the provider.

I just finished debugging a very similar problem with mellon where it made a lasso call prior to lasso "automagically" determining it was acting as an SP.

If you could provide execution information it would help.

In your ipsilon.conf make sure these config items are enabled:

debug = True
tools.log_request_response.on = True
log.screen = True

Then restart ipsilon by restarting httpd. Actually the best thing to do is stop httpd, rm -f /var/log/httpd* and start httpd. That way the error_log will contain only what we are interested in.

Do whatever provoked the problem and send me the contents of /var/log/httpd/error_log

As I mentioned on IRC, I've not been able to reproduce this. I suspect I initially had a malformed SP metadata file, but it looks like I overwrote it at some point so I can't confirm. Cheers for looking into this, but I reckon we can bin this ticket.

Fields changed

resolution: => wontfix
status: new => closed

Login to comment on this ticket.