Ticket was cloned from Red Hat Bugzilla (product Red Hat Enterprise Linux 6): Bug 979465
Please note that this Bug is private and may not be accessible as it contains confidential Red Hat customer information.
+++ This bug was initially created as a clone of Bug #973364 +++ Description of problem: Every 3-4 days replication will break between two IPA servers because SASL encrypted packet length exceeds maximum allowed limit. Restarting IPA fixes the problem. Version-Release number of selected component (if applicable): ipa-server-3.0.0-26.el6_4.2. How reproducible: I am not sure how to reproduce as it looks to be specific to this set up. Actual results: Master: ... [13/May/2013:09:43:36 -0400] - repl5_inc_waitfor_async_results timed out waiting for responses: 373887 373980 [13/May/2013:10:09:31 -0400] - repl5_inc_waitfor_async_results timed out waiting for responses: 375614 375896 [13/May/2013:10:11:31 -0400] NSMMReplicationPlugin - agmt="cn=meToopskvlp42.snops.net" (opskvlp42:389): Warning: unable to receive endReplication extended operation response (Timed out) [13/May/2013:10:13:30 -0400] NSMMReplicationPlugin - agmt="cn=meToopskvlp42.snops.net" (opskvlp42:389): Unable to receive the response for a startReplication extended operation to consumer (Timed out). Will retry later. [13/May/2013:10:19:12 -0400] - repl5_inc_waitfor_async_results timed out waiting for responses: 376731 376928 [13/May/2013:10:21:12 -0400] NSMMReplicationPlugin - agmt="cn=meToopskvlp42.snops.net" (opskvlp42:389): Warning: unable to receive endReplication extended operation response (Timed out) [13/May/2013:10:50:59 -0400] - repl5_inc_waitfor_async_results timed out waiting for responses: 379105 379111 [13/May/2013:10:56:57 -0400] - repl5_inc_waitfor_async_results timed out waiting for responses: 379638 379911 [13/May/2013:10:58:58 -0400] NSMMReplicationPlugin - agmt="cn=meToopskvlp42.snops.net" (opskvlp42:389): Warning: unable to receive endReplication extended operation response (Timed out) Replica (errors around same time): [13/May/2013:10:01:05 -0400] - SASL encrypted packet length exceeds maximum allowed limit (length=805634565, lim it=2097152). Change the nsslapd-maxsasliosize attribute in cn=config to increase limit. [13/May/2013:10:01:05 -0400] - SASL encrypted packet length exceeds maximum allowed limit (length=805634565, lim it=2097152). Change the nsslapd-maxsasliosize attribute in cn=config to increase limit. [13/May/2013:10:01:05 -0400] - SASL encrypted packet length exceeds maximum allowed limit (length=805634565, lim it=2097152). Change the nsslapd-maxsasliosize attribute in cn=config to increase limit. [13/May/2013:10:01:05 -0400] - SASL encrypted packet length exceeds maximum allowed limit (length=805634565, lim it=2097152). Change the nsslapd-maxsasliosize attribute in cn=config to increase limit. Expected results: Work without setting nsslapd-maxsasliosize to 806mb Additional info: Looking at his logs the size is always 805634565. I could increase nsslapd-maxsasliosize but changing a default setting of 2mb to over 806mb seems a little extreme to me. The questions I have is it Ok to increase nsslapd-maxsasliosize to 805634565? What the best way to debug this? Marc Sauton has started a KCS. https://access.redhat.com/site/solutions/374063 --- Additional comment from RHEL Product and Program Management on 2013-06-11 15:20:57 EDT --- Since this bug report was entered in bugzilla, the release flag has been set to ? to ensure that it is properly evaluated for this release. --- Additional comment from Eugene Keck on 2013-06-11 15:30:17 EDT --- If needed I have logs, sosreport, tcpdumps at http://10.13.211.168/BZ973364/ --- Additional comment from Martin Kosek on 2013-06-12 04:37:45 EDT --- Rich or Nathan, can you please advise? --- Additional comment from Rich Megginson on 2013-06-12 14:04:35 EDT --- There's something weird going on. To me, it looks as though the server thinks the connection should be using SASL encryption, but the client thinks it should not be, and sends over plain LDAP. 805634565 dec == 30050205 hex For example, an LDAP UNBIND request looks like this (from the tcp dump - tcp.stream eq 325): 000000E4 30 05 02 01 02 42 00 0....B. 30 - begin LDAP request 05 - length 02 - LBER_INTEGER 01 - length 02 - value (ldap msgid 2) 42 - LDAP_REQ_UNBIND 00 - 0 length that's an entire LDAP UNBIND request For an encrypted SASL request, the first 4 bytes are the length of the data that follows. So if the client sent an unencrypted LDAP UNBIND request, and the server was expecting a SASL encrypted request, the server would interpret the 30 05 02 01 == 805634561 + the 4 bytes for the length == 805634565 as the length of the SASL buffer. Why does the server think the connection should be SASL encrypted, but not the client? --- Additional comment from Nathan Kinder on 2013-06-12 18:23:50 EDT --- (In reply to Rich Megginson from comment #4) > There's something weird going on. To me, it looks as though the server > thinks the connection should be using SASL encryption, but the client thinks > it should not be, and sends over plain LDAP. > > 805634565 dec == 30050205 hex > > For example, an LDAP UNBIND request looks like this (from the tcp dump - > tcp.stream eq 325): > > 000000E4 30 05 02 01 02 42 00 0....B. > > 30 - begin LDAP request > 05 - length > 02 - LBER_INTEGER > 01 - length > 02 - value (ldap msgid 2) > 42 - LDAP_REQ_UNBIND > 00 - 0 length > > that's an entire LDAP UNBIND request > > For an encrypted SASL request, the first 4 bytes are the length of the data > that follows. So if the client sent an unencrypted LDAP UNBIND request, and > the server was expecting a SASL encrypted request, the server would > interpret the 30 05 02 01 == 805634561 + the 4 bytes for the length == > 805634565 as the length of the SASL buffer. > > Why does the server think the connection should be SASL encrypted, but not > the client? The "client" in this case is replication itself, right? Is it only UNBIND requests that are sent unencrypted? --- Additional comment from Rich Megginson on 2013-06-12 19:23:26 EDT --- I'm not sure what else is being sent unencrypted, but in the first comment every error message has 805634565 == 30 05 02 01 == UNBIND request I'm not sure what the client is - I suppose we could work back from the access logs on the replica to find out what connections/operations are being attempted at around [13/May/2013:10:01:05 -0400] --- Additional comment from Martin Kosek on 2013-06-13 02:51:21 EDT --- There should be 2 replicating IPA masters, so the "client" should be the other replica. I checked the dirsrv errors log and I see slapd_ldap_sasl_interactive_bind issues caused by inability to init from keytab: opskvlp42.snops.net: ... [29/May/2013:09:42:48 -0400] set_krb5_creds - Could not get initial credentials for principal [ldap/opskvlp42.snops.net@SNOPS.NET] in keytab [FILE:/etc/dirsrv/ds.keytab]: -1765328228 (Cannot contact any KDC for requested realm) [29/May/2013:09:42:48 -0400] slapd_ldap_sasl_interactive_bind - Error: could not perform interactive bind for id [] mech [GSSAPI]: LDAP error -2 (Local error) (SASL(-1): generic failure: GSSAPI Error: Unspecified GSS failure. Minor code may provide more information (Credentials cache file '/tmp/krb5cc_497' not found)) errno 0 (Success) ... [29/May/2013:13:57:58 -0400] slapi_ldap_bind - Error: could not perform interactive bind for id [] mech [GSSAPI]: error -2 (Local error) [29/May/2013:13:57:58 -0400] NSMMReplicationPlugin - agmt="cn=meToopskhlp41.snops.net" (opskhlp41:389): Replication bind with GSSAPI auth failed: LDAP error -2 (Local error) (SASL(-1): generic failure: GSSAPI Error: Unspecified GSS failure. Minor code may provide more information (Credentials cache file '/tmp/krb5cc_497' not found)) [29/May/2013:13:58:01 -0400] NSMMReplicationPlugin - agmt="cn=meToopskhlp41.snops.net" (opskhlp41:389): Replication bind with GSSAPI auth resumed [29/May/2013:14:13:34 -0400] NSMMReplicationPlugin - agmt="cn=meToopskhlp41.snops.net" (opskhlp41:389): Warning: unable to receive endReplication extended operation response (Timed out) [29/May/2013:14:13:36 -0400] NSMMReplicationPlugin - agmt="cn=meToopskhlp41.snops.net" (opskhlp41:389): Unable to receive the response for a startReplication extended operation to consumer (Can't contact LDAP server). Will retry later. The server seems to be able to resume GSSAPI connection after that (though with errors): [29/May/2013:13:57:58 -0400] NSMMReplicationPlugin - agmt="cn=meToopskhlp41.snops.net" (opskhlp41:389): Replication bind with GSSAPI auth failed: LDAP error -2 (Local error) (SASL(-1): generic failure: GSSAPI Error: Unspecified GSS failure. Minor code may provide more information (Credentials cache file '/tmp/krb5cc_497' not found)) [29/May/2013:13:58:01 -0400] NSMMReplicationPlugin - agmt="cn=meToopskhlp41.snops.net" (opskhlp41:389): Replication bind with GSSAPI auth resumed [29/May/2013:14:13:34 -0400] NSMMReplicationPlugin - agmt="cn=meToopskhlp41.snops.net" (opskhlp41:389): Warning: unable to receive endReplication extended operation response (Timed out) [29/May/2013:14:13:36 -0400] NSMMReplicationPlugin - agmt="cn=meToopskhlp41.snops.net" (opskhlp41:389): Unable to receive the response for a startReplication extended operation to consumer (Can't contact LDAP server). Will retry later. opskvlp41.snops.net seems to also report a lot og replication errors: [29/May/2013:09:50:06 -0400] - slapd started. Listening on All Interfaces port 389 for LDAP requests [29/May/2013:09:50:06 -0400] - Listening on All Interfaces port 636 for LDAPS requests [29/May/2013:09:50:06 -0400] - Listening on /var/run/slapd-SNOPS-NET.socket for LDAPI requests [29/May/2013:09:50:10 -0400] NSMMReplicationPlugin - agmt="cn=meToopskvlp42.snops.net" (opskvlp42:389): Replication bind with GSSAPI auth resumed [29/May/2013:09:52:50 -0400] NSMMReplicationPlugin - agmt="cn=meToopskvlp42.snops.net" (opskvlp42:389): Failed to send update operation to consumer (uniqueid 355c2f11-2f2711e2-88bac574-3c45cc45, CSN 51a606b3000000040000): Bad parameter to an ldap routine. Will retry later. [29/May/2013:09:52:50 -0400] NSMMReplicationPlugin - agmt="cn=meToopskvlp42.snops.net" (opskvlp42:389): Consumer failed to replay change (uniqueid 873095d1-2ee611e2-88bac574-3c45cc45, CSN 51a60614000100040000): Can't contact LDAP server(-1). Will retry later. [29/May/2013:09:52:52 -0400] NSMMReplicationPlugin - agmt="cn=meToopskvlp42.snops.net" (opskvlp42:389): Warning: unable to send endReplication extended operation (Can't contact LDAP server) [29/May/2013:09:52:56 -0400] slapd_ldap_sasl_interactive_bind - Error: could not perform interactive bind for id [] mech [GSSAPI]: LDAP error -1 (Can't contact LDAP server) ((null)) errno 107 (Transport endpoint is not connected) [29/May/2013:09:52:56 -0400] slapi_ldap_bind - Error: could not perform interactive bind for id [] mech [GSSAPI]: error -1 (Can't contact LDAP server) [29/May/2013:09:53:02 -0400] NSMMReplicationPlugin - agmt="cn=meToopskvlp42.snops.net" (opskvlp42:389): Replication bind with GSSAPI auth resumed [29/May/2013:10:31:46 -0400] - repl5_inc_waitfor_async_results timed out waiting for responses: 2998 3035 [29/May/2013:10:33:46 -0400] NSMMReplicationPlugin - agmt="cn=meToopskvlp42.snops.net" (opskvlp42:389): Warning: unable to receive endReplication extended operation response (Timed out) [29/May/2013:11:19:58 -0400] - repl5_inc_waitfor_async_results timed out waiting for responses: 6710 7070 --- Additional comment from on 2013-06-13 11:44:09 EDT --- Would it be possible to turn on "connection" logging on the replica? This will provide much more useful information when the problem occurs. dn: cn=config nsslapd-errorlog-level: 8 Then provide the error and access log. --- Additional comment from Martin Kosek on 2013-06-14 02:37:51 EDT --- Eugene, could you please get these extended logs from customer, as Mark requested? --- Additional comment from Rich Megginson on 2013-06-14 16:25:12 EDT --- I'm able to reproduce the SASL message 1) setup MMR with kerberos/GSSAPI 2) gdb one of the servers - set a break point in ldap_int_sasl_bind before the call to ldap_pvt_sasl_install at this line: if( saslrc == SASL_OK ) { 3) run some operation to trigger this server to make an outbound replication connection 4) when it hits the break point, (gdb) set saslrc = 99 (gdb) set rc = 80 (gdb) set ssf = 0 then step through the code and make sure it does not call ldap_pvt_sasl_install() then you should see this server call ldap_unbind, and you should see this error in the receiving server: [14/Jun/2013:14:07:03 -0600] - SASL encrypted packet length exceeds maximum allowed limit (length=805634565, limit=2097152). Change the nsslapd-maxsasliosize attribute in cn=config to increase limit. The real problem is - why is the server having trouble connecting in the first place? If I look at similar problems in https://access.redhat.com/site/solutions/374063 I notice this: [13/May/2013:10:00:55 -0400] conn=100248 op=3 BIND dn="" method=sasl version=3 mech=GSSAPI [13/May/2013:10:01:05 -0400] conn=100248 op=3 RESULT err=0 tag=97 nentries=0 etime=10 dn="fqdn=system1.example.com,cn=computers,cn=accounts,dc=example,dc=com" [13/May/2013:10:01:05 -0400] conn=100248 op=-1 fd=190 closed - The value requested is too large to be stored in the data buffer provided. There are 10 seconds between the last step of the bind being received and the result being returned to the client. 10 seconds is far, far too long. Probably the client times out the connection, as in [29/May/2013:10:31:46 -0400] - repl5_inc_waitfor_async_results timed out waiting for responses: 2998 3035 It should recover from this, and it looks like it does: [29/May/2013:09:53:02 -0400] NSMMReplicationPlugin - agmt="cn=meToopskvlp42.snops.net" (opskvlp42:389): Replication bind with GSSAPI auth resumed My question is - is replication "broken"? That is, are updates not being replicated due to this issue, or is it the case that, despite these errors, the updates are still being propagated throughout the network? --- Additional comment from on 2013-06-21 11:17:29 EDT --- What is the version of the kerberos library? rpm -qa | grep krb --- Additional comment from Martin Kosek on 2013-06-25 12:12:08 EDT --- Any update? I see Mark requested more information, but I don't see it neither in Bug nor customer Case. --- Additional comment from Rich Megginson on 2013-06-25 12:18:15 EDT --- (In reply to Martin Kosek from comment #12) > Any update? I see Mark requested more information, but I don't see it > neither in Bug nor customer Case. One of the problems is that we don't handle the case where the server thinks the client is going to establish a SASL I/O encrypted connection, but the client fails to establish this. The server then tries to interpret all subsequent client communications as encrypted. When the client fails to establish security, this is a hard failure, and it tries to terminate the connection by sending an LDAP UNBIND request, which is unencrypted. The server interprets this as a bogus SASL buffer length. So one fix is to change 389 to see this case and have the SASL I/O layer code just interpret this as an unbind request and close the connection. The other part of this problem is - why does the client fail to establish an encrypted connection? We have not been able to reproduce this part of the problem. Mark has set up an IPA test environment and is trying to reproduce in a "controlled" environment. --- Additional comment from Martin Kosek on 2013-06-28 02:10:27 EDT --- (In reply to Rich Megginson from comment #13) > (In reply to Martin Kosek from comment #12) ... > One of the problems is that we don't handle the case where the server thinks > the client is going to establish a SASL I/O encrypted connection, but the > client fails to establish this. The server then tries to interpret all > subsequent client communications as encrypted. When the client fails to > establish security, this is a hard failure, and it tries to terminate the > connection by sending an LDAP UNBIND request, which is unencrypted. The > server interprets this as a bogus SASL buffer length. So one fix is to > change 389 to see this case and have the SASL I/O layer code just interpret > this as an unbind request and close the connection. This sounds promising - also sounds like something we should consider for 6.5, this is not the first bugzilla where I see complaints about exceeded SASL limit. > The other part of this problem is - why does the client fail to establish an > encrypted connection? We have not been able to reproduce this part of the > problem. Mark has set up an IPA test environment and is trying to reproduce > in a "controlled" environment. Any further update on this part? Should we move or clone this bug to 389-ds-base component so that you have something to attach the fix to? --- Additional comment from Rich Megginson on 2013-06-28 10:38:17 EDT --- (In reply to Martin Kosek from comment #14) > (In reply to Rich Megginson from comment #13) > > (In reply to Martin Kosek from comment #12) > ... > > One of the problems is that we don't handle the case where the server thinks > > the client is going to establish a SASL I/O encrypted connection, but the > > client fails to establish this. The server then tries to interpret all > > subsequent client communications as encrypted. When the client fails to > > establish security, this is a hard failure, and it tries to terminate the > > connection by sending an LDAP UNBIND request, which is unencrypted. The > > server interprets this as a bogus SASL buffer length. So one fix is to > > change 389 to see this case and have the SASL I/O layer code just interpret > > this as an unbind request and close the connection. > > This sounds promising - also sounds like something we should consider for > 6.5, this is not the first bugzilla where I see complaints about exceeded > SASL limit. > > > The other part of this problem is - why does the client fail to establish an > > encrypted connection? We have not been able to reproduce this part of the > > problem. Mark has set up an IPA test environment and is trying to reproduce > > in a "controlled" environment. > > Any further update on this part? No. Mark Reynolds has been trying to reproduce but has not been able to. Does anyone have any idea how to reliably reproduce these sorts of failures? > Should we move or clone this bug to > 389-ds-base component so that you have something to attach the fix to? Yes.
I'm a bit worried if we could assume sp->encrypted_buffer is always NULL terminated. It is managed by encrypted_buffer_offset... … … sasl_io_start_packet(PRFileDesc *fd, PRIntn flags, PRIntervalTime ... 263 if(!sp->send_encrypted && '''strlen'''(sp->encrypted_buffer) != 0){ The rest looks good to me.
right - you can't use strlen - you should already know exactly how many bytes you have in the buffer
Replying to [comment:4 rmeggins]:
I can't use the bytes read as it's always 4 bytes on the initial pass, but the encrypted buffer is allocated using calloc(size of 1024), so after 4 bytes, it is definitely NULL terminated.
Now the strlen from this code can be removed, and we can use total_bytes:
303 } 304 /* 305 * LDAP operations are always a minimum of 7 bytes 306 */ 307 if(found_data && total_bytes >= 7){ 308 struct berval bv, tmp_bv; 309 ber_len_t len = 0; 310 BerElement *ber = NULL; 311 312 bv.bv_val = sp->encrypted_buffer; 313 bv.bv_len = strlen(bv.bv_val) + 1;
I'll send out the new patch shortly, but first, do we agree that the first strlen(from line 263) is safe?
Replying to [comment:5 mreynolds]:
Replying to [comment:4 rmeggins]: right - you can't use strlen - you should already know exactly how many bytes you have in the buffer I can't use the bytes read as it's always 4 bytes on the initial pass, but the encrypted buffer is allocated using calloc(size of 1024), so after 4 bytes, it is definitely NULL terminated.
I don't understand - you are controlling when PR_Recv is called, and you know exactly how many bytes are read (the return value from PR_Recv) - so you should always know exactly how many bytes you have read that are available to be parsed. You should never have to use strlen, even if in this case it is safe. A malicious attacker could send a stream of bytes that is not null terminated.
Now the strlen from this code can be removed, and we can use total_bytes: 303 } 304 / 305 * LDAP operations are always a minimum of 7 bytes 306 / 307 if(found_data && total_bytes >= 7){ 308 struct berval bv, tmp_bv; 309 ber_len_t len = 0; 310 BerElement *ber = NULL; 311 312 bv.bv_val = sp->encrypted_buffer; 313 bv.bv_len = strlen(bv.bv_val) + 1; I'll send out the new patch shortly, but first, do we agree that the first strlen(from line 263) is safe?
303 } 304 / 305 * LDAP operations are always a minimum of 7 bytes 306 / 307 if(found_data && total_bytes >= 7){ 308 struct berval bv, tmp_bv; 309 ber_len_t len = 0; 310 BerElement *ber = NULL; 311 312 bv.bv_val = sp->encrypted_buffer; 313 bv.bv_len = strlen(bv.bv_val) + 1;
safe - possibly - but you should not use strlen in this code
Replying to [comment:6 rmeggins]:
Replying to [comment:5 mreynolds]: Replying to [comment:4 rmeggins]: right - you can't use strlen - you should already know exactly how many bytes you have in the buffer I can't use the bytes read as it's always 4 bytes on the initial pass, but the encrypted buffer is allocated using calloc(size of 1024), so after 4 bytes, it is definitely NULL terminated. I don't understand - you are controlling when PR_Recv is called, and you know exactly how many bytes are read (the return value from PR_Recv) - so you should always know exactly how many bytes you have read that are available to be parsed.
I don't understand - you are controlling when PR_Recv is called, and you know exactly how many bytes are read (the return value from PR_Recv) - so you should always know exactly how many bytes you have read that are available to be parsed.
During a normal SSAL bind PR_Recv returns 4, but the encrypted buffer is empty:
$10 = {decrypted_buffer = 0x281a9a0 "\310ˁ!\363\177", decrypted_buffer_size = 1024, decrypted_buffer_count = 0, decrypted_buffer_offset = 0, encrypted_buffer = 0x281aef0 "", encrypted_buffer_size = 1024, encrypted_buffer_count = 0, encrypted_buffer_offset = 4, conn = 0x7ff3100aa1b8, send_encrypted = 0, send_buffer = 0x0, send_size = 0, send_offset = 0}
----> encrypted_buffer = 0x281aef0 ""
I need to detect if encrypted_buffer has any data, or not "". The result from PR_Recv is always 4, so I can not use "4" to determine if the buffer actually has any data in it. Does that clarify what I'm trying to do? What other way can detect this condition?
You should never have to use strlen, even if in this case it is safe. A malicious attacker could send > a stream of bytes that is not null terminated.
PR_Recv only reads 4 bytes on the first pass, so if it is is an endless string of bytes, we still only read in the first 4 bytes into a calloced buffer. Anyway, I have no problem getting rid of the strlen call if there is another way to safely detect that the encrypted_buffer is "" or not. I'm assuming strcmp is just as risky as strlen.
Thanks, Mark
Now the strlen from this code can be removed, and we can use total_bytes: 303 } 304 /* 305 * LDAP operations are always a minimum of 7 bytes 306 */ 307 if(found_data && total_bytes >= 7){ 308 struct berval bv, tmp_bv; 309 ber_len_t len = 0; 310 BerElement *ber = NULL; 311 312 bv.bv_val = sp->encrypted_buffer; 313 bv.bv_len = strlen(bv.bv_val) + 1; I'll send out the new patch shortly, but first, do we agree that the first strlen(from line 263) is safe? safe - possibly - but you should not use strlen in this code
Now the strlen from this code can be removed, and we can use total_bytes: 303 } 304 /* 305 * LDAP operations are always a minimum of 7 bytes 306 */ 307 if(found_data && total_bytes >= 7){ 308 struct berval bv, tmp_bv; 309 ber_len_t len = 0; 310 BerElement *ber = NULL; 311 312 bv.bv_val = sp->encrypted_buffer; 313 bv.bv_len = strlen(bv.bv_val) + 1; I'll send out the new patch shortly, but first, do we agree that the first strlen(from line 263) is safe?
Actually I found a way to safely detect the empty buffer, new patch will be coming out shortly.
Ok I think I found I an issue with the current buffer size calculation(when reading in more data):
292 bsize += next_size;
I think this should be "bsize -= ret;", and then it is also not reset after the buffer is resized.
New patch attached.
1) instead of reading 4 bytes to begin with, read 7 - if the packet is really an unencrypted LDAP request, it will be at least 7 bytes - if it is encrypted, then it will be at least 4 bytes + 7 bytes + encryption overhead, so the absolute minimum number of bytes that will always be available is 7
2) once you have these 7 bytes, put it in a ber: bv.bv_val = sp->encrypted_buffer; bv.bv_len = 7; ber = ber_init(&bv);
then check for LDAP_TAG_MESSAGE
you'll have to roll your own get_ber_len function - I don't think there is a way in the LDAP API to get the length of the ber without having the entire PDU already - the other good thing about using 7 bytes is that you are guaranteed to be able to read a BER len (5 bytes max - 1 for the len/lenlen field, and up to 4 more for the entire ber_len_t).
Once you have the length, you'll have to check it against config_get_maxbersize()
If the length check passes, you'll have to read len more bytes. The tricky part here is if PR_Recv returns PR_WOULD_BLOCK_ERROR. You'll have to actually return and arrange to be called again - that is, you'll have to handle the case where sasl_io_start_packet() is re-entrant - connection_read_operation will have to poll() and call PR_Recv again when more data is available, which will then call sasl_io_recv -> sasl_io_start_packet - you might need to add some sort of flag to the sasl_io_private structure to keep track of this.
Then, once you have all len bytes, you can then make another BER
bv.bv_val = sp->encrypted_buffer; bv.bv_len = fulllength; ber = ber_init(&bv);
then check for LDAP_MSGID and LDAP_REQ_UNBIND
Replying to [comment:11 rmeggins]:
1) instead of reading 4 bytes to begin with, read 7 - if the packet is really an unencrypted LDAP request, it will be at least 7 bytes - if it is encrypted, then it will be at least 4 bytes + 7 bytes + encryption overhead, so the absolute minimum number of bytes that will always be available is 7 2) once you have these 7 bytes, put it in a ber: bv.bv_val = sp->encrypted_buffer; bv.bv_len = 7; ber = ber_init(&bv); then check for LDAP_TAG_MESSAGE you'll have to roll your own get_ber_len function - I don't think there is a way in the LDAP API to get the length of the ber without having the entire PDU already - the other good thing about using 7 bytes is that you are guaranteed to be able to read a BER len (5 bytes max - 1 for the len/lenlen field, and up to 4 more for the entire ber_len_t).
I just want to double check this before I write too much code. Is this something I can use the length source code from ber_peek_element()? When I used this before, I got a length of 48 for the complete ldap unbind request(PR_Recv had returned 7). When I "skipped" the tag, this code then returned a length of 7. Are either of these numbers what you are referring to? Or is this something I need to explicitly parse out using ber_* functions?
Once you have the length, you'll have to check it against config_get_maxbersize() If the length check passes, you'll have to read len more bytes. The tricky part here is if PR_Recv returns PR_WOULD_BLOCK_ERROR. You'll have to actually return and arrange to be called again - that is, you'll have to handle the case where sasl_io_start_packet() is re-entrant - connection_read_operation will have to poll() and call PR_Recv again when more data is available, which will then call sasl_io_recv -> sasl_io_start_packet - you might need to add some sort of flag to the sasl_io_private structure to keep track of this. Then, once you have all len bytes, you can then make another BER bv.bv_val = sp->encrypted_buffer; bv.bv_len = fulllength; ber = ber_init(&bv); then check for LDAP_MSGID and LDAP_REQ_UNBIND
Replying to [comment:13 mreynolds]:
Replying to [comment:11 rmeggins]: 1) instead of reading 4 bytes to begin with, read 7 - if the packet is really an unencrypted LDAP request, it will be at least 7 bytes - if it is encrypted, then it will be at least 4 bytes + 7 bytes + encryption overhead, so the absolute minimum number of bytes that will always be available is 7 2) once you have these 7 bytes, put it in a ber: bv.bv_val = sp->encrypted_buffer; bv.bv_len = 7; ber = ber_init(&bv); then check for LDAP_TAG_MESSAGE you'll have to roll your own get_ber_len function - I don't think there is a way in the LDAP API to get the length of the ber without having the entire PDU already - the other good thing about using 7 bytes is that you are guaranteed to be able to read a BER len (5 bytes max - 1 for the len/lenlen field, and up to 4 more for the entire ber_len_t). I just want to double check this before I write too much code. Is this something I can use the length source code from ber_peek_element()?
I just want to double check this before I write too much code. Is this something I can use the length source code from ber_peek_element()?
Yes.
When I used this before, I got a length of 48 for the complete ldap unbind request(PR_Recv had returned 7).When I "skipped" the tag, this code then returned a length of 7. Are either of these numbers what you are referring to?
Yes. And you will have to make sure you skip the tag to get to the beginning of the length bytes.
Or is this something I need to explicitly parse out using ber_* functions?
The problem with using ber_skip_tag or other ber_ api functions to get the length of the LDAP PDU is that you only have the 7 bytes initially. You have to read the length of the full PDU to know how many more bytes to read. The ber_ api functions will just return an error - they expect to have the full PDU of X bytes - since only 7 are available, they will return an error. Unless you can figure out some way to call ber_peek_tag and figure out some way to tell that when the return value is LBER_DEFAULT, you can tell if the len returned is valid or not. In general I don't think it's possible or a good idea to rely on that. And don't forget to check the length against the maxbersize.
Once you figure out the length of the LDAP PDU and read it in, then you can use the ber api functions to get the msgid tag/length/value and the unbind tag.
You'll have to read the entire LDAP PDU anyway to pass up to connection_read_operation - it will need the entire UNBIND operation to parse and process it.
Revision #3 0001-Ticket-47416-SASL-encrypted-packet-length-exceeds-ma.patch
git merge ticket47416 Updating 708df4b..b4cdebb Fast-forward ldap/servers/slapd/ldaputil.c | 39 ++++++++ ldap/servers/slapd/sasl_io.c | 186 ++++++++++++++++++++++++++++++++----- ldap/servers/slapd/slapi-plugin.h | 10 ++ 3 files changed, 211 insertions(+), 24 deletions(-)
git push origin master Counting objects: 15, done. Delta compression using up to 4 threads. Compressing objects: 100% (8/8), done. Writing objects: 100% (8/8), 3.97 KiB, done. Total 8 (delta 6), reused 0 (delta 0) To ssh://git.fedorahosted.org/git/389/ds.git 708df4b..b4cdebb master -> master
commit b4cdebb
Metadata Update from @mreynolds: - Issue assigned to mreynolds - Issue set to the milestone: 1.3.2 - 07/13 (July)
389-ds-base is moving from Pagure to Github. This means that new issues and pull requests will be accepted only in 389-ds-base's github repository.
This issue has been cloned to Github and is available here: - https://github.com/389ds/389-ds-base/issues/753
If you want to receive further updates on the issue, please navigate to the github issue and click on subscribe button.
subscribe
Thank you for understanding. We apologize for all inconvenience.
Metadata Update from @spichugi: - Issue close_status updated to: wontfix (was: Fixed)
Login to comment on this ticket.