#47582 agmt_count in Replica could become (PRUint64)-1
Closed: wontfix None Opened 10 years ago by nhosoi.

agmt_count in Replica could become (PRUint64)-1
(gdb) p r->agmt_count
$2 = 18446744073709551615
(gdb) p (int)r->agmt_count
$3 = -1

The entire replica that includes agmt_count == (PRUint64)-1:

(gdb) p *r
$1 = {repl_root = 0x7f702f2102e0, repl_name = 0x7f702f035cf0 "dfa8f703-464f11e3-b993ea7f-4d6a3997", new_name = 0,
updatedn_list = 0x7f702f1cef20, repl_type = REPLICA_TYPE_UPDATABLE, legacy_consumer = 0, legacy_purl = 0x0, repl_rid = 1,
repl_ruv = 0x7f702f213360, repl_ruv_dirty = 0, min_csn_pl = 0x0, csn_pl_reg_id = 0x7f702f1cc940, repl_state_flags = 0,
repl_flags = 1, repl_lock = 0x7f702f1bfe40, repl_eqcxt_rs = 0x7f702f2055d0, repl_eqcxt_tr = 0x0,
repl_csngen = 0x7f702f20fa60, repl_csn_assigned = 0, repl_purge_delay = 0, tombstone_reap_stop = 0,
tombstone_reap_active = 0, tombstone_reap_interval = 0, repl_referral = 0x0, state_update_inprogress = 0,
agmt_lock = 0x7f702f201d20, locking_purl = 0x0, protocol_timeout = 120, backoff_min = 3, backoff_max = 300,
agmt_count = 18446744073709551615}

Location where agmt_count became (PRUint64)-1

(gdb) bt
#0 0x00007f70227dc438 in replica_decr_agmt_count (r=0x7f702f1e8750) at ldap/servers/plugins/replication/repl5_replica.c:3971
#1 0x00007f70227bdf1e in agmt_delete (rap=0x7f6ff37e9320) at ldap/servers/plugins/replication/repl5_agmt.c:606
#2 0x00007f70227bdd49 in agmt_new_from_entry (e=0x7f6f8c001080) at ldap/servers/plugins/replication/repl5_agmt.c:535
#3 0x00007f70227c37af in add_new_agreement (e=0x7f6f8c001080) at ldap/servers/plugins/replication/repl5_agmtlist.c:151
#4 0x00007f70227c38f9 in agmtlist_add_callback (pb=0x7f6ff37edae0, e=0x7f6f8c001080, entryAfter=0x0,
returncode=0x7f6ff37e9574, returntext=0x7f6ff37e9600 "", arg=0x0) at ldap/servers/plugins/replication/repl5_agmtlist.c:188
#5 0x00007f702c8384b8 in dse_call_callback (pdse=0x7f702ee63160, pb=0x7f6ff37edae0, operation=16, flags=1,
entryBefore=0x7f6f8c001080, entryAfter=0x0, returncode=0x7f6ff37e9574, returntext=0x7f6ff37e9600 "")
at ldap/servers/slapd/dse.c:2421
#6 0x00007f702c8378c0 in dse_add (pb=0x7f6ff37edae0) at ldap/servers/slapd/dse.c:2171
#7 0x00007f702c81b34e in op_shared_add (pb=0x7f6ff37edae0) at ldap/servers/slapd/add.c:735
#8 0x00007f702c81a274 in do_add (pb=0x7f6ff37edae0) at ldap/servers/slapd/add.c:258
#9 0x00007f702cd5b639 in connection_dispatch_operation (conn=0x7f702cbd7410, op=0x7f702f1bea20, pb=0x7f6ff37edae0)
at ldap/servers/slapd/connection.c:643
#10 0x00007f702cd5d6a4 in connection_threadmain () at ldap/servers/slapd/connection.c:2508
#11 0x00007f702ae65c76 in _pt_root (arg=0x7f702f0c9180) at ../../../nspr/pr/src/pthreads/ptthread.c:204
#12 0x00007f702a808d15 in start_thread (arg=0x7f6ff37ee700) at pthread_create.c:308
#13 0x00007f702a32553d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:114

The value 18446744073709551615 is used to initialize smod in which huge size of mod is allocated and it terminates the server. Stacktrace where the server exits with calloc failure:

#0 0x00007f427bced18c in slapi_ch_calloc (nelem=18446744073709551615, size=8) at ldap/servers/slapd/ch_malloc.c:251
#1 0x00007f427bd3d397 in slapi_mod_init (smod=0x7f4257ffebd0, initCount=-2) at ldap/servers/slapd/modutil.c:597
#2 0x00007f4271c89812 in agmt_maxcsn_to_smod (r=0x7f427d3bb750, smod=0x7f4257ffebd0)
at ldap/servers/plugins/replication/repl5_agmt.c:2808
#3 0x00007f4271ca069e in replica_write_ruv (r=0x7f427d3bb750) at ldap/servers/plugins/replication/repl5_replica.c:2608
#4 0x00007f4271ca04ae in replica_update_state (when=1383615645, arg=0x7f427d3cee50)
at ldap/servers/plugins/replication/repl5_replica.c:2559
#5 0x00007f427bd0bc74 in eq_call_all () at ldap/servers/slapd/eventq.c:312
#6 0x00007f427bd0be1e in eq_loop (arg=0x0) at ldap/servers/slapd/eventq.c:359
#7 0x00007f427a32cc76 in _pt_root (arg=0x7f427d3c69c0) at ../../../nspr/pr/src/pthreads/ptthread.c:204
#8 0x00007f4279ccfd15 in start_thread (arg=0x7f4257fff700) at pthread_create.c:308
#9 0x00007f42797ec53d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:114

In the first stacktrace, this agmt_delete is called for an error handling. Probably, if it is an error, we do not want to decrement agmt_count all the time?

#1 0x00007f70227bdf1e in agmt_delete (rap=0x7f6ff37e9320) at ldap/servers/plugins/replication/repl5_agmt.c:606

git merge ticket47582
Updating 8eecc43..d2aa2bd
Fast-forward
ldap/servers/plugins/replication/repl5_agmt.c | 10 ++++++++++
ldap/servers/plugins/replication/repl5_agmtlist.c | 1 -
ldap/servers/plugins/replication/repl5_replica.c | 4 +++-
3 files changed, 13 insertions(+), 2 deletions(-)

git push origin master
Counting objects: 17, done.
Delta compression using up to 4 threads.
Compressing objects: 100% (9/9), done.
Writing objects: 100% (9/9), 1.20 KiB, done.
Total 9 (delta 7), reused 0 (delta 0)
To ssh://git.fedorahosted.org/git/389/ds.git
8eecc43..d2aa2bd master -> master

commit d2aa2bd
Author: Mark Reynolds mreynolds@redhat.com
Date: Thu Nov 7 16:09:21 2013 -0500

Metadata Update from @mreynolds:
- Issue assigned to mreynolds
- Issue set to the milestone: 1.3.3 - 11/13 (November)

7 years ago

389-ds-base is moving from Pagure to Github. This means that new issues and pull requests
will be accepted only in 389-ds-base's github repository.

This issue has been cloned to Github and is available here:
- https://github.com/389ds/389-ds-base/issues/919

If you want to receive further updates on the issue, please navigate to the github issue
and click on subscribe button.

Thank you for understanding. We apologize for all inconvenience.

Metadata Update from @spichugi:
- Issue close_status updated to: wontfix (was: Fixed)

3 years ago

Login to comment on this ticket.

Metadata