#49194 Decresea default ioblocktimeout
Closed: wontfix 7 years ago Opened 7 years ago by firstyear.

Issue Description

Our default ioblocktimeout is set to 30 minutes. This is extremely high and causes issues out of the box to clients.

We should conservatively reduce this to 10 minutes, with the aim to reduce to 5 in the future.


Metadata Update from @firstyear:
- Issue assigned to firstyear

7 years ago

We talked about doing this a while ago. I think we all agreed about making it 30 or 60 seconds actually. Trying to find the email trail...

Metadata Update from @mreynolds:
- Custom field reviewstatus adjusted to new
- Custom field type adjusted to defect

7 years ago

Could be these?

Original:
-------- Original Message --------
Subject: Re: Reduce ioblocktimeout and idletimeout defaults
Date: Wed, 18 May 2016 17:29:49 +0200
From: Ludwig Krispenz lkrispen@redhat.com
To: Petr Vobornik pvoborni@redhat.com
CC: Thierry Bordaz tbordaz@redhat.com
Hi,
On 05/18/2016 04:53 PM, Petr Vobornik wrote:

Hi Ludwig,

we've this bz on RHEL 7.3 RPL(release priority list):
https://bugzilla.redhat.com/show_bug.cgi?id=1271321

I'd like to assign it to someone. I think there are several questions to
answer:
is a good proposal?
I think the defaults are quite high, eg idletimeout is set to forever
and ioblocktimeout to 30mins, so yes it could make sense
if yes, then what should be the defaults?
that's the difficult part,

  • an idle connection is not that bad, it blocks a slot in the connection
    table and is looked at whenever the connection list is iterated, but
    otherwise it's no harm, and we do not want to kick out bound users or an
    idle replication connection too quickly. If we want to reduce the
    default, I would not go below say 30 mins
  • ioblocktimeout is more serious, a connection waiting for response from
    a client blocks a worker thread and if there are many unresponsive
    clients, all threads are occupied and the server seems to hang. But
    there are slow clients or clients on slow connections, so setting a very
    small value can affect some clients unnecessary. Also I think this is an
    indication of some "lazy" or "bad" clients, which should be
    investigated. But the default 30 mins also seem unreasonable long, so
    the suggestion of 10 to 20 sec would make sense to me.
  • should it be set to existing systems?
    I am not sure about this, is there an ipa command to change these
    settings or would it require plain ldapmodify ?

When this is answered then it is very trivial change.

Is it easy to answer? Do you have the answers?
no, not very determined answers, I will discuss it in the DS team

Regards

The last one from the team discussion:
On 05/18/2016 11:43 PM, Ludwig Krispenz wrote:

On 05/19/2016 01:28 AM, William Brown wrote:

On Wed, 2016-05-18 at 09:44 -0700, Noriko Hosoi wrote:

On 05/18/2016 09:22 AM, Ludwig Krispenz wrote:

On 05/18/2016 06:18 PM, Noriko Hosoi wrote:

On 05/18/2016 08:54 AM, Mark Reynolds wrote:

On 05/18/2016 11:48 AM, Ludwig Krispenz wrote:

Hi,

Petr Vobornik from the IPA team raised the question if IPA shoul
lower the defaults for idletimeout and ioblocktimeout, I tried to
answer below, but am a bit undecided.
Do you have any strong opinion on this matter ?
I have a strong opinion around the ioblocktimeout, as you do too.
It is way too long, and it has caused many cases to be opened with
support. I was originally thinking 30 seconds, but really if after
10 seconds of waiting for something to come through the pipe then
there is probably something wrong and it should be closed.
I have one question. If an application keeps a connection and uses
it in the asynchronous manner, the connection could be closed if it
is idle beyond the ioblocktimeout?
I think ths would be idletimeout closing the connection
Thanks for the confirmation. Then, I vote yes to 10 sec.
What cases would be have >10s for a response ... THat would indicate that our result set is likely wayyyy too large.

How about simple paged results search? If they are not affected, I
have no problem to reduce it to 10 sec.
What about paged results? Or is that the idle timeout case above.
I think that's the idletimeout, what we should also look into is how idle timeout affects persistent connections like psearch and sync-repl

Unrelated reasons why the idea was proposed, but yes, same concept.

Metadata Update from @firstyear:
- Custom field reviewstatus adjusted to review (was: new)

7 years ago

I really think we should go lower. I don't think it should be higher than 2 minutes (preferably less).

I really think we should go lower. I don't think it should be higher than 2 minutes (preferably much less).

Let me elaborate...

I think the odds of the ioblocktimeout being too aggressive(short) is significantly less than the odds of it being too long. So I think it is acceptable for those rare cases to just increase timeout as needed. I'd rather give relief to the majority of the cases (which usually require the very low timeout), than try and pick a value that "might" satisfy both cases.

In customer cases we've seen we usually end up telling the customer to go down to 30 seconds (or less sometimes). So most of the cases if the ioblocktimeout needs to be lowered - it needs to be set very low in order to solve the problem.

So my stance is that we should not pick a middle ground value, but instead go very aggressive. Just my thoughts on the matter - I'd be interested in other's opinions.

I agree, but I contacted GSS, and they said they didn't want to rock the boat and cause too many issues too quickly - which I understand, they are the ones who have to cop the flak if we cause a mess!

So my thought was we can quickly get this into 1.3.6, and then for 1.3.7 we make the next step of 2 minutes.

I agree, but I contacted GSS, and they said they didn't want to rock the boat and cause too many issues too quickly - which I understand, they are the ones who have to cop the flak if we cause a mess!
So my thought was we can quickly get this into 1.3.6, and then for 1.3.7 we make the next step of 2 minutes.

I just think 10 minutes is still a really long time to block a connection - it's just as bad as 30 minute in my opinion. How about 5 minutes? ;)

Deal. New patch shortly.

It's still set to 10 minutes...

Whew I was worried it might have been a pagure bug. Ack

Metadata Update from @mreynolds:
- Custom field reviewstatus adjusted to ack (was: review)

7 years ago

commit 85dcc19
To ssh://git@pagure.io/389-ds-base.git
3129a94..85dcc19 master -> master

Metadata Update from @firstyear:
- Issue close_status updated to: fixed
- Issue status updated to: Closed (was: Open)

7 years ago

389-ds-base is moving from Pagure to Github. This means that new issues and pull requests
will be accepted only in 389-ds-base's github repository.

This issue has been cloned to Github and is available here:
- https://github.com/389ds/389-ds-base/issues/2253

If you want to receive further updates on the issue, please navigate to the github issue
and click on subscribe button.

Thank you for understanding. We apologize for all inconvenience.

Metadata Update from @spichugi:
- Issue close_status updated to: wontfix (was: fixed)

3 years ago

Login to comment on this ticket.

Metadata