#51071 Add cache ratio checks into Healthcheck tool
Closed: wontfix 3 years ago by spichugi. Opened 3 years ago by mreynolds.

Issue Description

With the existence of autotuning, many Admins are not checking if the caches are optimally tuned. Autotuning provides a much better minimum default for the cache sizes, but it is not fully optimized. The server itself can not do this as it doesn't know how the system is being used, etc. So an admin needs to take manual action and adjust the sizes based on actual availability of resources. Adding a "performance check" into healthcheck would be beneficial. This check would just look at the various cache hit ratios and report warnings based on these values. For an example, an cache hit ratio less than 80% should report a warning (something like that).

The challenge is that when you first start the server is that the ratios are at zero. We really should only check the cache hit ratios once the server has been up and running and/or the caches are fully primed. All this information is available in our monitors (cache stats, server uptime, etc), but when do we say it's okay to check the ratios? After 1 hour, 6 hours? Or when the entry caches are filled? This might not be so straightforward. My point is that we need to reduce the risk of a false positive if we add this type of health check to the tool.

The other issue is deciding what cache hit ratio percentages should generate warnings. For example:

95% or higher = Green, no warning
85 - 95% = Amber
< 85% = Red

This is a bit on the high end, but what percentage should trigger a warning? This should be discussed among the team.


I think there is a server uptime variable in cn=monitor we could read, and if that's less than 30 mins we can say "this may not yet be accurate" or similar?

I'd probably say 90%, 80% are the numbers for green/amber? But it's hard to know what's right here, there are many factors ....

Metadata Update from @firstyear:
- Custom field origin adjusted to None
- Custom field reviewstatus adjusted to None

3 years ago

Metadata Update from @mreynolds:
- Issue set to the milestone: 1.4.3

3 years ago

389-ds-base is moving from Pagure to Github. This means that new issues and pull requests
will be accepted only in 389-ds-base's github repository.

This issue has been cloned to Github and is available here:
- https://github.com/389ds/389-ds-base/issues/4124

If you want to receive further updates on the issue, please navigate to the github issue
and click on subscribe button.

Thank you for understanding. We apologize for all inconvenience.

Metadata Update from @spichugi:
- Issue close_status updated to: wontfix
- Issue status updated to: Closed (was: Open)

3 years ago

Login to comment on this ticket.

Metadata