#47968 [RFE] Send logs to journald
Closed: Fixed None Opened 4 years ago by mkosek.

Instead of writing the debug logs directly to the disk, it is possible to send them to journald, on platforms where journald is available.

The biggest benefit is that the debug logs are structured, so it's possible to decorate the debug messages with key/value pairs and later query based on these extra fields.

This feature is a required so that FreeIPA project can implement additional log processing or auditing capabilities. See upstream ticket for details: https://fedorahosted.org/freeipa/ticket/4296.


How does one "Send logs to journald"? Using the syslog(3) interface? If not, then what is the api?

AFAIU, the key is to simply send calling systemd specific function - sd_journal_send* (e.g., sssd project calls sd_journal_send_with_location, you can check specific example in [https://git.fedorahosted.org/cgit/sssd.git/commit/?id=60df591b865831a71e93eb57f4c353c4555b66cf this commit]).

The whole change that was needed for example in SSSD was not complicated, see https://fedorahosted.org/sssd/ticket/2195. I bet Jakub Hrozek or sssd-devel list could help with any specifics to journald utilization.

You can use syslog interface for plain log sending, but much better to move to sd_journal_send as mentioned by mkosek above.
http://www.freedesktop.org/software/systemd/man/systemd-journald.service.html
http://www.freedesktop.org/software/systemd/man/sd_journal_send.html

The logs already look heavily key/value focused already, but there's a few exceptions (and I don't understand enough about the logs already). Would be great to be using sd_journal_send when available to push all the data as key/value pairs.

Hi,

I have peformed a detailed investigation of this, and have concluded that while journald may offer nice interfaces, it is not fit for purpose for a serious logging solution for ns-slapd.

I have conducted a number of performance and api experiments with journald. At a previous workplace with a small-medium sized install of slapd, we would easily generate 4 million log events per hour.

Conducting a smaller test, has revealed that given ~1.5million events per hour systemd exhibits the following characteristics:

  • Background consumes 15% cpu on an i7 system.
  • Consumes 1200MB / hour for access / error logs.
  • Consumes 1740MB / hour for audit logs.

If we assume a 10:1 ration of access to audit, this means:

1374MB / hour of log consumption.

According to the systemd documentation:

"The first pair defaults to 10% and the second to 15% of the size of the respective file system, but each value is capped to 4G." [0]. This is supported by examining the code [1].

As a result, this means that even in a small installation, journald will only be able to retain logs for ~3 hours before they are rotated away. In any larger site or installation, we would see this reduce to sub 1 hour, easily.

Due to the nature of slapd, and the requirements for auditing, potentially well into the past, this is not an acceptable situation.

The lack of ACID support and the risk of corruption is of concern as well due to the need for reliable, verifiable logs in a security system.

Additionally, without reconfiguration, journald will drop nearly 85% of the events that ns-slapd generates due to rate limiting.

Reconfiguration is not a serious issue: We can notify of this in dsktune.

However, the 4G limit, is a serious issue.

Unless the systemd developers can be convinced to raise this limit to "unlimited", then I do not believe that we can consume journald as a logging solution.

Sincerely,

William

[0] http://www.freedesktop.org/software/systemd/man/journald.conf.html
[1] http://cgit.freedesktop.org/systemd/systemd/tree/src/journal/journal-file.c#n47

Despite the limitations of journald as you have pointed out, we would still like to have the option to log to journald, for those customers who will want to use it anyway.

I'll go ahead and implement it anyway then. It's good to know the limitations as we can document them so people are aware.

Updated patch for locking and safer handling of backend choices.
0001-Ticket-47968-RFE-send-logs-to-journald.2.patch

A question ... Is log_set_backend called just for the access log (i.e., logtype is always SLAPD_ACCESS_LOG?) It's not clear to me who is referring logging_backend...
{{{
393 int
394 log_set_backend(const char attrname, char value, int logtype, char errorbuf, int apply) {
...
441 /
We have a valid backend, set it /
442 /

443 * We just need to use any lock here, doesn't matter which.
444 */
445 LOG_ACCESS_LOCK_WRITE( );
446 loginfo.log_backend = backend;
447 slapi_ch_free_string(&(slapdFrontendConfig->logging_backend));
448 slapdFrontendConfig->logging_backend = slapi_ch_strdup(value);
449 LOG_ACCESS_UNLOCK_WRITE( );
}}}

And as usual, please try not to mix up tabs and spaces in the indentation... ;)

So, the reason for that isn't that the access log uses it (access, error, audit, audit fail all use this backend value), we just need a single lock around the backend update to prevent issues if two threads update the value simultaneously.

I'm going to retab both log.c and libglobs.c shortly after this, so I'm not bothering to fix these (yet) :)

Minor issue, you have: internal, syslog, and journald logging. I think "internal" can be misleading because we have other kinds of internal logging, I think it should be "server" or something similar.

Hmmm. Some options:

  • server
  • nsslapd
  • nsslapd-log
  • nsslapd-internal
  • classic
  • dirsrv
  • dirsrv-log

Any of these look good? I kind of like dirsrv-log ... But there is no reason it couldn't accept more than one possible value for this.

Replying to [comment:16 firstyear]:

Hmmm. Some options:

  • server
  • nsslapd
  • nsslapd-log
  • nsslapd-internal
  • classic
  • dirsrv
  • dirsrv-log

Any of these look good? I kind of like dirsrv-log ... But there is no reason it couldn't accept more than one possible value for this.

dirsrv-log is good for me.

Replying to [comment:17 mreynolds]:

Any of these look good? I kind of like dirsrv-log ... But there is no reason it couldn't accept more than one possible value for this.

dirsrv-log is good for me.

+1

Have you updated the DS log identifier to "dirsrv-log" yet? I don't see a new patch in ticket.

Sorry about that. Updated patch is attached with identifier of "dirsrv-log" instead of "internal".

commit aa20634

Total 27 (delta 21), reused 0 (delta 0)
To ssh://git.fedorahosted.org/git/389/ds.git
ab6501a..aa20634 master -> master

commit 7cf83ac

Fix for jenkins build.

To ssh://git.fedorahosted.org/git/389/ds.git
aa20634..7cf83ac master -> master

commit be8c063
Fix for rpm build which applies different options to our normal build.

Fixes for various faults in the implementation that were found in usage.
0001-Ticket-47968-Fix-for-issues-with-multiple-log-backen.patch

commit 42a3a68
commit ef53e0c

To ssh://git.fedorahosted.org/git/389/ds.git
b31749f..42a3a68 master -> master

We have decided that we can't support journald until some issues with it are corrected. This patch will prevent journald support from being built into DS until these issues with journald are solved.

Sorry that this is the case.

commit d779160bdb82a9d74481f821913b819e1e7ee6b0
Total 10 (delta 8), reused 0 (delta 0)
To ssh://git.fedorahosted.org/git/389/ds.git
955dc66..5ddd7b9 master -> master

Metadata Update from @firstyear:
- Issue assigned to firstyear
- Issue set to the milestone: 1.3.5 backlog

2 years ago

Login to comment on this ticket.

Metadata