#3566 syslog mesages for back ends uses invalid ident tag
Closed: cloned-to-github 3 years ago by pbrezina. Opened 6 years ago by erempel.

backend log messages apear in syslog as

2017-11-08T08:08:32.498-08:00 hostname daemon.info sssd[be[name]]: Shutting down

The ident tag in this case is sssd[be[name]] but the nested brackets violate the RFC for tag names and confuses the parsing of such messages. There is no problem logging the message, however, parsing tools such as logstach, syslog-ng and splunk have difficulties with identifying the application name as sssd.

I propose that these text within the brackets be moved to the message body resulting in a message log with a form of

sssd[PID]: be(name) Shutting down

This allows easier parsing of the backend names to use in SIEM systems or other log processing frameworks.
If it is determined that moving the test into the message body is unacceptable then please consider the format

sssd[be(name)]: Shutting down

This format can be processed by standard syslog services because it conforms to the RFC for ident tag names.

A word of caution is that since "name" is a user configured item, the total length of the ident tag can not be guaranteed to be less <= 32 characters, which is also part of the syslog RFC.


Please provide link to RFC which you mentioned few times.

BTW you might consider to get structured data from journald instead of syslog

sh# journalctl -u sssd --output=json-pretty
{
        "__CURSOR" : "s=ef1625b201c14664ae8fe904480f4c9d;i=1ea;b=7434db9268b34c5eaffb13cdba751542;m=d5e6cde25f;t=545a5aaf43c2f;x=a0d57c93673c8294",
        "__REALTIME_TIMESTAMP" : "1483952723606575",
        "__MONOTONIC_TIMESTAMP" : "918700286559",
        "_BOOT_ID" : "7434db9268b34c5eaffb13cdba751542",
        "SYSLOG_FACILITY" : "3",
        "PRIORITY" : "6",
        "_UID" : "0",
        "_GID" : "0",
        "_SYSTEMD_SLICE" : "system.slice",
        "_MACHINE_ID" : "52d5476f250e4edb9221538b3585f1b3",
        "_HOSTNAME" : "host.example.com",
        "_TRANSPORT" : "journal",
        "_CAP_EFFECTIVE" : "3fffffffff",
        "CODE_FILE" : "src/util/sss_log.c",
        "CODE_LINE" : "111",
        "CODE_FUNC" : "sss_log_internal",
        "MESSAGE" : "Starting up",
        "SSSD_DOMAIN" : "",
        "_SYSTEMD_CGROUP" : "/system.slice/sssd.service",
        "_SYSTEMD_UNIT" : "sssd.service",
        "_SYSTEMD_INVOCATION_ID" : "e55771e2a7d84ca8b2e54c0942a72b13",
        "_SELINUX_CONTEXT" : "system_u:system_r:sssd_t:s0",
        "SYSLOG_IDENTIFIER" : "sssd[be[redhat.com]]",
        "_PID" : "24510",
        "_COMM" : "sssd_be",
        "_EXE" : "/usr/libexec/sssd/sssd_be",
        "_CMDLINE" : "/usr/libexec/sssd/sssd_be --domain redhat.com --uid 974 --gid 973 --debug-to-files",
        "_SOURCE_REALTIME_TIMESTAMP" : "1483952723606551"
}
{
        "__CURSOR" : "s=ef1625b201c14664ae8fe904480f4c9d;i=1eb;b=7434db9268b34c5eaffb13cdba751542;m=d5e6d56da4;t=545a5aafbc774;x=f51b162b9b3101c0",
        "__REALTIME_TIMESTAMP" : "1483952724100980",
        "__MONOTONIC_TIMESTAMP" : "918700780964",
        "_BOOT_ID" : "7434db9268b34c5eaffb13cdba751542",
        "SYSLOG_FACILITY" : "3",
        "PRIORITY" : "6",
        "_SYSTEMD_SLICE" : "system.slice",
        "_MACHINE_ID" : "52d5476f250e4edb9221538b3585f1b3",
        "_HOSTNAME" : "host.example.com",
        "_TRANSPORT" : "journal",
        "_CAP_EFFECTIVE" : "0",
        "CODE_FILE" : "src/util/sss_log.c",
        "CODE_LINE" : "111",
        "CODE_FUNC" : "sss_log_internal",
        "MESSAGE" : "Starting up",
        "SSSD_DOMAIN" : "",
        "_SYSTEMD_CGROUP" : "/system.slice/sssd.service",
        "_SYSTEMD_UNIT" : "sssd.service",
        "_SYSTEMD_INVOCATION_ID" : "e55771e2a7d84ca8b2e54c0942a72b13",
        "_SELINUX_CONTEXT" : "system_u:system_r:sssd_t:s0",
        "SYSLOG_IDENTIFIER" : "sssd[nss]",
        "_PID" : "24511",
        "_UID" : "974",
        "_GID" : "973",
        "_COMM" : "sssd_nss",
        "_EXE" : "/usr/libexec/sssd/sssd_nss",
        "_CMDLINE" : "/usr/libexec/sssd/sssd_nss --uid 974 --gid 973 --debug-to-files",
        "_SOURCE_REALTIME_TIMESTAMP" : "1483952724100970"
}

sssd provides custom fields for journald

            "MESSAGE=%s", message,
            "PRIORITY=%i", LOG_DEBUG,
            "SSSD_DOMAIN=%s", domain,
            "SSSD_PRG_NAME=%s", debug_prg_name,
            "SSSD_DEBUG_LEVEL=%x", level,
            NULL

and we can provide more fields.

The original syslog RFC is here
https://tools.ietf.org/html/rfc3164
and conatains these limitations.

A new structured dataRFC is
https://tools.ietf.org/html/rfc5424
and given that is has specifically defined fields, there is no parsing problem for the application name.

In my case I am using Redhat 6 and it does not have journald. REdhat 6 is regular EOL in 2020 and Extended EOL until June 30, 2024 so this will be an issue for approx 6 years.

Additionally, many of the SIEM style analysis products use BSD syslog as a transport to consume system messages so this may be an issue until these commercial products support structured inputs.

Well, even if we fix this upstream, the fix won't make its way into RHEL-6, because RHEL-6 only accepts urgent fixes at this point..

Makes me sad that it won't get back to Redhat 6 ... anyhow. Looking at your journald json data

SYSLOG_IDENTIFIER" : "sssd[be[redhat.com]]"

if this were to be forwarded to a central log server or processing framework using either the new or old syslog RFC, or any other logging transport (Logstash, Flume, fluentd etc.) this identifier still needs to be parsed into its constituent parts at the receiving end. This identifier is supposed to be

APP-NAME

possibly followed by a

[PROCID]

According to RFC5424 this is "The process ID is a value that is included in the message, having no interoperable meaning, except that a change in the value indicates there has been a discontinuity in syslog reporting."

Usually this is an actual Process Id Number, but it should not include any user provided data, and though there is not supposed to be a limit on this string, in practice nesting square brakets in this field results in problems.

I would still recommend using an actual process ID inside the [] and move the be[name] data to the start of the message.

Other options are to use SYSLOG_IDENTIFIER(s) of

sssd
sssd_nss
sssd_pam
sssd_be

and move the real/domain name to the beginning of the MESSAGE

If the log messages from all of sssd, sssd_nss, sssd_pam and sssd_be are all very similar (the same) then it might make more sense to use the same SYSLOG_IDENTIFIER with the nss/pam/be portion becoming a variable component of MESSAGE.

The rational for this is that post processing of the log data stream is often broken down by PROGRAM name first, then by MESSAGE patterns. If the program is different for each of these, then the MESSAGE patterns need to be repeated for each PROGRAM name.

It really comes down to how seamlessly sssd can integrate into logging frameworks like splunk, loggly, Log Stash or any other post processing and storage retrieval framework. The SYSLOG_IDENTIFIER currently used creates a challenge for all of these frameworks.

Makes me sad that it won't get back to Redhat 6 ... anyhow. Looking at your journald json data
SYSLOG_IDENTIFIER" : "sssd[be[redhat.com]]"

There is no any reason to parseSYSLOG_IDENTIFIER. Because it shoudl be already part of SSSD_DOMAIN. But I think there is a bug (at least in 1.16.0 that it is empty). And you can use "_COMM" for identification of process.

The main point is that there is a difference between parsing plain text from rsyslog and structured data (json) from journald.

{
        "__CURSOR" : "s=d9f05c2c64cc4e03b97d739f4569ef0f;i=fcd32;b=17e10b540f644cdd81bff17960b7a8ae;m=9f8790cb17;t=54a514ed5e658;x=bc0d1779400524d0",
        "__REALTIME_TIMESTAMP" : "1489087959000664",
        "__MONOTONIC_TIMESTAMP" : "685174213399",
        "_BOOT_ID" : "17e10b540f644cdd81bff17960b7a8ae",
        "PRIORITY" : "6",
        "_TRANSPORT" : "journal",
        "_UID" : "0",
        "_GID" : "0",
        "_SYSTEMD_SLICE" : "system.slice",
        "_MACHINE_ID" : "52d5476f250e4edb9221538b3585f1b3",
        "_HOSTNAME" : "host.example.com",
        "SYSLOG_FACILITY" : "3",
        "CODE_FILE" : "src/util/sss_log.c",
        "CODE_LINE" : "111",
        "CODE_FUNC" : "sss_log_internal",
        "MESSAGE" : "Starting up",
        "SSSD_DOMAIN" : "",
        "_CAP_EFFECTIVE" : "1001a050ef",
        "_SYSTEMD_CGROUP" : "/system.slice/sssd.service",
        "_SYSTEMD_UNIT" : "sssd.service",
        "_SYSTEMD_INVOCATION_ID" : "43273d9845024726abd193715add68ae",
        "_SELINUX_CONTEXT" : "system_u:system_r:sssd_t:s0",
        "_COMM" : "sssd_be",
        "_EXE" : "/usr/libexec/sssd/sssd_be",
        "SYSLOG_IDENTIFIER" : "sssd[be[implicit_files]]",
        "_PID" : "28703",
        "_CMDLINE" : "/usr/libexec/sssd/sssd_be --domain implicit_files --uid 974 --gid 973 --debug-to-files",
        "_SOURCE_REALTIME_TIMESTAMP" : "1489087959000463"
}

A agree that there is a huge difference in parsing the data from text or structured data streams, but until such time as all of the log stream processing engines catch up, working with the sssd logs are a challenge. As you say, for structured data consumers there should not be any need to parse the SYSLOG_IDENTIFIER so there should not be any resistance to change its format. There are still a lot of non-structured data consumers and for them the content of the SYSLOG_IDENTIFIER is significant. I would hope that the sssd developers would have an interest in a smooth integration with the existing log processing frameworks.

Regardless of what I think, the documentation for the journald fields states

SYSLOG_IDENTIFIER - Syslog compatibility field containing the identifier string (i.e. "tag")

The tag that sssd is using is not syslog compatible.

I hope this is fixed since we fixed upstream #3433. Please reopen if not.

Metadata Update from @jhrozek:
- Issue close_status updated to: duplicate
- Issue status updated to: Closed (was: Open)

6 years ago

On Fedora 27 with sssd 1.16.0 the syslog lines are now of the format

2018-04-27T09:06:35.000-07:00 host daemon.info sssd[secrets][28504]: Starting up
2018-04-27T09:09:05.000-07:00 host daemon.info sssd[kcm][28540]: Starting up

indicating that the syslog ident is "sssd[kcm][28540]"

This ident is also not complient with the legacy syslog ident requirements.

Can this issue be reviewed again please?

Metadata Update from @erempel:
- Issue status updated to: Open (was: Closed)

5 years ago

Metadata Update from @pbrezina:
- Issue tagged with: Future milestone

4 years ago

SSSD is moving from Pagure to Github. This means that new issues and pull requests
will be accepted only in SSSD's github repository.

This issue has been cloned to Github and is available here:
- https://github.com/SSSD/sssd/issues/4590

If you want to receive further updates on the issue, please navigate to the github issue
and click on subscribe button.

Thank you for understanding. We apologize for all inconvenience.

Metadata Update from @pbrezina:
- Issue close_status updated to: cloned-to-github
- Issue status updated to: Closed (was: Open)

3 years ago

Login to comment on this ticket.

Metadata