#251 Obey debug_level when syslog()ing
Opened 2 months ago by rharwood. Modified a month ago
rharwood/gssproxy quiet_time  into  master

file modified
+8

@@ -150,6 +150,14 @@ 

              </varlistentry>

              <varlistentry>

                  <term>

+                     <option>--syslog-status</option>

+                 </term>

+                 <listitem>

+                     <para>Enable additional logging to syslog.</para>

+                 </listitem>

+             </varlistentry>

+             <varlistentry>

+                 <term>

                      <option>--version</option>

                  </term>

                  <listitem>

file modified
+10

@@ -371,6 +371,16 @@ 

                  </varlistentry>

  

                  <varlistentry>

+                     <term>syslog_status (boolean)</term>

+                     <listitem>

+                         <para>Enable per-call debugging output to the syslog.

+                         This may be useful for investigating problems in

+                         applications using gssproxy.</para>

+                         <para>Default: syslog_status = false</para>

+                     </listitem>

+                 </varlistentry>

+ 

+                 <varlistentry>

                      <term>trusted (boolean)</term>

                          <listitem><para>Defines whether this service is considered trusted. Use with caution, this enables impersonation.</para>

                              <para>Default: trusted = false</para>

file modified
+6

@@ -611,6 +611,12 @@ 

          goto done;

      }

  

+     ret = gp_config_get_string(ctx, "gssproxy", "syslog_status", &tmpstr);

+     if (ret == 0)

+         gp_syslog_status = gp_boolean_is_true(tmpstr);

+     else if (ret != ENOENT)

+         goto done;

+ 

      ret = gp_config_get_string(ctx, "gssproxy", "run_as_user", &tmpstr);

      if (ret == 0) {

          cfg->proxy_user = strdup(tmpstr);

file modified
+7 -2

@@ -5,6 +5,9 @@ 

  #include <stdio.h>

  #include <stdarg.h>

  

+ /* global logging switch */

+ bool gp_syslog_status = false;

+ 

  void gp_logging_init(void)

  {

      openlog("gssproxy",

@@ -55,7 +58,9 @@ 

  {

      char buf[MAX_LOG_LINE];

  

-     gp_fmt_status(mech, maj, min, buf, MAX_LOG_LINE);

+     if (!gp_syslog_status)

+         return;

  

-     GPERROR("%s\n", buf);

+     gp_fmt_status(mech, maj, min, buf, MAX_LOG_LINE);

+     syslog(LOG_DEBUG, "%s\n", buf);

  }

file modified
+3

@@ -3,9 +3,12 @@ 

  #ifndef _GP_LOG_H_

  #define _GP_LOG_H_

  

+ #include <stdbool.h>

  #include <syslog.h>

  #include <gssapi/gssapi.h>

  

+ extern bool gp_syslog_status;

+ 

  #define MAX_LOG_LINE 1024

  #define GPERROR(...) syslog(LOG_ERR, __VA_ARGS__);

  #define GPAUDIT(...) syslog(LOG_INFO, __VA_ARGS__);

file modified
+6

@@ -158,6 +158,7 @@ 

      int opt_version = 0;

      int opt_debug = 0;

      int opt_debug_level = 0;

+     int opt_syslog_status = 0;

      verto_ctx *vctx;

      verto_ev *ev;

      int wait_fd;

@@ -183,6 +184,8 @@ 

           _("Enable debugging"), NULL}, \

          {"debug-level", '\0', POPT_ARG_INT, &opt_debug_level, 0, \

           _("Set debugging level"), NULL}, \

+         {"syslog-status", '\0', POPT_ARG_NONE, &opt_syslog_status, 0, \

+          _("Enable GSSAPI status logging to syslog"), NULL}, \

          {"version", '\0', POPT_ARG_NONE, &opt_version, 0, \

           _("Print version number and exit"), NULL }, \

          POPT_TABLEEND

@@ -212,6 +215,9 @@ 

          gp_debug_toggle(opt_debug_level);

      }

  

+     if (opt_syslog_status)

+         gp_syslog_status = true;

+ 

      if (opt_daemon && opt_interactive) {

          fprintf(stderr, "Option -i|--interactive is not allowed together with -D|--daemon\n");

          poptPrintUsage(pc, stderr, 0);

Second commit is the real fix. First commit is a conceptual change to merge "logging" and "debugging".

Sorry but the reason why they are separate is that I wanted to make very clear that loggin and debugging should be two different things.
And should emit different things.

Can you give more explanation of the rationale ?

I definitely and, on purpose, recall not wanting debug going to syslog which is generally not the place for debugging logs as it may be sourced and aggregated in remote systems and debug may spit sensitive data like keys.

I guess my mistake was not to have also two completely separate files, perhaps we should fix that.

In developing the second patch (which I hope is okay?), I had a surprising amount of trouble figuring out where the messages were coming from. In normal operation, the "print" logging and the "syslog" logging get merged into one place, and if one doesn't touch the codebase frequently...

Additionally, the fix in the second patch causes the "syslog" system to depend on a variable (gp_debug) from the "debug" system, which made me think it made sense to merge them.

(I'd also be fine with merging the other direction - keep gp_log and remove gp_debug. I only did it this way for a smaller diff.)

I think the main issue is that I do not agree to merge debugging ad logging.
But perhaps I do not understand the problem, maybe ping me up for a high b/w discussion on that.

rebased onto 3b003dc

a month ago

Per out-of-band review, we agreed to use an off-by-default parameter to configure logging, and keep logging separate from debugging.

When investigating implementation, I determined that all messages I'm concerned with pass through gp_log_status(); thus, this parameter only controls those. Messages from GPERROR() are all things that an admin really should fix.

Additionally, there's only a single usage of GPAUDIT(); this isn't really a problem, but we might move it to GPERROR() for consistency.

Please double-check my variable/parameter names for clarity.

rebased onto 116618e

a month ago

So if I understand correctly, we still output errors to syslog in general, except now we make it a config option to log the status of "acquire cred" calls to syslog.
I think that is fine, am I right that debug statements still print the output at a deeper level so nothing is lost of debug purposes ?

Technically the debug logs are sufficient, but it's harder to follow without the syslogs. Because debugging and logging are separate, the two sets of logs are wholly disjoint. Here's my notes which might help explain why I wanted to tie the two systems together:

  • gp_log_status() is only called from a single place (in gp_creds.c), and will only show failures around credential acquisition. It is not called on success.
  • gp_log_failure() is also only called from a single place (in gp_rpc_indicate_mechs.c), and will only show failures around mech indication.
  • despite the name, gp_log_failure() sends output to debug, not syslog.
  • gp_log_failure() uses a prefix of "Failed with:", which no other functions do
  • gp_log_failure() will always log - it's not conditional on the value of gp_debug, unlike everything else that can send debug output.
  • GPAUDIT() is only used to indicate that a config directory couldn't be read - there are no other calls
  • almost everything logged with GPERROR() could be argued to be made fatal if we didn't have other constraints
  • GPERROR() syslogs at LOG_ERROR, while GPAUDIT() syslogs at LOG_INFO.
  • in light of that, it's confusing that GPDEBUG() sends to the debug output, rather than syslog at LOG_DEBUG.
  • there's a third logging system for the RPC layer, built on gp_debug_printf() (which always logs, but the RPC logging checks the value of gp_debug)

It's this RPC debugging layer that will log the return statuses, though they become harder to pick out, and haven't been run through gss_display_status().

Sounds all pretty confusing :-( Logging is hard.

Here are my intentions when I started:
- send logs to syslog
- send debug to a file
- audit is for when admin care only to see who does what, but no verbose logging is required

All the information is complementary, however it may make sense to replicate all logging in debug files for correlation purposes.
In particular my intention was that an admin would get enough info from logs to know what is failing and not require to crank up debugging unless directed by a support person.
Debugging can be as verbose as needed because it will never end in syslog.

If you are inclined to improve the situation in this PR I am ok with it, otherwise I am ok with the current patch for now, and to open an issue to adjust back al logging and debugging according to these guidelines if possible.

If not possible I would rather stop completely sending to syslog ever and only have debug logs.

I think the PR as it is now addresses my immediate issue (that admins don't like seeing failures here). However, it intentionally violates your assumption that "an admin would get enough info from logs to know what is failing".

I feared that, indeed.

What would you like to have happen here? Are there changes in approach or code you'd like to see from this PR? I'm happy to open such an issue, but I am trying to break one of your intentions here so I'm not sure where to go from here?

Any chance we can apply this patch downstream for now and revisit this issue in a more complete way for upstream ?

Well, this is upstream and I was hoping we could have a discussion to figure out what that would be :)

But yes, if it comes time to release and we don't have an easy solution, I can do the simple thing.