#48978 Add new logging severity level to the errors log
Closed: fixed None Opened 3 years ago by mreynolds.

There is a need to improve/clarify our error logging. It can be hard to tell when a message is an error, or something benign. Adding a new severity parameter would resolve this. The new logging type should follow https://tools.ietf.org/html/rfc3164

We don't necessarily have to use all these levels, but the levels we use should comply with this list. This can also be used if logging is redirected to journald

           0       Emergency: system is unusable
           1       Alert: action must be taken immediately
           2       Critical: critical conditions
           3       Error: error conditions
           4       Warning: warning conditions
           5       Notice: normal but significant condition
           6       Informational: informational messages
           7       Debug: debug-level messages

Example:

slapi_log_error(SLAPI_LOG_REPL, LOG_ERROR, repl_subsystem, "message\n");

Logged as:

[02/Sep/2016:10:38:11.188881144 -0400] NSMMReplicationPlugin - ERROR - message

We should also add a new logging level config setting that can filter based on the severity. This should work independently of the normal log level: trace function calls, replication, plugin, etc. So even if we don't have replication logging enabled, we will still log that replication message if the severity is set.


The latest version of the syslog rfc is https://tools.ietf.org/html/rfc5424

But in our code we should just use the macros defined in syslog.h as described in man 3 syslog:
{{{
level
This determines the importance of the message. The levels are, in
order of decreasing importance:

   LOG_EMERG      system is unusable

   LOG_ALERT      action must be taken immediately

   LOG_CRIT       critical conditions

   LOG_ERR        error conditions

   LOG_WARNING    warning conditions

   LOG_NOTICE     normal, but significant, condition

   LOG_INFO       informational message

   LOG_DEBUG      debug-level message

   The function setlogmask(3) can be used to restrict logging to specified
   levels only.

}}}

Refractoring tasks:

  • LDAPDebug() needs to be replaced with slapi_log_error()
  • 330 files will need to be updated
  • 7793 function calls need to be updated

SLAPI_LOG_FATAL\LDAP_DEBUG_ANY = 3683
SLAPI_LOG_PLUGIN = 2035
SLAPI_LOG_TRACE = 1437
SLAPI_LOG_REPL = 604
SLAPI_LOG_ACL = 232
SLAPI_LOG_FILTER = 158
SLAPI_LOG_CONNS = 96
SLAPI_LOG_BACKLDBM = 95
SLAPI_LOG_ARGS = 76
SLAPI_LOG_CONFIG = 58
SLAPI_LOG_CACHE = 35
SLAPI_LOG_HOUSE = 17
SLAPI_LOG_SHELL = 6
SLAPI_LOG_BER = 1

I also suggest we keep it to 4 severity levels:

  • LOG_ERR
  • LOG_WARNING
  • LOG_INFO
  • LOG_DEBUG

I'm going to do a mass find/replace of all the functions and add the same severity level to each logging type. So all replication logging will be LOG_DEBUG, all FATAL with be LOG_ERR, etc. Once that is committed, then we can go and fine tune each function call in batches.

Replying to [comment:3 mreynolds]:

Refractoring tasks:

  • LDAPDebug() needs to be replaced with slapi_log_error()

This will remove the ability to compile without those logging statements. This may have performance implications.

  • 330 files will need to be updated
  • 7793 function calls need to be updated

SLAPI_LOG_FATAL\LDAP_DEBUG_ANY = 3683
SLAPI_LOG_PLUGIN = 2035
SLAPI_LOG_TRACE = 1437
SLAPI_LOG_REPL = 604
SLAPI_LOG_ACL = 232
SLAPI_LOG_FILTER = 158
SLAPI_LOG_CONNS = 96
SLAPI_LOG_BACKLDBM = 95
SLAPI_LOG_ARGS = 76
SLAPI_LOG_CONFIG = 58
SLAPI_LOG_CACHE = 35
SLAPI_LOG_HOUSE = 17
SLAPI_LOG_SHELL = 6
SLAPI_LOG_BER = 1

I also suggest we keep it to 4 severity levels:

  • LOG_ERR
  • LOG_WARNING
  • LOG_INFO
  • LOG_DEBUG

Why not use all of the syslog(3) log levels and macros?

Replying to [comment:4 rmeggins]:

Replying to [comment:3 mreynolds]:

Refractoring tasks:

  • LDAPDebug() needs to be replaced with slapi_log_error()

This will remove the ability to compile without those logging statements. This may have performance implications.

Okay, well we can just update it use the new severity level then. No big deal.

I also suggest we keep it to 4 severity levels:

  • LOG_ERR
  • LOG_WARNING
  • LOG_INFO
  • LOG_DEBUG

Why not use all of the syslog(3) log levels and macros?

I was just trying to keep it simple for the end user. Less to filter on I suppose. In syslog there are 4 different types of errors - I was just trying to make it into a single error type. Less confusing for customers, but that's also dumbed down I suppose (not maximizing its potential).

We can use all of them. No problem.

Thanks for the feedback!

These first few patches are just setting the default severity levels. Anything with LDAP_DEBUG_ANY/SLAPI_LOG_FATAL gets set to LOG_ERR, otherwise its set to LOG_DEBUG. Part 2 & 3 and very large patches.

Next, add new logging level parameter to filter on severity.

After that, fine tune LOG_ERR's. For example in main.c many error messages are informational.

Your patches look good.

Sorry, to bring it up now... I've found these related bugs/tickets. They look like a dup of your effort. But it has some extra requirements. For instance, how about adding a config option to select printing the severity in a string or a digit for #48260?
Bug 1257459/Ticket #48260 - Add classification information to error logs and a way to process them in an automated fashion.
Bug 1369592 - RFE - Option to add errorlog level for specific component.

Replying to [comment:7 nhosoi]:

Your patches look good.

Sorry, to bring it up now... I've found these related bugs/tickets. They look like a dup of your effort. But it has some extra requirements. For instance, how about adding a config option to select printing the severity in a string or a digit for #48260?
Bug 1257459/Ticket #48260 - Add classification information to error logs and a way to process them in an automated fashion.
Bug 1369592 - RFE - Option to add errorlog level for specific component.

I'll look into this. I'm far from done with this ticket - I just wanted to commit things in phases.

Great. If that's the case you have my ack on
0001-Ticket-48978-Update-the-logging-function-to-accept-s.patch
0001-Ticket48978-refactor-LDAPDebug.patch​
0001-Ticket-47978-Refactor-slapi_log_error.patch​

11a5b1e..125486d master -> master
commit 125486d Part 3
commit b763121 Part 2
commit 051e0eb Part 1

Investigating ticket #48260 next...

Will this eliminate the LDAPDebug*() macros? I would be happy to see them all disappear ....

My biggest concern is how we will determine what logs to show. If you have:

{{{
slapi_log_error(SLAPI_LOG_REPL, LOG_ERROR, repl_subsystem, "message\n");
}}}

Do we show this at the normal 8192 level? Or only if repl log in enabled? I think that we could just have some better log levels, and change the default perhaps to show a few of them. Perhaps even this really is exposing the issue that we have log messages being issue to the wrong level in the first place .... The logging code is already a train wreck, I think this makes it more complex ....

Actually, reading the first patch, had I of seen this in time, I would NACK this.

This is misleading to the developer. When I put in

{{{
slapi_log_error(SLAPI_LOG_REPL, LOG_ERROR, repl_subsystem, "message\n");
}}}

I expect that message to be yelled from the rooftops as an error. Something has seriously gone wrong, LOG_ERROR!

But that's not what this patch does. If the log level is not set to recieve LOG_REPL, this will go into the ether! That's not what we want!

The real issue, is log messages being put to the wrong levels, and that we don't have a SLAPI_LOG_INFO, SLAPI_LOG_DEBUG etc. I don't like this approach I'm sorry.

I agree with William, if we label someting as error it should be always logged, if it is only logged under a specific debug level it should be a warning.

Another area. If I understand it correctly, now the startup ..; Listening ..., shutdown .., Waiting for threads all we be ERROR, but it is no error. Maybe NOTICE would be better:

LOG_NOTICE normal, but significant, condition

Replying to [comment:13 firstyear]:

Actually, reading the first patch, had I of seen this in time, I would NACK this.

This is misleading to the developer. When I put in

{{{
slapi_log_error(SLAPI_LOG_REPL, LOG_ERROR, repl_subsystem, "message\n");
}}}

Sorry this was just an example - there are currently no occurrences of this pattern.

I expect that message to be yelled from the rooftops as an error. Something has seriously gone wrong, LOG_ERROR!

But that's not what this patch does. If the log level is not set to recieve LOG_REPL, this will go into the ether! That's not what we want!

The real issue, is log messages being put to the wrong levels, and that we don't have a SLAPI_LOG_INFO, SLAPI_LOG_DEBUG etc. I don't like this approach I'm sorry.

Replying to [comment:14 lkrispen]:

I agree with William, if we label someting as error it should be always logged, if it is only logged under a specific debug level it should be a warning.

Another area. If I understand it correctly, now the startup ..; Listening ..., shutdown .., Waiting for threads all we be ERROR, but it is no error. Maybe NOTICE would be better:

LOG_NOTICE normal, but significant, condition

Sorry if I was not clear in my comments in this ticket, but I will reiterate.

These patches were the first stage of factoring the logging. I stated that we NOW need to fine tune the messages. There is much more work needed in this ticket.

After doing the refactoring, I realized that all verbose logging will be DEBUG or something similar(but never ERROR), and we need to re-evaluate all the FATAL messages. For example, I stated in comment 6, we need to fine tune the FATAL messages in main.c, etc. I plan on working on this today.

Everyone, please ignore this example below - this is will never happen.

{{{
slapi_log_error(SLAPI_LOG_REPL, LOG_ERROR, repl_subsystem, "message\n");
}}}

You can use __func__ to get the name of the current function. Looks like ipa is using that in more recent code. See http://www.delorie.com/gnu/docs/gcc/cpp_21.html

e.g. instead of
{{{
LDAPDebug(LDAP_DEBUG_ANY, LOG_ERR, "check_oid - Non numeric oid %s passed to check_oid\n",oid,0,0);
slapi_log_error(SLAPI_LOG_FATAL, LOG_ERR, "slapi_be_setentrypoint",
580 "Unknown entry point %d\n", entrypoint);
}}}
use
{{{
LDAPDebug(LDAP_DEBUG_ANY, LOG_ERR, func " - Non numeric oid %s passed to " func "\n",oid,0,0);
slapi_log_error(SLAPI_LOG_FATAL, LOG_ERR, func, ...
}}}
however, it says that "... Neither of them (func or FUNCTION) is a macro; the preprocessor does not know the name of the current function" so you might not be able to use string pasting as in the check_oid example, but you can use it as an argument to printf %s.

Replying to [comment:19 rmeggins]:

You can use __func__ to get the name of the current function. Looks like ipa is using that in more recent code. See http://www.delorie.com/gnu/docs/gcc/cpp_21.html

e.g. instead of
{{{
LDAPDebug(LDAP_DEBUG_ANY, LOG_ERR, "check_oid - Non numeric oid %s passed to check_oid\n",oid,0,0);
slapi_log_error(SLAPI_LOG_FATAL, LOG_ERR, "slapi_be_setentrypoint",
580 "Unknown entry point %d\n", entrypoint);
}}}
use
{{{
LDAPDebug(LDAP_DEBUG_ANY, LOG_ERR, func " - Non numeric oid %s passed to " func "\n",oid,0,0);
slapi_log_error(SLAPI_LOG_FATAL, LOG_ERR, func, ...
}}}
however, it says that "... Neither of them (func or FUNCTION) is a macro; the preprocessor does not know the name of the current function" so you might not be able to use string pasting as in the check_oid example, but you can use it as an argument to printf %s.

That's actually very cool, and maintenance free... However, redoing this patch might consume another day of work. Should I redo it, or just use this technique on the next round of patches?

You can use func to get the name of the current function. Looks like ipa is using that in more recent code. See ​http://www.delorie.com/gnu/docs/gcc/cpp_21.html

I don't think this is a good idea to do at this point. func will likely involve some stack magic or something, so I could see it being costly...

Everyone, please ignore this example below - this is will never happen.

{{{
slapi_log_error(SLAPI_LOG_REPL, LOG_ERROR, repl_subsystem, "message\n");
}}}

In the patch I see

{{{
slapi_log_error(SLAPI_LOG_FATAL, LOG_ERR, "process_bulk_import_op", "NULL entry\n");
}}}

So what's the point of the LOG_ERR? We already have the message at FATAL, we are just "doubly" expresing fatal.

The point of this patch apparently is to make it when you say "LOG_ERR" that it puts the word "ERROR" at the start of the line.

Why not just infer that from the SLAPI_LOG_* type? It would save changing the api, and it would have the same effect? It seems like we are double handling here ....

Replying to [comment:21 firstyear]:

You can use func to get the name of the current function. Looks like ipa is using that in more recent code. See ​http://www.delorie.com/gnu/docs/gcc/cpp_21.html

I don't think this is a good idea to do at this point. func will likely involve some stack magic or something, so I could see it being costly...

Then let's quantify that cost, and weigh that cost vs. readability/maintainability.

Everyone, please ignore this example below - this is will never happen.

{{{
slapi_log_error(SLAPI_LOG_REPL, LOG_ERROR, repl_subsystem, "message\n");
}}}

In the patch I see

{{{
slapi_log_error(SLAPI_LOG_FATAL, LOG_ERR, "process_bulk_import_op", "NULL entry\n");
}}}

So what's the point of the LOG_ERR? We already have the message at FATAL, we are just "doubly" expresing fatal.

The point of this patch apparently is to make it when you say "LOG_ERR" that it puts the word "ERROR" at the start of the line.

Why not just infer that from the SLAPI_LOG_* type? It would save changing the api, and it would have the same effect? It seems like we are double handling here ....

Replying to [comment:20 mreynolds]:

Replying to [comment:19 rmeggins]:

You can use __func__ to get the name of the current function. Looks like ipa is using that in more recent code. See http://www.delorie.com/gnu/docs/gcc/cpp_21.html

e.g. instead of
{{{
LDAPDebug(LDAP_DEBUG_ANY, LOG_ERR, "check_oid - Non numeric oid %s passed to check_oid\n",oid,0,0);
slapi_log_error(SLAPI_LOG_FATAL, LOG_ERR, "slapi_be_setentrypoint",
580 "Unknown entry point %d\n", entrypoint);
}}}
use
{{{
LDAPDebug(LDAP_DEBUG_ANY, LOG_ERR, func " - Non numeric oid %s passed to " func "\n",oid,0,0);
slapi_log_error(SLAPI_LOG_FATAL, LOG_ERR, func, ...
}}}
however, it says that "... Neither of them (func or FUNCTION) is a macro; the preprocessor does not know the name of the current function" so you might not be able to use string pasting as in the check_oid example, but you can use it as an argument to printf %s.

That's actually very cool, and maintenance free... However, redoing this patch might consume another day of work. Should I redo it, or just use this technique on the next round of patches?

Let's find out how much it actually costs.

Thank you so much for adding the severity, Mark! I really like to see LOG_EMERG, LOG_ALERT, LOG_CRIT, etc. along with the messages.

I have some requests like this.
These functions are in #ifdef FOR_DEBUGGING, better be LOG_DEBUG instead of LOG_ERR?
{{{
connection_table_dump_active_connection (Connection c)
connection_table_dump_active_connections (Connection_Table
ct)
}}}
ditto...
{{{
248 248 #ifdef FOR_DEBUGGING
249 slapi_log_error(SLAPI_LOG_FATAL, LOG_ERR, "connection", "Moving connection out of active list\n");
249 slapi_log_error(SLAPI_LOG_FATAL, LOG_ERR, "connection_table_move_connection_out_of_active_list", "Moving connection out of active list\n");
250 250 connection_table_dump_active_connection (c);
251 251 #endif
}}}
ditto...
{{{
307 309 #ifdef FOR_DEBUGGING
308 slapi_log_error(SLAPI_LOG_FATAL, LOG_ERR, "connection", "Moving connection into active list\n");
310 slapi_log_error(SLAPI_LOG_FATAL, LOG_ERR, "connection_table_move_connection_on_to_active_list",
311 "Moving connection into active list\n");
309 312 connection_table_dump_active_connection (c);
310 313 #endif
}}}
These are even further fine tunings... Since you are still working on the big transition, should we wait for the task done? Thanks!

Replying to [comment:21 firstyear]:

So what's the point of the LOG_ERR? We already have the message at FATAL, we are just "doubly" expresing fatal.

The point of this patch apparently is to make it when you say "LOG_ERR" that it puts the word "ERROR" at the start of the line.

It's actually "ERR" after we use the syslog mappings.

see /usr/include/sys/syslog.h

Why not just infer that from the SLAPI_LOG_* type? It would save changing the api, and it would have the same effect? It seems like we are double handling here ....

The whole point of all of this is that SLAPI_LOG_FATAL is too general, and there is no easy way sort/parse the errors log. We log info messages, warnings, errors, severe errors, and debug messages. all using SLAPI_LOG_FATAL. There is no easy way to determine what is serious and what is not. Customers get confused by this ALL the time. So is there a bit of double handling, of course there is, but the benefit far out weights it.

So I think you are implying that we should add new logging levels? The problem with that, is that they need to be set in the config and are not logged by default. We are also taking advantage of the syslog error codes, and moving towards a more standard way of logging errors.

Replying to [comment:24 nhosoi]:

Thank you so much for adding the severity, Mark! I really like to see LOG_EMERG, LOG_ALERT, LOG_CRIT, etc. along with the messages.

I have some requests like this.
These functions are in #ifdef FOR_DEBUGGING, better be LOG_DEBUG instead of LOG_ERR?
{{{
connection_table_dump_active_connection (Connection c)
connection_table_dump_active_connections (Connection_Table
ct)
}}}
ditto...
{{{
248 248 #ifdef FOR_DEBUGGING
249 slapi_log_error(SLAPI_LOG_FATAL, LOG_ERR, "connection", "Moving connection out of active list\n");
249 slapi_log_error(SLAPI_LOG_FATAL, LOG_ERR, "connection_table_move_connection_out_of_active_list", "Moving connection out of active list\n");
250 250 connection_table_dump_active_connection (c);
251 251 #endif
}}}
ditto...
{{{
307 309 #ifdef FOR_DEBUGGING
308 slapi_log_error(SLAPI_LOG_FATAL, LOG_ERR, "connection", "Moving connection into active list\n");
310 slapi_log_error(SLAPI_LOG_FATAL, LOG_ERR, "connection_table_move_connection_on_to_active_list",
311 "Moving connection into active list\n");
309 312 connection_table_dump_active_connection (c);
310 313 #endif
}}}
These are even further fine tunings... Since you are still working on the big transition, should we wait for the task done? Thanks!

You're right they should be LOG_DEBUG I'll go back and up this patch, and make sure I follow this moving forward. Thanks.

Replying to [comment:19 rmeggins]:

You can use __func__ to get the name of the current function. Looks like ipa is using that in more recent code. See http://www.delorie.com/gnu/docs/gcc/cpp_21.html

e.g. instead of
{{{
LDAPDebug(LDAP_DEBUG_ANY, LOG_ERR, "check_oid - Non numeric oid %s passed to check_oid\n",oid,0,0);
slapi_log_error(SLAPI_LOG_FATAL, LOG_ERR, "slapi_be_setentrypoint",
580 "Unknown entry point %d\n", entrypoint);
}}}
use
{{{
LDAPDebug(LDAP_DEBUG_ANY, LOG_ERR, func " - Non numeric oid %s passed to " func "\n",oid,0,0);
slapi_log_error(SLAPI_LOG_FATAL, LOG_ERR, func, ...
}}}
however, it says that "... Neither of them (func or FUNCTION) is a macro; the preprocessor does not know the name of the current function" so you might not be able to use string pasting as in the check_oid example, but you can use it as an argument to printf %s.

Okay, so it only works using it as a printf arg, and it only works with LDAPDebug().

{{{
LDAPDebug(LDAP_DEBUG_ANY, LOG_INFO, "%s - Message...\n", func, 0, 0);
}}}

It does not work in slapi_log_error() for the subsystem. So if we implemented this we would have inconsistencies between the two logging functions. Plus this adds an arg to the limited list of args for LDAPDebug(). Some calls already use up all the available args in LDAPDebug(). Based on these issues I'm not sure this is worth implementing after all.

Replying to [comment:4 rmeggins]:

Replying to [comment:3 mreynolds]:

Refractoring tasks:

  • LDAPDebug() needs to be replaced with slapi_log_error()

This will remove the ability to compile without those logging statements. This may have performance implications.

Can you elaborate on this? Wouldn't this just impact the compile time (not the run time)?

It would just be nice to have a single logging function, and slapi_log_error() is much better than LDAPDebug().

New patch attached. Applied Noriko's requests...

Replying to [comment:27 mreynolds]:

Replying to [comment:19 rmeggins]:

You can use __func__ to get the name of the current function. Looks like ipa is using that in more recent code. See http://www.delorie.com/gnu/docs/gcc/cpp_21.html

e.g. instead of
{{{
LDAPDebug(LDAP_DEBUG_ANY, LOG_ERR, "check_oid - Non numeric oid %s passed to check_oid\n",oid,0,0);
slapi_log_error(SLAPI_LOG_FATAL, LOG_ERR, "slapi_be_setentrypoint",
580 "Unknown entry point %d\n", entrypoint);
}}}
use
{{{
LDAPDebug(LDAP_DEBUG_ANY, LOG_ERR, func " - Non numeric oid %s passed to " func "\n",oid,0,0);
slapi_log_error(SLAPI_LOG_FATAL, LOG_ERR, func, ...
}}}
however, it says that "... Neither of them (func or FUNCTION) is a macro; the preprocessor does not know the name of the current function" so you might not be able to use string pasting as in the check_oid example, but you can use it as an argument to printf %s.

Okay, so it only works using it as a printf arg, and it only works with LDAPDebug().

{{{
LDAPDebug(LDAP_DEBUG_ANY, LOG_INFO, "%s - Message...\n", func, 0, 0);
}}}

It does not work in slapi_log_error() for the subsystem.

That's really strange. freeipa is using it in a very similar fashion. Why doesn't it work?

So if we implemented this we would have inconsistencies between the two logging functions. Plus this adds an arg to the limited list of args for LDAPDebug(). Some calls already use up all the available args in LDAPDebug(). Based on these issues I'm not sure this is worth implementing after all.

We would have to add another macro (e.g. LDAPDebugFunc) with an argument just for this.

Replying to [comment:28 mreynolds]:

Replying to [comment:4 rmeggins]:

Replying to [comment:3 mreynolds]:

Refractoring tasks:

  • LDAPDebug() needs to be replaced with slapi_log_error()

This will remove the ability to compile without those logging statements. This may have performance implications.

Can you elaborate on this? Wouldn't this just impact the compile time (not the run time)?

If LDAP_DEBUG is not defined at compile time, the LDAPDebug macros will evaluate to this:
{{{
/
debugging stuff /
/
Disable by default */

define LDAPDebug( level, fmt, arg1, arg2, arg3 )

define LDAPDebugLevelIsSet( level ) (0)

define LDAPDebug0Args( level, fmt )

define LDAPDebug1Arg( level, fmt, arg )

define LDAPDebug2Args( level, fmt, arg1, arg2 )

}}}
That is, they will evaluate to nothing. Which means they won't be called at all at runtime, because there is no code. If we have some very performance sensitive code that we want to test, we can undefine LDAP_DEBUG to completely eliminate calls to LDAPDebug*.

It would just be nice to have a single logging function, and slapi_log_error() is much better than LDAPDebug().

I agree. It is confusing to have both LDAPDebug and slapi_log_error.

Replying to [comment:31 rmeggins]:

  • LDAPDebug() needs to be replaced with slapi_log_error()

This will remove the ability to compile without those logging statements. This may have performance implications.

Can you elaborate on this? Wouldn't this just impact the compile time (not the run time)?

If LDAP_DEBUG is not defined at compile time, the LDAPDebug macros will evaluate to this:
{{{
/
debugging stuff /
/
Disable by default */

define LDAPDebug( level, fmt, arg1, arg2, arg3 )

define LDAPDebugLevelIsSet( level ) (0)

define LDAPDebug0Args( level, fmt )

define LDAPDebug1Arg( level, fmt, arg )

define LDAPDebug2Args( level, fmt, arg1, arg2 )

}}}
That is, they will evaluate to nothing. Which means they won't be called at all at runtime, because there is no code. If we have some very performance sensitive code that we want to test, we can undefine LDAP_DEBUG to completely eliminate calls to LDAPDebug*.

Ahhh okay. We can add a DEFINE to slapi_log_error() to accomplish the same thing (except it will impact all logging). If this is okay, I'd like to replace the LDAPDebug, and update slapi_log_error with a DISABLE_LOGGING define.

Replying to [comment:30 rmeggins]:

It does not work in slapi_log_error() for the subsystem.

That's really strange. freeipa is using it in a very similar fashion. Why doesn't it work?

It does work, I got a compiler warning(char vs const char), but I originally thought it was an error.

Lets create a new ticket for this to be investigated. I am also dropping the new error log severity message logging level for now. That can be another ticket if it's deemed useful to have such a new logging level - as it is now very easy just to grep the errors log for the new severity levels.

Replying to [comment:32 mreynolds]:

Replying to [comment:31 rmeggins]:

  • LDAPDebug() needs to be replaced with slapi_log_error()

This will remove the ability to compile without those logging statements. This may have performance implications.

Can you elaborate on this? Wouldn't this just impact the compile time (not the run time)?

If LDAP_DEBUG is not defined at compile time, the LDAPDebug macros will evaluate to this:
{{{
/
debugging stuff /
/
Disable by default */

define LDAPDebug( level, fmt, arg1, arg2, arg3 )

define LDAPDebugLevelIsSet( level ) (0)

define LDAPDebug0Args( level, fmt )

define LDAPDebug1Arg( level, fmt, arg )

define LDAPDebug2Args( level, fmt, arg1, arg2 )

}}}
That is, they will evaluate to nothing. Which means they won't be called at all at runtime, because there is no code. If we have some very performance sensitive code that we want to test, we can undefine LDAP_DEBUG to completely eliminate calls to LDAPDebug*.

Ahhh okay. We can add a DEFINE to slapi_log_error() to accomplish the same thing (except it will impact all logging). If this is okay, I'd like to replace the LDAPDebug, and update slapi_log_error with a DISABLE_LOGGING define.

If that works. In general, you can't have a #define replace a varadic function. You have to do something like this:
{{{

if LDAP_DEBUG

define slapi_log_error((x))

else

define slapi_log_error((x)) slapi_log_error_real((x))

endif

...

slapi_log_error((SLAPI_LOG_FATAL, "subsystem", "%s %s ..."));

}}}
The extra open and close () are significant.

It's possible there is now a C/gcc standard or extension that allows you to #define a varadic function.

Replying to [comment:33 mreynolds]:

Replying to [comment:30 rmeggins]:

It does not work in slapi_log_error() for the subsystem.

That's really strange. freeipa is using it in a very similar fashion. Why doesn't it work?

It does work, I got a compiler warning(char vs const char), but I originally thought it was an error.

Lets create a new ticket for this to be investigated. I am also dropping the new error log severity message logging level for now. That can be another ticket if it's deemed useful to have such a new logging level - as it is now very easy just to grep the errors log for the new severity levels.

+1

Replying to [comment:34 rmeggins]:

Replying to [comment:32 mreynolds]:

Replying to [comment:31 rmeggins]:

  • LDAPDebug() needs to be replaced with slapi_log_error()

This will remove the ability to compile without those logging statements. This may have performance implications.

Can you elaborate on this? Wouldn't this just impact the compile time (not the run time)?

If LDAP_DEBUG is not defined at compile time, the LDAPDebug macros will evaluate to this:
{{{
/
debugging stuff /
/
Disable by default */

define LDAPDebug( level, fmt, arg1, arg2, arg3 )

define LDAPDebugLevelIsSet( level ) (0)

define LDAPDebug0Args( level, fmt )

define LDAPDebug1Arg( level, fmt, arg )

define LDAPDebug2Args( level, fmt, arg1, arg2 )

}}}
That is, they will evaluate to nothing. Which means they won't be called at all at runtime, because there is no code. If we have some very performance sensitive code that we want to test, we can undefine LDAP_DEBUG to completely eliminate calls to LDAPDebug*.

Ahhh okay. We can add a DEFINE to slapi_log_error() to accomplish the same thing (except it will impact all logging). If this is okay, I'd like to replace the LDAPDebug, and update slapi_log_error with a DISABLE_LOGGING define.

If that works. In general, you can't have a #define replace a varadic function. You have to do something like this:
{{{

if LDAP_DEBUG

define slapi_log_error((x))

else

define slapi_log_error((x)) slapi_log_error_real((x))

endif

...

slapi_log_error((SLAPI_LOG_FATAL, "subsystem", "%s %s ..."));

}}}
The extra open and close () are significant.

It's possible there is now a C/gcc standard or extension that allows you to #define a varadic function.

Hmmm, I haven't tried this, but I thought we could do something like this:

{{{
slapi_log_error( int loglevel, int severity, char subsystem, char fmt, ... )
{

if defined LOGGING_ENABLED

/* all the logging code */

endif

return;

}
}}}

But this is still making a function call - not quite the same thing as the LDAPDebug macro. I'll do some research on the varadic function in C/gcc...

Replying to [comment:25 mreynolds]:

Replying to [comment:21 firstyear]:

So what's the point of the LOG_ERR? We already have the message at FATAL, we are just "doubly" expresing fatal.

The point of this patch apparently is to make it when you say "LOG_ERR" that it puts the word "ERROR" at the start of the line.

It's actually "ERR" after we use the syslog mappings.

see /usr/include/sys/syslog.h

So why not have a table to map SLAPI_LOG_type -> syslog type? That would be better here.

Why not just infer that from the SLAPI_LOG_* type? It would save changing the api, and it would have the same effect? It seems like we are double handling here ....

The whole point of all of this is that SLAPI_LOG_FATAL is too general, and there is no easy way sort/parse the errors log. We log info messages, warnings, errors, severe errors, and debug messages. all using SLAPI_LOG_FATAL. There is no easy way to determine what is serious and what is not. Customers get confused by this ALL the time. So is there a bit of double handling, of course there is, but the benefit far out weights it.

So I think you are implying that we should add new logging levels? The problem with that, is that they need to be set in the config and are not logged by default. We are also taking advantage of the syslog error codes, and moving towards a more standard way of logging errors.

Yes, that is my point. What is the problem we are trying to solve here: That we use SLAPI_LOG_FATAL too much because we don't have the log levels we need to express our messages, and that the default level is incorrect even if we had the other levels.

Instead, lets not make it more complex, we have to bite the bullet and just fix the default level, and the available log levels. I think we are actually moving to a more complex scenario. For example, at default log level, what happens here:

{{{
SLAPI_LOG_PLUGIN, LOG_DEBUG
SLAPI_LOG_PLUGIN, LOG_ERR
SLAPI_LOG_FATAL, LOG_INFO
SLAPI_LOG_FATAL, LOG_ERR
}}}

The fact that I have to really question what will happen to these messages, based on the combination of both flags is the immediate warning that this is the wrong approach.

So, I would suggest we add some new levels such as:

{{{
SLAPI_LOG_DEBUG
SLAPI_LOG_INFO
SLAPI_LOG_WARNING
}}}

And that we make the default error level set to show INFO, WARNING, and FATAL out of the box.

Some of the comments in this thread indicate some question about this, and perhaps backing out I'm not sure. Either way, I'm still not satisfied with this approach I'm sorry :(

If you just use slapi_log_error as a function, any arguments you pass in will still be evaluated. e.g. if you do something like this:
{{{
slapi_log_error(SLAPI_LOG_FATAL, something, "%d\n", expensive_function_that_returns_an_int());
}}}
The expensive function will be evaluated regardless. The nice thing about a macro like LDAPDebug is that you can completely compile out this code.

I'll do some research on the varadic function in C/gcc...

https://en.wikipedia.org/wiki/Variadic_macro

Replying to [comment:37 firstyear]:

Some of the comments in this thread indicate some question about this, and perhaps backing out I'm not sure. Either way, I'm still not satisfied with this approach I'm sorry :(

I don't think you've been following/aware of what's going on here. I will recap what is going on, but everyone else is on board with the changes that are being made (unless something changed over the weekend that I'm not aware of).

First, we are NOT doing any severity level log filtering. No new configurable log level is going to be used, or logging behavior changed. It isn't really needed (not yet). What this ticket is doing is just standardizing the logging format and improving error classification. The logging is going to look like:

{{{
<timestamp> - <log level=""> - <plugin function(aka="" subsystem)=""> - <message>
}}}

That's all that is going on here. This will finally allow for some basic parsing of the errors log, and clarity of what the message really means. The log levels are defined in syslog, and I have used every level in my work so far. Here are some examples:

{{{
[09/Sep/2016:11:54:42.416337604 -0400] - INFO - ldbm_back_ldbm2index - userroot: Indexed 50000 entries (99%).
[09/Sep/2016:11:54:42.430673504 -0400] - INFO - ldbm_back_ldbm2index - userroot: Finished indexing.
[10/Sep/2016:23:46:07.624057716 -0400] - NOTICE - ldbm_back_start - userroot: entry cache size 10485760 B is less than db size 102645760 B; We recommend to increase the entry cache size nsslapd-cachememsize.

Plugins will be:

[10/Sep/2016:23:46:07.624057716 -0400] - ERR - NSMMReplicationPlugin - replica_config_modify: replica already exists
[10/Sep/2016:23:46:07.624057717 -0400] - INFO - NSMMReplicationPlugin - cleanruv_task: cleaning rid (99)...
}}}

So, the only thing I did in the code was add a sev_level to the logging functions, and then cleaned up/standardized all the logging messages. I'm also like 75% complete in this task.

Moving forward I will be looking into replacing LDAPDebug as well, but this phase needs to be completed first. But, if all of this still really bothers you then please bring it up with Nathan.

Replying to [comment:38 rmeggins]:

If you just use slapi_log_error as a function, any arguments you pass in will still be evaluated. e.g. if you do something like this:
{{{
slapi_log_error(SLAPI_LOG_FATAL, something, "%d\n", expensive_function_that_returns_an_int());
}}}
The expensive function will be evaluated regardless. The nice thing about a macro like LDAPDebug is that you can completely compile out this code.

I'll do some research on the varadic function in C/gcc...

https://en.wikipedia.org/wiki/Variadic_macro

I don't want to add more macros for this. Macros cause headaches with linking to other langs(which I am especially interested in): You basically make a function which can never be exported or used, but has huge useful purpose. I would like to avoid this please.

Replying to [comment:40 firstyear]:

Replying to [comment:38 rmeggins]:

If you just use slapi_log_error as a function, any arguments you pass in will still be evaluated. e.g. if you do something like this:
{{{
slapi_log_error(SLAPI_LOG_FATAL, something, "%d\n", expensive_function_that_returns_an_int());
}}}
The expensive function will be evaluated regardless. The nice thing about a macro like LDAPDebug is that you can completely compile out this code.

I'll do some research on the varadic function in C/gcc...

https://en.wikipedia.org/wiki/Variadic_macro

I don't want to add more macros for this. Macros cause headaches with linking to other langs(which I am especially interested in): You basically make a function which can never be exported or used, but has huge useful purpose. I would like to avoid this please.

If you don't want to add more macros, what do you want to do with the nice-to-have, possibly critical-to-have-in-some-situations LDAPDebug* macros, that can't be used if we add/change the logging functions?

There is still more to do. Need to investigate the removal of LDAPDebug, etc.

Mark,

Your patch looks good. I compared with 11a5b1e, which is just before the first patch for this ticket. It gave me pretty clean diffs that mainly contains refined log level and messages with the new definitions/logging mechanisms. (Also, I believe the message part has been already reviewed in the previous patches.)

I'd like to ask one request on ssl.c. In the file, for some reason, 5 messages start with "Warning:" which were automatically set to SLAPI_LOG_WARNING. I'd think they are all errors. Could you please replace SLAPI_LOG_WARNING with SLAPI_LOG_ERR?

These look rather SLAPI_LOG_CRIT to me... :)
{{{
+ slapi_log_error(SLAPI_LOG_WARNING, "Security Initialization",
+ "warn_if_no_cert_file: certificate DB file cert8.db nor cert7.db exists in [%s] - "
"SSL initialization will likely fail\n", dir);

  • slapi_log_error(SLAPI_LOG_WARNING, "Security Initialization",
  • "warn_if_no_key_file: key DB file %s does not exist - SSL initialization will "
    "likely fail\n", filename);

  • slapi_log_error(SLAPI_LOG_WARNING, "Security Initialization",

  • "slapd_nss_init: The key/cert database directory [%s] is not writable by "
    "the server uid [%s]: initialization likely to fail.\n",
    certdir, serveruser);
    }}}
    Other than that, you have my ack.

Replying to [comment:43 nhosoi]:

I'd like to ask one request on ssl.c. In the file, for some reason, 5 messages start with "Warning:" which were automatically set to SLAPI_LOG_WARNING. I'd think they are all errors. Could you please replace SLAPI_LOG_WARNING with SLAPI_LOG_ERR?

Done, new patch attached!

Great! Thank you so much, Mark! The new log looks really cool:
{{{
[..] - INFO - slapd_extract_cert - CA CERT NAME: CA certificate
[..] - WARN - SSL alert: Sending pin request to SVRCore. You may need to run systemd-tty-ask-password-agent to provide the password.
[..] - INFO - slapd_extract_cert - SERVER CERT NAME: Server-Cert1
[..] - INFO - SSL info: Configured NSS Ciphers
[..] - INFO - SSL info: TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256: enabled
[..] - INFO - SSL info: TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256: enabled
........................
[..] - INFO - Security Initialization - slapd_ssl_init2 - Configured SSL version range: min: TLS1.1, max: TLS1.2
[..] - INFO - 389-Directory/1.3.6.0.20160920git73635f5 B2016.264.2035 starting up
[..] - INFO - slapd started. Listening on All Interfaces port 389 for LDAP requests
[..] - INFO - Listening on All Interfaces port 10637 for LDAPS requests
}}}
Do you need some more acks to push, Mark? I'm leaving the Review status to review? for now...

Replying to [comment:45 nhosoi]:

Do you need some more acks to push, Mark? I'm leaving the Review status to review? for now...

I don't think so, yours would be great :-)

2b45d58..fb9b9ef master -> master
commit fb9b9ef
Author: Mark Reynolds mreynolds@redhat.com
Date: Tue Sep 20 16:25:51 2016 -0400

I know that it's already acked and merged. This looks fantastic, exactly what I had in mind. I really appreciate all the work you put into this, and think this will make logging far better for us in the future!

New patch fixes the CI tests that failed because the error logging changed.

Thank you, Mark! I've checked and all three test suites are passed!

Replying to [comment:51 spichugi]:

Thank you, Mark! I've checked and all three test suites are passed!

Thanks Simon!

fb9b9ef..b3a1675 master -> master
commit b3a1675
Author: Mark Reynolds mreynolds@redhat.com
Date: Mon Sep 26 10:25:44 2016 -0400

1b48622..04a9c89 master -> master
commit 04a9c89
Author: Mark Reynolds mreynolds@redhat.com
Date: Wed Sep 28 14:11:03 2016 -0400

Arrghh! Too late again.

Nack to this change! Making functions as macros is going to break portability with external tools (like rust plugins) because they can't consume these macros! We should not be offering macros as a public interface.

I think this is better as either a function wrapper that just routes messages according to the defines in place instead. This way we have a function as the interface, and we can hide the defines and still have the functionality behind.

In the future, I'm going to start saying no to more macro-function additions, as they never end well IMO.

Second the addition of portable.h will break consumers of slapi-plugin.h because we don't distribute this in the -devel rpm.

Replying to [comment:56 firstyear]:

Arrghh! Too late again.

Nack to this change! Making functions as macros is going to break portability with external tools (like rust plugins) because they can't consume these macros! We should not be offering macros as a public interface.

I think this is better as either a function wrapper that just routes messages according to the defines in place instead. This way we have a function as the interface, and we can hide the defines and still have the functionality behind.

In the future, I'm going to start saying no to more macro-function additions, as they never end well IMO.

So we NEED the macro function since we are replacing the LDAPDebug functions, and we don't want to lose the functionality of being able to remove the presence of logging for performance testing. I'm sorry if this bothers you, but it's not going away. I will leave slapi_log_error() as the "real" function in slapi-plugin.h, and use the macro version in our server code. Then I can remove portable.h from slapi-plugin.h

I'm currently porting all the LDAPDebug functions to slapi_log_error() Once that's done then I will do a refactor to switch around the macro vs real function.

Replying to [comment:58 mreynolds]:

So we NEED the macro function since we are replacing the LDAPDebug functions, and we don't want to lose the functionality of being able to remove the presence of logging for performance testing. I'm sorry if this bothers you, but it's not going away. I will leave slapi_log_error() as the "real" function in slapi-plugin.h, and use the macro version in our server code.
+1

Then I can remove portable.h from slapi-plugin.h
That's good for this logging improvement project. I grepped portable.h in our code and indeed the header file is included most of the plug-ins. Is it really needed? The file looks really old...

I'm happy with this. LDAPDebug I hope can be removed completely soon, and this fixes the ABI/API issues with slapi_log_error. Ack from me, but it's a large change so maybe someone else wants to look too to be sure.

Q1. The second arg is supposed to be a plugin name?

The collation may not be following the rule?
{{{
diff --git a/ldap/servers/plugins/collation/collate.c b/ldap/servers/plugins/collation/collate.c

  • slapi_log_err(SLAPI_LOG_ERR, "collation_config",
    "%s: line %lu ignored: only %lu arguments (expected "
    "collation language country variant strength decomposition oid ...)\n",
    fname, (unsigned long)lineno, (unsigned long)cargc );
    [...]

diff --git a/ldap/servers/plugins/collation/orfilter.c b/ldap/servers/plugins/collation/orfilter.c

  • slapi_log_err(SLAPI_LOG_FILTER, "or_filter_create", "(oid %s; type %s)\n",
  • mrOID, mrTYPE);
    }}}
    although the last 2 of orfilter sets SLAPI_ORPLUGIN_NAME.
    {{{
    if ( slapi_pblock_get( pb, SLAPI_CONFIG_DIRECTORY, &cfgpath ) != 0 ) {
  • slapi_log_error(SLAPI_LOG_ERR, SLAPI_ORPLUGIN_NAME,
  • "orderingRule_init - Unable to retrieve slapd configuration pathname; using default\n" );
  • slapi_log_err(SLAPI_LOG_ERR, SLAPI_ORPLUGIN_NAME,
  • "orderingRule_ini - Unable to retrieve slapd configuration pathname; using default\n");
    cfgpath = NULL;
    }
    }}}

Replying to [comment:62 nhosoi]:

Q1. The second arg is supposed to be a plugin name?

Yes, nice catch! I also removed ldaplog.h - which is where all the LDAPDebug* macros were stored. So now LDAPDebug is completely removed. I also confirmed IPA does not use LDAPDebug() in its plugins.

New patch attached.

Replying to [comment:64 firstyear]:

Looks great to me!
+1

Thanks!

5cc301f..d80d054 master -> master
commit d80d054
Author: Mark Reynolds mreynolds@redhat.com
Date: Tue Oct 11 11:16:02 2016 -0400

Fixed build error, but on F24 this error was somehow ignored by gcc

d80d054..cbd6168 master -> master
commit cbd6168

commit 9464bc8
Writing objects: 100% (20/20), 1.97 KiB | 0 bytes/s, done.
Total 20 (delta 18), reused 0 (delta 0)
To ssh://git.fedorahosted.org/git/389/ds.git
6a6bbc8..9464bc8 master -> master

Pushed for Mark to fix CI,

Looks good in general, but if I could ask... it'd be nice for the WinSync code to have something like "WinSync plugin" for the second arg instead of individual function such as "create_subtree_pairs". That means there are more places to apply the change. So, we could fix them later separately. :)
{{{
940 940 if (NULL == spp->DSsubtree) {
941 941 slapi_log_err(SLAPI_LOG_ERR,
942 "create_subtree_pairs: "
942 "create_subtree_pairs",
943 943 "Ignoring invalid DS subtree \"%s\".\n",
944 944 p0);
945 945 continue;
… … create_subtree_pairs(char **pairs)
947 947 spp->ADsubtree = slapi_sdn_new_dn_byval(p1);
948 948 if (NULL == spp->ADsubtree) {
949 949 slapi_log_err(SLAPI_LOG_ERR,
950 "create_subtree_pairs: "
950 "create_subtree_pairs",
}}}

Replying to [comment:70 nhosoi]:

Looks good in general, but if I could ask... it'd be nice for the WinSync code to have something like "WinSync plugin" for the second arg instead of individual function such as "create_subtree_pairs".

Fixed, new patch attached.

Excellent!! Thanks a lot, Mark!!

One question... I'm wondering from where these systemd related changes came and were added to the generated files (Makefile.in and configure). Both config.h.in and configure.ac do not have the corresponding change?
{{{
1543 @with_systemd_FALSE@WITH_SYSTEMD = 0
1544 @with_systemd_TRUE@WITH_SYSTEMD = 1
}}}
Thanks!

New patch attached. Via Williams comments I also removed the logging macro from slapi-plugin.h.

As for Makefile being updated, it is because on f24 we have a newer version of autoconf (at least with the latest diff)

Looks good, appreciate this moving to slapi-private.h.

73b91d8..1e1f6fe master -> master
commit 1e1f6fe
Author: Mark Reynolds mreynolds@redhat.com
Date: Tue Oct 18 11:40:57 2016 -0400

d945aba..3a5cc4d master -> master
commit 3a5cc4d
Author: Mark Reynolds mreynolds@redhat.com
Date: Thu Oct 20 15:47:11 2016 -0400

Another refactoring error

fdf265e..0ad289c master -> master
commit 0ad289c
Author: Mark Reynolds mreynolds@redhat.com
Date: Tue Nov 1 11:57:52 2016 -0400

It's a very minor thing :), but reslimit_update_from_entry eliminated fnname, but reslimit_get_registered_attributes and slapi_reslimit_get_integer_limit didn't. I think you could remove fnname from all of them to make it simpler? I'm afraid a compiler warning might be reported for fnname as unused variable in slapi_reslimit_get_integer_limit...
{{{
… … reslimit_update_from_entry( Slapi_Connection conn, Slapi_Entry e )
347 char fnname = "reslimit_update_from_entry()";
}}}
{{{
… … static char
reslimit_get_registered_attributes(void)
482 char
fnname = "slapi_reslimit_register()";
480 char fnname = "slapi_reslimit_register";
}}}
{{{
… … slapi_reslimit_get_integer_limit( Slapi_Connection
conn, int handle,
}}}

Replying to [comment:81 nhosoi]:

It's a very minor thing :), but reslimit_update_from_entry eliminated fnname, but reslimit_get_registered_attributes and slapi_reslimit_get_integer_limit didn't. I think you could remove fnname from all of them to make it simpler?

Thanks for the review, new patch attached

https://fedorahosted.org/389/attachment/ticket/48978/0001-Ticket-48978-Fix-more-log-refactoring-issues.patch

6ea27bf..3c537b0 master -> master
commit 3c537b0
Author: Mark Reynolds mreynolds@redhat.com
Date: Mon Nov 14 11:46:58 2016 -0500

I'm happy with the commit message.

I am happy to ack this fix, but with a large warning. Please do not use systemd-journald with DS, as it is not supported, and will not be in the near future. It may be removed soon.

commit 5063d0d
Writing objects: 100% (46/46), 6.66 KiB | 0 bytes/s, done.
Total 46 (delta 36), reused 2 (delta 0)
To ssh://git.fedorahosted.org/git/389/ds.git
b1f434e..251424e master -> master

Metadata Update from @mreynolds:
- Issue assigned to mreynolds
- Issue set to the milestone: 1.3.6.0

2 years ago

Metadata Update from @mreynolds:
- Custom field reviewstatus adjusted to review (was: ack)
- Issue close_status updated to: None (was: Fixed)

2 years ago

Metadata Update from @nhosoi:
- Custom field reviewstatus adjusted to ack (was: review)

2 years ago

Thanks Noriko

9804f8a..fbf320f master -> master

Metadata Update from @mreynolds:
- Custom field reviewstatus adjusted to review (was: ack)
- Issue close_status updated to: fixed

2 years ago

Shouldn't some of these calls to slapi_log_err be at the LOG_CRIT level? Sorry to be a pain ...

While the message used to say "critical", it was in fact "emergency" by definition (server fails to start)

But they are now at LOG_ERR level?

The emergency logging functions now use LOG_EMERG. The other's that said critical were not "critical".

Login to comment on this ticket.

Metadata