Issue #57: CI pipeline: The split of stdout and stderr log is unreadable - general

fedora-ci / general

#57 CI pipeline: The split of stdout and stderr log is unreadable

Closed 4 years ago by astepano. Opened 4 years ago by churchyard.

Ie recently found out, that the logs in artifacts folder now separate stdout and stderr:
https://jenkins-continuous-infra.apps.ci.centos.org/job/fedora-f29-build-pipeline/536/artifact/package-tests/logs/PASS-str_smoke.log
https://jenkins-continuous-infra.apps.ci.centos.org/job/fedora-f29-build-pipeline/536/artifact/package-tests/logs/PASS-str_smoke-err.log

This is now not very readable. The stderr log contains all the commands, when we run sh -eux, while stdout contains all the important output. There is no way to see what output belongs to what command.

psss commented 4 years ago

I also think that having stdout & stderr together is better for easier review and asked @astepano for that but he has a different opinion.

bgoncalv commented 4 years ago

Right, I also would prefer to have a single log, but the split was discussed on this PR: https://pagure.io/standard-test-roles/pull-request/335

churchyard commented 4 years ago

I've also commented there. What should I do next to undo this?

astepano commented 4 years ago

@churchyard I don't know anything about buffered or unbuffered, but the split made the logs totally unreadable.
I think the issue is in: I don't know anything about buffered or unbuffered.

@bgoncalv : If you want to mix buffered and unbuffered . You know what to do. But, I will be against this with very strong position. You can gather people and push a PR with confirmation of other people, without my ACK.

astepano commented 4 years ago

The real issue is in test-case. Test case should care about its output. What goes to STDERR what goes to STDOUT.

bookwar commented 4 years ago

@astepano

I think the issue is in: I don't know anything about buffered or unbuffered.

This is not really a reasonable argument. If you want to make a point, please explain it.

The reason why we need logs at all, is to allow people to read and analyze them. The technical implementation details are important, but the decision must be driven by the question whether or not it is useful for the reader of the log.

Logs can be designed in various ways. The comment (https://pagure.io/standard-test-roles/pull-request/335#comment-84574) refers implicitly to the logs similar to Apache or other service environemnts, where each log line has a timestamp and much more information on where this line comes from.

This logic obviously doesn't work in the case from the top post, or in a generic bash-script case, where log lines are not structured and just work as lines of texts.

While we can argue about how logs should look like, the goal of the STR is to run various tests and test frameworks universally. Thus we can not assume the log timestamps to be available.

There could be a possibility to add timestamps to log lines, but unless we see the actual issue coming from the error logs and stdout in one file, I think it doesn't worth the trouble.

To you have an example for STR where mixing logs actually creates an issue?

astepano commented 4 years ago

refers implicitly to the logs similar to Apache or other service environemnts, where each log line has a timestamp and much more information on where this line comes from.

What is Apache service?

If this HTTPD server that it splits Error logs:

 ls /var/log/httpd/ -1
access_log
access_log-20190519
access_log-20190527
access_log-20190602
access_log-20190609
error_log
error_log-20190519
error_log-20190527
error_log-20190602
error_log-20190609

bookwar commented 4 years ago

What is Apache service?

The Apache HTTP Server, colloquially called Apache

If this HTTPD server that it splits Error logs:
ls /var/log/httpd/ -1
access_log
access_log-20190519
access_log-20190527
access_log-20190602
access_log-20190609
error_log
error_log-20190519
error_log-20190527
error_log-20190602
error_log-20190609

Yes, and it is exactly my point. Servers split error logs because
1) logs are formatted and timestamped
Each log line has additional information on where it comes from and when

2) these are logs for a service
There is no sequence of actions. All log lines are (more or less) independent from each other.

This is very different from the test logs or logs from generic bash scripts. In case of bash script we have a sequence of commands, which print certain information. The order of steps and order of lines in the log is important. If we split logs of a bash script we lose important information, the link between informational message as the execution flow enters a certain step and the error message which happened in this step.

Much closer example for STR would be mock:
https://github.com/rpm-software-management/mock/issues/266

The stdout and stderr are merged there (as they always were), while stderr now has additional label. This could be also a nice option in STR, btw.

astepano commented 4 years ago

@bookwar :

This is very different from the test logs or logs from generic bash scripts. In case of bash script we have a sequence of commands, which print certain information. The order of steps and order of lines in the log is important. If we split logs of a bash script we lose important information, the link between informational message as the execution flow enters a certain step and the error message which happened in this step.

STDERR is unbuffered.
STDERR from command will be out of sync STDOUT.
Position of STDERR text will be out of sync with commands run.

How do you want to solve situation with STDERR?

churchyard commented 4 years ago

Let me put it differently: it used to work for me, now it doesn't. From the user POV, it's a regression created to solve a problem I don't have. The logs from the tests are exactly as I want them, but they only make sense when read together. As I use bash with -eux all commands go to stderr but when their standard output goes to another log, this is totally unreadable mess.

You say the problem is I don't know about buffered or unbuffered. I say the problem is you made an unfortunate change without considering actual use cases of your users, against the advice of other developers working on the project.

Edited 4 years ago by churchyard

dperpeet commented 4 years ago

@astepano @bookwar While we can argue the technical points here, the Standard Test Roles seem to behave significantly differently than before. While @churchyard is the reporter, there are bound to be others impacted (as seen from the discussion on the changing PR).

I see a few obvious choices:
1. Open a PR quickly to merge the output (see @bookwar's example)
2. Come up with a documented explanation or technical fix that allows users to use the logs in a way they expect
3. Open a PR quickly to produce both merged and split output (e.g. -combined, -stderr, -stdout)

Let's fix the user experience instead of devolving into too many technical arguments, and please don't limit the choice to what I listed here. :)

churchyard commented 4 years ago

Thank you.

Metadata Update from @bookwar:
- Issue assigned to bookwar

4 years ago

churchyard commented 4 years ago

This appears stalled. Is it forgotten or does it wait for resources (people)?

astepano commented 4 years ago

Let's make 3 log files, STDOUT, STDERR, STDOUT+STDERR.
I will look into a bit later.

astepano commented 4 years ago

Proposed fix https://pagure.io/standard-test-roles/pull-request/374

astepano commented 4 years ago