When buildbot runs a task which prints to both stdout and stderr, the output is not chronologically correct. Often stdout and stderr interleave badly: {F53218} Or e.g. stderr is delayed by several (tens of) lines, making it hard to decipher what exactly went wrong.
One possible solution that I and @mkrizek tried was to set usePTY=True for buildbot's [[ http://docs.buildbot.net/latest/manual/cfg-buildsteps.html#using-shellcommands | ShellCommand ]]. Similarly to our paramiko usage, that fixes the problem by using a pseudoterminal in which there is no longer a separate stdout/stderr stream, but only a single stream, which guarantees correct ordering. It works, but unfortunately seems to double all newlines printed from minion (but not newlines printed from overlord):
usePTY=True
[libtaskotron:remote_exec.py:168] 2016-05-24 11:32:01 INFO Installing 2 packages on remote host... [libtaskotron:remote_exec.py:116] 2016-05-24 11:32:01 DEBUG Running command on remote host: dnf --assumeyes --cacheonly --best install libtaskotron-core libtaskotron-fedora Cache-only enabled but no cache for 'fedora', disabling. Cache-only enabled but no cache for 'infrastructure-testing', disabling. Cache-only enabled but no cache for 'infrastructure', disabling. Package libtaskotron-core-0.4.12-1.fc25.noarch is already installed, skipping. Package libtaskotron-fedora-0.4.12-1.fc25.noarch is already installed, skipping. Dependencies resolved. Nothing to do. Complete! [libtaskotron:minion.py:85] 2016-05-24 11:32:02 INFO Copying task files onto the minion... [libtaskotron:remote_exec.py:284] 2016-05-24 11:32:02 DEBUG Recursively copying /etc/taskotron to root@192.168.226.176:/etc/taskotron/ ... [libtaskotron:remote_exec.py:116] 2016-05-24 11:32:02 DEBUG Running command on remote host: rm -rf /var/tmp/taskotron/taskdir && mkdir -p /var/tmp/taskotron/taskdir [libtaskotron:remote_exec.py:284] 2016-05-24 11:32:02 DEBUG Recursively copying /home/qa11.qa-7/slave/x86_64/build to root@192.168.226.176:/var/tmp/taskotron/taskdir ... [libtaskotron:remote_exec.py:187] 2016-05-24 11:32:02 DEBUG Writing data to root@192.168.226.176:/var/tmp/taskotron/taskdir/runtask.yml ... [libtaskotron:minion.py:124] 2016-05-24 11:32:02 INFO Executing the task on the minion... [libtaskotron:remote_exec.py:116] 2016-05-24 11:32:02 DEBUG Running command on remote host: cd /var/tmp/taskotron/taskdir && runtask --arch x86_64 --item vim-7.4.1835-1.fc25 --jobid x86_64/171927 --type koji_build --uuid 1570b306-21a3-11e6-9a59-525400571835 --local runtask.yml [libtaskotron] 11:32:02 INFO Execution started at: 2016-05-24 11:32:02 UTC [libtaskotron] 11:32:02 DEBUG Using libtaskotron 0.4.12 [libtaskotron] 11:32:02 DEBUG Parsed arguments: Namespace(arch=['x86_64'], debug=False, item='vim-7.4.1835-1.fc25', jobid='x86_64/171927', libvirt=False, local=True, no_destroy=False, override=[], patch=None, ssh=None, ssh_privkey=None, task='runtask.yml', type='koji_build', uuid='1570b306-21a3-11e6-9a59-525400571835') [libtaskotron] 11:32:02 DEBUG Using config file: /etc/taskotron/taskotron.yaml [libtaskotron] 11:32:02 DEBUG Using config profile: production [libtaskotron:logger.py:184] 2016-05-24 11:32:02 DEBUG Stream logging enabled with level: DEBUG [libtaskotron:logger.py:261] 2016-05-24 11:32:02 DEBUG File logging enabled with level DEBUG into: /var/log/taskotron/taskotron-root.log [libtaskotron:overlord.py:86] 2016-05-24 11:32:02 INFO Task artifacts will be saved in: /var/lib/taskotron/artifacts/1570b306-21a3-11e6-9a59-525400571835
http://taskotron-dev.fedoraproject.org/taskmaster/builders/x86_64/builds/171927/steps/runtask/logs/stdio
I have no idea why that is, but we decided with Martin to revert the change for the moment, until we can get a proper output.
Another idea was to modify the ShellCommand to include 2>&1 bash style redirection. That would ensure we only use a single stream. However, since ShellCommand executes a command directly, we would need to wrap everything like this:
ShellCommand
2>&1
sh -c 'runtask -i ... -t ... runtask.yaml 2>&1'
That brings troubles with shell escaping and increases security risks. So we decided to not use this approach either.
The point of this ticket is come up with something that will ensure correct chronological ordering of logged messages, and will not suffer with these disadvantages. So e.g. figure out why we have doubles newlines with usePTY=True in some cases and fix it or work around it.
(Please note that this implies we will no longer have stderr messages colored. I don't think it's a problem, because currently we can't use colors to detect problems anyway - all logging messages are printed to stderr, including debug messages, making it unfit for detecting problems. It's better to detect errors by parsing log files anyway.)
I created a RFE for BuildBot to allow us to avoid splitting streams: http://trac.buildbot.net/ticket/3557
Metadata Update from @kparal: - Issue tagged with: infrastructure
Log in to comment on this ticket.