I got new package SCM request processed few minutes ago and in a hurry I cloned nodejs-he package and tried to "git push" but it gave this traceback
[parag@f26 master]$ git push Counting objects: 5, done. Delta compression using up to 4 threads. Compressing objects: 100% (4/4), done. Writing objects: 100% (5/5), 2.11 KiB | 0 bytes/s, done. Total 5 (delta 0), reused 0 (delta 0) remote: Emitting a message to the fedmsg bus. remote: * Publishing information for 1 commits remote: * Notifying alternative-arch people remote: Traceback (most recent call last): remote: File "./hooks/post-receive-chained.d/post-receive-alternativearch", line 205, in <module> remote: run_as_post_receive_hook() remote: File "./hooks/post-receive-chained.d/post-receive-alternativearch", line 197, in run_as_post_receive_hook remote: full_change=full_change remote: UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 1395: ordinal not in range(128) To ssh://pkgs.fedoraproject.org/rpms/nodejs-he 3138f43..43232e3 master -> master
Should I worry about this?
Metadata Update from @pingou: - Issue assigned to pingou
Worry no, but it looks like our git hook doesn't support properly UTF-8, sorry about that.
I'll look at it
Okay then I will proceed with "fedpkg build" and merging master branch in f26. Thanks.
FWIW, UCS-2/UTF-16 also appear broken in the same way.
Also reported in https://pagure.io/fedora-infrastructure/issue/6278
Also reported in https://pagure.io/fedora-infrastructure/issue/6327
Having just seen and reported this in 6594, the error is currently at line 198, where the string full_change is referenced.
full_change
I think it may be enough to mark this variable as a unicode string. It's passed via TEXT to send_email(). The TEXT var is already a unicode string. So perhaps all that's needed is to treat full_change as unicode as well?
TEXT
send_email()
(It's possible that the change (and/or diffs) variable which holds the diffs which match the arch patterns also needs to be handled as unicode, though it is less likely that those lines would contain non-ascii chars.)
change
diffs
<img alt="0001-git-post-receive-alternativearch-treat-diff-as-unico.patch" src="/fedora-infrastructure/issue/raw/files/2e830ea37a31c13d61a3c2524fd5324b8e5f6befb8bc8b0a5b501c6f365b43a8-0001-git-post-receive-alternativearch-treat-diff-as-unico.patch" />
The one line change included in the above git patch is simply:
- full_change = '' + full_change = u''
It's completely untested, of course. I don't have an environment to test the full hook script and didn't have time to strip it down to a self-contained test case. :)
I trimmed down the post-receive-alternativearch script to test locally. Doing so showed me that simply treating full_change as unicode was not enough to fix the issue. Encoding the output from git in read_output() fixes the issue for my test version of the hook.
post-receive-alternativearch
read_output()
<img alt="0001-git-post-receive-alternativearch-use-unicode-for-git.patch" src="/fedora-infrastructure/issue/raw/files/d8754aa0e2ab43e830fb0d88dec8c0b7084287d6d884ece921baf0aa42d99a63-0001-git-post-receive-alternativearch-use-unicode-for-git.patch" /> (Also available at https://pagure.io/fedora-ansible/c/d4d3135ab)
It may not strictly be necessary to define full_change as unicode, but it shouldn't hurt to be explicit about the intent.
Counting objects: 6, done. Delta compression using up to 4 threads. Compressing objects: 100% (6/6), done. Writing objects: 100% (6/6), 776 bytes | 776.00 KiB/s, done. Total 6 (delta 4), reused 0 (delta 0) remote: Emitting a message to the fedmsg bus. remote: * Publishing information for 2 commits remote: * Notifying alternative-arch people remote: Traceback (most recent call last): remote: File "/usr/share/git-core/post-receive-alternativearch", line 206, in <module> remote: run_as_post_receive_hook() remote: File "/usr/share/git-core/post-receive-alternativearch", line 198, in run_as_post_receive_hook remote: full_change=full_change remote: UnicodeDecodeError: 'ascii' codec can't decode byte 0xc4 in position 66: ordinal not in range(128) remote: Sending to redis to log activity and send commit notification emails remote: Detailed log of new commits: remote: remote: remote: * commit 27657069ff6fedfdeedc1770738261f4662213bd remote: * Author: Miro Hrončok <miro@hroncok.cz> remote: * Date: Wed Dec 27 15:58:25 2017 +0100 remote: * remote: * Temporarily skip test_socket on ix86 remote: * remote: * commit a6780ec4355a7d2691bf8441b76b77186fb1b22d remote: * Author: Miro Hrončok <miro@hroncok.cz> remote: * Date: Wed Dec 27 15:57:17 2017 +0100 remote: * remote: * Enable JIT on power and s390x To ssh://pkgs.fedoraproject.org/rpms/pypy3 2b72df5..2765706 master -> master
I suspect that the č in my name is to blame here.
May I suggest to switch to Python 3 to avoid this kind of problems? :P
If no one has time to test a proper fix for this issue, can we at least add a try/except/pass to avoid the error on pushes? I manage to hit this quite frequently.
I also don't think the hook should be including the full commit messages. The subject should be enough. (I imagine that most people only write a single line commit message and therefore don't see copious text spit back by this hook. It's quite excessive, IMO. My most recent push spit 154 lines back to me, for example: https://paste.fedoraproject.org/paste/-x06Rk4qrjZfo21Vgpi5TA
Using git log --date=short --format='%h %s (%an, %ad)' <commit-ish> would provide more succinct output.
git log --date=short --format='%h %s (%an, %ad)' <commit-ish>
Could someone ping me on IRC when ready to test this? I'd prefer to be around when testing the fix :)
(me or someone else, doesn't have to be me, but i'd prefer someone to be around when we test/add it)
Patch applied and from our tests in prod and staging, is working.
Many thanks to @tmz for the patch and to everyone for your patience, sorry it took so long to fix this :(
Metadata Update from @pingou: - Issue close_status updated to: Fixed
Login to comment on this ticket.