#8824 posts submitted to Fedora Users List getting lost.
Closed: Fixed 3 years ago by kevin. Opened 4 years ago by bier.

I've tried to submit a post to the Fedora Users List (users@lists.fedoraproject.org) three times in the last 17 hours, but they're neither showing up in the HYPERKITTY, nor showing up in the digests. I'm also not receiving any rejection messages, bounce messages, or users post acknowledgement messages. The failed attempts were at 8:06 P.M. (US mountain time) yesterday, 9:04 A.M. today, and 12:06 P.M. today.


With what address are you sending to the list?

Metadata Update from @kevin:
- Issue priority set to: Waiting on Assignee (was: Needs Review)
- Issue tagged with: lists

4 years ago

I send from "mattisonw@comcast.net". I use Thunderbird. All three posts were to the thread "upgrade attempts are failing.". I have the posts in my "Sent " folder, so I can get those to you if that would help.

I have looked at the logs on all our servers and the only times this email address has shown up has been for emails to this ticket and a couple of emails sent to it from mailing lists. I do not see any email from mattisonw arriving at any of the relays.

The last email I see arriving from the email address is on 2020-04-09 (all times UTC or subtract 6 hours for mountain time.)

[root@log01 ~][PROD]# egrep 4BD3F9805A /var/log/hosts/smtp-mm-osuosl01.vpn.fedoraproject.org/2020/04/09/mail.log
Apr  9 19:03:42 smtp-mm-osuosl01 postfix/smtpd[21665]: 4BD3F9805A: client=resqmta-po-10v.sys.comcast.net[2001:558:fe16:19:96:114:154:169]
Apr  9 19:03:42 smtp-mm-osuosl01 postfix/cleanup[21811]: 4BD3F9805A: message-id=<f998a117-c350-e43c-03f8-48bd551cf89c@comcast.net>
Apr  9 19:03:42 smtp-mm-osuosl01 postfix/qmgr[22191]: 4BD3F9805A: from=<mattisonw@comcast.net>, size=6285, nrcpt=1 (queue active)
Apr  9 19:03:42 smtp-mm-osuosl01 postfix/smtp[22014]: 4BD3F9805A: to=<users@lists.fedoraproject.org>, relay=mailman01.vpn.fedoraproject.org[192.168.1.118]:25, delay=0.67, delays=0.19/0.03/0.16/0.29, dsn=2.0.0, status=sent (250 2.0.0 Ok: queued as A298F5DC48815)
Apr  9 19:03:42 smtp-mm-osuosl01 postfix/qmgr[22191]: 4BD3F9805A: removed

But then the are dying in mailman:

Apr 09 17:06:39 2020 (4789) ACCEPT: <4dfff3f3-3830-5e19-cb4d-8e4fbeb6bbc9@comcast.net> {'to_list': True, 'lang': 'en', 'envsende
r': 'noreply@lists.fedoraproject.org', 'rule_hits': [], 'version': 3, 'listid': 'users.lists.fedoraproject.org', 'received_time'
: datetime.datetime(2020, 4, 9, 17, 6, 38, 502658), '_parsemsg': False, 'original_size': 29237, 'rule_misses': ['dmarc-mitigatio
n', 'no-senders', 'approved', 'emergency', 'loop', 'banned-address', 'member-moderation', 'header-match-config-1', 'header-match
-config-2', 'header-match-config-3', 'header-match-users.lists.fedoraproject.org-0', 'header-match-users.lists.fedoraproject.org
-1', 'header-match-users.lists.fedoraproject.org-2', 'nonmember-moderation', 'administrivia', 'implicit-dest', 'max-recipients',
 'max-size', 'news-moderation', 'no-subject', 'suspicious-header']}
Apr 09 17:07:15 2020 (4791) Exception in the HyperKitty archiver: 'ascii' codec can't encode character '\ufffd' in position 124:
 ordinal not in range(128)
Apr 09 17:07:15 2020 (4791) Traceback (most recent call last):
  File "/usr/lib/python3.4/site-packages/mailman_hyperkitty/__init__.py", line 154, in _archive_message
    url = self._send_message(mlist, msg)
  File "/usr/lib/python3.4/site-packages/mailman_hyperkitty/__init__.py", line 192, in _send_message
    message_text = msg.as_string()
  File "/usr/lib64/python3.4/email/message.py", line 159, in as_string
    g.flatten(self, unixfrom=unixfrom)
  File "/usr/lib64/python3.4/email/generator.py", line 112, in flatten
    self._write(msg)
  File "/usr/lib64/python3.4/email/generator.py", line 178, in _write
    self._dispatch(msg)
  File "/usr/lib64/python3.4/email/generator.py", line 211, in _dispatch
    meth(msg)
  File "/usr/lib64/python3.4/email/generator.py", line 240, in _handle_text
    msg.set_payload(payload, charset)
  File "/usr/lib64/python3.4/email/message.py", line 316, in set_payload
    payload = payload.encode(charset.output_charset)
UnicodeEncodeError: 'ascii' codec can't encode character '\ufffd' in position 124: ordinal not in range(128)

I don't suppose any of the mails you were sending had a \ufffd in them? ie: https://www.fileformat.info/info/unicode/char/fffd/index.htm

How do I get one of those messages to you? I can't attach a non-image to this comment.

I used ordinary keyboard characters, and a link pasted in from google docs. I don't know what HYPERKITTY put in there (when I click the reply and use email software buttons), or what Thunderbird put in there, or what comcast put in there.

I used Thunderbird's "View Source" function to view the most recent of the "lost" messages. Then I screen-captured that. It's attached. I notice a few occurrences, between sentences, of what looks like an upper case 'A' with a "hat". Might those be the trouble-makers?
listmessage.png

Well, it's definitely a bug in hyperkitty not handling it, but yeah, it might be those things in the screenshot. Can you try sending a 'text only' email and see if that also gets dropped?

done. and it got through.

Maybe a clue:
This problem started happening immediately after I upgraded from f30 to f31 (Thursday, April 09). I don't know if that upgrade included an upgrade of Thunderbird.

I did more digging. Messages that I posted to the "upgrade attempts are failing." thread before my upgrade from f30 to f31 did contain A-hat characters, and those messages did not get lost or blocked. So apparently, the A-hat characters are not the cause of the problem.

A few Fedora user's list members responded to my test message. Their comments might be helpful. I hope you're paying attention to those.

@bier, can you send a mail to me directly so I can take a look at the headers, encoding and whatnot? The address is my user name @redhat.com.

Anyway, these  characters seem to trigger the error (you might want to look into why your mail program puts them in there in the first place, they don't look intentional!), but I don't think they're the culprit for the crash:

The traceback shows that they're part of the (properly decoded, i.e. a text, not a binary string) message and the exception is raised when mailman/hyperkitty tries to encode them to ASCII (presumably for archiving).

@kevin I haven't found anything recent in Ansible that would hint at changes in our mailman/hyperkitty setup. Do we have anything, possibly in staging, at which I can throw messages and consult the logs, even with only the puny fi-apprentice privs?

Nils,

I hope that I got your e-mail address correct.  Please let me know that
you received this.  If you see this only as an issue comment, then
please clarify what the e-mail address is to reach you directly.

Those A-hat characters most certainly are not intentional.  My e-mail
client is Thunderbird 68.6.  It's kept up-to-date with my weekly (every
Thursday) "dnf --refresh upgrade".  Those A-hat characters cannot be the
cause of the trouble.  They were in my posts to the Fedora users list
before upgrading to f-31, but those posts were getting through.  It's
only after upgrading to f31 (on April 09) that my posts starting getting
lost.

I think Kevin's theory is the best (see his second comment to this
issue).  The '\uffd' character(s?) he refers to also most certainly are
not intentional.

I hope this gives you enough to do your diagnostic work.  Let me know if
you need something more or else.

Good luck.
Bill.

On 4/20/20 9:11 AM, Nils Philippsen wrote:

nphilipp added a new comment to an issue you are following:
` @bier, can you send a mail to me directly so I can take a look at the headers, encoding and whatnot? The address is my user name@redhat.com`.

Anyway, these  characters seem to trigger the error (you might want to look into why your mail program puts them in there in the first place, they don't look intentional!), but I don't think they're the culprit for the crash:

The traceback shows that they're part of the (properly decoded, i.e. a text, not a binary string) message and the exception is raised when mailman/hyperkitty tries to encode them to ASCII (presumably for archiving).

@kevin I haven't found anything recent in Ansible that would hint at changes in our mailman/hyperkitty setup. Do we have anything, possibly in staging, at which I can throw messages and consult the logs, even with only the puny fi-apprentice privs?
``

To reply, visit the link below or just reply to this email
https://pagure.io/fedora-infrastructure/issue/8824

Hey Bill,

I received your mail, I've attached it here.

Here's what I noticed:

  • The  you see when viewing the source looks like it's an artifact of the viewer. At the place where it is found in your screenshot (after the period at the end of a sentence), the message I got had (quoted-printable encoded) =C2=A0 there, which is UTF-8 for the non-breaking space (U+00A0), but interpreted as ISO-8859, =C2 is  and =A0 is the non-breaking space in that encoding.
  • The Unicode codepoint which causes the exception above is U+FFFD, the "REPLACEMENT CHARACTER" . I've found mention of these in conjunction with mailman/hyperkitty on the mailman users list and as a bug report on Ubuntu. The context there is "messages with broken Unicode content, held for moderation which have been previously processed by Postfix with SMTPUTF8 enabled", I don't think all of this applies here but the stack trace looks very similar.

@kevin there's a patch to mailman linked from the bug report, how do I find out what version we have installed? I don't think I can hope that there's an update for that in RHEL which we just don't have applied yet, found nothing pertinent in BZ.

Our hyperkitty/mailman3 situation is pretty sad unfortunately.

We are running it on rhel7, but with a custom set of packages from back when we were first deploying it.

mailman3-3.1.1-0.6.el7.centos.noarch
hyperkitty-1.1.5-0.1.el7.centos.noarch
mailman3-hyperkitty-1.1.1-0.2.el7.centos.noarch

We plan to move to a fedora based install with the fedora packaged versions as soon as they get through review. (mailman3 is, I am not sure what the status for hyperkitty is or what else we need).

We may be able to apply this patch to our old version, or this may have to just wait until we move to the new instance(s).

@kevin I couldn't find a review for hyperkitty or mailman3-hyperkitty in Fedora, is this something we want to pursue? Otherwise I'd just want to apply the patch to our old version, it should be simple enough.

/cc @abompard maybe you have some opinion on this, as you did the mailman3 and hyperkitty packages

@nphilipp well, we want to at some point yeah... but short term we could just patch.
Do see: https://docs.pagure.org/infra-docs/sysadmin-guide/sops/hotfix.html for how to do hotfixes in ansible...

it's basically, check in the existing/pure file, check in the changes to that file and add a task to the appropriate playbook to replace that file(s).

We do it that way so we can see the actual diff commit. If we just replaced the file it's hard to know what was changed.

@kevin ahh took me a while to spot how the ansible playbooks pull @abompard's mailman/hk repo on fedorapeople. So the hotfix would apply a patch or something directly "over" the file(s) of the deployed RPM package?

Well, there's several ways we could do it.

  1. Add a patch and rebuild the rpm and update that. The only downside to that is that the rpms depend on python34 in epel7 and I am not sure how intact that stack is anymore (since rhel released python36 in 7.6). I guess you could try some scratch rebuilds and see if it rebuilds ok unmodified?

or

  1. We could apply a hotfix as described above in ansible. This isn't as nice since it means anytime we update those rpms we have to make sure and quickly re-run the playbook to apply the patches, but should work given how infrequently we update those rpms. :)

Does that make sense?

That does make sense. I think I'll go with the latter to avoid running into problems with the old vs. new Python stack, time put into that is probably better spent on #8455. :wink:

Proposed change in Fedora-Infra/ansible#45 (only for staging ATM).

I've applied this hotfix in production now and the tracebacks no longer appear.

I'm going to close this, but please re-open it if you see any emails not getting through now.

Many thanks to @nphilipp !

Metadata Update from @kevin:
- Issue close_status updated to: Fixed
- Issue status updated to: Closed (was: Open)

3 years ago

A few minutes ago, I replied to a Fedora User's list thread the way I was doing it when I submitted this issue. I did receive the users post acknowledgement, and the reply was properly posted. I agree that the issue is fixed. I thank everyone who worked this issue.

Bill.

Login to comment on this ticket.

Metadata
Attachments 1
Attached 4 years ago View Comment