Recently I see various Chinese spam on several lists, like
https://lists.fedoraproject.org/archives/list/apac@lists.fedoraproject.org/thread/R4RPTKN7AR5VBZPVL2SPLYVVRG7TPJ4B/ https://lists.fedoraproject.org/archives/list/fudcon-planning@lists.fedoraproject.org/thread/OZRDK55KLVHSND7BUD7FHVNY3YP6FUNU/ https://lists.fedoraproject.org/archives/list/fudcon-planning@lists.fedoraproject.org/thread/6FIMSCNWPWANVMVK2QHNQHT4H3P7BADM/ https://lists.fedoraproject.org/archives/list/flock-planning@lists.fedoraproject.org/thread/AGMTQ2CCRQYENDBZHLMJ3QSCZP2NEXTT/ https://lists.fedoraproject.org/archives/list/flock-planning@lists.fedoraproject.org/thread/SZJ3FSOGLIM7O5POGM5DRW6ZRQU2YUZT/
I'd like to raise this for awareness and hopefully some actions can be taken.
some more received in: - https://lists.fedoraproject.org/archives/list/malaysian-users@lists.fedoraproject.org/thread/4J2FY3NRJNR4SH43XT3VILZYXUAG6ZSX/ - https://lists.fedoraproject.org/archives/list/malaysian-users@lists.fedoraproject.org/thread/52VHB2YMCVIWKPGPUUKDZNNYBFHQ4TE6/
Metadata Update from @smooge: - Issue assigned to smooge
Metadata Update from @smooge: - Issue priority set to: None (was: Needs Review) - Issue tagged with: high-gain, low-trouble, security
Other lists hit:
fedocal ibus-sayura-users flockinfo eng-service irc-support-sig flock-planning flock-attendees env-and-stacks badges logistics classroom cwg fonts flockinfo
One more: https://lists.fedoraproject.org/archives/list/389-commits@lists.fedoraproject.org/2021/12/
Thanks. I have deleted the singletons.. The other lists have thousands of spam on them which needs a script to automate through.
ok, we have mostly cleaned this up.
Sadly, there's still a bunch on these 4 lists:
1360 tinykdump 1338 ibus-sayura-users 1128 fedocal 688 matahari
clicking delete 4,000 times doesn't scale very well. ;( We need to poke the db or create a script to mass delete these.
Need to ask @misc if they have a script to do a mass delete of archives on their mailman3 as doing it via clicks is really slow. It may have to wait until we upgrade as I think I see some 3.3 options which we don't have which would fix it.
Metadata Update from @kevin: - Issue priority set to: Waiting on Assignee
One more spam msg on one more list: https://lists.fedoraproject.org/archives/list/campus-ambassadors@lists.fedoraproject.org/thread/TBL5W3KAVYNIFXLZ2J6ULWPOTYN5QKC5/
Besides, this list seems inactive, shall we just retire it?
Are these lists set to allow posts from non-subscribers, or are they subscribing? If we do have any lists that are open-posting, I think we should change that, and ideally remove the option to change it back.
On Sun, 12 Dec 2021 at 13:07, Matthew Miller pagure@pagure.io wrote:
mattdm added a new comment to an issue you are following: `` Are these lists set to allow posts from non-subscribers, or are they subscribing? If we do have any lists that are open-posting, I think we should change that, and ideally remove the option to change it back.
The accounts were all created as valid users in the Fedora project system and then logged into mailman3 web interface and put the messages. The main lists they filled are ones which should probably be closed completely as the only traffic on them is spam related. [The fedocal and flock-planning had 7 years of held spam from non-members and 2000 emails from the spammers after Kevin placed the lists under emergency moderation.]
`` To reply, visit the link below or just reply to this email https://pagure.io/fedora-infrastructure/issue/10417
``
To reply, visit the link below or just reply to this email https://pagure.io/fedora-infrastructure/issue/10417
-- Stephen J Smoogen. Let us be kind to one another, for most of us are fighting a hard battle. -- Ian MacClaren
Running the Chinese text below the links through google translate was fascinating. It appears to be snippets of Chinese poetry randomly concatenated.
https://src.fedoraproject.org/rpms/spamassassin
simple front-end filtering script for SpamAssassin https://spamassassin.apache.org/full/3.2.x/doc/spamassassin-run.html
Although it would be ideal to use some type of open source machine learning algorithm:
Here is what I have found regarding that so far: https://awesomeopensource.com/projects/machine-learning/spam-filtering
spammy seems to be the best/fastest but it isn't in python 3 yet: https://release-monitoring.org/project/241648/
https://lists.fedoraproject.org/archives/list/softwarecollections@lists.fedorahosted.org/thread/NMSSMAZJZ6UMPA6EARUOTSNSIRGYAMYF/ https://lists.fedoraproject.org/archives/list/softwarecollections@lists.fedorahosted.org/thread/DBMX57Q2R6KCNTUOX3OZDRLANW76IVKH/
perl6 list:
https://lists.fedoraproject.org/archives/list/perl6@lists.fedoraproject.org/thread/TNCKMYRUPM2JU6ITIVGREXN7UGSVMGW4/ https://lists.fedoraproject.org/archives/list/perl6@lists.fedoraproject.org/thread/M74UTLNFKHC7J7RIJCWJCDL22UABVENA/
https://lists.fedoraproject.org/archives/list/mindshare-announce@lists.fedoraproject.org/thread/6BCXRKYFQ7N3AFS6QV6EXLKKNDVVQZYU/
I use pydspam, which uses the milter api for sendmail or postfix, and is ported to python3. CONS: But it depends on dspam - which has been dropped from Fedora, probably because upstream seems to be dead.
I like the design of dspam because it simple and elegant. a) tokenize input with special attention to email headers (e.g. header names are tokens). b) database with spam/ham stats by token c) simple Bayes calculation based on tokens of new message and stats from database
pydspam wraps libdspam in a python API.
spammy seems the same idea (naive Bayes) as dspam. If it needs porting to py3, that might be something i could do.
https://github.com/tasdikrahman/spammy/issues/9
Reviewing the spammy system, it does not seem to have any special treatment for email headers. A big part of the effectiveness of dspam was that a word, e.g. "FREE", in the Subject header was a different token than the same word in another header or the message body.
I suppose that could be added later.
I am going to close this ticket as the major issue is 'fixed' and the longer term ones need to be filed in as a separate initiative.
Metadata Update from @smooge: - Issue close_status updated to: Fixed with Explanation - Issue status updated to: Closed (was: Open)
Login to comment on this ticket.