Training spam with doveadm
A while ago, I posted about training SpamAssassin Bayes filter with Proxmox Mail Gateway. That’s really easy when you’re using Maildir - as each email message is its own file.
At this point, we could easily just cat out a file and treat email in folders as files and ignore the fact they were part of an imap mailbox. However, what happens if you use something other than Maildir - like the newer mailbox formats? We can’t use the same approach, as each email is likely not just a file anymore.
For example, dbox is Dovecot’s own high-performance mailbox format.
If we use mdbox, we can no longer open a single message per file, nor can we tell what folders are what from the on disk layout. So we have to get smarter.
Using doveadm, we can search for messages in a mailbox, and fetch them to feed into our previously configured script and feed them into PMG as before. The main advantage is that this will work with any mail storage backend.
This simple bash script will go through all users Spam
or INBOX/Spam
folders and fetch each one, feed it into the learning system, and then remove it from
the users mailbox.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
#!/bin/bash
MAILFILTER=my.pmg.install.example.com
shopt -s nullglob
doveadm search -A mailbox Spam OR mailbox INBOX/Spam | while read user guid uid; do
doveadm fetch -u $user text mailbox-guid $guid uid $uid | tail -n+2 > /tmp/spam.$guid.$uid
cat /tmp/spam.$guid.$uid | ssh root@$MAILFILTER report
if [ $? != 0 ]; then
echo "Error running sa-learn. Aborting."
exit 1
fi
rm -f /tmp/spam.$guid.$uid
doveadm expunge -u $user mailbox-guid $guid uid $uid
done
Use it with the scripts / general configuration from the previous article, and this should be able to be used across all mail storage methods supported by Dovecot.
Cron it to run every 5 minutes or so, and you’re done! Nice and easy.