Spam
This technique uses bogofilter to categorize email as spam or ham on a server, and it uses spamassassin and manual characterization on a client computer to train bogofilter. This assumes that the administrator will use the client computer to review spamassassin's classification decisions and deem missed email as spam where appropriate. Any unread email found in spam-samples was placed there as a result of automatic (spamassassin or bogofilter) classification; read mail was either hand-classified or manually reviewed. The efficacy of this technique requires that the administrator's spam resemble each user's spam.
-
Install and configure Postfix and bogofilter on your mail server.
-
Use
spamassassin
andmutt
on a client machine to continuously trainbogofilter
: -
Configure
spamassassin
:
required_hits 3.5
report_safe 0
- Configure
procmail
to filter incoming mail usingspamassassin
and to move the email classified as spam to the folderspam-samples
:
:0:
* ^X-Bogosity: (Spam|Yes)
$MAILDIR/spam-samples
# Process with spamassassin unless too big.
:0fw: spamassassin.lock
* < 1048576
| spamassassin
# Dump spamassassin spam in spam-samples.
:0:
* ^X-Spam-Status: Yes
$MAILDIR/spam-samples
- Add a cronjob to analyze
spam-samples
usingbogofilter
and install the resultingwordlist.db
:
0 0 * * * rm -f ~/mail/wordlist.db
&& grep -av '\(^X-Spam[^ ]*:\|^X-Bogosity:\)' ~/mail/spam-samples | bogofilter -d ~/mail -M -s
&& grep -av '\(^X-Spam[^ ]*:\|^X-Bogosity:\)' ~/mail/ham-samples | bogofilter -d ~/mail -M -n
&& scp ~/mail/wordlist.db root@example.com:/etc/bogofilter/
- Configure mutt with hotkeys which manually characterize email as spam or ham and spam index highlights:
color index black brightred '~h "X-Spam-Flag: YES"' # Spamassassin.
color index black brightyellow '~h "X-Bogosity: Spam"' # Bogofilter.
macro index S "\
<enter-command>set resolve=no<enter>\
<clear-flag>N\
<enter-command>set resolve=yes<enter>\
<save-message>=spam-samples<enter><enter>" "Save to spam-samples"
macro pager S "\
<enter-command>set resolve=no<enter>\
<clear-flag>N\
<enter-command>set resolve=yes<enter>\
<save-message>=spam-samples<enter><enter>" "Save to spam-samples"
macro index H "\
<enter-command>set my_resolve=\$resolve resolve=no<enter>\
<copy-message>=ham-samples<enter><enter>\
<enter-command>set resolve=\$my_resolve<enter>" "Copy to ham-samples"
macro pager H "\
<enter-command>set my_resolve=\$resolve resolve=no<enter>\
<copy-message>=ham-samples<enter><enter>\
<enter-command>set resolve=\$my_resolve<enter>" "Copy to ham-samples"
macro index B "<shell-escape>rm -f ~/mail/wordlist.db
&& grep -av '\\(\^X-Spam[^ ]*:\\|\^X-Bogosity:\\)' ~/mail/spam-samples | bogofilter -d ~/mail -M -s\
&& grep -av '\\(\^X-Spam[^ ]*:\\|\^X-Bogosity:\\)' ~/mail/ham-samples | bogofilter -d ~/mail -M -n\
&& scp ~/mail/wordlist.db root@www.flyn.org:/etc/bogofilter/<enter>" "Push bogofilter samples"