Using CRM114 in Debian

DISCLAIMER: I am recounting the steps I took to get a more or less working CRM114 system for me. Things may break for you if you use my setup. Your firstborn may be devoured by a hairy purple monster. Follow these easy steps at your own risk!


Contents

  1. Setting up CRM114
  2. Copying the configuration files
  3. Initializing your CSS files
  4. Configuring procmail and mutt to work with CRM114
  5. Now What?

Setting up CRM114

# apt-get install crm114

There, easy, wasn't it?


Copying the configuration files

You'll want a directory, a la .spamassassin, to store all of your configuration files for CRM114. And there are quite a few. So,

joshk@influx:~> mkdir .crm114

Note that crm114 won't care what dir we use as we have to specify it in the procmail recipe later on. Then, you should copy these files from sundry places that the crm114 package has installed to into your directory:

/usr/share/doc/crm114/examples/crmfilter/priolist.mfp
/usr/share/doc/crm114/examples/crmfilter/rewrites.mfp
/usr/share/doc/crm114/examples/crmfilter/scrub_mailfile_rewrites.mfp
/usr/share/doc/crm114/examples/crmfilter/test_rewrites.mfp
/usr/share/doc/crm114/examples/crmfilter/whitelist.mfp
/usr/share/doc/crm114/examples/crmfilter/blacklist.mfp.gz
/usr/share/crm114/mailfilter.cf

dh_compress compressed the large sample blacklist; you should gunzip it and use it, it's quite comprehensive and accurate. You should edit the whitelist.mfp as well. rewrites.mfp should contain accurate information about your name, email address, and mail server. (Don't ask me how these get used when actual mail processing occurs. I'm not that good yet.

OK, so now onto editing mailfilter.cf. I won't go into all the options, but here is what I have (a pseudo-delta from the default):

:spw: /********/
:add_verbose_stats: /no/
:add_extra_stuff: /no/
#:spam_flag_subject_string: /ADV:/
:log_to_allmail.txt: /no/
:log_rejections: /no/
:mime_decoder: /mimencode -u/

A brief explanation. 'spw' is some password that is used for mailfilter's 'remote control interface'; should anybody know what the hell that might be, tell me.[1] But it's worth changing to some throwaway password. Also, spam_flag_subject_string is set to ADV: by default; I just comment that out as the X-CRM114 headers are enough for procmail anyway. Next, to use the mime_decoder thing with mimencode -u, you'll have to install the metamail package.

If things should go awry, you should enable all of the log and verbosity settings to figure out what's wrong and turn them off later. They get *annoying*.


Initializing your CSS files

It is not too hard:

joshk@influx:~/.crm114> cssutil -r -b nonspam.css
joshk@influx:~/.crm114> cssutil -r -b spam.css

This will initialize the database of spam/nonspam knowledge for CRM114. It is generated at 'full size' - around 12 megabytes full of zeroes. Whatever. Disk is cheap.

Now you'll want to feed crm114 spam messages - one by one or by the mailbox, I'm not quite sure, because I use Maildir and the former is the only option. Either way, one would invoke crm114 like so:

joshk@influx:~> /usr/share/crm114/mailfilter.crm -u /home/joshk/.crm114 < spam-message

I fed 481 spams to crm114 before it ever got to lay eyes on a ham. This is supposedly healthier for its heuristics.

Anyway, you should be ready to roll by now. Let's come up with some procmail cuisine.


Configuring procmail and mutt to work with CRM114

:0fw: 0.crm114.lock
| /usr/share/crm114/mailfilter.crm -u /home/joshk/.crm114/

:0:
* ^X-CRM114-Status: SPAM.*
./Maildir/.Junk\ Mail/

OK, this needs explaining. I am using a lockfile called 0.crm114.lock (yeah, filename arbitrarily chosen), and the rule will fall through to the next one so that we can catch the X-CRM114 header. /usr/share/crm114/mailfilter.crm is, surprisingly, executable; its hashbang is #!/usr/bin/crm and some other stuff. The -u argument forces a chdir to /home/joshk/.crm114, where I have chosen to put all of my configuration files.

As for getting mutt working, here is the excerpt from my .muttrc which lets me signal to crm114 the errors of its ways:

macro index X '| formail -I X-CRM114-Status -I X-CRM114-Action -I X-CRM114-Version | /usr/share/crm114/mailfilter.crm -u /home/joshk/.crm114/ --learnspam'
macro index H '| formail -I X-CRM114-Status -I X-CRM114-Action -I X-CRM114-Version | /usr/share/crm114/mailfilter.crm -u /home/joshk/.crm114/ --learnnonspam'
macro pager X '| formail -I X-CRM114-Status -I X-CRM114-Action -I X-CRM114-Version | /usr/share/crm114/mailfilter.crm -u /home/joshk/.crm114/ --learnspam'
macro pager H '| formail -I X-CRM114-Status -I X-CRM114-Action -I X-CRM114-Version | /usr/share/crm114/mailfilter.crm -u /home/joshk/.crm114/ --learnnonspam'

Note: the reason formail is used is to strip out those headers from being picked up by CRM114, or else it'll train based on those headers as well as the pertinent parts of the message. (Which could also mean it misflags a message when you want to correct it because it sees the CRM114 headers.)

So if you find a misfiltered message, hit H or X and the legwork will be done for you and you can expect never to see that kind of message misfiltered again.

Update: Joerg Jaspert sent me a similar mechanism that uses CRM114's remote control mechanism to allow Gnus to teach CRM114 about what's right and wrong. Find the Lisp snippet here.


Now What?

The flip side to a system like CRM114 is that you have to teach it for the best results. So, especially for the first few hours after you've installed CRM114, you will be babying your spam folder, and hitting 'H' and 'X' a lot to correct it, since even after having seeded it with lots of juicy spam it can get stuff wrong!

But the idea is that it gets better and better over time, just like a fine wine :) Enjoy CRM114, I hope this mini-HOWTO helped you get the program running on your computer...


References

[1]:
13:12 < ljlane> joshk, the remote control interface allows crm commands in 
                plain mail. Instead of mapping keys to reclassify spam or ham, 
                you forward it to yourself with your spw and commands at the 
                top of the mail.

Last updated by joshk@triplehelix.org on 3/20/2004