Tuning rspamd

For many years I’m running my own mailserver based on postfix and dovecot. To combat spam I’ve used spamassassin like everybody else back in the day but I was never quite satisfied with it. It came from a different era and as the spammers got more sophisticated and billions of people put poorly maintained and therefore hackable computers on the internet, our trusty old friend spamassassin wasn’t keeping up.

Then in 2013 a new contender entered the scene, rspamd. I remember discovering it, probably a few moons after its initial release and feeling quite excited. It was not written in Perl but in C, promising much better performance and offering a ton of modern features to combat spam.

When I first tried it, its default config was almost enough to get rid of most of the spam that I was struggling to filter with spamassassin but again, over the years as the spammers got more sophisticated, more and more spam was reaching my inbox again which is why I spent a weekend recently to try and figure out what I can do to improve the situation.

The first thing that became obvious to me was that the configuration options and format of certain modules has changed and that certain modules were just not working or even enabled in the first place.

But that was just the beginning of renovating my rspamd config. So here are a few suggestions for you if you have too much spam in your inbox. I will assume that you are familiar with common email and spamfilter related terms like greylisting and the principles behind it.

Check your config

Suggestion number one is pretty straight forward. Check your active configuration! You can do this by running

rspamadm configdump

rspamadm configdump <module name>

Check if the modules and values are as you expect them to be. Rspamd has a hierarchical config overloading structure and if not fully understood it is easy to believe that what you’ve configured in the local.d folder is actually what is active but I’ve realized that a few of these did not work as expected due to the before mentioned changes in the configuration.

Deal with the repetitive spam themes first

In my case, I’ve received a lot of similar looking spam. All german speakers probably have seen their fair share of spam mails with a subject like “Apotheke / Apo-theke / A-potheke”. There are many more “common” spam themes and topics and this is what I’ve tackled first because these categories of repetitive spam were very unlikely to produce false positives if I just blacklisted them.

But if you’re unsure whether this is the right approach on a multiuser setup with varying interests then you can fall back to greylisting. To set this up you will need to edit local.d/multimap.conf and maybe take a look at the corresponding documentation: https://rspamd.com/doc/modules/multimap.html

I’d say this page is one of the most important pieces of documentation to leverage rspamd’s potential.

Subject Blocklist

The first thing in my multimap.conf file is the following block:

BAD_SUBJECT_BL {
  type = "header";
  header = "subject";
  regexp = true;
  map = "$LOCAL_CONFDIR/local.d/local_bl_subject_map.inc";
  description = "Blacklist for common spam subjects";
  score = 10; 
}

The content of that local_bl_subject_map.inc file is as follows:

/\bpsoriasis\b/i
/\bprostatitis\b/i
/\bderila\b/i
/\betf\b/i
/\bbitcoin\b/i
/\breich\b/i
/\bgeld\b/i
/\bki\b/i
/\baktien\b/i
/\bmakita\b/i
/\blotto|lottery\b/i
/\bmubi\b/i
/\bauto\b/i
/\bantihaftbeschichtung\b/i
/.*r[:.-]*?e[:.-]*?z[:.-]*?e[:.-]*?p[:.-]*?t[:.-]*?f[:.-]*?r[:.-]*?e[:.-]*?i/i
/\br[-_]?e[-_]?zept[-_]?frei\b/i
/zeptfrei/i
/\beinkommen\b/i
/\bnubuu\b/i
/\bnuubu\b/i
/\bentgiftungsprogramm\b/i
/\bgelenkschmerzen\b/i
/\bmädchen\b/i
/\bsprachübersetzer\b/i
/\bstabilisierung.+blutdrucks\b/i
/\bmüheloses.+reinigen\b/i
/\bpapillome\b/i
/\bküchenmesser\b/i
/\brendite\b/i
/\bgewichtsverlust\b/i
/\bpreissturz\b/i
/\bchance.+kostenlos\b/i
/\bhamorrhoiden\b/i
/\bhörvermögens\b/i
/\bmuama\b/i
/\bryoko\b/i
/\bbambusseide\b/i
/\bluxusseide\b/i
/\bHondrostrong\b/i
/\btabletten.+apotheke\b/i
/\bEinlegesohlen\b/i
/\bEinlegesohlen\b/i
/\btest\syour\siq\snow\b/i
/\bzukunft.+sauberkeit\b/i
/\bcbd\b/i
/\bharninkontinenz\b/i
/\bpillen\b/i
/\btabletten\b/i

This might seem surprisingly short but this list got rid of the majority of spam mails reaching my inbox. It’s dull, it’s simple but quite effective. Very rarely I have to add things to it these days and it especially effective for those mails that don’t have a lot of suspicious content and fail other spam identification methods.

Again, if you’re uncomfortable to use it as a block / blacklist you can either lower the associated score to be below your global spam threshold or you can convert this map into a prefilter and send the matching mails into greylisting which also gets rid of 95-99% of spam mails.

TLD Blocklist

Speaking of prefilters and greylisting, let’s talk about my most crude blocklist where I apply special treatment on mails coming from certain top level domains. Here is the corresponding entry in local.d/multimap.conf:

SENDER_TLD_FROM {
  type = "from";
  filter = 'email:domain:tld';
  prefilter = true;
  map = "$LOCAL_CONFDIR/local.d/local_bl_tld_from.map.inc";
  regexp = true;
  description = "Local tld from blacklist";
  action = "greylist";
}

And here is the list of “blocked” top level domains:

[.]tr$
[.]su$
[.]mom$
[.]mg$
[.]com\.py$
[.]af$
[.]ng$
[.]ro$
[.]ar$
[.]pro$

For whatever reason, a disproportionate amount of spam mails is coming from those top level domains. Equally for me personally, there is very little chance of false positives but since this is even cruder than the subject based blocking, I changed this to a prefilter which means that this is evaluated before all other checks. I’ve set the action to greylist which basically sends matching mails directly into greylisting and that does the job very well. In case a “good” mail is coming from those top level domains, it should make it through the greylisting and all other modules.

Other Blocklists

I do have a few more blocklists for display names, domains and names (the part of an email address before the @) but they are quite short. For example I get a lot of spam mails from email addresses starting with “firewall@” so again I take care of those.

The multimap blocks for those look like this:

SENDER_FROM {
  type = "header";
  header = "from";
  filter = 'email:domain';
  map = "$LOCAL_CONFDIR/local.d/local_bl_from.map.inc";
  description = "Local from blacklist";
  score = 7;
}

SENDER_USER_FROM {
  type = "header";
  header = "from";
  filter = 'email:user';
  map = "$LOCAL_CONFDIR/local.d/local_bl_user_from.map.inc";
  description = "Local user from blacklist";
  score = 7;
}

SENDER_USER_DISPLAY_FROM {
  type = "header";
  header = "from";
  filter = 'email:name';
  map = "$LOCAL_CONFDIR/local.d/local_bl_from_display.map.inc";
  description = "Local user from display name blacklist";
  regexp = true;
  score = 7;
}

As mentioned before, this takes care of a very large portion of spam that wasn’t detected otherwise but is my no means the only thing you can tune.

Tuning Symbol Scores

While looking at the history tab of rspamd’s web interface, I noticed certain symbols being added to emails which didn’t have enough weight to get the score over the threshold which I thought should be weighted higher. You can also manually paste the mail source into the form field in the “Scan/Learn” tab of the web interface to scan spam mails that have slipped through the filter to see what score the mail gets and what symbols where added. If you spot certain symbols over and over again and feel like they should be weighted more in the overall score, then head over to the Symbols tab and add custom scores to them.

There are so many symbols that I don’t remember which ones I have changed because I have used the web interface. I should’ve done that in a config file right away but too late now. You can be smarter than me and add a file local.d/scores.conf and add symbols and your custom scores as follows:

ONCE_RECEIVED = 5.0; 
MANY_INVISIBLE_PARTS = 5.0;

etc etc.

Check/Configure the Fuzzy and Neural Modules

These modules are a cornerstone of rspamd’s effectiveness and therefore it’s worthwhile to check if they are indeed enabled and working. To do this run

rspamadm configdump neural

rspamadm configdump fuzzy_check

For recommended values check out the module documentation of both.

Ask the Mail Cow

Another great tip for getting more inspiration on how to fight spam with rspamd is to look into the repository of mailcow, which is a dockerized and pre-configured mail server setup and many of their configuration choices are proven to be solid.

For example you can take a look at the entire local.d folder and get inspiration, e.g. for tuning the fuzzy module. Also for your postfix and dovecot configs you could get useful settings that might have not occurred to you. What I did was to look at their configs and when I saw options that sounded interesting and which I didn’t know, I looked them up in the postfix/dovecot/rspamd documentation to see if they’d be suitable for me as well.

I wouldn’t blindly copy all their settings because many might not apply to your scenario and without understanding what they do, you can make your setup worse or break it entirely. Don’t change too many things at once. Do one change at a time, test and confirm that they are working as intended. Use rspamd’s web interface to scan and check mails and to feed the fuzzy and neural modules.

Auto Learn From Users Spam

This is another great option for training your spam filter. There are ways to auto scan junk boxes and auto feed them to the rspamd but I am not using this as all the previous methods already work well enough for me. Spam mails are usually quite distinguishable from “proper” mail with all the previous methods mentioned – but if you have a medium to large multiuser setup with a diverse user base (region, language, age) you might be receiving very diverse spam and auto learning from user classified spam might bring the last few percent.

You could even implement it in a way like gmail, by flaggin mail in user mail boxes after delivery, when enough users have marked it the same mail as spam. However there is a lot more effort required when you want to preserve data privacy which means a bit of scripting – but it is possible.

I hope that helps some of you to drastically reduce your spam. It did for me and I was surprised that some of the dullest methods were the most effective ones.

Questions?

I’m sure I haven’t answered all your questions and it’s not easy to cover everything. The rspamd config documentation isn’t easy to consume and to understand in its entirety and I wouldn’t claim I’ve reached the pinnacle of understanding but what I’ve done is enough so that I don’t get a single spam email into my inbox for days in a row. Whenever one slips through the cracks, I adjust one of the modules mentioned above.

Feel free to ask if you have any remaining questions in the comments or via the usual channels and let me know what things you have tuned to great effect. Sharing is caring 🙂

Oh and of course feel free to correct any errors I might have made!

Special thanks to @leah@chaos.social who saved my sanity during my config debugging session where I tried to figure out which modules are actually active and working.

SMYCK

a blog by John-Paul Bader