Skip to content. | Skip to navigation

Sections
Personal tools
What is this?
Hi, my name is Tom Lazar and I'm a Plone and Zope developer based in Berlin, Germany and this is my personal and professional (no big difference, really...) website.
 

Back on Track

Filed Under:

"Spam? What spam? There is no spam here..."

There now, that's better!

  PID USERNAME PRI   SIZE    RES     CPU COMMAND  
93942 spamd 96 43560K 42836K 0.00% perl5.8.6
95119 cyrus 4 39308K 4304K 0.00% imapd
93943 spamd 4 32956K 32276K 0.00% perl5.8.6
93940 spamd 4 32364K 31628K 0.00% perl5.8.6
93941 spamd 4 30064K 29356K 0.00% perl5.8.6

See, I told ya, it's possible ;-) Actually, the strategy was pretty simple and straightforward, since we've already been doing the more obvious steps such as using spamd instead of spamassassin. Our goal was to

  • reduce memory footprint of spamd processes
  • use a local DNS to speedup name lookups

While the latter was just a matter getting it over and done with, the former proved to be a bit more tricky. We opted to try, what happens, if we limit the size of mails processed by spamd - tests had shown, that even when just one of our clients was simply sending out a single email with a large(ish) attachment (say, 4 - 6Mb) the corresponding spamd child would eat up to 500Mb of RAM and max out the CPU.

It just proved that our particular setup with Exim doesn't seem to be so well documented than the other two (it's only been introduced with Exim 4.5.x). Basicially, you compile exim with WITH_CONTENT_SCAN=yes and then can use the spam directive in your exim configuration rather than setting up various transports and routers where you pipe the mail through spamc (the leightweight client to spamd).

After hunting down the place where to modify Exim's spamc parameters in vain I finally stumbled over the crucial bit here. I could tell, that it was going to work, because it immediately seemed so obvious in hindsight - God bless Ronan McGlue ;-)

The logic behind it is this: since the occurence of the spam directive actually triggers the call to spamc, simply add a condition to that ACL such as the following one:

  warn  message = X-Spam-Score: $spam_score ($spam_bar)
spam = nobody:true
condition = ${if <{$message_size}{100k}{1}{0}}

After that everything looked dandy, i.e. the load was fine, even when lots of mails were pouring in. However, the spamd children still consumed 200 to 400Mb of RAM each, which produced lots of swap activity. I opted to cut down the number of client connections for a spamd process from the default 200 to 100, this means, that it will respawn sooner - the freshly spawned process weighs in with just some 40Mb and then starts to grow. If this setup will hold well during the next days I'll try increasing the amount of child processes from four to six or eight.

spamd_flags="-m4 --max-conn-per-child=100 --ident-timeout=30 \
-x -d -r /var/run/spamd/spamd.pid -u spamd \
-H /var/spool/spamd"

Again, eventhough it's a pain in the ass having to deal with such nuisances I can't deny that I actually enjoy tinkering with this kind of stuff ;-) And as usual, a big shout out to Cryx, I really wouldn't wanna do this with anyone else ;-)