cache-fu
Mar 16, 2008
nginx and varnish on Mac OS X
Notes on how to create a local deployment setup for Zope on Mac OS X for debugging purposes.
Any non-trivial deployment of Plone usually consists of at least a web server 'sitting in front of' it, often enough accompanied by some sort of caching service. For my own purposes I have settled on nginx for the former and varnish for the latter[1].
They have proven to be a reasonably good team for me, especially in combination with recent versions of CacheFu, Plone's de facto default caching product. By 'reasonably good' I mean, that varnish and CacheFu in their 'normal' setup (i.e. with the 'official' Plone .vcl config) offer a reasonable trade-off between performance and potentially stale content.
I noticed this behaviour particularly when looking at Zope's access log (i.e. var/log/primary-Z2.log) while accessing my latest Plone installation at wahlcomputer.ccc.de. There was much too much going on, as theoretically there shouldn't be any requests reaching the backend until some content had changed. In essence, the site behaved much slower than it would need to by answering unnecessarily to requests that varnish should be able to satisfy.
However, in order to debug the caching and purging of content in detail I needed a local Mac OS X based setup that mirrored that on the FreeBSD box serving the site as closely, as possible. I habitually took notes while setting it up on my desktop machine and am now retracing my steps as I attempt to duplicate it on my Macbook (Just like a bug isn't fixable until it's reproducible, I never consider my admin work done until I have been able to do it at least twice). I'm confident that somebody out there will find the following notes useful, too (even if it's just myself six months^wweeks down the road)...
Since I'm a happy user of the macports collection already anyway, I let it do the 'heavy lifting' of actually installing nginx and varnish. In addition I provided a launchd startup item for varnish and also added a host entry for wahlcomputer to enable virtual hosting for nginx and varnish. Here it goes:
sudo port install nginx
sudo port install varnish
DNS
Before continuing the setup I first wanted to make sure that I could use the domain name wahlcomputer to access my machine. Instead of doing it 'properly' by adding a new record to OpenLDAP and thus possibly opening up what could amount to yet another can of worms I simply tried to edit ye' good ol' /etc/hosts file and added the following line:
127.0.0.1 wahlcomputer
And sure enough, even in the days of 'OpenLDAP FTW!!' Mac OS X still honours entries made to the hosts file so I could move on.
nginx
Next, I configured the freshly installed nginx. As the port already comes with a launchd start up item, I just needed to load it:
sudo launchctl load /opt/local/etc/LaunchDaemons/org.macports.nginx/org.macports.nginx.plist
It's my practice to leave the default files as untouched as possible and put all my configurations into separate files, that I simply include:
sudo mv /opt/local/etc/nginx/nginx.conf.default /opt/local/etc/nginx/nginx.conf
sudo mkdir /opt/local/etc/nginx/includes
sudo touch /opt/local/etc/nginx/includes/wahlcomputer.conf
sudo mkdir /opt/local/var/log/nginx/
To make nginx aware of those files I added the following line just before the final closing }:
include etc/nginx/includes/*.conf;
Here's its contents in its full, unabridged glory (and VHM gore):
server {
server_name wahlcomputer.ccc.de wahlcomputer;
location / {
proxy_pass http://127.0.0.1:8071/VirtualHostBase/http/wahlcomputer:80/foo/VirtualHostRoot/;
}
}
For the changes to take effect, I needed to restart nginx like so (why, oh why, does launchctl with all its bells and whistles not have a simple reload or restart command?!):
sudo launchctl stop org.macports.nginx
sudo launchctl start org.macports.nginx
Now visiting http://wahlcomputer/ showed the front page of the Plone site and I could move on to varnish:
varnish
Unlike nginx, the varnish port (currently) does not come with a startup item, so I made my very own (by blatantly copying from the nginx setup):
sudo mkdir /opt/local/etc/varnish
sudo touch /opt/local/etc/varnish/default.vcl
sudo mkdir /opt/local/etc/LaunchDaemons/org.macports.varnish
sudo touch /opt/local/etc/LaunchDaemons/org.macports.varnish/org.macports.varnishd.plist
sudo chown root:wheel /opt/local/etc/LaunchDaemons/org.macports.varnish/org.macports.varnishd.plist
sudo chmod 644 /opt/local/etc/LaunchDaemons/org.macports.varnish/org.macports.varnishd.plist
sudo ln -s /opt/local/etc/LaunchDaemons/org.macports.varnish/org.macports.varnishd.plist /Library/LaunchDaemons/
sudo mkdir /opt/local/var/varnish
sudo chown -R _www /opt/local/var/varnish
Looking at varnishd's man page - HTTP accelerator daemon - Linux Manual - Digipedia") I saw that there are some options that must be passed in upon start up. I did this by adding them to the ProgramArguments array in the /opt/local/etc/LaunchDaemons/org.macports.varnish/org.macports.varnishd.plist file:
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
<key>Disabled</key>
<false/>
<key>KeepAlive</key>
<false/>
<key>Debug</key>
<false/>
<key>Label</key>
<string>varnishd</string>
<key>OnDemand</key>
<false/>
<key>GroupName</key>
<string>staff</string>
<key>UserName</key>
<string>_www</string>
<key>ProgramArguments</key>
<array>
<string>/opt/local/sbin/varnishd</string>
<string>-a</string>
<string>localhost:6081</string>
<string>-T</string>
<string>localhost:6082</string>
<string>-f</string>
<string>/opt/local/etc/varnish/default.vcl</string>
</array>
<key>RunAtLoad</key>
<false/>
</dict>
</plist>
Now all that was missing was to add some directives to the currently still empty file /opt/local/etc/varnish/default.vcl. It ended up being too long to quote here in its entirety, instead I will simply point you to a nice variation of the 'official zope-plone.vcl that I found that sports some neat enhancements.
Finally I could start up varnish:
sudo launchctl load /opt/local/etc/LaunchDaemons/org.macports.varnish/org.macports.varnishd.plist
In the nginx wahlcomputer.conf file all I needed to change was the port number from 8071 (the Zope instance) to 6081 (varnish).
By keeping a tail on the access logs of nginx and my Zope instance and looking at the response headers with firebug I can now start to tweak my setup to my heart's content. Stay tuned.
[1] A bit of an explanation is perhaps necessary as to why even bother with two types of machinery in front of Plone. The short answer is 'separation of concerns'. While varnish could theoretically take over most of the features that I use nginx for (namely, URL-rewriting, logging and static content delivery) I still would need it for SSL/HTTPS. Also, the logging feature of varnish requires an extra varnishlog process anyway.
Oct 29, 2006
Proposal for a Zope3-based caching strategy
- Cache-Fu is entirely Five-agnostic. This means, that it won't set any caching headers for Five based templates. I.e. also not for basesyndication's feed templates.
- Feeds get a whole lot of repeated access, but change much more rarely than they get accessed (even on more prolific blogs than mine ;-)
- fatsyndication's default adapter has a very generic but also very expensive implementation for a feedsource's modification date (it simply queries all of its contained objects and then sorts them by modification date) because it doesn't (and shouldn't!) "know" anything about specific implementation details
tomster.orgcurrently contains roughly 500 blog entries.tomster.org/blog's feeds currently receive ca. 30.000 requests per month combined
Modified-Since headers for blog entries, the blog view, the archives and, last-but-not-least the feed templates. As a result, the majority of accesses to tomster.org/blog are now served from Squid and the overall CPU-usage has gone down considerably (also, I like to think, that the sites hosted here have become quite snappy™ ;-)
cacheable for the time being) would provide default adapters for the standard content types providing the effective modification date for any object. For non-folderish content that would be (in most cases) simply their Dublin Core modification date. For folderish-content the default adapter would return the youngest modification date of its children. This modification date could be inserted into any template rendering said object and thus enabling caching – so far, so good.
cacheable product would have to register itself for creation- and modification events. Each of these adapters would contain some kind of logic that would determine, whether the object that triggered the event should be considered relevant to it (i.e. the interface could define something like isRelevant(request, object)). If that is the case, it would update its effective modification date annotation to now (and perhaps trigger some invalidation mechanism inside the caching tool such as squid or varnish).
isRelevant() method could check, whether the object in question is one of the most recent 15 blog entries (or whatever the syndication tool tells it, what should be considered). If so, it would set its own effective modification date to now.
Oct 01, 2006
Debugging Squid and CacheFu locally on Mac OS X
From the Real-Developers-Do-It-Locally-Department
When using Squid and CacheFu it can be quite handy to have a duplicate (or at least similar) setup on your local development machine in order to debug header settings etc. If your development platform is Mac OS X, as is mine, you might find the following how-to useful.
After some trial and error I came up with an approach that puts all the customization into the CacheFu part of the equation and thus allows for the luxury of a simple ./configure ; make install installation of Squid. (The reason we need to install Squid from source is that CacheFu explicitely requires version 2.5 and i.e. Darwin ports already has upgraded its squid package to version 2.6.) Having said this, here are the necessary steps:
- Download and expand the Squid 2.5 sources which can be found here
- Inside the expanded package, simply issue the standard
./configure ; make ; sudo make install Next, we need to adapt CacheFu’s
squid.cfgfile to match Mac OS X’s environment and what the previous step produced:[python]
binary: /usr/bin/python2.3
[squid]
binary: /usr/local/squid/sbin/squid
user: nobody
config_dir: /usr/local/squid/etc
log_dir: /usr/local/squid/var/logs
cache_dir: /usr/local/squid/var/cache
cache_size_mb: 1000
direct: True
port: 3128
admin_email: somebody@somewhere.org
[supported-protocols]
http: 80
https: 443
[accelerated-hosts]
HOSTNAME.local: localhost:ZOPEPORT/SITENAME/The only changes you will need to make are the accelerated-hosts entries, where you obviously will want to replace
HOSTNAME,ZOPEPORTandSITENAMEwith values appropriate to your setup.- Now we can let CacheFu do its magic and generate the actual squid configuration for us (kudos to Geoff for writing this script!):
python makeconfig --config=squid.cfg --templates=templates/ --output=output/ Before deploying we need to adjust ownerships and create a missing directory:
sudo mkdir /usr/local/squid/var/cache
sudo chown -R nobody /usr/local/squid/varNow we can deploy the generated setup:
cd output ; sudo python deploy- … let squid create its necessary files and directories:
/usr/local/squid/sbin/squid -z - … and finally start up squid
sudo /usr/local/squid/sbin/squid(thesudois necessary to enable the squid process to bind to port 80)
Now you can access your plone instance via http://HOSTNAME.local and debug the Cache-Headers it sends.
Aug 20, 2006
More Performance tweaking
From the Never-Ending-Story-Department
Well, sorry to bore you with this, but since tomster.org is the only Plone instance experiencing these issues, it is also the only instance I can use to debug them...
To recap: whenever this Plone instance is active, CPU-usage on the host is erratic and constantly jumps between almost zero and 99%. After optimizing some feed performance issues on Friday and then sticking a squid instance in front of the site for good measure and then leaving everything on for a couple of hours I got this:

The first two peaks were due to lengthy compile runs on the system (while tomster.org was off). Then from Tuesday to Wednesday I turned off the static maintenance page and reroutet all tomster.org traffic to the plone instance. Wednesday to Friday I switched it on and off again experimenting with CacheFu settings until I finally installed squid on Friday night. Saturday morning, before heading off to an offline weekend in Stolzenhagen I switched everything off...
My next approach is to deactivate the tomster.org site product and see what happens if I serve the site using the native Plone skin. It's sunday, 10pm, I think I can risk leaving it on for a couple of hours over night... We'll see, what the morning brings...
Aug 18, 2006
Back (once again)
Serving syndicated content non-statically can be dangerous
So, tomster.org is back again... kind of... After lots of experimenting with CacheFu and the Firefox plugin HTTP LiveHeaders and looking at my Apache logs I finally found out, what could have been the cause for the dismal performance of this site:
- tomster.org currently has an average of ca. 3900 pageviews per day
- 25% of which are for for my atom or rss feeds
- none of which ever returned a
304 Not ModifiedCode!
But having narrowed down the problem, I was now able to take measures. After a bit of RTFM (CacheFu's that is) and looking at its control panel I found the solution: I had to add the feed-ids to the list of cacheable templates, like so:

This was possible, because the site-product for tomster.org provides its own ZPT-templates for the atom feeds which take precedence over the atom.xml Five view that Quills' basesyndication Product provides. Because sadly, I haven't found a way (yet) how to make Five views cachable -- which is why the RSS feed of this site is still being (re-)generated for every request. Luckily, 80% of feed views access the atom feeds and not the RSS.
So, if anybody has any idea on how to make CacheFu cache Five views, please speak up!
While I was playing around with CacheFu I added the WeblogEntry content type to its 'content rule' and Weblog and WeblogArchive to its 'container rule' -- with the result that also the front page is now able to return 304s.
Now, the way I understand it, this won't speed up access for first-time visitors at all, but clicking back and forth on tomster.org has become noticably snappier -- and, of course, all those feedreaders out there, will only retrieve a feed if it's actually changed.
I hope, we can sort this out for the Five views, though -- or else any Quills instance could quickly turn a plonesite into a snail -- and we don't want that...
Aug 13, 2006
Down for maintenance
I wish I had more interesting stuff to blog about, than the status of this blog itself... Here's just to inform you, that in my ongoing attempt to wrangle the performance problems I'm still experiencing on this host I've decied to temporarily deactivate tomster.org for a day or two. All requests (except for the atom and RSS-feed) are currently detoured to a static "down-for-maintenance" page.
I then want to compare the cpu-usage of, say 48 hours during normal business days with the same period of the week before.
The good news is, that meanwhile I've seriously streamlined the setup on this machine, so that currently every other site hosted here exceot for my own is enjoying increased performance thanks to CacheFu and Apache's mod_diskcache ;-)
Next up: putting Squid instead of Apache in front of Zope.
