About me

1994 - 1997 <fanf2@cam.ac.uk>   comp. sci. Trinity College
1997 - 2000 <fanf@demon.net>    web server admin
2000 - 2001 <fanf@covalent.net>  Apache httpd developer
2002 - now  <fanf2@cam.ac.uk>   Computing Service

1997 . . . <dot@dotat.at>
1999 . . . <fanf@apache.org> (httpd)
2002 . . . <fanf@FreeBSD.org>
2004 . . . <fanf@exim.org>
2006 . . . <fanf@apache.org> (SpamAssassin)

The Computing Service

Established in 1937 as the Mathematical Laboratory, at which point computers were human or mechanical. Didn't get going until after the war. World's first computing service provided by a stored program machine: Edsac, 1949 (picture). Maths Lab split into semi-detached Computer Lab and Computing Service in 1970, and properly separated when the Computer Lab moved to a new site in 2001.

The Computing Service


  • Cambridge-wide 33km University-owned network
  • Internet, Telephones, email, security monitoring

  • Main University web site & some departmental web sites
  • distributed workstation clusters & central file server

  • IT skills training & support for users and other IT staff
  • hardware maintenance & software licence administration

  • Administrative computing: finance, staff & student records

Because of our academic history, administrative computing is not in our remit. The University is very federal. Departments have a lot of autonomy. The colleges are legally separate organizations from the University (which leads to silly things like us having to charge them sales tax). We also provide some services to semi-detached organizations like the local medical research council units. This means we (mostly) do not have a monopoly, and institutions can deploy their own facilities if ours don't do what they want.

Hermes


central email service

Hermes is the University's central email service, run by my colleague David Carter and I. Standards-based service running open source software on Linux. Purely email - e.g. no calendaring. The main topic of this talk is how Hermes is put together.

Hermes statistics


  • 37600 accounts, 29000 active users
  • ~ 1 000 000 logins per day
  • ~ 500 000 messages per day
    12 per sec. at peak times
  • 1GB+ storage quota, 25MB message size limit
  • 38TB RAID10 storage
    + 8TB backup server + 4TB in old servers
  • 30 servers + 16 being decommissioned
  • 8 racks of equipment

Slightly out of date photograph. On the left are some of the message store servers that are about to be decommissioned. That kind of 2U server is typical of most of the Hermes machines. Our current message store machines are now 4U servers with 16 disks each, like the one towards the bottom right. Above it is our backup server, before it grew another shelf of disks.

In the rest of the talk I'll explain each part of this block diagram in turn.

Cyrus


message store

Cyrus is the IMAP server developed by Carnegie Mellon University, and named after the great Persian king who was supposedly the first to establish a postal service in the 6th century BC. It's a relatively complex piece of software specifically designed to scale up to large heavily-used installations.

We replaced the original UW IMAP Hermes architecture in 2003 with this architecture based on Cyrus 2.1. My colleague David Carter added some significant enhancements: delayed expunge, replication, and some performance improvements. These have been incorporated into the current version 2.3 of Cyrus, which we are currently migrating to.

The delayed expunge feature makes it easy for us to recover email that users have accidentally deleted. There's typically a handful of these each week.

The replication facility is illustrated in the diagram. Our message store is organized into pairs of machines: a live master and a hot spare replica which is continuously updated. This arrangement allows us to recover quickly from failures. We also use replication to copy email to the backup server for fast spooling to tape for off-site disaster recovery backups. Replication makes it easy for us to upgrade machines without disrupting service, by doing a controlled switch to the hot spare. In total we have about six copies of everyone's email (2 on master RAID, 2 on spare RAID, plus backup server and tape).

ppswitch


mail router

The other important user-facing part of Hermes is called "ppswitch". PPswitch dates back to 1992, a year before Hermes; the name refers to the software it ran to gateway between Internet email and the old JANET "grey book" protocol. We haven't run PP for over ten years but the name has stuck.

About 11,500 of our users use an MUA other than webmail - 8000 IMAP users, and 3500 POP users. We need to hide from them the fact that their IMAP or POP server could be any one of dozens of computers and could change without notice. Instead of logging in to the message store directly, they log into a proxy running on ppswitch which relays the connection to the correct back end server. The proxy was developed by David Carter, and is similar in fucntionality to Perdition. Cyrus also comes with a proxy which is rather more advanced.

As well as handling IMAP and POP connections, ppswitch is the University's central email hub.

  • Top-4 MTA, alongside
    Sendmail, Postfix, Exchange
  • Lots of features
  • Very configurable
  • Less mad than sendmail

  • PCRE - Perl-compatible
    regular expressions

Written by my colleague Philip Hazel, who retired just over a month ago.

PPswitch functions


  • 260,000 messages from the University per day
  • 300,000 legit messages from outside per day
  • 5,500,000 spam messages filtered per day

  • 198 virtual domains
  • 63 departmental email servers

  • SSL/TLS and authentication

Most of the spam is dealt with by Exim using DNS blacklists and various email address validity checks. We also use MailScanner (developed at the University of Southampton) which is a framework for running SpamAssassin and anti-virus software.

Prayer


webmail software

Prayer is our webmail software, also developed by David Carter. It is named in honor of the "WING" webmail software developed at Oxford University, and because of the hopeful nature of the project. Hence "a wing and a prayer".

Prayer was developed in 2000-2001, when we were still running UW IMAP. Therefore it was specifically designed to put as little stress on the IMAP server as possible, mainly by using persistent IMAP connections instead of re-connecting and re-scanning the mailbox for every web request, as other webmail servers usually do.

Prayer itself is also very lightweight, since it is written in C and has a built-in custom web server - instead of being based on Apache and PHP like other webmail systems. We run it on a single dual-CPU machine with 8GB RAM.

Prayer's user interface is also very lightweight, with no Javascript and few images. It's rather old-fashioned now. David is hoping to overhaul it next year, and add important missing features like international character set support.

About 25,000 of our users use webmail, and 18,000 use it exclusively. It handles about 135,000 logins each day, and about 5,000,000 web hits.

Although we haven't made much effort to publicize it, Prayer is used by a few other universities, including York and Queen's University, Belfast.

Mailman


mailing lists

We've slightly adapted Mailman to work with the University's central web authentication service, but it's otherwise fairly vanilla. About 3600 mailing lists on Mailman, and 5000 on the obsolete lists system (many of which are no longer used).

Future work


I've already mentioned the webmail overhaul. We're on a rolling upgrade path for the message store, and we aim to have multi-gigabyte quotas as standard next year. The anti-spam setup also needs work, to replace MailScanner with Exim's improved anti-spam features.

Tony Finch <fanf2@cam.ac.uk>
8 Nov 2007: Email at Cambridge