Report from my talk to the Computer Lab Security Group on email forgery protection ====================================================== $Cambridge: hermes/doc/antiforgery/csgtalk.txt,v 1.2 2004/09/22 14:50:24 fanf2 Exp $ These notes are from the perspective of my aim as the giver of the talk to get some useful advice from some experts. I'm glad to say that this was a success. The preparation of the talk was also useful for clarifying my thoughts, and some of these points are also included below. The slides for the talk can be viewed at http://www.cus.cam.ac.uk/~fanf2/hermes/doc/talks/2004-08-cl-csg/ Other documents on this topic can be found at http://www.cus.cam.ac.uk/~fanf2/hermes/doc/antiforgery/ There are two general approaches to the design: open and closed. Open: uses a tagged syntax to expose as many semantics as possible to the recipient e.g. expiry time, message data hash, pubkey sig allows the recipient of a message to participate in verification e.g. any of the above invalid to reduce the load on the verification servers requires standardized extensions Closed: uses only the barest of syntaxes to allow addresses to be unsigned by recipients all verification must be done by the home site possibly allows better address use/abuse data collection less support for heterogeneous implementations? Cryptography -- public key signatures: There was little enthusiasm in the audience for public key crypto. Perhaps I could have explained better the way it would be used, which is similar to DomainKeys -- per-domain public keys advertised in the DNS. It was noted that the size problem that makes pubkey sigs difficult to put in the local part of an address is much less severe if the signature is put in the domain part. Cryptography -- re-keying: There was general agreement from the audience that frequent re-keying (whether by PRNG or by distribution of new random keys) is an excessively complicated technique that only serves to obfuscate the timestamp. Cryptography -- alternative techniques: A more compact signature format than we have so far considered was suggested. It is suitable for use in a closed design. A message ID (IDm) including at least a timestamp is created. A session key (Ks) is created by hashing the system's master key (Km) with the unsigned address of the sender (Au). Ks = HMAC(Km, Au) The signature (S) is the message ID encrypted with the session key (a little extra framing may be necessary to ensure that we know a correct plaintext has been obtained after decryption) S = E_Ks(IDm) The signed address (As) is the unsigned address with the signature inserted. The message ID can be recovered at verification time because As includes Au in the clear, and Km is available to the verification process. Verification -- address lifetimes: It was noted that even a two week lifetime may be too short: Demon Internet spools email for customers for up to a month. Verification -- rate limiting: The standard list of techniques for physical intrusion detection is deter - detect - delay - respond In our situation the deterrance is the use of signed addresses, the detection is through monitoring of the verification service, and the response is to invalidate the address early. Computer security only rarely includes the delay step (examples include honeypots). It was noted that before we completely disable a compromised address, we are able to limit use of it e.g. if we are not yet sure that a spammer is abusing it. All the verification services we are considering have the possibility of returning an indeterminate result, e.g. a 450 response to a callback, or even no response at all. This even allows us to gradually fade out an address over time rather than going from full-on to completely off. Verification -- signatures in the domain part: This allows us to get a much more comprehensive data feed about the use of addresses than any of the other verification techniques. Callback verification is rare. A custom verification service will never be popular. Verification by recipients gives us no data at all. However it was noted that it will be detrimental to the cacheability of email domains, and might even be viewed as imposing the cost of callback verification -- i.e. the delay in looking up data from a remote site's DNS server rather than a local cache, and the consequent increase in the number of concurrent SMTP connections. Human factors -- location of the signature If a site supports sub-addressing, for example abc123+subaddress@dept.cam.ac.uk and if the standard signature format appends the signature to the start of the email address, then users may be fooled into thinking after a glance that abc123+d41d8cd98f00b204e9800998ecf8427e.xyz456@dept.dam.ac.uk is from xyz456 not abc123. -- end --