Report from my talk to the Computer Lab Security Group
on email forgery protection
======================================================

$Cambridge: hermes/doc/antiforgery/csgtalk.txt,v 1.2 2004/09/22 14:50:24 fanf2 Exp $


These notes are from the perspective of my aim as the giver of the
talk to get some useful advice from some experts. I'm glad to say that
this was a success. The preparation of the talk was also useful for
clarifying my thoughts, and some of these points are also included
below.

The slides for the talk can be viewed at
http://www.cus.cam.ac.uk/~fanf2/hermes/doc/talks/2004-08-cl-csg/

Other documents on this topic can be found at
http://www.cus.cam.ac.uk/~fanf2/hermes/doc/antiforgery/


There are two general approaches to the design: open and closed.

Open:
	uses a tagged syntax to expose as many semantics as possible
	to the recipient
		e.g. expiry time, message data hash, pubkey sig

	allows the recipient of a message to participate in
	verification
		e.g. any of the above invalid
	to reduce the load on the verification servers

	requires standardized extensions

Closed:
	uses only the barest of syntaxes to allow addresses to be
	unsigned by recipients

	all verification must be done by the home site
		possibly allows better address use/abuse data collection

	less support for heterogeneous implementations?


Cryptography -- public key signatures:

There was little enthusiasm in the audience for public key crypto.
Perhaps I could have explained better the way it would be used, which
is similar to DomainKeys -- per-domain public keys advertised in the
DNS.

It was noted that the size problem that makes pubkey sigs difficult to
put in the local part of an address is much less severe if the
signature is put in the domain part.


Cryptography -- re-keying:

There was general agreement from the audience that frequent re-keying
(whether by PRNG or by distribution of new random keys) is an
excessively complicated technique that only serves to obfuscate the
timestamp.


Cryptography -- alternative techniques:

A more compact signature format than we have so far considered was
suggested. It is suitable for use in a closed design.

A message ID (IDm) including at least a timestamp is created.

A session key (Ks) is created by hashing the system's master key (Km)
with the unsigned address of the sender (Au).
	Ks = HMAC(Km, Au)

The signature (S) is the message ID encrypted with the session key
(a little extra framing may be necessary to ensure that we know a
correct plaintext has been obtained after decryption)
	S = E_Ks(IDm)

The signed address (As) is the unsigned address with the signature
inserted.

The message ID can be recovered at verification time because As
includes Au in the clear, and Km is available to the verification
process.


Verification -- address lifetimes:

It was noted that even a two week lifetime may be too short: Demon
Internet spools email for customers for up to a month.


Verification -- rate limiting:

The standard list of techniques for physical intrusion detection is
	deter - detect - delay - respond

In our situation the deterrance is the use of signed addresses, the
detection is through monitoring of the verification service, and the
response is to invalidate the address early. Computer security only
rarely includes the delay step (examples include honeypots).

It was noted that before we completely disable a compromised address,
we are able to limit use of it e.g. if we are not yet sure that a
spammer is abusing it. All the verification services we are
considering have the possibility of returning an indeterminate result,
e.g. a 450 response to a callback, or even no response at all.

This even allows us to gradually fade out an address over time rather
than going from full-on to completely off.


Verification -- signatures in the domain part:

This allows us to get a much more comprehensive data feed about the
use of addresses than any of the other verification techniques.
Callback verification is rare. A custom verification service will
never be popular. Verification by recipients gives us no data at all.

However it was noted that it will be detrimental to the cacheability
of email domains, and might even be viewed as imposing the cost of
callback verification -- i.e. the delay in looking up data from a
remote site's DNS server rather than a local cache, and the consequent
increase in the number of concurrent SMTP connections.


Human factors -- location of the signature

If a site supports sub-addressing, for example
	abc123+subaddress@dept.cam.ac.uk

and if the standard signature format appends the signature to the
start of the email address, then users may be fooled into thinking
after a glance that
	abc123+d41d8cd98f00b204e9800998ecf8427e.xyz456@dept.dam.ac.uk
is from xyz456 not abc123.


-- end --