The following is some very early fragments of draft specification
related to signed addresses and email address verification. It needs
to be split up into separate documents and greatly fleshed out, and is
not usable in its current state. I'm keeping it for reference purposes.

$Cambridge: hermes/doc/antiforgery/oldstuff.txt,v 1.1 2004/08/16 16:24:34 fanf2 Exp $


Abstract
--------

Signed sender adresses are unforgeable email addresses which can be
used to verify that a message comes from where it claims to. A site
that uses signed reverse paths on all outgoing messages can detect
collateral spam without any special co-operation with other Internet
sites. An MTA that is receiving a message alledgedly from a site that
implements signed reverse paths can detect that it is forged using
existing address verification techniques. Signed addresses can also be
used to verify the legitimacy of the "From" and "Sender" addresses in
a message header.


Contents
--------

0. . . . . . Introduction and overview
1. . . . . . Signed Sender Addresses
1.1. . . . . Constructing SSAs
1.2. . . . . Interoperable SSA format
2. . . . . . Email address verification
2.1. . . . . Address verification authority
2.2. . . . . Local address verification
2.3. . . . . Remote address verification
2.4. . . . . Call-back verification
2.5. . . . . Call-forward verification
3. . . . . . Signed envelope sender
3.1. . . . . Detecting collateral spam
3.2. . . . . Detecting forged messages
4. . . . . . MUAs and SSAs
4.1. . . . . Message submission
4.2. . . . . SSAs in the message header
4.3. . . . . An SMTP extension for creating SSAs
5. . . . . . Compatibility considerations
5.1. . . . . Old callout implementations
5.2. . . . . Teergrubing and callouts
5.3. . . . . Greylisting and SSAs
5.4. . . . . Forwarding and signed reverse paths
5.5. . . . . The SMTP VRFY command
6. . . . . . Security considerations
R. . . . . . References


0. Introduction
---------------

XXX


1. Signed sender addresses
--------------------------

For the purpose of this specification, email addresses are divided
into the following categories:

"Recipient addresses" are advertised for use as the destination of
email messages. They appear as the arguments to SMTP RCPT TO commands
in the message envelope. Most of the addresses in the message header
are also recipient addresses; in particular note that addresses in
From: and Sender: headers are recipient addresses because they may be
used as the destination of replies.

"Sender addresses" are used to identify the source of a message. They
appear as the argument to the SMTP MAIL FROM command in the message
envelope, which is placed into the message's Return-Path: header on
final delivery. They are also used as the destination of message
disposition notifications ("bounces"). They tend to be less visible
to users.

Past practice is to use the same set of addresses as recipients and
senders. This is insecure since recipient addresses are widely
advertised and easily guessed, which means that it is trivial to
misrepresent the source of a message, or to send someone bounces
unrelated to any message they sent.

"Signed sender addresses" (abbreviated "SSAs") are intended to solve
this problem. They have the following properties:

(1) They MUST be distinct from all recipient addresses, and MUST NOT
be used as a recipient address as defined above.

(2) They SHALL only be creatable by systems acting legitimately on
behalf of the address's user, as described in the next subsection.

(3) They MUST be verifiable by systems that receive email for
addresses at that domain as described in section 2.

(4) They MUST be further protected against replay attacks, i.e. use of
harvested SSAs for illegitimate purposes:

(4a) A signed sender address MUST be used for a limited number of
messages.

(4b) It MUST have a limited lifetime.

(4c) It SHOULD NOT be stored persistently, e.g. in mailboxes; instead
the result of verifying the address before it expires SHOULD be
stored (see section 4).

(4d) Even when they are stored they certainly SHOULD NOT be published,
e.g. in mailing list archives.

(5) SMTP implementations SHOULD provide a mechanism for revoking SSAs
before the end of their usual lifetime, to deal with abuse in case the
anti-replay protections fail. This mechanism MAY be (partially)
automated.

It is not possible to be completely protected against replay attacks
on SSAs because email delivery is open-loop, that is the source and
destination of a message do not communicate directly with each other.
In fact there may be an arbitrary number of intermediate systems which
the sender will often not be able to predict, especially when the
message is forwarded by the initial recipient. Thus the sender cannot
practically restrict an SSA's validity to a particular set of hosts.

In addition to that, although message delivery is usually very quick
it may take a long time if the destination mailbox has quota problems
or is only intermittently connected to the Internet. Therefore it
is not possible to place tight bounds on the lifetime of an SSA.

There are similar problems with restricting the number of times an
SSA is validated or the number of times it receives a bounce.  An
SSA may be tested by the MTA at each hop on the message's outward
journey, so there will be multiple partial bounce attempts to a
given SSA. There MAY even be multiple complete bounce messages, for
example if a recipient of a message has quota problems there can
be one or more delay notifications before the final non-delivery
notification. Despite the existence of a standard format for DSNs
[RFC3464], bounces are still notoriously difficult to make any sense
of automatically.

Therefore a combination of techniques is used to protect SSAs from
abuse. The most important are (4c) and (4d) which prevent SSAs from
being harvested in the first place by viruses or spammers. If despite
this an address is harvested, (4b) limits the time in which it can be
used, and (4a) means that it can be revoked without causing problems
for uncompromised messages.

Signed sender addresses are not entirely new. A weak form is already
used by some mailing list managers for automated bounce handling.
A different form of signed reverse path from the one defined here
has been suggested as a work-around for the incompatibility between
forwarding and the proposed use of MTA authorization records in the
DNS. Making a distinction between sender addresses and recipient
addresses is also not new: Some systems are already refusing to accept
bounces to addresses that are never used to send email, such as
mailing list management contact addresses.

The next subsection describes various possible ways of meeting the
requirements listed above, and the following ones specify a family
of SSA formats which is RECOMMENDED for maximum interoperability
between software from different vendors.


1.1. Constructing SSAs
----------------------

xxx


1.2. Interoperable SSA format
-----------------------------

xxx

This section specifies two signed sender address formats. These
formats are not necessary for communication across the public
Internet, since the verification of SSAs only requires co-operation
between senders and the systems that host their mailboxes, i.e. within
the same organization. However an interoperable SSA format is
necessary so that organizations can expect software from multiple
vendors to work together.

The first format is intended to indicate that the address is an ISSA
to ISSA-aware software without stating anything about how the address
was constructed. The second format does specify how the address was
constructed. It is intended to fulfil the requirements in the
introduction to this section while being simple to configure. The
formats are specified in ABNF [RFC2234], based on the syntax rules in
[RFC2821] section 4.1.2.

ISSAs are "based on" a recipient address, which indicates the mailbox
that will receive any bounces. This address is prefixed with a string
containing the additional verification information that makes it
secure. There are two styles of prefixing, depending on whether the
local-part of the basis address is quoted or not.


    Mailbox		=	Local-part "@" Domain

    Local-part		=	Dot-string / Quoted-string

    Quoted-string	=	DQUOTE *qcontent DQUOTE

    Mailbox		=/	ISSA-Mailbox

    ISSA-Mailbox	=	ISSA-Local-part "@" Domain

    ISSA-Local-part	=	ISSA-Dot-string / ISSA-Quoted-string

    ISSA-Dot-string	=	ISSA-Prefix Dot-string

    ISSA-Quoted-string	=	DQUOTE ISSA-Prefix *qcontent DQOTE


The Dot-string or *qcontent of an ISSA-Mailbox are the same as the
Local-part of the Mailbox it is based on.


    ISSA-Prefix		=	ISSA-Tag "." ISSA-Data "."

    ISSA-Tag		=	"SSA" ISSA-Version

    ISSA-Version	=	1*DIGIT

    ISSA-Data		=	Atom


The ISSA-Tag is used by ISSA-aware software to distinguish between
ISSAs and other addresses (which might be SSAs but not compliant with
this format). ISSA-aware software MAY use the syntax described so far
to recover the Mailbox that the ISSA-Mailbox is based on. ISSA-aware
software that creates ISSAs that are not compliant with the rest of
this subsection MUST use an ISSA-Version of "0".


    ISSA-Tag		=/	ISSA1-Tag

    ISSA-Data		=/	ISSA1-Data

    ISSA1-Tag		=	"SSA1"

    ISSA1-Data		=	ISSA1-Time "-" ISSA1-ID "-" ISSA1-Hash

    ISSA1-Time		=	3base32

    ISSA1-ID		=	1*base32

    ISSA1-Hash		=	26base32

    base32		=	%x41-5a / %x32-37
				; A-Z / 2-7


ISSA1-Data is encoded in case-insensitive base32 [RFC3548] for
robustness. (Though base64 encoding fits within an Atom's character
set, it is vulnerable to case-smashing and includes the "/" character
which is liable to cause trouble.)

The data consists of three parts: a timestamp, a message ID, and a
hash. The timestamp is an unsigned 15 bit number that counts the days
since the start of 1970 (so 1 Jan 1970 is 0). The message ID is an
arbitrary number chosen by the creator of the address. The hash is
computed as follows: a preliminary address is first constructed
according to the ISSA1 syntax except with a secret in place of the
hash, then the MD5 [RFC1321] of this address is computed and encoded
in base32 to form the hash of the final ISSA1 address. The secret is
shared between the creator of the ISSA1 address and the final delivery
system hosting the mailbox of the basis address, and is the same for
all addresses for which the creator is authorized according to site
policy.


2. Email address verification
-----------------------------

Email address verification is simply ensuring that messages can be
delivered to the address successfully. In the case of a recipient
address these are normal messages, and for a sender address they
are bounces. In addition to that, signed sender addresses must have
a valid signature.

The advantages of SSAs depend on a mechanism for anyone to verify
their validity. Fortunately there are existing techniques for address
verification that work with SMTP as it is currently deployed. This
specification's effectiveness depends on these techniques being
used extensively across the Internet. The following subsections
clarify existing practice as a guide for new implementations.

The first part of this section defines some general terms used when
talking about address verification. The second part explains how
address verification translates into the responses to the SMTP
commands that form a message's envelope. The third subsection
describes how a system that implements signed sender addresses must
verify at SMTP time addresses at domains that are local to the system.
The next part describes how remote addresses are verified, and the
last three subsections explain how this basic technique differs for
sender and recipient addresses.


2.1. Address verification authority
-----------------------------------

The SMTP servers for a domain are advertized in the DNS using MX
records, or in their absence A and AAAA records, possibly indirected
via CNAME records. From the point of view of email address verification
on the public Internet these servers are considered to be authoritative,
in a similar way that the authoritative name servers for a zone are
advertized using NS records. This specification does not distinguish
between primary and secondary (fallback) MX hosts; they SHOULD all
be able to accurately verify addresses at domains for which they
are authoritative.

A "remote domain" is one for which an SMTP server cannot verify
addresses directly. Instead the server SHOULD contact another system
to do the verification. Systems that are not MTAs (in particular
MUAs) verify all addresses in this manner.

Conversely, an SMTP server's "local domains" are the domains for which
it can verify addresses without reference to other systems. (Clustered
servers are considered as single systems for this purpose.) This
usually implies that the server is the "delivery system" for addresses
in that domain, i.e. it puts email in a message store rather than
sending it on to another MTA.

However addresses at a local domain MAY be configured to "forward"
email to one or more other addresses; in this case the relayed message
retains its SMTP reverse path, though its envelope recipient addresses
are different. Contrast this with re-sending a message (e.g. from a
mailing list server, or from an MUA) which involves the message
re-entering the transport service environment with a new reverse path.

A domain MAY not be local to its advertized authoritative servers, for
example when the MX host is a firewall MTA behind which the delivery
system is hidden. This is similar to the "hidden master" configuration
of a domain's name servers. This specification refers to these servers
as "gateway systems".


2.2. General verification requirements
--------------------------------------

An SMTP server MUST verify addresses in the message envelope while
processing the MAIL FROM and RCPT TO commands, and reflect the result
of verification in the responses it makes to those commands: 25x
("accept") for a valid address, 55x ("reject") for an invalid address,
or 45x ("defer") for some temporary failure. As well as being
necessary for this specification to work, it avoids the creation of
collateral spam as explained in section 3.

However a MAIL FROM command SHOULD NOT provoke a 45x or 55x response;
instead the following RCPT TO commands should all be deferred or
rejected. This is because messages to the special postmaster mailbox
SHOULD always be accepted, so that system administrators can get
help resolving communication problems: it is not possible to implement
a more relaxed policy at RCPT TO time if a stricter policy has
already rejected MAIL FROM. Even if the system always applies a
strict verification policy, problems are easier to debug if the
SMTP server has logged a record of both the sender and recipient
addresses, even if the message is not accepted.

An SMTP server MAY of course reject a RCPT TO command for policy
reasons even if the address is valid. For example a server SHOULD do
this to prevent itself from being used as an open relay. The client
may get the wrong idea about the validity of the address, but this is
either the server's intention or because the client is asking the
wrong server.

Temporary problems with address verification SHOULD cause the SMTP
server to give a 45x response to the relevant RCPT TO command(s).
However, as a matter of local policy a site MAY decide to return a 25x
response in order to try to keep email working in the presence of
breakage.

An SMTP server MAY verify addresses in the message header at SMTP
time, and reflect the results of verification in the response it gives
to <CRLF>.<CRLF>. However this is risky because of the synchronization
problem described in [RFC1047], so the requirement in section 6.1 of
[RFC2821] remains: an SMTP server MUST seek to minimize the time
required to respond to the final <CRLF>.<CRLF> end of data indicator.


2.3. Local address verification and SSAs
-----------------------------------------

An SMTP server that implements signed sender addresses for a local
domain MUST verify addresses in those domains as follows:

If the reverse path (MAIL FROM argument) is an address at this domain,
then it must be a valid SSA address.

If the reverse path is null, then any RCPT TO argument at this domain
MUST be a valid SSA address. In this case there MUST be only one RCPT
TO command, since a message has only one reverse path address and SSA
addresses cannot be forwarded.

If the reverse path is not null, then any RCPT TO argument at this
domain MUST NOT be an SSA address and MUST have a valid local part.

If such a recipient address forwards to another address then the SMTP
server SHOULD verify the target address in order to properly verify
the recipient, recursively if necessary. If this results in a remote
address the SMTP server SHOULD perform remote address verification as
described in the following subsections.

If a recipient address forwards to multiple addresses, then the SMTP
server MAY cease verification with a successful result.


2.4. Remote address verification using callouts
-----------------------------------------------

A callout is a partial SMTP mail transaction which is used for the
side-effect of verifying the envelope addresses, rather than to
transfer a message. It's necessary to phrase the callout so that the
address to be verified is the argument to a RCPT TO command, because
MAIL FROM commands are allways accepted. Thus the general form of a
callout conversation is:

	S: 220 mx.example.com ESMTP service ready
	C: EHLO relay.example.net
	S: 250-mx.example.com Hello relay.example.net [192.0.2.130]
	S: 250 HELP
	C: MAIL FROM:<...callout.sender...>
	S: 250 OK
	C: RCPT TO:<...to.be.verified...>
	S: 250 Accepted
	C: QUIT
	S: 221 mx.example.com closing connection

An SMTP client SHOULD determine the target host for a callout using
the same algorithm it would use for routing a message whose recipient
is the address to be verified. Thus an MTA on the public Internet will
contact an authoritative SMTP server for the domain as described in
[RFC974].

If an SMTP client encounters a temporary failure during a callout, it
SHOULD re-try with the other SMTP servers (if any) for that domain in
the usual sequence. If the verification process takes too long or does
not return a definitive answer, the result of the callout as a whole
is a temporary failure.

An SMTP server cannot tell the difference between a callout and a full
mail transaction until too late; therefore it may in turn perform
callouts as part of its verification procedures. This means that in
most cases there will be a chain of callouts all the way to the
ultimate delivery system to get the answer. There may also be callouts
for verifying the sender address from each host along this chain.
Because a full callout chain is a lot of work to repeat for each
message transfer, SMTP servers SHOULD cache the results of callouts.

A callout to an SMTP server that works according to the specification
in section 2.2 above will not reject MAIL FROM commands, but a callout
implementation MUST be prepared for this to happen. The SMTP server
has rejected the transaction before it has seen the address that the
client is trying to verify, which implies that it is probably
unwilling to communicate with the client. The client MAY consider the
address to be invalid, or it MAY consider this to be a temporary error
and handle it as above.

Because callouts follow normal message routing, and because they chase
the answer as far as possible, they can be used for address
verification across firewalls etc. without special support. This works
at both the message submission end and at the message delivery end
which both frequently involve firewalls and/or message routing that
doesn't follow [RFC974].

This form of remote address verification works well with the SMTP
PIPELINING extension described in [RFC2920]. An SMTP server that
receives a message envelope as one pipelined command group can
callout to the next hop to verify all the addresses using one
pipelined command group. The start of the message transfer might look
like this:

	R: 220 relay.example.com Simple Mail Transfer Service ready
	C: EHLO client.example.com
	R: 250-relay.example.com Hello client.example.com [192.0.2.15]
	R: 250-PIPELINING
	R: 250 HELP
	C: MAIL FROM:<SSA1.ADAA.u4hmRUy8be9oTv0QDOyFQA==.alpha@example.com>
	C: RCPT TO:<bravo@example.com>
	C: RCPT TO:<charlie@example.com>
	C: DATA

The immediately following callout might look like this:

	S: 220 mx.example.com ESMTP service ready
	R: EHLO relay.example.com
	S: 250-mx.example.com Hello relay.example.com [192.0.2.2]
	S: 250-PIPELINING
	S: 250 HELP
	R: MAIL FROM:<>
	R: RCPT TO:<SSA1.ADAA.u4hmRUy8be9oTv0QDOyFQA==.alpha@example.com>
	R: RSET
	R: MAIL FROM:<SSA1.ADAA.5DlXQ2URBOI/2tVb+L2Bvg==.postmaster@example.com>
	R: RCPT TO:<bravo@example.com>
	R: RCPT TO:<charlie@example.com>
	R: RSET
	S: 250 OK
	S: 250 Accepted
	S: 250 Reset OK
	S: 250 OK
	S: 550 Unknown user
	S: 250 Accepted
	S: 250 Reset OK
	R: QUIT
	S: 221 mx.example.com closing connection

Resulting in the subsequent response to the client:

	R: 250 OK
	R: 550 Unknown user
	R: 250 Accepted
	R: 354 Enter message, ending with "." on a line by itself
	C: ...

The above exchange illustrates that there are two kinds of callout:
call-back verification and call-forward verification. These are
described further in the following subsections.


2.5. Call-back verification and SSAs
------------------------------------

Call-back verification is used to verify sender addresses (signed or
otherwise). The name refers to the fact that it is a callout along the
message's reverse path when verifying the envelope sender address. If
the address is valid as the destination of a bounce, then it is a
valid sender address. The server knows that a sender address is being
verified because the preceding MAIL FROM command has a null argument,
as described in section 2.3.

Thus the general form of a call-back conversation is:

	S: 220 mx.example.com ESMTP service ready
	C: EHLO relay.example.net
	S: 250-mx.example.com Hello relay.example.net [192.0.2.130]
	S: 250 HELP
	C: MAIL FROM:<>
	S: 250 OK
	C: RCPT TO:<...address...>
	S: 250 Accepted
	C: QUIT
	S: 221 mx.example.com closing connection


2.6. Call-forward verification and SSAs
---------------------------------------

Call-forward verification is used to verify recipient addresses, hence
it's the opposite of call-back verification. The difference is that
the MAIL FROM address in call-forward verification is not null.
Instead, the SMTP client SHOULD generate a signed sender address that
it knows to be valid and use that for the callout. It is RECOMMENDED
that this address is based on postmaster at the SMTP client's primary
mail domain. If the callout is being done on behalf of a user rather
than an MTA, the MAIL FROM address SHOULD be based on that user's
address.

(Another plausible address for an MTA to use is the reverse path from
the message that triggered this callout. This is NOT RECOMMENDED
because policy restrictions unrelated to address verification may
cause the callout to give a result that is not correct for other
reverse paths, and therefore not cacheable for use by other messages.)

Thus the general form of a call-forward conversation is:

	S: 220 mx.example.com ESMTP service ready
	C: EHLO relay.example.net
	S: 250-mx.example.com Hello relay.example.net [192.0.2.130]
	S: 250 HELP
	C: MAIL FROM:<SSA1.ADAA.n8eQujipEeFFkuMHv5LqfA==.postmaster@example.net>
	S: 250 OK
	C: RCPT TO:<...address...>
	S: 250 Accepted
	C: QUIT
	S: 221 mx.example.com closing connection


5.5. The SMTP VRFY command
--------------------------

Readers of section 2 of this specification may wonder why the SMTP
VRFY command is not used for remote address verification. The main
reasons are as follows:

This specification depends on a clear distinction between sender
addresses and recipient addresses. It must be clear which kind of
address is being verified otherwise a forged message with a valid
recipient address in the reverse path will appear to be legitimate.
The VRFY command does not make such a distinction.

It has been common practice for a number of years now to disable the
VRFY command to protect against address harvesting and dictionary
attacks, and many SMTP server implementations ship with it off in the
default configuration. However, envelope address verification is
encouraged nowadays in order to reduce collateral spam. Therefore
callout verification is much more effective in the real world.

Finally, the VRFY command cannot be pipelined so it can only verify
one address per round trip.


R. References
-------------

R.1. Normative references
-------------------------

[RFC974]  Mail Routing and the Domain System.
	  C. Partridge. Jan-01-1986.
[RFC1123] Requirements for Internet Hosts - Application and Support.
	  R. Braden, Ed.. October 1989.
[RFC1321] The MD5 Message-Digest Algorithm.
	  R. Rivest. April 1992.
[RFC1924] A Compact Representation of IPv6 Addresses.
	  R. Elz. April 1996.
[RFC2104] HMAC: Keyed-Hashing for Message Authentication.
	  H. Krawczyk, M. Bellare, R. Canetti. February 1997.
[RFC2119] Key words for use in RFCs to Indicate Requirement Levels.
	  S. Bradner. March 1997.
[RFC2234] Augmented BNF for Syntax Specifications: ABNF.
	  D. Crocker, Ed., P. Overell. November 1997.
[RFC2821] Simple Mail Transfer Protocol.
	  J. Klensin, Ed.. April 2001.
[RFC2822] Internet Message Format.
	  P. Resnick, Ed.. April 2001.
[RFC2920] SMTP Service Extension for Command Pipelining.
	  N. Freed. September 2000.
[RFC3548] The Base16, Base32, and Base64 Data Encodings.
	  S. Josefsson, Ed.. July 2003.

R.2. Informative references
---------------------------

[RFC1047] Duplicate Messages and SMTP.
	  C. Partridge. Feb-01-1988.
[RFC1711] Classifications in E-mail Routing.
	  J. Houttuin. October 1994.
[RFC2045] Multipurpose Internet Mail Extensions (MIME) Part One:
	  Format of Internet Message Bodies.
	  N. Freed, N. Borenstein. November 1996.

btoa	  Another base 85 format.

R.3. To read...
---------------

[RFC2360] Guide for Internet Standards Writers.
	  G. Scott. June 1998.
[RFC2434] Guidelines for Writing an IANA Considerations Section in RFCs.
	  T. Narten, H. Alvestrand. October 1998.
[RFC2505] Anti-Spam Recommendations for SMTP MTAs.
	  G. Lindberg. February 1999.
[RFC3013] Recommended Internet Service Provider
	  Security Services and Procedures.
	  T. Killalea. November 2000.
[RFC3365] Strong Security Requirements for
	  Internet Engineering Task Force Standard Protocols.
	  J. Schiller. August 2002.
[RFC3552] Guidelines for Writing RFC Text on Security Considerations.
	  E. Rescorla, B. Korver. July 2003.
[RFC3692] Assigning Experimental and Testing Numbers Considered Useful.
	  T. Narten. January 2004.