The following is some very early fragments of draft specification related to signed addresses and email address verification. It needs to be split up into separate documents and greatly fleshed out, and is not usable in its current state. I'm keeping it for reference purposes. $Cambridge: hermes/doc/antiforgery/oldstuff.txt,v 1.1 2004/08/16 16:24:34 fanf2 Exp $ Abstract -------- Signed sender adresses are unforgeable email addresses which can be used to verify that a message comes from where it claims to. A site that uses signed reverse paths on all outgoing messages can detect collateral spam without any special co-operation with other Internet sites. An MTA that is receiving a message alledgedly from a site that implements signed reverse paths can detect that it is forged using existing address verification techniques. Signed addresses can also be used to verify the legitimacy of the "From" and "Sender" addresses in a message header. Contents -------- 0. . . . . . Introduction and overview 1. . . . . . Signed Sender Addresses 1.1. . . . . Constructing SSAs 1.2. . . . . Interoperable SSA format 2. . . . . . Email address verification 2.1. . . . . Address verification authority 2.2. . . . . Local address verification 2.3. . . . . Remote address verification 2.4. . . . . Call-back verification 2.5. . . . . Call-forward verification 3. . . . . . Signed envelope sender 3.1. . . . . Detecting collateral spam 3.2. . . . . Detecting forged messages 4. . . . . . MUAs and SSAs 4.1. . . . . Message submission 4.2. . . . . SSAs in the message header 4.3. . . . . An SMTP extension for creating SSAs 5. . . . . . Compatibility considerations 5.1. . . . . Old callout implementations 5.2. . . . . Teergrubing and callouts 5.3. . . . . Greylisting and SSAs 5.4. . . . . Forwarding and signed reverse paths 5.5. . . . . The SMTP VRFY command 6. . . . . . Security considerations R. . . . . . References 0. Introduction --------------- XXX 1. Signed sender addresses -------------------------- For the purpose of this specification, email addresses are divided into the following categories: "Recipient addresses" are advertised for use as the destination of email messages. They appear as the arguments to SMTP RCPT TO commands in the message envelope. Most of the addresses in the message header are also recipient addresses; in particular note that addresses in From: and Sender: headers are recipient addresses because they may be used as the destination of replies. "Sender addresses" are used to identify the source of a message. They appear as the argument to the SMTP MAIL FROM command in the message envelope, which is placed into the message's Return-Path: header on final delivery. They are also used as the destination of message disposition notifications ("bounces"). They tend to be less visible to users. Past practice is to use the same set of addresses as recipients and senders. This is insecure since recipient addresses are widely advertised and easily guessed, which means that it is trivial to misrepresent the source of a message, or to send someone bounces unrelated to any message they sent. "Signed sender addresses" (abbreviated "SSAs") are intended to solve this problem. They have the following properties: (1) They MUST be distinct from all recipient addresses, and MUST NOT be used as a recipient address as defined above. (2) They SHALL only be creatable by systems acting legitimately on behalf of the address's user, as described in the next subsection. (3) They MUST be verifiable by systems that receive email for addresses at that domain as described in section 2. (4) They MUST be further protected against replay attacks, i.e. use of harvested SSAs for illegitimate purposes: (4a) A signed sender address MUST be used for a limited number of messages. (4b) It MUST have a limited lifetime. (4c) It SHOULD NOT be stored persistently, e.g. in mailboxes; instead the result of verifying the address before it expires SHOULD be stored (see section 4). (4d) Even when they are stored they certainly SHOULD NOT be published, e.g. in mailing list archives. (5) SMTP implementations SHOULD provide a mechanism for revoking SSAs before the end of their usual lifetime, to deal with abuse in case the anti-replay protections fail. This mechanism MAY be (partially) automated. It is not possible to be completely protected against replay attacks on SSAs because email delivery is open-loop, that is the source and destination of a message do not communicate directly with each other. In fact there may be an arbitrary number of intermediate systems which the sender will often not be able to predict, especially when the message is forwarded by the initial recipient. Thus the sender cannot practically restrict an SSA's validity to a particular set of hosts. In addition to that, although message delivery is usually very quick it may take a long time if the destination mailbox has quota problems or is only intermittently connected to the Internet. Therefore it is not possible to place tight bounds on the lifetime of an SSA. There are similar problems with restricting the number of times an SSA is validated or the number of times it receives a bounce. An SSA may be tested by the MTA at each hop on the message's outward journey, so there will be multiple partial bounce attempts to a given SSA. There MAY even be multiple complete bounce messages, for example if a recipient of a message has quota problems there can be one or more delay notifications before the final non-delivery notification. Despite the existence of a standard format for DSNs [RFC3464], bounces are still notoriously difficult to make any sense of automatically. Therefore a combination of techniques is used to protect SSAs from abuse. The most important are (4c) and (4d) which prevent SSAs from being harvested in the first place by viruses or spammers. If despite this an address is harvested, (4b) limits the time in which it can be used, and (4a) means that it can be revoked without causing problems for uncompromised messages. Signed sender addresses are not entirely new. A weak form is already used by some mailing list managers for automated bounce handling. A different form of signed reverse path from the one defined here has been suggested as a work-around for the incompatibility between forwarding and the proposed use of MTA authorization records in the DNS. Making a distinction between sender addresses and recipient addresses is also not new: Some systems are already refusing to accept bounces to addresses that are never used to send email, such as mailing list management contact addresses. The next subsection describes various possible ways of meeting the requirements listed above, and the following ones specify a family of SSA formats which is RECOMMENDED for maximum interoperability between software from different vendors. 1.1. Constructing SSAs ---------------------- xxx 1.2. Interoperable SSA format ----------------------------- xxx This section specifies two signed sender address formats. These formats are not necessary for communication across the public Internet, since the verification of SSAs only requires co-operation between senders and the systems that host their mailboxes, i.e. within the same organization. However an interoperable SSA format is necessary so that organizations can expect software from multiple vendors to work together. The first format is intended to indicate that the address is an ISSA to ISSA-aware software without stating anything about how the address was constructed. The second format does specify how the address was constructed. It is intended to fulfil the requirements in the introduction to this section while being simple to configure. The formats are specified in ABNF [RFC2234], based on the syntax rules in [RFC2821] section 4.1.2. ISSAs are "based on" a recipient address, which indicates the mailbox that will receive any bounces. This address is prefixed with a string containing the additional verification information that makes it secure. There are two styles of prefixing, depending on whether the local-part of the basis address is quoted or not. Mailbox = Local-part "@" Domain Local-part = Dot-string / Quoted-string Quoted-string = DQUOTE *qcontent DQUOTE Mailbox =/ ISSA-Mailbox ISSA-Mailbox = ISSA-Local-part "@" Domain ISSA-Local-part = ISSA-Dot-string / ISSA-Quoted-string ISSA-Dot-string = ISSA-Prefix Dot-string ISSA-Quoted-string = DQUOTE ISSA-Prefix *qcontent DQOTE The Dot-string or *qcontent of an ISSA-Mailbox are the same as the Local-part of the Mailbox it is based on. ISSA-Prefix = ISSA-Tag "." ISSA-Data "." ISSA-Tag = "SSA" ISSA-Version ISSA-Version = 1*DIGIT ISSA-Data = Atom The ISSA-Tag is used by ISSA-aware software to distinguish between ISSAs and other addresses (which might be SSAs but not compliant with this format). ISSA-aware software MAY use the syntax described so far to recover the Mailbox that the ISSA-Mailbox is based on. ISSA-aware software that creates ISSAs that are not compliant with the rest of this subsection MUST use an ISSA-Version of "0". ISSA-Tag =/ ISSA1-Tag ISSA-Data =/ ISSA1-Data ISSA1-Tag = "SSA1" ISSA1-Data = ISSA1-Time "-" ISSA1-ID "-" ISSA1-Hash ISSA1-Time = 3base32 ISSA1-ID = 1*base32 ISSA1-Hash = 26base32 base32 = %x41-5a / %x32-37 ; A-Z / 2-7 ISSA1-Data is encoded in case-insensitive base32 [RFC3548] for robustness. (Though base64 encoding fits within an Atom's character set, it is vulnerable to case-smashing and includes the "/" character which is liable to cause trouble.) The data consists of three parts: a timestamp, a message ID, and a hash. The timestamp is an unsigned 15 bit number that counts the days since the start of 1970 (so 1 Jan 1970 is 0). The message ID is an arbitrary number chosen by the creator of the address. The hash is computed as follows: a preliminary address is first constructed according to the ISSA1 syntax except with a secret in place of the hash, then the MD5 [RFC1321] of this address is computed and encoded in base32 to form the hash of the final ISSA1 address. The secret is shared between the creator of the ISSA1 address and the final delivery system hosting the mailbox of the basis address, and is the same for all addresses for which the creator is authorized according to site policy. 2. Email address verification ----------------------------- Email address verification is simply ensuring that messages can be delivered to the address successfully. In the case of a recipient address these are normal messages, and for a sender address they are bounces. In addition to that, signed sender addresses must have a valid signature. The advantages of SSAs depend on a mechanism for anyone to verify their validity. Fortunately there are existing techniques for address verification that work with SMTP as it is currently deployed. This specification's effectiveness depends on these techniques being used extensively across the Internet. The following subsections clarify existing practice as a guide for new implementations. The first part of this section defines some general terms used when talking about address verification. The second part explains how address verification translates into the responses to the SMTP commands that form a message's envelope. The third subsection describes how a system that implements signed sender addresses must verify at SMTP time addresses at domains that are local to the system. The next part describes how remote addresses are verified, and the last three subsections explain how this basic technique differs for sender and recipient addresses. 2.1. Address verification authority ----------------------------------- The SMTP servers for a domain are advertized in the DNS using MX records, or in their absence A and AAAA records, possibly indirected via CNAME records. From the point of view of email address verification on the public Internet these servers are considered to be authoritative, in a similar way that the authoritative name servers for a zone are advertized using NS records. This specification does not distinguish between primary and secondary (fallback) MX hosts; they SHOULD all be able to accurately verify addresses at domains for which they are authoritative. A "remote domain" is one for which an SMTP server cannot verify addresses directly. Instead the server SHOULD contact another system to do the verification. Systems that are not MTAs (in particular MUAs) verify all addresses in this manner. Conversely, an SMTP server's "local domains" are the domains for which it can verify addresses without reference to other systems. (Clustered servers are considered as single systems for this purpose.) This usually implies that the server is the "delivery system" for addresses in that domain, i.e. it puts email in a message store rather than sending it on to another MTA. However addresses at a local domain MAY be configured to "forward" email to one or more other addresses; in this case the relayed message retains its SMTP reverse path, though its envelope recipient addresses are different. Contrast this with re-sending a message (e.g. from a mailing list server, or from an MUA) which involves the message re-entering the transport service environment with a new reverse path. A domain MAY not be local to its advertized authoritative servers, for example when the MX host is a firewall MTA behind which the delivery system is hidden. This is similar to the "hidden master" configuration of a domain's name servers. This specification refers to these servers as "gateway systems". 2.2. General verification requirements -------------------------------------- An SMTP server MUST verify addresses in the message envelope while processing the MAIL FROM and RCPT TO commands, and reflect the result of verification in the responses it makes to those commands: 25x ("accept") for a valid address, 55x ("reject") for an invalid address, or 45x ("defer") for some temporary failure. As well as being necessary for this specification to work, it avoids the creation of collateral spam as explained in section 3. However a MAIL FROM command SHOULD NOT provoke a 45x or 55x response; instead the following RCPT TO commands should all be deferred or rejected. This is because messages to the special postmaster mailbox SHOULD always be accepted, so that system administrators can get help resolving communication problems: it is not possible to implement a more relaxed policy at RCPT TO time if a stricter policy has already rejected MAIL FROM. Even if the system always applies a strict verification policy, problems are easier to debug if the SMTP server has logged a record of both the sender and recipient addresses, even if the message is not accepted. An SMTP server MAY of course reject a RCPT TO command for policy reasons even if the address is valid. For example a server SHOULD do this to prevent itself from being used as an open relay. The client may get the wrong idea about the validity of the address, but this is either the server's intention or because the client is asking the wrong server. Temporary problems with address verification SHOULD cause the SMTP server to give a 45x response to the relevant RCPT TO command(s). However, as a matter of local policy a site MAY decide to return a 25x response in order to try to keep email working in the presence of breakage. An SMTP server MAY verify addresses in the message header at SMTP time, and reflect the results of verification in the response it gives to .. However this is risky because of the synchronization problem described in [RFC1047], so the requirement in section 6.1 of [RFC2821] remains: an SMTP server MUST seek to minimize the time required to respond to the final . end of data indicator. 2.3. Local address verification and SSAs ----------------------------------------- An SMTP server that implements signed sender addresses for a local domain MUST verify addresses in those domains as follows: If the reverse path (MAIL FROM argument) is an address at this domain, then it must be a valid SSA address. If the reverse path is null, then any RCPT TO argument at this domain MUST be a valid SSA address. In this case there MUST be only one RCPT TO command, since a message has only one reverse path address and SSA addresses cannot be forwarded. If the reverse path is not null, then any RCPT TO argument at this domain MUST NOT be an SSA address and MUST have a valid local part. If such a recipient address forwards to another address then the SMTP server SHOULD verify the target address in order to properly verify the recipient, recursively if necessary. If this results in a remote address the SMTP server SHOULD perform remote address verification as described in the following subsections. If a recipient address forwards to multiple addresses, then the SMTP server MAY cease verification with a successful result. 2.4. Remote address verification using callouts ----------------------------------------------- A callout is a partial SMTP mail transaction which is used for the side-effect of verifying the envelope addresses, rather than to transfer a message. It's necessary to phrase the callout so that the address to be verified is the argument to a RCPT TO command, because MAIL FROM commands are allways accepted. Thus the general form of a callout conversation is: S: 220 mx.example.com ESMTP service ready C: EHLO relay.example.net S: 250-mx.example.com Hello relay.example.net [192.0.2.130] S: 250 HELP C: MAIL FROM:<...callout.sender...> S: 250 OK C: RCPT TO:<...to.be.verified...> S: 250 Accepted C: QUIT S: 221 mx.example.com closing connection An SMTP client SHOULD determine the target host for a callout using the same algorithm it would use for routing a message whose recipient is the address to be verified. Thus an MTA on the public Internet will contact an authoritative SMTP server for the domain as described in [RFC974]. If an SMTP client encounters a temporary failure during a callout, it SHOULD re-try with the other SMTP servers (if any) for that domain in the usual sequence. If the verification process takes too long or does not return a definitive answer, the result of the callout as a whole is a temporary failure. An SMTP server cannot tell the difference between a callout and a full mail transaction until too late; therefore it may in turn perform callouts as part of its verification procedures. This means that in most cases there will be a chain of callouts all the way to the ultimate delivery system to get the answer. There may also be callouts for verifying the sender address from each host along this chain. Because a full callout chain is a lot of work to repeat for each message transfer, SMTP servers SHOULD cache the results of callouts. A callout to an SMTP server that works according to the specification in section 2.2 above will not reject MAIL FROM commands, but a callout implementation MUST be prepared for this to happen. The SMTP server has rejected the transaction before it has seen the address that the client is trying to verify, which implies that it is probably unwilling to communicate with the client. The client MAY consider the address to be invalid, or it MAY consider this to be a temporary error and handle it as above. Because callouts follow normal message routing, and because they chase the answer as far as possible, they can be used for address verification across firewalls etc. without special support. This works at both the message submission end and at the message delivery end which both frequently involve firewalls and/or message routing that doesn't follow [RFC974]. This form of remote address verification works well with the SMTP PIPELINING extension described in [RFC2920]. An SMTP server that receives a message envelope as one pipelined command group can callout to the next hop to verify all the addresses using one pipelined command group. The start of the message transfer might look like this: R: 220 relay.example.com Simple Mail Transfer Service ready C: EHLO client.example.com R: 250-relay.example.com Hello client.example.com [192.0.2.15] R: 250-PIPELINING R: 250 HELP C: MAIL FROM: C: RCPT TO: C: RCPT TO: C: DATA The immediately following callout might look like this: S: 220 mx.example.com ESMTP service ready R: EHLO relay.example.com S: 250-mx.example.com Hello relay.example.com [192.0.2.2] S: 250-PIPELINING S: 250 HELP R: MAIL FROM:<> R: RCPT TO: R: RSET R: MAIL FROM: R: RCPT TO: R: RCPT TO: R: RSET S: 250 OK S: 250 Accepted S: 250 Reset OK S: 250 OK S: 550 Unknown user S: 250 Accepted S: 250 Reset OK R: QUIT S: 221 mx.example.com closing connection Resulting in the subsequent response to the client: R: 250 OK R: 550 Unknown user R: 250 Accepted R: 354 Enter message, ending with "." on a line by itself C: ... The above exchange illustrates that there are two kinds of callout: call-back verification and call-forward verification. These are described further in the following subsections. 2.5. Call-back verification and SSAs ------------------------------------ Call-back verification is used to verify sender addresses (signed or otherwise). The name refers to the fact that it is a callout along the message's reverse path when verifying the envelope sender address. If the address is valid as the destination of a bounce, then it is a valid sender address. The server knows that a sender address is being verified because the preceding MAIL FROM command has a null argument, as described in section 2.3. Thus the general form of a call-back conversation is: S: 220 mx.example.com ESMTP service ready C: EHLO relay.example.net S: 250-mx.example.com Hello relay.example.net [192.0.2.130] S: 250 HELP C: MAIL FROM:<> S: 250 OK C: RCPT TO:<...address...> S: 250 Accepted C: QUIT S: 221 mx.example.com closing connection 2.6. Call-forward verification and SSAs --------------------------------------- Call-forward verification is used to verify recipient addresses, hence it's the opposite of call-back verification. The difference is that the MAIL FROM address in call-forward verification is not null. Instead, the SMTP client SHOULD generate a signed sender address that it knows to be valid and use that for the callout. It is RECOMMENDED that this address is based on postmaster at the SMTP client's primary mail domain. If the callout is being done on behalf of a user rather than an MTA, the MAIL FROM address SHOULD be based on that user's address. (Another plausible address for an MTA to use is the reverse path from the message that triggered this callout. This is NOT RECOMMENDED because policy restrictions unrelated to address verification may cause the callout to give a result that is not correct for other reverse paths, and therefore not cacheable for use by other messages.) Thus the general form of a call-forward conversation is: S: 220 mx.example.com ESMTP service ready C: EHLO relay.example.net S: 250-mx.example.com Hello relay.example.net [192.0.2.130] S: 250 HELP C: MAIL FROM: S: 250 OK C: RCPT TO:<...address...> S: 250 Accepted C: QUIT S: 221 mx.example.com closing connection 5.5. The SMTP VRFY command -------------------------- Readers of section 2 of this specification may wonder why the SMTP VRFY command is not used for remote address verification. The main reasons are as follows: This specification depends on a clear distinction between sender addresses and recipient addresses. It must be clear which kind of address is being verified otherwise a forged message with a valid recipient address in the reverse path will appear to be legitimate. The VRFY command does not make such a distinction. It has been common practice for a number of years now to disable the VRFY command to protect against address harvesting and dictionary attacks, and many SMTP server implementations ship with it off in the default configuration. However, envelope address verification is encouraged nowadays in order to reduce collateral spam. Therefore callout verification is much more effective in the real world. Finally, the VRFY command cannot be pipelined so it can only verify one address per round trip. R. References ------------- R.1. Normative references ------------------------- [RFC974] Mail Routing and the Domain System. C. Partridge. Jan-01-1986. [RFC1123] Requirements for Internet Hosts - Application and Support. R. Braden, Ed.. October 1989. [RFC1321] The MD5 Message-Digest Algorithm. R. Rivest. April 1992. [RFC1924] A Compact Representation of IPv6 Addresses. R. Elz. April 1996. [RFC2104] HMAC: Keyed-Hashing for Message Authentication. H. Krawczyk, M. Bellare, R. Canetti. February 1997. [RFC2119] Key words for use in RFCs to Indicate Requirement Levels. S. Bradner. March 1997. [RFC2234] Augmented BNF for Syntax Specifications: ABNF. D. Crocker, Ed., P. Overell. November 1997. [RFC2821] Simple Mail Transfer Protocol. J. Klensin, Ed.. April 2001. [RFC2822] Internet Message Format. P. Resnick, Ed.. April 2001. [RFC2920] SMTP Service Extension for Command Pipelining. N. Freed. September 2000. [RFC3548] The Base16, Base32, and Base64 Data Encodings. S. Josefsson, Ed.. July 2003. R.2. Informative references --------------------------- [RFC1047] Duplicate Messages and SMTP. C. Partridge. Feb-01-1988. [RFC1711] Classifications in E-mail Routing. J. Houttuin. October 1994. [RFC2045] Multipurpose Internet Mail Extensions (MIME) Part One: Format of Internet Message Bodies. N. Freed, N. Borenstein. November 1996. btoa Another base 85 format. R.3. To read... --------------- [RFC2360] Guide for Internet Standards Writers. G. Scott. June 1998. [RFC2434] Guidelines for Writing an IANA Considerations Section in RFCs. T. Narten, H. Alvestrand. October 1998. [RFC2505] Anti-Spam Recommendations for SMTP MTAs. G. Lindberg. February 1999. [RFC3013] Recommended Internet Service Provider Security Services and Procedures. T. Killalea. November 2000. [RFC3365] Strong Security Requirements for Internet Engineering Task Force Standard Protocols. J. Schiller. August 2002. [RFC3552] Guidelines for Writing RFC Text on Security Considerations. E. Rescorla, B. Korver. July 2003. [RFC3692] Assigning Experimental and Testing Numbers Considered Useful. T. Narten. January 2004.