Jutta Deneger from Sendmail brought her colleage Murray Kucherawy to OSAF to talk about
sender-authentication techniques; we also roped Jutta into talking for a few minutes
about
Sieve, a (mostly) server-side filtering
language. (Think "easier procmail".)
Murray works on mail filtering and performance at
Sendmail
and has been involved with the IETF
MARID group,
which recently dissolved for lack of consensus and some
patent issues.
Anti-phishing schemes
Murray said:
Taxonomy of anti-phishing solutions
There are basically two different anti-phising approaches:
- verifying that the originiating server is allowed to send for the sender's domain
- examples are RMX, SPF, and Sender-ID. The ill-fated MARID project was in this category as well.
- cryptographic signing
- full user-to-user verification does the job, but is more work for the users.
- examples of domain-based verification are much easier: see Identified Internet Mail (IIM) from Cisco and Domain Keys from Yahoo
Particular anti-forgery proposals
SPF
SPF uses the DNS TXT record to tell which IP addresses are allowed to send messages as from that domain. The receiving MTA compares the sender address to the TXT record. This unfortunately breaks down for forwarded messages and mailing lists.
SPF is still alive, but an IETF orphan.
SRS
SRS is, as Murray puts it, an "enormous abomination" that puts a huge mass of stuff in the headers.
Sender-ID
Sender-ID works by figuring out the "purported responsible address" to figure out who sent the message. If a message is forwarded (e.g. by a mailing list manager), then the mailing list manager adds its own information.
Sender-ID is being ignored for the moment because Microsoft applied for a patent that covers it. They say that they don't mean to restrict its use, they just want to protect their own interests. However, the possibility of Sender-ID being covered by a patent has been enough to cause the anti-phising community to drop it completely.
MASS
There are a few crypto techniques being covered by the IETF
Message Authentication Signature Standards (MASS) group.
Domain Keys
Domain Keys takes information from the headers and does a
SHA1 hash with a private key and adds it to the header in this form:
DomainKey-Signature: d=osafoundation.org s=fromble c=simple q=dns b=stuff
- d: domain
- s: which of your list of keys to use
- c: canonical form -- which of the various canonical forms to use
- q: what type of query (usually "dns")
- b: a base64 encoding of a bunch of data (sorry, missed exactly what was encoded in it); typically three or four lines worth of base64.
When the receiving MTA gets the message, it looks up the public key from DNS, and uses it to compare the SHA1 hash to the actual headers. He described this as "SMIME at the domain level".
Note that modifying the headers (as mailing list managers do when they prepend the name of the list, e.g. [Dev]) or even reordering headers messes up the hash.
The public key is stored in a domain based on the "s" field of the signature, e.g.
fromble._domainkey.osafoundation.org
in the TXT record, in the form
"k=rsa; p=public key"
IIM
Cisco's IIM differs in two important ways:
- The line they add to the header includes all the headers, making it more robust in the face of reordered headers.
- They include a message-length, so you can add on footers. However, this leaves an opening for spammers -- they can then add any payload they want at the end of the message. (One partial solution to this would be requiring a certain minimum length before you're allowed to sign a message.)
Because there won't be a day when suddenly everybody is using a domain authentication scheme, for practical purposes, people won't be able to just reject messages that aren't signed. Thus
there really needs to be MUA support for these forms of authentication. The receiving MTA can give an indication of pass/fail/unknown, but the user really needs a way to see what the MTA's opinion was. (And it needs to be more user-friendly than the little padlock that Web browsers have.)
Misc
Murray mentioned that he has a draft into IETF about how the MTA and the MUA should talk to each other.
There was some discussion about holes and dangers. He said that one of his biggest fears is that this puts big bullseyes on DNS servers. Murray was of the opinion that taking down a DNS server would prevent a legitimate user from sending messages, but would also prevent an illegitimate user from sending messages as that domain.
This is not a defense against compromised systems, but a defense against forging. Note that it will help slow the spread of some virii. Viruses sent as someone else in the infected user's address book will be blocked, so it will be easier to track down who is actually infected.
SIEVE
Jutta said:
SIEVE was designed originally to allow people to script e-mail processing without shooting themselves in the foot. However, people wanted to be able to do more.. so extensions showed up. Those extensions, of course, now allow you to shoot yourself in the foot (well, maybe not shoot... you can elbow yourself in the foot, we're working on minor abrasions), with the disadvantage that the language is very krufty.
Extensions
Extensions have to be declared at the beginning of the program, e.g.
require "fileinto";
require "variables";
require "envelope";
Common actions
Common actions include:
- fileinto "important"; (file into folder)
- redirect "foo@example.com";
- keep;
- discard;
Less common actions:
- addheader "Name" "Value";
- deleteheader "Name";
- notify :method ... :options ...;
- vacation "I'm on vacation! ...";
Conditions
The conditions are of the form:
- if header :matches "*osafoundation.org" ["from", "sender"] (if the header ends in "osafoundation.org" ...)
- Logical operators anyof(cond, cond, cond) ("or") and allof(cond, cond, cond) ("and")
It turns out that looking for a string in the from or sender lines is difficult because the address can have all kinds of uncooperative formats (like comments or surrounding punctuation), so an alternative is "address". "Address" does a better job, but people don't usually think to use it.
Other things to look for matches in include
- envelope
- body (that's an extension)
- variable values
Matches can be modified by match-types and comparators.
For example, one match type is :regex matching operator.
Variables are allowed, with the syntax of "${_n+}" for
the n-th element matched. This gives lines like:
redirect "${1}";
Unfortunately, as SIEVE has gotten more powerful, users can write scripts that do things like match regular expressions in message bodies and unwittingly open their servers up to denial-of-service attacks.
Implicit Keep
One illustration for how user-level simplicity can backfire is the implicit "keep". There's an implicit "keep" command that gets silently cleared by the fileinto, redirect, keep, and discard commands. Alas, this can confuse users, especially when they use front-ends that don't have mechanisms for dealing explicitly with the implicit keep. In particular, it's hard to always file or redirect a copy of a message, without affecting what happens to it afterwards.
(As a way around that, Jutta wrote an RFC that adds the
:copy keyword.)
Afterwards, Jutta emailed Ducky with a brief "intro to Sieve" that she suggested including.
SIEVE 101
=========
Sieve RFC 3028:
- Script is a sequence of commands
- Each command has the form
identifier arguments... ;
or identifier arguments... {
commands...
}
- Special commands that affect control flow or do other
things not directly related to mail processing:
REQUIRE -- announces extensions a script uses
require "name";
require ["name1", "name2", "name3", ... "nameN"];
Builtin commands you can use without "require":
- Controls: if, require, stop;
- Actions: redirect, discard, keep.
- Tests: address, allof, anyof,
exists, false, header, not, size, true
Extensions that require "require", even though they're
in the base specification:
- fileinto, reject, envelope
IF -- control flow
if
{
...
}
[ elsif
{
...
} ]
[ else
{
}
]
- Note how the "if" ends with a { ... } ? That's the
main reason that this form of command exists.
- But if we wanted to, we could write extensions that
also take blocks as an argument! ...
formime "application/binary"
{
if header ...
{
discard;
}
}
...
STOP -- control flow (jumps to the end)
stop;
- When you're looking at something in a Sieve script, it is
either:
- a string "foo"
- an identifier fileinto
- a tag (or "keyword") :matches
- a number 1234
- a comment /* jam that spam */
# yeah.
- a list of strings [ string1, string2, ..., stringN ]
- a list of tests ( test1, test2, ..., testN )
- Some non-obvious stuff about strings
- Strings can contain anything but \0, including
any combination of CR and LF.
- To include " in a string, write \". To include
\ in a string, write \\. To include a newline
in a string -- just write the frickin' newline!
- Everything in sieve is UTF-8. No need to specify
a charset, and don't worry about 8-bit chars, they're
allowed.
- Unlike C, \n isn't a newline, it's an n.
- There's a special syntax for multi-line strings
that may contain unescaped quotes.
text: [/* comments can go here */]
text text text text
text text text text
text text text text
.
So, it's text:, followed by white space (that may
include comments) and then a dot-stuffed paragraph
terminated with a . on a single line.
Dot-stuffing means:
- If there are two . starting a line, one of them is
removed.
That's rare, so for most purposes, you should be able
to just cut-and-paste a whole e-mail message after a
text:, add a ., and that's it.
- If the "variables" extension is included with
require, ${word} in strings acquires special meaning
and is replaced. It doesn't matter how those texts
were written, whether with \ or text: .
- Some non-obvious stuff about numbers:
- Numbers can be suffixed with K, M, or G to
multiply by 1,024; 1,048,576; or 1,073,741,824.
- However, sieve implementations need only provide at
least 31 bits of magnitude.
- They're pretty much useless! So far, numbers are used
only with the "size" operator. Everything else may look
like a number with quotes around it, but is really a string.
- Boolean tests:
- Boolean operators on multiple, single, and no tests:
allof (test / test-list)
anyof (test / test-list)
not test
- combine other boolean tests
true
false
- constants
- identifier [arguments...]
- related to the contents of the message or the
environment, the real "meat"
- address, exists, header, envelope, size
- Commands
- Look the same as test: identifier [arguments...]
- evaluated for their side effects; no results
- Arguments
- Two kinds:
- positional
- always there
- always in the same order
- tagged
- CommonLISP-style ":prefix"
(Because as everybody knows, Common Lisp
is extremely intuitive and widely spoken
among people who receive e-mail.)
- Syntax of the specific tag defines number
and meaning of its arguments.
- In practice, these are pretty much
always optional.
- Ordering of positional vs tagged arguments:
- tagged arguments go first (before positional
arguments),
- but may appear in any order relative to each other
- this is a lot like flags to a Unix command.
- Comparators
- Special kind of tagged argument, usually accepted
whenever strings are compared to other strings
(e.g., "spamtest")
- Controls how two strings are processed before they
are byte-for-byte compared.
- For example, "case-mapping" comparators map the two
strings both to the same case before comparing them.
- Syntax:
:comparator "i;octet"
:comparator "i;ascii-casemap"
Usually, " ; - ". The "i"
locale stands for international.
- Affects the way characters are compared, but doesn't
change the fact that the sieve scripts are written
in UTF-8.
- Two built-in comparators "i;octet" and
"i;ascii-casemap".
- For the rest, "require" statements are needed
:comparator "foo"
->
require "comparator-foo";
- Match Types
- Similar to comparators, but often assymetrical.
- Often don't have parameters, although they sound
as if they would; instead they just describe the
relationship between two implicit or explicit strings.
header :contains ["from", "to"] "bob"
header :matches "Content-Type" "text/*"
- Working with Variables:
- To "pick up" a variable value, match the string
you want against *, then use the "set" command to
assign it to a named variable.
require [ "variables" ];
if header :matches "subject" "*"
{
set "subject" "${1}";
}
# now you can use ${subject} in strings.
- To use a variable value in a string, write
${varname}
- To compare against a variable value, use the
"string" test:
if string :matches $subject "Hi!"
{
# May be spam.
}
- Working with addresses
- Use the test "address" or "envelope", not "header"!
- Basic syntax of "address":
address
For example
if address ["from", "sender", "reply-to"]
[ "bill@friend.com", "ted@friend.com" ]
- Testing against only part of an address
- use "address parts"
:domain (right side of the @)
:localpart (left side of the @, unquoted)
:all (the default)
require "subaddress";
:user (left side of the +, unquoted)
:detail (right side of the +, unquoted)
- You and Your Implict Keep
- Implicit keep - a flag that's cleared by
keep
discard
fileinto
redirect
reject
quarantine
but not by
vacation
notify denotify
addheader deleteheader
set setdate
spanset virusset
As long as it's set, the message hasn't really been stored
anywhere, and it would be a sad if it got dropped.
- If the "implicit keep" flag is set at the end of the
script, a
keep;
is performed implicitly. Hence the name.
- If you're in the middle of a sequence of scripts,
"keep;" means that the next script gets to run.
- If you're executing the last in a sequence of
scripts, it means that the message is filed in the
user's inbox.
- Once the keep is cleared, it can't be set.
- But you can avoid clearing it with
fileinto
redirect
quarantine
by using the "copy" extension:
require "copy";
# send a copy to alias@domain.com,
# but our main mailbox is still here,
# so don't let it cancel the implicit keep.
redirect :copy "alias@domain.com";
- If you want, read ":copy" as
":but-dont-cancel-implicit-keep".
It doesn't restore an implicit keep if some other
command took it away!
--
DuckySherwood - 29 Oct 2004
Ducky, thanks for this summary. Many thanks to Jutta and Murray for sharing their expertise with us and to Lisa for setting it up.
--
GrantBowman - 30 Oct 2004