Sendmail can be a little scary. If the 1,000+ page O’Reilly reference doesn’t give you pause, the cryptic configuration files probably will. But actually, if you can put up with a little pain to get by the basics, Sendmail really isn’t all that difficult. It is complicated, but a few “rules of the road” will allow you to understand it.
What Sendmail does (and doesn’t) do
First, let’s clear up some possible misconceptions. Sendmail is a Mail Transport Agent (MTA), not a Mail User Agent (MUA). An MUA is what you typically use to read and compose mail- like “mail” or Netscape or Eudora or “pine”. Sendmail is the behind the scene work horse that reads and acts upon alias files, .forward files and its own configuration files to decide how (and where) to deliver mail. You don’t (generally) use Sendmail directly. Your MUA is the one who will pass mail to Sendmail for disposition.
There are other MTAs: qmail, smail and many, many more. Sendmail is just the most common.
Sendmail isn’t POP or IMAP either. You may get your mail from a POP or IMAP server: that’s a totally separate program that has nothing to do with Sendmail. Sendmail just delivers the mail, and (as we’ll see later) it doesn’t even really do the final part of that by itself; it will call upon other programs (like /bin/mail) to do the actual work of writing files. Sendmail’s job is to transport the mail.
Why is it so complicated?
If that’s all it does, why is the configuration file such a complex mess?
Well, part of the answer is that it’s harder than it looks. Mail addresses come in many different formats, and Sendmail has to understand all of them. Sendmail has to understand alias files, and user’s .forward files. It has to figure out what other machines are allowed to talk to it and what are not. It has to know how it is going to deliver mail that belongs to other machines. These details and more are all spelled out in the configuration file.
Another thing that complicates configuration files is the need to understand other mailing systems. If you were only using Sendmail within your local network of Unix or Linux machines, the Sendmail configuration file could be very simple. If you could further force your users to only use one standard for addresses (for example, never include a host name and never include extraneous comments), the configuration would be even more simple. Unfortunately, the real world is more complex, and therefore so is the typical Sendmail configuration.
What configuration file?
The actual configuration file that Sendmail reads is probably going to be “/etc/sendmail.cf”. Technically it doesn’t have to be that, but it’s rare (other than when testing) for it not to be. On many systems now, /etc/sendmail.cf will be a link pointing to /etc/mail/sendmail.cf but you aren’t likely to find any other variance. That is the actual configuration file, but many sites configure sendmail without ever looking at that file by using “m4”. That’s a macro processor that can read and act upon macro definition files that will produce a sendmail.cf file. Using these macros is a bit easier than writing your own configuration files, but it’s still rather obscure for the uninitiated. I’ll talk about m4 macros later, but for now we’re going to dive right into the real stuff.
Rules, rules, rules
OK, take a deep breath and trust me: this isn’t as bad as it looks.
The sendmail configuration file consists mostly of definitions and rules. Every line in the file starts with a letter that tells you what you are looking at. For example, every rule begins with a “R”. Here’s a few rules:
R$* $: $1 <@> mark addresses R$* < $* > $* <@> $: $1 < $2 > $3 unmark <addr> R@ $* <@> $: @ $1 unmark @host:... R$* :: $* <@> $: $1 :: $2 unmark node::addr R:include: $* <@> $: :include: $1 unmark :include:... R$* [ IPv6 : $+ ] <@> $: $1 [ IPv6 : $2 ] unmark IPv6 addr R$* : $* [ $* ] $: $1 : $2 [ $3 ] <@> remark if leading colon R$* : $* <@> $: $2 strip colon if marked R$* <@> $: $1 unmark
Looks pretty awful, doesn’t it? Don’t panic yet, it is not as bad as it seems.
For now, just accept that those rules tell Sendmail how to do something.
Tell it how to do what?
The main functions of rules are to validate addresses and to select a delivery agent. Rules may also rewrite addresses either for internal convenience or because that’s what you want: you may want mail from “firstname.lastname@example.org” to appear to come from “email@example.com, for example.
Don’t worry for now about HOW the rules work. For now, just accept that they rewrite addresses and help make decisions. How they do that is unimportant right now.
To accomplish these things in a sensible manner, rules are divided into “sets”. There may be dozens of rule sets in your configuration file, but it’s very easy to follow the flow because there is always a specific starting point and Sendmail will always follow a particular path through the sets. Before we get to that, though, you need to be able to recognize a rule set.
This used to be easy. Rule sets start with an “S” and were followed by a number. So you’d just see:
Well, rule sets still can be written that way, and they still do start with an “S”, but now they
can also be words rather than just numbers. Here are a few of the rule sets from my sendmail.cf:
Scanonify=3 SCanonify2=96 Sfinal=4 SRecurse=97 Sparse=0 SParse0 SParse1 SLocal_localaddr Slocaladdr=5
Notice that some of them have an “=” followed by a number? There’s a reason for that. Sendmail
internally identifies all rule sets by number, and certain numbers are “special”. For example, all
addresses (both sender and recipients) first pass through rule set 3. That’s “canonify” above, and the
“=3” tells Sendmail where to find rule set 3. Sendmail doesn’t work in order through the configuration
file; it will always start with rule set 3 no matter where in the file that is located.
Most of what S3 (the shorthand way of referring to it) does is put addresses in a convenient form for the other rules to work with. Because of the many different forms of mail that have existed, S3 can be pretty complex. On my machine, 55 lines of rules comprise S3 (counting a sub routine rule that it can call).
Remember, for the moment we don’t care about the details of S3 or any other set. Here we are taking a bird’s eye view of the overall process, and that process starts by running S3 for the sender address and each recipient.
The next thing Sendmail does is call S0 to select a “delivery agent”. This figures out where the mail is going: should it be handed over to /bin/mail for local delivery, or will it be delivered directly to the host it is addressed to? If it is directed to a local user, does their .forward file cause the mail to go somewhere else? Perhaps the address is an alias? Maybe it needs to be handed to a UUCP gateway (rather rare nowadays, but still possible).
The S0 rule also makes another selection. In addition to the delivery agent, it also selects two rule sets that will be used later on. We’ll get into the details of that later, but for the moment, here’s an “M” (delivery agent) line from my file:
Mlocal, P=/usr/bin/procmail, F=lsDFMAw5:/|@qSPfhn9, S=EnvFromL/HdrFromL, R=EnvToL/HdrToL,
Just notice the S= and R= part of this for the moment. These are the rule sets that will be
used if the “local” delivery agent has been selected.
Also note that “procmail” will be what actually delivers the message.
After running S0, Sendmail runs other rules. The sender’s address is run through S1, and then through whichever “sender” rule set was selected by S0. Recipient addresses are sent through S2, and then through whichever “recipient” rule was selected by S0. Finally, both sender and recipient addresses are put through S4. The purpose of S4 is just to undo any convenience or special marking rewriting that was done by S3. For example, S3 might rewrite a uucp style address to what looks like a standard mail address, but mark it as “uucp”. Set S4 needs to undo that work so that uucp (which will have been selected as the delivery agent) can see the address in a format it understands.
As any rule can call another rule set, there can be lots more rules than these basic sets, but the flow always works this way.
S3 rewrites and validates addresses S0 selects a delivery agent and the rule sets that follow S1 and S2. S1 processes sender addresses Sender addresses are then processed by whichever sender rule set was selected by S0 Sender addresses are finally processed by S4 S2 processes each recipient address and then the addresses are passed through the set selected by S0. Recipient addresses are finally processed by S4 Finally, the mail is passed to the delivery agent selected by S0. Headers may have been added and addresses may have been rewritten (and probably were).
If you looked closely at a real .cf file, you may have noticed that both the S= and
R= from the delivery agent actually seem to refer to two separate rule sets
That’s true: S= refers to both EnvFromL and HrdrFromL. This is to allow different processing
of header lines and the envelope. The distinction here is this: the envelope is information
that has to do with delivery. Envelope information is passed to other programs, for example
to tell them who to deliver to. It isn’t part of the mail itself. While a header line might show
multiple recipients, a delivery program might only be told about one of them (because the
other recipients belong to other delivery agents). Thus the (possible) need for different rule
sets to process each part.
But how do these rules work?
That’s actually the simple part. Rules are very easy, and Sendmail conveniently provides a nice way to test rules to see exactly what they do. Understanding WHY a rule has to do what it does may lie buried in the mists of some long forgotten mail system, but the way the rules work is pretty straightforward.
Some overall concepts first:
Each rule consists of a Left Hand Side (LHS) and a Right Hand Side (RHS) separated by tabs. An optional comment (tab separated again) may follow the RHS.
The LHS is the pattern matching side. If the address being examined matches the LHS, it will be transformed by the RHS.
Rules are re-read and re-executed if the LHS again matches what the RHS produced. In other words, rules are recursive to themselves. The two exceptions to this are if you see $: or $@ at the BEGINNING of the RHS of the rule. It’s possible to write a rule that would get stuck forever, looping back upon itself. Sendmail will recognize simple cases like this and stop the recursion, but it can be fooled by complex situations.
The $: at the beginning of the RHS stops the recursion after one pass. $@ makes one pass, but then does not execute any more rules in the set.
Unless you see $@ at the beginning of the RHS, Sendmail falls through to the next rule whenever the LHS no longer matches the address it is working on. That next rule will see that address AS REWRITTEN by the previous rule.
The RHS only executes (and $@ only returns) if the LHS matches.
$* : matches zero or more tokens $+ : matches one or more tokens $- : matches exactly one token $@ : matches exactly zero tokens $= : matches any token in a class $~ matches any single token NOT in a class.
Great. What’s a token and what’s a class?
A token is part of an address. All addresses are broken into tokens: firstname.lastname@example.org becomes the 5 tokens:
tony @ aplawrence . com
A class is a list of words that could match (or not match) tokens. We’ll come
back to that later; let’s work with the simple stuff first.
How could we match an address like “email@example.com”? This LHS would match:
and so would this:
Let’s look more closely at certain rule sets.
# strip angle brackets -- note RFC733 heuristic to get innermost item R$* $: < $1 > housekeeping <> R$+ < $* > < $2 > strip excess on left R< $* > $+ < $1 > strip excess on right R<> $@ < @ > MAIL FROM:<> case R< $+ > $: $1 remove housekeeping <>
This is the part of S3 that removes angle brackets, for example from “Fred Thompson <firstname.lastname@example.org>”.
Notice that the first and last of these have the “$:” at the beginning of the RHS; these rules will execute only once.
Now for the fun part. We’re going to have Sendmail itself show us just how these rules work. (If you are not root, you probably can still do this). First, put these lines into /tmp/testrules:
# Add S3 at top of file S3 R$* $: < $1 > housekeeping <> R$+ < $* > < $2 > strip excess on left R< $* > $+ < $1 > strip excess on right R<> $@ < @ > MAIL FROM:<> case R< $+ > $: $1 remove housekeeping <>
And then run:
echo "3 Fred Thompson <email@example.com>" | sendmail -C/tmp/testrules -d21.12 -bt
You may need to specify a specific path to sendmail like “/usr/sbin/sendmail” if youare not root.
You should get something like this:
Warning: .cf file is out of date: sendmail 8.11.6 supports version 9, .cf file is version 0 No local mailer defined ADDRESS TEST MODE (ruleset 3 NOT automatically invoked) Enter <ruleset> <address> > 3 input: Fred Thompson < fred@somewhere . com > -----trying rule: $* -----rule matches: $: < $1 > rewritten as: < Fred Thompson < fred@somewhere . com > > -----trying rule: $+ < $* > -----rule matches: < $2 > rewritten as: < fred@somewhere . com > > -----trying rule: $+ < $* > ----- rule fails -----trying rule: < $* > $+ -----rule matches: < $1 > rewritten as: < fred@somewhere . com > -----trying rule: < $* > $+ ----- rule fails -----trying rule: < > ----- rule fails -----trying rule: < $+ > -----rule matches: $: $1 rewritten as: fred@somewhere . com 3 returns: fred@somewhere . com
If you want to play with more addresses, just do:
sendmail -C/tmp/testrules -d21.12 -bt
You can then enter “3” followed by addresses until you get tired of it.
If you leave of the “-d21.12” you can still test rules; you just don’t get the details of each line.
Let’s look at these rules again:
R$* $: < $1 > housekeeping <>
The first rule is going to match anything and put brackets around it. The $1 on the RHS
matches the first (and only) token that matched on the LHS. (token matching runs left to right).
That’s how our input got rewritten as < Fred Thompson < fred@somewhere . com > >
This rule needs $: to keep from getting stuck continually adding brackets. Remember that the next rule sees the rewritten result.
R$+ < $* > < $2 > strip excess on left
Next, we match one or more tokens followed by a left bracket, followed by any number of tokens
and a right bracket. Our new input matches, so it again gets rewritten, this time picking the second
token to match. It now looks like this: < fred@somewhere . com > >
That gets rerun through the rule again (no $: or $@ to stop it) but this time it fails.
R< $* > $+ < $1 > strip excess on right
This strips off the excess > , tries itself again, and then fails.
R<> $@ < @ > MAIL FROM:<> case
This rule doesn’t match. If it did, we’d return immediately ($@).
R< $+ > $: $1 remove housekeeping <>
Finally, this cleans off the remaining brackets and leaves us with a simple address. Why all
the extra fuss to do this? Because an address like this is legal:
Tom Jones <<<firstname.lastname@example.org>>>
Try that against /tmp/ruleset
One other thing you need to know when looking at rules sets. Sendmail can define macros. For example,
defines a macro called “A”. That macro can be used in a LHS (it’s unusual to do so, but it could be
done). If it were used, you might miscount the $1, $2 replacement characters on the RHS thinking
that this would be part of them. For example (and a nonsensical and artificial example indeed):
R$A $- . $* $1
That would NOT be replaced with “:include:”. It would be replaced with the next token that
matched the $- ; $2 would be the $* match.
If that makes your head hurt, don’t worry too much- you are very unlikely to see anything like that on the LHS.
Armed with this knowledge, you can now examine your real configuration file. Just use “sendmail -bt” or “sendmail -d21.12 bt” and play. You can list rules with “=S” followed by its number or its name. If you see “$>” in the RHS of a rule, that’s calling another rule set; the number or name that follows is the set being called. You won’t understand everything just from reading this, but this will get you started.
What about M4?
You will probably find at least one “.mc” file on your machine, and you may find dozens. M4 is a general purpose macro processor, and sendmail distributions generally provide an appropriate series of macros that can generate a working sendmail.cf file. All you have to do is pick an appropriate .mc file and run:
m4 /etc/mail/sendmail.mc > /etc/sendmail.cf
If you don’t have the .mc file you need, you can probably find it on the Internet.
As the .mc files are generally at least somewhat commented, picking and editing an appropriate file isn’t usually too difficult. There are some things that aren’t obvious, though:
What’s with the “dnl”‘s?
M4 macros generate blank lines even when they don’t generate anything else. The dnl just stops unwanted blank lines.
Why all the diverts?
Nothing for you to worry about; it’s just a way to put things that belong together in the output without necessarily having them together in the .mc file.
What do all these things mean?
Ah, that is the rub. These files are only a little bit less confusing than the .cf file itself. And it may seem that you only have one file to work with. For example, RedHat systems only put one sendmail.mc in /etc/mail.
However, you will probably find a line like this in that file:
If you look one directory up (/usr/share/sendmail-cf) from that, you’ll find a very helpful README
that explains everything very well and all the rest of the samples you might want to look at.
A.P. Lawrence provides SCO Unix and Linux consulting services http://www.pcunix.com