NAME

proctut7 - Anatomy of a Procmail Recipe, Part III

SYNOPSIS

In Anatomy of a Procmail Recipe, Part II we explained what a delivering recipe is; this tutorial describes the various forms of non-delivering recipes and what effect they have on procmail's mail processing.

DESCRIPTION

In an earlier tutorial we discussed procmail's two kinds of recipes: delivering recipes and non-delivering recipes. As we did before, we cite a passage from the procmailrc(5) manpage:

    If a non-delivering recipe is found to match, processing of the
    rcfile will continue after the action line of this recipe has
    been executed.

This is important to know because if you think a recipe is a delivering recipe when it is really a non-delivering recipe (or vice versa), your mail will likely pass through additional recipes that you wish it hadn't (or likely won't pass through recipes you wish it had). We continue quoting procmailrc(5):

    Non-delivering recipes are: those that cause the output of a
    program or filter to be captured back by procmail or those that
    start a nesting block.

    You can tell procmail to treat a delivering recipe as if it
    were a non-delivering recipe by specifying the `c' flag on such
    a recipe. This will make procmail generate a carbon copy of the
    mail by delivering it to this recipe, yet continue processing
    the rcfile.

Thus, delivering recipes are those which cause procmail to stop processing after the recipe's action line. Non-delivering recipes are those which do not cause procmail to stop processing, but the message continues on through subsequent recipes.

Why would we want to do that? Consider the case where we want to do something that otherwise would stop procmail, but we want it to continue on. Here's the standard "copy and forward" scenario:

    :0 c
    copy_of_mail

    :0
    ! me@somewhere.tld

Here we save our message to the copy_of_mail file (nothing new here); but because this delivering recipe has a c flag (described below), procmail executes the recipe action (delivers to copy_of_mail) and then continues to the next recipe (instead of ceasing) and forwards a copy of our message to me@somewhere.tld.

Normally, delivering mail to a mailbox is considered a delivering recipe and procmail would cease processing after executing the action for this recipe. You might notice, however, a little 'c' on the flags line for the first recipe; this tells procmail to "copy" the message[1] and keep it flowing through the remaining recipes (until it hits another delivering recipe, of course).

Now consider that we might want to add an autoreply to our chain of events, so we have to add another 'c' to the second recipe (since it is also a delivering recipe):

    :0 c:
    copy_of_mail

    :0 c
    ! me@somewhere.tld

    :0
    | /usr/bin/autoreply

We'll discuss more about the 'c' command next, as well as a few other ways to make a non-delivering recipe. One of the non-delivering recipe forms we will not discuss in this tutorial is the idea of filters, which make up a large part of non-delivering recipes. While equally important as the non-delivering recipes in this tutorial, filters (and pipes) will have the stage all to themselves in the next tutorial because of some additional complexities associated with their correct execution.

We now introduce some common ways to create non-delivering recipes.

Copies

You might have come across a recipe with the c flag in it (er, for example, our example above):

    :0 c:
    mailbox

The c flag tells procmail to create a copy of this email. From the procmailrc(5) manpage:

    Generate a carbon copy of this mail.  This only makes sense on
    delivering recipes.

The above recipe will cause procmail to deliver a copy of the message to mailbox and cease processing, as a normal delivering recipe. Once procmail processes the c flag on the recipe, however, it will fork a copy of itself[1]; the forked copy of procmail will skip the current recipe (the one with the carbon copy 'c' flag) and continue on through any remaining recipes.

If that's a little too complex of an explanation than you had hoped for, here's a summary: If you want to use a delivering recipe but want to make procmail "keep going", use the 'c' flag in your delivering recipe to make it a non-delivering recipe.

When the procmailrc manpage says "this only makes sense on delivering recipes," it means it. If you use the copy flag on a non-delivering recipe:

    :0 c
    { }

You've now got essentially two messages running through your procmailrc file, and will you likely receive two identical messages at the final destination. This is rarely useful, and almost always confusing. It is also the reason why it is important to recognize a delivering recipe from a non-delivering recipe.

The author must admit that even recently (though rather late at night, to his credit) he sent in a recipe as a suggested solution to a problem on a procmail mailing list that looked like this:

    :0 c
    * ^To:(.*[^-a-zA-Z0-9_.])?\/.*
    { TO = $MATCH }

    :0 c
    * ^Cc:(.*[^-a-zA-Z0-9_.])?\/.*
    { CC = $MATCH }

    :0 c
    * ^Bcc:(.*[^-a-zA-Z0-9_.])?\/.*
    { BCC = $MATCH }

This created three copies of the message, for a total of four duplicate messages! You see, his recipe actions consist of variable assignments (which are non-delivering, as we shall see later). Procmail would have continued through each of these recipes without the 'c' flag anyway. The correct solution is:

    :0
    * ^To:(.*[^-a-zA-Z0-9_.])?\/.*
    { TO = $MATCH }

    :0
    * ^Cc:(.*[^-a-zA-Z0-9_.])?\/.*
    { CC = $MATCH }

    :0
    * ^Bcc:(.*[^-a-zA-Z0-9_.])?\/.*
    { BCC = $MATCH }

In short, use the 'c' flag to make a copy of the mail on a delivering recipe where you want procmail to continue processing the message. Don't use it on non-delivering recipes or you'll end up with multiple copies and plenty of confusion.

Nested Blocks

Procmail recipes, if you recall from our second tutorial, have a flags line, zero or more condition lines, and exactly one action. We're going to broaden our notion of "action" in this section as we discuss "nested blocks."

At the end of a recipe, procmail is expecting to be told what to do, based on any matching conditions. For example:

    :0
    * ^Subject:.*foo
    foomail

This recipe will drop mail with the word 'foo' in the subject line into a mailbox called foomail. The action for this recipe is:

    foomail

which tells procmail that if the conditions were met (e.g., the Subject: line contained the word 'foo'), that the message should be delivered to the foomail file. If the conditions were not met, the action does not occur.

Procmail allows you to put for the action of a recipe what is called a "block". A minimal block is simply two curly braces:

    { }

This is what is called in programming parlance a "noop" (pronounced "no op") or "no operation". It tells procmail to simply do nothing. Where would you use an empty block?

Often an empty block is used in conjunction with an 'E' ("else") recipe:

    :0
    * condition 1
    { }

    :0 E
    some_action

This is really the same as:

    :0
    * ! condition 1
    some_action

but the meaning changes when you have multiple conditions[2][3]:

    :0
    * condition 1
    * condition 2
    * condition 3
    { }

    :0 E
    some_action

Now, 'some_action' only occurs if any one of the conditions failed, which is not the same as:

    :0
    * ! condition 1
    * ! condition 2
    * ! condition 3
    some_action

since this means that all conditions must be false for 'some_action' to trigger.

You will likely come across empty blocks as you review other people's procmail recipes, but now you know what they're for.

What else can go in a block? Well, anything can go in a block. It could be as little as nothing (the empty block) or a simple variable assignment, or a single recipe, or a dozen variable assignments and recipes together; it may be as if you were including an entire procmailrc file inside the block. You can even put nested blocks inside nested blocks. Here's a simple example of a nested block:

    :0
    * condition 1
    {
        VARIABLE = "some value"

        :0
        * condition 2
        some_action

        :0
        other_action
    }

Here we have a single recipe whose action consists of a variable assignment and two recipes. The catch is that the variable assignment (VARIABLE = "some value") and the recipes will only trigger if condition 1 is met. If condition 1 is not met, procmail skips the action for this recipe in the same way it skips any normal (read: "simple") action.

Why would you do this? Take the example of virus scanning. In the good old days, you could count on an email-bourne virus to be rather large. Because virus scanning is an expensive activity (in terms of server resources such as memory and CPU, etc.), you wouldn't want to fire up the virus scanner for every little message. This would be a good time to use a nested block:

    :0
    * > 20000
    {
        :0 fw
        | virus_scanner

        :0:
        * ^X-Virus-Scanner: infected
        /dev/quarantine
    }

This recipe checks to see if the message is larger than 20000 bytes. If it's smaller (technically, it could also be equal to, but with arbitrary size limits this is rarely important), the entire virus scanning block is skipped. Otherwise (that is, if the message is larger than 20000 bytes), the message will be scanned. If the scanning indicated that the message was infected (our scanner might add a special "X-Virus-Scanner:" header to the mail message), we toss it in the quarantine for later examination. Otherwise, the message hits the end of the block and continues on outside of the block.

To summarize, procmail will continue through a nested block, unless the block contains a delivering recipe as well. Otherwise (i.e., if the nested block did not trigger a delivering recipe), the mail will continue on.

Variable Assignment

The final way to create a non-delivering recipe we'll discuss in this tutorial (don't miss the next tutorial where we discuss filters and pipes) is to make a variable assignment for the action.

At any time outside of a procmail recipe you may make a variable assignment (this is also true of piped variable assignment, which we'll cover in a moment). You may make an assignment inside of a recipe also, which will be shown below.

Variables are simply strings; when you assign to a variable, it's simply a matter of:

    VARNAME = "something"

Now the variable 'VARNAME' contains the string "something" (space around the equals sign is optional for normal variable assignment). When you need to use the variable again, you need to give it a dollar sign so procmail knows to replace the name of the variable with its value:

    ## make a variable assignment
    SOME_SUBJECT = "The gerbils must go"

    ## look for Subject lines containing "The gerbils must go"
    :0
    * $ ^Subject: $SOME_SUBJECT
    /dev/null

Notice we assign 'SOME_SUBJECT' without a dollar sign, but when we want to use SOME_SUBJECT, we prefix it with a dollar sign ('$SOME_SUBJECT') and we also tell procmail to do shell substitutions on that line (that means, replace all the variable names you find with their corresponding values) by putting a dollar sign all by itself near the beginning of the condition line. The net effect is that procmail will see this:

    :0
    * ^Subject: The gerbils must go
    /dev/null

The next sections discuss two ways to assign a variable, and when you can assign them.

Simple Variable Assignment

This section should really be called "Variable Assignment", but we distinguish it from "piped variable assignment" since piped variable assignment can occur by themselves as the action of a recipe (though we will demonstrate a technique where simple variable assignment can also be done as the action of a recipe).

Simple variable assignment is just as we have demonstrated above: type a variable name, some optional whitespace, an equals sign, some more optional whitespace, and the value you want to assign to the variable surrounded by quotes:

    SPACE = " "
    SUBJ  = "Subject:"
    FOO="bar"

To use a variable in a recipe or in another variable assignment, you must prepend the variable name with a dollar sign to indicate to procmail you wish it to use it:

    H_SUBJ = "$SUBJ$SPACE"

Were we to print out the contents of H_SUBJ, we would have the word "Subject:" followed by a space ("Subject: " without the quotes).

You can also assign the output of a program to a variable. This is done by using special quotation marks called backticks (or backquotes). They look like this:

    `program_name -optional -arguments`

The backticks tell procmail to execute the program program_name with some optional arguments. Variable assignment with backticks is done like this:

    DATE = `date`

This will run the Unix date program and assign the current date (the output of the 'date' program) to the DATE variable. Backticks may be used with any program to "capture" or save its output into a variable. If the program emits more than one line, all lines will be captured (including newlines) into the variable.

We read in the procmailrc(5) manpage:

    Any program in backquotes started by procmail will have the
    entire mail at its stdin.

This means[4], in the Unix world and derivatives, the effect is this:

    OUTPUT = `cat message | program_name -optional -arguments`

where message is the mail message procmail is currently processing.

Backticks allow you to send your message on a little trip outside of procmail to another program for processing, and to capture the results of that trip, if any.

Piped Variable Assignment

This final section discusses a special way to assign a value to a variable called "piped variable assignment." Were we to read the procmailrc manpage carefully, we'd see this entry under the pipe symbol ('|'):

    Starts the specified program ... You can optionally prepend this
    pipe symbol with variable=, which will cause stdout of the program
    to be captured in the environment variable (procmail will not
    terminate processing the rcfile at this point).

In a nutshell, this says that you can do this:

    SUBJECT= | egrep '^Subject:' -

This will assign the Subject: line of the mail message to a variable named 'SUBJECT', which we can use later.

Piped variable assignment is semantically equivalent to using backticks[5]: the message is passed to a program on its standard input and the output of the program is captured into the variable. This means that for most purposes, piped variable assignment and backticks are the same.

Unlike simple variable assignment, whitespace matters in using this pipe variable assignment; there must not be a space between the variable and the equals sign (e.g., "SUBJECT=" works but not "SUBJECT ="), but you may have spaces after the pipe symbol for legibility. You'll often see piped variable assignments mashed together like this:

    SUBJECT=|egrep '^Subject:' -

but it isn't necessary (and it hurts readability). A nicer alternative:

    SUBJECT=| egrep '^Subject:' -

is a good compromise because "SUBJECT=|" stands out a little but (i.e., you can immediately recognize it is a variable assignment via a pipe) and the whitespace keeps things fairly readable.

Piped variable assignments may occur anywhere in the procmailrc file[6]. Since they're variables, they don't have to appear in a recipe. On the other hand, because they're also pipes, they may appear as recipe actions (simple variable assignment may not be the action of a recipe directly). Here is a piped variable assignment as a recipe action:

    :0
    * ! ^From: my@friend\.tld
    SCAN=| spam_checker

If the mail is not from our friend, we pass it along to our spam checker program. We can then test the value of the SCAN variable to see what it said about the message:

    :0
    * SCAN ?? ^^spam^^
    /dev/spam

The equivalent using backticks must be wrapped in a nested block:

    :0
    * ! ^From: my@friend\.tld
    { SCAN = `spam_checker` }

That piped variable assignments may appear as the action for a recipe is the primary difference between it and using backticks, yet we have seen that we can achieve the same effect by using a nested block.

SUMMARY

Procmail has two types of recipes: delivering and non-delivering. It is essential to distinguish between them to avoid false recipe processing (either too many recipes or too few could be triggered).

Non-delivering recipes process the mail in some way but procmail does not cease processing; it continues to the next recipe for further processing.

Types of non-delivering recipes include:

Non-delivering recipes using pipes and filters will be discussed in a future tutorial.

NOTES

Note 1

Actually, procmail forks or clones itself, so now there are two procmail processes running around. The first one exits after completing its action line, and the second procmail process (which has inherited all the same settings, state, etc. as the original procmail process) skips the recipe with the 'c' flag and begins processing the next recipe. I offer a humble ASCII diagram for your visualization pleasure:

           procmail
              |
              V
      +---------------+
      | :0 c:         |------+
      | * condition 1 |      |
      | copy_of_mail  |      | A copy of procmail is
      +---------------+      | forked; this clone skips
              |              | the recipe with the 'c'
              V              | flag and continues pro-
    [procmail terminates]    | cessing the next recipe
                             |
              +--------------+
              |
              V
      +--------------------+ This recipe is processed
      | :0                 | by the forked (or cloned)
      | ! me@somewhere.tld | copy of procmail
      +--------------------+
              |
              V
  [copy of procmail terminates]
Note 2

This is one way to "or" conditions together, instead of "and"ing them. By default, procmail requires that all conditions must be satisfied before the recipe action triggers. What we have here is a fairly common procmail-ism, which allows us to use logic rules to convert "and" conditions to "or" conditions by negating the sense of the tests and inverting the action.

This "classic" method used here is called the DeMorgan's method, named after the DeMorgan's laws of logic. DeMorgan states that the two statements are logically equivalent:

    not(A or B)

    (not A) and (not B)
and these pairs are logically equivalent also:

    not(A and B)

    (not A) or (not B)
What this means for procmail people like us is, that we can achieve "or" relationships between conditions like this:

    :0
    * ! condition 1
    * ! condition 2
    { }

    :0 E
    some_action
In a real situtation, let's say we want to save mail to the mom mailbox if it comes from Mom, or if it has the word "new recipe" in the subject line. We can do this:

    :0
    * ! ^From:.*mom
    * ! ^Subject:.*new recipe
    { }

    :0 E:
    mom
In English we read:

    If the mail is not from mom AND the mail does not contain "new
    recipe" in the subject line, do nothing (and, since this is a
    nesting block, we continue to the next set of recipes after the
    'E' recipe).

    Otherwise (that is, at least one of the conditions was satisfied),
    we save the mail in the "mom" mailbox.
DeMorgan's Laws come in handy when you only have primitive ways to express logical relationships, such as this. See Note 3 for yet another way to achieve "or" conditions.

Note 3

For the impatient, you can "or" conditions together using "scoring", an advanced yet essential feature of procmail:

    :0
    * 1^0 condition 1
    * 1^0 condition 2
    * 1^0 condition 3
    some_action
You can read more about scoring in the procmailsc manpage and a future installment in this series.

Note 4

The notion of stdin ("standard in"), stdout ("standard out"), and stderr ("standard error") are grounded in Unix I/O (input/output). Nearly all Unix system programs read from stdin and write to stdout. ls -l | less is a good example: ls -l sends its output normally to your terminal (screen or virtual terminal--whatever you want to call it). We can redirect its output (called stdout) through a pipe to another program (in this case, less). less receives data on its own stdin and displays it to us (via its stdout) screenfuls at a time.

Note 5

Piped variable assignment is problematic on certain platforms; this is a known bug in procmail but has not been fixed as of version 3.22.

If you find that piped variable assignment is problematic on your platform, you can always substitute it with backticks. If you are using the piped variable assignment as your recipe action:

    :0
    * some condition
    VARIABLE=|some_program
the equivalent backtick version uses nested blocks:

    :0
    * some condition
    { VARIABLE = `some_program` }
Note 6

Variable assignments may not occur as the action of a recipe, piped variable assignments being the exception. You may put regular variable assignment in a nested block, as seen in Note 5 above and accomplish the same effect.

PREVIOUS

Anatomy of a Procmail Recipe, Part II

NEXT

proctut8

SEE ALSO

procmail(1), procmailrc(5), procmailex(5)

AUTHOR

Scott Wiersdorf <scott@perlcode.org>

COPYRIGHT

Copyright (c) 2003 Scott Wiersdorf. All rights reserved.

REVISION

$Id: proctut7.pod,v 1.4 2004/05/26 06:06:59 scott Exp $