NAME

proctip2 - how to "or" recipe conditions

SYNOPSIS

Procmail recipe conditions are "and" conditions: all of the conditions must be satisfied for the recipe to be "true". Often times, we want to "or" our conditions; that is, any of the conditions may be satisfied for the recipe to be "true".

This can be accomplished in at least four ways: multiple recipes, by applying DeMorgan's Laws to recipes, regular expression grouping in the conditions, and condition scoring.

Multiple Recipes

The most simplistic way of achieving "or" results is by using multiple recipes:

    :0:
    * ^Subject:.*results
    /var/mail/temp

    :0:
    * ^Subject:.*mlm
    /var/mail/temp

The results are obvious: if the first recipe is successful, it will trigger. Otherwise, processing will move on to the second recipe and so on.

DeMorgan's Laws

DeMorgan's Laws allow us to logically invert a recipe, effectively achieving an "or" relationship between conditions. To achive this, we write the recipe as if we were requiring all conditions to be true (e.g., an "and" relationship) and invert the condition tests to be false with an exclamation point. Then we assign an empty action to that recipe; a second recipe with the procmail E flag (Mnemonic: "else") triggers the action we want to achieve only if the previous recipe was false (i.e., any of the conditions were false).

    :0
    * ! ^Subject:.*results
    * ! ^Subject:.*mlm
    { }

    :0 E:
    /var/mail/temp

The second recipe triggers if any of the previous recipe's conditions were false.

Regular Expression Grouping

Procmail regular expressions allow us to "or" conditions together under some circumstances:

    :0:
    * ^Subject:.*(results|mlm)
    /var/mail/temp

This is often the most convenient and succinct way to "or" a condition, but can become unwieldy if many terms are required.

Condition Scoring

The procmailsc(5) manpage introduces us to another way to effectively "or" conditions together. By assigning a "score" to each condition as a mail message is scanned, we can determine if any of the conditions were met:

    :0:
    * 1^0 ^Subject:.*results
    * 1^0 ^Subject:.*mlm
    /var/mail/temp

The action line will trigger only if at least one of the conditions was satisfied. Recipes with a positive score once the action line is reached will cause the action line to trigger (we can also use negative scoring, according to procmailsc(5)).

A more efficient way to do this is to use the procmail "supremum" score, which is 2147483647 (that's (2^31)-1). procmailsc(5) tells us that once a recipe has hit this score (or higher), subsequent conditions will be skipped. We can use this to our advantage to achieve efficient and readable recipes. Often, however, because 2147483647 is a little hard to remember, we simply count backwards from 9 and use this as our supremum score: 9876543210.

    :0:
    * 9876543210^0 ^Subject:.*results
    * 9876543210^0 ^Subject:.*mlm
    /var/mail/temp

Or, we can use a variable to make our recipe look tidier:

    SPR=9876543210
    :0:
    * $ $SPR^0 ^Subject:.*results
    * $ $SPR^0 ^Subject:.*mlm
    /var/mail/temp

The leading dollar sign after the asterisk turns "$SPR" into 9876543210. Using the supremum score is much more efficient than 1^0 because with 1^0, procmail continues to try each condition, even if the first one has been met. With the supremum score, procmail immediately skips the remaining conditions and triggers the action line. This can make a big difference when we have dozens of conditions to check.

DISCUSSION

Each of these methods has its advantages and disadvantages. The multiple recipe method is simple to understand and read, but makes for large recipe files and expensive overhead if there are many recipes. Each recipe sets a lockfile, opens and scans the message, and then removes the lockfile.

Applying DeMorgan's Laws to a recipe is probably most efficient as far as internal procmail processing goes. Since we still use native "and" relationships between conditions, the first false condition causes the recipe to fail immediately. The downside of using DeMorgan's Laws on conditions is that sometimes it isn't easy to invert a condition, and once we have, it's nearly impossible to understand what we've done looking at it months later. DeMorgan's Laws make otherwise straightforward recipes a little awkward.

Regular expression grouping is efficient for the most part since we reduce the number of conditions to scan for. However, some of this efficiency is countered by increasingly complex regular expressions. Alternation (grouping with parentheses and pipe characters) can be an expensive operation under some circumstances. Regular expressions also break down after just a dozen or so conditions, at which point they become hard to manage and read.

Scoring conditions with supremum scoring is probably the best all-around solution: it scales well without adding readability problems, is efficient since it exits quickly once a condition has been satisfied (i.e., it does not continue processing remaining conditions), and is relatively simple (compared to DeMorgan's Laws) to implement. The drawbacks to using supremum scoring is that it is a bit of a stretch for beginners to grasp, requires string interpolation (the leading dollar sign causes procmail to scan the condition for variable-looking strings and convert them), and for few conditions is a bit messy.

SUMMARY

We have at least four decent ways to "or" recipe conditions together: multiple recipes, DeMorgan's Laws, regular expression grouping, and condition scoring; each has its advantages depending on the number of conditions, efficiency considerations, and readability considerations.

SEE ALSO

procmailsc(5)

AUTHOR

Scott Wiersdorf <scott@perlcode.org>

COPYRIGHT

Copyright (c) 2003 Scott Wiersdorf. All rights reserved.

REVISION

$Id: proctip2.pod,v 1.1 2003/10/10 04:48:44 deep Exp $