Mailfromd Manual (split by chapter):   Section:   Chapter:FastBack: MFL   Up: Top   FastForward: Using MFL Mode   Contents: Table of ContentsIndex: Concept Index

5 The MFL Library Functions

This chapter describes library functions available in Mailfromd version 8.7. For the simplicity of explanation, we use the word ‘boolean’ to indicate variables of numeric type that are used as boolean values. For such variables, the term ‘False’ stands for the numeric 0, and ‘True’ for any non-zero value.

5.1 Sendmail Macro Access Functions

Built-in Function: string getmacro (string macro)

Returns the value of Sendmail macro macro. If macro is not defined, raises the e_macroundef exception.

Calling getmacro(name) is completely equivalent to referencing ${name}, except that it allows to construct macro names programmatically, e.g.:

  if getmacro("auth_%var") = "foo"
    …
  fi
Built-in Function: boolean macro_defined (string name)

Return true if Sendmail macro name is defined.

Notice, that if your MTA supports macro name negotiation18, you will have to export macro names used by these two functions using ‘#pragma miltermacros’ construct. Consider this example:

func authcheck(string name)
do
  string macname "auth_%name"
  if macro_defined(macname)
    if getmacro(macname)
      …
    fi
  fi
done

#pragma miltermacros envfrom auth_authen

prog envfrom
do
  authcheck("authen")
done

In this case, the parser cannot deduce that the envfrom handler will attempt to reference the ‘auth_authen’ macro, therefore the ‘#pragma miltermacros’ is used to help it.

5.2 String Manipulation Functions

Built-in Function: string escape (string str, [string chars])

Returns a copy of str with the characters from chars escaped, i.e. prefixed with a backslash. If chars is not specified, ‘\"’ is assumed.

escape('"a\tstr"ing') ⇒ '\"a\\tstr\"ing'
escape('new "value"', '\" ') ⇒ 'new\ \"value\"'
Built-in Function: string unescape (string str)

Performs the reverse to ‘escape’, i.e. removes any prefix backslash characters.

unescape('a \"quoted\" string') ⇒ 'a "quoted" string'
Built-in Function: string unescape (string str, [string chars])
Built-in Function: string domainpart (string str)

Returns the domain part of str, if it is a valid email address, otherwise returns str itself.

domainpart("gray") ⇒ "gray"
domainpart("gray@gnu.org.ua") ⇒ "gnu.org.ua"
Built-in Function: number index (string s, string t)
Built-in Function: number index (string s, string t, number start)

Returns the index of the first occurrence of the string t in the string s, or -1 if t is not present.

index("string of rings", "ring") ⇒ 2

Optional argument start, if supplied, indicates the position in string where to start searching.

index("string of rings", "ring", 3) ⇒ 10

To find the last occurrence of a substring, use the function rindex (see rindex).

Built-in Function: number interval (string str)

Converts str, which should be a valid time interval specification (see time interval specification), to seconds.

Built-in Function: number length (string str)

Returns the length of the string str in bytes.

length("string") ⇒ 6  
Built-in Function: string dequote (string str)

Removes ‘<’ and ‘>’ surrounding str. If str is not enclosed by angle brackets or these are unbalanced, the argument is returned unchanged:

dequote("<root@gnu.org.ua>") ⇒ "root@gnu.org.ua"
dequote("root@gnu.org.ua") ⇒ "root@gnu.org.ua"
dequote("there>") ⇒ "there>"
Built-in Function: string localpart (string str)

Returns the local part of str if it is a valid email address, otherwise returns str unchanged.

localpart("gray") ⇒ "gray"
localpart("gray@gnu.org.ua") ⇒ "gray"
Built-in Function: string replstr (string s, number n)

Replicate a string, i.e. return a string, consisting of s repeated n times:

replstr("12", 3) ⇒ "121212"
Built-in Function: string revstr (string s)

Returns the string composed of the characters from s in reversed order:

revstr("foobar") ⇒ "raboof"
Built-in Function: number rindex (string s, string t)
Built-in Function: number rindex (string s, string t, number start)

Returns the index of the last occurrence of the string t in the string s, or -1 if t is not present.

rindex("string of rings", "ring") ⇒ 10

Optional argument start, if supplied, indicates the position in string where to start searching. E.g.:

rindex("string of rings", "ring", 10) ⇒ 2

See also String manipulation.

Built-in Function: string substr (string str, number start)
Built-in Function: string substr (string str, number start, number length)

Returns the at most length-character substring of str starting at start. If length is omitted, the rest of str is used.

If length is greater than the actual length of the string, the e_range exception is signalled.

substr("mailfrom", 4) ⇒ "from"
substr("mailfrom", 4, 2) ⇒ "fr" 
Built-in Function: string substring (string str, number start, number end)

Returns a substring of str between offsets start and end, inclusive. Negative end means offset from the end of the string. In other words, yo obtain a substring from start to the end of the string, use substring(str, start, -1):

substring("mailfrom", 0, 3) ⇒ "mail"
substring("mailfrom", 2, 5) ⇒ "ilfr" 
substring("mailfrom", 4, -1) ⇒ "from"
substring("mailfrom", 4, length("mailfrom") - 1) ⇒ "from"
substring("mailfrom", 4, -2) ⇒ "fro"

This function signals e_range exception if either start or end are outside the string length.

Built-in Function: string tolower (string str)

Returns a copy of the string str, with all the upper-case characters translated to their corresponding lower-case counterparts. Non-alphabetic characters are left unchanged.

tolower("MAIL") ⇒ "mail"
Built-in Function: string toupper (string str)

Returns a copy of the string str, with all the lower-case characters translated to their corresponding upper-case counterparts. Non-alphabetic characters are left unchanged.

toupper("mail") ⇒ "MAIL"
Built-in Function: string ltrim (string str[, string cset)

Returns a copy of the input string str with any leading characters present in cset removed. If the latter is not given, white space is removed (spaces, tabs, newlines, carriage returns, and line feeds).

ltrim("  a string") ⇒ "a string"
ltrim("089", "0") ⇒ "89"

Note the last example. It shows how ltrim can be used to convert decimal numbers in string representation that begins with ‘0’. Normally such strings will be treated as representing octal numbers. If they are indeed decimal, use ltrim to strip off the leading zeros, e.g.:

set dayofyear ltrim(strftime('%j', time()), "0")
Built-in Function: string rtrim (string str[, string cset)

Returns a copy of the input string str with any trailing characters present in cset removed. If the latter is not given, white space is removed (spaces, tabs, newlines, carriage returns, and line feeds).

Built-in Function: number vercmp (string a, string b)

Compares two strings as mailfromd version numbers. The result is negative if b precedes a, zero if they refer to the same version, and positive if b follows a:

vercmp("5.0", "5.1") ⇒ 1
vercmp("4.4", "4.3") ⇒ -1
vercmp("4.3.1", "4.3") ⇒ -1
vercmp("8.0", "8.0") ⇒ 0
Library Function: string sa_format_score (number code, number prec)

Format code as a floating-point number with prec decimal digits:

sa_format_score(5000, 3) ⇒ "5.000"

This function is convenient for formatting SpamAssassin scores for use in message headers and textual reports. It is defined in module sa.mf.

See SpamAssassin, for examples of its use.

Library Function: string sa_format_report_header (string text)

Format a SpamAssassin report text in order to include it in a RFC 822 header. This function selects the score listing from text, and prefixes each line with ‘* ’. Its result looks like:

*  0.2 NO_REAL_NAME           From: does not include a real name
*  0.1 HTML_MESSAGE           BODY: HTML included in message

See SpamAssassin, for examples of its use.

Library Function: string strip_domain_part (string domain, number n)

Returns at most n last components of the domain name domain. If n is 0 the function returns domain.

This function is defined in the module strip_domain_part.mf (see Modules).

Examples:

require strip_domain_part
strip_domain_part("puszcza.gnu.org.ua", 2) ⇒ "org.ua"
strip_domain_part("puszcza.gnu.org.ua", 0) ⇒ "puszcza.gnu.org.ua"
Library Function: boolean is_ip (string str)

Returns ‘true’ if str is a valid IPv4 address. This function is defined in the module is_ip.mf (see Modules).

For example:

require is_ip

is_ip("1.2.3.4") ⇒ 1
is_ip("1.2.3.x") ⇒ 0
is_ip("blah") ⇒ 0
is_ip("255.255.255.255") ⇒ 1
is_ip("0.0.0.0") ⇒ 1
Library Function: string revip (string ip)

Reverses octets in ip, which must be a valid string representation of an IPv4 address.

Example:

revip("127.0.0.1") ⇒ "1.0.0.127"

Library Function: string verp_extract_user (string email, string domain)

If email is a valid VERP-style email address for domain, this function returns the user name, corresponding to that email. Otherwise, it returns empty string.

verp_extract_user("gray=gnu.org.ua@tuhs.org", 'gnu\..*')
  ⇒ "gray"

5.3 String formatting

Built-in Function: string sprintf (string format, …)

The function sprintf formats its argument according to format (see below) and returns the resulting string. It takes varying number of parameters, the only mandatory one being format.

Format string

The format string is a simplified version of the format argument to C printf-family functions.

The format string is composed of zero or more directives: ordinary characters (not ‘%’), which are copied unchanged to the output stream; and conversion specifications, each of which results in fetching zero or more subsequent arguments. Each conversion specification is introduced by the character ‘%’, and ends with a conversion specifier. In between there may be (in this order) zero or more flags, an optional minimum field width, and an optional precision.

Notice, that in practice that means that you should use single quotes with the format arguments, to protect conversion specifications from being recognized as variable references (see singe-vs-double).

No type conversion is done on arguments, so it is important that the supplied arguments match their corresponding conversion specifiers. By default, the arguments are used in the order given, where each ‘*’ and each conversion specifier asks for the next argument. If insufficiently many arguments are given, sprintf raises ‘e_range’ exception. One can also specify explicitly which argument is taken, at each place where an argument is required, by writing ‘%m$’, instead of ‘%’ and ‘*m$’ instead of ‘*’, where the decimal integer m denotes the position in the argument list of the desired argument, indexed starting from 1. Thus,

    sprintf('%*d', width, num);

and

    sprintf('%2$*1$d', width, num);

are equivalent. The second style allows repeated references to the same argument.

Flag characters

The character ‘%’ is followed by zero or more of the following flags:

#

The value should be converted to an alternate form. For ‘o’ conversions, the first character of the output string is made zero (by prefixing a ‘0’ if it was not zero already). For ‘x’ and ‘X’ conversions, a non-zero result has the string ‘0x’ (or ‘0X’ for ‘X’ conversions) prepended to it. Other conversions are not affected by this flag.

0

The value should be zero padded. For ‘d’, ‘i’, ‘o’, ‘u’, ‘x’, and ‘X’ conversions, the converted value is padded on the left with zeros rather than blanks. If the ‘0’ and ‘-’ flags both appear, the ‘0’ flag is ignored. If a precision is given, the ‘0’ flag is ignored. Other conversions are not affected by this flag.

-

The converted value is to be left adjusted on the field boundary. (The default is right justification.) The converted value is padded on the right with blanks, rather than on the left with blanks or zeros. A ‘-’ overrides a ‘0’ if both are given.

' ' (a space)

A blank should be left before a positive number (or empty string) produced by a signed conversion.

+

A sign (‘+’ or ‘-’) always be placed before a number produced by a signed conversion. By default a sign is used only for negative numbers. A ‘+’ overrides a space if both are used.

Field width

An optional decimal digit string (with nonzero first digit) specifying a minimum field width. If the converted value has fewer characters than the field width, it will be padded with spaces on the left (or right, if the left-adjustment flag has been given). Instead of a decimal digit string one may write ‘*’ or ‘*m$’ (for some decimal integer m) to specify that the field width is given in the next argument, or in the m-th argument, respectively, which must be of numeric type. A negative field width is taken as a ‘-’ flag followed by a positive field width. In no case does a non-existent or small field width cause truncation of a field; if the result of a conversion is wider than the field width, the field is expanded to contain the conversion result.

Precision

An optional precision, in the form of a period (‘.’) followed by an optional decimal digit string. Instead of a decimal digit string one may write ‘*’ or ‘*m$’ (for some decimal integer m) to specify that the precision is given in the next argument, or in the m-th argument, respectively, which must be of numeric type. If the precision is given as just ‘.’, or the precision is negative, the precision is taken to be zero. This gives the minimum number of digits to appear for ‘d’, ‘i’, ‘o’, ‘u’, ‘x’, and ‘X’ conversions, or the maximum number of characters to be printed from a string for the ‘s’ conversion.

Conversion specifier

A character that specifies the type of conversion to be applied. The conversion specifiers and their meanings are:

d
i

The numeric argument is converted to signed decimal notation. The precision, if any, gives the minimum number of digits that must appear; if the converted value requires fewer digits, it is padded on the left with zeros. The default precision is ‘1’. When ‘0’ is printed with an explicit precision ‘0’, the output is empty.

o
u
x
X

The numeric argument is converted to unsigned octal (‘o’), unsigned decimal (‘u’), or unsigned hexadecimal (‘x’ and ‘X’) notation. The letters ‘abcdef’ are used for ‘x’ conversions; the letters ‘ABCDEF’ are used for ‘X’ conversions. The precision, if any, gives the minimum number of digits that must appear; if the converted value requires fewer digits, it is padded on the left with zeros. The default precision is ‘1’. When ‘0’ is printed with an explicit precision 0, the output is empty.

s

The string argument is written to the output. If a precision is specified, no more than the number specified of characters are written.

%

A ‘%’ is written. No argument is converted. The complete conversion specification is ‘%%’.

5.4 Character Type

These functions check whether all characters of str fall into a certain character class according to the ‘C’ (‘POSIX’) locale19. ‘True’ (1) is returned if they do, ‘false’ (0) is returned otherwise. In the latter case, the global variable ctype_mismatch is set to the index of the first character that is outside of the character class (characters are indexed from 0).

Built-in Function: boolean isalnum (string str)

Checks for alphanumeric characters:

  isalnum("a123") ⇒ 1
  isalnum("a.123") ⇒ 0 (ctype_mismatch = 1)
Built-in Function: boolean isalpha (string str)

Checks for an alphabetic character:

  isalnum("abc") ⇒ 1
  isalnum("a123") ⇒ 0
Built-in Function: boolean isascii (string str)

Checks whether all characters in str are 7-bit ones, that fit into the ASCII character set.

  isascii("abc") ⇒ 1
  isascii("ab\0200") ⇒ 0
Built-in Function: boolean isblank (string str)

Checks if str contains only blank characters; that is, spaces or tabs.

Built-in Function: boolean iscntrl (string str)

Checks for control characters.

Built-in Function: boolean isdigit (string str)

Checks for digits (0 through 9).

Built-in Function: boolean isgraph (string str)

Checks for any printable characters except spaces.

Built-in Function: boolean islower (string str)

Checks for lower-case characters.

Built-in Function: boolean isprint (string str)

Checks for printable characters including space.

Built-in Function: boolean ispunct (string str)

Checks for any printable characters which are not a spaces or alphanumeric characters.

Built-in Function: boolean isspace (string str)

Checks for white-space characters, i.e.: space, form-feed (‘\f’), newline (‘\n’), carriage return (‘\r’), horizontal tab (‘\t’), and vertical tab (‘\v’).

Built-in Function: boolean isupper (string str)

Checks for uppercase letters.

Built-in Function: boolean isxdigit (string str)

Checks for hexadecimal digits, i.e. one of ‘0’, ‘1’, ‘2’, ‘3’, ‘4’, ‘5’, ‘6’, ‘7’, ‘8’, ‘9’, ‘a’, ‘b’, ‘c’, ‘d’, ‘e’, ‘f’, ‘A’, ‘B’, ‘C’, ‘D’, ‘E’, ‘F’.

5.5 Email processing functions.

Built-in Function: number email_map (string email)

Parses email and returns a bitmap, consisting of zero or more of the following flags:

EMAIL_MULTIPLE

email has more than one email address.

EMAIL_COMMENTS

email has comment parts.

EMAIL_PERSONAL

email has personal part.

EMAIL_LOCAL

email has local part.

EMAIL_DOMAIN

email has domain part.

EMAIL_ROUTE

email has route part.

These constants are declared in the email.mf module. The function email_map returns 0 if its argument is not a valid email address.

Library Function: boolean email_valid (string email)

Returns ‘True’ (1) if email is a valid email address, consisting of local and domain parts only. E.g.:

email_valid("gray@gnu.org") ⇒ 1
email_valid("gray") ⇒ 0
email_valid('"Sergey Poznyakoff <gray@gnu.org>') ⇒ 0

This function is defined in email.mf (see Modules).

5.6 Envelope Modification Functions

Envelope modification functions set sender and add or delete recipient addresses from the message envelope. This allows MFL scripts to redirect messages to another addresses.

Built-in Function: void set_from (string email [, string args])

Sets envelope sender address to email, which must be a valid email address. Optional args supply arguments to ESMTP ‘MAIL FROM’ command.

Built-in Function: void rcpt_add (string address)

Add the e-mail address to the envelope.

Built-in Function: void rcpt_delete (string address)

Remove address from the envelope.

The following example code uses these functions to implement a simple alias-like capability:

prog envrcpt
do
   string alias dbget(aliasdb, $1, "NULL", 1)
   if alias != "NULL"
     rcpt_delete($1)
     rcpt_add(alias)
   fi
done

5.7 Header Modification Functions

There are two ways to modify message headers in a MFL script. First is to use header actions, described in Actions, and the second way is to use message modification functions. Compared with the actions, the functions offer a series of advantages. For example, using functions you can construct the name of the header to operate upon (e.g. by concatenating several arguments), something which is impossible when using actions. Moreover, apart from three basic operations (add, modify and remove), as supported by header actions, header functions allow to insert a new header into a particular place.

Built-in Function: void header_add (string name, string value [, number idx])

Adds a header ‘name: value’ to the message. If idx is given, it specifies a 0-based index in the header list where to insert this header.

If idx is not supplied, the header is appended to the end of the header list.

In contrast to the add action, this function allows to construct the header name using arbitrary MFL expressions.

Built-in Function: void header_insert (string name, string value, number idx)

This function is equivalent to header_add with three arguments, i.e. it inserts a header ‘name: ‘value’ at idxth header position in the message.

Built-in Function: void header_delete (string name [, number index])

Delete header name from the envelope. If index is given, delete indexth instance of the header name.

Notice the differences between this function and the delete action:

  1. It allows to construct the header name, whereas delete requires it to be a literal string.
  2. Optional index argument allows to select a particular header instance to delete.
Built-in Function: void header_replace (string name, string value [, number index])

Replace the value of the header name with value. If index is given, replace indexth instance of header name.

Notice the differences between this function and the replace action:

  1. It allows to construct the header name, whereas replace requires it to be a literal string.
  2. Optional index argument allows to select a particular header instance to replace.
Built-in Function: void header_delete_nth (number n)

Deletes nth header. Headers are numbered from 1.

Built-in Function: void header_replace_nth (number n, string name, string value)

Replaces nth header with ‘name: value’.

Library Function: void header_rename (string name, string newname[, number idx])

Defined in the module header_rename.mf.
Available only in the ‘eom’ handler.

Renames the idxth instance of header name to newname. If idx is not given, assumes 1.

The example below renames ‘Subject’ header to ‘X-Old-Subject’:

require 'header_rename'

prog eom
do
  header_rename("Subject", "X-Old-Subject")
done
Library Function: void header_prefix_all (string name [, string prefix])

Defined in the module header_rename.mf.
Available only in the ‘eom’ handler.

If prefix is given, rename all headers named name to ‘prefix-name’. Otherwise, remove all such headers.

Library Function: void header_prefix_pattern (string pattern, string prefix)

Defined in the module header_rename.mf.
Available only in the ‘eom’ handler.

If prefix is given, rename all headers whose names match pattern (in the sense of fnmatch, see fnmatches) to ‘prefix-name’. Otherwise, remove them.

For example, to prefix all headers whose names begin with ‘X-Spamd-’ with an additional ‘X-’:

require 'header_rename'

prog eom
do
  header_prefix_pattern("X-Spamd-*", "X-")
done

5.8 Body Modification Functions

Body modification is an experimental feature of MFL. The version 8.7 provides only one function for that purpose.

Built-in Function: void replbody (string text)

Replace the body of the message with text. Notice, that text must not contain RFC 822 headers. See the previous section if you want to manipulate message headers.

Example:

  replbody("Body of this message has been removed by the mail filter.")

No restrictions are imposed on the format of text.

Built-in Function: void replbody_fd (number fd)

Replaces the body of the message with the content of the stream fd. Use this function if the body is very big, or if it is returned by an external program.

Notice that this function starts reading from the current position in fd. Use rewind if you wish to read from the beginning of the stream.

The example below shows how to preprocess the body of the message using external program /usr/bin/mailproc, which is supposed to read the body from its standard input and write the processed text to its standard output:

number fd   # Temporary file descriptor

prog data
do
  # Open the temporary file
  set fd tempfile()
done  

prog body
do
  # Write the body to it.
  write_body(fd, $1, $2)
done

prog eom
do
  # Use the resulting stream as the stdin to the mailproc
  # command and read the new body from its standard output.
  rewind(fd)
  replbody_fd(spawn("</usr/bin/mailproc", fd))
done

5.9 Message Modification Queue

Message modification functions described in the previous subsections do not take effect immediately, in the moment they are called. Instead they store the requested changes in the internal message modification queue. These changes are applied at the end of processing, before ‘eom’ stage finishes (see Figure 3.1).

One important consequence of this way of operation is that calling any MTA action (see Actions), causes all prior modifications to the message to be ignored. That is because after receiving the action command, MTA will not call filter for that message any more. In particular, the ‘eom’ handler will not be called, and the message modification queue will not be flushed. While it is logical for such actions as reject or tempfail, it may be quite confusing for accept. Consider, for example, the following code:

prog envfrom
do
  if $1 == ""
    header_add("X-Filter", "foo")
    accept
  fi
done

Obviously, the intention was to add a ‘X-Filter’ header and accept the message if it was sent from the null address. What happens in reality, however, is a bit different: the message is accepted, but no header is added to it. If you need to accept the message and retain any modifications you have done to it, you need to use an auxiliary variable, e.g.:

number accepted 0
prog envfrom
do
  if $1 == ""
    header_add("X-Filter", "foo")
    set accepted 1
  fi
done

Then, test this variable for non-zero value at the beginning of each subsequent handler, e.g.:

prog data
do
  if accepted
    continue
  fi
  ...
done

To help you trace such problematic usages of accept, mailfromd emits the following warning:

RUNTIME WARNING near /etc/mailfromd.mf:36: `accept' causes previous
message modification commands to be ignored; call mmq_purge() prior
to `accept', to suppress this warning

If it is OK to lose all modifications, call mmq_purge, as suggested in this message.

Built-in Function: void mmq_purge ()

Remove all modification requests from the queue. This function undoes the effect of any of the following functions, if they had been called previously: rcpt_add, rcpt_delete, header_add, header_insert, header_delete, header_replace, replbody, quarantine.

5.10 Mail Header Functions

Built-in Function: string message_header_encode (string text, [string enc, string charset])

Encode text in accordance with RFC 2047. Optional arguments:

enc

Encoding to use. Valid values are ‘quoted-printable’, or ‘Q’ (the default) and ‘base64’, or ‘B’.

charset

Character set. By default ‘UTF-8’.

If the function is unable to encode the string, it raises the exception e_failure.

For example:

set string "Keld Jørn Simonsen <keld@dkuug.dk>"
message_header_encode(string, "ISO-8859-1")
  ⇒ "=?ISO-8859-1?Q?Keld_J=F8rn_Simonsen?= <keld@dkuug.dk>" 
Built-in Function: string message_header_decode (string text, [string charset])

text must be a header value encoded in accordance with RFC 2047. The function returns the decoded string. If the decoding fails, it raises e_failure exception. The optional argument charset specifies the character set to use (default – ‘UTF-8’).

set string "=?ISO-8859-1?Q?Keld_J=F8rn_Simonsen?= <keld@dkuug.dk>"
message_header_decode(string)
 ⇒ "Keld Jørn Simonsen <keld@dkuug.dk>"
Built-in Function: string unfold (string text)

If text is a “folded” multi-line RFC 2822 header value, unfold it. If text is a single-line string, return its unchanged copy.

For example, suppose that the message being processed contained the following header:

List-Id: Sent bugreports to
  <some-address@some.net>

Then, applying unfold to its value20 will produce:

Sent bugreports to <some-address@some.net>

5.11 Mail Body Functions

Built-in Function: string body_string (pointer text, number count)

Converts first count bytes from the memory location pointed to by text into a regular string.

This function is intended to convert the $1 argument passed to a body handler to a regular MFL string. For more information about its use, see body handler.

Built-in Function: bool body_has_nulls (pointer text, number count)

Returns ‘True’ if first count bytes of the string pointed to by text contain ASCII NUL characters.

Example:

prog body
do
  if body_has_nulls($1, $2)
    reject
  fi
done

5.12 EOM Functions

The following function is available only in the ‘eom’ handler:

Built-in Function: void progress ()

Notify the MTA that the filter is still processing the message. This causes MTA to restart its timeouts and allows additional amount of time for execution of ‘eom’.

Use this function if your ‘eom’ handler needs additional time for processing the message (e.g. for scanning a very big MIME message). You may call it several times, if the need be, although such usage is not recommended.

5.13 Current Message Functions

Built-in Function: number current_message ()

This function can be used in eom handlers only. It returns a message descriptor referring to the current message. See Message functions, for a description of functions for accessing messages.

The functions below access the headers from the current message. They are available in the following handlers: eoh, body, eom.

Built-in Function: number current_header_count ([string name])

Return number of headers in the current message. If name is specified, return number of headers that have this name.

  current_header_count() ⇒ 6
  current_header_count("Subject") ⇒ 1
Built-in Function: string current_header_nth_name (number n)

Return the name of the nth header. The index n is 1-based.

Built-in Function: string current_header_nth_value (number n)

Return the value of the nth header. The index n is 1-based.

Built-in Function: string current_header (string name [, number n])

Return the value of the named header, e.g.:

  set s current_header("Subject")

Optional second argument specifies the header instance, if there are more than 1 header of the same name, e.g.:

  set s current_header("Received", 2)

Header indices are 1-based.

All current_header function raise the e_not_found exception if the requested header was not found.

5.14 Mailbox Functions

A set of functions is provided for accessing mailboxes and messages within them. In this subsection we describe the functions for accessing mailboxes.

A mailbox is opened using mailbox_open function:

Built-in Function: number mailbox_open (string url [, string mode, string perms])

Open a mailbox identified by url. Return a mailbox descriptor: a unique numeric identifier that can subsequently be used to access this mailbox.

The optional mode argument specifies the access mode for the mailbox. Its valid values are:

ValueMeaning
rOpen mailbox for reading. This is the default.
wOpen mailbox for writing. If the mailbox does not exist, it is created.
rwOpen mailbox for reading and writing. If the mailbox does not exist, it is created.
wrSame as ‘rw’.
w+Open mailbox for reading and writing. If the mailbox does not exist, it is created.
aOpen mailbox for appending messages to it. If the mailbox does not exist, an exception is signalled.
a+Open mailbox for appending messages to it. If the mailbox does not exist, it is created.

The optional perms argument specifies the permissions to use in case a new file (or files) is created. It is a comma-separated list of:

[go](+|=)[wr]+

The initial letter controls which users’ access is to be set: users in the file’s group (‘g’) or other users not in the file’s group (‘o’). The following character controls whether the permissions are added to the default ones (‘+’) or applied instead of them (‘=’). The remaining letters specify the permissions: ‘r’ for read access and ‘w’ for write access. For example:

g=rw,o+r

The number of mailbox descriptors available for simultaneous opening is 64. This value can be changed using the max-open-mailboxes runtime configuration statement (see max-open-mailboxes).

Built-in Function: number mailbox_messages_count (number nmbx)

Return the number of messages in mailbox. The argument nmbx is a valid mailbox descriptor as returned by a previous call to mailbox_open.

Built-in Function: number mailbox_get_message (number mbx, number n)

Retrieve nth message from the mailbox identified by descriptor mbx. On success, the function returns a message descriptor, an integer number that can subsequently be used to access that message (see Message functions). On error, an exception is raised.

Messages in a mailbox are numbered starting from 1.

Built-in Function: void mailbox_close (number nmbx)

Close a mailbox previously opened by mailbox_open.

Built-in Function: void mailbox_append_message (number nmbx, number nmsg)

Append message nmsg to mailbox nmbx. The message descriptor nsmg must be obtained from a previous call to mailbox_get_message or current_message (see current_message).

5.15 Message Functions

The functions described below retrieve information from RFC822 messages. The message to operate upon is identified by its descriptor, an integer number returned by the previous call to mailbox_get_message (see mailbox_get_message) or current_message (see current_message) function. The maximum number of message descriptors is limited by 1024. You can change this limit using the max-open-messages runtime configuration statement (see max-open-messages).

Built-in Function: number message_size (number nmsg)

Return the size of the message nmsg, in bytes. Notice, that if nmsg refers to current message (see current_message), the returned value is less than the size seen by the MTA, because mailfromd recodes CR-LF sequences to LF, i.e. removes carriage returns (ASCII 13) occurring before line feeds (ASCII 10. To obtain actual message length as seen by the MTA, add the number of lines in the message:

  set actual_length message_size(nmsg) + message_lines(nmsg)
Built-in Function: boolean message_body_is_empty (number nmsg)

Returns true if the body of message nmsg has zero size or contains only whitespace characters. If the ‘Content-Transfer-Encoding’ header is present, it is used to decode body before processing.

Built-in Function: void message_close (number nmsg)

Close the message identified by descriptor nmsg.

Built-in Function: number message_lines (number nmsg)

Return total number of lines in message nmsg. The following relation holds true:

message_lines(x) = message_body_lines(x)
                         + message_header_lines(x) + 1
Built-in Function: string message_read_line (number nmsg)

Read and return next line from the message nmsg. If there are no more lines to read, raise the eof exception.

Use message_rewind to rewind the message stream and read its contents again.

Built-in Function: void message_rewind (number nmsg)

Rewind the stream associated with message referred to by descriptor nmsg.

Built-in Function: number message_from_stream (number fd; string filter_chain)

Converts contents of the stream identified by fd to a mail message. Returns identifier of the created message.

Optional filter_chain supplies the name of a Mailutils filter chain, through which the data will be passed before converting. See http://mailutils.org/wiki/Filter_chain, for a description of filter chains.

Built-in Function: void message_to_stream (number fd, number nmsg; string filter_chain)

Copies message nsmg to stream descriptor fd. The descriptor must be obtained by a previous call to open.

Optional filter_chain supplies the name of a Mailutils filter chain, through which the data will be passed before writing them to fd. See http://mailutils.org/wiki/Filter_chain, for a description of filter chains.

5.15.1 Header functions

Built-in Function: number message_header_size (number nmsg)

Return the size, in bytes of the headers of message nmsg. See the note to the message_size, above.

Built-in Function: number message_header_lines (number nmsg)

Return number of lines occupied by headers in message nmsg.

Built-in Function: number message_header_count (number nmsg, [string name])

Return number of headers in message nmsg.

If name is supplied, count only headers with that name.

Built-in Function: string message_find_header (number nmsg, string name [, number idx])

Return value of header name from the message nmsg. If the message contains several headers with the same name, optional parameter idx may be used to select one of them. Headers are numbered from ‘1’.

If no matching header is not found, the not_found exception is raised. If another error occurs, the failure exception is raised.

The returned string is a verbatim copy of the message contents (except for eventual CR-LF -> LF translation, see above). You might need to apply the unfold function to it (see unfold).

Built-in Function: string message_nth_header_name (number nmsg, number n)

Returns the name of the nth header in message nmsg. If there is no such header, e_range exception is raised.

Built-in Function: string message_nth_header_value (number msg, number n)

Returns the value of the nth header in message nmsg. If there is no such header, e_range exception is raised.

Built-in Function: boolean message_has_header (number nmsg, string name [, number idx])

Return true if message nmsg contains header with the given name. If there are several headers with the same name, optional parameter idx may be used to select one of them.

5.15.2 Message body functions

Built-in Function: number message_body_size (number nmsg)

Return the size, in bytes, of the body of message nmsg. See the note to the message_size, above.

Built-in Function: number message_body_lines (number nmsg)

Return number of lines in the body of message referred to by descriptor nmsg.

Built-in Function: void message_body_rewind (number nmsg)

Rewind the stream associated with the body of message referred to by descriptor nmsg.

A call to message_body_read_line (see below) after calling this function will return the first line from the message body.

Built-in Function: string message_read_body_line (number nmsg)

Read and return next line from the body of the message nmsg. If there are no more lines to read, raise the eof exception.

Use message_body_rewind (see above) to rewind the body stream and read its contents again.

Built-in Function: void message_body_to_stream (number fd, number nmsg; string filter_chain)

Copies the body of the message nsmg to stream descriptor fd. The descriptor must be obtained by a previous call to open.

Optional filter_chain supplies the name of a Mailutils filter chain, through which the data will be passed before writing them to fd. oSee http://mailutils.org/wiki/Filter_chain, for a description of filter chains.

5.15.3 MIME functions

Built-in Function: boolean message_is_multipart (number nmsg)

Return true if message nmsg is a multipart (MIME) message.

Built-in Function: number message_count_parts (number nmsg)

Return number of parts in message nmsg, if it is a multipart (MIME) message. If it is not, return ‘1’.

Use message_is_multipart to check whether the message is a multipart one.

Built-in Function: number message_get_part (number nmsg, number n)

Extract nth part from the multipart message nmsg. Numeration of parts begins from ‘1’. Return message descriptor referring to the extracted part. Message parts are regarded as messages, so any message functions can be applied to them.

5.15.4 Message digest functions

Message digests are specially formatted messages that contain certain number of mail messages, encapsulated using the method described in RFC 934. Such digests are often used in mailing lists to reduce the frequency of sending mails. Messages of this format are also produced by the forward function in most MUA’s.

The usual way to handle a message digest in MFL is to convert it first to a MIME message, and then to use functions for accessing its parts (see MIME functions).

Built-in Function: number message_burst (number nmsg ; number flags)

Converts the message identified by the descriptor nmsg to a multi-part message. Returns a descriptor of the created message.

Optional argument flags controls the behavior of the bursting agent. It is a bitwise OR of error action and bursting flags.

Error action defines what to do if a part of the digest is not in RFC822 message format. If it is ‘BURST_ERR_FAIL’ (the default), the function will raise the ‘e_format’ exception. If onerr is ‘BURST_ERR_IGNORE’, the improperly formatted part will be ignored. Finally, the value ‘BURST_ERR_BODY’ instructs message_burst to create a replacement part with empty headers and the text of the offending part as its body.

Bursting flags control various aspects of the agent behavior. Currently only one flag is defined, ‘BURST_DECODE’, which instructs the agent to decode any MIME parts (according to the ‘Content-Transfer-Encoding’ header) it encounters while bursting the message.

Parts of a message digest are separated by so-called encapsulation boundaries, which are in essence lines beginning with at least one dash followed by a non-whitespace character. A dash followed by a whitespace serves as a byte-stuffing character, a sort of escape for lines which begin with a dash themselves. Unfortunately, there are mail agents which do not follow byte-stuffing rules and pass lines beginning with dashes unmodified into resulting digests. To help handle such cases a global variable is provided which controls how much dashes should the line begin with for it to be recognized as an encapsulation boundary.

Built-in variable: number burst_eb_min_length

Minimal number of consecutive dashes an encapsulation boundary must begin with.

The default is 2.

The following example shows a function which saves all parts of a digest message to separate disk files. The argument orig is a message descriptor. The resulting files are named by concatenating the string supplied by the stem argument and the ordinal number (1-based) of the message part.

func burst_digest(number orig, string stem)
do
  number msg message_burst(orig)
  number nparts message_count_parts(msg)
  
  loop for number i 1,
       while i <= nparts,
       set i i + 1
  do
    number part message_get_part(msg, i)
    number out open(sprintf('>%s%02d', stem, i))
    message_to_stream(out, part)
  done
  message_close(msg)
done

5.16 Quarantine Functions

Built-in Function: void quarantine (string text)

Place the message to the quarantine queue, using text as explanatory reason.

5.17 SMTP Callout Functions

Library Function: number callout_open (string url)

Opens connection to the callout server listening at url. Returns the descriptor of the connection.

Library Function: void callout_close (number fd)

Closes the connection. fd is the file descriptor returned by the previous call to callout_open.

Library Function: number callout_do (number fd, string email [, string rest])

Instructs the callout server identified by fd (a file descriptor returned by a previous call to callout_open) to verify the validity of the email. Optional rest argument supplies additional parameters for the server.

Possible return values:

0

Success. The email is found to be valid.

e_not_found

email does not exist.

e_temp_failure

The email validity cannot be determined right now, e.g. because remote SMTP server returned temporary failure. The caller should retry verification later.

e_failure

Some error occurred.

The function will throw the e_callout_proto exception if the remote host doesn’t speak the correct callout protocol.

Upon return, callout_do modifies the following variables:

last_poll_host

Host name or IP address of the last polled SMTP server.

last_poll_greeting

Initial SMTP reply from the last polled host.

last_poll_helo

The reply to the HELO (EHLO) command, received from the last polled host.

last_poll_sent

Last SMTP command sent to the polled host. If nothing was sent, last_poll_sent contains the string ‘nothing’.

last_poll_recv

Last SMTP reply received from the remote host. In case of multi-line replies, only the first line is stored. If nothing was received the variable contains the string ‘nothing’.

The default callout server is defined by the callout-url statement in the configuration file, or by the callout statement in the server milter section (see configuring default callout server. The following functions operate on that server.

Built-in Function: string default_callout_server_url ()

Returns URL of the default callout server.

Library Function: number callout (string email)

Verifies the validity of the email using the default callout server.

5.18 Compatibility Callout Functions

The following functions are wrappers over the callout functions described in the previous section. They are provided for backward compativbility.

These functions are defined in the module poll.mf, which you must require prior to using any of them.

Library Function: boolean _pollhost (string ip, string email, string domain, string mailfrom)

Poll SMTP host ip for email address email, using domain as EHLO domain and mailfrom as MAIL FROM. Returns 0 or 1 depending on the result of the test. In contrast to the strictpoll function, this function does not use cache database and does not fall back to polling MX servers if the main poll tempfails. The function can throw one of the following exceptions: e_failure, e_temp_failure.

Library Function: boolean _pollmx (string ip, string email, string domain, string mailfrom)

Poll MXs of the domain for email address email, using domain as EHLO domain and mailfrom as MAIL FROM address. Returns 0 or 1 depending on the result of the test. In contrast to the stdpoll function, _pollmx does not use cache database and does not fall back to polling the ip if the poll fails. The function can throw one of the following exceptions: e_failure, e_temp_failure.

Library Function: boolean stdpoll (string email, string domain, string mailfrom)

Performs standard poll for email, using domain as EHLO domain and mailfrom as MAIL FROM address. Returns 0 or 1 depending on the result of the test. Can raise one of the following exceptions: e_failure, e_temp_failure.

In on statement context, it is synonymous to poll without explicit host.

Library Function: boolean strictpoll (string host, string email, string domain, string mailfrom)

Performs strict poll for email on host host. See the description of stdpoll for the detailed information.

In on context, it is synonymous to poll host host.

The mailfrom argument can be a comma-separated list of email addresses, which can be useful for servers that are unusually picky about sender addresses. It is advised, however, that this list always contain the ‘<>’ address. For example:

_pollhost($client_addr, $f, "domain", "postmaster@my.net,<>")

See also mail-from-address.

Before returning, all described functions set the following built-in variables:

VariableContains
last_poll_hostHost name or IP address of the last polled host.
last_poll_sentLast SMTP command, sent to this host. If nothing was sent, it contains literal string ‘nothing’.
last_poll_recvLast SMTP reply received from this host. In case of multi-line replies, only the first line is stored. If nothing was received the variable contains the string ‘nothing’.
cache_used1 if cached data were used instead of polling, 0 otherwise. This variable is set by stdpoll and strictpoll. If it equals 1, none of the above variables are modified. See cache_used example, for an example.

Table 5.1: Variables set by polling functions

5.19 Internet address manipulation functions

Following functions operate on IPv4 addresses and CIDRs.

Built-in Function: number ntohl (number n)

Converts the number n, from host to network byte order. The argument n is treated as an unsigned 32-bit number.

Built-in Function: number htonl (number n)

Converts the number n, from network to host byte order. The argument n is treated as an unsigned 32-bit number.

Built-in Function: number ntohs (number n)

The argument n is treated as an unsigned 16-bit number. The function converts this number from network to host order.

Built-in Function: number htons (number n)

The argument n is treated as an unsigned 16-bit number. The function converts this number from host to network order.

Built-in Function: number inet_aton (string s)

Converts the Internet host address s from the standard numbers-and-dots notation into the equivalent integer in host byte order.

inet_aton("127.0.0.1") ⇒ 2130706433

The numeric data type in MFL is signed, therefore on machines with 32 bit integers, this conversion can result in a negative number:

inet_aton("255.255.255.255") ⇒ -1

However, this does not affect arithmetical operations on IP addresses.

Built-in Function: string inet_ntoa (number n)

Converts the Internet host address n, given in host byte order to string in standard numbers-and-dots notation:

inet_ntoa(2130706433) ⇒ "127.0.0.1"
Built-in Function: number len_to_netmask (number n)

Convert number of masked bits n to IPv4 netmask:

inet_ntoa(len_to_netmask(24)) ⇒ 255.255.255.0
inet_ntoa(len_to_netmask(7)) ⇒ 254.0.0.0

If n is greater than 32 the function raises e_range exception.

Built-in Function: number netmask_to_len (number mask)

Convert IPv4 netmask mask into netmask length (number of bits preserved by the mask):

netmask_to_len(inet_aton("255.255.255.0")) ⇒ 24
netmask_to_len(inet_aton("254.0.0.0")) ⇒ 7
Library Function: boolean match_cidr (string ip, string cidr)

This function is defined in the module match_cidr.mf (see Modules).

It returns true if the IP address ip pertains to the IP range cidr. The first argument, ip, is a string representation of an IP address. The second argument, cidr, is a string representation of a IP range in CIDR notation, i.e. "A.B.C.D/N", where A.B.C.D is an IPv4 address and N specifies the prefix length – the number of shared initial bits, counting from the left side of the address.

The following example will reject the mail if the IP address of the sending machine does not belong to the block 10.10.1.0/19:

if not match_cidr(${client_addr}, "10.10.1.0/19")
  reject
fi

5.20 DNS Functions

The functions are implemented in two layers: primitive built-in functions which raise exceptions if the lookup fails, and library calls that are warranted to always return meaningful value without throwing exceptions.

The built-in layer is always available. The library calls become available after requesting the dns module (see Modules):

require dns
Built-in Function: string dns_getaddr (string domain)

Returns a whitespace-separated list of IP addresses (A records) for domain.

This function does not use the DNS cache.

Built-in Function: string dns_getname (string ipstr)

Returns a whitespace-separated list of domain names (PTR records) for the IPv4 address ipstr.

This function does not use the DNS cache.

Built-in Function: string getmx (string domain [, boolean ip])

Returns a whitespace-separated list of ‘MX’ names (if ip is not given or if it is 0) or ‘MXIP addresses (if ip!=0)) for domain. Within the returned string, items are sorted in order of increasing ‘MX’ priority. If domain has no ‘MX’ records, an empty string is returned. If the DNS query fails, getmx raises an appropriate exception.

Examples:

getmx("mafra.cz") ⇒ "smtp1.mafra.cz smtp2.mafra.cz relay.iol.cz"
getmx("idnes.cz") ⇒ "smtp1.mafra.cz smtp2.mafra.cz relay.iol.cz"
getmx("gnu.org")  ⇒ "mx10.gnu.org mx20.gnu.org"
getmx("org.pl") ⇒ ""

Note:

  1. The number of items returned by getmx(domain) can differ from that obtained from getmx(domain, 1), e.g.:
    getmx("aol.com")
      ⇒ mailin-01.mx.aol.com mailin-02.mx.aol.com
                mailin-03.mx.aol.com mailin-04.mx.aol.com
    getmx("aol.com", 1)
      ⇒ 64.12.137.89 64.12.137.168 64.12.137.184
                64.12.137.249 64.12.138.57 64.12.138.88
                64.12.138.120 64.12.138.185 205.188.155.89
                205.188.156.185 205.188.156.249 205.188.157.25
                205.188.157.217 205.188.158.121 205.188.159.57
                205.188.159.217
    
  2. This interface will change in future releases, when array data types are implemented.
Built-in Function: boolean primitive_hasmx (string domain)

Returns true if the domain name given by its argument has any ‘MX’ records.

If the DNS query fails, this function throws failure or temp_failure.

Library Function: boolean hasmx (string domain)

Returns true if the domain name given by its argument has any ‘MX’ records.

Otherwise, if domain has no ‘MX’s or if the DNS query fails, hasmx returns false.

Built-in Function: string primitive_hostname (string ip)

The ip argument should be a string representing an IP address in dotted-quad notation. The function returns the canonical name of the host with this IP address obtained from DNS lookup. For example

primitive_hostname (${client_addr})

returns the fully qualified domain name of the host represented by Sendmail variable ‘client_addr’.

If there is no ‘PTR’ record for ip, primitive_hostname raises the exception e_not_found.

If DNS query fails, the function raises failure or temp_failure, depending on the character of the failure.

Library Function: string hostname (string ip)

The ip argument should be a string representing an IP address in dotted-quad notation. The function returns the canonical name of the host with this IP address obtained from DNS lookup.

If there is no ‘PTR’ record for ip, or if the lookup fails, the function returns ip unchanged.

The previous mailfromd versions used the following paradigm to check if an IP address resolves:

  if hostname(ip) != ip
    ...
Built-in Function: boolean primitive_ismx (string domain, string host)

The domain argument is any valid domain name, the host is a host name or IP address.

The function returns true if host is one of the ‘MX’ records for the domain.

If domain has no ‘MX’ records, primitive_ismx raises exception e_not_found.

If DNS query fails, the function raises failure or temp_failure, depending on the character of the failure.

Library Function: boolean ismx (string domain, string host)

The domain argument is any valid domain name, the host is a host name or IP address.

The function returns true if host is one of the ‘MX’ records for the domain. Otherwise it returns false.

If domain has no ‘MX’ records, or if the DNS query fails, the function returns false.

Built-in Function: string primitive_resolve (string host, [string domain])

Reverse of primitive_hostname. The primitive_resolve function returns the IP address for the host name specified by host argument. If host has no A records, the function raises the exception e_not_found.

If DNS lookup fails, the function raises failure or temp_failure, depending on the character of the failure.

If the optional domain argument is given, it will be appended to host (with an intermediate dot), before querying the DNS. For example, the following two expressions will return the same value:

primitive_resolve("puszcza.gnu.org.ua")
primitive_resolve("puszcza", "gnu.org.ua")

There is a considerable internal difference between one-argument and two-argument forms of primitive_resolve: the former queries DNS for an ‘A’ record, whereas the latter queries it for any record matching host in the domain domain and then selects the most appropriate one. For example, the following two calls are equivalent:

primitive_hostname("213.130.0.22")
primitive_resolve("22.0.130.213", "in-addr.arpa")

This makes it possible to use primitive_resolve for querying DNS black listing domains. See match_dnsbl, for a working example of this approach. See also match_rhsbl, for another practical example of the use of the two-argument form.

Library Function: string resolve (string host, [string domain])

Reverse of hostname. The resolve function returns IP address for the host name specified by host argument. If the host name cannot be resolved, or a DNS failure occurs, the function returns ‘"0"’.

This function is entirely equivalent to primitive_resolve (see above), except that it never raises exceptions.

Built-in Function: string ptr_validate (string ip)

Tests whether the DNS reverse-mapping for ip exists and correctly points to a domain name within a particular domain.

First, it obtains all PTR records for ip. Then, for each record returned, a look up for A records is performed and IP addresses of each record are compared against ip. The function returns true if a matching A record is found.

Built-in Function: boolean primitive_hasns (string domain)

Returns ‘True’ if the domain domain has at least one ‘NS’ record. Throws exception if DNS lookup fails.

Library Function: boolean hasns (string domain)

Returns ‘True’ if the domain domain has at least one ‘NS’ record. Returns ‘False’ if there are no ‘NS’ records or if the DNS lookup fails.

Built-in Function: string getns (string domain ; boolean resolve, boolean sort)

Returns a whitespace-separated list of all the ‘NS’ records for the domain domain. Optional parameters resolve and sort control the formatting. If resolve is 0 (the default), the resulting string will contain IP addresses of the NS servers. If resolve is not 0, hostnames will be returned instead. If sort is 1, the returned items will be sorted.

If the DNS query fails, getns raises an appropriate exception.

5.21 Geolocation functions

The geolocation functions allow you to identify the country where the given IP address or host name is located. These functions are available only if the ‘GeoIP’ library is installed and mailfromd is compiled with the ‘GeoIP’ support. The m4 macro ‘WITH_GEOIP’ is defined if it is so.

The GeoIP is a geolocational package distributed by ‘MaxMind’ under the terms of the GNU Lesser General Public License. The library is available from http://www.maxmind.com/app/c.

Built-in Function: string geoip_country_code_by_addr ( string ip [, bool tlc])

Look up the ‘ISO 3166-1’ country code corresponding to the IP address ip. If tlc is given and is not zero, return the 3 letter code, otherwise return the 2 letter code.

Built-in Function: string geoip_country_code_by_name ( string name [, bool tlc])

Look up the ‘ISO 3166-1’ country code corresponding to the host name name. If tlc is given and is not zero, return the 3 letter code, otherwise return the 2 letter code.

If it is impossible to locate the country, both functions raise the e_not_found exception. If an error internal to the ‘GeoIP’ library occurs, they raise the e_failure exception.

Applications may test whether the GeoIP support is present and enable corresponding code blocks conditionally by testing if the ‘WITH_GEOIP’ m4 macro is defined. For example, the following code adds to the message the ‘X-Originator-Country’ header, containing the 2 letter code of the country where the client machine is located. If mailfromd is compiled without ‘GeoIP’ support, it does nothing:

m4_ifdef(`WITH_GEOIP',`
  header_add("X-Originator-Country", geoip_country_code_by_addr($client_addr))
')

5.22 Database Functions

The functions described below provide a user interface to DBM databases.

Each DBM database is a separate disk file that keeps key/value pairs. The interface allows to retrieve the value corresponding to a given key. Both ‘key’ and ‘value’ are null-terminated character strings. To lookup a key, it is important to know whether its length includes the terminating null byte. By default, it is assumed that it does not.

Another important database property is the file mode of the database file. The default file mode is ‘640’ (i.e. ‘rw-r----’, in symbolic notation).

Both properties can be configured using the dbprop pragma:

#pragma dbprop pattern prop [prop]

The pattern is the database name or shell-style globbing pattern. Properties defined by that pragma apply to each database whose name matches this pattern. If several dbprop pragmas match the database name, the one that matches exactly is preferred.

The rest of arguments define properties for that database. The valid values for prop are:

  1. The word ‘null’, meaning that the terminating null byte is included in the key length.

    Setting ‘null’ property is necessary, for databases created with makemap -N hash command.

  2. File mode for the disk file. It can be either an octal number, or a symbolic mode specification in ls-like format. E.g., the following two formats are equivalent:
    640
    rw-r----
    

For example, consider the following pragmas:

#pragma dbprop /etc/mail/whitelist.db 640

It tells that the database file whitelist.db has privileges ‘640’ and do not include null in the key length.

Similarly, the following pragma:

#pragma dbprop `/etc/mail/*.db' null 600

declares that all database files in directory /etc/mail have privileges ‘640’ and include null terminator in the key length. Notice, the use of m4 quoting characters in the example below. Without them, the sequence ‘/*’ would have been taken as the beginning of a comment.

Additionally, for compatibility with previous versions (up to 5.0), the terminating null property can be requested via an optional argument to the database functions (in description below, marked as null).

Built-in Function: boolean dbmap (string db, string key, [boolean null])

Looks up key in the DBM file db and returns true if it is found.

See above for the meaning of null.

See whitelisting, for an example of using this function.

Built-in Function: string dbget (string db, string key [, string default, boolean null])

Looks up key in the database db and returns the value associated with it. If the key is not found returns default, if specified, or empty string otherwise.

See above for the meaning of null.

Built-in Function: void dbput (string db, string key, string value [, boolean null, number mode ])

Inserts in the database a record with the given key and value. If a record with the given key already exists, its value is replaced with the supplied one.

See above for the meaning of null. Optional mode allows to explicitly specify the file mode for this database. See also #pragma dbprop, described above.

Built-in Function: void dbinsert (string db, string key, string value [, boolean replace, boolean null, number mode ])

This is an improved variant of dbput, which provides a better control on the actions to take if the key already exists in the database. Namely, if replace is ‘True’, the old value is replaced with the new one. Otherwise, the ‘e_exists’ exception is thrown.

Built-in Function: void dbdel (string db, string key [, boolean null, number mode])

Delete from the database the record with the given key. If there are no such record, return without signalling error.

If the optional null argument is given and is not zero, the terminating null character will be included in key length.

Optional mode allows to explicitly specify the file mode for this database. See also #pragma dbprop, described above.

The functions above have also the corresponding exception-safe interfaces, which return cleanly if the ‘e_dbfailure’ exception occurs. To use these interfaces, request the safedb module:

require safedb

The exception-safe interfaces are:

Library Function: string safedbmap (string db, string key [, string default, boolean null])

This is an exception-safe interface to dbmap. If a database error occurs while attempting to retrieve the record, safedbmap returns default or ‘0’, if it is not defined.

Library Function: string safedbget (string db, string key [, string default, boolean null])

This is an exception-safe interface to dbget. If a database error occurs while attempting to retrieve the record, safedbget returns default or empty string, if it is not defined.

Library Function: void safedbput (string db, string key, string value [, boolean null])

This is an exception-safe interface to dbput. If a database error occurs while attempting to retrieve the record, the function returns without raising exception.

Library Function: void safedbdel (string db, string key [, boolean null])

This is an exception-safe interface to dbdel. If a database error occurs while attempting to delete the record, the function returns without raising exception.

The verbosity of ‘safedb’ interfaces in case of database error is controlled by the value of safedb_verbose variable. If it is ‘0’, these functions return silently. This is the default behavior. Otherwise, if safedb_verbose is not ‘0’, these functions log the detailed diagnostics about the database error and return.

The following functions provide a sequential access to the contents of a DBM database:

Built-in Function: number dbfirst (string name)

Start sequential access to the database name. The return value is an opaque identifier, which is used by the remaining sequential access functions. This number is ‘0’ if the database is empty.

Built-in Function: number dbnext (number dn)

Select next record form the database. The argument dn is the access identifier, returned by a previous call to dbfirst or dbnext.

Returns new access identifier. This number is ‘0’ if all records in the database have been visited.

The usual approach for iterating over all records in a database dbname is:

  loop for number dbn dbfirst(dbname)
  do
    …
  done while dbnext(dbn)

The following two functions can be used to access values of the currently selected database record. Their argument, dn, is the access identifier, returned by a previous call to dbfirst or dbnext.

Built-in Function: string dbkey (number dn)

Return the key from the selected database record.

Built-in Function: string dbvalue (number dn)

Return the value from the selected database record.

Built-in Function: number db_expire_interval (string fmt)

The fmt argument is a database format identifier (see Database Formats). If it is valid, the function returns the expiration interval for that format. Otherwise, db_expire_interval raises the e_not_found exception.

Built-in Function: string db_name (string fmtid)

The fmt argument is a database format identifier (see Database Formats). The function returns the file name for that format. If fmtid does not match any known format, db_name raises the e_not_found exception.

Built-in Function: number db_get_active (string fmtid)

Returns the flag indicating whether the cache database fmtid is currently enabled. If fmtid does not match any known format, db_name raises the e_not_found exception.

Built-in Function: void db_set_active (string fmtid, boolean enable)

Enables the cache database fmtid if enable is ‘True’, or disables it otherwise. For example, to disable DNS caching, do:

db_set_active("dns", 0)
Built-in Function: boolean relayed (string domain)

Returns true if the string domain is found in one of relayed domain files (see relayed-domain-file). The usual construct is:

if relayed(hostname(${client_addr}))
  …

which yields true if the IP address from Sendmail variable ‘client_addr’ is relayed by the local machine.

5.23 I/O functions

MFL provides a set of functions for writing to disk files, pipes or sockets and reading from them. The idea behind them is the same as in most other programming languages: first you open the resource with a call to open which returns a descriptor i.e. an integer number uniquely identifying the resource. Then you can write or read from it using this descriptor. Finally, when the resource is no longer needed, you can close it with a call to close.

The number of available resource descriptors is limited. The default limit is 1024. You can tailor it to your needs using the max-streams runtime configuration statement. See max-streams, for a detailed description.

Built-in Function: number open (string name)

The name argument specifies the name of a resource to open and the access rights you need to have on it. The function returns a descriptor of the opened stream, which can subsequently be used as an argument to other I/O operations.

First symbols of name determine the type of the resource to be opened and the access mode:

>

The rest of name is a name of a file. Open the file for read-write access. If the file exists, truncate it to zero length, otherwise create the file.

>>

The rest of name is a name of a file. Open the file for appending (writing at end of file). The file is created if it does not exist.

|

Treat the rest of name as the command name and its arguments. Run this command and open its standard input for writing. The standard error is closed before launching the program. This can be altered by using the following versions of this construct:

|2>null: command

Standard error is redirected to /dev/null.

|2>file:name command

Execute command with its standard error redirected to the file name. If the file exists, it will be truncated.

|2>>file:name command

Standard error of the command is appended to the file name. If file does not exist, it will be created.

The ‘|2>null:’ construct described above is a shortcut for

|2>>file:/dev/null command
|2>syslog:facility[.priority] command

Standard error is redirected to the given syslog facility and, optionally, priority. If the latter is omitted, ‘LOG_ERR’ is assumed.

Valid values for facility are: ‘user’, ‘daemon’, ‘auth’, ‘authpriv’, ‘mail’, and ‘local0’ through ‘local7’. Valid values for priority are: ‘emerg’, ‘alert’, ‘crit’, ‘err’, ‘warning’, ‘notice’, ‘info’, ‘debug’. Both facility and priority may be given in upper, lower or mixed cases.

Notice, that no whitespace characters are allowed between ‘|’ and ‘2>’.

|<

Treat the rest of name as the command name and its arguments. Run this command with its stdin closed and stdout open for reading.

The standard error is treated as described above (see ‘|’).

|&

Treat the rest of name as the command name and its arguments. Run this command and set up for two-way communication with it, i.e writes to the descriptor returned by open will send data to the program’s standard input, reads from the descriptor will get data from the program’s standard output.

The standard error is treated as described above (see ‘|’). For example, the following redirects it to syslog ‘mail.debug’:

|&2>syslog:mail.debug command
@

Treat the rest of name as the URL of a socket to connect to. Valid URL forms are described in milter port specification.

If none of these prefixes is used, name is treated as a name of an existing file and open will attempt to open this file for reading.

The open function will signal exception e_failure if it is unable to open the resource or get the required access to it.

Built-in Function: number spawn (string cmd [, number in, number out, number err])

Runs the supplied command cmd. The syntax of the cmd is the same as for the name argument to open (see above), which begins with ‘|’, excepting that the ‘|’ sign is optional. That is:

spawn("/bin/cat")

has exactly the same effect as

open("|/bin/cat")

Optional arguments specify file stream descriptors to be used for the program standard input, output and error streams, correspondingly. If supplied, these should be the values returned by a previous call to open or tempfile. The value ‘-1’ means no redirection.

The example below starts the awk program with a simple expression as its argument and redirects the content of the file /etc/passwd to its standard input. The returned stream descriptor is bound to the command’s standard output (see the description of ‘|<’ prefix above). The standard error is closed:

number fd spawn("<awk -F: '{print $1}'", open("/etc/passwd"))
Built-in Function: void close (number rd)

The argument rd is a resource descriptor returned by a previous call to open. The function close closes the resource and deallocates any memory associated with it.

close will signal e_range exception if rd lies outside of allowed range of resource descriptors. See max-streams.

Notice that you are not required to close resources opened by open. Any unclosed resource will be closed automatically upon the termination of the filtering program.

Built-in Function: void shutdown (number rd, number how)

This function causes all or part of a full-duplex connection to be closed. The rd must be either a socket descriptor (returned by open(@...)) or a two-way pipe socket descriptor (returned by open(|&...)), otherwise the call to shutdown is completely equivalent to close.

The how argument identifies which part of the connection to shut down:

SHUT_RD

Read connection. All further receptions will be disallowed.

SHUT_WR

Write connection. All further transmissions will be disallowed.

SHUT_RDWR

Shut down both read and write parts.

Built-in Function: number tempfile ([string tmpdir])

Creates a nameless temporary file and returns its descriptor. Optional tmpdir supplies the directory where to create the file, instead of the default /tmp.

Built-in Function: void rewind (number rd)

Rewinds the stream identified by rd to its beginning.

Built-in Function: number copy (number dst, number src)

Copies all data from the stream src to dst. Returns number of bytes copied.

The following functions provide basic read/write capabilities.

Built-in Function: void write (number rd, string str [, number size])

Writes the string str to the resource descriptor rd. If the size argument is given, writes this number of bytes.

The function will signal e_range exception if rd lies outside of allowed range of resource descriptors, and e_io exception if an I/O error occurs.

Built-in Function: void write_body (number rd, pointer bp , number size)

Write the body segment of length size from pointer bp to the stream rd. This function can be used only in prog body (see body handler). Its second and third arguments correspond exactly to the parameters of the body handler, so the following construct writes the message body to the resource fd, which should have been open prior to invoking the body handler:

prog body
do
  write_body(fd, $1, $2)
done
Built-in Function: string read (number rd, number n)

Read and return n bytes from the resource descriptor rd.

The function may signal the following exceptions:

e_range

rd lies outside of allowed range of resource descriptors.

e_eof

End of file encountered.

e_io

An I/O error occurred.

Built-in Function: string getdelim (number rd, string delim)

Read and return the next string terminated by delim from the resource descriptor rd.

The terminating delim string will be removed from the return value.

The function may signal the following exceptions:

e_range

rd lies outside of allowed range of resource descriptors.

e_eof

End of file encountered.

e_io

An I/O error occurred.

Built-in Function: string getline (number rd)

Read and return the next line from the resource descriptor rd. A line is any sequence of characters terminated with the default line delimiter. The default delimiter is a property of rd, i.e. different descriptors can have different line delimiters. The default value is ‘\n’ (ASCII 10), and can be changed using the fd_set_delimiter function (see below).

The function may signal the following exceptions:

e_range

rd lies outside of allowed range of resource descriptors.

e_eof

End of file encountered.

e_io

An I/O error occurred.

Built-in Function: void fd_set_delimiter (number fd, string delim)

Set new line delimiter for the descriptor fd, which must be in opened state.

Default delimiter is a newline character (ASCII 10). The following example shows how to change it to CRLF sequence:

fd_set_delimiter(fd, "\r\n")
Built-in Function: string fd_delimiter (number fd)

Returns the line delimiter string for fd.

The following example shows how mailfromd I/O functions can be used to automatically add IP addresses to an RBL zone:

set nsupdate_cmd
  "/usr/bin/nsupdate -k /etc/bind/Kmail.+157+14657.private"
  
func block_address(string addr)
do
  number fd
  string domain

  set fd open "|%nsupdate_cmd"

  set domain revip(addr) . ".rbl.myzone.come"
  write(fd, "prereq nxrrset %domain A\n"
             "update add %domain 86400 A %addr\n\n"
done

The function revip is defined in revip.

5.24 System functions

Built-in Function: boolean access (string pathname, number mode)

Checks whether the calling process can access the file pathname. If pathname is a symbolic link, it is dereferenced. The function returns ‘True’ if the file can be accessed and ‘False’ otherwise21.

Symbolic values for mode are provided in module status:

F_OK

Tests for the existence of the file.

R_OK

Tests whether the file exists and grants read permission.

W_OK

Tests whether the file exists and grants write permission.

X_OK

Tests whether the file exists and grants execute permission.

Built-in Function: string getenv (string name)

Searches the environment list for the variable name and returns its value. If the variable is not defined, the function raises the exception ‘e_not_found’.

Built-in Function: string gethostname ([bool fqn])

Return the host name of this machine.

If the optional fqn is given and is ‘true’, the function will attempt to return fully-qualified host name, by attempting to resolve it using DNS.

Built-in Function: string getdomainname ()

Return the domain name of this machine. Note, that it does not necessarily coincide with the actual machine name in DNS.

Depending on the underlying ‘libc’ implementation, this call may return empty string or the string ‘(none)’. Do not rely on it to get the real domain name of the box mailfromd runs on, use localdomain (see below) instead.

Library Function: string localdomain ()

Return the local domain name of this machine.

This function first uses getdomainname to make a first guess. If it does not return a meaningful value, localdomain calls gethostname(1) to determine the fully qualified host name of the machine, and returns its domain part.

To use this function, require the localdomain module (see Modules), e.g.: require localdomain.

Built-in Function: number time ()

Return the time since the Epoch (00:00:00 UTC, January 1, 1970), measured in seconds.

Built-in Function: string strftime (string fmt, number timestamp)
Built-in Function: string strftime (string fmt, number timestamp, boolean gmt)

Formats the time timestamp (seconds since the Epoch) according to the format specification format. Ordinary characters placed in the format string are copied to the output without conversion. Conversion specifiers are introduced by a ‘%’ character. See Time and Date Formats, for a detailed description of the conversion specifiers. We recommend using single quotes around fmt to prevent ‘%’ specifiers from being interpreted as Mailfromd variables (See Literals, for a discussion of quoted literals and variable interpretation within them).

The timestamp argument can be a return value of time function (see above).

For example:

strftime('%Y-%m-%d %H:%M:%S %Z', 1164477564)
 ⇒ 2006-11-25 19:59:24 EET
strftime('%Y-%m-%d %H:%M:%S %Z', 1164477564, 1)
 ⇒ 2006-11-25 17:59:24 GMT
Built-in Function: string uname (string format)

This function returns system information formatted according to the format specification format. Ordinary characters placed in the format string are copied to the output without conversion. Conversion specifiers are introduced by a ‘%’ character. The following conversions are defined:

%s

Name of this system.

%n

Name of this node within the communications network to which this node is attached. Note, that it does not necessarily coincide with the actual machine name in DNS.

%r

Kernel release.

%v

Kernel version.

%m

Name of the hardware type on which the system is running.

For example:

  uname('%n runs %s, release %r on %m')
    ⇒ "Trurl runs Linux, release 2.6.26 on i686"

Notice the use of single quotes.

Built-in Function: void unlink (string name)

Unlinks (deletes) the file name. On error, throws the e_failure exception.

Built-in Function: number system (string str)

The function system executes a command specified in str by calling /bin/sh -c string, and returns -1 on error or the return status of the command otherwise.

Built-in Function: void sleep (number secs[, usec])

Sleep for secs seconds. If optional usec argument is given, it specifies additional number of microseconds to wait for. For example, to suspend execution of the filter for 1.5 seconds:

  sleep(1,500000)

This function is intended mostly for debugging and experimental purposes.

Built-in Function: number umask (number mask)

Set the umask to mask & 0777. Return the previous value of the mask.

5.25 System User Database

Built-in Function: string getpwnam (string name)
Built-in Function: string getpwuid (number uid)

Look for the user name (getpwnam) or user ID uid (getpwuid) in the system password database and return the corresponding record, if found. If not found, raise the ‘e_not_found’ exception.

The returned record consists of six fields, separated by colon sign:

uname:passwd:uid:gid:gecos:dir:shell
FieldMeaning
unameuser name
passwduser password
uiduser ID
gidgroup ID
gecosreal name
dirhome directory
shellshell program

For example:

getpwnam("gray")
⇒ "gray:x:1000:1000:Sergey Poznyakoff:/home/gray:/bin/bash"

Following two functions can be used to test for existence of a key in the user database:

Built-in Function: boolean mappwnam (string name)
Built-in Function: boolean mappwuid (number uid)

Return ‘true’ if name (or uid) is found in the system user database.

5.26 Sieve Interface

Sieve’ is a powerful mail filtering language, defined in RFC 3028. Mailfromd supports an extended form of this language. For a description of the language and available extensions, see Sieve Language in GNU Mailutils Manual.

Built-in Function: boolean sieve (number msg, string script [, number flags, string file, number line])

Compile the Sieve program script and execute it over the message identified by the descriptor nmsg.

Optional flags modify the behavior of the function. It is a bit-mask field, consisting of a bitwise or of one or more of the following flags, defined in sieve.mf:

MF_SIEVE_FILE

The script argument specifies the name of a Sieve program file. This is the default.

MF_SIEVE_TEXT

The script argument is a string containing entire Sieve program. Optional arguments file and line can be used to fix source locations in Sieve diagnostic messages (see below).

MF_SIEVE_LOG

Log every executed ‘Sieve’ action.

MF_SIEVE_DEBUG_TRACE

Trace execution of ‘Sieve’ tests.

MF_SIEVE_DEBUG_INSTR

Log every instruction, executed in the compiled ‘Sieve’ code. This produces huge amounts of output and is rarely useful, unless you suspect some bug in ‘Sieve’ implementation and wish to trace it.

For example, MF_SIEVE_LOG|MF_SIEVE_DEBUG_TRACE enables logging ‘Sieve’ actions and tests.

The sieve function returns true if the message was accepted by the script program, and false otherwise. Here, the word accepted means that some form of ‘KEEP’ action (see Actions in GNU Mailutils Manual) was executed over the message.

While executing the Sieve script, Sieve environment (RFC 5183) is initialized as follows:

domain

The domain name of the server Sieve is running on.

host

Host name of the server Sieve is running on.

location

The string ‘MTA’.

name

The string ‘GNU Mailutils’.

phase

The string ‘pre’.

remote-host

Defined to the value of ‘client_ptr’ macro, if it was required.

remote-ip

Defined to the value of ‘client_addr’ macro, if it was required.

version

The version of GNU Mailutils.

The following example discards each message not accepted by the ‘Sieve’ program /etc/mail/filter.siv:

require 'sieve'

group eom
do
  if not sieve(current_message(), "/etc/mail/filter.siv", MF_SIEVE_LOG)
     discard
  fi
done

The Sieve program can be embedded in the MFL filter, as shown in the example below:

require 'sieve'

prog eom
do
  if not sieve(current_message(),
               "require \"fileinto\";\n"
               "fileinto \"/tmp/sieved.mbox\";",
               MF_SIEVE_TEXT | MF_SIEVE_LOG)
     discard
  fi
done

In such cases, any Sieve diagnostics (error messages, traces, etc.) will be marked with the locations relative to the line where the call to sieve appears. For example, the above program produces the following in the log:

prog.mf:7: FILEINTO; delivering into /tmp/sieved.mbox

Notice, that the line number correctly refers to the line where the fileinto action appears in the source. However, there are cases where the reported line number is incorrect. This happens, for instance, if script is a string variable defined elsewhere. To handle such cases, sieve accepts two optional parameters which are used to compute the location in the Sieve program. The file parameter specifies the file name where the definition of the program appears, and the line parameter gives the number of line in that file where the program begins. For example:

require 'sieve'

const sieve_prog_line __line__ + 2
string sieve_prog <<EOT
require "fileinto";
fileinto "/tmp/sieved.mbox";
EOT

prog eom
do
  if not sieve(current_message(),
               sieve_prog, MF_SIEVE_TEXT | MF_SIEVE_LOG,
               __file__, sieve_prog_line)
     discard
  fi
done

The actual Sieve program begins two lines below the sieve_prog_line constant definition, which is reflected in its initialization.

5.27 Interfaces to Third-Party Programs

A set of functions is defined for interfacing with other filters via TCP. Currently implemented are interfaces with SpamAssassin spamd daemon and with ClamAV anti-virus.

Both interfaces work much the same way: the remote filter is connected and the message is passed to it. If the remote filter confirms that the message matches its requirements, the function returns true. Notice that in practice that means that such a message should be rejected or deferred.

The address of the remote filter is supplied as the second argument in the form of a standard URL:

proto://path[:port]

The proto part specifies the connection protocol. It should be ‘tcp’ for the TCP connection and ‘file’ or ‘socket’ for the connection via UNIX socket. In the latter case the proto part can be omitted. When using TCP connection, the path part gives the remote host name or IP address and the optional port specifies the port number or service name to use. For example:

# connect to ‘remote.filter.net’ on port 3314:
tcp://remote.filter.net:3314

# the same, using symbolic service name (must be defined in
# /etc/services):
tcp://remote.filter.net:spamd

# Connect via a local UNIX socket (equivalent forms):
/var/run/filter.sock
file:///var/run/filter.sock
socket:///var/run/filter.sock

The description of the interface functions follows.

5.27.1 SpamAssassin

Built-in Function: boolean spamc (number msg, string url, number prec, number command)

Send the message msgt to the SpamAssassin daemon (spamd) listening on the given url. The command argument identifies what kind of processing is needed for the message. Allowed values are:

SA_SYMBOLS

Process the message and return 1 or 0 depending on whether it is diagnosed as spam or not. Store SpamAssassin keywords in the global variable sa_keywords (see below).

SA_REPORT

Process the message and return 1 or 0 depending on whether it is diagnosed as spam or not. Store entire SpamAssassin report in the global variable sa_keywords.

SA_LEARN_SPAM

Learn the supplied message as spam.

SA_LEARN_HAM

Learn the supplied message as ham.

SA_FORGET

Forget any prior classification of the message.

The second argument, prec, gives the precision, in decimal digits, to be used when converting SpamAssassin diagnostic data and storing them into mailfromd variables.

The floating point SpamAssassin data are converted to the integer mailfromd variables using the following relation:

var = int(sa-var * 10**prec)

where sa-var stands for the SpamAssassin value and var stands for the corresponding mailfromd one. int() means taking the integer part and ‘**’ denotes the exponentiation operator.

The function returns additional information via the following variables:

sa_score

The spam score, converted to integer as described above. To convert it to a floating-point representation, use sa_format_score function (see sa_format_score). See also the example below.

sa_threshold

The threshold, converted to integer form.

sa_keywords

If command is ‘SA_SYMBOLS’, this variable contains a string of comma-separated SpamAssassin keywords identifying this message, e.g.:

ADVANCE_FEE_1,AWL,BAYES_99

If command is ‘SA_REPORT’, the value of this variable is a spam report message. It is a multi-line textual message, containing detailed description of spam scores in a tabular form. It consists of the following parts:

  1. A preamble.
  2. Content preview.

    The words ‘Content preview’, followed by a colon and an excerpt of the message body.

  3. Content analysis details.

    It has the following form:

    Content analysis details: (score points, max required)
    

    where score and max are spam score and threshold in floating point.

  4. Score table.

    The score table is formatted in three columns:

    pts

    The score, as a floating point number with one decimal digit.

    rule name

    SpamAssassin rule name that contributed this score.

    description

    Textual description of the rule

    The score table can be extracted from sa_keywords using sa_format_report_header function (see sa_format_report_header), as illustrated in the example below.

The value of this variable is undefined if command is ‘SA_LEARN_SPAM’, ‘SA_LEARN_HAM’ or ‘SA_FORGET’.

The spamc function can signal the following exceptions: e_failure if the connection fails, e_url if the supplied URL is invalid and e_range if the supplied port number is out of the range 1–65535.

An example of using this function:

prog eom
do
  if spamc(current_message(), "tcp://192.168.10.1:3333", 3,
           SA_SYMBOLS)
    reject 550 5.7.0
         "Spam detected, score %sa_score with threshold %sa_threshold"
  fi
done

Here is a more advanced example:

prog eom
do
  set prec 3
  if spamc(current_message(),
           "tcp://192.168.10.1:3333", prec, SA_REPORT)
    add "X-Spamd-Status" "SPAM"
  else
    add "X-Spamd-Status" "OK"
  fi
  add "X-Spamd-Score" sa_format_score(sa_score, prec)
  add "X-Spamd-Threshold" sa_format_score(sa_threshold, prec)
  add "X-Spamd-Keywords" sa_format_report_header(sa_keywords)
done
Library Function: boolean sa (string url, number prec; number command)

Additional interface to the spamc function, provided for backward compatibility. It is equivalent to

spamc(current_message(), url, prec, command)

If command is not supplied, ‘SA_SYMBOLS’ is used.

5.27.2 DSPAM

DSPAM is a statistical spam filter distributed under the terms of the GNU General Public License. It is available from http://dspam.sourceforge.net.

MFL provides an interface to DSPAM functionality if the libdspam library is installed and mailfromd is linked with it. The m4 macro ‘WITH_DSPAM’ is defined if it is so.

The DSPAM functions and definitions become available after requiring the ‘dspam’ module:

require 'dspam'
Built-in Function: number dspam (number msg, number mode_flags; number class_source)

Analyze a message using DSPAM. The message is identified by its descriptor, passed in the msg argument.

The mode_flags argument controls the function behavior. Its value is a bitwise OR of operation mode, flag, tokenizer and training mode. Operation mode defines what dspam is supposed to do with the message. Its value is either ‘DSM_PROCESS’ if full processing of the message is intended (the default), or ‘DSM_CLASSIFY’, if the message must only be classified.

Optional flag bits turn on additional functionality. The ‘DSF_SIGNATURE’ bit instructs dspam to create a signature for the message – a unique string which can subsequently be used to identify that particular message. Upon return from the function, the signature is stored in the dspam_signature variable.

The ‘DSF_NOISE’ bit enables Bayesian noise reduction, and ‘DSF_WHITELIST’ enables automatic whitelisting.

Additional flags are available for defining the algorithm to split the message into tokens (tokenizer) and training mode. See flags-dspam, for a complete list of these. All these are optional, any missing values will be read from the DSPAM configuration file.

The configuration file must always be present. Its full file name must be stored in the global variable dspam_config. There is no default value, so make sure this variable is initialized. If a specific profile section should be read, store the name of that profile in the variable dspam_profile.

When called to process or classify the message, dspam returns an integer code of the class of the message. The value ‘DSR_ISSPAM’ means that this message was classified as spam. The value ‘DSR_ISINNOCENT’ means it is a clean (“ham”) message.

The probability and confidence values are returned in global variables dspam_probability and dspam_confidence. Since MFL lacks floating-point data type, both variables keep integers, obtained from the corresponding floating point values by shifting the decimal point dspam_prec digits to the right and rounding the resulting value to the nearest integer. The same method is used in spamc function (see sa-floating-point-conversion). The default value for dspam_prec variable is 3. You can use the sa_format_score function to convert these values to strings representing floating point numbers, e.g.:

require 'dspam'
require 'sa'

prog eom
do
  if dspam(current_message(), DSM_PROCESS | DSM_SIGNATURE)
       == DSR_ISSPAM
    header_add("X-DSPAM-Result", "Spam")
  else
    header_add("X-DSPAM-Result", "Innocent")
  fi
  header_add("X-DSPAM-Probability",
             sa_format_score(dspam_probability, dspam_prec))
  header_add("X-DSPAM-Confidence",
             sa_format_score(dspam_confidence, dspam_prec))
  header_add("X-DSPAM-Signature", dspam_signature)
done

Optional class_source argument is used when training the DSPAM classifier. It is a bitwise OR of the message class and message source values. Message class specifies the class this message belongs to. Possible values are ‘DSR_ISSPAM’, for spam messages, and ‘DSR_ISINNOCENT’, for clean messages. Message source informs DSPAM where this message comes from. The value ‘DSS_ERROR’ means the message was previously misclassified by DSPAM. The value ‘DSS_CORPUS’ indicates the message comes from a corpus feed. Finally, the value ‘DSS_INOCULATION’ means that the message is in pristine form, and should be trained as an inoculation. Inoculation is a more intense mode of training, usually used on honeypots.

The following example calls dspam to train the classifier on the current message if it was sent to a honeypot address, and uses dspam to analyze the message class otherwise. The honeypot variable is supposed to be set elsewhere in the code (e.g. in the ‘envrcpt’ handler):

prog eom
do
  number res
  if honeypot
    set res dspam(current_message(), DSM_PROCESS,
                  DSR_ISSPAM | DSS_INOCULATION)
    discard
  else
    if dspam(current_message(), DSM_PROCESS | DSM_SIGNATURE)
             == DSR_ISSPAM
      header_add("X-DSPAM-Result", "Spam")
    else
      header_add("X-DSPAM-Result", "Innocent")
    fi
    header_add("X-DSPAM-Probability",
               sa_format_score(dspam_probability, dspam_prec))
    header_add("X-DSPAM-Confidence"
               sa_format_score(dspam_confidence, dspam_prec))
    header_add("X-DSPAM-Signature", dspam_signature)
  fi
done

5.27.2.1 DSPAM Operation Modes and Flags.

The tables below summarize flags which can be used in the mode_flags argument to dspam function. The argument is a bitwise OR of operation mode, flags, tokenizer and training mode bits. Only one operation mode bit can be used. Flags, tokenizer and training mode are optional. Any number of flags, but no more than one tokenizer type and one training mode bit are allowed. Missing values will be supplied from the configuration file.

ModeMeaning
DSM_PROCESSProcess message
DSM_CLASSIFYClassify message only (do not write changes)

Table 5.2: DSPAM Operation modes

FlagMeaning
DSF_SIGNATURECreate a signature
DSF_NOISEUse Bayesian Noise Reduction
DSF_WHITELISTUse Automatic Whitelisting

Table 5.3: DSPAM flags

ConstantMeaning
DSZ_WORDUse WORD tokenizer
DSZ_CHAINUse CHAIN tokenizer
DSZ_SBPHUse SBPH tokenizer
DSZ_OSBUse OSB tokenizer

Table 5.4: DSPAM Tokenizer bits

ModeMeaning
DST_TEFTTrain Everything
DST_TOETrain-on-Error
DST_TUMTrain-until-Mature

Table 5.5: DSPAM Training Modes

5.27.2.2 DSPAM Class and Source Bits

The tables below summarize flags which can be used in the class_source argument to dspam function. The argument is a bitwise OR of classification and source bits. At most one classification and one source bit can be given. If not supplied, ‘DSR_NONE|DSS_NONE’ is used.

The classification flags are also used as the return code, as shown in the following table.

ModeAs return valueAs argument
DSR_NONEN/AClassify message
DSR_ISSPAMMessage is spamLearn as spam
DSR_ISINNOCENTMessage is innocentLearn as innocent

Table 5.6: DSPAM Classification

SourceMeaning
DSS_NONENo classification source (use only with DSR_NONE)
DSS_ERRORMisclassification by libdspam
DSS_CORPUSMessage came from a corpus feed
DSS_INOCULATIONMessage inoculation

Table 5.7: DSPAM Source

5.27.2.3 DSPAM Global Variables

Following global variables affect the behavior of the dspam function:

Built-in variable: string dspam_config

Name of the DSPAM configuration file. You must set this variable prior to calling dspam. There is no default value.

Built-in variable: string dspam_profile

Name of the configuration profile to be used. If empty (the default), use global configuration settings.

Built-in variable: string dspam_user

Name of the user on behalf of which dspam is called. Default is empty (no user).

Built-in variable: string dspam_group

Name of the user group on behalf of which dspam is called. Default is empty (no group).

Built-in variable: number dspam_prec

Number of decimal digits to retain in the dspam_probability and dspam_confidence values. See dspam probability and confidence, for more information and examples.

Before returning, dspam stores additional information in the following variables:

Built-in variable: string dspam_signature

Signature of the classified message. This variable is initialized if ‘DSF_SIGNATURE’ bit is set in the mode_flags argument (see dspam classify example),

Built-in variable: number dspam_probability

Spam probability value converted to integer by shifting decimal point dspam_prec positions to the right and rounding the resulting number. See dspam probability and confidence, for more information and examples.

Built-in variable: number dspam_confidence

Spam confidence converted to integer using the same algorithm as for dspam_probability. See dspam probability and confidence, for more information and examples.

5.27.3 ClamAV

Built-in Function: boolean clamav (number msg, string url)

Pass the message msg to the ClamAV daemon at url. Return true if it detects a virus in it. Return virus name in clamav_virus_name global variable.

The clamav function can signal the following exceptions: e_failure if failed to connect to the server, e_url if the supplied URL is invalid and e_range if the supplied port number is out of the range 1–65535.

An example usage:

prog eom
do
  if clamav(current_message(), "tcp://192.168.10.1:6300")
    reject 550 5.7.0 "Infected with %clamav_virus_name"
  fi
done

5.28 Rate limiting functions

Built-in Function: number rate (string key, number sample-interval, [number min-samples, number threshold])

Returns the mail sending rate for key per sample-interval. Optional min-samples, if supplied, specifies the minimal number of mails needed to obtain the statistics. The default is 2. Optional threshold controls rate database updates. If the observed rate (per sample-interval seconds) is higher than the threshold, the hit counters for that key are not incremented and the database is not updated. Although the threshold argument is optional22, its use is strongly encouraged. Normally, the value of threshold equals the value compared with the return from rate, as in:

  if rate("$f-$client_addr", rate_interval, 4, maxrate) > maxrate
    tempfail 450 4.7.0 "Mail sending rate exceeded.  Try again later"
  fi

This function is a low-level interface. Instead of using it directly, we advise to use the rateok function, described below.

Library Function: boolean rateok (string key, number sample-interval, number threshold,

[number min-samples])

To use this function, require the rateok module (see Modules), e.g.: require rateok.

The rateok function returns ‘True’ if the mail sending rate for key, computed for the interval of sample-interval seconds is less than the threshold. Optional min-samples parameter supplies the minimal number of mails needed to obtain the statistics. It defaults to 4.

See Sending Rate, for a detailed description of the rateok and its use. The interval function (see interval) is often used in the second argument to rateok or rate.

Built-in Function: boolean tbf_rate (string key, number cost, number sample-interval, number burst-size)

This function implements a classical token bucket filter algorithm. Tokens are added to the bucket identified by the key at constant rate of 1 token per sample-interval microseconds, to a maximum of burst-size tokens. If no bucket is found for the specified key, a new bucket is created and initialized to contain burst-size tokens. If the bucket contains cost or more tokens, cost tokens are removed from it and tbf_rate returns ‘True’. Otherwise, the function returns ‘False’.

For a detailed description of the Token Bucket Algorithm and its use to limit mail rates, see TBF.

5.29 Greylisting functions

Built-in Function: boolean greylist (string key, number interval)

Returns ‘True’ if the key is found in the greylist database (controlled by database greylist configuration file statement, see conf-database). The argument interval gives the greylisting interval in seconds. The function stores the number of seconds left to the end of greylisting period in the global variable greylist_seconds_left. See Greylisting, for a detailed explanation.

The function greylist can signal e_dbfailure exception.

Built-in Function: boolean is_greylisted (string key

Returns ‘True’ if the key is still greylisted. If ‘true’ is returned, the function also stores the number of seconds left to the end of greylisting period in the global variable greylist_seconds_left.

This function is available only if Con Tassios implementation of greylisting is used. See greylisting types, for a discussion of available greylisting implementations. See greylist, for a way to switch to Con Tassios implementation.

5.30 Special Test Functions

Library Function: boolean portprobe (string host, [number port])
Library Function: boolean listens (string host, [number port])

Returns true if the IP address or host name given by host argument listens on the port number port (default 25).

This function is defined in the module portprobe.

Built-in Function: boolean validuser (string name)

Returns true if authenticity of the user name is confirmed using mailutils authentication system. See Local Account Verification, for more details.

Library Function: boolean valid_domain (string domain)

Returns true if the domain name domain has a corresponding A record or if it has any ‘MX’ records, i.e. if it is possible to send mail to it.

To use this function, require the valid_domain module (see Modules):

require valid_domain
Library Function: number heloarg_test (string arg, string remote_ip, string local_ip)

Verify if an argument of ‘HELO’ (‘EHLO’) command is valid. To use this function, require the heloarg_test module (see Modules).

Arguments:

arg

HELO’ (‘EHLO’) argument. Typically, the value of $s Sendmail macro;

remote_ip

IP address of the remote client. Typically, the value of $client_addr Sendmail macro;

local_ip

IP address of this SMTP server;

The function returns a number describing the result of the test, as described in the following table.

CodeMeaning
HELO_SUCCESSarg successfully passes all tests.
HELO_MYIParg is our IP address.
HELO_IPNOMATCHarg is an IP, but it does not match the remote party IP address.
HELO_ARGNORESOLVEarg is an IP, but it does not resolve.
HELO_ARGNOIParg is in square brackets, but it is not an IP address.
HELO_ARGINVALIDarg is not an IP address and does not resolve to one.
HELO_MYSERVERIParg resolves to our server IP.
HELO_IPMISMATCHarg does not resolve to the remote client IP address.

5.31 Mail Sending Functions

The mail sending functions are new interfaces, introduced in version 3.1.

The underlying mechanism for sending mail, called mailer, is specified by --mailer command line option. This global setting can be overridden using the last optional argument to a particular function. In any case, the mailer is specified in the form of a URL.

Mailer URL begins with a protocol specification. Two protocol specifications are currently supported: ‘sendmail’ and ‘smtp’. The former means to use a sendmail-compatible program to send mails. Such a program must be able to read mail from its standard input and must support the following options:

-oi

Do not treat ‘.’ as message terminator.

-f addr

Use addr as the address of the sender.

-t

Get recipient addresses from the message.

These conditions are met by most existing MTA programs, such as exim or postfix (to say nothing of sendmail itself).

Following the protocol specification is the mailer location, which is separated from it with a colon. For the ‘sendmail’ protocol, the mailer location sets the full file name of the Sendmail-compatible MTA binary, for example:

sendmail:/usr/sbin/sendmail

A special form of a sendmail URL, consisting of protocol specification only (‘sendmail:’) is also allowed. It means “use the sendmail binary from the _PATH_SENDMAIL macro in your /usr/include/paths.h file”. This is the default mailer.

The ‘smtp’ protocol means to use an SMTP server directly. In this case the mailer location consists of two slashes, followed by the IP address or host name of the SMTP server, and, optionally, the port number. If the port number is present, it is separated from the rest of URL by a colon. For example:

smtp://remote.server.net
smtp://remote.server.net:24
Built-in Function: void send_mail (string msg [, string to, string from, string mailer])

Sends message msg to the email address to. The value of msg must be a valid RFC 2822 message, consisting of headers and body. Optional argument to can contain several email addresses. In this case the message will be sent to each recipient specified in to. If it is not specified, recipient addresses will be obtained from the message headers.

Other optional arguments are:

from

Sets the sender address. By default ‘<>’ is used.

mailer

The URL of the mailer to use

Sample usage:

    set message <<- EOT
          Subject: Test message
          To: Postmaster <postmaster@gnu.org.ua>
          From: Mailfromd <devnull@gnu.org.ua>
          X-Agent: %__package__ (%__version__)

          Dear postmaster,
          
          This is to notify you that our /etc/mailfromd.mf
          needs a revision.
          --
          Mailfromd filter administrator
    EOT
    send_mail(message, "postmaster@gnu.org.ua")
Built-in Function: void send_text (string text, string headers [, string to, string from, string mailer])

A more complex interface to mail sending functions.

Mandatory arguments:

text

Text of the message to be sent.

headers

Headers for the message.

Optional arguments:

to

Recipient email addresses.

from

Sender email address.

mailer

URL of the mailer to use.

The above example can be rewritten using send_text as follows:

    set headers << -EOT
          Subject: Test message
          To: Postmaster <postmaster@gnu.org.ua>
          From: Mailfromd <devnull@gnu.org.ua>
          X-Agent: %__package__ (%__version__)
    EOT
    set text <<- EOT
          Dear postmaster,
          
          This is to notify you that our /etc/mailfromd.mf
          needs a revision.
          --
          Mailfromd filter administrator
    EOT
    send_text(text, headers, "postmaster@gnu.org.ua")
Built-in Function: void send_message (number msg [string to, string from, string mailer])

Send the message identified by descriptor msg (see Message functions).

Optional arguments are:

to

Recipient email addresses.

from

Sender email address.

mailer

URL of the mailer to use.

Built-in Function: void send_dsn (string to, string sender, string rcpt, string text [, string headers, string from, string mailer])

This is an experimental interface which will change in the future versions. It sends a message disposition notification (RFC 2298, RFC 1894), of type ‘deleted’ to the email address to. Arguments are:

to

Recipient email address.

sender

Original sender email address.

rcpt

Original recipient email address.

text

Notification text.

Optional arguments:

headers

Message headers

from

Sender address.

mailer

URL of the mailer to use.

Built-in Function: void create_dsn (string sender, string rcpt, string text [, string headers, string from])

Creates DSN message and returns its descriptor. Arguments are:

sender

Original sender email address.

rcpt

Original recipient email address.

text

Notification text.

headers

Message headers

from

Sender address.

5.32 Blacklisting Functions

The functions described in this subsection allow to check whether the given IP address is listed in certain black list DNS zone.

Library Function: boolean match_dnsbl (string address, string zone, string range)

This function looks up address in the DNS blacklist zone zone and checks if the return falls into the given range of IP addresses.

It is intended as a replacement for the Sendmail macros ‘dnsbl’ and ‘enhdnsbl’.

To use match_dnsbl, require the match_dnsbl module (see Modules).

Arguments:

address

IP address of the SMTP server to be tested.

zone

FQDN of the DNSbl zone to test against.

range

The range of IP addresses in CIDR notation or the word ‘ANY’, which stands for ‘127.0.0.0/8’.

The function returns true if dns lookup for address in the zone dnsbl yields an IP that falls within the range, specified by cidr. Otherwise, it returns false.

This function raises the following exceptions: e_invip if address is invalid and e_invcidr if cidr is invalid.

Library Function: boolean match_rhsbl (string email, string zone, string range)

This function checks if the IP address, corresponding to the domain part of email is listed in the RHS DNS blacklist zone zone, and if so, whether its record falls into the given range of IP addresses range.

It is intended as a replacement for the Sendmail macro ‘rhsbl’ by Derek J. Balling.

To use this function, require the match_rhsbl module (see Modules).

Arguments:

email

E-mail address, whose domain name should be tested (usually, it is $f)

zone

Domain name of the RHS DNS blacklist zone.

range

The range of IP addresses in CIDR notation.

5.33 SPF Functions

Sender Policy Framework, or SPF for short, is an extension to SMTP protocol that allows to identify forged identities supplied with the MAIL FROM and HELO commands. The framework is explained in detail in RFC 4408 (http://tools.ietf.org/html/rfc4408) and on the SPF Project Site. The following description is a short introduction only, and the users are encouraged to refer to the original specification for the detailed description of the framework.

The domain holder publishes an SPF record – a special DNS resource record that contains a set of rules declaring which hosts are, and are not, authorized to use a domain name for HELO and MAIL FROM identities. This resource record is usually of type TXT.23

The MFL script can verify if the identity matches the published SPF record by calling check_host function and analyzing its return code. The function can be called either in helo or in envfrom handler. Its arguments are:

ip

The IP address of the SMTP client that is emitting the mail. Usually it is $client_addr.

domain

The domain that provides the sought-after authorization information; Normally it is the domain portion of the MAIL FROM or HELO identity.

sender

The MAIL FROM identity.

helo_domain

The HELO identity.

my_domain

The SMTP domain served by the local server.

The function returns a numeric result code. For convenience, all possible return values are defined as macros in module spf.mf. The table below describes each value along with the recommended actions for it:

None

A result of None means that no records were published by the domain or that no checkable sender domain could be determined from the given identity. The checking software cannot ascertain whether or not the client host is authorized. Such a message can be subject to further checks that will decide about its fate.

Neutral

The domain owner has explicitly stated that he cannot or does not want to assert whether or not the IP address is authorized. This result must be treated exactly like None; the distinction between them exists only for informational purposes

Pass

The client is authorized to send mail with the given identity. The message can be subject to further policy checks with confidence in the legitimate use of the identity or it can be accepted in the absence of such checks.

Fail

The client is not authorized to use the domain in the given identity. The proper action in this case can be to mark the message with a header explicitly stating it is spam, or to reject it outright.

If you choose to reject such mails, we suggest to use reject 550 5.7.1, as recommended by RFC 4408. The reject can return either a default explanation string, or the one supplied by the domain that published the SPF records, as in the example below:

  reject 550 5.7.1 "SPF check failed:\n%spf_explanation"

(for the description of spf_explanation, see spf_explanation)

SoftFail

The domain believes the host is not authorized but is not willing to make that strong of a statement. This result code should be treated as somewhere in between a Fail and a Neutral. It is not recommended to reject the message based solely on this result.

TempError

A transient error occurred while performing SPF check. The proper action in this case is to accept or temporarily reject the message. If you choose the latter, we suggest to use SMTP reply code of ‘451’ and DSN code ‘4.4.3’, for example:

  tempfail 451 4.4.3
           "Transient error while performing SPF verification"
PermError

This result means that the domain’s published records could not be correctly interpreted. This signals an error condition that requires manual intervention to be resolved, as opposed to the TempError result.

The following example illustrates the use of SPF verification in envfrom handler:

#include_once <status.mfh>
require spf

prog envfrom
do
  switch check_host($client_addr, domainpart($f), $f, $s)
  do
  case Fail:
    string text ""
    if spf_explanation != ""
      set text "%text\n%spf_explanation"
    fi
    reject 550 5.7.1 "SPF MAIL FROM check failed: %text"

  case Pass:
    accept

  case TempError:
    tempfail 451 4.4.3
             "Transient error while performing SPF verification"

  default:
    on poll $f do
    when success:
      accept
    when not_found or failure:
      reject 550 5.1.0 "Sender validity not confirmed"
    when temp_failure:
      tempfail 450 4.7.0 "Temporary failure during sender verification"
    done
  done
done  

The SPF support is implemented in MFL in two layers: a built-in layer that provides basic support, and a library layer that provides a convenience wrapper over the library function.

The library layer is implemented in the module spf.mf (see Modules).

The rest of this node describes available SPF functions and variables.

Built-in Function: number spf_check_host (string ip, string domain, string sender, string helo_domain, string my_domain)

This function is the basic implementation of the check_host function, defined in RFC 4408, chapter 4. It fetches SPF records, parses them, and evaluates them to determine whether a particular host (ip) is or is not permitted to send mail from a given email address (sender). The function returns an SPF result code.

Arguments are:

ip

The IP address of the SMTP client that is emitting the mail. Usually it is $client_addr.

domain

The domain that provides the sought-after authorization information; Normally it is the domain portion of the MAIL FROM or HELO identity.

sender

The MAIL FROM identity.

helo_domain

The HELO identity.

my_domain

The SMTP domain served by the local server.

Before returning the spf_check_host function stores additional information in global variables:

spf_explanation

If the result code is Fail, this variable contains the explanation string as returned by the publishing domain, prefixed with the value of the global variable spf_explanation_prefix.

For example, if spf_explanation_prefix contains ‘The domain %{o} explains: ’, and the publishing domain ‘example.com’ returns the explanation string ‘Please see http://www.example.com/mailpolicy.html’, than the value of spf_explanation will be:

The domain example.com explains:
Please see http://www.example.com/mailpolicy.html

(see RFC 4408, chapter 8, for the description of SPF macro facility).

spf_mechanism

Name of the SPF mechanism that decided about the result code of the SPF record. If one or more ‘include’ or ‘redirect’ mechanisms were traversed before arriving at that mechanism, their values are appended in the reverse order.

Built-in Function: number spf_test_record (string record, string ip, string domain, string sender, string helo_domain, string my_domain)

Evaluate SPF record record as if it were published by domain. The rest of arguments are the same as for spf_check_host above.

This function is designed primarily for testing and debugging purposes. You would hardly need to use it.

The spf_test_record function sets the same global variables as spf_check_host.

Library Function: number check_host (string ip, string domain, string sender, string helo)

This function implements the check_host function, defined in RFC 4408, chapter 4. It fetches SPF records, parses them, and evaluates them to determine whether a particular host (ip) is or is not permitted to send mail from a given email address (sender). The function returns an SPF result code.

This function is a wrapper over the built-in spf_check_host.

The arguments are:

ip

The IP address of the SMTP client that is emitting the mail. Usually it is the same as the value of $client_addr.

domain

The domain that provides the sought-after authorization information; Normally it is the domain portion of the MAIL FROM or HELO identity.

sender

The MAIL FROM identity.

helo

The HELO identity.

Library Function: string spf_status_string (number code)

Converts numeric SPF result code to its string representation.

Built-in variable: string spf_explanation

If check_host (or spf_check_host or spf_test_record) returned Fail, this variable contains the explanation string as returned by the publishing domain, prefixed with the value of the global variable spf_explanation_prefix.

For example, if spf_explanation_prefix contains ‘The domain %{o} explains: ’, and the publishing domain ‘example.com’ returns the explanation string ‘Please see http://www.example.com/mailpolicy.html’, than the value of spf_explanation will be:

The domain example.com explains:
Please see http://www.example.com/mailpolicy.html
Built-in variable: string spf_mechanism

Set to the name of a SPF mechanism that decided about the result code of the SPF record.

Built-in variable: string spf_explanation_prefix

The prefix to be appended to the explanation string before storing it in the spf_explanation variable. This string can contain valid SPF macros (see RFC 4408, chapter 8), for example:

set spf_explanation_prefix "%{o} explains: "

The default value is ‘""’ (an empty string).

5.34 Sockmap Functions

Socket map (sockmap for short) is a special type of database used in Sendmail and MeTA1. It uses a simple server/client protocol over INET or UNIX stream sockets. The server listens on a socket for queries. The client connects to the server and sends it a query, consisting of a map name and a key separated by a single space. Both map name and key are sequences of non-whitespace characters. The map name serves to identify the type of the query. The server replies with a response consisting of a status indicator and result, separated by a single space. The result part is optional.

For example, following is the query for key ‘smith’ in map ‘aliases’:

11:aliases news,

A possible reply is:

18:OK root@domain.net,

This reply means that the key ‘news’ was found in the map, and the value corresponding to that key is ‘root@domain.net’.

The following reply means the key was not found:

8:NOTFOUND,

For a detailed description of the sockmap protocol, see Protocol in Smap manual.

The MFL library provides two primitives for dealing with sockmaps. Both primitives become available after requiring the sockmap module.

Library Function: string sockmap_lookup (number fd, string map, string key)

This function look ups the key in the map. The fd refers to the sockmap to use. It must be obtained as a result of a previous call to open with the URL of the sockmap as its first argument (see open). For example:

  number fd open("@ unix:///var/spool/meta1/smap/socket")
  string ret sockmap_query(fd, "aliases", $rcpt_to)
  if ret matches "OK (.+)"
    set alias \1
  fi
  close(fd)
Library Function: string sockmap_single_lookup (string url, string map, string key)

This function connects to the sockmap identified by the url, queries for key in map and closes the connection. It is useful when you need to perform only a single lookup on the sockmap.

5.35 National Language Support Functions

The National Language Support functions allow you to write your scripts in such a way, that any textual messages they display are automatically translated to your native language, or, more precisely, to the language required by your current locale.

This section assumes the reader is familiar with the concepts of program internationalization and localization. If not, please refer to The Purpose of GNU gettext in GNU gettext manual, before reading further.

In general, internationalization of any MFL script follows the same rules as described in the GNU gettext manual. First of all, you select the program message domain, i.e. the identifier of a set of translatable messages your script contain. This identifier is then used to select appropriate translation. The message domain is set using textdomain function. For the purposes of this section, let’s suppose the domain name is ‘myfilter’. All NLS functions are provided in the nls module, which you need to require prior to using any of them.

To find translations of textual message to the current locale, the underlying gettext mechanism will look for file dirname/locale/LC_MESSAGES/domainname.mo, where dirname is the message catalog hierarchy name, locale is the locale name, and domainname is the name of the message domain. By default dirname is /usr/local/share/locale, but you may change it using bindtextdomain function. The right place for this initial NLS setup is in the ‘begin’ block (see begin/end). To summarize all the above, the usual NLS setup will look like:

require nls

begin
do
  textdomain("myfilter")
  bindtextdomain("myfilter", "/usr/share/locale");
done  

For example, given the settings above, and supposing the environment variable LC_ALL is set to ‘pl’, translations will be looked in file /usr/share/locale/pl/LC_MESSAGES/myfilter.mo.

Once this preparatory work is done, you can request each message to be translated by using gettext function, or _ (underscore) macro. For example, the following statement will produce translated textual description for ‘450’ response:

tempfail 450 4.1.0 _("Try again later")

Of course it assumes that the appropriate myfile.mo file already exists. If it does not, nothing bad happens: in this case the macro _ (as well as gettext function) will simply return its argument unchanged, so that the remote party will get the textual message in English.

The ‘mo’ files are binary files created from ‘po’ source files using msgfmt utility, as described in Producing Binary MO Files in GNU gettext manual. In turn, the format of ‘po’ files is described in The Format of PO Files in GNU gettext manual.

Built-in Function: string bindtextdomain (string domain, string dirname)

This function sets the base directory of the hierarchy containing message catalogs for a given message domain.

domain is a string identifying the textual domain. If it is not empty, the base directory for message catalogs belonging to domain domain is set to dirname. It is important that dirname be an absolute pathname; otherwise it cannot be guaranteed that the message catalogs will be found.

If domain is ‘""’, bindtextdomain returns the previously set base directory for domain domain.

The rest of this section describes the NLS functions supplied in the nls module.

Built-in Function: string dgettext (string domain, string msgid)

dgettext attempts to translate the string msgid into the currently active locale, according to the settings of the textual domain domain. If there is no translation available, dgettext returns msgid unchanged.

Built-in Function: string dngettext (string domain, string msgid, string msgid_plural, number n)

The dngettext functions attempts to translate a text string into the language specified by the current locale, by looking up the appropriate singular or plural form of the translation in a message catalog, set for the textual domain domain.

See Additional functions for plural forms in GNU gettext utilities, for a discussion of the plural form handling in different languages.

Library Function: string textdomain (string domain)

The textdomain function sets the current message domain to domain, if it is not empty. In any case the function returns the current message domain. The current domain is ‘mailfromd’ initially. For example, the following sequence of textdomain invocations will yield:

textdomain("") ⇒ "mailfromd"
textdomain("myfilter") ⇒ "myfilter"
textdomain("") ⇒ "myfilter"
Library Function: string gettext (string msgid)

gettext attempts to translate the string msgid into the currently active locale, according to the settings of the current textual domain (set using textdomain function). If there is no translation available, gettext returns msgid unchanged.

Library Function: string ngettext (string msgid, string msgid_plural, number n)

The ngettext functions attempts to translate a text string into the language specified by the current locale, by looking up the appropriate singular or plural form of the translation in a message catalog, set for the current textual domain.

See Additional functions for plural forms in GNU gettext utilities, for a discussion of the plural form handling in different languages.

5.36 Syslog Interface

The basic means for outputting diagnostic messages is the ‘echo’ instruction (see Echo), which sends its arguments to the currently established logging channel. In daemon mode, the latter is normally connected to syslog, so any echoed messages are sent there with the facility selected in mailfromd configuration and priority ‘info’.

If you want to send a message to another facility and/or priority, use the ‘syslog’ function:

Built-in Function: void syslog (number priority, string text)

Sends text to syslog. The priority argument is formed by ORing the facility and the level values (explained below). The facility level is optional. If not supplied, the currently selected logging facility is used.

The facility specifies what type of program is logging the message, and the level indicates its relative severity. The following symbolic facility values are declared in the syslog module: ‘LOG_KERN’, ‘LOG_USER’, ‘LOG_MAIL’, ‘LOG_DAEMON’, ‘LOG_AUTH’, ‘LOG_SYSLOG’, ‘LOG_LPR’, ‘LOG_NEWS’, ‘LOG_UUCP’, ‘LOG_CRON’, ‘LOG_AUTHPRIV’, ‘LOG_FTP’ and ‘LOG_LOCAL0’ through ‘LOG_LOCAL7

The declared severity levels are: ‘LOG_EMERG’, ‘LOG_ALERT’, ‘LOG_CRIT’, ‘LOG_ERR’, ‘LOG_WARNING’, ‘LOG_NOTICE’, ‘LOG_INFO’ and ‘LOG_DEBUG’.

5.37 Debugging Functions

These functions are designed for debugging the MFL programs.

Built-in Function: void debug (string spec)

Enable debugging. The value of spec sets the debugging level. See debugging level specification, for a description of its format.

For compatibility with previous versions, this function is also available under the name ‘mailutils_set_debug_level’.

Built-in Function: number debug_level ([string srcname])

This function returns the debugging level currently in effect for the source module srcname, or the global debugging level, if called without arguments.

For example, if the program was started with --debug='all.trace5;engine.trace8' option, then:

debug_level() ⇒ 127
debug_level("engine") ⇒ 1023
debug_level("db") ⇒ 0
Built-in Function: boolean callout_transcript ([boolean value])

Returns the current state of the callout SMTP transcript. The result is 1 if the transcript is enabled and 0 otherwise. The transcript is normally enabled either by the use of the --transcript command line option (see SMTP transcript) or via the ‘transcript’ configuration statement (see transcript).

The optional value, supplies the new state for SMTP transcript. Thus, calling ‘callout_transcript(0)’ disables the transcript.

This function can be used in bracket-like fashion to enable transcript for a certain part of MFL program, e.g.:

number xstate callout_transcript(1)
on poll $f do
  …
done
set xstate callout_transcript(0)

Note, that the use of this function (as well as the use of the --transcript option) makes sense only if callouts are performed by the mailfromd daemon itself. It will not work if a dedicated callout server is used for that purpose (see calloutd).

Built-in Function: string debug_spec ([string catnames, bool showunset])

Returns the current debugging level specification, as given by --debug command line option or by the debug configuration statement (see conf-debug).

If the argument srcnames is specified, it is treated as a semicolon-separated list of categories for which the debugging specification is to be returned.

For example, if mailfromd was started with --debug=all.trace5;spf.trace1;engine.trace8;db.trace0, then:

debug_spec() ⇒ "all.trace5,engine.trace8"
debug_spec("all;engine") ⇒ "all.trace5,engine.trace8"
debug_spec("engine;db") ⇒ "db.trace0;engine.trace8"
debug_spec("prog") ⇒ ""

When called without arguments, debug_spec returns only those categories which have been set, as shown in the first example above.

Optional showunset parameters controls whether to return unset module specifications. To print all debugging specifications, whether set or not, use

debug_spec("", 1)

These three functions are intended to complement each other. The calls to debug can be placed around some piece of code you wish to debug, to enable specific debugging information for this code fragment only. For example:

    /* Save debugging level for dns.c source */
    set dlev debug_spec("dns", 1)
    /* Set new debugging level */
    debug("dns.trace8")
    .
    .
    .
    /* Restore previous level */
    debug(dlev)
Built-in Function: void program_trace (string module)

Enable tracing for a set of modules given in module argument. See --trace-program, for a description of its format.

Built-in Function: void cancel_program_trace (string module)

Disable tracing for given modules.

This pair of functions is also designed to be used together in a bracket-like fashion. They are useful for debugging mailfromd, but are not advised to use otherwise, since tracing slows down the execution considerably.

Built-in Function: void stack_trace ()

Generate a stack trace in this point. See tracing runtime errors, for the detailed description of stack traces.

The functions below are intended mainly for debugging MFL run-time engine and for use in mailfromd testsuite. You will hardly need to use them in your programs.

Built-in Function: void _expand_dataseg (number n)

Expands the run-time data segment by at least n words.

Built-in Function: number _reg (number r)

Returns the value of the register r at the moment of the call. Symbolic names for run-time registers are provided in the module _register:

NameRegister
REG_PCProgram counter
REG_TOSTop of stack
REG_TOHTop of heap
REG_BASEFrame base
REG_REGGeneral-purpose accumulator
REG_MATCHSTRLast matched string pointer
Built-in Function: void _wd ([number n])

Enters a time-consuming loop and waits there for n seconds (by default – indefinitely). The intention is to facilitate attaching to mailfromd with the debugger. Before entering the loop, a diagnostic message is printed on the ‘crit’ facility, informing about the PID of the process and suggesting the command to be used to attach to it, e.g.:

mailfromd: process 21831 is waiting for debug
mailfromd: to attach: gdb -ex 'set variable mu_wd::_count_down=0'
 /usr/sbib/mailfromd 21831

Footnotes

(18)

That is, if it supports Milter protocol 6 and upper. Sendmail 8.14.0 and Postfix 2.6 and newer do. MeTA1 (via pmult) does as well. See MTA Configuration, for more details.

(19)

Support for other locales is planned for future versions.

(20)

For example:

prog header
do
  echo unfold($2)
done    

(21)

Note, that the return code is inverted in respect to the system function ‘access(2)’.

(22)

It is made optional in order to provide backward compatibility with the releases of mailfromd prior to 5.0.93.

(23)

Although RFC 4408 introduces a special SPF record type for this purpose, it is not yet widely used. As of version 8.7, MFL does not support SPF DNS records.

Mailfromd Manual (split by chapter):   Section:   Chapter:FastBack: Library   Up: Library   FastForward: Using MFL Mode   Contents: Table of ContentsIndex: Concept Index