Next: , Previous: , Up: Tutorial   [Contents][Index]


3.12 Sending Rate

We have introduced the notion of mail sending rate in Rate Limit. Mailfromd keeps the computed rates in the special rate database (see Databases). Each record in this database consists of a key, for which the rate is computed, and the rate value, in form of a double precision floating point number, representing average number of messages per second sent by this key within the last sampling interval. In the simplest case, the sender email address can be used as a key, however we recommend to use a conjunction email-sender_ip instead, so the actual email owner won’t be blocked by actions of some spammer abusing his/her address.

Two functions are provided to control and update sending rates. The rateok function takes three mandatory arguments:

  bool rateok(string key, number interval, number threshold)

The key meaning is described above. The interval is the sampling interval, or the number of seconds to which the actual sending rate value is converted. Remember that it is stored internally as a floating point number, and thus cannot be directly used in mailfromd filters, which operate only on integer numbers. To use the rate value, it is first converted to messages per given interval, which is an integer number. For example, the rate 0.138888 brought to 1-hour interval gives 500 (messages per hour).

When the rateok function is called, it recomputes rate record for the given key. If the new rate value converted to messages per given interval is less than threshold, the function updates the database and returns True. Otherwise it returns False and does not update the database.

This function must be required prior to use, by placing the following statement somewhere at the beginning of your script:

require rateok

For example, the following code limits the mail sending rate for each ‘email address’-‘IP’ combination to 180 per hour. If the actual rate value exceeds this limit, the sender is returned a temporary failure response:

require rateok

prog envfrom
do
  if not rateok($f . "-" . ${client_addr}, 3600, 180)
    tempfail 450 4.7.0 "Mail sending rate exceeded.  Try again later"
  fi
done

Notice argument concatenation, used to produce the key.

It is often inconvenient to specify intervals in seconds, therefore a special interval function is provided. It converts its argument, which is a textual string representing time interval in English, to the corresponding number of seconds. Using this function, the function invocation would be:

     rateok($f . "-" . ${client_addr}, interval("1 hour"), 180)

The interval function is described in interval, and time intervals are discussed in time interval specification.

The rateok function begins computing the rate as soon as it has collected enough data. By default, it needs at least four mails. Since this may lead to a big number of false positives (i.e. overestimated rates) at the beginning of sampling interval, there is a way to specify a minimum number of samples rateok must collect before starting to actually compute rates. This number of samples is given as the optional fourth argument to the function. For example, the following call will always return True for the first 10 mails, no matter what the actual rate:

     rateok($f . "-" . ${client_addr}, interval("1 hour"), 180, 10)

The tbf_rate function allows to exercise more control over the mail rates. This function implements a token bucket filter (TBF) algorithm.

The token bucket controls when the data can be transmitted based on the presence of abstract entities called tokens in a container called bucket. Each token represents some amount of data. The algorithm works as follows:

This algorithm allows to keep the data traffic at a constant rate t with bursts of up to m data items. Such bursts occur when no data was being arrived for m*t or more microseconds.

Mailfromd keeps buckets in a database ‘tbf’. Each bucket is identified by a unique key. The tbf_rate function is defined as follows:

 bool tbf_rate(string key, number n, number t, number m)

The key identifies the bucket to operate upon. The rest of arguments is described above. The tbf_rate function returns ‘True’ if the algorithm allows to accept the data and ‘False’ otherwise.

Depending on how the actual arguments are selected the tbf_rate function can be used to control various types of flow rates. For example, to control mail sending rate, assign the arguments as follows: n to the number of mails and t to the control interval in microseconds:

prog envfrom
do
  if not tbf_rate($f . "-" . $client_addr, 1, 10000000, 20)
    tempfail 450 4.7.0 "Mail sending rate exceeded.  Try again later"
  fi
done

The example above permits to send at most one mail each 10 seconds. The burst size is set to 20.

Another use for the tbf_rate function is to limit the total delivered mail size per given interval of time. To do so, the function must be used in prog eom handler, because it is the only handler where the entire size of the message is known. The n argument must contain the number of bytes in the email (or email bytes * number of recipients), and the t must be set to the number of bytes per microsecond a given user is allowed to send. The m argument must be large enough to accommodate a couple of large emails. E.g.:

  prog eom
  do
    if not tbf_rate("$f-$client_addr",
                    message_size(current_message()),
                    10240*1000000,  # At most 10 kb/sec
                    10*1024*1024)
      tempfail 450 4.7.0 "Data sending rate exceeded.  Try again later"
    fi
  done

See Rate limiting functions, for more information about rateok and tbf_rate functions.


Next: , Previous: , Up: Tutorial   [Contents][Index]