preload preload preload preload

Amavisd-new Update Version 2.6.1

Amavisd-new Update Version 2.6.1
                  amavisd-new-2.6.1 release notes  

BUG FIXES

- avoid a bounce-killer’s false positive when a message is multipart/mixed

with an attached message/rfc822 (looking like a qmail or a MSN bounce)

and having attached a message with a foreign Message-ID – by restricting

the check to messages with an empty sender address or a ‘postmaster’ or

‘MAILER-DAEMON’ author address;

- privileges were dropped too early when chrooting, causing chroot to fail

(a workaround was to specify a jail directory through a command line

option -R); reported by Helmut Schneider;

- fix unwarranted ‘run_av error: Exceeded allowed time’ error when using

a virus scanned Mail::ClamAV; reported by Chaminda Indrajith;

- fix a bug in helper-progs/amavis-milter.c where atoi could be reading

from a non-null terminated string which could result in wrong milter

return status, or even cause a read-access violation;

reported by Shin-ichi Nagamura;

- dsn_cutoff_level was ignored if SpamAssassin was not invoked (e.g. on

large messages) even if recip_score_boost was nonzero, causing a DSN

not to be suppressed for internally generated large score values;

reported by Bernd Probst;

- add back the ‘Ok, id=…, from MTA(…):’ prefix to a MTA status responses

on forwarded mail when generating own SMTP status response (it was lost

in code transition from 2.5.4 to 2.6.0); reported by Thomas Gelf;

- replaced ‘-ErrFile=>*STDOUT’ with ‘-ErrFile=>\*STDOUT’ in a call to

BerkeleyDB::Env::new in amavisd-nanny and amavisd-agent; seems it

was failing in some setups (even though it was in accordance with

a BerkeleyDB module documentation); reported by Leo Baltus;

- README.sql-mysql: fixed a SQL data type mismatch between maddr.id (used as

a foreign key) and msgs.sid & msgrcpt.rid; they all should be of the same

type, either integer unsigned or bigint unsigned; a schema as published

in README.sql-mysql could not be built because of a conflict in a data

type; reported by Leonardo Rodrigues Magalhães and Zhang Huangbin;

NEW FEATURES

- recognize an additional place-holder %P in a template used to build

a file name in file-based quarantining, for example:

$spam_quarantine_method = ‘local:Week%P/spam/%m.gz’;

A %P is replaced by a current partition tag, which makes it easier to

better organize a file-based quarantine by including a partition tag

(e.g. an ISO week number) in a file name or a file path.

For the record, here is a complete list of place-holders currently

recognized in filename templates:

%P => $msginfo->partition_tag

%b => $msginfo->body_digest

%m => $msginfo->mail_id

%n => $msginfo->log_id

%i => iso8601 timestamp of a message reception time by amavisd

%% => %

The following example organizes spam quarantine into weekly subdirectories:

cd /var/virusmails

mkdir -p W23/spam W24/spam W25/spam … (weeks 01..53)

chown -R vscan:vscan W23 W24 W25 … (weeks 01..53)

amavisd.conf:

$spam_quarantine_method = ‘local:W%P/spam/%m.gz’;

$sql_partition_tag =

sub { my($msginfo)=@_; sprintf(“%02d”,iso8601_week($msginfo->rx_time)) };

- add a macro %P as a synonym for a macro ‘partition_tag’, mainly for

completeness with the added place-holder %P in a file name template;

OTHER

- disabled a do_ascii decoder in the default @decoders list:

# ['asc', \&Amavis::Unpackers::do_ascii],

# ['uue', \&Amavis::Unpackers::do_ascii],

# ['hqx', \&Amavis::Unpackers::do_ascii],

# ['ync', \&Amavis::Unpackers::do_ascii],

The do_ascii is invoking a module Convert::UUlib which in turn calls

a troublesome library uulib, which has a history of security problems

and on occasion misinterprets a text file as some encoded text, causing

false positives (e.g. making it look like an executable);

recent false positive on base64-decoding reported by Jeffrey Arbuckle;

recent DoS (looping in uulib) reported by Thomas Ritterbach;

- added a rule into $map_full_type_to_short_type_re to cope with another

example of misclassification by a file(1) utility, where a plain text

file is considered a DOS executable:

[qr/^DOS executable \(COM\)/ => 'asc'], # misclassified?

An example was provided by Leonardo Rodrigues Magalhães;

- until the issue is better understood, revert the use of ‘my_require’

and go back to the standard but less informative ‘require’; some people

were reporting problems with my_require (loading of some Perl modules can

fail, apparently depending on a current directory where amavisd is started

from); reports by Tuomo Soini, Max Matslofva, Bill Landry;

- use the $myproduct_name value in generated Received header field

instead of a hard-wired ‘amavisd-new’; suggested by Thomas Gelf;

- added missing required header fields to some test mail messages in a

directory test-messages to quench down a complaint about a bad header;

- changed SQL default clauses in %sql_clause (upd_msg, sel_quar, sel_penpals)

to always join tables using both the partition_tag and the mail_id fields,

previously just the mail_id field was used in a join. The change has no

particular effect (and is not really necessary) on existing 2.6.0 databases

where a primary key is mail_id (it is just a redundant extra condition),

but saves a day when a primary key is a composite: (partition_tag,mail_id),

which may be a requirement of a SQL partitioning mechanism.

Thanks to Thomas Gelf for his testing of MySQL partitioning, reporting

deficiency in amavisd SQL schema (primary keys) which did not meet MySQL

requirements for partitioning, and suggestions;

- an AM.PDP release request can specify an additional optional attribute:

partition_tag=xx

where a requester can supply a partition_tag value of a message to be

released. This helps to uniquely identify a message in case where an SQL

database did not enforce a mail_id field to be unique (as may be necessary

with some partitioning schemes).

If a partition_tag information is readily available to a requester, it

is advised that the attribute is included in a request even if mail_id

is known to be unique. This may expedite a search and provide a double

check to a validity of a request. For backwards compatibility amavisd

performs a query on msgs.mail_id for a partition_tag value if it is

missing form a request, the query uses an SQL clause in a new entry

$sql_clause{‘sel_msg’}. If exactly one record matches, then everything

is fine, and releasing may proceed. If multiple records with the same

mail_id exist the release request is aborted with a message asking user

to supply a disambiguating partition_tag=xx attribute;

- a quarantine id for an SQL-quarantined message as logged in a main

log entry is changed from:

quarantine: aX3C4f6btXgX

to:

quarantine: aX3C4f6btXgX[25]

i.e. a partition_tag in brackets is appended to mail_id.

Correspondingly the amavisd-release is also changed to be able to parse

‘aX3C4f6btXgX[25]‘, splitting it into mail_id and partition_tag, and

providing each as a separate attribute in an AM.PDP release request;

- README.sql-mysql: changed SQL datatype VARCHAR into VARBINARY for

data fields mail_id, secret_id and quar_loc, and CHAR into BINARY for

msgs.content and msgs.quar_type to preserve case sensitivity on string

comparison operators; suggested by Thomas Gelf;

The same change should eventually be done on README.sql-pg too, but as

PostgreSQL is more picky than MySQL on matching a field data type to a

supplied data value, the change of a data type would need to be reflected

in SQL calls in amavisd. This will have to wait until some future version

of amavisd-new, having to undergo more testing than I have available

before the 2.6.1 release.

Background information on UNIQUE constraint in table SQL msgs

Amavisd does not know and need not be aware of what is a primary

key or what are UNIQUE constraints in SQL table msgs. When generating

a mail_id for a message being processed, amavisd tries to INSERT

a record with a randomly generated mail_id into table msgs (using

SQL clause in $sql_clause{‘ins_msg’}). If the operation fails,

another mail_id is generated and attempt repeated, until it eventually

succeeds. Thus it depends entirely on SQL’s decision whether a

particular record is allowed or would break some UNIQUE constraint.

So, by only changing a declaration on table msgs (PRIMARY KEY or

adding a CONSTRAINT), it changes what keys amavisd will be allowed

to insert and what kind of duplicates would be allowed.

Classically the msgs.mail_id is a PRIMARY KEY and as such it is unique.

This was a requirement for versions of amavisd up to and including 2.6.0.

Starting with 2.6.1 the JOINs have been tightened to include a

partition_tag besides mail_id in a relation, which makes it possible

to loosen a unique requirement on msgs.mail_id and only require a

pair (partition_tag,mail_id) to be unique. In other words, this way

the mail_id is only needed to be unique within each partition tag value.

This change allows a partitioning scheme to meet requirements on

MySQL partitioning. For non-partitioned databases the change shouldn’t

make any difference, and one is free to choose between having mail_id

unique across the entire table or just within each partition_tag value.

Changing a primary key to (partition_tag,mail_id) brings consequences

to quarantining, in particular to releasing from a SQL quarantine,

where it no longer suffices to specify mail_id=xxx in AM.PDP request,

but may be necessary to specify also a partition_tag=xx to distinguish

between SQL-quarantined messages which happen to have the same mail_id.

—————————————————————————

April 23, 2008

amavisd-new-2.6.0 release notes

MAIN NEW FEATURES SUMMARY

- integrated DKIM signing and verification; see section

A QUICK START TO DKIM SIGNING by the end of this release note;

- loading of policy banks based on valid DKIM-signed author’s address

can be used for reliable whitelisting, for bypassing banned checks, etc.

- bounce killer feature: uses a pen pals SQL lookup to check inbound DSN;

- SQL logging and quarantining tables have a new field ‘partition_tag’;

- captures SpamAssassin logging, more flexibility specifying SA log areas;

- collects and logs SpamAssassin timing breakdown report (requires SA 3.3);

- releasing from a quarantine can push a released message to an attachment;

- new experimental code for abuse reporting using formats: ARF/attach/plain;

- TLS support on the SMTP client and server side;

- connection caching by a SMTP client;

- amavisd-nanny and amavisd-agent now re-open a database on amavisd restarts;

- amavisd-nanny and amavisd-agent new command line option: -c count;

- updated p0f-analyzer.pl to support source port number in queries;

- amavisd can send queries either to p0f-analyzer.pl or directly to p0f;

COMPATIBILITY WITH 2.5.4

- when using SQL for logging (e.g. for a pen pals feature) or for

quarantining, SQL tables tables maddr, msgs, msgrcpt and quarantine need

to be extended by a new field ‘partition_tag’; see below for details;

- when SQL logging (pen pals) or SQL lookups are used, one can choose a

binary or a character data type for fields users.email, mailaddr.email,

and maddr.email; now may be a good opportunity to change a data type

to binary (string of bytes); see below for details;

- when using SQL for logging, a default for $sql_clause{‘upd_msg’}

has changed, so if a configuration file replaces this SQL clause

by a non-default setting, it needs to be updated;

- perl module Mail::DKIM is now required when DKIM verification or signing

is enabled or when spam checking by SpamAssassin is used and a DKIM plugin

is enabled; a required version of this module is 0.31 (or later);

- because privileges are now dropped sooner, pid and lock files as

generated by Net::Server can no longer be located in a directory which

is not writable by UID under which amavisd is running (e.g. /var/run).

A location of these files is controlled by $pid_file and $lock_file

settings, and by default are placed in $MYHOME, which still satisfies

the new requirement;

- white and blacklisting now takes into account both the SMTP envelope

sender address, as well as the author address from a header section

(address(es) in a ‘From:’ header field). Note that whitelisting

based only on a sender-specified address is mostly useless nowadays.

For a reliable whitelisting see @author_to_policy_bank_maps below,

as well as a set of whitelisting possibilities in SpamAssassin (based

on DKIM, SPF, or on Received header fields);

- if using custom hooks, some of the internal functions have changed,

in particular the semantics of a method orig_header_fields – use new

functions get_header_field() or get_header_field_body() instead;

see updated sample code amavisd-custom.conf, and see entries labeled

‘internal’ below;

- a configuration variable $append_header_fields_to_bottom is now obsolete;

the variable is still declared for compatibility with old configuration

files, but its value is ignored: new header fields are always prepended,

i.e. added to the top of a header section;

- semantics of a command line option ‘debug-sa’ has changed due to a merge

of SpamAssassin logging with a mainstream amavisd logging mechanism.

A command ‘amavisd debug-sa’ is now equivalent to ‘amavisd -d all’ with

an implied redirection of all logging to stderr. Previously it only rerouted

SpamAssassin logging to stderr but did not affect normal amavisd logging,

which still followed the usual $DO_SYSLOG and $LOGFILE settings.

Also, a SpamAssassin log level ‘info’ is now turned on by default (as was

previously achievable by a command line option ‘-d info’), and shows merged

with a normal amavisd logging at level 1 or higher.

The following table shows mapping of SpamAssassin log levels to amavisd

log levels, and for completeness also shows mapping of amavisd log levels

to syslog priorities (which has not changed since previous version):

SA amavisd syslog

—– ——- ———–

-3 LOG_CRIT

-2 LOG_ERR

error -1 LOG_WARNING

warn 0 LOG_NOTICE

info 1 LOG_INFO

2 LOG_INFO

dbg 3 LOG_DEBUG

4 LOG_DEBUG

5 LOG_DEBUG

- an additional requirement for loading a policy bank ‘MYUSERS’ is that

‘originating’ flag must be on, which typically means that mail must

be coming from internal networks or from authenticated roaming users

to be able to load a policy bank ‘MYUSERS’;

BUG FIXES

- run_av: limit the number of filenames given as arguments to a command

line scanner to stay within a safe (POSIX) program argument space limit,

run a command line scanner multiple times if necessary. This required

a larger change in the program (run_av, ask_av) which is why the fix

was listed for a long time on a TODO list and not implemented so far.

The problem affected command line virus scanners which are unable to

traverse a directory by themselves and need a list of filenames as

arguments (such as KasperskyLab’s aveclient and kavscanner, MkS_Vir mks,

and CyberSoft VFind). Actual problem reported by Danny Richter;

NEW FEATURES

- DKIM signing and verification – see below: A QUICK START TO DKIM SIGNING.

Not to forget upgrading Mail::DKIM to 0.31 (or later) and adding the

following to amavisd.conf;

$enable_dkim_verification = 1;

$enable_dkim_signing = 1;

- SQL tables tables maddr, msgs, msgrcpt and quarantine are extended by

a new field ‘partition_tag’. When amavisd creates new records in these

tables, a current value of a configuration variable $sql_partition_tag

(or its value from policy banks) is written into ‘partition_tag’ fields.

An undefined value translates to 0. The ‘partition_tag’ field is usually

declared in a schema as an integer, but in principle could be any data

type, such as a string.

A value of ‘partition_tag’ field may be used to speed up purging of

old records by using partitioned tables (MySQL 5.1 +, PostgreSQL 8.1 +).

A sensible value is a week number of a year, or some other slowly changing

value, allowing to quickly drop old table partitions without wasting

time on deleting individual records. Records in all tables carrying the

‘partition_tag’ field are self-contained within each value of a field.

In other words, foreign keys never reference a record in a subordinate

table with a value of a ‘partition_tag’ field different from the referencing

record. Consequently, mail addresses in table maddr are also self-contained

within a partition tag, implying that the same mail address may appear in

more than one maddr partition (using different ‘id’s), and that tables

msgs and msgrcpt are guaranteed to reference a maddr.id within their own

partition tag. Too fine a granularity of partition tags (e.g. changing a

value daily) wastes space in table maddr by storing multiple copies of

the same mail address.

The $sql_partition_tag may be a scalar (usually an integer or a string),

or a reference to a subroutine, which will be called with an object of

type Amavis::In::Message as argument (giving access to information about

a message being processed), and its result will be used as a partition

tag value. Possible/typical usage (in amavisd.conf):

$sql_partition_tag =

sub { my($msginfo)=@_; iso8601_week($msginfo->rx_time) };

yields an ISO 8601 week number (1..53) corresponding to a mail reception

timestamp in a local time zone.

Another possible use of ‘partition_tag’ field is to let a policy bank set

its specific value (a fixed value or a subroutine) for $sql_partition_tag.

This would allow for example labeling of SQL records for mail originating

from inside with a different partition_tag value, compared to entries for

incoming mail, and consequently let them be stored in a separate partition

if desired.

Amavisd process itself does not use the ‘partition_tag’ field for its

own purposes, all records regardless of their ‘partition_tag’ value

are available for example to pen pals lookups, as before. The field is

provided only as a convenience to SQL database maintenance, and can be

ignored by smaller sites where current practice of database maintenance

is fast enough. If SQL partitioning is not in use (or not intended to

be used in a near future), it is more economical to use a fixed value

(such as 0, which is a default) for the $sql_partition_tag. Using week

numbers as partition tags adds about 50 % to the number of records in

table maddr, the exact number depends on retention period and a ratio

of regular vs. infrequent mail addresses observed.

To convert tables of an existing database, please use ALTER command.

Here is a conversion example (MySQL or PostgreSQL, probably others):

ALTER TABLE maddr ADD partition_tag integer DEFAULT 0;

ALTER TABLE msgs ADD partition_tag integer DEFAULT 0;

ALTER TABLE msgrcpt ADD partition_tag integer DEFAULT 0;

ALTER TABLE quarantine ADD partition_tag integer DEFAULT 0;

As the maddr.email is no longer guaranteed to be unique, but a pair

of (maddr.partition_tag, maddr.email) is unique, the constraint and

an associated index needs to be changed:

=> PostgreSQL:

ALTER TABLE maddr

DROP CONSTRAINT maddr_email_key,

ADD CONSTRAINT maddr_email_key UNIQUE (partition_tag,email);

=> MySQL:

ALTER TABLE maddr

DROP KEY email,

ADD UNIQUE KEY part_email (partition_tag,email);

Should a need arise to revert to amavisd-new-2.5.4 while keeping the new

partition_tag field, the ‘SELECT id FROM maddr …’ may become slow due to

dropped index on a field maddr.email, which is replaced by an index on a

pair (maddr.partition_tag, maddr.email). The following change to amavisd

2.5.4 solves the problem:

@@ -901,2 +901,2 @@

‘sel_adr’ =>

- ‘SELECT id FROM maddr WHERE email=?’,

+ ‘SELECT id FROM maddr WHERE partition_tag=0 AND email=?’,

The use of partitioned tables to speed up purging of old records was

suggested by Robert Pelletier.

- when SQL logging (pen pals) or SQL lookups are used, one can choose a

binary or a character data type for fields users.email, mailaddr.email,

and maddr.email; now may be a good opportunity to change a data type

to binary (string of arbitrary bytes, no character set associated).

Background: values of these fields come from SMTP envelope or from a

mail header section of processed mail. Even though RFC 2821 and RFC 2822

restrict these addresses to 7-bit ASCII, there is nothing preventing

a malicious or misguided sender from supplying any 8-bit byte values.

If SQL fields are declared as VARCHAR or CHAR, a character set is

associated with data and its rules apply, e.g. control characters may

not be permitted, or UTF-8 byte sequences are validated, or a restriction

to codes below 128 apply. Depending on strictness of an SQL server on

validating data, a violation of character set rules may lead to aborting

an SQL operation and failing of mail processing. Even though new standards

for e-mail addresses are being negotiated allowing for UTF-8 encoding, an

actual e-mail address may still supply arbitrary bytes, which may violate

UTF-8 byte sequence rules.

A new configuration variable $sql_allow_8bit_address now controls how

amavisd passes e-mail addresses to SQL.

If a value is true, then it is expected that SQL tables will accept

strings of arbitrary bytes for these fields, without associating a

character set with data. No data sanitation is done by amavisd. An

appropriate SQL data type is ‘VARBINARY’ or with PostgreSQL a ‘BYTEA’.

If a value of $sql_allow_8bit_address is false (which is a default for

compatibility) then amavisd performs sanitation before passing data to SQL:

control characters and characters with codes above 127 are converted to ‘?’,

which brings strings within ASCII character set restrictions. A suitable

SQL data type is VARCHAR or CHAR. Note that some information is lost in

this case.

The following clauses can convert pre-2.6.0 tables into the now preferred

and more universal form:

MySQL:

ALTER TABLE users CHANGE email email varbinary(255);

ALTER TABLE mailaddr CHANGE email email varbinary(255);

ALTER TABLE maddr CHANGE email email varbinary(255);

PostgreSQL:

ALTER TABLE users ALTER email TYPE bytea USING decode(email,’escape’);

ALTER TABLE mailaddr ALTER email TYPE bytea USING decode(email,’escape’);

ALTER TABLE maddr ALTER email TYPE bytea USING decode(email,’escape’);

If a binary data type is chosen for these three fields, the setting

$sql_allow_8bit_address MUST be set to true to let the amavisd program

use the appropriate data type in SQL commands, otherwise PostgreSQL will

complain with:

‘types bytea and character varying cannot be matched’

when amavisd tries to execute SQL commands. MySQL is more forgiving and

does not complain about a data type mismatch, so one may get away with a

mismatch, although it is appropriate to eventually make it right.

If a change of a data type of these fields is chosen while using some

third-party management interface to SQL data set (e.g. MailZu), make sure

the management interface supports the changed data type. This is primarily

a concern with PostgreSQL which is more strict in requiring a match

between field data types in tables and data in SQL clauses.

The need for a change was pointed out by Xavier Romero, reporting that

PostgreSQL SQL lookups with pre-2.6.0 versions of amavisd can fail when

8-bit data appears in SMTP envelope addresses:

lookup_sql: sql exec: err=7, 22021, DBD::Pg::st execute failed: ERROR:

invalid byte sequence for encoding “UTF8″

- bounce killer feature: uses a pen pals SQL lookup to check inbound DSN,

attempting to match it with a previous outbound message. If a Message-ID

found in an attachment of the inbound DSN matches a Message-ID of a

message previously sent from our system by a current recipient of the DSN,

the DSN message is spared, otherwise it receives $bounce_killer_score

spam score points (0 by default, i.e. disabled) and can be blocked as

spam (although technically it is just a misdirected bounce, not spam).

A received delivery status notifications is parsed looking for attached

header section of an original message in an attempt to find a Message-ID.

A standard DSN structure (RFC 3462, RFC 3464) is recognized, as well as

a few nonstandard but common formats. Other automatic reports and bounces

with unknown structure and no attached header section are ignored for

this purpose (are subject to other regular checks). Unfortunately there

are still many nonstandard mailers around (12+ years after DSN format

standardization) and many ad-hoc filtering solutions which do not supply

the essential information.

If a Message-ID can be found in an SQL log database matching a previous

message sent by a local user (which is now a recipient of a DSN),

using a normal pen pals lookup (no extra SQL operations are necessary),

or if domain part of the Message-ID is one of local domains, then the

DSN message is considered a genuine bounce, is unaffected by this check

and passes normally (subject to other checks).

On the other hand, if the attached DSN header does supply a Message-ID

but but it does not meet the above pen pals matching criteria, then it is

assumed that the message is a backscatter to a faked address belonging

to our local domains, and $bounce_killer_score spam score points are

added, so the message can be treated as spam (subject to spam kill level

and other spam settings).

The only user-configurable setting is $bounce_killer_score (also member

of policy banks), its default value is 0. To activate the bounce killer

feature set the $bounce_killer_score to a positive number, e.g. 100.

A pre-requisite is a working SQL logging database (pen pals).

A couple of SNMP-like counters are added to facilitate assessing

effectiveness of the feature (e.g. viewed by amavisd-agent utility):

InMsgsBounce 21310 333/h 9.9 % (InMsgs)

InMsgsBounceKilled 19967 312/h 93.7 % (InMsgsBounce)

InMsgsBounceRescuedByDomain 7 0/h 0.0 % (InMsgsBounce)

InMsgsBounceRescuedByOriginating 242 4/h 1.1 % (InMsgsBounce)

InMsgsBounceRescuedByPenPals 67 1/h 0.3 % (InMsgsBounce)

InMsgsBounceUnverifiable 1027 16/h 4.8 % (InMsgsBounce)

More information on operations can be obtained from a log, search for:

inspect_dsn:

bounce killed

bounce rescued by penpals

bounce rescued by domain

bounce unverifiable

The feature was suggested by Scott F. Crosby.

See also http://www.postfix.org/BACKSCATTER_README.html,

http://wiki.apache.org/spamassassin/VBounceRuleset and

a SpamAssassin man page Mail::SpamAssassin::Plugin::VBounce

for additional ideas on fighting joe-jobbed backscatter mail.

- a new configuration variable @author_to_policy_bank_maps (also a member

of policy banks) is a list of lookup tables (typically only a hash-type

lookup table is used), which maps author addresses(es) (each address in

a ‘From:’ header field – typically only one) in a mail header section

to one or more policy bank names (a comma-separated list of names).

A match can only occur if a valid DKIM author signature or a valid

DKIM third-party signature is found, so in as much as one can trust the

signing domain, loading of arbitrary policy banks can be safe, offering

a flexibility of whitelisting against spam (absolute or just contributing

score points), bypassing of checks (banned, virus, bad-header), using

less restrictive banned rules for certain senders, by-sender routing,

turning quarantining/archiving on/off, and other tricks offered by the

existing policy bank loading mechanisms.

When a message has a valid DKIM (or DomainKeys) author signature (i.e.

when a ‘From:’ address matches a signing identity according to DKIM

(RFC 4871) or DomainKeys (RFC 4870) rules), a lookup key is an unchanged

author address and the usual lookup rules apply (README.lookups – hash

lookups).

When a valid third-party signature is found, a lookup key is extended

by a ‘/@’ and a lowercased signing domain, as shown in the example below.

The semantics is very similar to a whitelist_from_dkim feature in

SpamAssassin, but is more flexible as is allows any dynamic amavisd

setting to be changed depending on author address, not just skipping

of spam checks.

A few examples of a SpamAssassin’s whitelist_from_dkim (as in local.cf)

along with equivalent amavisd @author_to_policy_bank_maps entries follow.

To whitelist any From address with a domain example.com when a message

has a valid author signature (i.e. a signature by the same domain):

SA: whitelist_from_dkim *@example.com

am: ‘example.com’ => ‘WHITELIST’,

which is equivalent to a lengthy but redundant:

SA: whitelist_from_dkim *@example.com example.com

am: ‘example.com/@example.com’ => ‘WHITELIST’,

Similar to above, but applies to subdomains of example.com carrying

a valid author signature (i.e. signature BY THE SAME SUBDOMAIN):

SA: whitelist_from_dkim *@*.example.com

am: ‘.example.com’ => ‘WHITELIST’,

Note that in amavisd hash lookups a ‘.example.com’ implies a parent

domain ‘example.com’ too, while in SpamAssassin and in Postfix maps

a parent domain needs its own entry if desired.

To whitelist From addresses from subdomains of example.com which carry

a valid third-party signature of its parent domain:

SA: whitelist_from_dkim *@*.example.com example.com

am: ‘.example.com/@example.com’ => ‘WHITELIST’,

To whitelist any From address as long as a message has a valid DKIM

or DomainKeys signature by example.com, i.e. a third-party signature.

Typical for mailing lists or discussion groups which sign postings.

SA: whitelist_from_dkim *@* example.com

am: ‘./@example.com’ => ‘WHITELIST’,

Here is a complete example to be included in amavisd.conf:

@author_to_policy_bank_maps = ( {

# ‘friends.example.net’ => ‘WHITELIST,NOBANNEDCHECK’,

# ‘user1@cust.example.net’ => ‘WHITELIST,NOBANNEDCHECK’,

‘.ebay.com’ => ‘WHITELIST’,

‘.ebay.co.uk’ => ‘WHITELIST’,

‘ebay.at’ => ‘WHITELIST’,

‘ebay.ca’ => ‘WHITELIST’,

‘ebay.de’ => ‘WHITELIST’,

‘ebay.fr’ => ‘WHITELIST’,

‘.paypal.co.uk’ => ‘WHITELIST’,

‘.paypal.com’ => ‘WHITELIST’, # author signatures

‘./@paypal.com’ => ‘WHITELIST’, # 3rd-party sign. by paypal.com

‘alert.bankofamerica.com’ => ‘WHITELIST’,

‘amazon.com’ => ‘WHITELIST’,

‘cisco.com’ => ‘WHITELIST’,

‘.cnn.com’ => ‘WHITELIST’,

‘skype.net’ => ‘WHITELIST’,

‘welcome.skype.com’ => ‘WHITELIST’,

‘cc.yahoo-inc.com’ => ‘WHITELIST’,

‘cc.yahoo-inc.com/@yahoo-inc.com’ => ‘WHITELIST’,

‘google.com’ => ‘MILD_WHITELIST’,

‘googlemail.com’ => ‘MILD_WHITELIST’,

‘./@googlegroups.com’ => ‘MILD_WHITELIST’,

‘./@yahoogroups.com’ => ‘MILD_WHITELIST’,

‘./@yahoogroups.co.uk’ => ‘MILD_WHITELIST’,

‘./@yahoogroupes.fr’ => ‘MILD_WHITELIST’,

‘yousendit.com’ => ‘MILD_WHITELIST’,

‘meetup.com’ => ‘MILD_WHITELIST’,

‘dailyhoroscope@astrology.com’ => ‘MILD_WHITELIST’,

} );

$policy_bank{‘MILD_WHITELIST’} = {

score_sender_maps => [ { '.' => [-1.8] } ],

};

$policy_bank{‘WHITELIST’} = {

bypass_spam_checks_maps => [1],

spam_lovers_maps => [1],

};

$policy_bank{‘NOVIRUSCHECK’} = {

bypass_decode_parts => 1,

bypass_virus_checks_maps => [1],

virus_lovers_maps => [1],

};

$policy_bank{‘NOBANNEDCHECK’} = {

bypass_banned_checks_maps => [1],

banned_files_lovers_maps => [1],

};

- smtp client connection caching is a new feature which allows smtp client

code in amavisd to keep a SMTP session to MTA open after forwarding a

message or a notification, so that a next mail message that needs to be

sent by this child process can avoid re-establishing a session and the

initial greeting/EHLO (and TLS) handshake.

A current value of a global settings $smtp_connection_cache_enable

controls whether a session will be retained after forwarding a message

or not. Its default initial value is true.

A global setting $smtp_connection_cache_on_demand controls whether amavisd

is allowed to dynamically change the $smtp_connection_cache_enable setting

according to its estimate of the message frequency. The heuristics is

currently very simple: if time interval between a previous task completion

by this child process and the arrival of a current message is 5 seconds

or less, the $smtp_connection_cache_enable is turned on (which will affect

the next message); if the interval is 15 seconds or more, it is turned off.

The default value of the $smtp_connection_cache_on_demand is true, thus

enabling the adaptive behaviour.

On a busy server the connection caching can save some processing time.

Savings are substantial if client-side TLS is enabled, otherwise just a

few milliseconds are saved. On an idle server the feature may unnecessarily

keep sessions to MTA open (until MTA times them out), so one can disable

the feature by setting both controls to false (to 0 or undef).

To monitor the connection caching effectiveness, some SNMP-like counters

were added, so amavisd-agent may display something like:

OutConnNew 2764 319/h 98.2 % (OutMsgs)

OutConnQuit 2521 291/h 89.5 % (OutMsgs)

OutConnReuseFail 7 1/h 0.2 % (OutMsgs)

OutConnReuseRecent 21 2/h 0.7 % (OutMsgs)

OutConnReuseRefreshed 31 4/h 1.1 % (OutMsgs)

OutConnTransact 2816 325/h 100.0 % (OutMsgs)

- client-side TLS support is added, i.e. on forwarding a passed mail back

to MTA. Currently only encryption is supported, no client certificates

are offered. A $tls_security_level_out is a per-policy-bank setting which

controls client-side TLS, its value is either undefined (default), or a

string:

undef … client-side TLS is disabled (a default setting);

‘may’ … TLS is used if MTA offers a STARTTLS capability (RFC 3207),

otherwise a plain text SMTP session is established;

‘encrypt’ TLS is used if MTA offers a STARTTLS capability, otherwise

amavisd refuses to forward a message.

The client-side TLS imposes some performance penalty on passing a message

back to MTA, although it is still reasonably fast: a benchmark indicates

a drop in transfer rate by about a factor of 2, from 22 MB/s (no TLS)

to 9 MB/s (with TLS). The smtp client connection caching (see previous item)

should preferably be left enabled (permanently or opportunistically), as

TLS negotiation adds significantly to the initial SMTP handshake time.

- server-side TLS support is added, i.e. on accepting mail from MTA.

Encryption is supported, server (i.e. amavisd) offers its certificate,

but client certificates are not verified. A $tls_security_level_in is

a per-policy-bank setting which controls server-side TLS, its value

is either undefined (default), or a string:

undef … server-side TLS is disabled, STARTTLS capability is

not offered;

‘may’ … STARTTLS capability is offered by amavisd, but client is

not required to enter TLS, plain text sessions are permitted;

‘encrypt’ STARTTLS capability is offered and enforced by amavisd,

any SMTP command other than STARTTLS, NOOP, EHLO or QUIT

is rejected.

If $tls_security_level_in is enabled (any value other than undef or ‘none’),

amavisd offers a certificate to a connecting client requesting TLS, so a

path to a certificate and to its private key must be provided through two

global settings: $smtpd_tls_cert_file and $smtpd_tls_key_file, e.g.:

$smtpd_tls_cert_file = “$MYHOME/cert/amavisd-cert.pem”;

$smtpd_tls_key_file = “$MYHOME/cert/amavisd-key.pem”;

The private key should be guarded as secret (not world-readable).

A self-signed certificate is acceptable by most mailers.

Server-side TLS imposes a significant performance penalty on accepting

a message from MTA. A benchmark indicates a drop in transfer rate by a

factor of 10, from about 10 MB/s (no TLS) to 1 MB/s (using TLS), so it

should only be enabled with a good reason or for experimentation.

- enhanced a subroutine delivery_status_notification (along with

dispatch_from_quarantine and msg_from_quarantine) to produce a message

in one of several formats (derived from a message being processed, or

from a quarantined message). Its new arguments can be strings as follows:

$request_type: dsn, release, requeue, report

$msg_format: dsn, arf, attach, plain, resend

$feedback_type: abuse, fraud, miscategorized, not-spam, opt-out,

opt-out-list, virus, other (according to ARF draft)

Per-policy settings $report_format and $release_format control the format

of a generated message. Their value can be one of the following strings,

although not all combinations make sense:

‘arf’ …. an abuse report is generated according to

draft-shafranovich-feedback-report-04: “An Extensible

Format for Email Feedback Reports”; a plain-text part

contains text from a template;

‘attach’.. generates a report message as plain text according to

a template, with an original message attached;

‘plain’… generates a simple (flat) mail with an only MIME part

containing a text from a template, followed inline by

original message (some service providers can’t handle

abuse reports with attachments, e.g. Yahoo!);

‘resend’.. original message is forwarded unchanged, except for an

addition of header fields Resent-From, Resent-Sender,

Resent-To, Resent-Date and Resent-Message-ID;

‘dsn’ …. (for internal use) a delivery status notification is

generated according to rfc3462, rfc3464 and rfc3461;

When a request_type is ‘release’ or ‘requeue’, the format of a generated

message is governed by a per-policy setting $release_format according to

the table above. Only the ‘attach’, ‘plain’ and ‘resend’ values are useful.

A default setting is:

$release_format = ‘resend’; # with alternatives: attach, plain, resend

A plain-text part (if used) is taken from a $notify_release_templ template

and a sending address is obtained from %hdrfrom_notify_release_by_ccat.

When a request_type is ‘report’, the format of a generated message is

governed by a per-policy setting $report_format according to the table

above. Only the following settings are useful: arf, attach, plain, resend.

A default setting is:

$report_format = ‘arf’; # alternatives: arf, attach, plain, resend

A plain-text part (if used) is obtained from a $notify_report_templ

template, and a sending address from %hdrfrom_notify_report_by_ccat.

It is possible to automatically generate abuse reports from custom hooks

by calling delivery_status_notification() and mail_dispatch(). Extreme

care must be taken to only produce legitimate abuse reports (about true

fraud and true spam), sent only to parties that are truly responsible for

a message being reported. Non-repudiation is a key factor here – trust

only header fields covered by a valid DKIM signature, or generated by

your own MX MTA (such as an IP address of the last hop), and only report

messages received from a network which officially belongs to the party

(according to whois). Rate-limiting should be used, and abuse reports on

the same abuser should only be sent once in a time interval of several

hours. A SQL database can be used to maintain a list of recently reported

abusers, thus preventing excessive reports.

- introduced a variation of a message release from a quarantine, allowing

a releaser to send an abuse report based on a quarantined message.

It is implemented by:

* enhancing a subroutine delivery_status_notification as described

in the previous item;

* extending AM.PDP protocol with a ‘request=report’ attribute

which can be used in place of a ‘request=release’,

* enhancing the ‘amavisd-release’ utility program to allow sending an

attribute ‘request=release’ or ‘request=requeue’ or ‘request=report’

based on its program name. By making a soft or hard link named

‘amavisd-report’ linking to ‘amavisd-release’, the utility will

send a ‘request=report’ in place of the usual ‘request=release’,

e.g.:

# ln -s amavisd-release amavisd-report

# ln -s amavisd-release amavisd-requeue

$ amavisd-report spam/k/kg2P0rP9Lpu3.gz ” abuse@example.com

- releasing from a quarantine can push a released message to an attachment

(Content-Type: message/rfc822), with a configurable template for a header

section and the plain-text part; select by: $release_format=’attach’;

suggested by Patrick Ben Koetter;

- detect and save a new attribute SOURCE from an XFORWARD smtp command;

the value is also accepted as AM.PDP protocol attribute ‘client_source’.

Possible values are: ‘LOCAL’, ‘REMOTE’, or ‘[UNAVAILABLE]‘, the information

corresponds to ‘local_header_rewrite_clients’ postfix setting and is

not supposed to be used for security decisions according to Postfix

documentation (which makes it less interesting for our purposes);

- added client and server support for a PORT attribute of an XFORWARD command,

allowing MTA to pass a TCP port number of a remote client to a content

filter (and back if necessary); the PORT attribute is made available

with Postfix version 2.5 (20071004); a source port number is also

accepted as an AM.PDP protocol attribute ‘client_port’;

- updated p0f-analyzer.pl now supports a source port number information in

queries while preserving backwards compatibility with previous versions

of amavisd-new. Version 2.6.0 of amavisd requires a new version of

p0f-analyzer.pl (supplied in the 2.6.0 distribution) if operating system

fingerprinting is enabled. A source port number information in a query

allows p0f-analyzer.pl to locate a matching entry in its cache faster and

also more accurately when multiple connections are present from clients

behind NAT using the same IP address. The source port number is made

available to a content filter since Postfix version 2.5 (20071004);

- besides the ability to send queries to p0f-analyzer.pl, amavisd now also

supports sending queries directly to a p0f program over a Unix socket

using a p0f query protocol. There is a bug in p0f-2.0.8 (and probably in

earlier versions) which makes it send back incorrect results at times, i.e.

results belonging to some other unrelated session, so a patch to p0f-2.0.8

MUST be applied in order to use a direct querying mechanism – author has

been notified. The patch is supplied: p0f-patch.

There are currently no advantages (and some disadvantages) in choosing

direct queries to p0f, compared to sending queries to p0f-analyzer.pl,

so this new method is not currently recommended. Disadvantages are:

* p0f uses a linear search over its list of recent sessions, whereas

p0f-analyzer.pl uses a fast hash lookup method;

* p0f keeps a relatively small list of recent sessions which is limited

by the number of slots (size can be specified on a command line, but

is limited by a linear search time), whereas p0f-analyzer.pl expires

old entries according to time since entered and is thus independent

of a current mail rate;

* a direct p0f query protocol uses packed binary data and its on-the-wire

representation may depend on a compiler used, so it may be incompatible

with queries sent by amavisd, whereas the p0f-analyzer.pl queries and

replies use a more environment-independent textual representation.

To let amavisd sent queries directly to p0f, specify a p0f socket path:

$os_fingerprint_method = ‘p0f:/var/amavis/home/p0f.sock’;

and specify an IP address and a port number on which MTA is listening:

$os_fingerprint_dst_ip_and_port = ‘[192.0.2.3]:25′;

because p0f requires this information in a query and the information

is not made available to a content filter via XFORWARD command

(the p0f-analyzer.pl does not need this information).

To send queries to p0f-analyzer.pl (traditional and recommended), use:

$os_fingerprint_method = ‘p0f:127.0.0.1:2345′;

as before. The $os_fingerprint_dst_ip_and_port in this case is not needed

and is ignored.

- usually a sending address in spam messages is faked and it is desirable

to suppress most if not all bounces by keeping $sa_dsn_cutoff_level low,

but sometimes it may be possible to be more certain of the validity of

a sending address, and when such mail is considered spam, it may still be

desirable to send a non-delivery notification, knowing that a notification

will most likely be addressed to a genuine sender.

Two new settings are provided for this purpose:

@spam_crediblefrom_dsn_cutoff_level_bysender_maps and

@spam_crediblefrom_dsn_cutoff_level_maps

(with their default being $sa_crediblefrom_dsn_cutoff_level),

complementing the existing @spam_dsn_cutoff_level_bysender_maps and

@spam_dsn_cutoff_level_maps.

It is expected that $sa_crediblefrom_dsn_cutoff_level would be set somewhat

higher than $sa_dsn_cutoff_level, allowing for more bounces to be generated

for spam from likely-to-be-genuine senders (possibly false positives).

The choice between taking a cutoff value from one or the other pair of

settings depends on an attribute $msginfo->sender_credible – when it is

true (e.g. some nonempty string) the *spam_crediblefrom_* settings will

be used instead of the baseline @spam_dsn_cutoff_level_*maps.

An initial value of a sender_credible attribute as provided by amavisd

is true if either the ‘originating’ flag is true (e.g. mail from inside),

or if dkim_envsender_sig attribute is true, e.g. a domain of a valid

DKIM signature matches envelope sender address, otherwise it is false.

A user-provided custom hook code is free to change the value of

sender_credible attribute. An exact value does not matter (it is only

interpreted as a boolean), but serves for logging purposes. Heuristics

may be based on some tests provided by SpamAssassin, on DKIM signatures,

on p0f results, on policy banks, etc.