Archive for the Troubleshooting Category

Emergency: Cleaning the Active Queue

Posted by Filed Under Troubleshooting with Comments Off

What do you do when you massive overload of unwanted mail that is coming to your server?  Recently I saw a Postfix Mail server that had gone down some months ago but had 10s of thousands of email ready to  be sent.  When it was repaired it began sending tons of old mail….mail that was simply not needed to send any longer. This is a situation where you need to solve things fast.

The mailq command showed many thousands of mail being sent out.  When the deferred queue was examined it was clear….all the mail was in the active queue.  It looked something like this:

[root@mail postfix]# ls active/
001072081CF  289032080DA  50C8B2081EE  7D8B6208169  A3B14208235  CAD4F20823A
007D9208259  28D1E2080F5  50F662081BA  7DEEB20813A  A3D0920813F  CBC242081AE
009D82080EF  2A7C3208199  51FD020822B  7E24A208283  A3D1C208251  CC806208109
00CD9208274  2B02520812C  549E7208134  7EE6F208266  A4137208104  CCC81208125
01ED5208222  2B1B420825F  54B4A208183  7EEF3208213  A48A02081FA  CD05E2080EC
02BB420815D  2B2BF208209  551A42080FA  7F5492081F4  A58B1208120  CE523208191
0341F208110  2BB272081E9  569992081EF  7FFA8208150  A627D20816C  CE564208145
034C4208174  2CABC208296  56C1520814A  802FC2081D9  A6BC0208288  CEAE2208256
0388C20823F  2CFC820817B  56C4E208249  8209A2081F5  A6F7F208218  CF33320826F
041CA208194  2D11F207BC3  56C7820819F  826D52081A4  A7B112081C5  CFDA2208144
046AC208195  2D6B8208279  56F792080E0  82D05208100  A82832081AA  CFEB72081D2
04A792081B4  2ED842080DB  5757520829A  8305F208231  A8F4B208140  D00B920828D
05366208205  2F2FB2080F6  57EE520822C  83304208188  AA6B52081FB  D05C520821D
05BC320812B  2F32420817C  57FBF208165  84BB4208151  AA6EA20818C  D2D632081AF
064F4208147  2FCE520820A  58B372081BB  8536620811C  AA9C92080E9  D35ED2080A7
065782080D7  30C4B20812F  5995E2081F0  853B3208267  AAB0E208252  D3603208200
066922081D0  32145208245  59C5C20827F  859CD207BB6  AAD5B208219  D4CBB20810A
066A020825A  324CF208260  5D28E20822D  85C312080E4  AB15E2080CD  D4FF02080ED
067F7208273  324DB20817D  5D6A1208264  861D6208214  AB2CD208289  D55AA2081DF
075722080F0  3285220819A  5DC22208280  862C220813B  AB5AF208121  D5805208159
09D6120815E  3291320820B  5DE27208118  86C6E2081C0  AC639208105  D5950207BC2
09DB6208293  32AF3208149  5DE4B2081D5  877C620824E  ACE222081C6  D63E7208126

The solution, though it is dangerous, was to delete all mail in the active queue as it was considered safer to get rid of all mail than to send thousands of old email to other servers.

rm -f /var/spool/postfix/active/*

This should only be considered a last resort, but …nice to have the option when needed.

Troubleshooting Amavisd

Posted by Filed Under Troubleshooting with Comments Off

Recently my mail server just stopped.  No indication of problems just no mail delivered.  So what do you do in this situation?  Since I have set up so I receive an email summary of mail each day, when that does not come, it tells me I have trouble.

1. Check the Logs
This seems so obvious, but in reality it is easy to do a lot of other things before you check the logs…check the logs first.  So what I find in the logs is this:

May 12 19:28:44 ns amavis[10218]: (10218-01-3) (!!)TROUBLE in process_request: Can’t create file /var/vscan/tmp/amavis-20090512T192844-10218/email.txt: File exists at ./amavisd line 5119, <GEN12> line 341.

This will tell you that the problem is amavisd but it is misleading from there on out.  In fact, if you see this error and search google you will find that the recommendation is to re-queue your messages with:

postsupper -r  ALL

But you will also find that this does not solve the issue…the issue is amavisd.  So on to step two of the analysis.

2. What Changed?
After 20 years of troubleshooting computers/servers the most powerful tool that I have found for troubleshooting is this question.  One of the most unfortunate solutions that people immediately perform is that they begin the search on google for an answer and begin applying possible solutions before asking this question. Really this stuff is not magic, there has to be a cause and effect relationship.  When you ask yourself this question first, then you will not apply those solutions which suggest that you start modifying the or the files….because they have not changed…do not make modifications for the solution.  Instead, think about the issues that relate to amavisd, things that do change over the course to time.  The first answer you should get is that it works with clamav, actually clamd, to scan for virus activity.  So you know that one thing that does change is that virus signatures will be updated and when they are updated clamd must be notifiied of the changes.

As you investigate the logs in detail you will find a number of lines that indicate a problem with amavisd but no clues as to what you need to fix.  Again, if you did not change it …look elsewhere.

May 12 18:05:18 ns amavis[5424]: (05424-07-2) (!!)TROUBLE in check_mail: parts_decode_ext FAILED: file(1) utility (/usr/bin/file)
error: run_command (open pipe): Can’t fork at /usr/lib/perl5/5.8.8/i386-linux-thread-multi/IO/ line 188, <GEN55> line 113
. at ./amavisd line 2873, <GEN55> line 113.

3. Is amavisd Running?
Typically when amavisd has a problem it stops, and when it stops it will also disable virus checking so mail cannot be delivered because it cannot be scanned. So if you find that amavisd is not running, restart clamd, spamassassin and amavisd.  You will be amazed at how many times restarting these three will solve problems, and it certiainly solved the one mentioned above.

4. Debug Mode
If this does not solve the issue run amavisd in debug mode as that will provide a wealth of information that you can use.  Read carefully the data that is delivered and it will usually point right to the problem.

amavisd debug

When you are using amavisd, always remember that it is closely tied to clamd and spamassassin, any changes (including virus signature updates) can create problems so restart these three first.

Amavisd: Fixing Broken Perl Modules

Posted by Filed Under Troubleshooting with Comments Off

Recently, working on amavisd-new and restarting it broke several perl modules that are required.  The results of this was that the mail server quit sending mail because it could not process mail due to the broken modules.  Here is the log file info that helped me locate the problem.

Mar 20 03:23:33 ns postfix/smtp[20284]: 992BB207DE0: to=<>, relay=[]:10024, delay=1.7, delays=1.7/0/0/0.07, dsn=4.5.0, status=deferred (host[] said: 451 4.5.0 Error in processing, id=19738-02, mime_decode-1 FAILED: Can’t locate object method “seek” via package “File::Temp” at /usr/lib/perl5/site_perl/5.8.8/MIME/ line 816, <GEN13> line 65. (in reply to end of DATA command))

So once you know it is broken you can do something about it but first of all when you have a problem like this there are two external reminders.  First, my mail was not delivered.  This is a no brainer, but sometimes you just think, well….maybe things are slow.  If you have not received mail, check your logs, find the problem, verify it is all working.  Second, the pflogsumm did not arrive on schedule.  Each day I receive a log summary which provides detailed information on what has happened on the Postfix Mail Server.    Here is what was in the log summary, though it was not delivered.

smtp (total: 579)
369   Temp” at /usr/lib/perl5/site_perl/5.8.8/MIME/ line 81…
210   4.3.2 Service shutting down, closing channel (in reply to …

Fixing the Problem:

Once you recognize perl modules are broken you need to replace them.  As root execute the CPAN shell.  and then install the two modules listed.

perl -MCPAN -e shell

cpan> install File::Temp

cpan> install IO::File

Once this is done restart amavisd-new and you will be set.

tcp_wrappers Problems

Posted by Filed Under Troubleshooting with Comments Off

Mail Does Not Send
A common problem is finding that mail is not sending correctly and that the /var/spool/clientqueue is filling up with files. This directory can actually shut down your server if you do not have a separate directory for /var when this happen. The speed at which this happens is determined by the amount of mail that is being sent.
Cause of the Problem:
The cause of the problem is most often a mis configured firewall or a mis configured tcp_wrappers.
The firewall should allow connections on port 25 for Postfix, port 143 for IMAP (web based email) and port 110 for POP3. Check your firewall that these are open.
Here is an example of a tcp_wrappers hosts.allow from a working Mail Server. Note that the is mail server does not allow POP3 but allows IMAP, and POSTFIX .
# hosts.allow   This file describes the names of the hosts which are
#               allowed to use the local INET services, as decided
#               by the ‘/usr/sbin/tcpd’ server.

Cannot Create Mailboxes
Even if cyrus IMAP is set up correctly there are times when the user cyrus cannot login to create a mailbox. If the password is correct the problem is probably related to /etc/hosts.allow/
Add this line to etc/hosts.allow

Savemail Panic
If you see the error “Losing.lqfj54koqLM019255:savmail panic”
“savemail: cannot save rejected email anywhere”
Add this line to etc/hosts.allow

Using dig to check MX Records

Posted by Filed Under Troubleshooting with Comments Off

It is important to verify that MX records can be seen.  To check use the dig command with the -t option which will allow you to do an MX lookup.  This example shows the MX records for the domain which is using the server as the mail server.

dig -t MX

; <<>> DiG 9.2.4 <<>> -t MX
;; global options:  printcmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 4499
;; flags: qr aa rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 1, ADDITIONAL: 1

;               IN      MX

;; ANSWER SECTION:        86400   IN      MX      10

;; AUTHORITY SECTION:        86400   IN      NS

;; ADDITIONAL SECTION:       86400   IN      A

;; Query time: 79 msec
;; WHEN: Sun Dec 17 12:08:39 2006
;; MSG SIZE  rcvd: 95

Postfix Troublehooting Errors

Posted by Filed Under Troubleshooting with 1 Comment

One of the best ways to begin troubleshooting is to check your logs.  Here is a script that you can create that provides a timestamp of the current problems and then will email you that data to a a chosen email account.

# Check Logs, Mail Output with Timestamp
TIMESTAMP=`date +%Y%m%d_%H%M%S`;
egrep  ‘(warning|error|fatal|panic):’ /var/log/maillog > /root/mail_$TIMESTAMP
cat /root/mail_$TIMESTAMP | /usr/sbin/sendmail -f root

Warning -  This is an error that will not stop Postfix but that should be fixed.  However, there may be warning errors that are a problem but not specific to Postfix, like DNS errors.

Error -  Again this will not prevent Postfix from starting but after 13 errors, Postfix is designed to not start, so pay attention to these errors.
Fatal – These are errors that involve a bad configuration, incorrect permissions or a software issue that could be fixed.  These are errors that you need to do some research on to solve the problem.
Panic – These are errors with the software  that likely you cannot fix unless you recompile the program differently.  Postfix will not start until these are fixed.

Postfix: Cannot Receive Mail

Posted by Filed Under Troubleshooting with Comments Off

Problem: Cannot Receive Mail

In this example the user can send mail and it all works correctly but the user is not able to receive any mail. However, viewing the /var/log/

Sep 17 13:25:50 ns postfix/smtpd[17971]: >[]: 250 2.0.0 Ok: queued as 4A454207D8D

Sep 17 13:25:50 ns postfix/virtual[17976]: 4A454207D8D: to=<>, relay=virtual, delay=1.8, delays=1.8/0/0/0.01, dsn=4.2.0, status=deferred (delivery failed to mailbox /var/spool/vhosts// cannot open file: Is a directory)

Problem – no trailing slash on user’s mailbox in virtual_mailbox_recipients

This is incorrect, no trailing slash.

This is the correct setting

Postfix Logs

Posted by Filed Under Troubleshooting with Comments Off

Checking Logs

The file /etc/syslog.conf contains a line that is used to copy all mail by syslogd (CentOS). Note that Suse uses syslog-ng and so that config file is found at /etc/syslog-ng/syslog-ng.conf (see below).

Here is that line(CentOS):

# Log all the mail messages in one place.
mail.*                  -/var/log/maillog

So the location of the maillog is listed and you can use this command to locate the most notable issues with Postfix.

# egrep ' (reject|warning|error|fatal|panic):' /var/log/maillog

Here is an example.

# egrep ' (reject|warning|error|fatal|panic):' /var/log/maillog

Jun 6 17:53:09 admin131 postfix[5195]: warning: My hostname admin131 is not a f ully qualified name – set myhostname or mydomain in /etc/postfix/

Jun 6 17:53:10 admin131 postfix/master[5196]: warning: My hostname admin131 is not a fully qualified name – set myhostname or mydomain in /etc/postfix/

Jun 6 17:53:10 admin131 postfix/postfix-script: warning: My hostname admin131 i s not a fully qualified name – set myhostname or mydomain in /etc/postfix/main.c f

Jun 6 17:53:10 admin131 postfix/postfix-script: fatal: the Postfix mail system is not running

Jun 6 18:38:52 admin131 postfix/master[5570]: fatal: pipe: Cannot allocate memo ry

Jun 6 18:59:38 admin131 postfix/postmap[5712]: warning: /etc/postfix/canonical, line 227: expected format: key whitespace value

Jun 6 19:00:18 admin131 postfix/postmap[5743]: warning: /etc/postfix/canonical, line 227: expected format: key whitespace value

Jun 6 18:38:52 admin131 postfix/master[5570]: fatal: pipe: Cannot allocate memory

Jun 6 18:59:38 admin131 postfix/postmap[5712]: warning: /etc/postfix/canonical, line 227: expected format: key whitespace value

Jun 6 19:00:18 admin131 postfix/postmap[5743]: warning: /etc/postfix/canonical, line 227: expected format: key whitespace value

Copyright CyberMontana Inc. and
All rights reserved. Cannot be reproduced without written permission. Box 1262 Trout Creek, MT 59874