Archive for the Troubleshooting Category
Posted by mike Filed Under Troubleshooting with Comments Off
What do you do when you massive overload of unwanted mail that is coming to your server? Recently I saw a Postfix Mail server that had gone down some months ago but had 10s of thousands of email ready to be sent. When it was repaired it began sending tons of old mail….mail that was simply not needed to send any longer. This is a situation where you need to solve things fast.
The mailq command showed many thousands of mail being sent out. When the deferred queue was examined it was clear….all the mail was in the active queue. It looked something like this:
[root@mail postfix]# ls active/
001072081CF 289032080DA 50C8B2081EE 7D8B6208169 A3B14208235 CAD4F20823A
007D9208259 28D1E2080F5 50F662081BA 7DEEB20813A A3D0920813F CBC242081AE
009D82080EF 2A7C3208199 51FD020822B 7E24A208283 A3D1C208251 CC806208109
00CD9208274 2B02520812C 549E7208134 7EE6F208266 A4137208104 CCC81208125
01ED5208222 2B1B420825F 54B4A208183 7EEF3208213 A48A02081FA CD05E2080EC
02BB420815D 2B2BF208209 551A42080FA 7F5492081F4 A58B1208120 CE523208191
0341F208110 2BB272081E9 569992081EF 7FFA8208150 A627D20816C CE564208145
034C4208174 2CABC208296 56C1520814A 802FC2081D9 A6BC0208288 CEAE2208256
0388C20823F 2CFC820817B 56C4E208249 8209A2081F5 A6F7F208218 CF33320826F
041CA208194 2D11F207BC3 56C7820819F 826D52081A4 A7B112081C5 CFDA2208144
046AC208195 2D6B8208279 56F792080E0 82D05208100 A82832081AA CFEB72081D2
04A792081B4 2ED842080DB 5757520829A 8305F208231 A8F4B208140 D00B920828D
05366208205 2F2FB2080F6 57EE520822C 83304208188 AA6B52081FB D05C520821D
05BC320812B 2F32420817C 57FBF208165 84BB4208151 AA6EA20818C D2D632081AF
064F4208147 2FCE520820A 58B372081BB 8536620811C AA9C92080E9 D35ED2080A7
065782080D7 30C4B20812F 5995E2081F0 853B3208267 AAB0E208252 D3603208200
066922081D0 32145208245 59C5C20827F 859CD207BB6 AAD5B208219 D4CBB20810A
066A020825A 324CF208260 5D28E20822D 85C312080E4 AB15E2080CD D4FF02080ED
067F7208273 324DB20817D 5D6A1208264 861D6208214 AB2CD208289 D55AA2081DF
075722080F0 3285220819A 5DC22208280 862C220813B AB5AF208121 D5805208159
09D6120815E 3291320820B 5DE27208118 86C6E2081C0 AC639208105 D5950207BC2
09DB6208293 32AF3208149 5DE4B2081D5 877C620824E ACE222081C6 D63E7208126
The solution, though it is dangerous, was to delete all mail in the active queue as it was considered safer to get rid of all mail than to send thousands of old email to other servers.
rm -f /var/spool/postfix/active/*
This should only be considered a last resort, but …nice to have the option when needed.
Posted by mike Filed Under Troubleshooting with Comments Off
Recently my mail server just stopped. No indication of problems just no mail delivered. So what do you do in this situation? Since I have set up so I receive an email summary of mail each day, when that does not come, it tells me I have trouble.
1. Check the Logs
This seems so obvious, but in reality it is easy to do a lot of other things before you check the logs…check the logs first. So what I find in the logs is this:
May 12 19:28:44 ns amavis[10218]: (10218-01-3) (!!)TROUBLE in process_request: Can’t create file /var/vscan/tmp/amavis-20090512T192844-10218/email.txt: File exists at ./amavisd line 5119, <GEN12> line 341.
This will tell you that the problem is amavisd but it is misleading from there on out. In fact, if you see this error and search google you will find that the recommendation is to re-queue your messages with:
postsupper -r ALL
But you will also find that this does not solve the issue…the issue is amavisd. So on to step two of the analysis.
2. What Changed?
After 20 years of troubleshooting computers/servers the most powerful tool that I have found for troubleshooting is this question. One of the most unfortunate solutions that people immediately perform is that they begin the search on google for an answer and begin applying possible solutions before asking this question. Really this stuff is not magic, there has to be a cause and effect relationship. When you ask yourself this question first, then you will not apply those solutions which suggest that you start modifying the master.cf or the main.cf files….because they have not changed…do not make modifications for the solution. Instead, think about the issues that relate to amavisd, things that do change over the course to time. The first answer you should get is that it works with clamav, actually clamd, to scan for virus activity. So you know that one thing that does change is that virus signatures will be updated and when they are updated clamd must be notifiied of the changes.
As you investigate the logs in detail you will find a number of lines that indicate a problem with amavisd but no clues as to what you need to fix. Again, if you did not change it …look elsewhere.
May 12 18:05:18 ns amavis[5424]: (05424-07-2) (!!)TROUBLE in check_mail: parts_decode_ext FAILED: file(1) utility (/usr/bin/file)
error: run_command (open pipe): Can’t fork at /usr/lib/perl5/5.8.8/i386-linux-thread-multi/IO/File.pm line 188, <GEN55> line 113
. at ./amavisd line 2873, <GEN55> line 113.
3. Is amavisd Running?
Typically when amavisd has a problem it stops, and when it stops it will also disable virus checking so mail cannot be delivered because it cannot be scanned. So if you find that amavisd is not running, restart clamd, spamassassin and amavisd. You will be amazed at how many times restarting these three will solve problems, and it certiainly solved the one mentioned above.
4. Debug Mode
If this does not solve the issue run amavisd in debug mode as that will provide a wealth of information that you can use. Read carefully the data that is delivered and it will usually point right to the problem.
amavisd debug
Summary:
When you are using amavisd, always remember that it is closely tied to clamd and spamassassin, any changes (including virus signature updates) can create problems so restart these three first.
Posted by mike Filed Under Troubleshooting with Comments Off
Recently, working on amavisd-new and restarting it broke several perl modules that are required. The results of this was that the mail server quit sending mail because it could not process mail due to the broken modules. Here is the log file info that helped me locate the problem.
Mar 20 03:23:33 ns postfix/smtp[20284]: 992BB207DE0: to=<user@example.com>, relay=127.0.0.1[127.0.0.1]:10024, delay=1.7, delays=1.7/0/0/0.07, dsn=4.5.0, status=deferred (host 127.0.0.1[127.0.0.1] said: 451 4.5.0 Error in processing, id=19738-02, mime_decode-1 FAILED: Can’t locate object method “seek” via package “File::Temp” at /usr/lib/perl5/site_perl/5.8.8/MIME/Parser.pm line 816, <GEN13> line 65. (in reply to end of DATA command))
So once you know it is broken you can do something about it but first of all when you have a problem like this there are two external reminders. First, my mail was not delivered. This is a no brainer, but sometimes you just think, well….maybe things are slow. If you have not received mail, check your logs, find the problem, verify it is all working. Second, the pflogsumm did not arrive on schedule. Each day I receive a log summary which provides detailed information on what has happened on the Postfix Mail Server. Here is what was in the log summary, though it was not delivered.
smtp (total: 579)
369 Temp” at /usr/lib/perl5/site_perl/5.8.8/MIME/Parser.pm line 81…
210 4.3.2 Service shutting down, closing channel (in reply to …
Fixing the Problem:
Once you recognize perl modules are broken you need to replace them. As root execute the CPAN shell. and then install the two modules listed.
perl -MCPAN -e shell
cpan> install File::Temp
…
cpan> install IO::File
Once this is done restart amavisd-new and you will be set.
Posted by mike Filed Under Troubleshooting with Comments Off
Mail Does Not Send
A common problem is finding that mail is not sending correctly and that the /var/spool/clientqueue is filling up with files. This directory can actually shut down your server if you do not have a separate directory for /var when this happen. The speed at which this happens is determined by the amount of mail that is being sent.
Cause of the Problem:
The cause of the problem is most often a mis configured firewall or a mis configured tcp_wrappers.
The firewall should allow connections on port 25 for Postfix, port 143 for IMAP (web based email) and port 110 for POP3. Check your firewall that these are open.
Here is an example of a tcp_wrappers hosts.allow from a working Mail Server. Note that the is mail server does not allow POP3 but allows IMAP, and POSTFIX .
#
# hosts.allow This file describes the names of the hosts which are
# allowed to use the local INET services, as decided
# by the ‘/usr/sbin/tcpd’ server.
#
POSTFIX: ALL
SMTP: ALL
IMAP: ALL
Cannot Create Mailboxes
Even if cyrus IMAP is set up correctly there are times when the user cyrus cannot login to create a mailbox. If the password is correct the problem is probably related to /etc/hosts.allow/
Add this line to etc/hosts.allow
ALL: 127.0.0.1
Savemail Panic
If you see the error “Losing.lqfj54koqLM019255:savmail panic”
or
“savemail: cannot save rejected email anywhere”
Add this line to etc/hosts.allow
ALL: 127.0.0.1
Posted by mike Filed Under Troubleshooting with Comments Off
It is important to verify that MX records can be seen. To check use the dig command with the -t option which will allow you to do an MX lookup. This example shows the MX records for the myexample.com domain which is using the ns.example.org server as the mail server.
dig -t MX myexample.com
; <<>> DiG 9.2.4 <<>> -t MX myexample.com
;; global options: printcmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 4499
;; flags: qr aa rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 1, ADDITIONAL: 1
;; QUESTION SECTION:
;myexample.com. IN MX
;; ANSWER SECTION:
myexample.com. 86400 IN MX 10 ns.example.org.
;; AUTHORITY SECTION:
myexample.com. 86400 IN NS ns.example.org.
;; ADDITIONAL SECTION:
ns.example.org. 86400 IN A 21.14.226.132
;; Query time: 79 msec
;; SERVER: 21.14.226.132#53(21.14.226.132)
;; WHEN: Sun Dec 17 12:08:39 2006
;; MSG SIZE rcvd: 95
Posted by mike Filed Under Troubleshooting with 1 Comment
One of the best ways to begin troubleshooting is to check your logs. Here is a script that you can create that provides a timestamp of the current problems and then will email you that data to a a chosen email account.
#!/bin/sh
# Check Logs, Mail Output with Timestamp
TIMESTAMP=`date +%Y%m%d_%H%M%S`;
echo $TIMESTAMP
egrep ‘(warning|error|fatal|panic):’ /var/log/maillog > /root/mail_$TIMESTAMP
cat /root/mail_$TIMESTAMP | /usr/sbin/sendmail -f root someuser@example.com
Warning - This is an error that will not stop Postfix but that should be fixed. However, there may be warning errors that are a problem but not specific to Postfix, like DNS errors.
Error - Again this will not prevent Postfix from starting but after 13 errors, Postfix is designed to not start, so pay attention to these errors.
Fatal – These are errors that involve a bad configuration, incorrect permissions or a software issue that could be fixed. These are errors that you need to do some research on to solve the problem.
Panic – These are errors with the software that likely you cannot fix unless you recompile the program differently. Postfix will not start until these are fixed.
Posted by mike Filed Under Troubleshooting with Comments Off
Problem: Cannot Receive Mail
In this example the user can send mail and it all works correctly but the user is not able to receive any mail. However, viewing the /var/log/
Sep 17 13:25:50 ns postfix/smtpd[17971]: > rn-out-0910.google.com[64.233.170.190]: 250 2.0.0 Ok: queued as 4A454207D8D
Sep 17 13:25:50 ns postfix/virtual[17976]: 4A454207D8D: to=<tsmith@example.com>, relay=virtual, delay=1.8, delays=1.8/0/0/0.01, dsn=4.2.0, status=deferred (delivery failed to mailbox /var/spool/vhosts//example.com/tsmith: cannot open file: Is a directory)
Problem – no trailing slash on user’s mailbox in virtual_mailbox_recipients
This is incorrect, no trailing slash.
tsmith@example.com example.com/tsmith
This is the correct setting
tsmith@example.com example.com/tsmith/
Posted by mike Filed Under Troubleshooting with Comments Off
Checking Logs
The file /etc/syslog.conf contains a line that is used to copy all mail by syslogd (CentOS). Note that Suse uses syslog-ng and so that config file is found at /etc/syslog-ng/syslog-ng.conf (see below).
Here is that line(CentOS):
# Log all the mail messages in one place.
mail.* -/var/log/maillog
So the location of the maillog is listed and you can use this command to locate the most notable issues with Postfix.
# egrep ' (reject|warning|error|fatal|panic):' /var/log/maillog
Here is an example.
# egrep ' (reject|warning|error|fatal|panic):' /var/log/maillog
Jun 6 17:53:09 admin131 postfix[5195]: warning: My hostname admin131 is not a f ully qualified name – set myhostname or mydomain in /etc/postfix/main.cf
Jun 6 17:53:10 admin131 postfix/master[5196]: warning: My hostname admin131 is not a fully qualified name – set myhostname or mydomain in /etc/postfix/main.cf
Jun 6 17:53:10 admin131 postfix/postfix-script: warning: My hostname admin131 i s not a fully qualified name – set myhostname or mydomain in /etc/postfix/main.c f
Jun 6 17:53:10 admin131 postfix/postfix-script: fatal: the Postfix mail system is not running
Jun 6 18:38:52 admin131 postfix/master[5570]: fatal: pipe: Cannot allocate memo ry
Jun 6 18:59:38 admin131 postfix/postmap[5712]: warning: /etc/postfix/canonical, line 227: expected format: key whitespace value
Jun 6 19:00:18 admin131 postfix/postmap[5743]: warning: /etc/postfix/canonical, line 227: expected format: key whitespace value
Jun 6 18:38:52 admin131 postfix/master[5570]: fatal: pipe: Cannot allocate memory
Jun 6 18:59:38 admin131 postfix/postmap[5712]: warning: /etc/postfix/canonical, line 227: expected format: key whitespace value
Jun 6 19:00:18 admin131 postfix/postmap[5743]: warning: /etc/postfix/canonical, line 227: expected format: key whitespace value
Copyright CyberMontana Inc. and Postfixmail.com
All rights reserved. Cannot be reproduced without written permission. Box 1262 Trout Creek, MT 59874