diff options
Diffstat (limited to 'README_FILES/DEBUG_README')
-rw-r--r-- | README_FILES/DEBUG_README | 402 |
1 files changed, 402 insertions, 0 deletions
diff --git a/README_FILES/DEBUG_README b/README_FILES/DEBUG_README new file mode 100644 index 0000000..a277d96 --- /dev/null +++ b/README_FILES/DEBUG_README @@ -0,0 +1,402 @@ +PPoossttffiixx DDeebbuuggggiinngg HHoowwttoo + +------------------------------------------------------------------------------- + +PPuurrppoossee ooff tthhiiss ddooccuummeenntt + +This document describes how to debug parts of the Postfix mail system when +things do not work according to expectation. The methods vary from making +Postfix log a lot of detail, to running some daemon processes under control of +a call tracer or debugger. + +The text assumes that the Postfix main.cf and master.cf configuration files are +stored in directory /etc/postfix. You can use the command "ppoossttccoonnff +ccoonnffiigg__ddiirreeccttoorryy" to find out the actual location of this directory on your +machine. + +Listed in order of increasing invasiveness, the debugging techniques are as +follows: + + * Look for obvious signs of trouble + * Debugging Postfix from inside + * Try turning off chroot operation in master.cf + * Verbose logging for specific SMTP connections + * Record the SMTP session with a network sniffer + * Making Postfix daemon programs more verbose + * Manually tracing a Postfix daemon process + * Automatically tracing a Postfix daemon process + * Running daemon programs with the interactive ddd debugger + * Running daemon programs with the interactive gdb debugger + * Running daemon programs under a non-interactive debugger + * Unreasonable behavior + * Reporting problems to postfix-users@postfix.org + +LLooookk ffoorr oobbvviioouuss ssiiggnnss ooff ttrroouubbllee + +Postfix logs all failed and successful deliveries to a logfile. + + * When Postfix uses syslog logging (the default), the file is usually called + /var/log/maillog, /var/log/mail, or something similar; the exact pathname + is configured in a file called /etc/syslog.conf, /etc/rsyslog.conf, or + something similar. + + * When Postfix uses its own logging system (see MAILLOG_README), the location + of the logfile is configured with the Postfix maillog_file parameter. + +When Postfix does not receive or deliver mail, the first order of business is +to look for errors that prevent Postfix from working properly: + + % eeggrreepp ''((wwaarrnniinngg||eerrrroorr||ffaattaall||ppaanniicc))::'' //ssoommee//lloogg//ffiillee || mmoorree + +Note: the most important message is near the BEGINNING of the output. Error +messages that come later are less useful. + +The nature of each problem is indicated as follows: + + * "ppaanniicc" indicates a problem in the software itself that only a programmer + can fix. Postfix cannot proceed until this is fixed. + + * "ffaattaall" is the result of missing files, incorrect permissions, incorrect + configuration file settings that you can fix. Postfix cannot proceed until + this is fixed. + + * "eerrrroorr" reports an error condition. For safety reasons, a Postfix process + will terminate when more than 13 of these happen. + + * "wwaarrnniinngg" indicates a non-fatal error. These are problems that you may not + be able to fix (such as a broken DNS server elsewhere on the network) but + may also indicate local configuration errors that could become a problem + later. + +DDeebbuuggggiinngg PPoossttffiixx ffrroomm iinnssiiddee + +Postfix version 2.1 and later can produce mail delivery reports for debugging +purposes. These reports not only show sender/recipient addresses after address +rewriting and alias expansion or forwarding, they also show information about +delivery to mailbox, delivery to non-Postfix command, responses from remote +SMTP servers, and so on. + +Postfix can produce two types of mail delivery reports for debugging: + + * What-if: report what would happen, but do not actually deliver mail. This + mode of operation is requested with: + + % //uussrr//ssbbiinn//sseennddmmaaiill --bbvv aaddddrreessss...... + Mail Delivery Status Report will be mailed to <your login name>. + + * What happened: deliver mail and report successes and/or failures, including + replies from remote SMTP servers. This mode of operation is requested with: + + % //uussrr//ssbbiinn//sseennddmmaaiill --vv aaddddrreessss...... + Mail Delivery Status Report will be mailed to <your login name>. + +These reports contain information that is generated by Postfix delivery agents. +Since these run as daemon processes that cannot interact with users directly, +the result is sent as mail to the sender of the test message. The format of +these reports is practically identical to that of ordinary non-delivery +notifications. + +For a detailed example of a mail delivery status report, see the debugging +section at the end of the ADDRESS_REWRITING_README document. + +TTrryy ttuurrnniinngg ooffff cchhrroooott ooppeerraattiioonn iinn mmaasstteerr..ccff + +A common mistake is to turn on chroot operation in the master.cf file without +going through all the necessary steps to set up a chroot environment. This +causes Postfix daemon processes to fail due to all kinds of missing files. + +The example below shows an SMTP server that is configured with chroot turned +off: + + /etc/postfix/master.cf: + # ============================================================= + # service type private unpriv cchhrroooott wakeup maxproc command + # (yes) (yes) ((yyeess)) (never) (100) + # ============================================================= + smtp inet n - nn - - smtpd + +Inspect master.cf for any processes that have chroot operation not turned off. +If you find any, save a copy of the master.cf file, and edit the entries in +question. After executing the command "ppoossttffiixx rreellooaadd", see if the problem has +gone away. + +If turning off chrooted operation made the problem go away, then +congratulations. Leaving Postfix running in this way is adequate for most +sites. If you prefer chrooted operation, see the Postfix +BASIC_CONFIGURATION_README file for information about how to prepare Postfix +for chrooted operation. + +VVeerrbboossee llooggggiinngg ffoorr ssppeecciiffiicc SSMMTTPP ccoonnnneeccttiioonnss + +In /etc/postfix/main.cf, list the remote site name or address in the +debug_peer_list parameter. For example, in order to make the software log a lot +of information to the syslog daemon for connections from or to the loopback +interface: + + /etc/postfix/main.cf: + debug_peer_list = 127.0.0.1 + +You can specify one or more hosts, domains, addresses or net/masks. To make the +change effective immediately, execute the command "ppoossttffiixx rreellooaadd". + +RReeccoorrdd tthhee SSMMTTPP sseessssiioonn wwiitthh aa nneettwwoorrkk ssnniiffffeerr + +This example uses ttccppdduummpp. In order to record a conversation you need to +specify a large enough buffer with the "--ss" option or else you will miss some +or all of the packet payload. + + # ttccppdduummpp --ww //ffiillee//nnaammee --ss 00 hhoosstt eexxaammppllee..ccoomm aanndd ppoorrtt 2255 + +Older tcpdump versions don't support "--ss 00"; in that case, use "--ss 22000000" +instead. + +Run this for a while, stop with Ctrl-C when done. To view the data use a binary +viewer, eetthheerreeaall, or good old lleessss. + +MMaakkiinngg PPoossttffiixx ddaaeemmoonn pprrooggrraammss mmoorree vveerrbboossee + +Append one or more "--vv" options to selected daemon definitions in /etc/postfix/ +master.cf and type "ppoossttffiixx rreellooaadd". This will cause a lot of activity to be +logged to the syslog daemon. For example, to make the Postfix SMTP server +process more verbose: + + /etc/postfix/master.cf: + smtp inet n - n - - smtpd -v + +To diagnose problems with address rewriting specify a "--vv" option for the +cleanup(8) and/or trivial-rewrite(8) daemon, and to diagnose problems with mail +delivery specify a "--vv" option for the qmgr(8) or oqmgr(8) queue manager, or +for the lmtp(8), local(8), pipe(8), smtp(8), or virtual(8) delivery agent. + +MMaannuuaallllyy ttrraacciinngg aa PPoossttffiixx ddaaeemmoonn pprroocceessss + +Many systems allow you to inspect a running process with a system call tracer. +For example: + + # ttrraaccee --pp pprroocceessss--iidd (SunOS 4) + # ssttrraaccee --pp pprroocceessss--iidd (Linux and many others) + # ttrruussss --pp pprroocceessss--iidd (Solaris, FreeBSD) + # kkttrraaccee --pp pprroocceessss--iidd (generic 4.4BSD) + +Even more informative are traces of system library calls. Examples: + + # llttrraaccee --pp pprroocceessss--iidd (Linux, also ported to FreeBSD and BSD/OS) + # ssoottrruussss --pp pprroocceessss--iidd (Solaris) + +See your system documentation for details. + +Tracing a running process can give valuable information about what a process is +attempting to do. This is as much information as you can get without running an +interactive debugger program, as described in a later section. + +AAuuttoommaattiiccaallllyy ttrraacciinngg aa PPoossttffiixx ddaaeemmoonn pprroocceessss + +Postfix can attach a call tracer whenever a daemon process starts. Call tracers +come in several kinds. + + 1. System call tracers such as ttrraaccee, ttrruussss, ssttrraaccee, or kkttrraaccee. These show the + communication between the process and the kernel. + + 2. Library call tracers such as ssoottrruussss and llttrraaccee. These show calls of + library routines, and give a better idea of what is going on within the + process. + +Append a --DD option to the suspect command in /etc/postfix/master.cf, for +example: + + /etc/postfix/master.cf: + smtp inet n - n - - smtpd -D + +Edit the debugger_command definition in /etc/postfix/main.cf so that it invokes +the call tracer of your choice, for example: + + /etc/postfix/main.cf: + debugger_command = + PATH=/bin:/usr/bin:/usr/local/bin; + (truss -p $process_id 2>&1 | logger -p mail.info) & sleep 5 + +Type "ppoossttffiixx rreellooaadd" and watch the logfile. + +RRuunnnniinngg ddaaeemmoonn pprrooggrraammss wwiitthh tthhee iinntteerraaccttiivvee dddddd ddeebbuuggggeerr + +If you have X Windows installed on the Postfix machine, then an interactive +debugger such as dddddd can be convenient. + +Edit the debugger_command definition in /etc/postfix/main.cf so that it invokes +dddddd: + + /etc/postfix/main.cf: + debugger_command = + PATH=/bin:/usr/bin:/usr/local/bin:/usr/X11R6/bin + ddd $daemon_directory/$process_name $process_id & sleep 5 + +Be sure that ggddbb is in the command search path, and export XXAAUUTTHHOORRIITTYY so that X +access control works, for example: + + % sseetteennvv XXAAUUTTHHOORRIITTYY ~~//..XXaauutthhoorriittyy (csh syntax) + $ eexxppoorrtt XXAAUUTTHHOORRIITTYY==$$HHOOMMEE//..XXaauutthhoorriittyy (sh syntax) + +Append a --DD option to the suspect daemon definition in /etc/postfix/master.cf, +for example: + + /etc/postfix/master.cf: + smtp inet n - n - - smtpd -D + +Stop and start the Postfix system. This is necessary so that Postfix runs with +the proper XXAAUUTTHHOORRIITTYY and DDIISSPPLLAAYY settings. + +Whenever the suspect daemon process is started, a debugger window pops up and +you can watch in detail what happens. + +RRuunnnniinngg ddaaeemmoonn pprrooggrraammss wwiitthh tthhee iinntteerraaccttiivvee ggddbb ddeebbuuggggeerr + +If you have the screen command installed on the Postfix machine, then you can +run an interactive debugger such as ggddbb as follows. + +Edit the debugger_command definition in /etc/postfix/main.cf so that it runs +ggddbb inside a detached ssccrreeeenn session: + + /etc/postfix/main.cf: + debugger_command = + PATH=/bin:/usr/bin:/sbin:/usr/sbin; export PATH; HOME=/root; + export HOME; screen -e^tt -dmS $process_name gdb + $daemon_directory/$process_name $process_id & sleep 2 + +Be sure that ggddbb is in the command search path. + +Append a --DD option to the suspect daemon definition in /etc/postfix/master.cf, +for example: + + /etc/postfix/master.cf: + smtp inet n - n - - smtpd -D + +Execute the command "ppoossttffiixx rreellooaadd" and wait until a daemon process is started +(you can see this in the maillog file). + +Then attach to the screen, and debug away: + + # HOME=/root screen -r + gdb) continue + gdb) where + +RRuunnnniinngg ddaaeemmoonn pprrooggrraammss uunnddeerr aa nnoonn--iinntteerraaccttiivvee ddeebbuuggggeerr + +If you do not have X Windows installed on the Postfix machine, or if you are +not familiar with interactive debuggers, then you can try to run ggddbb in non- +interactive mode, and have it print a stack trace when the process crashes. + +Edit the debugger_command definition in /etc/postfix/main.cf so that it invokes +the ggddbb debugger: + + /etc/postfix/main.cf: + debugger_command = + PATH=/bin:/usr/bin:/usr/local/bin; export PATH; (echo cont; echo + where; sleep 8640000) | gdb $daemon_directory/$process_name + $process_id 2>&1 + >$config_directory/$process_name.$process_id.log & sleep 5 + +Append a --DD option to the suspect daemon in /etc/postfix/master.cf, for +example: + + /etc/postfix/master.cf: + smtp inet n - n - - smtpd -D + +Type "ppoossttffiixx rreellooaadd" to make the configuration changes effective. + +Whenever a suspect daemon process is started, an output file is created, named +after the daemon and process ID (for example, smtpd.12345.log). When the +process crashes, a stack trace (with output from the "wwhheerree" command) is +written to its logfile. + +UUnnrreeaassoonnaabbllee bbeehhaavviioorr + +Sometimes the behavior exhibited by Postfix just does not match the source +code. Why can a program deviate from the instructions given by its author? +There are two possibilities. + + * The compiler has erred. This rarely happens. + + * The hardware has erred. Does the machine have ECC memory? + +In both cases, the program being executed is not the program that was supposed +to be executed, so anything could happen. + +There is a third possibility: + + * Bugs in system software (kernel or libraries). + +Hardware-related failures usually do not reproduce in exactly the same way +after power cycling and rebooting the system. There's little Postfix can do +about bad hardware. Be sure to use hardware that at the very least can detect +memory errors. Otherwise, Postfix will just be waiting to be hit by a bit +error. Critical systems deserve real hardware. + +When a compiler makes an error, the problem can be reproduced whenever the +resulting program is run. Compiler errors are most likely to happen in the code +optimizer. If a problem is reproducible across power cycles and system reboots, +it can be worthwhile to rebuild Postfix with optimization disabled, and to see +if optimization makes a difference. + +In order to compile Postfix with optimizations turned off: + + % mmaakkee ttiiddyy + % mmaakkee mmaakkeeffiilleess OOPPTT== + +This produces a set of Makefiles that do not request compiler optimization. + +Once the makefiles are set up, build the software: + + % mmaakkee + % ssuu + Password: + # mmaakkee iinnssttaallll + +If the problem goes away, then it is time to ask your vendor for help. + +RReeppoorrttiinngg pprroobblleemmss ttoo ppoossttffiixx--uusseerrss@@ppoossttffiixx..oorrgg + +The people who participate on postfix-users@postfix.org are very helpful, +especially if YOU provide them with sufficient information. Remember, these +volunteers are willing to help, but their time is limited. + +When reporting a problem, be sure to include the following information. + + * A summary of the problem. Please do not just send some logging without + explanation of what YOU believe is wrong. + + * Complete error messages. Please use cut-and-paste, or use attachments, + instead of reciting information from memory. + + * Postfix logging. See the text at the top of the DEBUG_README document to + find out where logging is stored. Please do not frustrate the helpers by + word wrapping the logging. If the logging is more than a few kbytes of + text, consider posting an URL on a web or ftp site. + + * Consider using a test email address so that you don't have to reveal email + addresses or passwords of innocent people. + + * If you can't use a test email address, please anonymize email addresses and + host names consistently. Replace each letter by "A", each digit by "D" so + that the helpers can still recognize syntactical errors. + + * Command output from: + + o "ppoossttccoonnff --nn". Please do not send your main.cf file, or 1000+ lines of + ppoossttccoonnff command output. + + o "ppoossttccoonnff --MMff" (Postfix 2.9 or later). + + * Better, provide output from the ppoossttffiinnggeerr tool. This can be found at http: + //ftp.wl0.org/SOURCES/postfinger. + + * If the problem is SASL related, consider including the output from the + ssaassllffiinnggeerr tool. This can be found at http://postfix.state-of-mind.de/ + patrick.koetter/saslfinger/. + + * If the problem is about too much mail in the queue, consider including + output from the qqsshhaappee tool, as described in the QSHAPE_README file. + + * If the problem is protocol related (connections time out, or an SMTP server + complains about syntax errors etc.) consider recording a session with + ttccppdduummpp, as described in the DEBUG_README document. + |