diff options
Diffstat (limited to '')
-rw-r--r-- | README_FILES/QSHAPE_README | 739 |
1 files changed, 739 insertions, 0 deletions
diff --git a/README_FILES/QSHAPE_README b/README_FILES/QSHAPE_README new file mode 100644 index 0000000..311797a --- /dev/null +++ b/README_FILES/QSHAPE_README @@ -0,0 +1,739 @@ +PPoossttffiixx BBoottttlleenneecckk AAnnaallyyssiiss + +------------------------------------------------------------------------------- + +PPuurrppoossee ooff tthhiiss ddooccuummeenntt + +This document is an introduction to Postfix queue congestion analysis. It +explains how the qshape(1) program can help to track down the reason for queue +congestion. qshape(1) is bundled with Postfix 2.1 and later source code, under +the "auxiliary" directory. This document describes qshape(1) as bundled with +Postfix 2.4. + +This document covers the following topics: + + * Introducing the qshape tool + * Trouble shooting with qshape + * Example 1: Healthy queue + * Example 2: Deferred queue full of dictionary attack bounces + * Example 3: Congestion in the active queue + * Example 4: High volume destination backlog + * Postfix queue directories + + o The "maildrop" queue + o The "hold" queue + o The "incoming" queue + o The "active" queue + o The "deferred" queue + + * Credits + +IInnttrroodduucciinngg tthhee qqsshhaappee ttooooll + +When mail is draining slowly or the queue is unexpectedly large, run qshape(1) +as the super-user (root) to help zero in on the problem. The qshape(1) program +displays a tabular view of the Postfix queue contents. + + * On the horizontal axis, it displays the queue age with fine granularity for + recent messages and (geometrically) less fine granularity for older + messages. + + * The vertical axis displays the destination (or with the "-s" switch the + sender) domain. Domains with the most messages are listed first. + +For example, in the output below we see the top 10 lines of the (mostly forged) +sender domain distribution for captured spam in the "hold" queue: + + $ qshape -s hold | head + T 5 10 20 40 80 160 320 640 1280 1280+ + TOTAL 486 0 0 1 0 0 2 4 20 40 419 + yahoo.com 14 0 0 1 0 0 0 0 1 0 12 + extremepricecuts.net 13 0 0 0 0 0 0 0 2 0 11 + ms35.hinet.net 12 0 0 0 0 0 0 0 0 1 11 + winnersdaily.net 12 0 0 0 0 0 0 0 2 0 10 + hotmail.com 11 0 0 0 0 0 0 0 0 1 10 + worldnet.fr 6 0 0 0 0 0 0 0 0 0 6 + ms41.hinet.net 6 0 0 0 0 0 0 0 0 0 6 + osn.de 5 0 0 0 0 0 1 0 0 0 4 + + * The "T" column shows the total (in this case sender) count for each domain. + The columns with numbers above them, show counts for messages aged fewer + than that many minutes, but not younger than the age limit for the previous + column. The row labeled "TOTAL" shows the total count for all domains. + + * In this example, there are 14 messages allegedly from yahoo.com, 1 between + 10 and 20 minutes old, 1 between 320 and 640 minutes old and 12 older than + 1280 minutes (1440 minutes in a day). + +When the output is a terminal intermediate results showing the top 20 domains +(-n option) are displayed after every 1000 messages (-N option) and the final +output also shows only the top 20 domains. This makes qshape useful even when +the deferred queue is very large and it may otherwise take prohibitively long +to read the entire deferred queue. + +By default, qshape shows statistics for the union of both the incoming and +active queues which are the most relevant queues to look at when analyzing +performance. + +One can request an alternate list of queues: + + $ qshape deferred + $ qshape incoming active deferred + +this will show the age distribution of the deferred queue or the union of the +incoming active and deferred queues. + +Command line options control the number of display "buckets", the age limit for +the smallest bucket, display of parent domain counts and so on. The "-h" option +outputs a summary of the available switches. + +TTrroouubbllee sshhoooottiinngg wwiitthh qqsshhaappee + +Large numbers in the qshape output represent a large number of messages that +are destined to (or alleged to come from) a particular domain. It should be +possible to tell at a glance which domains dominate the queue sender or +recipient counts, approximately when a burst of mail started, and when it +stopped. + +The problem destinations or sender domains appear near the top left corner of +the output table. Remember that the active queue can accommodate up to 20000 +($qmgr_message_active_limit) messages. To check whether this limit has been +reached, use: + + $ qshape -s active (show sender statistics) + +If the total sender count is below 20000 the active queue is not yet saturated, +any high volume sender domains show near the top of the output. + +With oqmgr(8) the active queue is also limited to at most 20000 recipient +addresses ($qmgr_message_recipient_limit). To check for exhaustion of this +limit use: + + $ qshape active (show recipient statistics) + +Having found the high volume domains, it is often useful to search the logs for +recent messages pertaining to the domains in question. + + # Find deliveries to example.com + # + $ tail -10000 /var/log/maillog | + egrep -i ': to=<.*@example\.com>,' | + less + + # Find messages from example.com + # + $ tail -10000 /var/log/maillog | + egrep -i ': from=<.*@example\.com>,' | + less + +You may want to drill in on some specific queue ids: + + # Find all messages for a specific queue id. + # + $ tail -10000 /var/log/maillog | egrep ': 2B2173FF68: ' + +Also look for queue manager warning messages in the log. These warnings can +suggest strategies to reduce congestion. + + $ egrep 'qmgr.*(panic|fatal|error|warning):' /var/log/maillog + +When all else fails try the Postfix mailing list for help, but please don't +forget to include the top 10 or 20 lines of qshape(1) output. + +EExxaammppllee 11:: HHeeaalltthhyy qquueeuuee + +When looking at just the incoming and active queues, under normal conditions +(no congestion) the incoming and active queues are nearly empty. Mail leaves +the system almost as quickly as it comes in or is deferred without congestion +in the active queue. + + $ qshape (show incoming and active queue status) + + T 5 10 20 40 80 160 320 640 1280 1280+ + TOTAL 5 0 0 0 1 0 0 0 1 1 2 + meri.uwasa.fi 5 0 0 0 1 0 0 0 1 1 2 + +If one looks at the two queues separately, the incoming queue is empty or +perhaps briefly has one or two messages, while the active queue holds more +messages and for a somewhat longer time: + + $ qshape incoming + + T 5 10 20 40 80 160 320 640 1280 1280+ + TOTAL 0 0 0 0 0 0 0 0 0 0 0 + + $ qshape active + + T 5 10 20 40 80 160 320 640 1280 1280+ + TOTAL 5 0 0 0 1 0 0 0 1 1 2 + meri.uwasa.fi 5 0 0 0 1 0 0 0 1 1 2 + +EExxaammppllee 22:: DDeeffeerrrreedd qquueeuuee ffuullll ooff ddiiccttiioonnaarryy aattttaacckk bboouunncceess + +This is from a server where recipient validation is not yet available for some +of the hosted domains. Dictionary attacks on the unvalidated domains result in +bounce backscatter. The bounces dominate the queue, but with proper tuning they +do not saturate the incoming or active queues. The high volume of deferred mail +is not a direct cause for alarm. + + $ qshape deferred | head + + T 5 10 20 40 80 160 320 640 1280 1280+ + TOTAL 2234 4 2 5 9 31 57 108 201 464 1353 + heyhihellothere.com 207 0 0 1 1 6 6 8 25 68 92 + pleazerzoneprod.com 105 0 0 0 0 0 0 0 5 44 56 + groups.msn.com 63 2 1 2 4 4 14 14 14 8 0 + orion.toppoint.de 49 0 0 0 1 0 2 4 3 16 23 + kali.com.cn 46 0 0 0 0 1 0 2 6 12 25 + meri.uwasa.fi 44 0 0 0 0 1 0 2 8 11 22 + gjr.paknet.com.pk 43 1 0 0 1 1 3 3 6 12 16 + aristotle.algonet.se 41 0 0 0 0 0 1 2 11 12 15 + +The domains shown are mostly bulk-mailers and all the volume is the tail end of +the time distribution, showing that short term arrival rates are moderate. +Larger numbers and lower message ages are more indicative of current trouble. +Old mail still going nowhere is largely harmless so long as the active and +incoming queues are short. We can also see that the groups.msn.com +undeliverables are low rate steady stream rather than a concentrated dictionary +attack that is now over. + + $ qshape -s deferred | head + + T 5 10 20 40 80 160 320 640 1280 1280+ + TOTAL 2193 4 4 5 8 33 56 104 205 465 1309 + MAILER-DAEMON 1709 4 4 5 8 33 55 101 198 452 849 + example.com 263 0 0 0 0 0 0 0 0 2 261 + example.org 209 0 0 0 0 0 1 3 6 11 188 + example.net 6 0 0 0 0 0 0 0 0 0 6 + example.edu 3 0 0 0 0 0 0 0 0 0 3 + example.gov 2 0 0 0 0 0 0 0 1 0 1 + example.mil 1 0 0 0 0 0 0 0 0 0 1 + +Looking at the sender distribution, we see that as expected most of the +messages are bounces. + +EExxaammppllee 33:: CCoonnggeessttiioonn iinn tthhee aaccttiivvee qquueeuuee + +This example is taken from a Feb 2004 discussion on the Postfix Users list. +Congestion was reported with the active and incoming queues large and not +shrinking despite very large delivery agent process limits. The thread is +archived at: http://groups.google.com/ +groups?threadm=c0b7js$2r65$1@FreeBSD.csie.NCTU.edu.tw and http:// +archives.neohapsis.com/archives/postfix/2004-02/thread.html#1371 + +Using an older version of qshape(1) it was quickly determined that all the +messages were for just a few destinations: + + $ qshape (show incoming and active queue status) + + T A 5 10 20 40 80 160 320 320+ + TOTAL 11775 9996 0 0 1 1 42 94 221 1420 + user.sourceforge.net 7678 7678 0 0 0 0 0 0 0 0 + lists.sourceforge.net 2313 2313 0 0 0 0 0 0 0 0 + gzd.gotdns.com 102 0 0 0 0 0 0 0 2 100 + +The "A" column showed the count of messages in the active queue, and the +numbered columns showed totals for the deferred queue. At 10000 messages +(Postfix 1.x active queue size limit) the active queue is full. The incoming +was growing rapidly. + +With the trouble destinations clearly identified, the administrator quickly +found and fixed the problem. It is substantially harder to glean the same +information from the logs. While a careful reading of mailq(1) output should +yield similar results, it is much harder to gauge the magnitude of the problem +by looking at the queue one message at a time. + +EExxaammppllee 44:: HHiigghh vvoolluummee ddeessttiinnaattiioonn bbaacckklloogg + +When a site you send a lot of email to is down or slow, mail messages will +rapidly build up in the deferred queue, or worse, in the active queue. The +qshape output will show large numbers for the destination domain in all age +buckets that overlap the starting time of the problem: + + $ qshape deferred | head + + T 5 10 20 40 80 160 320 640 1280 1280+ + TOTAL 5000 200 200 400 800 1600 1000 200 200 200 200 + highvolume.com 4000 160 160 320 640 1280 1440 0 0 0 0 + ... + +Here the "highvolume.com" destination is continuing to accumulate deferred +mail. The incoming and active queues are fine, but the deferred queue started +growing some time between 1 and 2 hours ago and continues to grow. + +If the high volume destination is not down, but is instead slow, one might see +similar congestion in the active queue. Active queue congestion is a greater +cause for alarm; one might need to take measures to ensure that the mail is +deferred instead or even add an access(5) rule asking the sender to try again +later. + +If a high volume destination exhibits frequent bursts of consecutive +connections refused by all MX hosts or "421 Server busy errors", it is possible +for the queue manager to mark the destination as "dead" despite the transient +nature of the errors. The destination will be retried again after the +expiration of a $minimal_backoff_time timer. If the error bursts are frequent +enough it may be that only a small quantity of email is delivered before the +destination is again marked "dead". In some cases enabling static (not on +demand) connection caching by listing the appropriate nexthop domain in a table +included in "smtp_connection_cache_destinations" may help to reduce the error +rate, because most messages will re-use existing connections. + +The MTA that has been observed most frequently to exhibit such bursts of errors +is Microsoft Exchange, which refuses connections under load. Some proxy virus +scanners in front of the Exchange server propagate the refused connection to +the client as a "421" error. + +Note that it is now possible to configure Postfix to exhibit similarly erratic +behavior by misconfiguring the anvil(8) service. Do not use anvil(8) for +steady-state rate limiting, its purpose is (unintentional) DoS prevention and +the rate limits set should be very generous! + +If one finds oneself needing to deliver a high volume of mail to a destination +that exhibits frequent brief bursts of errors and connection caching does not +solve the problem, there is a subtle workaround. + + * Postfix version 2.5 and later: + + o In master.cf set up a dedicated clone of the "smtp" transport for the + destination in question. In the example below we will call it + "fragile". + + o In master.cf configure a reasonable process limit for the cloned smtp + transport (a number in the 10-20 range is typical). + + o IMPORTANT!!! In main.cf configure a large per-destination pseudo-cohort + failure limit for the cloned smtp transport. + + /etc/postfix/main.cf: + transport_maps = hash:/etc/postfix/transport + fragile_destination_concurrency_failed_cohort_limit = 100 + fragile_destination_concurrency_limit = 20 + + /etc/postfix/transport: + example.com fragile: + + /etc/postfix/master.cf: + # service type private unpriv chroot wakeup maxproc command + fragile unix - - n - 20 smtp + + See also the documentation for + default_destination_concurrency_failed_cohort_limit and + default_destination_concurrency_limit. + + * Earlier Postfix versions: + + o In master.cf set up a dedicated clone of the "smtp" transport for the + destination in question. In the example below we will call it + "fragile". + + o In master.cf configure a reasonable process limit for the transport (a + number in the 10-20 range is typical). + + o IMPORTANT!!! In main.cf configure a very large initial and destination + concurrency limit for this transport (say 2000). + + /etc/postfix/main.cf: + transport_maps = hash:/etc/postfix/transport + initial_destination_concurrency = 2000 + fragile_destination_concurrency_limit = 2000 + + /etc/postfix/transport: + example.com fragile: + + /etc/postfix/master.cf: + # service type private unpriv chroot wakeup maxproc command + fragile unix - - n - 20 smtp + + See also the documentation for default_destination_concurrency_limit. + +The effect of this configuration is that up to 2000 consecutive errors are +tolerated without marking the destination dead, while the total concurrency +remains reasonable (10-20 processes). This trick is only for a very specialized +situation: high volume delivery into a channel with multi-error bursts that is +capable of high throughput, but is repeatedly throttled by the bursts of +errors. + +When a destination is unable to handle the load even after the Postfix process +limit is reduced to 1, a desperate measure is to insert brief delays between +delivery attempts. + + * Postfix version 2.5 and later: + + o In master.cf set up a dedicated clone of the "smtp" transport for the + problem destination. In the example below we call it "slow". + + o In main.cf configure a short delay between deliveries to the same + destination. + + /etc/postfix/main.cf: + transport_maps = hash:/etc/postfix/transport + slow_destination_rate_delay = 1 + slow_destination_concurrency_failed_cohort_limit = 100 + + /etc/postfix/transport: + example.com slow: + + /etc/postfix/master.cf: + # service type private unpriv chroot wakeup maxproc command + slow unix - - n - - smtp + + See also the documentation for default_destination_rate_delay. + + This solution forces the Postfix smtp(8) client to wait for + $slow_destination_rate_delay seconds between deliveries to the same + destination. + + IMPORTANT!! The large slow_destination_concurrency_failed_cohort_limit + value is needed. This prevents Postfix from deferring all mail for the same + destination after only one connection or handshake error (the reason for + this is that non-zero slow_destination_rate_delay forces a per-destination + concurrency of 1). + + * Earlier Postfix versions: + + o In the transport map entry for the problem destination, specify a dead + host as the primary nexthop. + + o In the master.cf entry for the transport specify the problem + destination as the fallback_relay and specify a small + smtp_connect_timeout value. + + /etc/postfix/main.cf: + transport_maps = hash:/etc/postfix/transport + + /etc/postfix/transport: + example.com slow:[dead.host] + + /etc/postfix/master.cf: + # service type private unpriv chroot wakeup maxproc command + slow unix - - n - 1 smtp + -o fallback_relay=problem.example.com + -o smtp_connect_timeout=1 + -o smtp_connection_cache_on_demand=no + + This solution forces the Postfix smtp(8) client to wait for + $smtp_connect_timeout seconds between deliveries. The connection caching + feature is disabled to prevent the client from skipping over the dead host. + +PPoossttffiixx qquueeuuee ddiirreeccttoorriieess + +The following sections describe Postfix queues: their purpose, what normal +behavior looks like, and how to diagnose abnormal behavior. + +TThhee ""mmaaiillddrroopp"" qquueeuuee + +Messages that have been submitted via the Postfix sendmail(1) command, but not +yet brought into the main Postfix queue by the pickup(8) service, await +processing in the "maildrop" queue. Messages can be added to the "maildrop" +queue even when the Postfix system is not running. They will begin to be +processed once Postfix is started. + +The "maildrop" queue is drained by the single threaded pickup(8) service +scanning the queue directory periodically or when notified of new message +arrival by the postdrop(1) program. The postdrop(1) program is a setgid helper +that allows the unprivileged Postfix sendmail(1) program to inject mail into +the "maildrop" queue and to notify the pickup(8) service of its arrival. + +All mail that enters the main Postfix queue does so via the cleanup(8) service. +The cleanup service is responsible for envelope and header rewriting, header +and body regular expression checks, automatic bcc recipient processing, milter +content processing, and reliable insertion of the message into the Postfix +"incoming" queue. + +In the absence of excessive CPU consumption in cleanup(8) header or body +regular expression checks or other software consuming all available CPU +resources, Postfix performance is disk I/O bound. The rate at which the pickup +(8) service can inject messages into the queue is largely determined by disk +access times, since the cleanup(8) service must commit the message to stable +storage before returning success. The same is true of the postdrop(1) program +writing the message to the "maildrop" directory. + +As the pickup service is single threaded, it can only deliver one message at a +time at a rate that does not exceed the reciprocal disk I/O latency (+ CPU if +not negligible) of the cleanup service. + +Congestion in this queue is indicative of an excessive local message submission +rate or perhaps excessive CPU consumption in the cleanup(8) service due to +excessive body_checks, or (Postfix >= 2.3) high latency milters. + +Note, that once the active queue is full, the cleanup service will attempt to +slow down message injection by pausing $in_flow_delay for each message. In this +case "maildrop" queue congestion may be a consequence of congestion downstream, +rather than a problem in its own right. + +Note, you should not attempt to deliver large volumes of mail via the pickup(8) +service. High volume sites should avoid using "simple" content filters that re- +inject scanned mail via Postfix sendmail(1) and postdrop(1). + +A high arrival rate of locally submitted mail may be an indication of an +uncaught forwarding loop, or a run-away notification program. Try to keep the +volume of local mail injection to a moderate level. + +The "postsuper -r" command can place selected messages into the "maildrop" +queue for reprocessing. This is most useful for resetting any stale +content_filter settings. Requeuing a large number of messages using "postsuper +-r" can clearly cause a spike in the size of the "maildrop" queue. + +TThhee ""hhoolldd"" qquueeuuee + +The administrator can define "smtpd" access(5) policies, or cleanup(8) header/ +body checks that cause messages to be automatically diverted from normal +processing and placed indefinitely in the "hold" queue. Messages placed in the +"hold" queue stay there until the administrator intervenes. No periodic +delivery attempts are made for messages in the "hold" queue. The postsuper(1) +command can be used to manually release messages into the "deferred" queue. + +Messages can potentially stay in the "hold" queue longer than +$maximal_queue_lifetime. If such "old" messages need to be released from the +"hold" queue, they should typically be moved into the "maildrop" queue using +"postsuper -r", so that the message gets a new timestamp and is given more than +one opportunity to be delivered. Messages that are "young" can be moved +directly into the "deferred" queue using "postsuper -H". + +The "hold" queue plays little role in Postfix performance, and monitoring of +the "hold" queue is typically more closely motivated by tracking spam and +malware, than by performance issues. + +TThhee ""iinnccoommiinngg"" qquueeuuee + +All new mail entering the Postfix queue is written by the cleanup(8) service +into the "incoming" queue. New queue files are created owned by the "postfix" +user with an access bitmask (or mode) of 0600. Once a queue file is ready for +further processing the cleanup(8) service changes the queue file mode to 0700 +and notifies the queue manager of new mail arrival. The queue manager ignores +incomplete queue files whose mode is 0600, as these are still being written by +cleanup. + +The queue manager scans the incoming queue bringing any new mail into the +"active" queue if the active queue resource limits have not been exceeded. By +default, the active queue accommodates at most 20000 messages. Once the active +queue message limit is reached, the queue manager stops scanning the incoming +(and deferred, see below) queue. + +Under normal conditions the incoming queue is nearly empty (has only mode 0600 +files), with the queue manager able to import new messages into the active +queue as soon as they become available. + +The incoming queue grows when the message input rate spikes above the rate at +which the queue manager can import messages into the active queue. The main +factors slowing down the queue manager are disk I/O and lookup queries to the +trivial-rewrite service. If the queue manager is routinely not keeping up, +consider not using "slow" lookup services (MySQL, LDAP, ...) for transport +lookups or speeding up the hosts that provide the lookup service. If the +problem is I/O starvation, consider striping the queue over more disks, faster +controllers with a battery write cache, or other hardware improvements. At the +very least, make sure that the queue directory is mounted with the "noatime" +option if applicable to the underlying filesystem. + +The in_flow_delay parameter is used to clamp the input rate when the queue +manager starts to fall behind. The cleanup(8) service will pause for +$in_flow_delay seconds before creating a new queue file if it cannot obtain a +"token" from the queue manager. + +Since the number of cleanup(8) processes is limited in most cases by the SMTP +server concurrency, the input rate can exceed the output rate by at most "SMTP +connection count" / $in_flow_delay messages per second. + +With a default process limit of 100, and an in_flow_delay of 1s, the coupling +is strong enough to limit a single run-away injector to 1 message per second, +but is not strong enough to deflect an excessive input rate from many sources +at the same time. + +If a server is being hammered from multiple directions, consider raising the +in_flow_delay to 10 seconds, but only if the incoming queue is growing even +while the active queue is not full and the trivial-rewrite service is using a +fast transport lookup mechanism. + +TThhee ""aaccttiivvee"" qquueeuuee + +The queue manager is a delivery agent scheduler; it works to ensure fast and +fair delivery of mail to all destinations within designated resource limits. + +The active queue is somewhat analogous to an operating system's process run +queue. Messages in the active queue are ready to be sent (runnable), but are +not necessarily in the process of being sent (running). + +While most Postfix administrators think of the "active" queue as a directory on +disk, the real "active" queue is a set of data structures in the memory of the +queue manager process. + +Messages in the "maildrop", "hold", "incoming" and "deferred" queues (see +below) do not occupy memory; they are safely stored on disk waiting for their +turn to be processed. The envelope information for messages in the "active" +queue is managed in memory, allowing the queue manager to do global scheduling, +allocating available delivery agent processes to an appropriate message in the +active queue. + +Within the active queue, (multi-recipient) messages are broken up into groups +of recipients that share the same transport/nexthop combination; the group size +is capped by the transport's recipient concurrency limit. + +Multiple recipient groups (from one or more messages) are queued for delivery +grouped by transport/nexthop combination. The ddeessttiinnaattiioonn concurrency limit for +the transports caps the number of simultaneous delivery attempts for each +nexthop. Transports with a rreecciippiieenntt concurrency limit of 1 are special: these +are grouped by the actual recipient address rather than the nexthop, yielding +per-recipient concurrency limits rather than per-domain concurrency limits. +Per-recipient limits are appropriate when performing final delivery to +mailboxes rather than when relaying to a remote server. + +Congestion occurs in the active queue when one or more destinations drain +slower than the corresponding message input rate. + +Input into the active queue comes both from new mail in the "incoming" queue, +and retries of mail in the "deferred" queue. Should the "deferred" queue get +really large, retries of old mail can dominate the arrival rate of new mail. +Systems with more CPU, faster disks and more network bandwidth can deal with +larger deferred queues, but as a rule of thumb the deferred queue scales to +somewhere between 100,000 and 1,000,000 messages with good performance unlikely +above that "limit". Systems with queues this large should typically stop +accepting new mail, or put the backlog "on hold" until the underlying issue is +fixed (provided that there is enough capacity to handle just the new mail). + +When a destination is down for some time, the queue manager will mark it dead, +and immediately defer all mail for the destination without trying to assign it +to a delivery agent. In this case the messages will quickly leave the active +queue and end up in the deferred queue (with Postfix < 2.4, this is done +directly by the queue manager, with Postfix >= 2.4 this is done via the "retry" +delivery agent). + +When the destination is instead simply slow, or there is a problem causing an +excessive arrival rate the active queue will grow and will become dominated by +mail to the congested destination. + +The only way to reduce congestion is to either reduce the input rate or +increase the throughput. Increasing the throughput requires either increasing +the concurrency or reducing the latency of deliveries. + +For high volume sites a key tuning parameter is the number of "smtp" delivery +agents allocated to the "smtp" and "relay" transports. High volume sites tend +to send to many different destinations, many of which may be down or slow, so a +good fraction of the available delivery agents will be blocked waiting for slow +sites. Also mail destined across the globe will incur large SMTP command- +response latencies, so high message throughput can only be achieved with more +concurrent delivery agents. + +The default "smtp" process limit of 100 is good enough for most sites, and may +even need to be lowered for sites with low bandwidth connections (no use +increasing concurrency once the network pipe is full). When one finds that the +queue is growing on an "idle" system (CPU, disk I/O and network not exhausted) +the remaining reason for congestion is insufficient concurrency in the face of +a high average latency. If the number of outbound SMTP connections (either +ESTABLISHED or SYN_SENT) reaches the process limit, mail is draining slowly and +the system and network are not loaded, raise the "smtp" and/or "relay" process +limits! + +When a high volume destination is served by multiple MX hosts with typically +low delivery latency, performance can suffer dramatically when one of the MX +hosts is unresponsive and SMTP connections to that host timeout. For example, +if there are 2 equal weight MX hosts, the SMTP connection timeout is 30 seconds +and one of the MX hosts is down, the average SMTP connection will take +approximately 15 seconds to complete. With a default per-destination +concurrency limit of 20 connections, throughput falls to just over 1 message +per second. + +The best way to avoid bottlenecks when one or more MX hosts is non-responsive +is to use connection caching. Connection caching was introduced with Postfix +2.2 and is by default enabled on demand for destinations with a backlog of mail +in the active queue. When connection caching is in effect for a particular +destination, established connections are re-used to send additional messages, +this reduces the number of connections made per message delivery and maintains +good throughput even in the face of partial unavailability of the destination's +MX hosts. + +If connection caching is not available (Postfix < 2.2) or does not provide a +sufficient latency reduction, especially for the "relay" transport used to +forward mail to "your own" domains, consider setting lower than default SMTP +connection timeouts (1-5 seconds) and higher than default destination +concurrency limits. This will further reduce latency and provide more +concurrency to maintain throughput should latency rise. + +Setting high concurrency limits to domains that are not your own may be viewed +as hostile by the receiving system, and steps may be taken to prevent you from +monopolizing the destination system's resources. The defensive measures may +substantially reduce your throughput or block access entirely. Do not set +aggressive concurrency limits to remote domains without coordinating with the +administrators of the target domain. + +If necessary, dedicate and tune custom transports for selected high volume +destinations. The "relay" transport is provided for forwarding mail to domains +for which your server is a primary or backup MX host. These can make up a +substantial fraction of your email traffic. Use the "relay" and not the "smtp" +transport to send email to these domains. Using the "relay" transport allocates +a separate delivery agent pool to these destinations and allows separate tuning +of timeouts and concurrency limits. + +Another common cause of congestion is unwarranted flushing of the entire +deferred queue. The deferred queue holds messages that are likely to fail to be +delivered and are also likely to be slow to fail delivery (time out). As a +result the most common reaction to a large deferred queue (flush it!) is more +than likely counter-productive, and typically makes the congestion worse. Do +not flush the deferred queue unless you expect that most of its content has +recently become deliverable (e.g. relayhost back up after an outage)! + +Note that whenever the queue manager is restarted, there may already be +messages in the active queue directory, but the "real" active queue in memory +is empty. In order to recover the in-memory state, the queue manager moves all +the active queue messages back into the incoming queue, and then uses its +normal incoming queue scan to refill the active queue. The process of moving +all the messages back and forth, redoing transport table (trivial-rewrite(8) +resolve service) lookups, and re-importing the messages back into memory is +expensive. At all costs, avoid frequent restarts of the queue manager (e.g. via +frequent execution of "postfix reload"). + +TThhee ""ddeeffeerrrreedd"" qquueeuuee + +When all the deliverable recipients for a message are delivered, and for some +recipients delivery failed for a transient reason (it might succeed later), the +message is placed in the deferred queue. + +The queue manager scans the deferred queue periodically. The scan interval is +controlled by the queue_run_delay parameter. While a deferred queue scan is in +progress, if an incoming queue scan is also in progress (ideally these are +brief since the incoming queue should be short), the queue manager alternates +between looking for messages in the "incoming" queue and in the "deferred" +queue. This "round-robin" strategy prevents starvation of either the incoming +or the deferred queues. + +Each deferred queue scan only brings a fraction of the deferred queue back into +the active queue for a retry. This is because each message in the deferred +queue is assigned a "cool-off" time when it is deferred. This is done by time- +warping the modification time of the queue file into the future. The queue file +is not eligible for a retry if its modification time is not yet reached. + +The "cool-off" time is at least $minimal_backoff_time and at most +$maximal_backoff_time. The next retry time is set by doubling the message's age +in the queue, and adjusting up or down to lie within the limits. This means +that young messages are initially retried more often than old messages. + +If a high volume site routinely has large deferred queues, it may be useful to +adjust the queue_run_delay, minimal_backoff_time and maximal_backoff_time to +provide short enough delays on first failure (Postfix >= 2.4 has a sensibly low +minimal backoff time by default), with perhaps longer delays after multiple +failures, to reduce the retransmission rate of old messages and thereby reduce +the quantity of previously deferred mail in the active queue. If you want a +really low minimal_backoff_time, you may also want to lower queue_run_delay, +but understand that more frequent scans will increase the demand for disk I/O. + +One common cause of large deferred queues is failure to validate recipients at +the SMTP input stage. Since spammers routinely launch dictionary attacks from +unrepliable sender addresses, the bounces for invalid recipient addresses clog +the deferred queue (and at high volumes proportionally clog the active queue). +Recipient validation is strongly recommended through use of the +local_recipient_maps and relay_recipient_maps parameters. Even when bounces +drain quickly they inundate innocent victims of forgery with unwanted email. To +avoid this, do not accept mail for invalid recipients. + +When a host with lots of deferred mail is down for some time, it is possible +for the entire deferred queue to reach its retry time simultaneously. This can +lead to a very full active queue once the host comes back up. The phenomenon +can repeat approximately every maximal_backoff_time seconds if the messages are +again deferred after a brief burst of congestion. Perhaps, a future Postfix +release will add a random offset to the retry time (or use a combination of +strategies) to reduce the odds of repeated complete deferred queue flushes. + +CCrreeddiittss + +The qshape(1) program was developed by Victor Duchovni of Morgan Stanley, who +also wrote the initial version of this document. + |