summaryrefslogtreecommitdiffstats
path: root/proto/QSHAPE_README.html
diff options
context:
space:
mode:
Diffstat (limited to '')
-rw-r--r--proto/QSHAPE_README.html939
1 files changed, 939 insertions, 0 deletions
diff --git a/proto/QSHAPE_README.html b/proto/QSHAPE_README.html
new file mode 100644
index 0000000..27a9a17
--- /dev/null
+++ b/proto/QSHAPE_README.html
@@ -0,0 +1,939 @@
+<!doctype html public "-//W3C//DTD HTML 4.01 Transitional//EN"
+ "http://www.w3.org/TR/html4/loose.dtd">
+
+<html>
+
+<head>
+
+<title>Postfix Bottleneck Analysis</title>
+
+<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
+<link rel='stylesheet' type='text/css' href='postfix-doc.css'>
+
+</head>
+
+<body>
+
+<h1><img src="postfix-logo.jpg" width="203" height="98" ALT="">Postfix Bottleneck Analysis</h1>
+
+<hr>
+
+<h2>Purpose of this document </h2>
+
+<p> This document is an introduction to Postfix queue congestion analysis.
+It explains how the qshape(1) program can help to track down the
+reason for queue congestion. qshape(1) is bundled with Postfix
+2.1 and later source code, under the "auxiliary" directory. This
+document describes qshape(1) as bundled with Postfix 2.4. </p>
+
+<p> This document covers the following topics: </p>
+
+<ul>
+
+<li><a href="#qshape">Introducing the qshape tool</a>
+
+<li><a href="#trouble_shooting">Trouble shooting with qshape</a>
+
+<li><a href="#healthy">Example 1: Healthy queue</a>
+
+<li><a href="#dictionary_bounce">Example 2: Deferred queue full of
+dictionary attack bounces</a></li>
+
+<li><a href="#active_congestion">Example 3: Congestion in the active
+queue</a></li>
+
+<li><a href="#backlog">Example 4: High volume destination backlog</a>
+
+<li><a href="#queues">Postfix queue directories</a>
+
+<ul>
+
+<li> <a href="#maildrop_queue"> The "maildrop" queue </a>
+
+<li> <a href="#hold_queue"> The "hold" queue </a>
+
+<li> <a href="#incoming_queue"> The "incoming" queue </a>
+
+<li> <a href="#active_queue"> The "active" queue </a>
+
+<li> <a href="#deferred_queue"> The "deferred" queue </a>
+
+</ul>
+
+<li><a href="#credits">Credits</a>
+
+</ul>
+
+<h2><a name="qshape">Introducing the qshape tool</a></h2>
+
+<p> When mail is draining slowly or the queue is unexpectedly large,
+run qshape(1) as the super-user (root) to help zero in on the problem.
+The qshape(1) program displays a tabular view of the Postfix queue
+contents. </p>
+
+<ul>
+
+<li> <p> On the horizontal axis, it displays the queue age with
+fine granularity for recent messages and (geometrically) less fine
+granularity for older messages. </p>
+
+<li> <p> The vertical axis displays the destination (or with the
+"-s" switch the sender) domain. Domains with the most messages are
+listed first. </p>
+
+</ul>
+
+<p> For example, in the output below we see the top 10 lines of
+the (mostly forged) sender domain distribution for captured spam
+in the "hold" queue: </p>
+
+<blockquote>
+<pre>
+$ qshape -s hold | head
+ T 5 10 20 40 80 160 320 640 1280 1280+
+ TOTAL 486 0 0 1 0 0 2 4 20 40 419
+ yahoo.com 14 0 0 1 0 0 0 0 1 0 12
+ extremepricecuts.net 13 0 0 0 0 0 0 0 2 0 11
+ ms35.hinet.net 12 0 0 0 0 0 0 0 0 1 11
+ winnersdaily.net 12 0 0 0 0 0 0 0 2 0 10
+ hotmail.com 11 0 0 0 0 0 0 0 0 1 10
+ worldnet.fr 6 0 0 0 0 0 0 0 0 0 6
+ ms41.hinet.net 6 0 0 0 0 0 0 0 0 0 6
+ osn.de 5 0 0 0 0 0 1 0 0 0 4
+</pre>
+</blockquote>
+
+<ul>
+
+<li> <p> The "T" column shows the total (in this case sender) count
+for each domain. The columns with numbers above them, show counts
+for messages aged fewer than that many minutes, but not younger
+than the age limit for the previous column. The row labeled "TOTAL"
+shows the total count for all domains. </p>
+
+<li> <p> In this example, there are 14 messages allegedly from
+yahoo.com, 1 between 10 and 20 minutes old, 1 between 320 and 640
+minutes old and 12 older than 1280 minutes (1440 minutes in a day).
+</p>
+
+</ul>
+
+<p> When the output is a terminal intermediate results showing the top 20
+domains (-n option) are displayed after every 1000 messages (-N option)
+and the final output also shows only the top 20 domains. This makes
+qshape useful even when the "deferred" queue is very large and it may
+otherwise take prohibitively long to read the entire "deferred" queue. </p>
+
+<p> By default, qshape shows statistics for the union of both the
+"incoming" and "active" queues which are the most relevant queues to
+look at when analyzing performance. </p>
+
+<p> One can request an alternate list of queues: </p>
+
+<blockquote>
+<pre>
+$ qshape deferred
+$ qshape incoming active deferred
+</pre>
+</blockquote>
+
+<p> this will show the age distribution of the "deferred" queue or
+the union of the "incoming", "active" and "deferred" queues. </p>
+
+<p> Command line options control the number of display "buckets",
+the age limit for the smallest bucket, display of parent domain
+counts and so on. The "-h" option outputs a summary of the available
+switches. </p>
+
+<h2><a name="trouble_shooting">Trouble shooting with qshape</a>
+</h2>
+
+<p> Large numbers in the qshape output represent a large number of
+messages that are destined to (or alleged to come from) a particular
+domain. It should be possible to tell at a glance which domains
+dominate the queue sender or recipient counts, approximately when
+a burst of mail started, and when it stopped. </p>
+
+<p> The problem destinations or sender domains appear near the top
+left corner of the output table. Remember that the "active" queue
+can accommodate up to 20000 ($qmgr_message_active_limit) messages.
+To check whether this limit has been reached, use: </p>
+
+<blockquote>
+<pre>
+$ qshape -s active <i>(show sender statistics)</i>
+</pre>
+</blockquote>
+
+<p> If the total sender count is below 20000 the "active" queue is
+not yet saturated, any high volume sender domains show near the
+top of the output.
+
+<p> With oqmgr(8) the "active" queue is also limited to at most 20000
+recipient addresses ($qmgr_message_recipient_limit). To check for
+exhaustion of this limit use: </p>
+
+<blockquote>
+<pre>
+$ qshape active <i>(show recipient statistics)</i>
+</pre>
+</blockquote>
+
+<p> Having found the high volume domains, it is often useful to
+search the logs for recent messages pertaining to the domains in
+question. </p>
+
+<blockquote>
+<pre>
+# Find deliveries to example.com
+#
+$ tail -10000 /var/log/maillog |
+ grep -E -i ': to=&lt;.*@example\.com&gt;,' |
+ less
+
+# Find messages from example.com
+#
+$ tail -10000 /var/log/maillog |
+ grep -E -i ': from=&lt;.*@example\.com&gt;,' |
+ less
+</pre>
+</blockquote>
+
+<p> You may want to drill in on some specific queue ids: </p>
+
+<blockquote>
+<pre>
+# Find all messages for a specific queue id.
+#
+$ tail -10000 /var/log/maillog | grep -E ': 2B2173FF68: '
+</pre>
+</blockquote>
+
+<p> Also look for queue manager warning messages in the log. These
+warnings can suggest strategies to reduce congestion. </p>
+
+<blockquote>
+<pre>
+$ grep -E 'qmgr.*(panic|fatal|error|warning):' /var/log/maillog
+</pre>
+</blockquote>
+
+<p> When all else fails try the Postfix mailing list for help, but
+please don't forget to include the top 10 or 20 lines of qshape(1)
+output. </p>
+
+<h2><a name="healthy">Example 1: Healthy queue</a></h2>
+
+<p> When looking at just the "incoming" and "active" queues, under
+normal conditions (no congestion) the "incoming" and "active" queues
+are nearly empty. Mail leaves the system almost as quickly as it
+comes in or is deferred without congestion in the "active" queue.
+</p>
+
+<blockquote>
+<pre>
+$ qshape <i>(show "incoming" and "active" queue status)</i>
+
+ T 5 10 20 40 80 160 320 640 1280 1280+
+ TOTAL 5 0 0 0 1 0 0 0 1 1 2
+ meri.uwasa.fi 5 0 0 0 1 0 0 0 1 1 2
+</pre>
+</blockquote>
+
+<p> If one looks at the two queues separately, the "incoming" queue
+is empty or perhaps briefly has one or two messages, while the
+"active" queue holds more messages and for a somewhat longer time:
+</p>
+
+<blockquote>
+<pre>
+$ qshape incoming
+
+ T 5 10 20 40 80 160 320 640 1280 1280+
+ TOTAL 0 0 0 0 0 0 0 0 0 0 0
+
+$ qshape active
+
+ T 5 10 20 40 80 160 320 640 1280 1280+
+ TOTAL 5 0 0 0 1 0 0 0 1 1 2
+ meri.uwasa.fi 5 0 0 0 1 0 0 0 1 1 2
+</pre>
+</blockquote>
+
+<h2><a name="dictionary_bounce">Example 2: Deferred queue full of
+dictionary attack bounces</a></h2>
+
+<p> This is from a server where recipient validation is not yet
+available for some of the hosted domains. Dictionary attacks on
+the unvalidated domains result in bounce backscatter. The bounces
+dominate the queue, but with proper tuning they do not saturate the
+"incoming" or "active" queues. The high volume of deferred mail is not
+a direct cause for alarm. </p>
+
+<blockquote>
+<pre>
+$ qshape deferred | head
+
+ T 5 10 20 40 80 160 320 640 1280 1280+
+ TOTAL 2234 4 2 5 9 31 57 108 201 464 1353
+ heyhihellothere.com 207 0 0 1 1 6 6 8 25 68 92
+ pleazerzoneprod.com 105 0 0 0 0 0 0 0 5 44 56
+ groups.msn.com 63 2 1 2 4 4 14 14 14 8 0
+ orion.toppoint.de 49 0 0 0 1 0 2 4 3 16 23
+ kali.com.cn 46 0 0 0 0 1 0 2 6 12 25
+ meri.uwasa.fi 44 0 0 0 0 1 0 2 8 11 22
+ gjr.paknet.com.pk 43 1 0 0 1 1 3 3 6 12 16
+ aristotle.algonet.se 41 0 0 0 0 0 1 2 11 12 15
+</pre>
+</blockquote>
+
+<p> The domains shown are mostly bulk-mailers and all the volume
+is the tail end of the time distribution, showing that short term
+arrival rates are moderate. Larger numbers and lower message ages
+are more indicative of current trouble. Old mail still going nowhere
+is largely harmless so long as the "active" and "incoming" queues are
+short. We can also see that the groups.msn.com undeliverables are
+low rate steady stream rather than a concentrated dictionary attack
+that is now over. </p>
+
+<blockquote>
+<pre>
+$ qshape -s deferred | head
+
+ T 5 10 20 40 80 160 320 640 1280 1280+
+ TOTAL 2193 4 4 5 8 33 56 104 205 465 1309
+ MAILER-DAEMON 1709 4 4 5 8 33 55 101 198 452 849
+ example.com 263 0 0 0 0 0 0 0 0 2 261
+ example.org 209 0 0 0 0 0 1 3 6 11 188
+ example.net 6 0 0 0 0 0 0 0 0 0 6
+ example.edu 3 0 0 0 0 0 0 0 0 0 3
+ example.gov 2 0 0 0 0 0 0 0 1 0 1
+ example.mil 1 0 0 0 0 0 0 0 0 0 1
+</pre>
+</blockquote>
+
+<p> Looking at the sender distribution, we see that as expected
+most of the messages are bounces. </p>
+
+<h2><a name="active_congestion">Example 3: Congestion in the active
+queue</a></h2>
+
+<p> This example is taken from a Feb 2004 discussion on the Postfix
+Users list. Congestion was reported with the
+"active" and "incoming" queues
+large and not shrinking despite very large delivery agent
+process limits. The thread is archived at:
+http://groups.google.com/groups?threadm=c0b7js$2r65$1@FreeBSD.csie.NCTU.edu.tw
+and
+http://archives.neohapsis.com/archives/postfix/2004-02/thread.html#1371
+</p>
+
+<p> Using an older version of qshape(1) it was quickly determined
+that all the messages were for just a few destinations: </p>
+
+<blockquote>
+<pre>
+$ qshape <i>(show "incoming" and "active" queue status)</i>
+
+ T A 5 10 20 40 80 160 320 320+
+ TOTAL 11775 9996 0 0 1 1 42 94 221 1420
+ user.sourceforge.net 7678 7678 0 0 0 0 0 0 0 0
+ lists.sourceforge.net 2313 2313 0 0 0 0 0 0 0 0
+ gzd.gotdns.com 102 0 0 0 0 0 0 0 2 100
+</pre>
+</blockquote>
+
+<p> The "A" column showed the count of messages in the "active" queue,
+and the numbered columns showed totals for the "deferred" queue. At
+10000 messages (Postfix 1.x "active" queue size limit) the "active" queue
+is full. The "incoming" queue was growing rapidly. </p>
+
+<p> With the trouble destinations clearly identified, the administrator
+quickly found and fixed the problem. It is substantially harder to
+glean the same information from the logs. While a careful reading
+of mailq(1) output should yield similar results, it is much harder
+to gauge the magnitude of the problem by looking at the queue
+one message at a time. </p>
+
+<h2><a name="backlog">Example 4: High volume destination backlog</a></h2>
+
+<p> When a site you send a lot of email to is down or slow, mail
+messages will rapidly build up in the "deferred" queue, or worse, in
+the "active" queue. The qshape output will show large numbers for
+the destination domain in all age buckets that overlap the starting
+time of the problem: </p>
+
+<blockquote>
+<pre>
+$ qshape deferred | head
+
+ T 5 10 20 40 80 160 320 640 1280 1280+
+ TOTAL 5000 200 200 400 800 1600 1000 200 200 200 200
+ highvolume.com 4000 160 160 320 640 1280 1440 0 0 0 0
+ ...
+</pre>
+</blockquote>
+
+<p> Here the "highvolume.com" destination is continuing to accumulate
+deferred mail. The "incoming" and "active" queues are fine, but the
+"deferred" queue started growing some time between 1 and 2 hours ago
+and continues to grow. </p>
+
+<p> If the high volume destination is not down, but is instead
+slow, one might see similar congestion in the "active" queue.
+"Active" queue congestion is a greater cause for alarm; one might need to
+take measures to ensure that the mail is deferred instead or even
+add an access(5) rule asking the sender to try again later. </p>
+
+<p> If a high volume destination exhibits frequent bursts of consecutive
+connections refused by all MX hosts or "421 Server busy errors", it
+is possible for the queue manager to mark the destination as "dead"
+despite the transient nature of the errors. The destination will be
+retried again after the expiration of a $minimal_backoff_time timer.
+If the error bursts are frequent enough it may be that only a small
+quantity of email is delivered before the destination is again marked
+"dead". In some cases enabling static (not on demand) connection
+caching by listing the appropriate nexthop domain in a table included in
+"smtp_connection_cache_destinations" may help to reduce the error rate,
+because most messages will re-use existing connections. </p>
+
+<p> The MTA that has been observed most frequently to exhibit such
+bursts of errors is Microsoft Exchange, which refuses connections
+under load. Some proxy virus scanners in front of the Exchange
+server propagate the refused connection to the client as a "421"
+error. </p>
+
+<p> Note that it is now possible to configure Postfix to exhibit similarly
+erratic behavior by misconfiguring the anvil(8) service. Do not use
+anvil(8) for steady-state rate limiting, its purpose is (unintentional)
+DoS prevention and the rate limits set should be very generous! </p>
+
+<p> If one finds oneself needing to deliver a high volume of mail to a
+destination that exhibits frequent brief bursts of errors and connection
+caching does not solve the problem, there is a subtle workaround. </p>
+
+<ul>
+
+<li> <p> Postfix version 2.5 and later: </p>
+
+<ul>
+
+<li> <p> In master.cf set up a dedicated clone of the "smtp" transport
+for the destination in question. In the example below we will call
+it "fragile". </p>
+
+<li> <p> In master.cf configure a reasonable process limit for the
+cloned smtp transport (a number in the 10-20 range is typical). </p>
+
+<li> <p> IMPORTANT!!! In main.cf configure a large per-destination
+pseudo-cohort failure limit for the cloned smtp transport. </p>
+
+<pre>
+/etc/postfix/main.cf:
+ transport_maps = hash:/etc/postfix/transport
+ fragile_destination_concurrency_failed_cohort_limit = 100
+ fragile_destination_concurrency_limit = 20
+
+/etc/postfix/transport:
+ example.com fragile:
+
+/etc/postfix/master.cf:
+ # service type private unpriv chroot wakeup maxproc command
+ fragile unix - - n - 20 smtp
+</pre>
+
+<p> See also the documentation for
+default_destination_concurrency_failed_cohort_limit and
+default_destination_concurrency_limit. </p>
+
+</ul>
+
+<li> <p> Earlier Postfix versions: </p>
+
+<ul>
+
+<li> <p> In master.cf set up a dedicated clone of the "smtp"
+transport for the destination in question. In the example below
+we will call it "fragile". </p>
+
+<li> <p> In master.cf configure a reasonable process limit for the
+transport (a number in the 10-20 range is typical). </p>
+
+<li> <p> IMPORTANT!!! In main.cf configure a very large initial
+and destination concurrency limit for this transport (say 2000). </p>
+
+<pre>
+/etc/postfix/main.cf:
+ transport_maps = hash:/etc/postfix/transport
+ initial_destination_concurrency = 2000
+ fragile_destination_concurrency_limit = 2000
+
+/etc/postfix/transport:
+ example.com fragile:
+
+/etc/postfix/master.cf:
+ # service type private unpriv chroot wakeup maxproc command
+ fragile unix - - n - 20 smtp
+</pre>
+
+<p> See also the documentation for default_destination_concurrency_limit.
+</p>
+
+</ul>
+
+</ul>
+
+<p> The effect of this configuration is that up to 2000
+consecutive errors are tolerated without marking the destination
+dead, while the total concurrency remains reasonable (10-20
+processes). This trick is only for a very specialized situation:
+high volume delivery into a channel with multi-error bursts
+that is capable of high throughput, but is repeatedly throttled by
+the bursts of errors. </p>
+
+<p> When a destination is unable to handle the load even after the
+Postfix process limit is reduced to 1, a desperate measure is to
+insert brief delays between delivery attempts. </p>
+
+<ul>
+
+<li> <p> Postfix version 2.5 and later: </p>
+
+<ul>
+
+<li> <p> In master.cf set up a dedicated clone of the "smtp" transport
+for the problem destination. In the example below we call it "slow".
+</p>
+
+<li> <p> In main.cf configure a short delay between deliveries to
+the same destination. </p>
+
+<pre>
+/etc/postfix/main.cf:
+ transport_maps = hash:/etc/postfix/transport
+ slow_destination_rate_delay = 1
+ slow_destination_concurrency_failed_cohort_limit = 100
+
+/etc/postfix/transport:
+ example.com slow:
+
+/etc/postfix/master.cf:
+ # service type private unpriv chroot wakeup maxproc command
+ slow unix - - n - - smtp
+</pre>
+
+</ul>
+
+<p> See also the documentation for default_destination_rate_delay. </p>
+
+<p> This solution forces the Postfix smtp(8) client to wait for
+$slow_destination_rate_delay seconds between deliveries to the same
+destination. </p>
+
+<p> IMPORTANT!! The large slow_destination_concurrency_failed_cohort_limit
+value is needed. This prevents Postfix from deferring all mail for
+the same destination after only one connection or handshake error
+(the reason for this is that non-zero slow_destination_rate_delay
+forces a per-destination concurrency of 1). </p>
+
+<li> <p> Earlier Postfix versions: </p>
+
+<ul>
+
+<li> <p> In the transport map entry for the problem destination,
+specify a dead host as the primary nexthop. </p>
+
+<li> <p> In the master.cf entry for the transport specify the
+problem destination as the fallback_relay and specify a small
+smtp_connect_timeout value. </p>
+
+<pre>
+/etc/postfix/main.cf:
+ transport_maps = hash:/etc/postfix/transport
+
+/etc/postfix/transport:
+ example.com slow:[dead.host]
+
+/etc/postfix/master.cf:
+ # service type private unpriv chroot wakeup maxproc command
+ slow unix - - n - 1 smtp
+ -o fallback_relay=problem.example.com
+ -o smtp_connect_timeout=1
+ -o smtp_connection_cache_on_demand=no
+</pre>
+
+</ul>
+
+<p> This solution forces the Postfix smtp(8) client to wait for
+$smtp_connect_timeout seconds between deliveries. The connection
+caching feature is disabled to prevent the client from skipping
+over the dead host. </p>
+
+</ul>
+
+<h2><a name="queues">Postfix queue directories</a></h2>
+
+<p> The following sections describe Postfix queues: their purpose,
+what normal behavior looks like, and how to diagnose abnormal
+behavior. </p>
+
+<h3> <a name="maildrop_queue"> The "maildrop" queue </a> </h3>
+
+<p> Messages that have been submitted via the Postfix sendmail(1)
+command, but not yet brought into the main Postfix queue by the
+pickup(8) service, await processing in the "maildrop" queue. Messages
+can be added to the "maildrop" queue even when the Postfix system
+is not running. They will begin to be processed once Postfix is
+started. </p>
+
+<p> The "maildrop" queue is drained by the single threaded pickup(8)
+service scanning the queue directory periodically or when notified
+of new message arrival by the postdrop(1) program. The postdrop(1)
+program is a setgid helper that allows the unprivileged Postfix
+sendmail(1) program to inject mail into the "maildrop" queue and
+to notify the pickup(8) service of its arrival. </p>
+
+<p> All mail that enters the main Postfix queue does so via the
+cleanup(8) service. The cleanup service is responsible for envelope
+and header rewriting, header and body regular expression checks,
+automatic bcc recipient processing, milter content processing, and
+reliable insertion of the message into the Postfix "incoming" queue. </p>
+
+<p> In the absence of excessive CPU consumption in cleanup(8) header
+or body regular expression checks or other software consuming all
+available CPU resources, Postfix performance is disk I/O bound.
+The rate at which the pickup(8) service can inject messages into
+the queue is largely determined by disk access times, since the
+cleanup(8) service must commit the message to stable storage before
+returning success. The same is true of the postdrop(1) program
+writing the message to the "maildrop" directory. </p>
+
+<p> As the pickup service is single threaded, it can only deliver
+one message at a time at a rate that does not exceed the reciprocal
+disk I/O latency (+ CPU if not negligible) of the cleanup service.
+</p>
+
+<p> Congestion in this queue is indicative of an excessive local message
+submission rate or perhaps excessive CPU consumption in the cleanup(8)
+service due to excessive body_checks, or (Postfix &ge; 2.3) high latency
+milters. </p>
+
+<p> Note, that once the "active" queue is full, the cleanup service
+will attempt to slow down message injection by pausing $in_flow_delay
+for each message. In this case "maildrop" queue congestion may be
+a consequence of congestion downstream, rather than a problem in
+its own right. </p>
+
+<p> Note, you should not attempt to deliver large volumes of mail via
+the pickup(8) service. High volume sites should avoid using "simple"
+content filters that re-inject scanned mail via Postfix sendmail(1)
+and postdrop(1). </p>
+
+<p> A high arrival rate of locally submitted mail may be an indication
+of an uncaught forwarding loop, or a run-away notification program.
+Try to keep the volume of local mail injection to a moderate level.
+</p>
+
+<p> The "postsuper -r" command can place selected messages into
+the "maildrop" queue for reprocessing. This is most useful for
+resetting any stale content_filter settings. Requeuing a large number
+of messages using "postsuper -r" can clearly cause a spike in the
+size of the "maildrop" queue. </p>
+
+<h3> <a name="hold_queue"> The "hold" queue </a> </h3>
+
+<p> The administrator can define "smtpd" access(5) policies, or
+cleanup(8) header/body checks that cause messages to be automatically
+diverted from normal processing and placed indefinitely in the
+"hold" queue. Messages placed in the "hold" queue stay there until
+the administrator intervenes. No periodic delivery attempts are
+made for messages in the "hold" queue. The postsuper(1) command
+can be used to manually release messages into the "deferred" queue.
+</p>
+
+<p> Messages can potentially stay in the "hold" queue longer than
+$maximal_queue_lifetime. If such "old" messages need to be released from
+the "hold" queue, they should typically be moved into the "maildrop" queue
+using "postsuper -r", so that the message gets a new timestamp and
+is given more than one opportunity to be delivered. Messages that are
+"young" can be moved directly into the "deferred" queue using
+"postsuper -H". </p>
+
+<p> The "hold" queue plays little role in Postfix performance, and
+monitoring of the "hold" queue is typically more closely motivated
+by tracking spam and malware, than by performance issues. </p>
+
+<h3> <a name="incoming_queue"> The "incoming" queue </a> </h3>
+
+<p> All new mail entering the Postfix queue is written by the
+cleanup(8) service into the "incoming" queue. New queue files are
+created owned by the "postfix" user with an access bitmask (or
+mode) of 0600. Once a queue file is ready for further processing
+the cleanup(8) service changes the queue file mode to 0700 and
+notifies the queue manager of new mail arrival. The queue manager
+ignores incomplete queue files whose mode is 0600, as these are
+still being written by cleanup. </p>
+
+<p> The queue manager scans the "incoming" queue bringing any new
+mail into the "active" queue if the "active" queue resource limits
+have not been exceeded. By default, the "active" queue accommodates
+at most 20000 messages. Once the "active" queue message limit is
+reached, the queue manager stops scanning the "incoming" queue
+(and the "deferred" queue, see below). </p>
+
+<p> Under normal conditions the "incoming" queue is nearly empty (has
+only mode 0600 files), with the queue manager able to import new
+messages into the "active" queue as soon as they become available.
+</p>
+
+<p> The "incoming" queue grows when the message input rate spikes
+above the rate at which the queue manager can import messages into
+the "active" queue. The main factors slowing down the queue manager
+are disk I/O and lookup queries to the trivial-rewrite service. If the queue
+manager is routinely not keeping up, consider not using "slow"
+lookup services (MySQL, LDAP, ...) for transport lookups or speeding
+up the hosts that provide the lookup service. If the problem is I/O
+starvation, consider striping the queue over more disks, faster controllers
+with a battery write cache, or other hardware improvements. At the very
+least, make sure that the queue directory is mounted with the "noatime"
+option if applicable to the underlying filesystem. </p>
+
+<p> The in_flow_delay parameter is used to clamp the input rate
+when the queue manager starts to fall behind. The cleanup(8) service
+will pause for $in_flow_delay seconds before creating a new queue
+file if it cannot obtain a "token" from the queue manager. </p>
+
+<p> Since the number of cleanup(8) processes is limited in most
+cases by the SMTP server concurrency, the input rate can exceed
+the output rate by at most "SMTP connection count" / $in_flow_delay
+messages per second. </p>
+
+<p> With a default process limit of 100, and an in_flow_delay of
+1s, the coupling is strong enough to limit a single run-away injector
+to 1 message per second, but is not strong enough to deflect an
+excessive input rate from many sources at the same time. </p>
+
+<p> If a server is being hammered from multiple directions, consider
+raising the in_flow_delay to 10 seconds, but only if the "incoming" queue
+is growing even while the "active" queue is not full and the
+trivial-rewrite service is using a fast transport lookup mechanism.
+</p>
+
+<h3> <a name="active_queue"> The "active" queue </a> </h3>
+
+<p> The queue manager is a delivery agent scheduler; it works to
+ensure fast and fair delivery of mail to all destinations within
+designated resource limits. </p>
+
+<p> The "active" queue is somewhat analogous to an operating system's
+process run queue. Messages in the "active" queue are ready to be
+sent (runnable), but are not necessarily in the process of being
+sent (running). </p>
+
+<p> While most Postfix administrators think of the "active" queue
+as a directory on disk, the real "active" queue is a set of data
+structures in the memory of the queue manager process. </p>
+
+<p> Messages in the "maildrop", "hold", "incoming" and "deferred" queues
+(see below) do not occupy memory; they are safely stored on
+disk waiting for their turn to be processed. The envelope information
+for messages in the "active" queue is managed in memory, allowing
+the queue manager to do global scheduling, allocating available
+delivery agent processes to an appropriate message in the "active" queue. </p>
+
+<p> Within the "active" queue, (multi-recipient) messages are broken
+up into groups of recipients that share the same transport/nexthop
+combination; the group size is capped by the transport's recipient
+concurrency limit. </p>
+
+<p> Multiple recipient groups (from one or more messages) are queued
+for delivery grouped by transport/nexthop combination. The
+<b>destination</b> concurrency limit for the transports caps the number
+of simultaneous delivery attempts for each nexthop. Transports with
+a <b>recipient</b> concurrency limit of 1 are special: these are grouped
+by the actual recipient address rather than the nexthop, yielding
+per-recipient concurrency limits rather than per-domain
+concurrency limits. Per-recipient limits are appropriate when
+performing final delivery to mailboxes rather than when relaying
+to a remote server. </p>
+
+<p> Congestion occurs in the "active" queue when one or more destinations
+drain slower than the corresponding message input rate. </p>
+
+<p> Input into the "active" queue comes both from new mail in the "incoming" queue,
+and retries of mail in the "deferred" queue. Should the "deferred" queue
+get really large, retries of old mail can dominate the arrival
+rate of new mail. Systems with more CPU, faster disks and more network
+bandwidth can deal with larger "deferred" queues, but as a rule of thumb
+the "deferred" queue scales to somewhere between 100,000 and 1,000,000
+messages with good performance unlikely above that "limit". Systems with
+queues this large should typically stop accepting new mail, or put the
+backlog "on hold" until the underlying issue is fixed (provided that
+there is enough capacity to handle just the new mail). </p>
+
+<p> When a destination is down for some time, the queue manager will
+mark it dead, and immediately defer all mail for the destination without
+trying to assign it to a delivery agent. In this case the messages
+will quickly leave the "active" queue and end up in the "deferred" queue
+(with Postfix &lt; 2.4, this is done directly by the queue manager,
+with Postfix &ge; 2.4 this is done via the "retry" delivery agent). </p>
+
+<p> When the destination is instead simply slow, or there is a problem
+causing an excessive arrival rate the "active" queue will grow and will
+become dominated by mail to the congested destination. </p>
+
+<p> The only way to reduce congestion is to either reduce the input
+rate or increase the throughput. Increasing the throughput requires
+either increasing the concurrency or reducing the latency of
+deliveries. </p>
+
+<p> For high volume sites a key tuning parameter is the number of
+"smtp" delivery agents allocated to the "smtp" and "relay" transports.
+High volume sites tend to send to many different destinations, many
+of which may be down or slow, so a good fraction of the available
+delivery agents will be blocked waiting for slow sites. Also mail
+destined across the globe will incur large SMTP command-response
+latencies, so high message throughput can only be achieved with
+more concurrent delivery agents. </p>
+
+<p> The default "smtp" process limit of 100 is good enough for most
+sites, and may even need to be lowered for sites with low bandwidth
+connections (no use increasing concurrency once the network pipe
+is full). When one finds that the queue is growing on an "idle"
+system (CPU, disk I/O and network not exhausted) the remaining
+reason for congestion is insufficient concurrency in the face of
+a high average latency. If the number of outbound SMTP connections
+(either ESTABLISHED or SYN_SENT) reaches the process limit, mail
+is draining slowly and the system and network are not loaded, raise
+the "smtp" and/or "relay" process limits! </p>
+
+<p> When a high volume destination is served by multiple MX hosts with
+typically low delivery latency, performance can suffer dramatically when
+one of the MX hosts is unresponsive and SMTP connections to that host
+timeout. For example, if there are 2 equal weight MX hosts, the SMTP
+connection timeout is 30 seconds and one of the MX hosts is down, the
+average SMTP connection will take approximately 15 seconds to complete.
+With a default per-destination concurrency limit of 20 connections,
+throughput falls to just over 1 message per second. </p>
+
+<p> The best way to avoid bottlenecks when one or more MX hosts is
+non-responsive is to use connection caching. Connection caching was
+introduced with Postfix 2.2 and is by default enabled on demand for
+destinations with a backlog of mail in the "active" queue. When connection
+caching is in effect for a particular destination, established connections
+are re-used to send additional messages, this reduces the number of
+connections made per message delivery and maintains good throughput even
+in the face of partial unavailability of the destination's MX hosts. </p>
+
+<p> If connection caching is not available (Postfix &lt; 2.2) or does
+not provide a sufficient latency reduction, especially for the "relay"
+transport used to forward mail to "your own" domains, consider setting
+lower than default SMTP connection timeouts (1-5 seconds) and higher
+than default destination concurrency limits. This will further reduce
+latency and provide more concurrency to maintain throughput should
+latency rise. </p>
+
+<p> Setting high concurrency limits to domains that are not your own may
+be viewed as hostile by the receiving system, and steps may be taken
+to prevent you from monopolizing the destination system's resources.
+The defensive measures may substantially reduce your throughput or block
+access entirely. Do not set aggressive concurrency limits to remote
+domains without coordinating with the administrators of the target
+domain. </p>
+
+<p> If necessary, dedicate and tune custom transports for selected high
+volume destinations. The "relay" transport is provided for forwarding mail
+to domains for which your server is a primary or backup MX host. These can
+make up a substantial fraction of your email traffic. Use the "relay" and
+not the "smtp" transport to send email to these domains. Using the "relay"
+transport allocates a separate delivery agent pool to these destinations
+and allows separate tuning of timeouts and concurrency limits. </p>
+
+<p> Another common cause of congestion is unwarranted flushing of the
+entire "deferred" queue. The "deferred" queue holds messages that are likely
+to fail to be delivered and are also likely to be slow to fail delivery
+(time out). As a result the most common reaction to a large "deferred" queue
+(flush it!) is more than likely counter-productive, and typically makes
+the congestion worse. Do not flush the "deferred" queue unless you expect
+that most of its content has recently become deliverable (e.g. relayhost
+back up after an outage)! </p>
+
+<p> Note that whenever the queue manager is restarted, there may
+already be messages in the "active" queue directory, but the "real"
+"active" queue in memory is empty. In order to recover the in-memory
+state, the queue manager moves all the "active" queue messages
+back into the "incoming" queue, and then uses its normal "incoming" queue
+scan to refill the "active" queue. The process of moving all
+the messages back and forth, redoing transport table (trivial-rewrite(8)
+resolve service) lookups, and re-importing the messages back into
+memory is expensive. At all costs, avoid frequent restarts of the
+queue manager (e.g. via frequent execution of "postfix reload"). </p>
+
+<h3> <a name="deferred_queue"> The "deferred" queue </a> </h3>
+
+<p> When all the deliverable recipients for a message are delivered,
+and for some recipients delivery failed for a transient reason (it
+might succeed later), the message is placed in the "deferred" queue.
+</p>
+
+<p> The queue manager scans the "deferred" queue periodically. The scan
+interval is controlled by the queue_run_delay parameter. While a "deferred" queue
+scan is in progress, if an "incoming" queue scan is also in progress
+(ideally these are brief since the "incoming" queue should be short), the
+queue manager alternates between looking for messages in the "incoming" queue
+and in the "deferred" queue. This "round-robin" strategy prevents
+starvation of either the "incoming" or the "deferred" queues. </p>
+
+<p> Each "deferred" queue scan only brings a fraction of the "deferred" queue
+back into the "active" queue for a retry. This is because each
+message in the "deferred" queue is assigned a "cool-off" time when
+it is deferred. This is done by time-warping the modification
+time of the queue file into the future. The queue file is not
+eligible for a retry if its modification time is not yet reached.
+</p>
+
+<p> The "cool-off" time is at least $minimal_backoff_time and at
+most $maximal_backoff_time. The next retry time is set by doubling
+the message's age in the queue, and adjusting up or down to lie
+within the limits. This means that young messages are initially
+retried more often than old messages. </p>
+
+<p> If a high volume site routinely has large "deferred" queues, it
+may be useful to adjust the queue_run_delay, minimal_backoff_time and
+maximal_backoff_time to provide short enough delays on first failure
+(Postfix &ge; 2.4 has a sensibly low minimal backoff time by default),
+with perhaps longer delays after multiple failures, to reduce the
+retransmission rate of old messages and thereby reduce the quantity
+of previously deferred mail in the "active" queue. If you want a really
+low minimal_backoff_time, you may also want to lower queue_run_delay,
+but understand that more frequent scans will increase the demand for
+disk I/O. </p>
+
+<p> One common cause of large "deferred" queues is failure to validate
+recipients at the SMTP input stage. Since spammers routinely launch
+dictionary attacks from unrepliable sender addresses, the bounces
+for invalid recipient addresses clog the "deferred" queue (and at high
+volumes proportionally clog the "active" queue). Recipient validation
+is strongly recommended through use of the local_recipient_maps and
+relay_recipient_maps parameters. Even when bounces drain quickly they
+inundate innocent victims of forgery with unwanted email. To avoid
+this, do not accept mail for invalid recipients. </p>
+
+<p> When a host with lots of deferred mail is down for some time,
+it is possible for the entire "deferred" queue to reach its retry
+time simultaneously. This can lead to a very full "active" queue once
+the host comes back up. The phenomenon can repeat approximately
+every maximal_backoff_time seconds if the messages are again deferred
+after a brief burst of congestion. Perhaps, a future Postfix release
+will add a random offset to the retry time (or use a combination
+of strategies) to reduce the odds of repeated complete "deferred" queue
+flushes. </p>
+
+<h2><a name="credits">Credits</a></h2>
+
+<p> The qshape(1) program was developed by Victor Duchovni of Morgan
+Stanley, who also wrote the initial version of this document. </p>
+
+</body>
+
+</html>