summaryrefslogtreecommitdiffstats
path: root/doc/src/sgml/html/tutorial-agg.html
blob: 29aaebe52bb855b70379739730c58ef1c3734441 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"><html xmlns="http://www.w3.org/1999/xhtml"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /><title>2.7. Aggregate Functions</title><link rel="stylesheet" type="text/css" href="stylesheet.css" /><link rev="made" href="pgsql-docs@lists.postgresql.org" /><meta name="generator" content="DocBook XSL Stylesheets Vsnapshot" /><link rel="prev" href="tutorial-join.html" title="2.6. Joins Between Tables" /><link rel="next" href="tutorial-update.html" title="2.8. Updates" /></head><body id="docContent" class="container-fluid col-10"><div class="navheader"><table width="100%" summary="Navigation header"><tr><th colspan="5" align="center">2.7. Aggregate Functions</th></tr><tr><td width="10%" align="left"><a accesskey="p" href="tutorial-join.html" title="2.6. Joins Between Tables">Prev</a> </td><td width="10%" align="left"><a accesskey="u" href="tutorial-sql.html" title="Chapter 2. The SQL Language">Up</a></td><th width="60%" align="center">Chapter 2. The <acronym class="acronym">SQL</acronym> Language</th><td width="10%" align="right"><a accesskey="h" href="index.html" title="PostgreSQL 16.3 Documentation">Home</a></td><td width="10%" align="right"> <a accesskey="n" href="tutorial-update.html" title="2.8. Updates">Next</a></td></tr></table><hr /></div><div class="sect1" id="TUTORIAL-AGG"><div class="titlepage"><div><div><h2 class="title" style="clear: both">2.7. Aggregate Functions <a href="#TUTORIAL-AGG" class="id_link">#</a></h2></div></div></div><a id="id-1.4.4.8.2" class="indexterm"></a><p>
    Like  most  other relational database products,
    <span class="productname">PostgreSQL</span> supports
    <em class="firstterm">aggregate functions</em>.
    An aggregate function computes a single result from multiple input rows.
    For example, there are aggregates to compute the
    <code class="function">count</code>, <code class="function">sum</code>,
    <code class="function">avg</code> (average), <code class="function">max</code> (maximum) and
    <code class="function">min</code> (minimum) over a set of rows.
   </p><p>
    As an example, we can find the highest low-temperature reading anywhere
    with:

</p><pre class="programlisting">
SELECT max(temp_lo) FROM weather;
</pre><p>

</p><pre class="screen">
 max
-----
  46
(1 row)
</pre><p>
   </p><p>
    <a id="id-1.4.4.8.5.1" class="indexterm"></a>

    If we wanted to know what city (or cities) that reading occurred in,
    we might try:

</p><pre class="programlisting">
SELECT city FROM weather WHERE temp_lo = max(temp_lo);     <em class="lineannotation"><span class="lineannotation">WRONG</span></em>
</pre><p>

    but this will not work since the aggregate
    <code class="function">max</code> cannot be used in the
    <code class="literal">WHERE</code> clause.  (This restriction exists because
    the <code class="literal">WHERE</code> clause determines which rows will be
    included in the aggregate calculation; so obviously it has to be evaluated
    before aggregate functions are computed.)
    However, as is often the case
    the query can be restated to accomplish the desired result, here
    by using a <em class="firstterm">subquery</em>:

</p><pre class="programlisting">
SELECT city FROM weather
    WHERE temp_lo = (SELECT max(temp_lo) FROM weather);
</pre><p>

</p><pre class="screen">
     city
---------------
 San Francisco
(1 row)
</pre><p>

    This is OK because the subquery is an independent computation
    that computes its own aggregate separately from what is happening
    in the outer query.
   </p><p>
    <a id="id-1.4.4.8.6.1" class="indexterm"></a>
    <a id="id-1.4.4.8.6.2" class="indexterm"></a>

    Aggregates are also very useful in combination with <code class="literal">GROUP
    BY</code> clauses.  For example, we can get the number of readings
    and the maximum low temperature observed in each city with:

</p><pre class="programlisting">
SELECT city, count(*), max(temp_lo)
    FROM weather
    GROUP BY city;
</pre><p>

</p><pre class="screen">
     city      | count | max
---------------+-------+-----
 Hayward       |     1 |  37
 San Francisco |     2 |  46
(2 rows)
</pre><p>

    which gives us one output row per city.  Each aggregate result is
    computed over the table rows matching that city.
    We can filter these grouped
    rows using <code class="literal">HAVING</code>:

</p><pre class="programlisting">
SELECT city, count(*), max(temp_lo)
    FROM weather
    GROUP BY city
    HAVING max(temp_lo) &lt; 40;
</pre><p>

</p><pre class="screen">
  city   | count | max
---------+-------+-----
 Hayward |     1 |  37
(1 row)
</pre><p>

    which gives us the same results for only the cities that have all
    <code class="structfield">temp_lo</code> values below 40.  Finally, if we only care about
    cities whose
    names begin with <span class="quote"><span class="quote"><code class="literal">S</code></span></span>, we might do:

</p><pre class="programlisting">
SELECT city, count(*), max(temp_lo)
    FROM weather
    WHERE city LIKE 'S%'            -- <span id="co.tutorial-agg-like"></span>(1)
    GROUP BY city;
</pre><p>

</p><pre class="screen">
     city      | count | max
---------------+-------+-----
 San Francisco |     2 |  46
(1 row)
</pre><p>
   </p><div class="calloutlist"><table border="0" summary="Callout list"><tr><td width="5%" valign="top" align="left"><p><a href="#co.tutorial-agg-like">(1)</a> </p></td><td valign="top" align="left"><p>
      The <code class="literal">LIKE</code> operator does pattern matching and
      is explained in <a class="xref" href="functions-matching.html" title="9.7. Pattern Matching">Section 9.7</a>.
     </p></td></tr></table></div><p>
   </p><p>
    It is important to understand the interaction between aggregates and
    <acronym class="acronym">SQL</acronym>'s <code class="literal">WHERE</code> and <code class="literal">HAVING</code> clauses.
    The fundamental difference between <code class="literal">WHERE</code> and
    <code class="literal">HAVING</code> is this: <code class="literal">WHERE</code> selects
    input rows before groups and aggregates are computed (thus, it controls
    which rows go into the aggregate computation), whereas
    <code class="literal">HAVING</code> selects group rows after groups and
    aggregates are computed.  Thus, the
    <code class="literal">WHERE</code> clause must not contain aggregate functions;
    it makes no sense to try to use an aggregate to determine which rows
    will be inputs to the aggregates.  On the other hand, the
    <code class="literal">HAVING</code> clause always contains aggregate functions.
    (Strictly speaking, you are allowed to write a <code class="literal">HAVING</code>
    clause that doesn't use aggregates, but it's seldom useful. The same
    condition could be used more efficiently at the <code class="literal">WHERE</code>
    stage.)
   </p><p>
    In the previous example, we can apply the city name restriction in
    <code class="literal">WHERE</code>, since it needs no aggregate.  This is
    more efficient than adding the restriction to <code class="literal">HAVING</code>,
    because we avoid doing the grouping and aggregate calculations
    for all rows that fail the <code class="literal">WHERE</code> check.
   </p><p>
    Another way to select the rows that go into an aggregate
    computation is to use <code class="literal">FILTER</code>, which is a
    per-aggregate option:

</p><pre class="programlisting">
SELECT city, count(*) FILTER (WHERE temp_lo &lt; 45), max(temp_lo)
    FROM weather
    GROUP BY city;
</pre><p>

</p><pre class="screen">
     city      | count | max
---------------+-------+-----
 Hayward       |     1 |  37
 San Francisco |     1 |  46
(2 rows)
</pre><p>

    <code class="literal">FILTER</code> is much like <code class="literal">WHERE</code>,
    except that it removes rows only from the input of the particular
    aggregate function that it is attached to.
    Here, the <code class="literal">count</code> aggregate counts only
    rows with <code class="literal">temp_lo</code> below 45; but the
    <code class="literal">max</code> aggregate is still applied to all rows,
    so it still finds the reading of 46.
   </p></div><div class="navfooter"><hr /><table width="100%" summary="Navigation footer"><tr><td width="40%" align="left"><a accesskey="p" href="tutorial-join.html" title="2.6. Joins Between Tables">Prev</a> </td><td width="20%" align="center"><a accesskey="u" href="tutorial-sql.html" title="Chapter 2. The SQL Language">Up</a></td><td width="40%" align="right"> <a accesskey="n" href="tutorial-update.html" title="2.8. Updates">Next</a></td></tr><tr><td width="40%" align="left" valign="top">2.6. Joins Between Tables </td><td width="20%" align="center"><a accesskey="h" href="index.html" title="PostgreSQL 16.3 Documentation">Home</a></td><td width="40%" align="right" valign="top"> 2.8. Updates</td></tr></table></div></body></html>