summaryrefslogtreecommitdiffstats
path: root/doc/src/sgml/html/tutorial-agg.html
diff options
context:
space:
mode:
authorDaniel Baumann <daniel.baumann@progress-linux.org>2024-04-13 13:44:03 +0000
committerDaniel Baumann <daniel.baumann@progress-linux.org>2024-04-13 13:44:03 +0000
commit293913568e6a7a86fd1479e1cff8e2ecb58d6568 (patch)
treefc3b469a3ec5ab71b36ea97cc7aaddb838423a0c /doc/src/sgml/html/tutorial-agg.html
parentInitial commit. (diff)
downloadpostgresql-16-293913568e6a7a86fd1479e1cff8e2ecb58d6568.tar.xz
postgresql-16-293913568e6a7a86fd1479e1cff8e2ecb58d6568.zip
Adding upstream version 16.2.upstream/16.2
Signed-off-by: Daniel Baumann <daniel.baumann@progress-linux.org>
Diffstat (limited to 'doc/src/sgml/html/tutorial-agg.html')
-rw-r--r--doc/src/sgml/html/tutorial-agg.html172
1 files changed, 172 insertions, 0 deletions
diff --git a/doc/src/sgml/html/tutorial-agg.html b/doc/src/sgml/html/tutorial-agg.html
new file mode 100644
index 0000000..c2c1f9a
--- /dev/null
+++ b/doc/src/sgml/html/tutorial-agg.html
@@ -0,0 +1,172 @@
+<?xml version="1.0" encoding="UTF-8" standalone="no"?>
+<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"><html xmlns="http://www.w3.org/1999/xhtml"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /><title>2.7. Aggregate Functions</title><link rel="stylesheet" type="text/css" href="stylesheet.css" /><link rev="made" href="pgsql-docs@lists.postgresql.org" /><meta name="generator" content="DocBook XSL Stylesheets Vsnapshot" /><link rel="prev" href="tutorial-join.html" title="2.6. Joins Between Tables" /><link rel="next" href="tutorial-update.html" title="2.8. Updates" /></head><body id="docContent" class="container-fluid col-10"><div class="navheader"><table width="100%" summary="Navigation header"><tr><th colspan="5" align="center">2.7. Aggregate Functions</th></tr><tr><td width="10%" align="left"><a accesskey="p" href="tutorial-join.html" title="2.6. Joins Between Tables">Prev</a> </td><td width="10%" align="left"><a accesskey="u" href="tutorial-sql.html" title="Chapter 2. The SQL Language">Up</a></td><th width="60%" align="center">Chapter 2. The <acronym class="acronym">SQL</acronym> Language</th><td width="10%" align="right"><a accesskey="h" href="index.html" title="PostgreSQL 16.2 Documentation">Home</a></td><td width="10%" align="right"> <a accesskey="n" href="tutorial-update.html" title="2.8. Updates">Next</a></td></tr></table><hr /></div><div class="sect1" id="TUTORIAL-AGG"><div class="titlepage"><div><div><h2 class="title" style="clear: both">2.7. Aggregate Functions <a href="#TUTORIAL-AGG" class="id_link">#</a></h2></div></div></div><a id="id-1.4.4.8.2" class="indexterm"></a><p>
+ Like most other relational database products,
+ <span class="productname">PostgreSQL</span> supports
+ <em class="firstterm">aggregate functions</em>.
+ An aggregate function computes a single result from multiple input rows.
+ For example, there are aggregates to compute the
+ <code class="function">count</code>, <code class="function">sum</code>,
+ <code class="function">avg</code> (average), <code class="function">max</code> (maximum) and
+ <code class="function">min</code> (minimum) over a set of rows.
+ </p><p>
+ As an example, we can find the highest low-temperature reading anywhere
+ with:
+
+</p><pre class="programlisting">
+SELECT max(temp_lo) FROM weather;
+</pre><p>
+
+</p><pre class="screen">
+ max
+-----
+ 46
+(1 row)
+</pre><p>
+ </p><p>
+ <a id="id-1.4.4.8.5.1" class="indexterm"></a>
+
+ If we wanted to know what city (or cities) that reading occurred in,
+ we might try:
+
+</p><pre class="programlisting">
+SELECT city FROM weather WHERE temp_lo = max(temp_lo); <em class="lineannotation"><span class="lineannotation">WRONG</span></em>
+</pre><p>
+
+ but this will not work since the aggregate
+ <code class="function">max</code> cannot be used in the
+ <code class="literal">WHERE</code> clause. (This restriction exists because
+ the <code class="literal">WHERE</code> clause determines which rows will be
+ included in the aggregate calculation; so obviously it has to be evaluated
+ before aggregate functions are computed.)
+ However, as is often the case
+ the query can be restated to accomplish the desired result, here
+ by using a <em class="firstterm">subquery</em>:
+
+</p><pre class="programlisting">
+SELECT city FROM weather
+ WHERE temp_lo = (SELECT max(temp_lo) FROM weather);
+</pre><p>
+
+</p><pre class="screen">
+ city
+---------------
+ San Francisco
+(1 row)
+</pre><p>
+
+ This is OK because the subquery is an independent computation
+ that computes its own aggregate separately from what is happening
+ in the outer query.
+ </p><p>
+ <a id="id-1.4.4.8.6.1" class="indexterm"></a>
+ <a id="id-1.4.4.8.6.2" class="indexterm"></a>
+
+ Aggregates are also very useful in combination with <code class="literal">GROUP
+ BY</code> clauses. For example, we can get the number of readings
+ and the maximum low temperature observed in each city with:
+
+</p><pre class="programlisting">
+SELECT city, count(*), max(temp_lo)
+ FROM weather
+ GROUP BY city;
+</pre><p>
+
+</p><pre class="screen">
+ city | count | max
+---------------+-------+-----
+ Hayward | 1 | 37
+ San Francisco | 2 | 46
+(2 rows)
+</pre><p>
+
+ which gives us one output row per city. Each aggregate result is
+ computed over the table rows matching that city.
+ We can filter these grouped
+ rows using <code class="literal">HAVING</code>:
+
+</p><pre class="programlisting">
+SELECT city, count(*), max(temp_lo)
+ FROM weather
+ GROUP BY city
+ HAVING max(temp_lo) &lt; 40;
+</pre><p>
+
+</p><pre class="screen">
+ city | count | max
+---------+-------+-----
+ Hayward | 1 | 37
+(1 row)
+</pre><p>
+
+ which gives us the same results for only the cities that have all
+ <code class="structfield">temp_lo</code> values below 40. Finally, if we only care about
+ cities whose
+ names begin with <span class="quote">“<span class="quote"><code class="literal">S</code></span>”</span>, we might do:
+
+</p><pre class="programlisting">
+SELECT city, count(*), max(temp_lo)
+ FROM weather
+ WHERE city LIKE 'S%' -- <span id="co.tutorial-agg-like"></span>(1)
+ GROUP BY city;
+</pre><p>
+
+</p><pre class="screen">
+ city | count | max
+---------------+-------+-----
+ San Francisco | 2 | 46
+(1 row)
+</pre><p>
+ </p><div class="calloutlist"><table border="0" summary="Callout list"><tr><td width="5%" valign="top" align="left"><p><a href="#co.tutorial-agg-like">(1)</a> </p></td><td valign="top" align="left"><p>
+ The <code class="literal">LIKE</code> operator does pattern matching and
+ is explained in <a class="xref" href="functions-matching.html" title="9.7. Pattern Matching">Section 9.7</a>.
+ </p></td></tr></table></div><p>
+ </p><p>
+ It is important to understand the interaction between aggregates and
+ <acronym class="acronym">SQL</acronym>'s <code class="literal">WHERE</code> and <code class="literal">HAVING</code> clauses.
+ The fundamental difference between <code class="literal">WHERE</code> and
+ <code class="literal">HAVING</code> is this: <code class="literal">WHERE</code> selects
+ input rows before groups and aggregates are computed (thus, it controls
+ which rows go into the aggregate computation), whereas
+ <code class="literal">HAVING</code> selects group rows after groups and
+ aggregates are computed. Thus, the
+ <code class="literal">WHERE</code> clause must not contain aggregate functions;
+ it makes no sense to try to use an aggregate to determine which rows
+ will be inputs to the aggregates. On the other hand, the
+ <code class="literal">HAVING</code> clause always contains aggregate functions.
+ (Strictly speaking, you are allowed to write a <code class="literal">HAVING</code>
+ clause that doesn't use aggregates, but it's seldom useful. The same
+ condition could be used more efficiently at the <code class="literal">WHERE</code>
+ stage.)
+ </p><p>
+ In the previous example, we can apply the city name restriction in
+ <code class="literal">WHERE</code>, since it needs no aggregate. This is
+ more efficient than adding the restriction to <code class="literal">HAVING</code>,
+ because we avoid doing the grouping and aggregate calculations
+ for all rows that fail the <code class="literal">WHERE</code> check.
+ </p><p>
+ Another way to select the rows that go into an aggregate
+ computation is to use <code class="literal">FILTER</code>, which is a
+ per-aggregate option:
+
+</p><pre class="programlisting">
+SELECT city, count(*) FILTER (WHERE temp_lo &lt; 45), max(temp_lo)
+ FROM weather
+ GROUP BY city;
+</pre><p>
+
+</p><pre class="screen">
+ city | count | max
+---------------+-------+-----
+ Hayward | 1 | 37
+ San Francisco | 1 | 46
+(2 rows)
+</pre><p>
+
+ <code class="literal">FILTER</code> is much like <code class="literal">WHERE</code>,
+ except that it removes rows only from the input of the particular
+ aggregate function that it is attached to.
+ Here, the <code class="literal">count</code> aggregate counts only
+ rows with <code class="literal">temp_lo</code> below 45; but the
+ <code class="literal">max</code> aggregate is still applied to all rows,
+ so it still finds the reading of 46.
+ </p></div><div class="navfooter"><hr /><table width="100%" summary="Navigation footer"><tr><td width="40%" align="left"><a accesskey="p" href="tutorial-join.html" title="2.6. Joins Between Tables">Prev</a> </td><td width="20%" align="center"><a accesskey="u" href="tutorial-sql.html" title="Chapter 2. The SQL Language">Up</a></td><td width="40%" align="right"> <a accesskey="n" href="tutorial-update.html" title="2.8. Updates">Next</a></td></tr><tr><td width="40%" align="left" valign="top">2.6. Joins Between Tables </td><td width="20%" align="center"><a accesskey="h" href="index.html" title="PostgreSQL 16.2 Documentation">Home</a></td><td width="40%" align="right" valign="top"> 2.8. Updates</td></tr></table></div></body></html> \ No newline at end of file