diff options
Diffstat (limited to 'doc/src/sgml/intagg.sgml')
-rw-r--r-- | doc/src/sgml/intagg.sgml | 131 |
1 files changed, 131 insertions, 0 deletions
diff --git a/doc/src/sgml/intagg.sgml b/doc/src/sgml/intagg.sgml new file mode 100644 index 0000000..c410f64 --- /dev/null +++ b/doc/src/sgml/intagg.sgml @@ -0,0 +1,131 @@ +<!-- doc/src/sgml/intagg.sgml --> + +<sect1 id="intagg" xreflabel="intagg"> + <title>intagg</title> + + <indexterm zone="intagg"> + <primary>intagg</primary> + </indexterm> + + <para> + The <filename>intagg</filename> module provides an integer aggregator and an + enumerator. <filename>intagg</filename> is now obsolete, because there + are built-in functions that provide a superset of its capabilities. + However, the module is still provided as a compatibility wrapper around + the built-in functions. + </para> + + <sect2> + <title>Functions</title> + + <indexterm> + <primary>int_array_aggregate</primary> + </indexterm> + + <indexterm> + <primary>array_agg</primary> + </indexterm> + + <para> + The aggregator is an aggregate function + <function>int_array_aggregate(integer)</function> + that produces an integer array + containing exactly the integers it is fed. + This is a wrapper around <function>array_agg</function>, + which does the same thing for any array type. + </para> + + <indexterm> + <primary>int_array_enum</primary> + </indexterm> + + <para> + The enumerator is a function + <function>int_array_enum(integer[])</function> + that returns <type>setof integer</type>. It is essentially the reverse + operation of the aggregator: given an array of integers, expand it + into a set of rows. This is a wrapper around <function>unnest</function>, + which does the same thing for any array type. + </para> + + </sect2> + + <sect2> + <title>Sample Uses</title> + + <para> + Many database systems have the notion of a one to many table. Such a table + usually sits between two indexed tables, for example: + +<programlisting> +CREATE TABLE left (id INT PRIMARY KEY, ...); +CREATE TABLE right (id INT PRIMARY KEY, ...); +CREATE TABLE one_to_many(left INT REFERENCES left, right INT REFERENCES right); +</programlisting> + + It is typically used like this: + +<programlisting> +SELECT right.* from right JOIN one_to_many ON (right.id = one_to_many.right) + WHERE one_to_many.left = <replaceable>item</replaceable>; +</programlisting> + + This will return all the items in the right hand table for an entry + in the left hand table. This is a very common construct in SQL. + </para> + + <para> + Now, this methodology can be cumbersome with a very large number of + entries in the <structname>one_to_many</structname> table. Often, + a join like this would result in an index scan + and a fetch for each right hand entry in the table for a particular + left hand entry. If you have a very dynamic system, there is not much you + can do. However, if you have some data which is fairly static, you can + create a summary table with the aggregator. + +<programlisting> +CREATE TABLE summary AS + SELECT left, int_array_aggregate(right) AS right + FROM one_to_many + GROUP BY left; +</programlisting> + + This will create a table with one row per left item, and an array + of right items. Now this is pretty useless without some way of using + the array; that's why there is an array enumerator. You can do + +<programlisting> +SELECT left, int_array_enum(right) FROM summary WHERE left = <replaceable>item</replaceable>; +</programlisting> + + The above query using <function>int_array_enum</function> produces the same results + as + +<programlisting> +SELECT left, right FROM one_to_many WHERE left = <replaceable>item</replaceable>; +</programlisting> + + The difference is that the query against the summary table has to get + only one row from the table, whereas the direct query against + <structname>one_to_many</structname> must index scan and fetch a row for each entry. + </para> + + <para> + On one system, an <command>EXPLAIN</command> showed a query with a cost of 8488 was + reduced to a cost of 329. The original query was a join involving the + <structname>one_to_many</structname> table, which was replaced by: + +<programlisting> +SELECT right, count(right) FROM + ( SELECT left, int_array_enum(right) AS right + FROM summary JOIN (SELECT left FROM left_table WHERE left = <replaceable>item</replaceable>) AS lefts + ON (summary.left = lefts.left) + ) AS list + GROUP BY right + ORDER BY count DESC; +</programlisting> + </para> + + </sect2> + +</sect1> |