1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
|
<!-- doc/src/sgml/intagg.sgml -->
<sect1 id="intagg" xreflabel="intagg">
<title>intagg</title>
<indexterm zone="intagg">
<primary>intagg</primary>
</indexterm>
<para>
The <filename>intagg</filename> module provides an integer aggregator and an
enumerator. <filename>intagg</filename> is now obsolete, because there
are built-in functions that provide a superset of its capabilities.
However, the module is still provided as a compatibility wrapper around
the built-in functions.
</para>
<sect2>
<title>Functions</title>
<indexterm>
<primary>int_array_aggregate</primary>
</indexterm>
<indexterm>
<primary>array_agg</primary>
</indexterm>
<para>
The aggregator is an aggregate function
<function>int_array_aggregate(integer)</function>
that produces an integer array
containing exactly the integers it is fed.
This is a wrapper around <function>array_agg</function>,
which does the same thing for any array type.
</para>
<indexterm>
<primary>int_array_enum</primary>
</indexterm>
<para>
The enumerator is a function
<function>int_array_enum(integer[])</function>
that returns <type>setof integer</type>. It is essentially the reverse
operation of the aggregator: given an array of integers, expand it
into a set of rows. This is a wrapper around <function>unnest</function>,
which does the same thing for any array type.
</para>
</sect2>
<sect2>
<title>Sample Uses</title>
<para>
Many database systems have the notion of a one to many table. Such a table
usually sits between two indexed tables, for example:
<programlisting>
CREATE TABLE left (id INT PRIMARY KEY, ...);
CREATE TABLE right (id INT PRIMARY KEY, ...);
CREATE TABLE one_to_many(left INT REFERENCES left, right INT REFERENCES right);
</programlisting>
It is typically used like this:
<programlisting>
SELECT right.* from right JOIN one_to_many ON (right.id = one_to_many.right)
WHERE one_to_many.left = <replaceable>item</replaceable>;
</programlisting>
This will return all the items in the right hand table for an entry
in the left hand table. This is a very common construct in SQL.
</para>
<para>
Now, this methodology can be cumbersome with a very large number of
entries in the <structname>one_to_many</structname> table. Often,
a join like this would result in an index scan
and a fetch for each right hand entry in the table for a particular
left hand entry. If you have a very dynamic system, there is not much you
can do. However, if you have some data which is fairly static, you can
create a summary table with the aggregator.
<programlisting>
CREATE TABLE summary AS
SELECT left, int_array_aggregate(right) AS right
FROM one_to_many
GROUP BY left;
</programlisting>
This will create a table with one row per left item, and an array
of right items. Now this is pretty useless without some way of using
the array; that's why there is an array enumerator. You can do
<programlisting>
SELECT left, int_array_enum(right) FROM summary WHERE left = <replaceable>item</replaceable>;
</programlisting>
The above query using <function>int_array_enum</function> produces the same results
as
<programlisting>
SELECT left, right FROM one_to_many WHERE left = <replaceable>item</replaceable>;
</programlisting>
The difference is that the query against the summary table has to get
only one row from the table, whereas the direct query against
<structname>one_to_many</structname> must index scan and fetch a row for each entry.
</para>
<para>
On one system, an <command>EXPLAIN</command> showed a query with a cost of 8488 was
reduced to a cost of 329. The original query was a join involving the
<structname>one_to_many</structname> table, which was replaced by:
<programlisting>
SELECT right, count(right) FROM
( SELECT left, int_array_enum(right) AS right
FROM summary JOIN (SELECT left FROM left_table WHERE left = <replaceable>item</replaceable>) AS lefts
ON (summary.left = lefts.left)
) AS list
GROUP BY right
ORDER BY count DESC;
</programlisting>
</para>
</sect2>
</sect1>
|