tablefunctablefunc
The tablefunc module includes various functions that return
tables (that is, multiple rows). These functions are useful both in their
own right and as examples of how to write C functions that return
multiple rows.
This module is considered trusted, that is, it can be
installed by non-superusers who have CREATE privilege
on the current database.
Functions Provided summarizes the functions provided
by the tablefunc module.
tablefunc Functions
Function
Description
normal_rand ( numvalsinteger, meanfloat8, stddevfloat8 )
setof float8
Produces a set of normally distributed random values.
crosstab ( sqltext )
setof record
Produces a pivot table containing
row names plus N value columns, where
N is determined by the row type specified
in the calling query.
crosstabN ( sqltext )
setof table_crosstab_N
Produces a pivot table containing
row names plus N value columns.
crosstab2, crosstab3, and
crosstab4 are predefined, but you can create additional
crosstabN functions as described below.
crosstab ( source_sqltext, category_sqltext )
setof record
Produces a pivot table
with the value columns specified by a second query.
crosstab ( sqltext, Ninteger )
setof record
Obsolete version of crosstab(text).
The parameter N is now ignored, since the
number of value columns is always determined by the calling query.
connectbyconnectby ( relnametext, keyid_fldtext, parent_keyid_fldtext, orderby_fldtext, start_withtext, max_depthinteger, branch_delimtext )
setof record
Produces a representation of a hierarchical tree structure.
normal_randnormal_rand
normal_rand(int numvals, float8 mean, float8 stddev) returns setof float8
normal_rand produces a set of normally distributed random
values (Gaussian distribution).
numvals is the number of values to be returned
from the function. mean is the mean of the normal
distribution of values and stddev is the standard
deviation of the normal distribution of values.
For example, this call requests 1000 values with a mean of 5 and a
standard deviation of 3:
test=# SELECT * FROM normal_rand(1000, 5, 3);
normal_rand
----------------------
1.56556322244898
9.10040991424657
5.36957140345079
-0.369151492880995
0.283600703686639
.
.
.
4.82992125404908
9.71308014517282
2.49639286969028
(1000 rows)
crosstab(text)crosstab
crosstab(text sql)
crosstab(text sql, int N)
The crosstab function is used to produce pivot
displays, wherein data is listed across the page rather than down.
For example, we might have data like
row1 val11
row1 val12
row1 val13
...
row2 val21
row2 val22
row2 val23
...
which we wish to display like
row1 val11 val12 val13 ...
row2 val21 val22 val23 ...
...
The crosstab function takes a text parameter that is an SQL
query producing raw data formatted in the first way, and produces a table
formatted in the second way.
The sql parameter is an SQL statement that produces
the source set of data. This statement must return one
row_name column, one
category column, and one
value column. N is an
obsolete parameter, ignored if supplied (formerly this had to match the
number of output value columns, but now that is determined by the
calling query).
For example, the provided query might produce a set something like:
row_name cat value
----------+-------+-------
row1 cat1 val1
row1 cat2 val2
row1 cat3 val3
row1 cat4 val4
row2 cat1 val5
row2 cat2 val6
row2 cat3 val7
row2 cat4 val8
The crosstab function is declared to return setof
record, so the actual names and types of the output columns must be
defined in the FROM clause of the calling SELECT
statement, for example:
SELECT * FROM crosstab('...') AS ct(row_name text, category_1 text, category_2 text);
This example produces a set something like:
<== value columns ==>
row_name category_1 category_2
----------+------------+------------
row1 val1 val2
row2 val5 val6
The FROM clause must define the output as one
row_name column (of the same data type as the first result
column of the SQL query) followed by N value columns
(all of the same data type as the third result column of the SQL query).
You can set up as many output value columns as you wish. The names of the
output columns are up to you.
The crosstab function produces one output row for each
consecutive group of input rows with the same
row_name value. It fills the output
value columns, left to right, with the
value fields from these rows. If there
are fewer rows in a group than there are output value
columns, the extra output columns are filled with nulls; if there are
more rows, the extra input rows are skipped.
In practice the SQL query should always specify ORDER BY 1,2
to ensure that the input rows are properly ordered, that is, values with
the same row_name are brought together and
correctly ordered within the row. Notice that crosstab
itself does not pay any attention to the second column of the query
result; it's just there to be ordered by, to control the order in which
the third-column values appear across the page.
Here is a complete example:
CREATE TABLE ct(id SERIAL, rowid TEXT, attribute TEXT, value TEXT);
INSERT INTO ct(rowid, attribute, value) VALUES('test1','att1','val1');
INSERT INTO ct(rowid, attribute, value) VALUES('test1','att2','val2');
INSERT INTO ct(rowid, attribute, value) VALUES('test1','att3','val3');
INSERT INTO ct(rowid, attribute, value) VALUES('test1','att4','val4');
INSERT INTO ct(rowid, attribute, value) VALUES('test2','att1','val5');
INSERT INTO ct(rowid, attribute, value) VALUES('test2','att2','val6');
INSERT INTO ct(rowid, attribute, value) VALUES('test2','att3','val7');
INSERT INTO ct(rowid, attribute, value) VALUES('test2','att4','val8');
SELECT *
FROM crosstab(
'select rowid, attribute, value
from ct
where attribute = ''att2'' or attribute = ''att3''
order by 1,2')
AS ct(row_name text, category_1 text, category_2 text, category_3 text);
row_name | category_1 | category_2 | category_3
----------+------------+------------+------------
test1 | val2 | val3 |
test2 | val6 | val7 |
(2 rows)
You can avoid always having to write out a FROM clause to
define the output columns, by setting up a custom crosstab function that
has the desired output row type wired into its definition. This is
described in the next section. Another possibility is to embed the
required FROM clause in a view definition.
See also the \crosstabview
command in psql, which provides functionality similar
to crosstab().
crosstabN(text)crosstab
crosstabN(text sql)
The crosstabN functions are examples of how
to set up custom wrappers for the general crosstab function,
so that you need not write out column names and types in the calling
SELECT query. The tablefunc module includes
crosstab2, crosstab3, and
crosstab4, whose output row types are defined as
CREATE TYPE tablefunc_crosstab_N AS (
row_name TEXT,
category_1 TEXT,
category_2 TEXT,
.
.
.
category_N TEXT
);
Thus, these functions can be used directly when the input query produces
row_name and value columns of type
text, and you want 2, 3, or 4 output values columns.
In all other ways they behave exactly as described above for the
general crosstab function.
For instance, the example given in the previous section would also
work as
SELECT *
FROM crosstab3(
'select rowid, attribute, value
from ct
where attribute = ''att2'' or attribute = ''att3''
order by 1,2');
These functions are provided mostly for illustration purposes. You
can create your own return types and functions based on the
underlying crosstab() function. There are two ways
to do it:
Create a composite type describing the desired output columns,
similar to the examples in
contrib/tablefunc/tablefunc--1.0.sql.
Then define a
unique function name accepting one text parameter and returning
setof your_type_name, but linking to the same underlying
crosstab C function. For example, if your source data
produces row names that are text, and values that are
float8, and you want 5 value columns:
CREATE TYPE my_crosstab_float8_5_cols AS (
my_row_name text,
my_category_1 float8,
my_category_2 float8,
my_category_3 float8,
my_category_4 float8,
my_category_5 float8
);
CREATE OR REPLACE FUNCTION crosstab_float8_5_cols(text)
RETURNS setof my_crosstab_float8_5_cols
AS '$libdir/tablefunc','crosstab' LANGUAGE C STABLE STRICT;
Use OUT parameters to define the return type implicitly.
The same example could also be done this way:
CREATE OR REPLACE FUNCTION crosstab_float8_5_cols(
IN text,
OUT my_row_name text,
OUT my_category_1 float8,
OUT my_category_2 float8,
OUT my_category_3 float8,
OUT my_category_4 float8,
OUT my_category_5 float8)
RETURNS setof record
AS '$libdir/tablefunc','crosstab' LANGUAGE C STABLE STRICT;
crosstab(text, text)crosstab
crosstab(text source_sql, text category_sql)
The main limitation of the single-parameter form of crosstab
is that it treats all values in a group alike, inserting each value into
the first available column. If you want the value
columns to correspond to specific categories of data, and some groups
might not have data for some of the categories, that doesn't work well.
The two-parameter form of crosstab handles this case by
providing an explicit list of the categories corresponding to the
output columns.
source_sql is an SQL statement that produces the
source set of data. This statement must return one
row_name column, one
category column, and one
value column. It may also have one or more
extra columns.
The row_name column must be first. The
category and value
columns must be the last two columns, in that order. Any columns between
row_name and
category are treated as extra.
The extra columns are expected to be the same for all rows
with the same row_name value.
For example, source_sql might produce a set
something like:
SELECT row_name, extra_col, cat, value FROM foo ORDER BY 1;
row_name extra_col cat value
----------+------------+-----+---------
row1 extra1 cat1 val1
row1 extra1 cat2 val2
row1 extra1 cat4 val4
row2 extra2 cat1 val5
row2 extra2 cat2 val6
row2 extra2 cat3 val7
row2 extra2 cat4 val8
category_sql is an SQL statement that produces
the set of categories. This statement must return only one column.
It must produce at least one row, or an error will be generated.
Also, it must not produce duplicate values, or an error will be
generated. category_sql might be something like:
SELECT DISTINCT cat FROM foo ORDER BY 1;
cat
-------
cat1
cat2
cat3
cat4
The crosstab function is declared to return setof
record, so the actual names and types of the output columns must be
defined in the FROM clause of the calling SELECT
statement, for example:
SELECT * FROM crosstab('...', '...')
AS ct(row_name text, extra text, cat1 text, cat2 text, cat3 text, cat4 text);
This will produce a result something like:
<== value columns ==>
row_name extra cat1 cat2 cat3 cat4
---------+-------+------+------+------+------
row1 extra1 val1 val2 val4
row2 extra2 val5 val6 val7 val8
The FROM clause must define the proper number of output
columns of the proper data types. If there are N
columns in the source_sql query's result, the first
N-2 of them must match up with the first
N-2 output columns. The remaining output columns
must have the type of the last column of the source_sql
query's result, and there must be exactly as many of them as there
are rows in the category_sql query's result.
The crosstab function produces one output row for each
consecutive group of input rows with the same
row_name value. The output
row_name column, plus any extra
columns, are copied from the first row of the group. The output
value columns are filled with the
value fields from rows having matching
category values. If a row's category
does not match any output of the category_sql
query, its value is ignored. Output
columns whose matching category is not present in any input row
of the group are filled with nulls.
In practice the source_sql query should always
specify ORDER BY 1 to ensure that values with the same
row_name are brought together. However,
ordering of the categories within a group is not important.
Also, it is essential to be sure that the order of the
category_sql query's output matches the specified
output column order.
Here are two complete examples:
create table sales(year int, month int, qty int);
insert into sales values(2007, 1, 1000);
insert into sales values(2007, 2, 1500);
insert into sales values(2007, 7, 500);
insert into sales values(2007, 11, 1500);
insert into sales values(2007, 12, 2000);
insert into sales values(2008, 1, 1000);
select * from crosstab(
'select year, month, qty from sales order by 1',
'select m from generate_series(1,12) m'
) as (
year int,
"Jan" int,
"Feb" int,
"Mar" int,
"Apr" int,
"May" int,
"Jun" int,
"Jul" int,
"Aug" int,
"Sep" int,
"Oct" int,
"Nov" int,
"Dec" int
);
year | Jan | Feb | Mar | Apr | May | Jun | Jul | Aug | Sep | Oct | Nov | Dec
------+------+------+-----+-----+-----+-----+-----+-----+-----+-----+------+------
2007 | 1000 | 1500 | | | | | 500 | | | | 1500 | 2000
2008 | 1000 | | | | | | | | | | |
(2 rows)
CREATE TABLE cth(rowid text, rowdt timestamp, attribute text, val text);
INSERT INTO cth VALUES('test1','01 March 2003','temperature','42');
INSERT INTO cth VALUES('test1','01 March 2003','test_result','PASS');
INSERT INTO cth VALUES('test1','01 March 2003','volts','2.6987');
INSERT INTO cth VALUES('test2','02 March 2003','temperature','53');
INSERT INTO cth VALUES('test2','02 March 2003','test_result','FAIL');
INSERT INTO cth VALUES('test2','02 March 2003','test_startdate','01 March 2003');
INSERT INTO cth VALUES('test2','02 March 2003','volts','3.1234');
SELECT * FROM crosstab
(
'SELECT rowid, rowdt, attribute, val FROM cth ORDER BY 1',
'SELECT DISTINCT attribute FROM cth ORDER BY 1'
)
AS
(
rowid text,
rowdt timestamp,
temperature int4,
test_result text,
test_startdate timestamp,
volts float8
);
rowid | rowdt | temperature | test_result | test_startdate | volts
-------+--------------------------+-------------+-------------+--------------------------+--------
test1 | Sat Mar 01 00:00:00 2003 | 42 | PASS | | 2.6987
test2 | Sun Mar 02 00:00:00 2003 | 53 | FAIL | Sat Mar 01 00:00:00 2003 | 3.1234
(2 rows)
You can create predefined functions to avoid having to write out
the result column names and types in each query. See the examples
in the previous section. The underlying C function for this form
of crosstab is named crosstab_hash.
connectbyconnectby
connectby(text relname, text keyid_fld, text parent_keyid_fld
[, text orderby_fld ], text start_with, int max_depth
[, text branch_delim ])
The connectby function produces a display of hierarchical
data that is stored in a table. The table must have a key field that
uniquely identifies rows, and a parent-key field that references the
parent (if any) of each row. connectby can display the
sub-tree descending from any row.
explains the
parameters.
connectby ParametersParameterDescriptionrelnameName of the source relationkeyid_fldName of the key fieldparent_keyid_fldName of the parent-key fieldorderby_fldName of the field to order siblings by (optional)start_withKey value of the row to start atmax_depthMaximum depth to descend to, or zero for unlimited depthbranch_delimString to separate keys with in branch output (optional)
The key and parent-key fields can be any data type, but they must be
the same type. Note that the start_with value must be
entered as a text string, regardless of the type of the key field.
The connectby function is declared to return setof
record, so the actual names and types of the output columns must be
defined in the FROM clause of the calling SELECT
statement, for example:
SELECT * FROM connectby('connectby_tree', 'keyid', 'parent_keyid', 'pos', 'row2', 0, '~')
AS t(keyid text, parent_keyid text, level int, branch text, pos int);
The first two output columns are used for the current row's key and
its parent row's key; they must match the type of the table's key field.
The third output column is the depth in the tree and must be of type
integer. If a branch_delim parameter was
given, the next output column is the branch display and must be of type
text. Finally, if an orderby_fld
parameter was given, the last output column is a serial number, and must
be of type integer.
The branch output column shows the path of keys taken to
reach the current row. The keys are separated by the specified
branch_delim string. If no branch display is
wanted, omit both the branch_delim parameter
and the branch column in the output column list.
If the ordering of siblings of the same parent is important,
include the orderby_fld parameter to
specify which field to order siblings by. This field can be of any
sortable data type. The output column list must include a final
integer serial-number column, if and only if
orderby_fld is specified.
The parameters representing table and field names are copied as-is
into the SQL queries that connectby generates internally.
Therefore, include double quotes if the names are mixed-case or contain
special characters. You may also need to schema-qualify the table name.
In large tables, performance will be poor unless there is an index on
the parent-key field.
It is important that the branch_delim string
not appear in any key values, else connectby may incorrectly
report an infinite-recursion error. Note that if
branch_delim is not provided, a default value
of ~ is used for recursion detection purposes.
Here is an example:
CREATE TABLE connectby_tree(keyid text, parent_keyid text, pos int);
INSERT INTO connectby_tree VALUES('row1',NULL, 0);
INSERT INTO connectby_tree VALUES('row2','row1', 0);
INSERT INTO connectby_tree VALUES('row3','row1', 0);
INSERT INTO connectby_tree VALUES('row4','row2', 1);
INSERT INTO connectby_tree VALUES('row5','row2', 0);
INSERT INTO connectby_tree VALUES('row6','row4', 0);
INSERT INTO connectby_tree VALUES('row7','row3', 0);
INSERT INTO connectby_tree VALUES('row8','row6', 0);
INSERT INTO connectby_tree VALUES('row9','row5', 0);
-- with branch, without orderby_fld (order of results is not guaranteed)
SELECT * FROM connectby('connectby_tree', 'keyid', 'parent_keyid', 'row2', 0, '~')
AS t(keyid text, parent_keyid text, level int, branch text);
keyid | parent_keyid | level | branch
-------+--------------+-------+---------------------
row2 | | 0 | row2
row4 | row2 | 1 | row2~row4
row6 | row4 | 2 | row2~row4~row6
row8 | row6 | 3 | row2~row4~row6~row8
row5 | row2 | 1 | row2~row5
row9 | row5 | 2 | row2~row5~row9
(6 rows)
-- without branch, without orderby_fld (order of results is not guaranteed)
SELECT * FROM connectby('connectby_tree', 'keyid', 'parent_keyid', 'row2', 0)
AS t(keyid text, parent_keyid text, level int);
keyid | parent_keyid | level
-------+--------------+-------
row2 | | 0
row4 | row2 | 1
row6 | row4 | 2
row8 | row6 | 3
row5 | row2 | 1
row9 | row5 | 2
(6 rows)
-- with branch, with orderby_fld (notice that row5 comes before row4)
SELECT * FROM connectby('connectby_tree', 'keyid', 'parent_keyid', 'pos', 'row2', 0, '~')
AS t(keyid text, parent_keyid text, level int, branch text, pos int);
keyid | parent_keyid | level | branch | pos
-------+--------------+-------+---------------------+-----
row2 | | 0 | row2 | 1
row5 | row2 | 1 | row2~row5 | 2
row9 | row5 | 2 | row2~row5~row9 | 3
row4 | row2 | 1 | row2~row4 | 4
row6 | row4 | 2 | row2~row4~row6 | 5
row8 | row6 | 3 | row2~row4~row6~row8 | 6
(6 rows)
-- without branch, with orderby_fld (notice that row5 comes before row4)
SELECT * FROM connectby('connectby_tree', 'keyid', 'parent_keyid', 'pos', 'row2', 0)
AS t(keyid text, parent_keyid text, level int, pos int);
keyid | parent_keyid | level | pos
-------+--------------+-------+-----
row2 | | 0 | 1
row5 | row2 | 1 | 2
row9 | row5 | 2 | 3
row4 | row2 | 1 | 4
row6 | row4 | 2 | 5
row8 | row6 | 3 | 6
(6 rows)
Author
Joe Conway