summaryrefslogtreecommitdiffstats
path: root/doc/src/sgml/rowtypes.sgml
diff options
context:
space:
mode:
Diffstat (limited to 'doc/src/sgml/rowtypes.sgml')
-rw-r--r--doc/src/sgml/rowtypes.sgml540
1 files changed, 540 insertions, 0 deletions
diff --git a/doc/src/sgml/rowtypes.sgml b/doc/src/sgml/rowtypes.sgml
new file mode 100644
index 0000000..417ccb0
--- /dev/null
+++ b/doc/src/sgml/rowtypes.sgml
@@ -0,0 +1,540 @@
+<!-- doc/src/sgml/rowtypes.sgml -->
+
+<sect1 id="rowtypes">
+ <title>Composite Types</title>
+
+ <indexterm>
+ <primary>composite type</primary>
+ </indexterm>
+
+ <indexterm>
+ <primary>row type</primary>
+ </indexterm>
+
+ <para>
+ A <firstterm>composite type</firstterm> represents the structure of a row or record;
+ it is essentially just a list of field names and their data types.
+ <productname>PostgreSQL</productname> allows composite types to be
+ used in many of the same ways that simple types can be used. For example, a
+ column of a table can be declared to be of a composite type.
+ </para>
+
+ <sect2 id="rowtypes-declaring">
+ <title>Declaration of Composite Types</title>
+
+ <para>
+ Here are two simple examples of defining composite types:
+<programlisting>
+CREATE TYPE complex AS (
+ r double precision,
+ i double precision
+);
+
+CREATE TYPE inventory_item AS (
+ name text,
+ supplier_id integer,
+ price numeric
+);
+</programlisting>
+ The syntax is comparable to <command>CREATE TABLE</command>, except that only
+ field names and types can be specified; no constraints (such as <literal>NOT
+ NULL</literal>) can presently be included. Note that the <literal>AS</literal> keyword
+ is essential; without it, the system will think a different kind
+ of <command>CREATE TYPE</command> command is meant, and you will get odd syntax
+ errors.
+ </para>
+
+ <para>
+ Having defined the types, we can use them to create tables:
+
+<programlisting>
+CREATE TABLE on_hand (
+ item inventory_item,
+ count integer
+);
+
+INSERT INTO on_hand VALUES (ROW('fuzzy dice', 42, 1.99), 1000);
+</programlisting>
+
+ or functions:
+
+<programlisting>
+CREATE FUNCTION price_extension(inventory_item, integer) RETURNS numeric
+AS 'SELECT $1.price * $2' LANGUAGE SQL;
+
+SELECT price_extension(item, 10) FROM on_hand;
+</programlisting>
+
+ </para>
+
+ <para>
+ Whenever you create a table, a composite type is also automatically
+ created, with the same name as the table, to represent the table's
+ row type. For example, had we said:
+<programlisting>
+CREATE TABLE inventory_item (
+ name text,
+ supplier_id integer REFERENCES suppliers,
+ price numeric CHECK (price &gt; 0)
+);
+</programlisting>
+ then the same <literal>inventory_item</literal> composite type shown above would
+ come into being as a
+ byproduct, and could be used just as above. Note however an important
+ restriction of the current implementation: since no constraints are
+ associated with a composite type, the constraints shown in the table
+ definition <emphasis>do not apply</emphasis> to values of the composite type
+ outside the table. (To work around this, create a
+ <glossterm linkend="glossary-domain">domain</glossterm> over the composite
+ type, and apply the desired constraints as <literal>CHECK</literal>
+ constraints of the domain.)
+ </para>
+ </sect2>
+
+ <sect2>
+ <title>Constructing Composite Values</title>
+
+ <indexterm>
+ <primary>composite type</primary>
+ <secondary>constant</secondary>
+ </indexterm>
+
+ <para>
+ To write a composite value as a literal constant, enclose the field
+ values within parentheses and separate them by commas. You can put double
+ quotes around any field value, and must do so if it contains commas or
+ parentheses. (More details appear <link
+ linkend="rowtypes-io-syntax">below</link>.) Thus, the general format of
+ a composite constant is the following:
+<synopsis>
+'( <replaceable>val1</replaceable> , <replaceable>val2</replaceable> , ... )'
+</synopsis>
+ An example is:
+<programlisting>
+'("fuzzy dice",42,1.99)'
+</programlisting>
+ which would be a valid value of the <literal>inventory_item</literal> type
+ defined above. To make a field be NULL, write no characters at all
+ in its position in the list. For example, this constant specifies
+ a NULL third field:
+<programlisting>
+'("fuzzy dice",42,)'
+</programlisting>
+ If you want an empty string rather than NULL, write double quotes:
+<programlisting>
+'("",42,)'
+</programlisting>
+ Here the first field is a non-NULL empty string, the third is NULL.
+ </para>
+
+ <para>
+ (These constants are actually only a special case of
+ the generic type constants discussed in <xref
+ linkend="sql-syntax-constants-generic"/>. The constant is initially
+ treated as a string and passed to the composite-type input conversion
+ routine. An explicit type specification might be necessary to tell
+ which type to convert the constant to.)
+ </para>
+
+ <para>
+ The <literal>ROW</literal> expression syntax can also be used to
+ construct composite values. In most cases this is considerably
+ simpler to use than the string-literal syntax since you don't have
+ to worry about multiple layers of quoting. We already used this
+ method above:
+<programlisting>
+ROW('fuzzy dice', 42, 1.99)
+ROW('', 42, NULL)
+</programlisting>
+ The ROW keyword is actually optional as long as you have more than one
+ field in the expression, so these can be simplified to:
+<programlisting>
+('fuzzy dice', 42, 1.99)
+('', 42, NULL)
+</programlisting>
+ The <literal>ROW</literal> expression syntax is discussed in more detail in <xref
+ linkend="sql-syntax-row-constructors"/>.
+ </para>
+ </sect2>
+
+ <sect2 id="rowtypes-accessing">
+ <title>Accessing Composite Types</title>
+
+ <para>
+ To access a field of a composite column, one writes a dot and the field
+ name, much like selecting a field from a table name. In fact, it's so
+ much like selecting from a table name that you often have to use parentheses
+ to keep from confusing the parser. For example, you might try to select
+ some subfields from our <literal>on_hand</literal> example table with something
+ like:
+
+<programlisting>
+SELECT item.name FROM on_hand WHERE item.price &gt; 9.99;
+</programlisting>
+
+ This will not work since the name <literal>item</literal> is taken to be a table
+ name, not a column name of <literal>on_hand</literal>, per SQL syntax rules.
+ You must write it like this:
+
+<programlisting>
+SELECT (item).name FROM on_hand WHERE (item).price &gt; 9.99;
+</programlisting>
+
+ or if you need to use the table name as well (for instance in a multitable
+ query), like this:
+
+<programlisting>
+SELECT (on_hand.item).name FROM on_hand WHERE (on_hand.item).price &gt; 9.99;
+</programlisting>
+
+ Now the parenthesized object is correctly interpreted as a reference to
+ the <literal>item</literal> column, and then the subfield can be selected from it.
+ </para>
+
+ <para>
+ Similar syntactic issues apply whenever you select a field from a composite
+ value. For instance, to select just one field from the result of a function
+ that returns a composite value, you'd need to write something like:
+
+<programlisting>
+SELECT (my_func(...)).field FROM ...
+</programlisting>
+
+ Without the extra parentheses, this will generate a syntax error.
+ </para>
+
+ <para>
+ The special field name <literal>*</literal> means <quote>all fields</quote>, as
+ further explained in <xref linkend="rowtypes-usage"/>.
+ </para>
+ </sect2>
+
+ <sect2>
+ <title>Modifying Composite Types</title>
+
+ <para>
+ Here are some examples of the proper syntax for inserting and updating
+ composite columns.
+ First, inserting or updating a whole column:
+
+<programlisting>
+INSERT INTO mytab (complex_col) VALUES((1.1,2.2));
+
+UPDATE mytab SET complex_col = ROW(1.1,2.2) WHERE ...;
+</programlisting>
+
+ The first example omits <literal>ROW</literal>, the second uses it; we
+ could have done it either way.
+ </para>
+
+ <para>
+ We can update an individual subfield of a composite column:
+
+<programlisting>
+UPDATE mytab SET complex_col.r = (complex_col).r + 1 WHERE ...;
+</programlisting>
+
+ Notice here that we don't need to (and indeed cannot)
+ put parentheses around the column name appearing just after
+ <literal>SET</literal>, but we do need parentheses when referencing the same
+ column in the expression to the right of the equal sign.
+ </para>
+
+ <para>
+ And we can specify subfields as targets for <command>INSERT</command>, too:
+
+<programlisting>
+INSERT INTO mytab (complex_col.r, complex_col.i) VALUES(1.1, 2.2);
+</programlisting>
+
+ Had we not supplied values for all the subfields of the column, the
+ remaining subfields would have been filled with null values.
+ </para>
+ </sect2>
+
+ <sect2 id="rowtypes-usage">
+ <title>Using Composite Types in Queries</title>
+
+ <para>
+ There are various special syntax rules and behaviors associated with
+ composite types in queries. These rules provide useful shortcuts,
+ but can be confusing if you don't know the logic behind them.
+ </para>
+
+ <para>
+ In <productname>PostgreSQL</productname>, a reference to a table name (or alias)
+ in a query is effectively a reference to the composite value of the
+ table's current row. For example, if we had a table
+ <structname>inventory_item</structname> as shown
+ <link linkend="rowtypes-declaring">above</link>, we could write:
+<programlisting>
+SELECT c FROM inventory_item c;
+</programlisting>
+ This query produces a single composite-valued column, so we might get
+ output like:
+<programlisting>
+ c
+------------------------
+ ("fuzzy dice",42,1.99)
+(1 row)
+</programlisting>
+ Note however that simple names are matched to column names before table
+ names, so this example works only because there is no column
+ named <structfield>c</structfield> in the query's tables.
+ </para>
+
+ <para>
+ The ordinary qualified-column-name
+ syntax <replaceable>table_name</replaceable><literal>.</literal><replaceable>column_name</replaceable>
+ can be understood as applying <link linkend="field-selection">field
+ selection</link> to the composite value of the table's current row.
+ (For efficiency reasons, it's not actually implemented that way.)
+ </para>
+
+ <para>
+ When we write
+<programlisting>
+SELECT c.* FROM inventory_item c;
+</programlisting>
+ then, according to the SQL standard, we should get the contents of the
+ table expanded into separate columns:
+<programlisting>
+ name | supplier_id | price
+------------+-------------+-------
+ fuzzy dice | 42 | 1.99
+(1 row)
+</programlisting>
+ as if the query were
+<programlisting>
+SELECT c.name, c.supplier_id, c.price FROM inventory_item c;
+</programlisting>
+ <productname>PostgreSQL</productname> will apply this expansion behavior to
+ any composite-valued expression, although as shown <link
+ linkend="rowtypes-accessing">above</link>, you need to write parentheses
+ around the value that <literal>.*</literal> is applied to whenever it's not a
+ simple table name. For example, if <function>myfunc()</function> is a function
+ returning a composite type with columns <structfield>a</structfield>,
+ <structfield>b</structfield>, and <structfield>c</structfield>, then these two queries have the
+ same result:
+<programlisting>
+SELECT (myfunc(x)).* FROM some_table;
+SELECT (myfunc(x)).a, (myfunc(x)).b, (myfunc(x)).c FROM some_table;
+</programlisting>
+ </para>
+
+ <tip>
+ <para>
+ <productname>PostgreSQL</productname> handles column expansion by
+ actually transforming the first form into the second. So, in this
+ example, <function>myfunc()</function> would get invoked three times per row
+ with either syntax. If it's an expensive function you may wish to
+ avoid that, which you can do with a query like:
+<programlisting>
+SELECT m.* FROM some_table, LATERAL myfunc(x) AS m;
+</programlisting>
+ Placing the function in
+ a <literal>LATERAL</literal> <literal>FROM</literal> item keeps it from
+ being invoked more than once per row. <literal>m.*</literal> is still
+ expanded into <literal>m.a, m.b, m.c</literal>, but now those variables
+ are just references to the output of the <literal>FROM</literal> item.
+ (The <literal>LATERAL</literal> keyword is optional here, but we show it
+ to clarify that the function is getting <structfield>x</structfield>
+ from <structname>some_table</structname>.)
+ </para>
+ </tip>
+
+ <para>
+ The <replaceable>composite_value</replaceable><literal>.*</literal> syntax results in
+ column expansion of this kind when it appears at the top level of
+ a <link linkend="queries-select-lists"><command>SELECT</command> output
+ list</link>, a <link linkend="dml-returning"><literal>RETURNING</literal>
+ list</link> in <command>INSERT</command>/<command>UPDATE</command>/<command>DELETE</command>,
+ a <link linkend="queries-values"><literal>VALUES</literal> clause</link>, or
+ a <link linkend="sql-syntax-row-constructors">row constructor</link>.
+ In all other contexts (including when nested inside one of those
+ constructs), attaching <literal>.*</literal> to a composite value does not
+ change the value, since it means <quote>all columns</quote> and so the
+ same composite value is produced again. For example,
+ if <function>somefunc()</function> accepts a composite-valued argument,
+ these queries are the same:
+
+<programlisting>
+SELECT somefunc(c.*) FROM inventory_item c;
+SELECT somefunc(c) FROM inventory_item c;
+</programlisting>
+
+ In both cases, the current row of <structname>inventory_item</structname> is
+ passed to the function as a single composite-valued argument.
+ Even though <literal>.*</literal> does nothing in such cases, using it is good
+ style, since it makes clear that a composite value is intended. In
+ particular, the parser will consider <literal>c</literal> in <literal>c.*</literal> to
+ refer to a table name or alias, not to a column name, so that there is
+ no ambiguity; whereas without <literal>.*</literal>, it is not clear
+ whether <literal>c</literal> means a table name or a column name, and in fact
+ the column-name interpretation will be preferred if there is a column
+ named <literal>c</literal>.
+ </para>
+
+ <para>
+ Another example demonstrating these concepts is that all these queries
+ mean the same thing:
+<programlisting>
+SELECT * FROM inventory_item c ORDER BY c;
+SELECT * FROM inventory_item c ORDER BY c.*;
+SELECT * FROM inventory_item c ORDER BY ROW(c.*);
+</programlisting>
+ All of these <literal>ORDER BY</literal> clauses specify the row's composite
+ value, resulting in sorting the rows according to the rules described
+ in <xref linkend="composite-type-comparison"/>. However,
+ if <structname>inventory_item</structname> contained a column
+ named <structfield>c</structfield>, the first case would be different from the
+ others, as it would mean to sort by that column only. Given the column
+ names previously shown, these queries are also equivalent to those above:
+<programlisting>
+SELECT * FROM inventory_item c ORDER BY ROW(c.name, c.supplier_id, c.price);
+SELECT * FROM inventory_item c ORDER BY (c.name, c.supplier_id, c.price);
+</programlisting>
+ (The last case uses a row constructor with the key word <literal>ROW</literal>
+ omitted.)
+ </para>
+
+ <para>
+ Another special syntactical behavior associated with composite values is
+ that we can use <firstterm>functional notation</firstterm> for extracting a field
+ of a composite value. The simple way to explain this is that
+ the notations <literal><replaceable>field</replaceable>(<replaceable>table</replaceable>)</literal>
+ and <literal><replaceable>table</replaceable>.<replaceable>field</replaceable></literal>
+ are interchangeable. For example, these queries are equivalent:
+
+<programlisting>
+SELECT c.name FROM inventory_item c WHERE c.price &gt; 1000;
+SELECT name(c) FROM inventory_item c WHERE price(c) &gt; 1000;
+</programlisting>
+
+ Moreover, if we have a function that accepts a single argument of a
+ composite type, we can call it with either notation. These queries are
+ all equivalent:
+
+<programlisting>
+SELECT somefunc(c) FROM inventory_item c;
+SELECT somefunc(c.*) FROM inventory_item c;
+SELECT c.somefunc FROM inventory_item c;
+</programlisting>
+ </para>
+
+ <para>
+ This equivalence between functional notation and field notation
+ makes it possible to use functions on composite types to implement
+ <quote>computed fields</quote>.
+ <indexterm>
+ <primary>computed field</primary>
+ </indexterm>
+ <indexterm>
+ <primary>field</primary>
+ <secondary>computed</secondary>
+ </indexterm>
+ An application using the last query above wouldn't need to be directly
+ aware that <literal>somefunc</literal> isn't a real column of the table.
+ </para>
+
+ <tip>
+ <para>
+ Because of this behavior, it's unwise to give a function that takes a
+ single composite-type argument the same name as any of the fields of
+ that composite type. If there is ambiguity, the field-name
+ interpretation will be chosen if field-name syntax is used, while the
+ function will be chosen if function-call syntax is used. However,
+ <productname>PostgreSQL</productname> versions before 11 always chose the
+ field-name interpretation, unless the syntax of the call required it to
+ be a function call. One way to force the function interpretation in
+ older versions is to schema-qualify the function name, that is, write
+ <literal><replaceable>schema</replaceable>.<replaceable>func</replaceable>(<replaceable>compositevalue</replaceable>)</literal>.
+ </para>
+ </tip>
+ </sect2>
+
+ <sect2 id="rowtypes-io-syntax">
+ <title>Composite Type Input and Output Syntax</title>
+
+ <para>
+ The external text representation of a composite value consists of items that
+ are interpreted according to the I/O conversion rules for the individual
+ field types, plus decoration that indicates the composite structure.
+ The decoration consists of parentheses (<literal>(</literal> and <literal>)</literal>)
+ around the whole value, plus commas (<literal>,</literal>) between adjacent
+ items. Whitespace outside the parentheses is ignored, but within the
+ parentheses it is considered part of the field value, and might or might not be
+ significant depending on the input conversion rules for the field data type.
+ For example, in:
+<programlisting>
+'( 42)'
+</programlisting>
+ the whitespace will be ignored if the field type is integer, but not if
+ it is text.
+ </para>
+
+ <para>
+ As shown previously, when writing a composite value you can write double
+ quotes around any individual field value.
+ You <emphasis>must</emphasis> do so if the field value would otherwise
+ confuse the composite-value parser. In particular, fields containing
+ parentheses, commas, double quotes, or backslashes must be double-quoted.
+ To put a double quote or backslash in a quoted composite field value,
+ precede it with a backslash. (Also, a pair of double quotes within a
+ double-quoted field value is taken to represent a double quote character,
+ analogously to the rules for single quotes in SQL literal strings.)
+ Alternatively, you can avoid quoting and use backslash-escaping to
+ protect all data characters
+ that would otherwise be taken as composite syntax.
+ </para>
+
+ <para>
+ A completely empty field value (no characters at all between the commas
+ or parentheses) represents a NULL. To write a value that is an empty
+ string rather than NULL, write <literal>""</literal>.
+ </para>
+
+ <para>
+ The composite output routine will put double quotes around field values
+ if they are empty strings or contain parentheses, commas,
+ double quotes, backslashes, or white space. (Doing so for white space
+ is not essential, but aids legibility.) Double quotes and backslashes
+ embedded in field values will be doubled.
+ </para>
+
+ <note>
+ <para>
+ Remember that what you write in an SQL command will first be interpreted
+ as a string literal, and then as a composite. This doubles the number of
+ backslashes you need (assuming escape string syntax is used).
+ For example, to insert a <type>text</type> field
+ containing a double quote and a backslash in a composite
+ value, you'd need to write:
+<programlisting>
+INSERT ... VALUES ('("\"\\")');
+</programlisting>
+ The string-literal processor removes one level of backslashes, so that
+ what arrives at the composite-value parser looks like
+ <literal>("\"\\")</literal>. In turn, the string
+ fed to the <type>text</type> data type's input routine
+ becomes <literal>"\</literal>. (If we were working
+ with a data type whose input routine also treated backslashes specially,
+ <type>bytea</type> for example, we might need as many as eight backslashes
+ in the command to get one backslash into the stored composite field.)
+ Dollar quoting (see <xref linkend="sql-syntax-dollar-quoting"/>) can be
+ used to avoid the need to double backslashes.
+ </para>
+ </note>
+
+ <tip>
+ <para>
+ The <literal>ROW</literal> constructor syntax is usually easier to work with
+ than the composite-literal syntax when writing composite values in SQL
+ commands.
+ In <literal>ROW</literal>, individual field values are written the same way
+ they would be written when not members of a composite.
+ </para>
+ </tip>
+ </sect2>
+
+</sect1>