summaryrefslogtreecommitdiffstats
path: root/doc/src/sgml/typeconv.sgml
diff options
context:
space:
mode:
Diffstat (limited to 'doc/src/sgml/typeconv.sgml')
-rw-r--r--doc/src/sgml/typeconv.sgml1262
1 files changed, 1262 insertions, 0 deletions
diff --git a/doc/src/sgml/typeconv.sgml b/doc/src/sgml/typeconv.sgml
new file mode 100644
index 0000000..2874874
--- /dev/null
+++ b/doc/src/sgml/typeconv.sgml
@@ -0,0 +1,1262 @@
+<!-- doc/src/sgml/typeconv.sgml -->
+
+<chapter id="typeconv">
+<title>Type Conversion</title>
+
+<indexterm zone="typeconv">
+ <primary>data type</primary>
+ <secondary>conversion</secondary>
+</indexterm>
+
+<para>
+<acronym>SQL</acronym> statements can, intentionally or not, require
+the mixing of different data types in the same expression.
+<productname>PostgreSQL</productname> has extensive facilities for
+evaluating mixed-type expressions.
+</para>
+
+<para>
+In many cases a user does not need
+to understand the details of the type conversion mechanism.
+However, implicit conversions done by <productname>PostgreSQL</productname>
+can affect the results of a query. When necessary, these results
+can be tailored by using <emphasis>explicit</emphasis> type conversion.
+</para>
+
+<para>
+This chapter introduces the <productname>PostgreSQL</productname>
+type conversion mechanisms and conventions.
+Refer to the relevant sections in <xref linkend="datatype"/> and <xref linkend="functions"/>
+for more information on specific data types and allowed functions and
+operators.
+</para>
+
+<sect1 id="typeconv-overview">
+<title>Overview</title>
+
+<para>
+<acronym>SQL</acronym> is a strongly typed language. That is, every data item
+has an associated data type which determines its behavior and allowed usage.
+<productname>PostgreSQL</productname> has an extensible type system that is
+more general and flexible than other <acronym>SQL</acronym> implementations.
+Hence, most type conversion behavior in <productname>PostgreSQL</productname>
+is governed by general rules rather than by ad hoc
+heuristics. This allows the use of mixed-type expressions even with
+user-defined types.
+</para>
+
+<para>
+The <productname>PostgreSQL</productname> scanner/parser divides lexical
+elements into five fundamental categories: integers, non-integer numbers,
+strings, identifiers, and key words. Constants of most non-numeric types are
+first classified as strings. The <acronym>SQL</acronym> language definition
+allows specifying type names with strings, and this mechanism can be used in
+<productname>PostgreSQL</productname> to start the parser down the correct
+path. For example, the query:
+
+<screen>
+SELECT text 'Origin' AS "label", point '(0,0)' AS "value";
+
+ label | value
+--------+-------
+ Origin | (0,0)
+(1 row)
+</screen>
+
+has two literal constants, of type <type>text</type> and <type>point</type>.
+If a type is not specified for a string literal, then the placeholder type
+<type>unknown</type> is assigned initially, to be resolved in later
+stages as described below.
+</para>
+
+<para>
+There are four fundamental <acronym>SQL</acronym> constructs requiring
+distinct type conversion rules in the <productname>PostgreSQL</productname>
+parser:
+
+<variablelist>
+<varlistentry>
+<term>
+Function calls
+</term>
+<listitem>
+<para>
+Much of the <productname>PostgreSQL</productname> type system is built around a
+rich set of functions. Functions can have one or more arguments.
+Since <productname>PostgreSQL</productname> permits function
+overloading, the function name alone does not uniquely identify the function
+to be called; the parser must select the right function based on the data
+types of the supplied arguments.
+</para>
+</listitem>
+</varlistentry>
+<varlistentry>
+<term>
+Operators
+</term>
+<listitem>
+<para>
+<productname>PostgreSQL</productname> allows expressions with
+prefix (one-argument) operators,
+as well as infix (two-argument) operators. Like functions, operators can
+be overloaded, so the same problem of selecting the right operator
+exists.
+</para>
+</listitem>
+</varlistentry>
+<varlistentry>
+<term>
+Value Storage
+</term>
+<listitem>
+<para>
+<acronym>SQL</acronym> <command>INSERT</command> and <command>UPDATE</command> statements place the results of
+expressions into a table. The expressions in the statement must be matched up
+with, and perhaps converted to, the types of the target columns.
+</para>
+</listitem>
+</varlistentry>
+<varlistentry>
+<term>
+<literal>UNION</literal>, <literal>CASE</literal>, and related constructs
+</term>
+<listitem>
+<para>
+Since all query results from a unionized <command>SELECT</command> statement
+must appear in a single set of columns, the types of the results of each
+<command>SELECT</command> clause must be matched up and converted to a uniform set.
+Similarly, the result expressions of a <literal>CASE</literal> construct must be
+converted to a common type so that the <literal>CASE</literal> expression as a whole
+has a known output type. Some other constructs, such
+as <literal>ARRAY[]</literal> and the <function>GREATEST</function>
+and <function>LEAST</function> functions, likewise require determination of a
+common type for several subexpressions.
+</para>
+</listitem>
+</varlistentry>
+</variablelist>
+</para>
+
+<para>
+The system catalogs store information about which conversions, or
+<firstterm>casts</firstterm>, exist between which data types, and how to
+perform those conversions. Additional casts can be added by the user
+with the <xref linkend="sql-createcast"/>
+command. (This is usually
+done in conjunction with defining new data types. The set of casts
+between built-in types has been carefully crafted and is best not
+altered.)
+</para>
+
+<indexterm>
+ <primary>data type</primary>
+ <secondary>category</secondary>
+</indexterm>
+
+<para>
+An additional heuristic provided by the parser allows improved determination
+of the proper casting behavior among groups of types that have implicit casts.
+Data types are divided into several basic <firstterm>type
+categories</firstterm>, including <type>boolean</type>, <type>numeric</type>,
+<type>string</type>, <type>bitstring</type>, <type>datetime</type>,
+<type>timespan</type>, <type>geometric</type>, <type>network</type>, and
+user-defined. (For a list see <xref linkend="catalog-typcategory-table"/>;
+but note it is also possible to create custom type categories.) Within each
+category there can be one or more <firstterm>preferred types</firstterm>, which
+are preferred when there is a choice of possible types. With careful selection
+of preferred types and available implicit casts, it is possible to ensure that
+ambiguous expressions (those with multiple candidate parsing solutions) can be
+resolved in a useful way.
+</para>
+
+<para>
+All type conversion rules are designed with several principles in mind:
+
+<itemizedlist>
+<listitem>
+<para>
+Implicit conversions should never have surprising or unpredictable outcomes.
+</para>
+</listitem>
+
+<listitem>
+<para>
+There should be no extra overhead in the parser or executor
+if a query does not need implicit type conversion.
+That is, if a query is well-formed and the types already match, then the query should execute
+without spending extra time in the parser and without introducing unnecessary implicit conversion
+calls in the query.
+</para>
+</listitem>
+
+<listitem>
+<para>
+Additionally, if a query usually requires an implicit conversion for a function, and
+if then the user defines a new function with the correct argument types, the parser
+should use this new function and no longer do implicit conversion to use the old function.
+</para>
+</listitem>
+</itemizedlist>
+</para>
+
+</sect1>
+
+<sect1 id="typeconv-oper">
+<title>Operators</title>
+
+<indexterm zone="typeconv-oper">
+ <primary>operator</primary>
+ <secondary>type resolution in an invocation</secondary>
+</indexterm>
+
+ <para>
+ The specific operator that is referenced by an operator expression
+ is determined using the following procedure.
+ Note that this procedure is indirectly affected
+ by the precedence of the operators involved, since that will determine
+ which sub-expressions are taken to be the inputs of which operators.
+ See <xref linkend="sql-precedence"/> for more information.
+ </para>
+
+<procedure>
+<title>Operator Type Resolution</title>
+
+<step id="op-resol-select" performance="required">
+<para>
+Select the operators to be considered from the
+<classname>pg_operator</classname> system catalog. If a non-schema-qualified
+operator name was used (the usual case), the operators
+considered are those with the matching name and argument count that are
+visible in the current search path (see <xref linkend="ddl-schemas-path"/>).
+If a qualified operator name was given, only operators in the specified
+schema are considered.
+</para>
+
+<substeps>
+<step performance="optional">
+<para>
+If the search path finds multiple operators with identical argument types,
+only the one appearing earliest in the path is considered. Operators with
+different argument types are considered on an equal footing regardless of
+search path position.
+</para>
+</step>
+</substeps>
+</step>
+
+<step id="op-resol-exact-match" performance="required">
+<para>
+Check for an operator accepting exactly the input argument types.
+If one exists (there can be only one exact match in the set of
+operators considered), use it. Lack of an exact match creates a security
+hazard when calling, via qualified name
+ <footnote id="op-qualified-security">
+ <!-- If you edit this, consider editing func-qualified-security. -->
+ <para>
+ The hazard does not arise with a non-schema-qualified name, because a
+ search path containing schemas that permit untrusted users to create
+ objects is not a <link linkend="ddl-schemas-patterns">secure schema usage
+ pattern</link>.
+ </para>
+ </footnote>
+(not typical), any operator found in a schema that permits untrusted users to
+create objects. In such situations, cast arguments to force an exact match.
+</para>
+
+<substeps>
+<step id="op-resol-exact-unknown" performance="optional">
+<para>
+If one argument of a binary operator invocation is of the <type>unknown</type> type,
+then assume it is the same type as the other argument for this check.
+Invocations involving two <type>unknown</type> inputs, or a prefix operator
+with an <type>unknown</type> input, will never find a match at this step.
+</para>
+</step>
+<step id="op-resol-exact-domain" performance="optional">
+<para>
+If one argument of a binary operator invocation is of the <type>unknown</type>
+type and the other is of a domain type, next check to see if there is an
+operator accepting exactly the domain's base type on both sides; if so, use it.
+</para>
+</step>
+</substeps>
+</step>
+
+<step id="op-resol-best-match" performance="required">
+<para>
+Look for the best match.
+</para>
+<substeps>
+<step performance="required">
+<para>
+Discard candidate operators for which the input types do not match
+and cannot be converted (using an implicit conversion) to match.
+<type>unknown</type> literals are
+assumed to be convertible to anything for this purpose. If only one
+candidate remains, use it; else continue to the next step.
+</para>
+</step>
+<step performance="required">
+<para>
+If any input argument is of a domain type, treat it as being of the
+domain's base type for all subsequent steps. This ensures that domains
+act like their base types for purposes of ambiguous-operator resolution.
+</para>
+</step>
+<step performance="required">
+<para>
+Run through all candidates and keep those with the most exact matches
+on input types. Keep all candidates if none have exact matches.
+If only one candidate remains, use it; else continue to the next step.
+</para>
+</step>
+<step performance="required">
+<para>
+Run through all candidates and keep those that accept preferred types (of the
+input data type's type category) at the most positions where type conversion
+will be required.
+Keep all candidates if none accept preferred types.
+If only one candidate remains, use it; else continue to the next step.
+</para>
+</step>
+<step performance="required">
+<para>
+If any input arguments are <type>unknown</type>, check the type
+categories accepted at those argument positions by the remaining
+candidates. At each position, select the <type>string</type> category
+if any
+candidate accepts that category. (This bias towards string is appropriate
+since an unknown-type literal looks like a string.) Otherwise, if
+all the remaining candidates accept the same type category, select that
+category; otherwise fail because the correct choice cannot be deduced
+without more clues. Now discard
+candidates that do not accept the selected type category. Furthermore,
+if any candidate accepts a preferred type in that category,
+discard candidates that accept non-preferred types for that argument.
+Keep all candidates if none survive these tests.
+If only one candidate remains, use it; else continue to the next step.
+</para>
+</step>
+<step id="op-resol-last-unknown" performance="required">
+<para>
+If there are both <type>unknown</type> and known-type arguments, and all
+the known-type arguments have the same type, assume that the
+<type>unknown</type> arguments are also of that type, and check which
+candidates can accept that type at the <type>unknown</type>-argument
+positions. If exactly one candidate passes this test, use it.
+Otherwise, fail.
+</para>
+</step>
+</substeps>
+</step>
+</procedure>
+
+<para>
+Some examples follow.
+</para>
+
+<example>
+<title>Square Root Operator Type Resolution</title>
+
+<para>
+There is only one square root operator (prefix <literal>|/</literal>)
+defined in the standard catalog, and it takes an argument of type
+<type>double precision</type>.
+The scanner assigns an initial type of <type>integer</type> to the argument
+in this query expression:
+<screen>
+SELECT |/ 40 AS "square root of 40";
+ square root of 40
+-------------------
+ 6.324555320336759
+(1 row)
+</screen>
+
+So the parser does a type conversion on the operand and the query
+is equivalent to:
+
+<screen>
+SELECT |/ CAST(40 AS double precision) AS "square root of 40";
+</screen>
+</para>
+</example>
+
+<example>
+<title>String Concatenation Operator Type Resolution</title>
+
+<para>
+A string-like syntax is used for working with string types and for
+working with complex extension types.
+Strings with unspecified type are matched with likely operator candidates.
+</para>
+
+<para>
+An example with one unspecified argument:
+<screen>
+SELECT text 'abc' || 'def' AS "text and unknown";
+
+ text and unknown
+------------------
+ abcdef
+(1 row)
+</screen>
+</para>
+
+<para>
+In this case the parser looks to see if there is an operator taking <type>text</type>
+for both arguments. Since there is, it assumes that the second argument should
+be interpreted as type <type>text</type>.
+</para>
+
+<para>
+Here is a concatenation of two values of unspecified types:
+<screen>
+SELECT 'abc' || 'def' AS "unspecified";
+
+ unspecified
+-------------
+ abcdef
+(1 row)
+</screen>
+</para>
+
+<para>
+In this case there is no initial hint for which type to use, since no types
+are specified in the query. So, the parser looks for all candidate operators
+and finds that there are candidates accepting both string-category and
+bit-string-category inputs. Since string category is preferred when available,
+that category is selected, and then the
+preferred type for strings, <type>text</type>, is used as the specific
+type to resolve the unknown-type literals as.
+</para>
+</example>
+
+<example>
+<title>Absolute-Value and Negation Operator Type Resolution</title>
+
+<para>
+The <productname>PostgreSQL</productname> operator catalog has several
+entries for the prefix operator <literal>@</literal>, all of which implement
+absolute-value operations for various numeric data types. One of these
+entries is for type <type>float8</type>, which is the preferred type in
+the numeric category. Therefore, <productname>PostgreSQL</productname>
+will use that entry when faced with an <type>unknown</type> input:
+<screen>
+SELECT @ '-4.5' AS "abs";
+ abs
+-----
+ 4.5
+(1 row)
+</screen>
+Here the system has implicitly resolved the unknown-type literal as type
+<type>float8</type> before applying the chosen operator. We can verify that
+<type>float8</type> and not some other type was used:
+<screen>
+SELECT @ '-4.5e500' AS "abs";
+
+ERROR: "-4.5e500" is out of range for type double precision
+</screen>
+</para>
+
+<para>
+On the other hand, the prefix operator <literal>~</literal> (bitwise negation)
+is defined only for integer data types, not for <type>float8</type>. So, if we
+try a similar case with <literal>~</literal>, we get:
+<screen>
+SELECT ~ '20' AS "negation";
+
+ERROR: operator is not unique: ~ "unknown"
+HINT: Could not choose a best candidate operator. You might need to add
+explicit type casts.
+</screen>
+This happens because the system cannot decide which of the several
+possible <literal>~</literal> operators should be preferred. We can help
+it out with an explicit cast:
+<screen>
+SELECT ~ CAST('20' AS int8) AS "negation";
+
+ negation
+----------
+ -21
+(1 row)
+</screen>
+</para>
+</example>
+
+<example>
+<title>Array Inclusion Operator Type Resolution</title>
+
+<para>
+Here is another example of resolving an operator with one known and one
+unknown input:
+<screen>
+SELECT array[1,2] &lt;@ '{1,2,3}' as "is subset";
+
+ is subset
+-----------
+ t
+(1 row)
+</screen>
+The <productname>PostgreSQL</productname> operator catalog has several
+entries for the infix operator <literal>&lt;@</literal>, but the only two that
+could possibly accept an integer array on the left-hand side are
+array inclusion (<type>anyarray</type> <literal>&lt;@</literal> <type>anyarray</type>)
+and range inclusion (<type>anyelement</type> <literal>&lt;@</literal> <type>anyrange</type>).
+Since none of these polymorphic pseudo-types (see <xref
+linkend="datatype-pseudo"/>) are considered preferred, the parser cannot
+resolve the ambiguity on that basis.
+However, <xref linkend="op-resol-last-unknown"/> tells
+it to assume that the unknown-type literal is of the same type as the other
+input, that is, integer array. Now only one of the two operators can match,
+so array inclusion is selected. (Had range inclusion been selected, we would
+have gotten an error, because the string does not have the right format to be
+a range literal.)
+</para>
+</example>
+
+<example>
+<title>Custom Operator on a Domain Type</title>
+
+<para>
+Users sometimes try to declare operators applying just to a domain type.
+This is possible but is not nearly as useful as it might seem, because the
+operator resolution rules are designed to select operators applying to the
+domain's base type. As an example consider
+<screen>
+CREATE DOMAIN mytext AS text CHECK(...);
+CREATE FUNCTION mytext_eq_text (mytext, text) RETURNS boolean AS ...;
+CREATE OPERATOR = (procedure=mytext_eq_text, leftarg=mytext, rightarg=text);
+CREATE TABLE mytable (val mytext);
+
+SELECT * FROM mytable WHERE val = 'foo';
+</screen>
+This query will not use the custom operator. The parser will first see if
+there is a <type>mytext</type> <literal>=</literal> <type>mytext</type> operator
+(<xref linkend="op-resol-exact-unknown"/>), which there is not;
+then it will consider the domain's base type <type>text</type>, and see if
+there is a <type>text</type> <literal>=</literal> <type>text</type> operator
+(<xref linkend="op-resol-exact-domain"/>), which there is;
+so it resolves the <type>unknown</type>-type literal as <type>text</type> and
+uses the <type>text</type> <literal>=</literal> <type>text</type> operator.
+The only way to get the custom operator to be used is to explicitly cast
+the literal:
+<screen>
+SELECT * FROM mytable WHERE val = text 'foo';
+</screen>
+so that the <type>mytext</type> <literal>=</literal> <type>text</type> operator is found
+immediately according to the exact-match rule. If the best-match rules
+are reached, they actively discriminate against operators on domain types.
+If they did not, such an operator would create too many ambiguous-operator
+failures, because the casting rules always consider a domain as castable
+to or from its base type, and so the domain operator would be considered
+usable in all the same cases as a similarly-named operator on the base type.
+</para>
+</example>
+
+</sect1>
+
+<sect1 id="typeconv-func">
+<title>Functions</title>
+
+<indexterm zone="typeconv-func">
+ <primary>function</primary>
+ <secondary>type resolution in an invocation</secondary>
+</indexterm>
+
+ <para>
+ The specific function that is referenced by a function call
+ is determined using the following procedure.
+ </para>
+
+<procedure>
+<title>Function Type Resolution</title>
+
+<step performance="required">
+<para>
+Select the functions to be considered from the
+<classname>pg_proc</classname> system catalog. If a non-schema-qualified
+function name was used, the functions
+considered are those with the matching name and argument count that are
+visible in the current search path (see <xref linkend="ddl-schemas-path"/>).
+If a qualified function name was given, only functions in the specified
+schema are considered.
+</para>
+
+<substeps>
+<step performance="optional">
+<para>
+If the search path finds multiple functions of identical argument types,
+only the one appearing earliest in the path is considered. Functions of
+different argument types are considered on an equal footing regardless of
+search path position.
+</para>
+</step>
+<step performance="optional">
+<para>
+If a function is declared with a <literal>VARIADIC</literal> array parameter, and
+the call does not use the <literal>VARIADIC</literal> keyword, then the function
+is treated as if the array parameter were replaced by one or more occurrences
+of its element type, as needed to match the call. After such expansion the
+function might have effective argument types identical to some non-variadic
+function. In that case the function appearing earlier in the search path is
+used, or if the two functions are in the same schema, the non-variadic one is
+preferred.
+</para>
+<para>
+This creates a security hazard when calling, via qualified name
+ <footnote id="func-qualified-security">
+ <!-- If you edit this, consider editing op-qualified-security. -->
+ <para>
+ The hazard does not arise with a non-schema-qualified name, because a
+ search path containing schemas that permit untrusted users to create
+ objects is not a <link linkend="ddl-schemas-patterns">secure schema usage
+ pattern</link>.
+ </para>
+ </footnote>,
+a variadic function found in a schema that permits untrusted users to create
+objects. A malicious user can take control and execute arbitrary SQL
+functions as though you executed them. Substitute a call bearing
+the <literal>VARIADIC</literal> keyword, which bypasses this hazard. Calls
+populating <literal>VARIADIC "any"</literal> parameters often have no
+equivalent formulation containing the <literal>VARIADIC</literal> keyword. To
+issue those calls safely, the function's schema must permit only trusted users
+to create objects.
+</para>
+</step>
+<step performance="optional">
+<para>
+Functions that have default values for parameters are considered to match any
+call that omits zero or more of the defaultable parameter positions. If more
+than one such function matches a call, the one appearing earliest in the
+search path is used. If there are two or more such functions in the same
+schema with identical parameter types in the non-defaulted positions (which is
+possible if they have different sets of defaultable parameters), the system
+will not be able to determine which to prefer, and so an <quote>ambiguous
+function call</quote> error will result if no better match to the call can be
+found.
+</para>
+<para>
+This creates an availability hazard when calling, via qualified
+name<footnoteref linkend="func-qualified-security"/>, any function found in a
+schema that permits untrusted users to create objects. A malicious user can
+create a function with the name of an existing function, replicating that
+function's parameters and appending novel parameters having default values.
+This precludes new calls to the original function. To forestall this hazard,
+place functions in schemas that permit only trusted users to create objects.
+</para>
+</step>
+</substeps>
+</step>
+
+<step performance="required">
+<para>
+Check for a function accepting exactly the input argument types.
+If one exists (there can be only one exact match in the set of
+functions considered), use it. Lack of an exact match creates a security
+hazard when calling, via qualified
+name<footnoteref linkend="func-qualified-security"/>, a function found in a
+schema that permits untrusted users to create objects. In such situations,
+cast arguments to force an exact match. (Cases involving <type>unknown</type>
+will never find a match at this step.)
+</para>
+</step>
+
+<step performance="required">
+<para>
+If no exact match is found, see if the function call appears
+to be a special type conversion request. This happens if the function call
+has just one argument and the function name is the same as the (internal)
+name of some data type. Furthermore, the function argument must be either
+an unknown-type literal, or a type that is binary-coercible to the named
+data type, or a type that could be converted to the named data type by
+applying that type's I/O functions (that is, the conversion is either to or
+from one of the standard string types). When these conditions are met,
+the function call is treated as a form of <literal>CAST</literal> specification.
+ <footnote>
+ <para>
+ The reason for this step is to support function-style cast specifications
+ in cases where there is not an actual cast function. If there is a cast
+ function, it is conventionally named after its output type, and so there
+ is no need to have a special case. See
+ <xref linkend="sql-createcast"/>
+ for additional commentary.
+ </para>
+ </footnote>
+</para>
+</step>
+<step performance="required">
+<para>
+Look for the best match.
+</para>
+<substeps>
+<step performance="required">
+<para>
+Discard candidate functions for which the input types do not match
+and cannot be converted (using an implicit conversion) to match.
+<type>unknown</type> literals are
+assumed to be convertible to anything for this purpose. If only one
+candidate remains, use it; else continue to the next step.
+</para>
+</step>
+<step performance="required">
+<para>
+If any input argument is of a domain type, treat it as being of the
+domain's base type for all subsequent steps. This ensures that domains
+act like their base types for purposes of ambiguous-function resolution.
+</para>
+</step>
+<step performance="required">
+<para>
+Run through all candidates and keep those with the most exact matches
+on input types. Keep all candidates if none have exact matches.
+If only one candidate remains, use it; else continue to the next step.
+</para>
+</step>
+<step performance="required">
+<para>
+Run through all candidates and keep those that accept preferred types (of the
+input data type's type category) at the most positions where type conversion
+will be required.
+Keep all candidates if none accept preferred types.
+If only one candidate remains, use it; else continue to the next step.
+</para>
+</step>
+<step performance="required">
+<para>
+If any input arguments are <type>unknown</type>, check the type categories
+accepted
+at those argument positions by the remaining candidates. At each position,
+select the <type>string</type> category if any candidate accepts that category.
+(This bias towards string
+is appropriate since an unknown-type literal looks like a string.)
+Otherwise, if all the remaining candidates accept the same type category,
+select that category; otherwise fail because
+the correct choice cannot be deduced without more clues.
+Now discard candidates that do not accept the selected type category.
+Furthermore, if any candidate accepts a preferred type in that category,
+discard candidates that accept non-preferred types for that argument.
+Keep all candidates if none survive these tests.
+If only one candidate remains, use it; else continue to the next step.
+</para>
+</step>
+<step performance="required">
+<para>
+If there are both <type>unknown</type> and known-type arguments, and all
+the known-type arguments have the same type, assume that the
+<type>unknown</type> arguments are also of that type, and check which
+candidates can accept that type at the <type>unknown</type>-argument
+positions. If exactly one candidate passes this test, use it.
+Otherwise, fail.
+</para>
+</step>
+</substeps>
+</step>
+</procedure>
+
+<para>
+Note that the <quote>best match</quote> rules are identical for operator and
+function type resolution.
+Some examples follow.
+</para>
+
+<example>
+<title>Rounding Function Argument Type Resolution</title>
+
+<para>
+There is only one <function>round</function> function that takes two
+arguments; it takes a first argument of type <type>numeric</type> and
+a second argument of type <type>integer</type>.
+So the following query automatically converts
+the first argument of type <type>integer</type> to
+<type>numeric</type>:
+
+<screen>
+SELECT round(4, 4);
+
+ round
+--------
+ 4.0000
+(1 row)
+</screen>
+
+That query is actually transformed by the parser to:
+<screen>
+SELECT round(CAST (4 AS numeric), 4);
+</screen>
+</para>
+
+<para>
+Since numeric constants with decimal points are initially assigned the
+type <type>numeric</type>, the following query will require no type
+conversion and therefore might be slightly more efficient:
+<screen>
+SELECT round(4.0, 4);
+</screen>
+</para>
+</example>
+
+<example>
+<title>Variadic Function Resolution</title>
+
+<para>
+<screen>
+CREATE FUNCTION public.variadic_example(VARIADIC numeric[]) RETURNS int
+ LANGUAGE sql AS 'SELECT 1';
+CREATE FUNCTION
+</screen>
+
+This function accepts, but does not require, the VARIADIC keyword. It
+tolerates both integer and numeric arguments:
+
+<screen>
+SELECT public.variadic_example(0),
+ public.variadic_example(0.0),
+ public.variadic_example(VARIADIC array[0.0]);
+ variadic_example | variadic_example | variadic_example
+------------------+------------------+------------------
+ 1 | 1 | 1
+(1 row)
+</screen>
+
+However, the first and second calls will prefer more-specific functions, if
+available:
+
+<screen>
+CREATE FUNCTION public.variadic_example(numeric) RETURNS int
+ LANGUAGE sql AS 'SELECT 2';
+CREATE FUNCTION
+
+CREATE FUNCTION public.variadic_example(int) RETURNS int
+ LANGUAGE sql AS 'SELECT 3';
+CREATE FUNCTION
+
+SELECT public.variadic_example(0),
+ public.variadic_example(0.0),
+ public.variadic_example(VARIADIC array[0.0]);
+ variadic_example | variadic_example | variadic_example
+------------------+------------------+------------------
+ 3 | 2 | 1
+(1 row)
+</screen>
+
+Given the default configuration and only the first function existing, the
+first and second calls are insecure. Any user could intercept them by
+creating the second or third function. By matching the argument type exactly
+and using the <literal>VARIADIC</literal> keyword, the third call is secure.
+</para>
+</example>
+
+<example>
+<title>Substring Function Type Resolution</title>
+
+<para>
+There are several <function>substr</function> functions, one of which
+takes types <type>text</type> and <type>integer</type>. If called
+with a string constant of unspecified type, the system chooses the
+candidate function that accepts an argument of the preferred category
+<literal>string</literal> (namely of type <type>text</type>).
+
+<screen>
+SELECT substr('1234', 3);
+
+ substr
+--------
+ 34
+(1 row)
+</screen>
+</para>
+
+<para>
+If the string is declared to be of type <type>varchar</type>, as might be the case
+if it comes from a table, then the parser will try to convert it to become <type>text</type>:
+<screen>
+SELECT substr(varchar '1234', 3);
+
+ substr
+--------
+ 34
+(1 row)
+</screen>
+
+This is transformed by the parser to effectively become:
+<screen>
+SELECT substr(CAST (varchar '1234' AS text), 3);
+</screen>
+</para>
+<para>
+<note>
+<para>
+The parser learns from the <structname>pg_cast</structname> catalog that
+<type>text</type> and <type>varchar</type>
+are binary-compatible, meaning that one can be passed to a function that
+accepts the other without doing any physical conversion. Therefore, no
+type conversion call is really inserted in this case.
+</para>
+</note>
+</para>
+
+<para>
+And, if the function is called with an argument of type <type>integer</type>,
+the parser will try to convert that to <type>text</type>:
+<screen>
+SELECT substr(1234, 3);
+ERROR: function substr(integer, integer) does not exist
+HINT: No function matches the given name and argument types. You might need
+to add explicit type casts.
+</screen>
+
+This does not work because <type>integer</type> does not have an implicit cast
+to <type>text</type>. An explicit cast will work, however:
+<screen>
+SELECT substr(CAST (1234 AS text), 3);
+
+ substr
+--------
+ 34
+(1 row)
+</screen>
+</para>
+</example>
+
+</sect1>
+
+<sect1 id="typeconv-query">
+<title>Value Storage</title>
+
+ <para>
+ Values to be inserted into a table are converted to the destination
+ column's data type according to the
+ following steps.
+ </para>
+
+<procedure>
+<title>Value Storage Type Conversion</title>
+
+<step performance="required">
+<para>
+Check for an exact match with the target.
+</para>
+</step>
+
+<step performance="required">
+<para>
+Otherwise, try to convert the expression to the target type. This is possible
+if an <firstterm>assignment cast</firstterm> between the two types is registered in the
+<structname>pg_cast</structname> catalog (see <xref linkend="sql-createcast"/>).
+Alternatively, if the expression is an unknown-type literal, the contents of
+the literal string will be fed to the input conversion routine for the target
+type.
+</para>
+</step>
+
+<step performance="required">
+<para>
+Check to see if there is a sizing cast for the target type. A sizing
+cast is a cast from that type to itself. If one is found in the
+<structname>pg_cast</structname> catalog, apply it to the expression before storing
+into the destination column. The implementation function for such a cast
+always takes an extra parameter of type <type>integer</type>, which receives
+the destination column's <structfield>atttypmod</structfield> value (typically its
+declared length, although the interpretation of <structfield>atttypmod</structfield>
+varies for different data types), and it may take a third <type>boolean</type>
+parameter that says whether the cast is explicit or implicit. The cast
+function
+is responsible for applying any length-dependent semantics such as size
+checking or truncation.
+</para>
+</step>
+
+</procedure>
+
+<example>
+<title><type>character</type> Storage Type Conversion</title>
+
+<para>
+For a target column declared as <type>character(20)</type> the following
+statement shows that the stored value is sized correctly:
+
+<screen>
+CREATE TABLE vv (v character(20));
+INSERT INTO vv SELECT 'abc' || 'def';
+SELECT v, octet_length(v) FROM vv;
+
+ v | octet_length
+----------------------+--------------
+ abcdef | 20
+(1 row)
+</screen>
+</para>
+
+<para>
+What has really happened here is that the two unknown literals are resolved
+to <type>text</type> by default, allowing the <literal>||</literal> operator
+to be resolved as <type>text</type> concatenation. Then the <type>text</type>
+result of the operator is converted to <type>bpchar</type> (<quote>blank-padded
+char</quote>, the internal name of the <type>character</type> data type) to match the target
+column type. (Since the conversion from <type>text</type> to
+<type>bpchar</type> is binary-coercible, this conversion does
+not insert any real function call.) Finally, the sizing function
+<literal>bpchar(bpchar, integer, boolean)</literal> is found in the system catalog
+and applied to the operator's result and the stored column length. This
+type-specific function performs the required length check and addition of
+padding spaces.
+</para>
+</example>
+</sect1>
+
+<sect1 id="typeconv-union-case">
+<title><literal>UNION</literal>, <literal>CASE</literal>, and Related Constructs</title>
+
+<indexterm zone="typeconv-union-case">
+ <primary>UNION</primary>
+ <secondary>determination of result type</secondary>
+</indexterm>
+
+<indexterm zone="typeconv-union-case">
+ <primary>CASE</primary>
+ <secondary>determination of result type</secondary>
+</indexterm>
+
+<indexterm zone="typeconv-union-case">
+ <primary>ARRAY</primary>
+ <secondary>determination of result type</secondary>
+</indexterm>
+
+<indexterm zone="typeconv-union-case">
+ <primary>VALUES</primary>
+ <secondary>determination of result type</secondary>
+</indexterm>
+
+<indexterm zone="typeconv-union-case">
+ <primary>GREATEST</primary>
+ <secondary>determination of result type</secondary>
+</indexterm>
+
+<indexterm zone="typeconv-union-case">
+ <primary>LEAST</primary>
+ <secondary>determination of result type</secondary>
+</indexterm>
+
+<para>
+SQL <literal>UNION</literal> constructs must match up possibly dissimilar
+types to become a single result set. The resolution algorithm is
+applied separately to each output column of a union query. The
+<literal>INTERSECT</literal> and <literal>EXCEPT</literal> constructs resolve
+dissimilar types in the same way as <literal>UNION</literal>.
+Some other constructs, including
+<literal>CASE</literal>, <literal>ARRAY</literal>, <literal>VALUES</literal>,
+and the <function>GREATEST</function> and <function>LEAST</function>
+functions, use the identical
+algorithm to match up their component expressions and select a result
+data type.
+</para>
+
+<procedure>
+<title>Type Resolution for <literal>UNION</literal>, <literal>CASE</literal>,
+and Related Constructs</title>
+
+<step performance="required">
+<para>
+If all inputs are of the same type, and it is not <type>unknown</type>,
+resolve as that type.
+</para>
+</step>
+
+<step performance="required">
+<para>
+If any input is of a domain type, treat it as being of the
+domain's base type for all subsequent steps.
+ <footnote>
+ <para>
+ Somewhat like the treatment of domain inputs for operators and
+ functions, this behavior allows a domain type to be preserved through
+ a <literal>UNION</literal> or similar construct, so long as the user is
+ careful to ensure that all inputs are implicitly or explicitly of that
+ exact type. Otherwise the domain's base type will be used.
+ </para>
+ </footnote>
+</para>
+</step>
+
+<step performance="required">
+<para>
+If all inputs are of type <type>unknown</type>, resolve as type
+<type>text</type> (the preferred type of the string category).
+Otherwise, <type>unknown</type> inputs are ignored for the purposes
+of the remaining rules.
+</para>
+</step>
+
+<step performance="required">
+<para>
+If the non-unknown inputs are not all of the same type category, fail.
+</para>
+</step>
+
+<step performance="required">
+<para>
+Select the first non-unknown input type as the candidate type,
+then consider each other non-unknown input type, left to right.
+ <footnote>
+ <para>
+ For historical reasons, <literal>CASE</literal> treats
+ its <literal>ELSE</literal> clause (if any) as the <quote>first</quote>
+ input, with the <literal>THEN</literal> clauses(s) considered after
+ that. In all other cases, <quote>left to right</quote> means the order
+ in which the expressions appear in the query text.
+ </para>
+ </footnote>
+If the candidate type can be implicitly converted to the other type,
+but not vice-versa, select the other type as the new candidate type.
+Then continue considering the remaining inputs. If, at any stage of this
+process, a preferred type is selected, stop considering additional
+inputs.
+</para>
+</step>
+
+<step performance="required">
+<para>
+Convert all inputs to the final candidate type. Fail if there is not an
+implicit conversion from a given input type to the candidate type.
+</para>
+</step>
+</procedure>
+
+<para>
+Some examples follow.
+</para>
+
+<example>
+<title>Type Resolution with Underspecified Types in a Union</title>
+
+<para>
+<screen>
+SELECT text 'a' AS "text" UNION SELECT 'b';
+
+ text
+------
+ a
+ b
+(2 rows)
+</screen>
+Here, the unknown-type literal <literal>'b'</literal> will be resolved to type <type>text</type>.
+</para>
+</example>
+
+<example>
+<title>Type Resolution in a Simple Union</title>
+
+<para>
+<screen>
+SELECT 1.2 AS "numeric" UNION SELECT 1;
+
+ numeric
+---------
+ 1
+ 1.2
+(2 rows)
+</screen>
+The literal <literal>1.2</literal> is of type <type>numeric</type>,
+and the <type>integer</type> value <literal>1</literal> can be cast implicitly to
+<type>numeric</type>, so that type is used.
+</para>
+</example>
+
+<example>
+<title>Type Resolution in a Transposed Union</title>
+
+<para>
+<screen>
+SELECT 1 AS "real" UNION SELECT CAST('2.2' AS REAL);
+
+ real
+------
+ 1
+ 2.2
+(2 rows)
+</screen>
+Here, since type <type>real</type> cannot be implicitly cast to <type>integer</type>,
+but <type>integer</type> can be implicitly cast to <type>real</type>, the union
+result type is resolved as <type>real</type>.
+</para>
+</example>
+
+<example>
+<title>Type Resolution in a Nested Union</title>
+
+<para>
+<screen>
+SELECT NULL UNION SELECT NULL UNION SELECT 1;
+
+ERROR: UNION types text and integer cannot be matched
+</screen>
+This failure occurs because <productname>PostgreSQL</productname> treats
+multiple <literal>UNION</literal>s as a nest of pairwise operations;
+that is, this input is the same as
+<screen>
+(SELECT NULL UNION SELECT NULL) UNION SELECT 1;
+</screen>
+The inner <literal>UNION</literal> is resolved as emitting
+type <type>text</type>, according to the rules given above. Then the
+outer <literal>UNION</literal> has inputs of types <type>text</type>
+and <type>integer</type>, leading to the observed error. The problem
+can be fixed by ensuring that the leftmost <literal>UNION</literal>
+has at least one input of the desired result type.
+</para>
+
+<para>
+<literal>INTERSECT</literal> and <literal>EXCEPT</literal> operations are
+likewise resolved pairwise. However, the other constructs described in this
+section consider all of their inputs in one resolution step.
+</para>
+</example>
+</sect1>
+
+<sect1 id="typeconv-select">
+<title><literal>SELECT</literal> Output Columns</title>
+
+<indexterm zone="typeconv-select">
+ <primary>SELECT</primary>
+ <secondary>determination of result type</secondary>
+</indexterm>
+
+<para>
+The rules given in the preceding sections will result in assignment
+of non-<type>unknown</type> data types to all expressions in an SQL query,
+except for unspecified-type literals that appear as simple output
+columns of a <command>SELECT</command> command. For example, in
+
+<screen>
+SELECT 'Hello World';
+</screen>
+
+there is nothing to identify what type the string literal should be
+taken as. In this situation <productname>PostgreSQL</productname> will fall back
+to resolving the literal's type as <type>text</type>.
+</para>
+
+<para>
+When the <command>SELECT</command> is one arm of a <literal>UNION</literal>
+(or <literal>INTERSECT</literal> or <literal>EXCEPT</literal>) construct, or when it
+appears within <command>INSERT ... SELECT</command>, this rule is not applied
+since rules given in preceding sections take precedence. The type of an
+unspecified-type literal can be taken from the other <literal>UNION</literal> arm
+in the first case, or from the destination column in the second case.
+</para>
+
+<para>
+<literal>RETURNING</literal> lists are treated the same as <command>SELECT</command>
+output lists for this purpose.
+</para>
+
+<note>
+ <para>
+ Prior to <productname>PostgreSQL</productname> 10, this rule did not exist, and
+ unspecified-type literals in a <command>SELECT</command> output list were
+ left as type <type>unknown</type>. That had assorted bad consequences,
+ so it's been changed.
+ </para>
+</note>
+
+</sect1>
+</chapter>