1 files changed, 302 insertions, 0 deletions
diff --git a/doc/src/sgml/html/xtypes.html b/doc/src/sgml/html/xtypes.html
new file mode 100644
index 0000000..c370c6f
--- /dev/null
+++ b/doc/src/sgml/html/xtypes.html
@@ -0,0 +1,302 @@
+<?xml version="1.0" encoding="UTF-8" standalone="no"?>
+<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"><html xmlns="http://www.w3.org/1999/xhtml"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /><title>38.13. User-Defined Types</title><link rel="stylesheet" type="text/css" href="stylesheet.css" /><link rev="made" href="pgsql-docs@lists.postgresql.org" /><meta name="generator" content="DocBook XSL Stylesheets Vsnapshot" /><link rel="prev" href="xaggr.html" title="38.12. User-Defined Aggregates" /><link rel="next" href="xoper.html" title="38.14. User-Defined Operators" /></head><body id="docContent" class="container-fluid col-10"><div class="navheader"><table width="100%" summary="Navigation header"><tr><th colspan="5" align="center">38.13. User-Defined Types</th></tr><tr><td width="10%" align="left"><a accesskey="p" href="xaggr.html" title="38.12. User-Defined Aggregates">Prev</a> </td><td width="10%" align="left"><a accesskey="u" href="extend.html" title="Chapter 38. Extending SQL">Up</a></td><th width="60%" align="center">Chapter 38. Extending <acronym class="acronym">SQL</acronym></th><td width="10%" align="right"><a accesskey="h" href="index.html" title="PostgreSQL 15.5 Documentation">Home</a></td><td width="10%" align="right"> <a accesskey="n" href="xoper.html" title="38.14. User-Defined Operators">Next</a></td></tr></table><hr /></div><div class="sect1" id="XTYPES"><div class="titlepage"><div><div><h2 class="title" style="clear: both">38.13. User-Defined Types</h2></div></div></div><div class="toc"><dl class="toc"><dt><span class="sect2"><a href="xtypes.html#XTYPES-TOAST">38.13.1. TOAST Considerations</a></span></dt></dl></div><a id="id-1.8.3.16.2" class="indexterm"></a><p>
+   As described in <a class="xref" href="extend-type-system.html" title="38.2. The PostgreSQL Type System">Section 38.2</a>,
+   <span class="productname">PostgreSQL</span> can be extended to support new
+   data types.  This section describes how to define new base types,
+   which are data types defined below the level of the <acronym class="acronym">SQL</acronym>
+   language.  Creating a new base type requires implementing functions
+   to operate on the type in a low-level language, usually C.
+  </p><p>
+   The examples in this section can be found in
+   <code class="filename">complex.sql</code> and <code class="filename">complex.c</code>
+   in the <code class="filename">src/tutorial</code> directory of the source distribution.
+   See the <code class="filename">README</code> file in that directory for instructions
+   about running the examples.
+  </p><p>
+  <a id="id-1.8.3.16.5.1" class="indexterm"></a>
+  <a id="id-1.8.3.16.5.2" class="indexterm"></a>
+  A user-defined type must always have input and output functions.
+  These functions determine how the type appears in strings (for input
+  by the user and output to the user) and how the type is organized in
+  memory.  The input function takes a null-terminated character string
+  as its argument and returns the internal (in memory) representation
+  of the type.  The output function takes the internal representation
+  of the type as argument and returns a null-terminated character
+  string.  If we want to do anything more with the type than merely
+  store it, we must provide additional functions to implement whatever
+  operations we'd like to have for the type.
+ </p><p>
+  Suppose we want to define a type <code class="type">complex</code> that represents
+  complex numbers. A natural way to represent a complex number in
+  memory would be the following C structure:
+
+</p><pre class="programlisting">
+typedef struct Complex {
+    double      x;
+    double      y;
+} Complex;
+</pre><p>
+
+  We will need to make this a pass-by-reference type, since it's too
+  large to fit into a single <code class="type">Datum</code> value.
+ </p><p>
+  As the external string representation of the type, we choose a
+  string of the form <code class="literal">(x,y)</code>.
+ </p><p>
+  The input and output functions are usually not hard to write,
+  especially the output function.  But when defining the external
+  string representation of the type, remember that you must eventually
+  write a complete and robust parser for that representation as your
+  input function.  For instance:
+
+</p><pre class="programlisting">
+PG_FUNCTION_INFO_V1(complex_in);
+
+Datum
+complex_in(PG_FUNCTION_ARGS)
+{
+    char       *str = PG_GETARG_CSTRING(0);
+    double      x,
+                y;
+    Complex    *result;
+
+    if (sscanf(str, " ( %lf , %lf )", &amp;x, &amp;y) != 2)
+        ereport(ERROR,
+                (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
+                 errmsg("invalid input syntax for type %s: \"%s\"",
+                        "complex", str)));
+
+    result = (Complex *) palloc(sizeof(Complex));
+    result-&gt;x = x;
+    result-&gt;y = y;
+    PG_RETURN_POINTER(result);
+}
+
+</pre><p>
+
+  The output function can simply be:
+
+</p><pre class="programlisting">
+PG_FUNCTION_INFO_V1(complex_out);
+
+Datum
+complex_out(PG_FUNCTION_ARGS)
+{
+    Complex    *complex = (Complex *) PG_GETARG_POINTER(0);
+    char       *result;
+
+    result = psprintf("(%g,%g)", complex-&gt;x, complex-&gt;y);
+    PG_RETURN_CSTRING(result);
+}
+
+</pre><p>
+ </p><p>
+  You should be careful to make the input and output functions inverses of
+  each other.  If you do not, you will have severe problems when you
+  need to dump your data into a file and then read it back in.  This
+  is a particularly common problem when floating-point numbers are
+  involved.
+ </p><p>
+  Optionally, a user-defined type can provide binary input and output
+  routines.  Binary I/O is normally faster but less portable than textual
+  I/O.  As with textual I/O, it is up to you to define exactly what the
+  external binary representation is.  Most of the built-in data types
+  try to provide a machine-independent binary representation.  For
+  <code class="type">complex</code>, we will piggy-back on the binary I/O converters
+  for type <code class="type">float8</code>:
+
+</p><pre class="programlisting">
+PG_FUNCTION_INFO_V1(complex_recv);
+
+Datum
+complex_recv(PG_FUNCTION_ARGS)
+{
+    StringInfo  buf = (StringInfo) PG_GETARG_POINTER(0);
+    Complex    *result;
+
+    result = (Complex *) palloc(sizeof(Complex));
+    result-&gt;x = pq_getmsgfloat8(buf);
+    result-&gt;y = pq_getmsgfloat8(buf);
+    PG_RETURN_POINTER(result);
+}
+
+PG_FUNCTION_INFO_V1(complex_send);
+
+Datum
+complex_send(PG_FUNCTION_ARGS)
+{
+    Complex    *complex = (Complex *) PG_GETARG_POINTER(0);
+    StringInfoData buf;
+
+    pq_begintypsend(&amp;buf);
+    pq_sendfloat8(&amp;buf, complex-&gt;x);
+    pq_sendfloat8(&amp;buf, complex-&gt;y);
+    PG_RETURN_BYTEA_P(pq_endtypsend(&amp;buf));
+}
+
+</pre><p>
+ </p><p>
+  Once we have written the I/O functions and compiled them into a shared
+  library, we can define the <code class="type">complex</code> type in SQL.
+  First we declare it as a shell type:
+
+</p><pre class="programlisting">
+CREATE TYPE complex;
+</pre><p>
+
+  This serves as a placeholder that allows us to reference the type while
+  defining its I/O functions.  Now we can define the I/O functions:
+
+</p><pre class="programlisting">
+CREATE FUNCTION complex_in(cstring)
+    RETURNS complex
+    AS '<em class="replaceable"><code>filename</code></em>'
+    LANGUAGE C IMMUTABLE STRICT;
+
+CREATE FUNCTION complex_out(complex)
+    RETURNS cstring
+    AS '<em class="replaceable"><code>filename</code></em>'
+    LANGUAGE C IMMUTABLE STRICT;
+
+CREATE FUNCTION complex_recv(internal)
+   RETURNS complex
+   AS '<em class="replaceable"><code>filename</code></em>'
+   LANGUAGE C IMMUTABLE STRICT;
+
+CREATE FUNCTION complex_send(complex)
+   RETURNS bytea
+   AS '<em class="replaceable"><code>filename</code></em>'
+   LANGUAGE C IMMUTABLE STRICT;
+</pre><p>
+ </p><p>
+  Finally, we can provide the full definition of the data type:
+</p><pre class="programlisting">
+CREATE TYPE complex (
+   internallength = 16,
+   input = complex_in,
+   output = complex_out,
+   receive = complex_recv,
+   send = complex_send,
+   alignment = double
+);
+</pre><p>
+ </p><p>
+  <a id="id-1.8.3.16.13.1" class="indexterm"></a>
+  When you define a new base type,
+  <span class="productname">PostgreSQL</span> automatically provides support
+  for arrays of that type.  The array type typically
+  has the same name as the base type with the underscore character
+  (<code class="literal">_</code>) prepended.
+ </p><p>
+  Once the data type exists, we can declare additional functions to
+  provide useful operations on the data type.  Operators can then be
+  defined atop the functions, and if needed, operator classes can be
+  created to support indexing of the data type.  These additional
+  layers are discussed in following sections.
+ </p><p>
+  If the internal representation of the data type is variable-length, the
+  internal representation must follow the standard layout for variable-length
+  data: the first four bytes must be a <code class="type">char[4]</code> field which is
+  never accessed directly (customarily named <code class="structfield">vl_len_</code>). You
+  must use the <code class="function">SET_VARSIZE()</code> macro to store the total
+  size of the datum (including the length field itself) in this field
+  and <code class="function">VARSIZE()</code> to retrieve it.  (These macros exist
+  because the length field may be encoded depending on platform.)
+ </p><p>
+  For further details see the description of the
+  <a class="xref" href="sql-createtype.html" title="CREATE TYPE"><span class="refentrytitle">CREATE TYPE</span></a> command.
+ </p><div class="sect2" id="XTYPES-TOAST"><div class="titlepage"><div><div><h3 class="title">38.13.1. TOAST Considerations</h3></div></div></div><a id="id-1.8.3.16.17.2" class="indexterm"></a><p>
+  If the values of your data type vary in size (in internal form), it's
+  usually desirable to make the data type <acronym class="acronym">TOAST</acronym>-able (see <a class="xref" href="storage-toast.html" title="73.2. TOAST">Section 73.2</a>). You should do this even if the values are always
+  too small to be compressed or stored externally, because
+  <acronym class="acronym">TOAST</acronym> can save space on small data too, by reducing header
+  overhead.
+ </p><p>
+  To support <acronym class="acronym">TOAST</acronym> storage, the C functions operating on the data
+  type must always be careful to unpack any toasted values they are handed
+  by using <code class="function">PG_DETOAST_DATUM</code>.  (This detail is customarily hidden
+  by defining type-specific <code class="function">GETARG_DATATYPE_P</code> macros.)
+  Then, when running the <code class="command">CREATE TYPE</code> command, specify the
+  internal length as <code class="literal">variable</code> and select some appropriate storage
+  option other than <code class="literal">plain</code>.
+ </p><p>
+  If data alignment is unimportant (either just for a specific function or
+  because the data type specifies byte alignment anyway) then it's possible
+  to avoid some of the overhead of <code class="function">PG_DETOAST_DATUM</code>. You can use
+  <code class="function">PG_DETOAST_DATUM_PACKED</code> instead (customarily hidden by
+  defining a <code class="function">GETARG_DATATYPE_PP</code> macro) and using the macros
+  <code class="function">VARSIZE_ANY_EXHDR</code> and <code class="function">VARDATA_ANY</code> to access
+  a potentially-packed datum.
+  Again, the data returned by these macros is not aligned even if the data
+  type definition specifies an alignment. If the alignment is important you
+  must go through the regular <code class="function">PG_DETOAST_DATUM</code> interface.
+ </p><div class="note"><h3 class="title">Note</h3><p>
+   Older code frequently declares <code class="structfield">vl_len_</code> as an
+   <code class="type">int32</code> field instead of <code class="type">char[4]</code>.  This is OK as long as
+   the struct definition has other fields that have at least <code class="type">int32</code>
+   alignment.  But it is dangerous to use such a struct definition when
+   working with a potentially unaligned datum; the compiler may take it as
+   license to assume the datum actually is aligned, leading to core dumps on
+   architectures that are strict about alignment.
+  </p></div><p>
+  Another feature that's enabled by <acronym class="acronym">TOAST</acronym> support is the
+  possibility of having an <em class="firstterm">expanded</em> in-memory data
+  representation that is more convenient to work with than the format that
+  is stored on disk.  The regular or <span class="quote">“<span class="quote">flat</span>”</span> varlena storage format
+  is ultimately just a blob of bytes; it cannot for example contain
+  pointers, since it may get copied to other locations in memory.
+  For complex data types, the flat format may be quite expensive to work
+  with, so <span class="productname">PostgreSQL</span> provides a way to <span class="quote">“<span class="quote">expand</span>”</span>
+  the flat format into a representation that is more suited to computation,
+  and then pass that format in-memory between functions of the data type.
+ </p><p>
+  To use expanded storage, a data type must define an expanded format that
+  follows the rules given in <code class="filename">src/include/utils/expandeddatum.h</code>,
+  and provide functions to <span class="quote">“<span class="quote">expand</span>”</span> a flat varlena value into
+  expanded format and <span class="quote">“<span class="quote">flatten</span>”</span> the expanded format back to the
+  regular varlena representation.  Then ensure that all C functions for
+  the data type can accept either representation, possibly by converting
+  one into the other immediately upon receipt.  This does not require fixing
+  all existing functions for the data type at once, because the standard
+  <code class="function">PG_DETOAST_DATUM</code> macro is defined to convert expanded inputs
+  into regular flat format.  Therefore, existing functions that work with
+  the flat varlena format will continue to work, though slightly
+  inefficiently, with expanded inputs; they need not be converted until and
+  unless better performance is important.
+ </p><p>
+  C functions that know how to work with an expanded representation
+  typically fall into two categories: those that can only handle expanded
+  format, and those that can handle either expanded or flat varlena inputs.
+  The former are easier to write but may be less efficient overall, because
+  converting a flat input to expanded form for use by a single function may
+  cost more than is saved by operating on the expanded format.
+  When only expanded format need be handled, conversion of flat inputs to
+  expanded form can be hidden inside an argument-fetching macro, so that
+  the function appears no more complex than one working with traditional
+  varlena input.
+  To handle both types of input, write an argument-fetching function that
+  will detoast external, short-header, and compressed varlena inputs, but
+  not expanded inputs.  Such a function can be defined as returning a
+  pointer to a union of the flat varlena format and the expanded format.
+  Callers can use the <code class="function">VARATT_IS_EXPANDED_HEADER()</code> macro to
+  determine which format they received.
+ </p><p>
+  The <acronym class="acronym">TOAST</acronym> infrastructure not only allows regular varlena
+  values to be distinguished from expanded values, but also
+  distinguishes <span class="quote">“<span class="quote">read-write</span>”</span> and <span class="quote">“<span class="quote">read-only</span>”</span> pointers to
+  expanded values.  C functions that only need to examine an expanded
+  value, or will only change it in safe and non-semantically-visible ways,
+  need not care which type of pointer they receive.  C functions that
+  produce a modified version of an input value are allowed to modify an
+  expanded input value in-place if they receive a read-write pointer, but
+  must not modify the input if they receive a read-only pointer; in that
+  case they have to copy the value first, producing a new value to modify.
+  A C function that has constructed a new expanded value should always
+  return a read-write pointer to it.  Also, a C function that is modifying
+  a read-write expanded value in-place should take care to leave the value
+  in a sane state if it fails partway through.
+ </p><p>
+  For examples of working with expanded values, see the standard array
+  infrastructure, particularly
+  <code class="filename">src/backend/utils/adt/array_expanded.c</code>.
+ </p></div></div><div class="navfooter"><hr /><table width="100%" summary="Navigation footer"><tr><td width="40%" align="left"><a accesskey="p" href="xaggr.html" title="38.12. User-Defined Aggregates">Prev</a> </td><td width="20%" align="center"><a accesskey="u" href="extend.html" title="Chapter 38. Extending SQL">Up</a></td><td width="40%" align="right"> <a accesskey="n" href="xoper.html" title="38.14. User-Defined Operators">Next</a></td></tr><tr><td width="40%" align="left" valign="top">38.12. User-Defined Aggregates </td><td width="20%" align="center"><a accesskey="h" href="index.html" title="PostgreSQL 15.5 Documentation">Home</a></td><td width="40%" align="right" valign="top"> 38.14. User-Defined Operators</td></tr></table></div></body></html>
+\ No newline at end of file