diff options
author | Daniel Baumann <daniel.baumann@progress-linux.org> | 2024-05-04 12:15:05 +0000 |
---|---|---|
committer | Daniel Baumann <daniel.baumann@progress-linux.org> | 2024-05-04 12:15:05 +0000 |
commit | 46651ce6fe013220ed397add242004d764fc0153 (patch) | |
tree | 6e5299f990f88e60174a1d3ae6e48eedd2688b2b /doc/src/sgml/html/xfunc-c.html | |
parent | Initial commit. (diff) | |
download | postgresql-14-46651ce6fe013220ed397add242004d764fc0153.tar.xz postgresql-14-46651ce6fe013220ed397add242004d764fc0153.zip |
Adding upstream version 14.5.upstream/14.5upstream
Signed-off-by: Daniel Baumann <daniel.baumann@progress-linux.org>
Diffstat (limited to 'doc/src/sgml/html/xfunc-c.html')
-rw-r--r-- | doc/src/sgml/html/xfunc-c.html | 1386 |
1 files changed, 1386 insertions, 0 deletions
diff --git a/doc/src/sgml/html/xfunc-c.html b/doc/src/sgml/html/xfunc-c.html new file mode 100644 index 0000000..b28a3be --- /dev/null +++ b/doc/src/sgml/html/xfunc-c.html @@ -0,0 +1,1386 @@ +<?xml version="1.0" encoding="UTF-8" standalone="no"?> +<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"><html xmlns="http://www.w3.org/1999/xhtml"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /><title>38.10. C-Language Functions</title><link rel="stylesheet" type="text/css" href="stylesheet.css" /><link rev="made" href="pgsql-docs@lists.postgresql.org" /><meta name="generator" content="DocBook XSL Stylesheets Vsnapshot" /><link rel="prev" href="xfunc-internal.html" title="38.9. Internal Functions" /><link rel="next" href="xfunc-optimization.html" title="38.11. Function Optimization Information" /></head><body id="docContent" class="container-fluid col-10"><div xmlns="http://www.w3.org/TR/xhtml1/transitional" class="navheader"><table width="100%" summary="Navigation header"><tr><th colspan="5" align="center">38.10. C-Language Functions</th></tr><tr><td width="10%" align="left"><a accesskey="p" href="xfunc-internal.html" title="38.9. Internal Functions">Prev</a> </td><td width="10%" align="left"><a accesskey="u" href="extend.html" title="Chapter 38. Extending SQL">Up</a></td><th width="60%" align="center">Chapter 38. Extending <acronym xmlns="http://www.w3.org/1999/xhtml" class="acronym">SQL</acronym></th><td width="10%" align="right"><a accesskey="h" href="index.html" title="PostgreSQL 14.5 Documentation">Home</a></td><td width="10%" align="right"> <a accesskey="n" href="xfunc-optimization.html" title="38.11. Function Optimization Information">Next</a></td></tr></table><hr></hr></div><div class="sect1" id="XFUNC-C"><div class="titlepage"><div><div><h2 class="title" style="clear: both">38.10. C-Language Functions</h2></div></div></div><div class="toc"><dl class="toc"><dt><span class="sect2"><a href="xfunc-c.html#XFUNC-C-DYNLOAD">38.10.1. Dynamic Loading</a></span></dt><dt><span class="sect2"><a href="xfunc-c.html#XFUNC-C-BASETYPE">38.10.2. Base Types in C-Language Functions</a></span></dt><dt><span class="sect2"><a href="xfunc-c.html#id-1.8.3.13.7">38.10.3. Version 1 Calling Conventions</a></span></dt><dt><span class="sect2"><a href="xfunc-c.html#id-1.8.3.13.8">38.10.4. Writing Code</a></span></dt><dt><span class="sect2"><a href="xfunc-c.html#DFUNC">38.10.5. Compiling and Linking Dynamically-Loaded Functions</a></span></dt><dt><span class="sect2"><a href="xfunc-c.html#id-1.8.3.13.10">38.10.6. Composite-Type Arguments</a></span></dt><dt><span class="sect2"><a href="xfunc-c.html#id-1.8.3.13.11">38.10.7. Returning Rows (Composite Types)</a></span></dt><dt><span class="sect2"><a href="xfunc-c.html#XFUNC-C-RETURN-SET">38.10.8. Returning Sets</a></span></dt><dt><span class="sect2"><a href="xfunc-c.html#id-1.8.3.13.13">38.10.9. Polymorphic Arguments and Return Types</a></span></dt><dt><span class="sect2"><a href="xfunc-c.html#XFUNC-SHARED-ADDIN">38.10.10. Shared Memory and LWLocks</a></span></dt><dt><span class="sect2"><a href="xfunc-c.html#EXTEND-CPP">38.10.11. Using C++ for Extensibility</a></span></dt></dl></div><a id="id-1.8.3.13.2" class="indexterm"></a><p> + User-defined functions can be written in C (or a language that can + be made compatible with C, such as C++). Such functions are + compiled into dynamically loadable objects (also called shared + libraries) and are loaded by the server on demand. The dynamic + loading feature is what distinguishes <span class="quote">“<span class="quote">C language</span>”</span> functions + from <span class="quote">“<span class="quote">internal</span>”</span> functions — the actual coding conventions + are essentially the same for both. (Hence, the standard internal + function library is a rich source of coding examples for user-defined + C functions.) + </p><p> + Currently only one calling convention is used for C functions + (<span class="quote">“<span class="quote">version 1</span>”</span>). Support for that calling convention is + indicated by writing a <code class="literal">PG_FUNCTION_INFO_V1()</code> macro + call for the function, as illustrated below. + </p><div class="sect2" id="XFUNC-C-DYNLOAD"><div class="titlepage"><div><div><h3 class="title">38.10.1. Dynamic Loading</h3></div></div></div><a id="id-1.8.3.13.5.2" class="indexterm"></a><p> + The first time a user-defined function in a particular + loadable object file is called in a session, + the dynamic loader loads that object file into memory so that the + function can be called. The <code class="command">CREATE FUNCTION</code> + for a user-defined C function must therefore specify two pieces of + information for the function: the name of the loadable + object file, and the C name (link symbol) of the specific function to call + within that object file. If the C name is not explicitly specified then + it is assumed to be the same as the SQL function name. + </p><p> + The following algorithm is used to locate the shared object file + based on the name given in the <code class="command">CREATE FUNCTION</code> + command: + + </p><div class="orderedlist"><ol class="orderedlist" type="1"><li class="listitem"><p> + If the name is an absolute path, the given file is loaded. + </p></li><li class="listitem"><p> + If the name starts with the string <code class="literal">$libdir</code>, + that part is replaced by the <span class="productname">PostgreSQL</span> package + library directory + name, which is determined at build time.<a id="id-1.8.3.13.5.4.2.2.1.3" class="indexterm"></a> + </p></li><li class="listitem"><p> + If the name does not contain a directory part, the file is + searched for in the path specified by the configuration variable + <a class="xref" href="runtime-config-client.html#GUC-DYNAMIC-LIBRARY-PATH">dynamic_library_path</a>.<a id="id-1.8.3.13.5.4.2.3.1.2" class="indexterm"></a> + </p></li><li class="listitem"><p> + Otherwise (the file was not found in the path, or it contains a + non-absolute directory part), the dynamic loader will try to + take the name as given, which will most likely fail. (It is + unreliable to depend on the current working directory.) + </p></li></ol></div><p> + + If this sequence does not work, the platform-specific shared + library file name extension (often <code class="filename">.so</code>) is + appended to the given name and this sequence is tried again. If + that fails as well, the load will fail. + </p><p> + It is recommended to locate shared libraries either relative to + <code class="literal">$libdir</code> or through the dynamic library path. + This simplifies version upgrades if the new installation is at a + different location. The actual directory that + <code class="literal">$libdir</code> stands for can be found out with the + command <code class="literal">pg_config --pkglibdir</code>. + </p><p> + The user ID the <span class="productname">PostgreSQL</span> server runs + as must be able to traverse the path to the file you intend to + load. Making the file or a higher-level directory not readable + and/or not executable by the <span class="systemitem">postgres</span> + user is a common mistake. + </p><p> + In any case, the file name that is given in the + <code class="command">CREATE FUNCTION</code> command is recorded literally + in the system catalogs, so if the file needs to be loaded again + the same procedure is applied. + </p><div class="note"><h3 class="title">Note</h3><p> + <span class="productname">PostgreSQL</span> will not compile a C function + automatically. The object file must be compiled before it is referenced + in a <code class="command">CREATE + FUNCTION</code> command. See <a class="xref" href="xfunc-c.html#DFUNC" title="38.10.5. Compiling and Linking Dynamically-Loaded Functions">Section 38.10.5</a> for additional + information. + </p></div><a id="id-1.8.3.13.5.9" class="indexterm"></a><p> + To ensure that a dynamically loaded object file is not loaded into an + incompatible server, <span class="productname">PostgreSQL</span> checks that the + file contains a <span class="quote">“<span class="quote">magic block</span>”</span> with the appropriate contents. + This allows the server to detect obvious incompatibilities, such as code + compiled for a different major version of + <span class="productname">PostgreSQL</span>. To include a magic block, + write this in one (and only one) of the module source files, after having + included the header <code class="filename">fmgr.h</code>: + +</p><pre class="programlisting"> +PG_MODULE_MAGIC; +</pre><p> + </p><p> + After it is used for the first time, a dynamically loaded object + file is retained in memory. Future calls in the same session to + the function(s) in that file will only incur the small overhead of + a symbol table lookup. If you need to force a reload of an object + file, for example after recompiling it, begin a fresh session. + </p><a id="id-1.8.3.13.5.12" class="indexterm"></a><a id="id-1.8.3.13.5.13" class="indexterm"></a><a id="id-1.8.3.13.5.14" class="indexterm"></a><a id="id-1.8.3.13.5.15" class="indexterm"></a><p> + Optionally, a dynamically loaded file can contain initialization and + finalization functions. If the file includes a function named + <code class="function">_PG_init</code>, that function will be called immediately after + loading the file. The function receives no parameters and should + return void. If the file includes a function named + <code class="function">_PG_fini</code>, that function will be called immediately before + unloading the file. Likewise, the function receives no parameters and + should return void. Note that <code class="function">_PG_fini</code> will only be called + during an unload of the file, not during process termination. + (Presently, unloads are disabled and will never occur, but this may + change in the future.) + </p></div><div class="sect2" id="XFUNC-C-BASETYPE"><div class="titlepage"><div><div><h3 class="title">38.10.2. Base Types in C-Language Functions</h3></div></div></div><a id="id-1.8.3.13.6.2" class="indexterm"></a><p> + To know how to write C-language functions, you need to know how + <span class="productname">PostgreSQL</span> internally represents base + data types and how they can be passed to and from functions. + Internally, <span class="productname">PostgreSQL</span> regards a base + type as a <span class="quote">“<span class="quote">blob of memory</span>”</span>. The user-defined + functions that you define over a type in turn define the way that + <span class="productname">PostgreSQL</span> can operate on it. That + is, <span class="productname">PostgreSQL</span> will only store and + retrieve the data from disk and use your user-defined functions + to input, process, and output the data. + </p><p> + Base types can have one of three internal formats: + + </p><div class="itemizedlist"><ul class="itemizedlist" style="list-style-type: disc; "><li class="listitem"><p> + pass by value, fixed-length + </p></li><li class="listitem"><p> + pass by reference, fixed-length + </p></li><li class="listitem"><p> + pass by reference, variable-length + </p></li></ul></div><p> + </p><p> + By-value types can only be 1, 2, or 4 bytes in length + (also 8 bytes, if <code class="literal">sizeof(Datum)</code> is 8 on your machine). + You should be careful to define your types such that they will be the + same size (in bytes) on all architectures. For example, the + <code class="literal">long</code> type is dangerous because it is 4 bytes on some + machines and 8 bytes on others, whereas <code class="type">int</code> type is 4 bytes + on most Unix machines. A reasonable implementation of the + <code class="type">int4</code> type on Unix machines might be: + +</p><pre class="programlisting"> +/* 4-byte integer, passed by value */ +typedef int int4; +</pre><p> + + (The actual PostgreSQL C code calls this type <code class="type">int32</code>, because + it is a convention in C that <code class="type">int<em class="replaceable"><code>XX</code></em></code> + means <em class="replaceable"><code>XX</code></em> <span class="emphasis"><em>bits</em></span>. Note + therefore also that the C type <code class="type">int8</code> is 1 byte in size. The + SQL type <code class="type">int8</code> is called <code class="type">int64</code> in C. See also + <a class="xref" href="xfunc-c.html#XFUNC-C-TYPE-TABLE" title="Table 38.2. Equivalent C Types for Built-in SQL Types">Table 38.2</a>.) + </p><p> + On the other hand, fixed-length types of any size can + be passed by-reference. For example, here is a sample + implementation of a <span class="productname">PostgreSQL</span> type: + +</p><pre class="programlisting"> +/* 16-byte structure, passed by reference */ +typedef struct +{ + double x, y; +} Point; +</pre><p> + + Only pointers to such types can be used when passing + them in and out of <span class="productname">PostgreSQL</span> functions. + To return a value of such a type, allocate the right amount of + memory with <code class="literal">palloc</code>, fill in the allocated memory, + and return a pointer to it. (Also, if you just want to return the + same value as one of your input arguments that's of the same data type, + you can skip the extra <code class="literal">palloc</code> and just return the + pointer to the input value.) + </p><p> + Finally, all variable-length types must also be passed + by reference. All variable-length types must begin + with an opaque length field of exactly 4 bytes, which will be set + by <code class="symbol">SET_VARSIZE</code>; never set this field directly! All data to + be stored within that type must be located in the memory + immediately following that length field. The + length field contains the total length of the structure, + that is, it includes the size of the length field + itself. + </p><p> + Another important point is to avoid leaving any uninitialized bits + within data type values; for example, take care to zero out any + alignment padding bytes that might be present in structs. Without + this, logically-equivalent constants of your data type might be + seen as unequal by the planner, leading to inefficient (though not + incorrect) plans. + </p><div class="warning"><h3 class="title">Warning</h3><p> + <span class="emphasis"><em>Never</em></span> modify the contents of a pass-by-reference input + value. If you do so you are likely to corrupt on-disk data, since + the pointer you are given might point directly into a disk buffer. + The sole exception to this rule is explained in + <a class="xref" href="xaggr.html" title="38.12. User-Defined Aggregates">Section 38.12</a>. + </p></div><p> + As an example, we can define the type <code class="type">text</code> as + follows: + +</p><pre class="programlisting"> +typedef struct { + int32 length; + char data[FLEXIBLE_ARRAY_MEMBER]; +} text; +</pre><p> + + The <code class="literal">[FLEXIBLE_ARRAY_MEMBER]</code> notation means that the actual + length of the data part is not specified by this declaration. + </p><p> + When manipulating + variable-length types, we must be careful to allocate + the correct amount of memory and set the length field correctly. + For example, if we wanted to store 40 bytes in a <code class="structname">text</code> + structure, we might use a code fragment like this: + +</p><pre class="programlisting"> +#include "postgres.h" +... +char buffer[40]; /* our source data */ +... +text *destination = (text *) palloc(VARHDRSZ + 40); +SET_VARSIZE(destination, VARHDRSZ + 40); +memcpy(destination->data, buffer, 40); +... + +</pre><p> + + <code class="literal">VARHDRSZ</code> is the same as <code class="literal">sizeof(int32)</code>, but + it's considered good style to use the macro <code class="literal">VARHDRSZ</code> + to refer to the size of the overhead for a variable-length type. + Also, the length field <span class="emphasis"><em>must</em></span> be set using the + <code class="literal">SET_VARSIZE</code> macro, not by simple assignment. + </p><p> + <a class="xref" href="xfunc-c.html#XFUNC-C-TYPE-TABLE" title="Table 38.2. Equivalent C Types for Built-in SQL Types">Table 38.2</a> shows the C types + corresponding to many of the built-in SQL data types + of <span class="productname">PostgreSQL</span>. + The <span class="quote">“<span class="quote">Defined In</span>”</span> column gives the header file that + needs to be included to get the type definition. (The actual + definition might be in a different file that is included by the + listed file. It is recommended that users stick to the defined + interface.) Note that you should always include + <code class="filename">postgres.h</code> first in any source file of server + code, because it declares a number of things that you will need + anyway, and because including other headers first can cause + portability issues. + </p><div class="table" id="XFUNC-C-TYPE-TABLE"><p class="title"><strong>Table 38.2. Equivalent C Types for Built-in SQL Types</strong></p><div class="table-contents"><table class="table" summary="Equivalent C Types for Built-in SQL Types" border="1"><colgroup><col class="col1" /><col class="col2" /><col class="col3" /></colgroup><thead><tr><th> + SQL Type + </th><th> + C Type + </th><th> + Defined In + </th></tr></thead><tbody><tr><td><code class="type">boolean</code></td><td><code class="type">bool</code></td><td><code class="filename">postgres.h</code> (maybe compiler built-in)</td></tr><tr><td><code class="type">box</code></td><td><code class="type">BOX*</code></td><td><code class="filename">utils/geo_decls.h</code></td></tr><tr><td><code class="type">bytea</code></td><td><code class="type">bytea*</code></td><td><code class="filename">postgres.h</code></td></tr><tr><td><code class="type">"char"</code></td><td><code class="type">char</code></td><td>(compiler built-in)</td></tr><tr><td><code class="type">character</code></td><td><code class="type">BpChar*</code></td><td><code class="filename">postgres.h</code></td></tr><tr><td><code class="type">cid</code></td><td><code class="type">CommandId</code></td><td><code class="filename">postgres.h</code></td></tr><tr><td><code class="type">date</code></td><td><code class="type">DateADT</code></td><td><code class="filename">utils/date.h</code></td></tr><tr><td><code class="type">float4</code> (<code class="type">real</code>)</td><td><code class="type">float4</code></td><td><code class="filename">postgres.h</code></td></tr><tr><td><code class="type">float8</code> (<code class="type">double precision</code>)</td><td><code class="type">float8</code></td><td><code class="filename">postgres.h</code></td></tr><tr><td><code class="type">int2</code> (<code class="type">smallint</code>)</td><td><code class="type">int16</code></td><td><code class="filename">postgres.h</code></td></tr><tr><td><code class="type">int4</code> (<code class="type">integer</code>)</td><td><code class="type">int32</code></td><td><code class="filename">postgres.h</code></td></tr><tr><td><code class="type">int8</code> (<code class="type">bigint</code>)</td><td><code class="type">int64</code></td><td><code class="filename">postgres.h</code></td></tr><tr><td><code class="type">interval</code></td><td><code class="type">Interval*</code></td><td><code class="filename">datatype/timestamp.h</code></td></tr><tr><td><code class="type">lseg</code></td><td><code class="type">LSEG*</code></td><td><code class="filename">utils/geo_decls.h</code></td></tr><tr><td><code class="type">name</code></td><td><code class="type">Name</code></td><td><code class="filename">postgres.h</code></td></tr><tr><td><code class="type">numeric</code></td><td><code class="type">Numeric</code></td><td><code class="filename">utils/numeric.h</code></td></tr><tr><td><code class="type">oid</code></td><td><code class="type">Oid</code></td><td><code class="filename">postgres.h</code></td></tr><tr><td><code class="type">oidvector</code></td><td><code class="type">oidvector*</code></td><td><code class="filename">postgres.h</code></td></tr><tr><td><code class="type">path</code></td><td><code class="type">PATH*</code></td><td><code class="filename">utils/geo_decls.h</code></td></tr><tr><td><code class="type">point</code></td><td><code class="type">POINT*</code></td><td><code class="filename">utils/geo_decls.h</code></td></tr><tr><td><code class="type">regproc</code></td><td><code class="type">RegProcedure</code></td><td><code class="filename">postgres.h</code></td></tr><tr><td><code class="type">text</code></td><td><code class="type">text*</code></td><td><code class="filename">postgres.h</code></td></tr><tr><td><code class="type">tid</code></td><td><code class="type">ItemPointer</code></td><td><code class="filename">storage/itemptr.h</code></td></tr><tr><td><code class="type">time</code></td><td><code class="type">TimeADT</code></td><td><code class="filename">utils/date.h</code></td></tr><tr><td><code class="type">time with time zone</code></td><td><code class="type">TimeTzADT</code></td><td><code class="filename">utils/date.h</code></td></tr><tr><td><code class="type">timestamp</code></td><td><code class="type">Timestamp</code></td><td><code class="filename">datatype/timestamp.h</code></td></tr><tr><td><code class="type">timestamp with time zone</code></td><td><code class="type">TimestampTz</code></td><td><code class="filename">datatype/timestamp.h</code></td></tr><tr><td><code class="type">varchar</code></td><td><code class="type">VarChar*</code></td><td><code class="filename">postgres.h</code></td></tr><tr><td><code class="type">xid</code></td><td><code class="type">TransactionId</code></td><td><code class="filename">postgres.h</code></td></tr></tbody></table></div></div><br class="table-break" /><p> + Now that we've gone over all of the possible structures + for base types, we can show some examples of real functions. + </p></div><div class="sect2" id="id-1.8.3.13.7"><div class="titlepage"><div><div><h3 class="title">38.10.3. Version 1 Calling Conventions</h3></div></div></div><p> + The version-1 calling convention relies on macros to suppress most + of the complexity of passing arguments and results. The C declaration + of a version-1 function is always: +</p><pre class="programlisting"> +Datum funcname(PG_FUNCTION_ARGS) +</pre><p> + In addition, the macro call: +</p><pre class="programlisting"> +PG_FUNCTION_INFO_V1(funcname); +</pre><p> + must appear in the same source file. (Conventionally, it's + written just before the function itself.) This macro call is not + needed for <code class="literal">internal</code>-language functions, since + <span class="productname">PostgreSQL</span> assumes that all internal functions + use the version-1 convention. It is, however, required for + dynamically-loaded functions. + </p><p> + In a version-1 function, each actual argument is fetched using a + <code class="function">PG_GETARG_<em class="replaceable"><code>xxx</code></em>()</code> + macro that corresponds to the argument's data type. (In non-strict + functions there needs to be a previous check about argument null-ness + using <code class="function">PG_ARGISNULL()</code>; see below.) + The result is returned using a + <code class="function">PG_RETURN_<em class="replaceable"><code>xxx</code></em>()</code> + macro for the return type. + <code class="function">PG_GETARG_<em class="replaceable"><code>xxx</code></em>()</code> + takes as its argument the number of the function argument to + fetch, where the count starts at 0. + <code class="function">PG_RETURN_<em class="replaceable"><code>xxx</code></em>()</code> + takes as its argument the actual value to return. + </p><p> + Here are some examples using the version-1 calling convention: + </p><pre class="programlisting"> +#include "postgres.h" +#include <string.h> +#include "fmgr.h" +#include "utils/geo_decls.h" + +PG_MODULE_MAGIC; + +/* by value */ + +PG_FUNCTION_INFO_V1(add_one); + +Datum +add_one(PG_FUNCTION_ARGS) +{ + int32 arg = PG_GETARG_INT32(0); + + PG_RETURN_INT32(arg + 1); +} + +/* by reference, fixed length */ + +PG_FUNCTION_INFO_V1(add_one_float8); + +Datum +add_one_float8(PG_FUNCTION_ARGS) +{ + /* The macros for FLOAT8 hide its pass-by-reference nature. */ + float8 arg = PG_GETARG_FLOAT8(0); + + PG_RETURN_FLOAT8(arg + 1.0); +} + +PG_FUNCTION_INFO_V1(makepoint); + +Datum +makepoint(PG_FUNCTION_ARGS) +{ + /* Here, the pass-by-reference nature of Point is not hidden. */ + Point *pointx = PG_GETARG_POINT_P(0); + Point *pointy = PG_GETARG_POINT_P(1); + Point *new_point = (Point *) palloc(sizeof(Point)); + + new_point->x = pointx->x; + new_point->y = pointy->y; + + PG_RETURN_POINT_P(new_point); +} + +/* by reference, variable length */ + +PG_FUNCTION_INFO_V1(copytext); + +Datum +copytext(PG_FUNCTION_ARGS) +{ + text *t = PG_GETARG_TEXT_PP(0); + + /* + * VARSIZE_ANY_EXHDR is the size of the struct in bytes, minus the + * VARHDRSZ or VARHDRSZ_SHORT of its header. Construct the copy with a + * full-length header. + */ + text *new_t = (text *) palloc(VARSIZE_ANY_EXHDR(t) + VARHDRSZ); + SET_VARSIZE(new_t, VARSIZE_ANY_EXHDR(t) + VARHDRSZ); + + /* + * VARDATA is a pointer to the data region of the new struct. The source + * could be a short datum, so retrieve its data through VARDATA_ANY. + */ + memcpy((void *) VARDATA(new_t), /* destination */ + (void *) VARDATA_ANY(t), /* source */ + VARSIZE_ANY_EXHDR(t)); /* how many bytes */ + PG_RETURN_TEXT_P(new_t); +} + +PG_FUNCTION_INFO_V1(concat_text); + +Datum +concat_text(PG_FUNCTION_ARGS) +{ + text *arg1 = PG_GETARG_TEXT_PP(0); + text *arg2 = PG_GETARG_TEXT_PP(1); + int32 arg1_size = VARSIZE_ANY_EXHDR(arg1); + int32 arg2_size = VARSIZE_ANY_EXHDR(arg2); + int32 new_text_size = arg1_size + arg2_size + VARHDRSZ; + text *new_text = (text *) palloc(new_text_size); + + SET_VARSIZE(new_text, new_text_size); + memcpy(VARDATA(new_text), VARDATA_ANY(arg1), arg1_size); + memcpy(VARDATA(new_text) + arg1_size, VARDATA_ANY(arg2), arg2_size); + PG_RETURN_TEXT_P(new_text); +} + +</pre><p> + Supposing that the above code has been prepared in file + <code class="filename">funcs.c</code> and compiled into a shared object, + we could define the functions to <span class="productname">PostgreSQL</span> + with commands like this: + </p><pre class="programlisting"> +CREATE FUNCTION add_one(integer) RETURNS integer + AS '<em class="replaceable"><code>DIRECTORY</code></em>/funcs', 'add_one' + LANGUAGE C STRICT; + +-- note overloading of SQL function name "add_one" +CREATE FUNCTION add_one(double precision) RETURNS double precision + AS '<em class="replaceable"><code>DIRECTORY</code></em>/funcs', 'add_one_float8' + LANGUAGE C STRICT; + +CREATE FUNCTION makepoint(point, point) RETURNS point + AS '<em class="replaceable"><code>DIRECTORY</code></em>/funcs', 'makepoint' + LANGUAGE C STRICT; + +CREATE FUNCTION copytext(text) RETURNS text + AS '<em class="replaceable"><code>DIRECTORY</code></em>/funcs', 'copytext' + LANGUAGE C STRICT; + +CREATE FUNCTION concat_text(text, text) RETURNS text + AS '<em class="replaceable"><code>DIRECTORY</code></em>/funcs', 'concat_text' + LANGUAGE C STRICT; +</pre><p> + Here, <em class="replaceable"><code>DIRECTORY</code></em> stands for the + directory of the shared library file (for instance the + <span class="productname">PostgreSQL</span> tutorial directory, which + contains the code for the examples used in this section). + (Better style would be to use just <code class="literal">'funcs'</code> in the + <code class="literal">AS</code> clause, after having added + <em class="replaceable"><code>DIRECTORY</code></em> to the search path. In any + case, we can omit the system-specific extension for a shared + library, commonly <code class="literal">.so</code>.) + </p><p> + Notice that we have specified the functions as <span class="quote">“<span class="quote">strict</span>”</span>, + meaning that + the system should automatically assume a null result if any input + value is null. By doing this, we avoid having to check for null inputs + in the function code. Without this, we'd have to check for null values + explicitly, using <code class="function">PG_ARGISNULL()</code>. + </p><p> + The macro <code class="function">PG_ARGISNULL(<em class="replaceable"><code>n</code></em>)</code> + allows a function to test whether each input is null. (Of course, doing + this is only necessary in functions not declared <span class="quote">“<span class="quote">strict</span>”</span>.) + As with the + <code class="function">PG_GETARG_<em class="replaceable"><code>xxx</code></em>()</code> macros, + the input arguments are counted beginning at zero. Note that one + should refrain from executing + <code class="function">PG_GETARG_<em class="replaceable"><code>xxx</code></em>()</code> until + one has verified that the argument isn't null. + To return a null result, execute <code class="function">PG_RETURN_NULL()</code>; + this works in both strict and nonstrict functions. + </p><p> + At first glance, the version-1 coding conventions might appear + to be just pointless obscurantism, compared to using + plain <code class="literal">C</code> calling conventions. They do however allow + us to deal with <code class="literal">NULL</code>able arguments/return values, + and <span class="quote">“<span class="quote">toasted</span>”</span> (compressed or out-of-line) values. + </p><p> + Other options provided by the version-1 interface are two + variants of the + <code class="function">PG_GETARG_<em class="replaceable"><code>xxx</code></em>()</code> + macros. The first of these, + <code class="function">PG_GETARG_<em class="replaceable"><code>xxx</code></em>_COPY()</code>, + guarantees to return a copy of the specified argument that is + safe for writing into. (The normal macros will sometimes return a + pointer to a value that is physically stored in a table, which + must not be written to. Using the + <code class="function">PG_GETARG_<em class="replaceable"><code>xxx</code></em>_COPY()</code> + macros guarantees a writable result.) + The second variant consists of the + <code class="function">PG_GETARG_<em class="replaceable"><code>xxx</code></em>_SLICE()</code> + macros which take three arguments. The first is the number of the + function argument (as above). The second and third are the offset and + length of the segment to be returned. Offsets are counted from + zero, and a negative length requests that the remainder of the + value be returned. These macros provide more efficient access to + parts of large values in the case where they have storage type + <span class="quote">“<span class="quote">external</span>”</span>. (The storage type of a column can be specified using + <code class="literal">ALTER TABLE <em class="replaceable"><code>tablename</code></em> ALTER + COLUMN <em class="replaceable"><code>colname</code></em> SET STORAGE + <em class="replaceable"><code>storagetype</code></em></code>. <em class="replaceable"><code>storagetype</code></em> is one of + <code class="literal">plain</code>, <code class="literal">external</code>, <code class="literal">extended</code>, + or <code class="literal">main</code>.) + </p><p> + Finally, the version-1 function call conventions make it possible + to return set results (<a class="xref" href="xfunc-c.html#XFUNC-C-RETURN-SET" title="38.10.8. Returning Sets">Section 38.10.8</a>) and + implement trigger functions (<a class="xref" href="triggers.html" title="Chapter 39. Triggers">Chapter 39</a>) and + procedural-language call handlers (<a class="xref" href="plhandler.html" title="Chapter 56. Writing a Procedural Language Handler">Chapter 56</a>). For more details + see <code class="filename">src/backend/utils/fmgr/README</code> in the + source distribution. + </p></div><div class="sect2" id="id-1.8.3.13.8"><div class="titlepage"><div><div><h3 class="title">38.10.4. Writing Code</h3></div></div></div><p> + Before we turn to the more advanced topics, we should discuss + some coding rules for <span class="productname">PostgreSQL</span> + C-language functions. While it might be possible to load functions + written in languages other than C into + <span class="productname">PostgreSQL</span>, this is usually difficult + (when it is possible at all) because other languages, such as + C++, FORTRAN, or Pascal often do not follow the same calling + convention as C. That is, other languages do not pass argument + and return values between functions in the same way. For this + reason, we will assume that your C-language functions are + actually written in C. + </p><p> + The basic rules for writing and building C functions are as follows: + + </p><div class="itemizedlist"><ul class="itemizedlist" style="list-style-type: disc; "><li class="listitem"><p> + Use <code class="literal">pg_config + --includedir-server</code><a id="id-1.8.3.13.8.3.1.1.1.2" class="indexterm"></a> + to find out where the <span class="productname">PostgreSQL</span> server header + files are installed on your system (or the system that your + users will be running on). + </p></li><li class="listitem"><p> + Compiling and linking your code so that it can be dynamically + loaded into <span class="productname">PostgreSQL</span> always + requires special flags. See <a class="xref" href="xfunc-c.html#DFUNC" title="38.10.5. Compiling and Linking Dynamically-Loaded Functions">Section 38.10.5</a> for a + detailed explanation of how to do it for your particular + operating system. + </p></li><li class="listitem"><p> + Remember to define a <span class="quote">“<span class="quote">magic block</span>”</span> for your shared library, + as described in <a class="xref" href="xfunc-c.html#XFUNC-C-DYNLOAD" title="38.10.1. Dynamic Loading">Section 38.10.1</a>. + </p></li><li class="listitem"><p> + When allocating memory, use the + <span class="productname">PostgreSQL</span> functions + <code class="function">palloc</code><a id="id-1.8.3.13.8.3.1.4.1.3" class="indexterm"></a> and <code class="function">pfree</code><a id="id-1.8.3.13.8.3.1.4.1.5" class="indexterm"></a> + instead of the corresponding C library functions + <code class="function">malloc</code> and <code class="function">free</code>. + The memory allocated by <code class="function">palloc</code> will be + freed automatically at the end of each transaction, preventing + memory leaks. + </p></li><li class="listitem"><p> + Always zero the bytes of your structures using <code class="function">memset</code> + (or allocate them with <code class="function">palloc0</code> in the first place). + Even if you assign to each field of your structure, there might be + alignment padding (holes in the structure) that contain + garbage values. Without this, it's difficult to + support hash indexes or hash joins, as you must pick out only + the significant bits of your data structure to compute a hash. + The planner also sometimes relies on comparing constants via + bitwise equality, so you can get undesirable planning results if + logically-equivalent values aren't bitwise equal. + </p></li><li class="listitem"><p> + Most of the internal <span class="productname">PostgreSQL</span> + types are declared in <code class="filename">postgres.h</code>, while + the function manager interfaces + (<code class="symbol">PG_FUNCTION_ARGS</code>, etc.) are in + <code class="filename">fmgr.h</code>, so you will need to include at + least these two files. For portability reasons it's best to + include <code class="filename">postgres.h</code> <span class="emphasis"><em>first</em></span>, + before any other system or user header files. Including + <code class="filename">postgres.h</code> will also include + <code class="filename">elog.h</code> and <code class="filename">palloc.h</code> + for you. + </p></li><li class="listitem"><p> + Symbol names defined within object files must not conflict + with each other or with symbols defined in the + <span class="productname">PostgreSQL</span> server executable. You + will have to rename your functions or variables if you get + error messages to this effect. + </p></li></ul></div><p> + </p></div><div class="sect2" id="DFUNC"><div class="titlepage"><div><div><h3 class="title">38.10.5. Compiling and Linking Dynamically-Loaded Functions</h3></div></div></div><p> + Before you are able to use your + <span class="productname">PostgreSQL</span> extension functions written in + C, they must be compiled and linked in a special way to produce a + file that can be dynamically loaded by the server. To be precise, a + <em class="firstterm">shared library</em> needs to be + created.<a id="id-1.8.3.13.9.2.3" class="indexterm"></a> + + </p><p> + For information beyond what is contained in this section + you should read the documentation of your + operating system, in particular the manual pages for the C compiler, + <code class="command">cc</code>, and the link editor, <code class="command">ld</code>. + In addition, the <span class="productname">PostgreSQL</span> source code + contains several working examples in the + <code class="filename">contrib</code> directory. If you rely on these + examples you will make your modules dependent on the availability + of the <span class="productname">PostgreSQL</span> source code, however. + </p><p> + Creating shared libraries is generally analogous to linking + executables: first the source files are compiled into object files, + then the object files are linked together. The object files need to + be created as <em class="firstterm">position-independent code</em> + (<acronym class="acronym">PIC</acronym>),<a id="id-1.8.3.13.9.4.3" class="indexterm"></a> which + conceptually means that they can be placed at an arbitrary location + in memory when they are loaded by the executable. (Object files + intended for executables are usually not compiled that way.) The + command to link a shared library contains special flags to + distinguish it from linking an executable (at least in theory + — on some systems the practice is much uglier). + </p><p> + In the following examples we assume that your source code is in a + file <code class="filename">foo.c</code> and we will create a shared library + <code class="filename">foo.so</code>. The intermediate object file will be + called <code class="filename">foo.o</code> unless otherwise noted. A shared + library can contain more than one object file, but we only use one + here. + </p><div class="variablelist"><dl class="variablelist"><dt><span class="term"> + <span class="systemitem">FreeBSD</span> + <a id="id-1.8.3.13.9.6.1.1.2" class="indexterm"></a> + </span></dt><dd><p> + The compiler flag to create <acronym class="acronym">PIC</acronym> is + <code class="option">-fPIC</code>. To create shared libraries the compiler + flag is <code class="option">-shared</code>. +</p><pre class="programlisting"> +gcc -fPIC -c foo.c +gcc -shared -o foo.so foo.o +</pre><p> + This is applicable as of version 3.0 of + <span class="systemitem">FreeBSD</span>. + </p></dd><dt><span class="term"> + <span class="systemitem">HP-UX</span> + <a id="id-1.8.3.13.9.6.2.1.2" class="indexterm"></a> + </span></dt><dd><p> + The compiler flag of the system compiler to create + <acronym class="acronym">PIC</acronym> is <code class="option">+z</code>. When using + <span class="application">GCC</span> it's <code class="option">-fPIC</code>. The + linker flag for shared libraries is <code class="option">-b</code>. So: +</p><pre class="programlisting"> +cc +z -c foo.c +</pre><p> + or: +</p><pre class="programlisting"> +gcc -fPIC -c foo.c +</pre><p> + and then: +</p><pre class="programlisting"> +ld -b -o foo.sl foo.o +</pre><p> + <span class="systemitem">HP-UX</span> uses the extension + <code class="filename">.sl</code> for shared libraries, unlike most other + systems. + </p></dd><dt><span class="term"> + <span class="systemitem">Linux</span> + <a id="id-1.8.3.13.9.6.3.1.2" class="indexterm"></a> + </span></dt><dd><p> + The compiler flag to create <acronym class="acronym">PIC</acronym> is + <code class="option">-fPIC</code>. + The compiler flag to create a shared library is + <code class="option">-shared</code>. A complete example looks like this: +</p><pre class="programlisting"> +cc -fPIC -c foo.c +cc -shared -o foo.so foo.o +</pre><p> + </p></dd><dt><span class="term"> + <span class="systemitem">macOS</span> + <a id="id-1.8.3.13.9.6.4.1.2" class="indexterm"></a> + </span></dt><dd><p> + Here is an example. It assumes the developer tools are installed. +</p><pre class="programlisting"> +cc -c foo.c +cc -bundle -flat_namespace -undefined suppress -o foo.so foo.o +</pre><p> + </p></dd><dt><span class="term"> + <span class="systemitem">NetBSD</span> + <a id="id-1.8.3.13.9.6.5.1.2" class="indexterm"></a> + </span></dt><dd><p> + The compiler flag to create <acronym class="acronym">PIC</acronym> is + <code class="option">-fPIC</code>. For <acronym class="acronym">ELF</acronym> systems, the + compiler with the flag <code class="option">-shared</code> is used to link + shared libraries. On the older non-ELF systems, <code class="literal">ld + -Bshareable</code> is used. +</p><pre class="programlisting"> +gcc -fPIC -c foo.c +gcc -shared -o foo.so foo.o +</pre><p> + </p></dd><dt><span class="term"> + <span class="systemitem">OpenBSD</span> + <a id="id-1.8.3.13.9.6.6.1.2" class="indexterm"></a> + </span></dt><dd><p> + The compiler flag to create <acronym class="acronym">PIC</acronym> is + <code class="option">-fPIC</code>. <code class="literal">ld -Bshareable</code> is + used to link shared libraries. +</p><pre class="programlisting"> +gcc -fPIC -c foo.c +ld -Bshareable -o foo.so foo.o +</pre><p> + </p></dd><dt><span class="term"> + <span class="systemitem">Solaris</span> + <a id="id-1.8.3.13.9.6.7.1.2" class="indexterm"></a> + </span></dt><dd><p> + The compiler flag to create <acronym class="acronym">PIC</acronym> is + <code class="option">-KPIC</code> with the Sun compiler and + <code class="option">-fPIC</code> with <span class="application">GCC</span>. To + link shared libraries, the compiler option is + <code class="option">-G</code> with either compiler or alternatively + <code class="option">-shared</code> with <span class="application">GCC</span>. +</p><pre class="programlisting"> +cc -KPIC -c foo.c +cc -G -o foo.so foo.o +</pre><p> + or +</p><pre class="programlisting"> +gcc -fPIC -c foo.c +gcc -G -o foo.so foo.o +</pre><p> + </p></dd></dl></div><div class="tip"><h3 class="title">Tip</h3><p> + If this is too complicated for you, you should consider using + <a class="ulink" href="https://www.gnu.org/software/libtool/" target="_top"> + <span class="productname">GNU Libtool</span></a>, + which hides the platform differences behind a uniform interface. + </p></div><p> + The resulting shared library file can then be loaded into + <span class="productname">PostgreSQL</span>. When specifying the file name + to the <code class="command">CREATE FUNCTION</code> command, one must give it + the name of the shared library file, not the intermediate object file. + Note that the system's standard shared-library extension (usually + <code class="literal">.so</code> or <code class="literal">.sl</code>) can be omitted from + the <code class="command">CREATE FUNCTION</code> command, and normally should + be omitted for best portability. + </p><p> + Refer back to <a class="xref" href="xfunc-c.html#XFUNC-C-DYNLOAD" title="38.10.1. Dynamic Loading">Section 38.10.1</a> about where the + server expects to find the shared library files. + </p></div><div class="sect2" id="id-1.8.3.13.10"><div class="titlepage"><div><div><h3 class="title">38.10.6. Composite-Type Arguments</h3></div></div></div><p> + Composite types do not have a fixed layout like C structures. + Instances of a composite type can contain null fields. In + addition, composite types that are part of an inheritance + hierarchy can have different fields than other members of the + same inheritance hierarchy. Therefore, + <span class="productname">PostgreSQL</span> provides a function + interface for accessing fields of composite types from C. + </p><p> + Suppose we want to write a function to answer the query: + +</p><pre class="programlisting"> +SELECT name, c_overpaid(emp, 1500) AS overpaid + FROM emp + WHERE name = 'Bill' OR name = 'Sam'; +</pre><p> + + Using the version-1 calling conventions, we can define + <code class="function">c_overpaid</code> as: + +</p><pre class="programlisting"> +#include "postgres.h" +#include "executor/executor.h" /* for GetAttributeByName() */ + +PG_MODULE_MAGIC; + +PG_FUNCTION_INFO_V1(c_overpaid); + +Datum +c_overpaid(PG_FUNCTION_ARGS) +{ + HeapTupleHeader t = PG_GETARG_HEAPTUPLEHEADER(0); + int32 limit = PG_GETARG_INT32(1); + bool isnull; + Datum salary; + + salary = GetAttributeByName(t, "salary", &isnull); + if (isnull) + PG_RETURN_BOOL(false); + /* Alternatively, we might prefer to do PG_RETURN_NULL() for null salary. */ + + PG_RETURN_BOOL(DatumGetInt32(salary) > limit); +} + +</pre><p> + </p><p> + <code class="function">GetAttributeByName</code> is the + <span class="productname">PostgreSQL</span> system function that + returns attributes out of the specified row. It has + three arguments: the argument of type <code class="type">HeapTupleHeader</code> passed + into + the function, the name of the desired attribute, and a + return parameter that tells whether the attribute + is null. <code class="function">GetAttributeByName</code> returns a <code class="type">Datum</code> + value that you can convert to the proper data type by using the + appropriate <code class="function">DatumGet<em class="replaceable"><code>XXX</code></em>()</code> + macro. Note that the return value is meaningless if the null flag is + set; always check the null flag before trying to do anything with the + result. + </p><p> + There is also <code class="function">GetAttributeByNum</code>, which selects + the target attribute by column number instead of name. + </p><p> + The following command declares the function + <code class="function">c_overpaid</code> in SQL: + +</p><pre class="programlisting"> +CREATE FUNCTION c_overpaid(emp, integer) RETURNS boolean + AS '<em class="replaceable"><code>DIRECTORY</code></em>/funcs', 'c_overpaid' + LANGUAGE C STRICT; +</pre><p> + + Notice we have used <code class="literal">STRICT</code> so that we did not have to + check whether the input arguments were NULL. + </p></div><div class="sect2" id="id-1.8.3.13.11"><div class="titlepage"><div><div><h3 class="title">38.10.7. Returning Rows (Composite Types)</h3></div></div></div><p> + To return a row or composite-type value from a C-language + function, you can use a special API that provides macros and + functions to hide most of the complexity of building composite + data types. To use this API, the source file must include: +</p><pre class="programlisting"> +#include "funcapi.h" +</pre><p> + </p><p> + There are two ways you can build a composite data value (henceforth + a <span class="quote">“<span class="quote">tuple</span>”</span>): you can build it from an array of Datum values, + or from an array of C strings that can be passed to the input + conversion functions of the tuple's column data types. In either + case, you first need to obtain or construct a <code class="structname">TupleDesc</code> + descriptor for the tuple structure. When working with Datums, you + pass the <code class="structname">TupleDesc</code> to <code class="function">BlessTupleDesc</code>, + and then call <code class="function">heap_form_tuple</code> for each row. When working + with C strings, you pass the <code class="structname">TupleDesc</code> to + <code class="function">TupleDescGetAttInMetadata</code>, and then call + <code class="function">BuildTupleFromCStrings</code> for each row. In the case of a + function returning a set of tuples, the setup steps can all be done + once during the first call of the function. + </p><p> + Several helper functions are available for setting up the needed + <code class="structname">TupleDesc</code>. The recommended way to do this in most + functions returning composite values is to call: +</p><pre class="programlisting"> +TypeFuncClass get_call_result_type(FunctionCallInfo fcinfo, + Oid *resultTypeId, + TupleDesc *resultTupleDesc) +</pre><p> + passing the same <code class="literal">fcinfo</code> struct passed to the calling function + itself. (This of course requires that you use the version-1 + calling conventions.) <code class="varname">resultTypeId</code> can be specified + as <code class="literal">NULL</code> or as the address of a local variable to receive the + function's result type OID. <code class="varname">resultTupleDesc</code> should be the + address of a local <code class="structname">TupleDesc</code> variable. Check that the + result is <code class="literal">TYPEFUNC_COMPOSITE</code>; if so, + <code class="varname">resultTupleDesc</code> has been filled with the needed + <code class="structname">TupleDesc</code>. (If it is not, you can report an error along + the lines of <span class="quote">“<span class="quote">function returning record called in context that + cannot accept type record</span>”</span>.) + </p><div class="tip"><h3 class="title">Tip</h3><p> + <code class="function">get_call_result_type</code> can resolve the actual type of a + polymorphic function result; so it is useful in functions that return + scalar polymorphic results, not only functions that return composites. + The <code class="varname">resultTypeId</code> output is primarily useful for functions + returning polymorphic scalars. + </p></div><div class="note"><h3 class="title">Note</h3><p> + <code class="function">get_call_result_type</code> has a sibling + <code class="function">get_expr_result_type</code>, which can be used to resolve the + expected output type for a function call represented by an expression + tree. This can be used when trying to determine the result type from + outside the function itself. There is also + <code class="function">get_func_result_type</code>, which can be used when only the + function's OID is available. However these functions are not able + to deal with functions declared to return <code class="structname">record</code>, and + <code class="function">get_func_result_type</code> cannot resolve polymorphic types, + so you should preferentially use <code class="function">get_call_result_type</code>. + </p></div><p> + Older, now-deprecated functions for obtaining + <code class="structname">TupleDesc</code>s are: +</p><pre class="programlisting"> +TupleDesc RelationNameGetTupleDesc(const char *relname) +</pre><p> + to get a <code class="structname">TupleDesc</code> for the row type of a named relation, + and: +</p><pre class="programlisting"> +TupleDesc TypeGetTupleDesc(Oid typeoid, List *colaliases) +</pre><p> + to get a <code class="structname">TupleDesc</code> based on a type OID. This can + be used to get a <code class="structname">TupleDesc</code> for a base or + composite type. It will not work for a function that returns + <code class="structname">record</code>, however, and it cannot resolve polymorphic + types. + </p><p> + Once you have a <code class="structname">TupleDesc</code>, call: +</p><pre class="programlisting"> +TupleDesc BlessTupleDesc(TupleDesc tupdesc) +</pre><p> + if you plan to work with Datums, or: +</p><pre class="programlisting"> +AttInMetadata *TupleDescGetAttInMetadata(TupleDesc tupdesc) +</pre><p> + if you plan to work with C strings. If you are writing a function + returning set, you can save the results of these functions in the + <code class="structname">FuncCallContext</code> structure — use the + <code class="structfield">tuple_desc</code> or <code class="structfield">attinmeta</code> field + respectively. + </p><p> + When working with Datums, use: +</p><pre class="programlisting"> +HeapTuple heap_form_tuple(TupleDesc tupdesc, Datum *values, bool *isnull) +</pre><p> + to build a <code class="structname">HeapTuple</code> given user data in Datum form. + </p><p> + When working with C strings, use: +</p><pre class="programlisting"> +HeapTuple BuildTupleFromCStrings(AttInMetadata *attinmeta, char **values) +</pre><p> + to build a <code class="structname">HeapTuple</code> given user data + in C string form. <em class="parameter"><code>values</code></em> is an array of C strings, + one for each attribute of the return row. Each C string should be in + the form expected by the input function of the attribute data + type. In order to return a null value for one of the attributes, + the corresponding pointer in the <em class="parameter"><code>values</code></em> array + should be set to <code class="symbol">NULL</code>. This function will need to + be called again for each row you return. + </p><p> + Once you have built a tuple to return from your function, it + must be converted into a <code class="type">Datum</code>. Use: +</p><pre class="programlisting"> +HeapTupleGetDatum(HeapTuple tuple) +</pre><p> + to convert a <code class="structname">HeapTuple</code> into a valid Datum. This + <code class="type">Datum</code> can be returned directly if you intend to return + just a single row, or it can be used as the current return value + in a set-returning function. + </p><p> + An example appears in the next section. + </p></div><div class="sect2" id="XFUNC-C-RETURN-SET"><div class="titlepage"><div><div><h3 class="title">38.10.8. Returning Sets</h3></div></div></div><p> + C-language functions have two options for returning sets (multiple + rows). In one method, called <em class="firstterm">ValuePerCall</em> + mode, a set-returning function is called repeatedly (passing the same + arguments each time) and it returns one new row on each call, until + it has no more rows to return and signals that by returning NULL. + The set-returning function (<acronym class="acronym">SRF</acronym>) must therefore + save enough state across calls to remember what it was doing and + return the correct next item on each call. + In the other method, called <em class="firstterm">Materialize</em> mode, + an SRF fills and returns a tuplestore object containing its + entire result; then only one call occurs for the whole result, and + no inter-call state is needed. + </p><p> + When using ValuePerCall mode, it is important to remember that the + query is not guaranteed to be run to completion; that is, due to + options such as <code class="literal">LIMIT</code>, the executor might stop + making calls to the set-returning function before all rows have been + fetched. This means it is not safe to perform cleanup activities in + the last call, because that might not ever happen. It's recommended + to use Materialize mode for functions that need access to external + resources, such as file descriptors. + </p><p> + The remainder of this section documents a set of helper macros that + are commonly used (though not required to be used) for SRFs using + ValuePerCall mode. Additional details about Materialize mode can be + found in <code class="filename">src/backend/utils/fmgr/README</code>. Also, + the <code class="filename">contrib</code> modules in + the <span class="productname">PostgreSQL</span> source distribution contain + many examples of SRFs using both ValuePerCall and Materialize mode. + </p><p> + To use the ValuePerCall support macros described here, + include <code class="filename">funcapi.h</code>. These macros work with a + structure <code class="structname">FuncCallContext</code> that contains the + state that needs to be saved across calls. Within the calling + SRF, <code class="literal">fcinfo->flinfo->fn_extra</code> is used to + hold a pointer to <code class="structname">FuncCallContext</code> across + calls. The macros automatically fill that field on first use, + and expect to find the same pointer there on subsequent uses. +</p><pre class="programlisting"> +typedef struct FuncCallContext +{ + /* + * Number of times we've been called before + * + * call_cntr is initialized to 0 for you by SRF_FIRSTCALL_INIT(), and + * incremented for you every time SRF_RETURN_NEXT() is called. + */ + uint64 call_cntr; + + /* + * OPTIONAL maximum number of calls + * + * max_calls is here for convenience only and setting it is optional. + * If not set, you must provide alternative means to know when the + * function is done. + */ + uint64 max_calls; + + /* + * OPTIONAL pointer to miscellaneous user-provided context information + * + * user_fctx is for use as a pointer to your own data to retain + * arbitrary context information between calls of your function. + */ + void *user_fctx; + + /* + * OPTIONAL pointer to struct containing attribute type input metadata + * + * attinmeta is for use when returning tuples (i.e., composite data types) + * and is not used when returning base data types. It is only needed + * if you intend to use BuildTupleFromCStrings() to create the return + * tuple. + */ + AttInMetadata *attinmeta; + + /* + * memory context used for structures that must live for multiple calls + * + * multi_call_memory_ctx is set by SRF_FIRSTCALL_INIT() for you, and used + * by SRF_RETURN_DONE() for cleanup. It is the most appropriate memory + * context for any memory that is to be reused across multiple calls + * of the SRF. + */ + MemoryContext multi_call_memory_ctx; + + /* + * OPTIONAL pointer to struct containing tuple description + * + * tuple_desc is for use when returning tuples (i.e., composite data types) + * and is only needed if you are going to build the tuples with + * heap_form_tuple() rather than with BuildTupleFromCStrings(). Note that + * the TupleDesc pointer stored here should usually have been run through + * BlessTupleDesc() first. + */ + TupleDesc tuple_desc; + +} FuncCallContext; +</pre><p> + </p><p> + The macros to be used by an <acronym class="acronym">SRF</acronym> using this + infrastructure are: +</p><pre class="programlisting"> +SRF_IS_FIRSTCALL() +</pre><p> + Use this to determine if your function is being called for the first or a + subsequent time. On the first call (only), call: +</p><pre class="programlisting"> +SRF_FIRSTCALL_INIT() +</pre><p> + to initialize the <code class="structname">FuncCallContext</code>. On every function call, + including the first, call: +</p><pre class="programlisting"> +SRF_PERCALL_SETUP() +</pre><p> + to set up for using the <code class="structname">FuncCallContext</code>. + </p><p> + If your function has data to return in the current call, use: +</p><pre class="programlisting"> +SRF_RETURN_NEXT(funcctx, result) +</pre><p> + to return it to the caller. (<code class="literal">result</code> must be of type + <code class="type">Datum</code>, either a single value or a tuple prepared as + described above.) Finally, when your function is finished + returning data, use: +</p><pre class="programlisting"> +SRF_RETURN_DONE(funcctx) +</pre><p> + to clean up and end the <acronym class="acronym">SRF</acronym>. + </p><p> + The memory context that is current when the <acronym class="acronym">SRF</acronym> is called is + a transient context that will be cleared between calls. This means + that you do not need to call <code class="function">pfree</code> on everything + you allocated using <code class="function">palloc</code>; it will go away anyway. However, if you want to allocate + any data structures to live across calls, you need to put them somewhere + else. The memory context referenced by + <code class="structfield">multi_call_memory_ctx</code> is a suitable location for any + data that needs to survive until the <acronym class="acronym">SRF</acronym> is finished running. In most + cases, this means that you should switch into + <code class="structfield">multi_call_memory_ctx</code> while doing the + first-call setup. + Use <code class="literal">funcctx->user_fctx</code> to hold a pointer to + any such cross-call data structures. + (Data you allocate + in <code class="structfield">multi_call_memory_ctx</code> will go away + automatically when the query ends, so it is not necessary to free + that data manually, either.) + </p><div class="warning"><h3 class="title">Warning</h3><p> + While the actual arguments to the function remain unchanged between + calls, if you detoast the argument values (which is normally done + transparently by the + <code class="function">PG_GETARG_<em class="replaceable"><code>xxx</code></em></code> macro) + in the transient context then the detoasted copies will be freed on + each cycle. Accordingly, if you keep references to such values in + your <code class="structfield">user_fctx</code>, you must either copy them into the + <code class="structfield">multi_call_memory_ctx</code> after detoasting, or ensure + that you detoast the values only in that context. + </p></div><p> + A complete pseudo-code example looks like the following: +</p><pre class="programlisting"> +Datum +my_set_returning_function(PG_FUNCTION_ARGS) +{ + FuncCallContext *funcctx; + Datum result; + <em class="replaceable"><code>further declarations as needed</code></em> + + if (SRF_IS_FIRSTCALL()) + { + MemoryContext oldcontext; + + funcctx = SRF_FIRSTCALL_INIT(); + oldcontext = MemoryContextSwitchTo(funcctx->multi_call_memory_ctx); + /* One-time setup code appears here: */ + <em class="replaceable"><code>user code</code></em> + <em class="replaceable"><code>if returning composite</code></em> + <em class="replaceable"><code>build TupleDesc, and perhaps AttInMetadata</code></em> + <em class="replaceable"><code>endif returning composite</code></em> + <em class="replaceable"><code>user code</code></em> + MemoryContextSwitchTo(oldcontext); + } + + /* Each-time setup code appears here: */ + <em class="replaceable"><code>user code</code></em> + funcctx = SRF_PERCALL_SETUP(); + <em class="replaceable"><code>user code</code></em> + + /* this is just one way we might test whether we are done: */ + if (funcctx->call_cntr < funcctx->max_calls) + { + /* Here we want to return another item: */ + <em class="replaceable"><code>user code</code></em> + <em class="replaceable"><code>obtain result Datum</code></em> + SRF_RETURN_NEXT(funcctx, result); + } + else + { + /* Here we are done returning items, so just report that fact. */ + /* (Resist the temptation to put cleanup code here.) */ + SRF_RETURN_DONE(funcctx); + } +} +</pre><p> + </p><p> + A complete example of a simple <acronym class="acronym">SRF</acronym> returning a composite type + looks like: +</p><pre class="programlisting"> +PG_FUNCTION_INFO_V1(retcomposite); + +Datum +retcomposite(PG_FUNCTION_ARGS) +{ + FuncCallContext *funcctx; + int call_cntr; + int max_calls; + TupleDesc tupdesc; + AttInMetadata *attinmeta; + + /* stuff done only on the first call of the function */ + if (SRF_IS_FIRSTCALL()) + { + MemoryContext oldcontext; + + /* create a function context for cross-call persistence */ + funcctx = SRF_FIRSTCALL_INIT(); + + /* switch to memory context appropriate for multiple function calls */ + oldcontext = MemoryContextSwitchTo(funcctx->multi_call_memory_ctx); + + /* total number of tuples to be returned */ + funcctx->max_calls = PG_GETARG_UINT32(0); + + /* Build a tuple descriptor for our result type */ + if (get_call_result_type(fcinfo, NULL, &tupdesc) != TYPEFUNC_COMPOSITE) + ereport(ERROR, + (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), + errmsg("function returning record called in context " + "that cannot accept type record"))); + + /* + * generate attribute metadata needed later to produce tuples from raw + * C strings + */ + attinmeta = TupleDescGetAttInMetadata(tupdesc); + funcctx->attinmeta = attinmeta; + + MemoryContextSwitchTo(oldcontext); + } + + /* stuff done on every call of the function */ + funcctx = SRF_PERCALL_SETUP(); + + call_cntr = funcctx->call_cntr; + max_calls = funcctx->max_calls; + attinmeta = funcctx->attinmeta; + + if (call_cntr < max_calls) /* do when there is more left to send */ + { + char **values; + HeapTuple tuple; + Datum result; + + /* + * Prepare a values array for building the returned tuple. + * This should be an array of C strings which will + * be processed later by the type input functions. + */ + values = (char **) palloc(3 * sizeof(char *)); + values[0] = (char *) palloc(16 * sizeof(char)); + values[1] = (char *) palloc(16 * sizeof(char)); + values[2] = (char *) palloc(16 * sizeof(char)); + + snprintf(values[0], 16, "%d", 1 * PG_GETARG_INT32(1)); + snprintf(values[1], 16, "%d", 2 * PG_GETARG_INT32(1)); + snprintf(values[2], 16, "%d", 3 * PG_GETARG_INT32(1)); + + /* build a tuple */ + tuple = BuildTupleFromCStrings(attinmeta, values); + + /* make the tuple into a datum */ + result = HeapTupleGetDatum(tuple); + + /* clean up (this is not really necessary) */ + pfree(values[0]); + pfree(values[1]); + pfree(values[2]); + pfree(values); + + SRF_RETURN_NEXT(funcctx, result); + } + else /* do when there is no more left */ + { + SRF_RETURN_DONE(funcctx); + } +} + +</pre><p> + + One way to declare this function in SQL is: +</p><pre class="programlisting"> +CREATE TYPE __retcomposite AS (f1 integer, f2 integer, f3 integer); + +CREATE OR REPLACE FUNCTION retcomposite(integer, integer) + RETURNS SETOF __retcomposite + AS '<em class="replaceable"><code>filename</code></em>', 'retcomposite' + LANGUAGE C IMMUTABLE STRICT; +</pre><p> + A different way is to use OUT parameters: +</p><pre class="programlisting"> +CREATE OR REPLACE FUNCTION retcomposite(IN integer, IN integer, + OUT f1 integer, OUT f2 integer, OUT f3 integer) + RETURNS SETOF record + AS '<em class="replaceable"><code>filename</code></em>', 'retcomposite' + LANGUAGE C IMMUTABLE STRICT; +</pre><p> + Notice that in this method the output type of the function is formally + an anonymous <code class="structname">record</code> type. + </p></div><div class="sect2" id="id-1.8.3.13.13"><div class="titlepage"><div><div><h3 class="title">38.10.9. Polymorphic Arguments and Return Types</h3></div></div></div><p> + C-language functions can be declared to accept and + return the polymorphic types described in <a class="xref" href="extend-type-system.html#EXTEND-TYPES-POLYMORPHIC" title="38.2.5. Polymorphic Types">Section 38.2.5</a>. + When a function's arguments or return types + are defined as polymorphic types, the function author cannot know + in advance what data type it will be called with, or + need to return. There are two routines provided in <code class="filename">fmgr.h</code> + to allow a version-1 C function to discover the actual data types + of its arguments and the type it is expected to return. The routines are + called <code class="literal">get_fn_expr_rettype(FmgrInfo *flinfo)</code> and + <code class="literal">get_fn_expr_argtype(FmgrInfo *flinfo, int argnum)</code>. + They return the result or argument type OID, or <code class="symbol">InvalidOid</code> if the + information is not available. + The structure <code class="literal">flinfo</code> is normally accessed as + <code class="literal">fcinfo->flinfo</code>. The parameter <code class="literal">argnum</code> + is zero based. <code class="function">get_call_result_type</code> can also be used + as an alternative to <code class="function">get_fn_expr_rettype</code>. + There is also <code class="function">get_fn_expr_variadic</code>, which can be used to + find out whether variadic arguments have been merged into an array. + This is primarily useful for <code class="literal">VARIADIC "any"</code> functions, + since such merging will always have occurred for variadic functions + taking ordinary array types. + </p><p> + For example, suppose we want to write a function to accept a single + element of any type, and return a one-dimensional array of that type: + +</p><pre class="programlisting"> +PG_FUNCTION_INFO_V1(make_array); +Datum +make_array(PG_FUNCTION_ARGS) +{ + ArrayType *result; + Oid element_type = get_fn_expr_argtype(fcinfo->flinfo, 0); + Datum element; + bool isnull; + int16 typlen; + bool typbyval; + char typalign; + int ndims; + int dims[MAXDIM]; + int lbs[MAXDIM]; + + if (!OidIsValid(element_type)) + elog(ERROR, "could not determine data type of input"); + + /* get the provided element, being careful in case it's NULL */ + isnull = PG_ARGISNULL(0); + if (isnull) + element = (Datum) 0; + else + element = PG_GETARG_DATUM(0); + + /* we have one dimension */ + ndims = 1; + /* and one element */ + dims[0] = 1; + /* and lower bound is 1 */ + lbs[0] = 1; + + /* get required info about the element type */ + get_typlenbyvalalign(element_type, &typlen, &typbyval, &typalign); + + /* now build the array */ + result = construct_md_array(&element, &isnull, ndims, dims, lbs, + element_type, typlen, typbyval, typalign); + + PG_RETURN_ARRAYTYPE_P(result); +} +</pre><p> + </p><p> + The following command declares the function + <code class="function">make_array</code> in SQL: + +</p><pre class="programlisting"> +CREATE FUNCTION make_array(anyelement) RETURNS anyarray + AS '<em class="replaceable"><code>DIRECTORY</code></em>/funcs', 'make_array' + LANGUAGE C IMMUTABLE; +</pre><p> + </p><p> + There is a variant of polymorphism that is only available to C-language + functions: they can be declared to take parameters of type + <code class="literal">"any"</code>. (Note that this type name must be double-quoted, + since it's also an SQL reserved word.) This works like + <code class="type">anyelement</code> except that it does not constrain different + <code class="literal">"any"</code> arguments to be the same type, nor do they help + determine the function's result type. A C-language function can also + declare its final parameter to be <code class="literal">VARIADIC "any"</code>. This will + match one or more actual arguments of any type (not necessarily the same + type). These arguments will <span class="emphasis"><em>not</em></span> be gathered into an array + as happens with normal variadic functions; they will just be passed to + the function separately. The <code class="function">PG_NARGS()</code> macro and the + methods described above must be used to determine the number of actual + arguments and their types when using this feature. Also, users of such + a function might wish to use the <code class="literal">VARIADIC</code> keyword in their + function call, with the expectation that the function would treat the + array elements as separate arguments. The function itself must implement + that behavior if wanted, after using <code class="function">get_fn_expr_variadic</code> to + detect that the actual argument was marked with <code class="literal">VARIADIC</code>. + </p></div><div class="sect2" id="XFUNC-SHARED-ADDIN"><div class="titlepage"><div><div><h3 class="title">38.10.10. Shared Memory and LWLocks</h3></div></div></div><p> + Add-ins can reserve LWLocks and an allocation of shared memory on server + startup. The add-in's shared library must be preloaded by specifying + it in + <a class="xref" href="runtime-config-client.html#GUC-SHARED-PRELOAD-LIBRARIES">shared_preload_libraries</a><a id="id-1.8.3.13.14.2.2" class="indexterm"></a>. + Shared memory is reserved by calling: +</p><pre class="programlisting"> +void RequestAddinShmemSpace(int size) +</pre><p> + from your <code class="function">_PG_init</code> function. + </p><p> + LWLocks are reserved by calling: +</p><pre class="programlisting"> +void RequestNamedLWLockTranche(const char *tranche_name, int num_lwlocks) +</pre><p> + from <code class="function">_PG_init</code>. This will ensure that an array of + <code class="literal">num_lwlocks</code> LWLocks is available under the name + <code class="literal">tranche_name</code>. Use <code class="function">GetNamedLWLockTranche</code> + to get a pointer to this array. + </p><p> + To avoid possible race-conditions, each backend should use the LWLock + <code class="function">AddinShmemInitLock</code> when connecting to and initializing + its allocation of shared memory, as shown here: +</p><pre class="programlisting"> +static mystruct *ptr = NULL; + +if (!ptr) +{ + bool found; + + LWLockAcquire(AddinShmemInitLock, LW_EXCLUSIVE); + ptr = ShmemInitStruct("my struct name", size, &found); + if (!found) + { + initialize contents of shmem area; + acquire any requested LWLocks using: + ptr->locks = GetNamedLWLockTranche("my tranche name"); + } + LWLockRelease(AddinShmemInitLock); +} +</pre><p> + </p></div><div class="sect2" id="EXTEND-CPP"><div class="titlepage"><div><div><h3 class="title">38.10.11. Using C++ for Extensibility</h3></div></div></div><a id="id-1.8.3.13.15.2" class="indexterm"></a><p> + Although the <span class="productname">PostgreSQL</span> backend is written in + C, it is possible to write extensions in C++ if these guidelines are + followed: + + </p><div class="itemizedlist"><ul class="itemizedlist" style="list-style-type: disc; "><li class="listitem"><p> + All functions accessed by the backend must present a C interface + to the backend; these C functions can then call C++ functions. + For example, <code class="literal">extern C</code> linkage is required for + backend-accessed functions. This is also necessary for any + functions that are passed as pointers between the backend and + C++ code. + </p></li><li class="listitem"><p> + Free memory using the appropriate deallocation method. For example, + most backend memory is allocated using <code class="function">palloc()</code>, so use + <code class="function">pfree()</code> to free it. Using C++ + <code class="function">delete</code> in such cases will fail. + </p></li><li class="listitem"><p> + Prevent exceptions from propagating into the C code (use a catch-all + block at the top level of all <code class="literal">extern C</code> functions). This + is necessary even if the C++ code does not explicitly throw any + exceptions, because events like out-of-memory can still throw + exceptions. Any exceptions must be caught and appropriate errors + passed back to the C interface. If possible, compile C++ with + <code class="option">-fno-exceptions</code> to eliminate exceptions entirely; in such + cases, you must check for failures in your C++ code, e.g., check for + NULL returned by <code class="function">new()</code>. + </p></li><li class="listitem"><p> + If calling backend functions from C++ code, be sure that the + C++ call stack contains only plain old data structures + (<acronym class="acronym">POD</acronym>). This is necessary because backend errors + generate a distant <code class="function">longjmp()</code> that does not properly + unroll a C++ call stack with non-POD objects. + </p></li></ul></div><p> + </p><p> + In summary, it is best to place C++ code behind a wall of + <code class="literal">extern C</code> functions that interface to the backend, + and avoid exception, memory, and call stack leakage. + </p></div></div><div xmlns="http://www.w3.org/TR/xhtml1/transitional" class="navfooter"><hr></hr><table width="100%" summary="Navigation footer"><tr><td width="40%" align="left"><a accesskey="p" href="xfunc-internal.html" title="38.9. Internal Functions">Prev</a> </td><td width="20%" align="center"><a accesskey="u" href="extend.html" title="Chapter 38. Extending SQL">Up</a></td><td width="40%" align="right"> <a accesskey="n" href="xfunc-optimization.html" title="38.11. Function Optimization Information">Next</a></td></tr><tr><td width="40%" align="left" valign="top">38.9. Internal Functions </td><td width="20%" align="center"><a accesskey="h" href="index.html" title="PostgreSQL 14.5 Documentation">Home</a></td><td width="40%" align="right" valign="top"> 38.11. Function Optimization Information</td></tr></table></div></body></html>
\ No newline at end of file |