diff options
Diffstat (limited to 'doc/src/sgml/plperl.sgml')
-rw-r--r-- | doc/src/sgml/plperl.sgml | 1584 |
1 files changed, 1584 insertions, 0 deletions
diff --git a/doc/src/sgml/plperl.sgml b/doc/src/sgml/plperl.sgml new file mode 100644 index 0000000..01f9870 --- /dev/null +++ b/doc/src/sgml/plperl.sgml @@ -0,0 +1,1584 @@ +<!-- doc/src/sgml/plperl.sgml --> + + <chapter id="plperl"> + <title>PL/Perl — Perl Procedural Language</title> + + <indexterm zone="plperl"> + <primary>PL/Perl</primary> + </indexterm> + + <indexterm zone="plperl"> + <primary>Perl</primary> + </indexterm> + + <para> + PL/Perl is a loadable procedural language that enables you to write + <productname>PostgreSQL</productname> functions and procedures in the + <ulink url="https://www.perl.org">Perl programming language</ulink>. + </para> + + <para> + The main advantage to using PL/Perl is that this allows use, + within stored functions and procedures, of the manyfold <quote>string + munging</quote> operators and functions available for Perl. Parsing + complex strings might be easier using Perl than it is with the + string functions and control structures provided in PL/pgSQL. + </para> + + <para> + To install PL/Perl in a particular database, use + <literal>CREATE EXTENSION plperl</literal>. + </para> + + <tip> + <para> + If a language is installed into <literal>template1</literal>, all subsequently + created databases will have the language installed automatically. + </para> + </tip> + + <note> + <para> + Users of source packages must specially enable the build of + PL/Perl during the installation process. (Refer to <xref + linkend="installation"/> for more information.) Users of + binary packages might find PL/Perl in a separate subpackage. + </para> + </note> + + <sect1 id="plperl-funcs"> + <title>PL/Perl Functions and Arguments</title> + + <para> + To create a function in the PL/Perl language, use the standard + <xref linkend="sql-createfunction"/> + syntax: + +<programlisting> +CREATE FUNCTION <replaceable>funcname</replaceable> (<replaceable>argument-types</replaceable>) +RETURNS <replaceable>return-type</replaceable> +-- function attributes can go here +AS $$ + # PL/Perl function body goes here +$$ LANGUAGE plperl; +</programlisting> + + The body of the function is ordinary Perl code. In fact, the PL/Perl + glue code wraps it inside a Perl subroutine. A PL/Perl function is + called in a scalar context, so it can't return a list. You can return + non-scalar values (arrays, records, and sets) by returning a reference, + as discussed below. + </para> + + <para> + In a PL/Perl procedure, any return value from the Perl code is ignored. + </para> + + <para> + PL/Perl also supports anonymous code blocks called with the + <xref linkend="sql-do"/> statement: + +<programlisting> +DO $$ + # PL/Perl code +$$ LANGUAGE plperl; +</programlisting> + + An anonymous code block receives no arguments, and whatever value it + might return is discarded. Otherwise it behaves just like a function. + </para> + + <note> + <para> + The use of named nested subroutines is dangerous in Perl, especially if + they refer to lexical variables in the enclosing scope. Because a PL/Perl + function is wrapped in a subroutine, any named subroutine you place inside + one will be nested. In general, it is far safer to create anonymous + subroutines which you call via a coderef. For more information, see the + entries for <literal>Variable "%s" will not stay shared</literal> and + <literal>Variable "%s" is not available</literal> in the + <citerefentry><refentrytitle>perldiag</refentrytitle></citerefentry> man page, or + search the Internet for <quote>perl nested named subroutine</quote>. + </para> + </note> + + <para> + The syntax of the <command>CREATE FUNCTION</command> command requires + the function body to be written as a string constant. It is usually + most convenient to use dollar quoting (see <xref + linkend="sql-syntax-dollar-quoting"/>) for the string constant. + If you choose to use escape string syntax <literal>E''</literal>, + you must double any single quote marks (<literal>'</literal>) and backslashes + (<literal>\</literal>) used in the body of the function + (see <xref linkend="sql-syntax-strings"/>). + </para> + + <para> + Arguments and results are handled as in any other Perl subroutine: + arguments are passed in <varname>@_</varname>, and a result value + is returned with <literal>return</literal> or as the last expression + evaluated in the function. + </para> + + <para> + For example, a function returning the greater of two integer values + could be defined as: + +<programlisting> +CREATE FUNCTION perl_max (integer, integer) RETURNS integer AS $$ + if ($_[0] > $_[1]) { return $_[0]; } + return $_[1]; +$$ LANGUAGE plperl; +</programlisting> + </para> + + <note> + <para> + Arguments will be converted from the database's encoding to UTF-8 + for use inside PL/Perl, and then converted from UTF-8 back to the + database encoding upon return. + </para> + </note> + + <para> + If an SQL null value<indexterm><primary>null value</primary><secondary + sortas="PL/Perl">in PL/Perl</secondary></indexterm> is passed to a function, + the argument value will appear as <quote>undefined</quote> in Perl. The + above function definition will not behave very nicely with null + inputs (in fact, it will act as though they are zeroes). We could + add <literal>STRICT</literal> to the function definition to make + <productname>PostgreSQL</productname> do something more reasonable: + if a null value is passed, the function will not be called at all, + but will just return a null result automatically. Alternatively, + we could check for undefined inputs in the function body. For + example, suppose that we wanted <function>perl_max</function> with + one null and one nonnull argument to return the nonnull argument, + rather than a null value: + +<programlisting> +CREATE FUNCTION perl_max (integer, integer) RETURNS integer AS $$ + my ($x, $y) = @_; + if (not defined $x) { + return undef if not defined $y; + return $y; + } + return $x if not defined $y; + return $x if $x > $y; + return $y; +$$ LANGUAGE plperl; +</programlisting> + As shown above, to return an SQL null value from a PL/Perl + function, return an undefined value. This can be done whether the + function is strict or not. + </para> + + <para> + Anything in a function argument that is not a reference is + a string, which is in the standard <productname>PostgreSQL</productname> + external text representation for the relevant data type. In the case of + ordinary numeric or text types, Perl will just do the right thing and + the programmer will normally not have to worry about it. However, in + other cases the argument will need to be converted into a form that is + more usable in Perl. For example, the <function>decode_bytea</function> + function can be used to convert an argument of + type <type>bytea</type> into unescaped binary. + </para> + + <para> + Similarly, values passed back to <productname>PostgreSQL</productname> + must be in the external text representation format. For example, the + <function>encode_bytea</function> function can be used to + escape binary data for a return value of type <type>bytea</type>. + </para> + + <para> + One case that is particularly important is boolean values. As just + stated, the default behavior for <type>bool</type> values is that they + are passed to Perl as text, thus either <literal>'t'</literal> + or <literal>'f'</literal>. This is problematic, since Perl will not + treat <literal>'f'</literal> as false! It is possible to improve matters + by using a <quote>transform</quote> (see + <xref linkend="sql-createtransform"/>). Suitable transforms are provided + by the <filename>bool_plperl</filename> extension. To use it, install + the extension: +<programlisting> +CREATE EXTENSION bool_plperl; -- or bool_plperlu for PL/PerlU +</programlisting> + Then use the <literal>TRANSFORM</literal> function attribute for a + PL/Perl function that takes or returns <type>bool</type>, for example: +<programlisting> +CREATE FUNCTION perl_and(bool, bool) RETURNS bool +TRANSFORM FOR TYPE bool +AS $$ + my ($a, $b) = @_; + return $a && $b; +$$ LANGUAGE plperl; +</programlisting> + When this transform is applied, <type>bool</type> arguments will be seen + by Perl as being <literal>1</literal> or empty, thus properly true or + false. If the function result is type <type>bool</type>, it will be true + or false according to whether Perl would evaluate the returned value as + true. + Similar transformations are also performed for boolean query arguments + and results of SPI queries performed inside the function + (<xref linkend="plperl-database"/>). + </para> + + <para> + Perl can return <productname>PostgreSQL</productname> arrays as + references to Perl arrays. Here is an example: + +<programlisting> +CREATE OR REPLACE function returns_array() +RETURNS text[][] AS $$ + return [['a"b','c,d'],['e\\f','g']]; +$$ LANGUAGE plperl; + +select returns_array(); +</programlisting> + </para> + + <para> + Perl passes <productname>PostgreSQL</productname> arrays as a blessed + <type>PostgreSQL::InServer::ARRAY</type> object. This object may be treated as an array + reference or a string, allowing for backward compatibility with Perl + code written for <productname>PostgreSQL</productname> versions below 9.1 to + run. For example: + +<programlisting> +CREATE OR REPLACE FUNCTION concat_array_elements(text[]) RETURNS TEXT AS $$ + my $arg = shift; + my $result = ""; + return undef if (!defined $arg); + + # as an array reference + for (@$arg) { + $result .= $_; + } + + # also works as a string + $result .= $arg; + + return $result; +$$ LANGUAGE plperl; + +SELECT concat_array_elements(ARRAY['PL','/','Perl']); +</programlisting> + + <note> + <para> + Multidimensional arrays are represented as references to + lower-dimensional arrays of references in a way common to every Perl + programmer. + </para> + </note> + </para> + + <para> + Composite-type arguments are passed to the function as references + to hashes. The keys of the hash are the attribute names of the + composite type. Here is an example: + +<programlisting> +CREATE TABLE employee ( + name text, + basesalary integer, + bonus integer +); + +CREATE FUNCTION empcomp(employee) RETURNS integer AS $$ + my ($emp) = @_; + return $emp->{basesalary} + $emp->{bonus}; +$$ LANGUAGE plperl; + +SELECT name, empcomp(employee.*) FROM employee; +</programlisting> + </para> + + <para> + A PL/Perl function can return a composite-type result using the same + approach: return a reference to a hash that has the required attributes. + For example: + +<programlisting> +CREATE TYPE testrowperl AS (f1 integer, f2 text, f3 text); + +CREATE OR REPLACE FUNCTION perl_row() RETURNS testrowperl AS $$ + return {f2 => 'hello', f1 => 1, f3 => 'world'}; +$$ LANGUAGE plperl; + +SELECT * FROM perl_row(); +</programlisting> + + Any columns in the declared result data type that are not present in the + hash will be returned as null values. + </para> + + <para> + Similarly, output arguments of procedures can be returned as a hash + reference: + +<programlisting> +CREATE PROCEDURE perl_triple(INOUT a integer, INOUT b integer) AS $$ + my ($a, $b) = @_; + return {a => $a * 3, b => $b * 3}; +$$ LANGUAGE plperl; + +CALL perl_triple(5, 10); +</programlisting> + </para> + + <para> + PL/Perl functions can also return sets of either scalar or + composite types. Usually you'll want to return rows one at a + time, both to speed up startup time and to keep from queuing up + the entire result set in memory. You can do this with + <function>return_next</function> as illustrated below. Note that + after the last <function>return_next</function>, you must put + either <literal>return</literal> or (better) <literal>return + undef</literal>. + +<programlisting> +CREATE OR REPLACE FUNCTION perl_set_int(int) +RETURNS SETOF INTEGER AS $$ + foreach (0..$_[0]) { + return_next($_); + } + return undef; +$$ LANGUAGE plperl; + +SELECT * FROM perl_set_int(5); + +CREATE OR REPLACE FUNCTION perl_set() +RETURNS SETOF testrowperl AS $$ + return_next({ f1 => 1, f2 => 'Hello', f3 => 'World' }); + return_next({ f1 => 2, f2 => 'Hello', f3 => 'PostgreSQL' }); + return_next({ f1 => 3, f2 => 'Hello', f3 => 'PL/Perl' }); + return undef; +$$ LANGUAGE plperl; +</programlisting> + + For small result sets, you can return a reference to an array that + contains either scalars, references to arrays, or references to + hashes for simple types, array types, and composite types, + respectively. Here are some simple examples of returning the entire + result set as an array reference: + +<programlisting> +CREATE OR REPLACE FUNCTION perl_set_int(int) RETURNS SETOF INTEGER AS $$ + return [0..$_[0]]; +$$ LANGUAGE plperl; + +SELECT * FROM perl_set_int(5); + +CREATE OR REPLACE FUNCTION perl_set() RETURNS SETOF testrowperl AS $$ + return [ + { f1 => 1, f2 => 'Hello', f3 => 'World' }, + { f1 => 2, f2 => 'Hello', f3 => 'PostgreSQL' }, + { f1 => 3, f2 => 'Hello', f3 => 'PL/Perl' } + ]; +$$ LANGUAGE plperl; + +SELECT * FROM perl_set(); +</programlisting> + </para> + + <para> + If you wish to use the <literal>strict</literal> pragma with your code you + have a few options. For temporary global use you can <command>SET</command> + <literal>plperl.use_strict</literal> to true. + This will affect subsequent compilations of <application>PL/Perl</application> + functions, but not functions already compiled in the current session. + For permanent global use you can set <literal>plperl.use_strict</literal> + to true in the <filename>postgresql.conf</filename> file. + </para> + + <para> + For permanent use in specific functions you can simply put: +<programlisting> +use strict; +</programlisting> + at the top of the function body. + </para> + + <para> + The <literal>feature</literal> pragma is also available to <function>use</function> if your Perl is version 5.10.0 or higher. + </para> + + </sect1> + + <sect1 id="plperl-data"> + <title>Data Values in PL/Perl</title> + + <para> + The argument values supplied to a PL/Perl function's code are + simply the input arguments converted to text form (just as if they + had been displayed by a <command>SELECT</command> statement). + Conversely, the <function>return</function> and <function>return_next</function> + commands will accept any string that is acceptable input format + for the function's declared return type. + </para> + + <para> + If this behavior is inconvenient for a particular case, it can be + improved by using a transform, as already illustrated + for <type>bool</type> values. Several examples of transform modules + are included in the <productname>PostgreSQL</productname> distribution. + </para> + </sect1> + + <sect1 id="plperl-builtins"> + <title>Built-in Functions</title> + + <sect2 id="plperl-database"> + <title>Database Access from PL/Perl</title> + + <para> + Access to the database itself from your Perl function can be done + via the following functions: + </para> + + <variablelist> + <varlistentry> + <term> + <literal><function>spi_exec_query</function>(<replaceable>query</replaceable> [, <replaceable>max-rows</replaceable>])</literal> + <indexterm> + <primary>spi_exec_query</primary> + <secondary>in PL/Perl</secondary> + </indexterm> + </term> + <listitem> + <para> + <literal>spi_exec_query</literal> executes an SQL command and +returns the entire row set as a reference to an array of hash +references. <emphasis>You should only use this command when you know +that the result set will be relatively small.</emphasis> Here is an +example of a query (<command>SELECT</command> command) with the +optional maximum number of rows: + +<programlisting> +$rv = spi_exec_query('SELECT * FROM my_table', 5); +</programlisting> + This returns up to 5 rows from the table + <literal>my_table</literal>. If <literal>my_table</literal> + has a column <literal>my_column</literal>, you can get that + value from row <literal>$i</literal> of the result like this: +<programlisting> +$foo = $rv->{rows}[$i]->{my_column}; +</programlisting> + The total number of rows returned from a <command>SELECT</command> + query can be accessed like this: +<programlisting> +$nrows = $rv->{processed} +</programlisting> + </para> + + <para> + Here is an example using a different command type: +<programlisting> +$query = "INSERT INTO my_table VALUES (1, 'test')"; +$rv = spi_exec_query($query); +</programlisting> + You can then access the command status (e.g., + <literal>SPI_OK_INSERT</literal>) like this: +<programlisting> +$res = $rv->{status}; +</programlisting> + To get the number of rows affected, do: +<programlisting> +$nrows = $rv->{processed}; +</programlisting> + </para> + + <para> + Here is a complete example: +<programlisting> +CREATE TABLE test ( + i int, + v varchar +); + +INSERT INTO test (i, v) VALUES (1, 'first line'); +INSERT INTO test (i, v) VALUES (2, 'second line'); +INSERT INTO test (i, v) VALUES (3, 'third line'); +INSERT INTO test (i, v) VALUES (4, 'immortal'); + +CREATE OR REPLACE FUNCTION test_munge() RETURNS SETOF test AS $$ + my $rv = spi_exec_query('select i, v from test;'); + my $status = $rv->{status}; + my $nrows = $rv->{processed}; + foreach my $rn (0 .. $nrows - 1) { + my $row = $rv->{rows}[$rn]; + $row->{i} += 200 if defined($row->{i}); + $row->{v} =~ tr/A-Za-z/a-zA-Z/ if (defined($row->{v})); + return_next($row); + } + return undef; +$$ LANGUAGE plperl; + +SELECT * FROM test_munge(); +</programlisting> + </para> + </listitem> + </varlistentry> + + <varlistentry> + <term> + <literal><function>spi_query(<replaceable>command</replaceable>)</function></literal> + <indexterm> + <primary>spi_query</primary> + <secondary>in PL/Perl</secondary> + </indexterm> + </term> + <term> + <literal><function>spi_fetchrow(<replaceable>cursor</replaceable>)</function></literal> + <indexterm> + <primary>spi_fetchrow</primary> + <secondary>in PL/Perl</secondary> + </indexterm> + </term> + <term> + <literal><function>spi_cursor_close(<replaceable>cursor</replaceable>)</function></literal> + <indexterm> + <primary>spi_cursor_close</primary> + <secondary>in PL/Perl</secondary> + </indexterm> + </term> + + <listitem> + <para> + <literal>spi_query</literal> and <literal>spi_fetchrow</literal> + work together as a pair for row sets which might be large, or for cases + where you wish to return rows as they arrive. + <literal>spi_fetchrow</literal> works <emphasis>only</emphasis> with + <literal>spi_query</literal>. The following example illustrates how + you use them together: + +<programlisting> +CREATE TYPE foo_type AS (the_num INTEGER, the_text TEXT); + +CREATE OR REPLACE FUNCTION lotsa_md5 (INTEGER) RETURNS SETOF foo_type AS $$ + use Digest::MD5 qw(md5_hex); + my $file = '/usr/share/dict/words'; + my $t = localtime; + elog(NOTICE, "opening file $file at $t" ); + open my $fh, '<', $file # ooh, it's a file access! + or elog(ERROR, "cannot open $file for reading: $!"); + my @words = <$fh>; + close $fh; + $t = localtime; + elog(NOTICE, "closed file $file at $t"); + chomp(@words); + my $row; + my $sth = spi_query("SELECT * FROM generate_series(1,$_[0]) AS b(a)"); + while (defined ($row = spi_fetchrow($sth))) { + return_next({ + the_num => $row->{a}, + the_text => md5_hex($words[rand @words]) + }); + } + return; +$$ LANGUAGE plperlu; + +SELECT * from lotsa_md5(500); +</programlisting> + </para> + + <para> + Normally, <function>spi_fetchrow</function> should be repeated until it + returns <literal>undef</literal>, indicating that there are no more + rows to read. The cursor returned by <literal>spi_query</literal> + is automatically freed when + <function>spi_fetchrow</function> returns <literal>undef</literal>. + If you do not wish to read all the rows, instead call + <function>spi_cursor_close</function> to free the cursor. + Failure to do so will result in memory leaks. + </para> + + </listitem> + </varlistentry> + + <varlistentry> + <term> + <literal><function>spi_prepare(<replaceable>command</replaceable>, <replaceable>argument types</replaceable>)</function></literal> + <indexterm> + <primary>spi_prepare</primary> + <secondary>in PL/Perl</secondary> + </indexterm> + </term> + <term> + <literal><function>spi_query_prepared(<replaceable>plan</replaceable>, <replaceable>arguments</replaceable>)</function></literal> + <indexterm> + <primary>spi_query_prepared</primary> + <secondary>in PL/Perl</secondary> + </indexterm> + </term> + <term> + <literal><function>spi_exec_prepared(<replaceable>plan</replaceable> [, <replaceable>attributes</replaceable>], <replaceable>arguments</replaceable>)</function></literal> + <indexterm> + <primary>spi_exec_prepared</primary> + <secondary>in PL/Perl</secondary> + </indexterm> + </term> + <term> + <literal><function>spi_freeplan(<replaceable>plan</replaceable>)</function></literal> + <indexterm> + <primary>spi_freeplan</primary> + <secondary>in PL/Perl</secondary> + </indexterm> + </term> + + <listitem> + <para> + <literal>spi_prepare</literal>, <literal>spi_query_prepared</literal>, <literal>spi_exec_prepared</literal>, + and <literal>spi_freeplan</literal> implement the same functionality but for prepared queries. + <literal>spi_prepare</literal> accepts a query string with numbered argument placeholders ($1, $2, etc) + and a string list of argument types: +<programlisting> +$plan = spi_prepare('SELECT * FROM test WHERE id > $1 AND name = $2', + 'INTEGER', 'TEXT'); +</programlisting> + Once a query plan is prepared by a call to <literal>spi_prepare</literal>, the plan can be used instead + of the string query, either in <literal>spi_exec_prepared</literal>, where the result is the same as returned + by <literal>spi_exec_query</literal>, or in <literal>spi_query_prepared</literal> which returns a cursor + exactly as <literal>spi_query</literal> does, which can be later passed to <literal>spi_fetchrow</literal>. + The optional second parameter to <literal>spi_exec_prepared</literal> is a hash reference of attributes; + the only attribute currently supported is <literal>limit</literal>, which sets the maximum number of rows returned by a query. + </para> + + <para> + The advantage of prepared queries is that is it possible to use one prepared plan for more + than one query execution. After the plan is not needed anymore, it can be freed with + <literal>spi_freeplan</literal>: +<programlisting> +CREATE OR REPLACE FUNCTION init() RETURNS VOID AS $$ + $_SHARED{my_plan} = spi_prepare('SELECT (now() + $1)::date AS now', + 'INTERVAL'); +$$ LANGUAGE plperl; + +CREATE OR REPLACE FUNCTION add_time( INTERVAL ) RETURNS TEXT AS $$ + return spi_exec_prepared( + $_SHARED{my_plan}, + $_[0] + )->{rows}->[0]->{now}; +$$ LANGUAGE plperl; + +CREATE OR REPLACE FUNCTION done() RETURNS VOID AS $$ + spi_freeplan( $_SHARED{my_plan}); + undef $_SHARED{my_plan}; +$$ LANGUAGE plperl; + +SELECT init(); +SELECT add_time('1 day'), add_time('2 days'), add_time('3 days'); +SELECT done(); + + add_time | add_time | add_time +------------+------------+------------ + 2005-12-10 | 2005-12-11 | 2005-12-12 +</programlisting> + Note that the parameter subscript in <literal>spi_prepare</literal> is defined via + $1, $2, $3, etc, so avoid declaring query strings in double quotes that might easily + lead to hard-to-catch bugs. + </para> + + <para> + Another example illustrates usage of an optional parameter in <literal>spi_exec_prepared</literal>: +<programlisting> +CREATE TABLE hosts AS SELECT id, ('192.168.1.'||id)::inet AS address + FROM generate_series(1,3) AS id; + +CREATE OR REPLACE FUNCTION init_hosts_query() RETURNS VOID AS $$ + $_SHARED{plan} = spi_prepare('SELECT * FROM hosts + WHERE address << $1', 'inet'); +$$ LANGUAGE plperl; + +CREATE OR REPLACE FUNCTION query_hosts(inet) RETURNS SETOF hosts AS $$ + return spi_exec_prepared( + $_SHARED{plan}, + {limit => 2}, + $_[0] + )->{rows}; +$$ LANGUAGE plperl; + +CREATE OR REPLACE FUNCTION release_hosts_query() RETURNS VOID AS $$ + spi_freeplan($_SHARED{plan}); + undef $_SHARED{plan}; +$$ LANGUAGE plperl; + +SELECT init_hosts_query(); +SELECT query_hosts('192.168.1.0/30'); +SELECT release_hosts_query(); + + query_hosts +----------------- + (1,192.168.1.1) + (2,192.168.1.2) +(2 rows) +</programlisting> + </para> + </listitem> + </varlistentry> + + <varlistentry> + <term> + <literal><function>spi_commit()</function></literal> + <indexterm> + <primary>spi_commit</primary> + <secondary>in PL/Perl</secondary> + </indexterm> + </term> + <term> + <literal><function>spi_rollback()</function></literal> + <indexterm> + <primary>spi_rollback</primary> + <secondary>in PL/Perl</secondary> + </indexterm> + </term> + <listitem> + <para> + Commit or roll back the current transaction. This can only be called + in a procedure or anonymous code block (<command>DO</command> command) + called from the top level. (Note that it is not possible to run the + SQL commands <command>COMMIT</command> or <command>ROLLBACK</command> + via <function>spi_exec_query</function> or similar. It has to be done + using these functions.) After a transaction is ended, a new + transaction is automatically started, so there is no separate function + for that. + </para> + + <para> + Here is an example: +<programlisting> +CREATE PROCEDURE transaction_test1() +LANGUAGE plperl +AS $$ +foreach my $i (0..9) { + spi_exec_query("INSERT INTO test1 (a) VALUES ($i)"); + if ($i % 2 == 0) { + spi_commit(); + } else { + spi_rollback(); + } +} +$$; + +CALL transaction_test1(); +</programlisting> + </para> + </listitem> + </varlistentry> + </variablelist> + </sect2> + + <sect2 id="plperl-utility-functions"> + <title>Utility Functions in PL/Perl</title> + + <variablelist> + <varlistentry> + <term> + <literal><function>elog(<replaceable>level</replaceable>, <replaceable>msg</replaceable>)</function></literal> + <indexterm> + <primary>elog</primary> + <secondary>in PL/Perl</secondary> + </indexterm> + </term> + <listitem> + <para> + Emit a log or error message. Possible levels are + <literal>DEBUG</literal>, <literal>LOG</literal>, <literal>INFO</literal>, + <literal>NOTICE</literal>, <literal>WARNING</literal>, and <literal>ERROR</literal>. + <literal>ERROR</literal> + raises an error condition; if this is not trapped by the surrounding + Perl code, the error propagates out to the calling query, causing + the current transaction or subtransaction to be aborted. This + is effectively the same as the Perl <literal>die</literal> command. + The other levels only generate messages of different + priority levels. + Whether messages of a particular priority are reported to the client, + written to the server log, or both is controlled by the + <xref linkend="guc-log-min-messages"/> and + <xref linkend="guc-client-min-messages"/> configuration + variables. See <xref linkend="runtime-config"/> for more + information. + </para> + </listitem> + </varlistentry> + + <varlistentry> + <term> + <literal><function>quote_literal(<replaceable>string</replaceable>)</function></literal> + <indexterm> + <primary>quote_literal</primary> + <secondary>in PL/Perl</secondary> + </indexterm> + </term> + <listitem> + <para> + Return the given string suitably quoted to be used as a string literal in an SQL + statement string. Embedded single-quotes and backslashes are properly doubled. + Note that <function>quote_literal</function> returns undef on undef input; if the argument + might be undef, <function>quote_nullable</function> is often more suitable. + </para> + </listitem> + </varlistentry> + + <varlistentry> + <term> + <literal><function>quote_nullable(<replaceable>string</replaceable>)</function></literal> + <indexterm> + <primary>quote_nullable</primary> + <secondary>in PL/Perl</secondary> + </indexterm> + </term> + <listitem> + <para> + Return the given string suitably quoted to be used as a string literal in an SQL + statement string; or, if the argument is undef, return the unquoted string "NULL". + Embedded single-quotes and backslashes are properly doubled. + </para> + </listitem> + </varlistentry> + + <varlistentry> + <term> + <literal><function>quote_ident(<replaceable>string</replaceable>)</function></literal> + <indexterm> + <primary>quote_ident</primary> + <secondary>in PL/Perl</secondary> + </indexterm> + </term> + <listitem> + <para> + Return the given string suitably quoted to be used as an identifier in + an SQL statement string. Quotes are added only if necessary (i.e., if + the string contains non-identifier characters or would be case-folded). + Embedded quotes are properly doubled. + </para> + </listitem> + </varlistentry> + + <varlistentry> + <term> + <literal><function>decode_bytea(<replaceable>string</replaceable>)</function></literal> + <indexterm> + <primary>decode_bytea</primary> + <secondary>in PL/Perl</secondary> + </indexterm> + </term> + <listitem> + <para> + Return the unescaped binary data represented by the contents of the given string, + which should be <type>bytea</type> encoded. + </para> + </listitem> + </varlistentry> + + <varlistentry> + <term> + <literal><function>encode_bytea(<replaceable>string</replaceable>)</function></literal> + <indexterm> + <primary>encode_bytea</primary> + <secondary>in PL/Perl</secondary> + </indexterm> + </term> + <listitem> + <para> + Return the <type>bytea</type> encoded form of the binary data contents of the given string. + </para> + </listitem> + </varlistentry> + + <varlistentry> + <term> + <literal><function>encode_array_literal(<replaceable>array</replaceable>)</function></literal> + <indexterm> + <primary>encode_array_literal</primary> + <secondary>in PL/Perl</secondary> + </indexterm> + </term> + <term> + <literal><function>encode_array_literal(<replaceable>array</replaceable>, <replaceable>delimiter</replaceable>)</function></literal> + </term> + <listitem> + <para> + Returns the contents of the referenced array as a string in array literal format + (see <xref linkend="arrays-input"/>). + Returns the argument value unaltered if it's not a reference to an array. + The delimiter used between elements of the array literal defaults to "<literal>, </literal>" + if a delimiter is not specified or is undef. + </para> + </listitem> + </varlistentry> + + <varlistentry> + <term> + <literal><function>encode_typed_literal(<replaceable>value</replaceable>, <replaceable>typename</replaceable>)</function></literal> + <indexterm> + <primary>encode_typed_literal</primary> + <secondary>in PL/Perl</secondary> + </indexterm> + </term> + <listitem> + <para> + Converts a Perl variable to the value of the data type passed as a + second argument and returns a string representation of this value. + Correctly handles nested arrays and values of composite types. + </para> + </listitem> + </varlistentry> + + <varlistentry> + <term> + <literal><function>encode_array_constructor(<replaceable>array</replaceable>)</function></literal> + <indexterm> + <primary>encode_array_constructor</primary> + <secondary>in PL/Perl</secondary> + </indexterm> + </term> + <listitem> + <para> + Returns the contents of the referenced array as a string in array constructor format + (see <xref linkend="sql-syntax-array-constructors"/>). + Individual values are quoted using <function>quote_nullable</function>. + Returns the argument value, quoted using <function>quote_nullable</function>, + if it's not a reference to an array. + </para> + </listitem> + </varlistentry> + + <varlistentry> + <term> + <literal><function>looks_like_number(<replaceable>string</replaceable>)</function></literal> + <indexterm> + <primary>looks_like_number</primary> + <secondary>in PL/Perl</secondary> + </indexterm> + </term> + <listitem> + <para> + Returns a true value if the content of the given string looks like a + number, according to Perl, returns false otherwise. + Returns undef if the argument is undef. Leading and trailing space is + ignored. <literal>Inf</literal> and <literal>Infinity</literal> are regarded as numbers. + </para> + </listitem> + </varlistentry> + + <varlistentry> + <term> + <literal><function>is_array_ref(<replaceable>argument</replaceable>)</function></literal> + <indexterm> + <primary>is_array_ref</primary> + <secondary>in PL/Perl</secondary> + </indexterm> + </term> + <listitem> + <para> + Returns a true value if the given argument may be treated as an + array reference, that is, if ref of the argument is <literal>ARRAY</literal> or + <literal>PostgreSQL::InServer::ARRAY</literal>. Returns false otherwise. + </para> + </listitem> + </varlistentry> + + </variablelist> + </sect2> + </sect1> + + <sect1 id="plperl-global"> + <title>Global Values in PL/Perl</title> + + <para> + You can use the global hash <varname>%_SHARED</varname> to store + data, including code references, between function calls for the + lifetime of the current session. + </para> + + <para> + Here is a simple example for shared data: +<programlisting> +CREATE OR REPLACE FUNCTION set_var(name text, val text) RETURNS text AS $$ + if ($_SHARED{$_[0]} = $_[1]) { + return 'ok'; + } else { + return "cannot set shared variable $_[0] to $_[1]"; + } +$$ LANGUAGE plperl; + +CREATE OR REPLACE FUNCTION get_var(name text) RETURNS text AS $$ + return $_SHARED{$_[0]}; +$$ LANGUAGE plperl; + +SELECT set_var('sample', 'Hello, PL/Perl! How''s tricks?'); +SELECT get_var('sample'); +</programlisting> + </para> + + <para> + Here is a slightly more complicated example using a code reference: + +<programlisting> +CREATE OR REPLACE FUNCTION myfuncs() RETURNS void AS $$ + $_SHARED{myquote} = sub { + my $arg = shift; + $arg =~ s/(['\\])/\\$1/g; + return "'$arg'"; + }; +$$ LANGUAGE plperl; + +SELECT myfuncs(); /* initializes the function */ + +/* Set up a function that uses the quote function */ + +CREATE OR REPLACE FUNCTION use_quote(TEXT) RETURNS text AS $$ + my $text_to_quote = shift; + my $qfunc = $_SHARED{myquote}; + return &$qfunc($text_to_quote); +$$ LANGUAGE plperl; +</programlisting> + + (You could have replaced the above with the one-liner + <literal>return $_SHARED{myquote}->($_[0]);</literal> + at the expense of readability.) + </para> + + <para> + For security reasons, PL/Perl executes functions called by any one SQL role + in a separate Perl interpreter for that role. This prevents accidental or + malicious interference by one user with the behavior of another user's + PL/Perl functions. Each such interpreter has its own value of the + <varname>%_SHARED</varname> variable and other global state. Thus, two + PL/Perl functions will share the same value of <varname>%_SHARED</varname> + if and only if they are executed by the same SQL role. In an application + wherein a single session executes code under multiple SQL roles (via + <literal>SECURITY DEFINER</literal> functions, use of <command>SET ROLE</command>, etc) + you may need to take explicit steps to ensure that PL/Perl functions can + share data via <varname>%_SHARED</varname>. To do that, make sure that + functions that should communicate are owned by the same user, and mark + them <literal>SECURITY DEFINER</literal>. You must of course take care that + such functions can't be used to do anything unintended. + </para> + </sect1> + + <sect1 id="plperl-trusted"> + <title>Trusted and Untrusted PL/Perl</title> + + <indexterm zone="plperl-trusted"> + <primary>trusted</primary> + <secondary>PL/Perl</secondary> + </indexterm> + + <para> + Normally, PL/Perl is installed as a <quote>trusted</quote> programming + language named <literal>plperl</literal>. In this setup, certain Perl + operations are disabled to preserve security. In general, the + operations that are restricted are those that interact with the + environment. This includes file handle operations, + <literal>require</literal>, and <literal>use</literal> (for + external modules). There is no way to access internals of the + database server process or to gain OS-level access with the + permissions of the server process, + as a C function can do. Thus, any unprivileged database user can + be permitted to use this language. + </para> + + <para> + Here is an example of a function that will not work because file + system operations are not allowed for security reasons: +<programlisting> +CREATE FUNCTION badfunc() RETURNS integer AS $$ + my $tmpfile = "/tmp/badfile"; + open my $fh, '>', $tmpfile + or elog(ERROR, qq{could not open the file "$tmpfile": $!}); + print $fh "Testing writing to a file\n"; + close $fh or elog(ERROR, qq{could not close the file "$tmpfile": $!}); + return 1; +$$ LANGUAGE plperl; +</programlisting> + The creation of this function will fail as its use of a forbidden + operation will be caught by the validator. + </para> + + <para> + Sometimes it is desirable to write Perl functions that are not + restricted. For example, one might want a Perl function that sends + mail. To handle these cases, PL/Perl can also be installed as an + <quote>untrusted</quote> language (usually called + <application>PL/PerlU</application><indexterm><primary>PL/PerlU</primary></indexterm>). + In this case the full Perl language is available. When installing the + language, the language name <literal>plperlu</literal> will select + the untrusted PL/Perl variant. + </para> + + <para> + The writer of a <application>PL/PerlU</application> function must take care that the function + cannot be used to do anything unwanted, since it will be able to do + anything that could be done by a user logged in as the database + administrator. Note that the database system allows only database + superusers to create functions in untrusted languages. + </para> + + <para> + If the above function was created by a superuser using the language + <literal>plperlu</literal>, execution would succeed. + </para> + + <para> + In the same way, anonymous code blocks written in Perl can use + restricted operations if the language is specified as + <literal>plperlu</literal> rather than <literal>plperl</literal>, but the caller + must be a superuser. + </para> + + <note> + <para> + While <application>PL/Perl</application> functions run in a separate Perl + interpreter for each SQL role, all <application>PL/PerlU</application> functions + executed in a given session run in a single Perl interpreter (which is + not any of the ones used for <application>PL/Perl</application> functions). + This allows <application>PL/PerlU</application> functions to share data freely, + but no communication can occur between <application>PL/Perl</application> and + <application>PL/PerlU</application> functions. + </para> + </note> + + <note> + <para> + Perl cannot support multiple interpreters within one process unless + it was built with the appropriate flags, namely either + <literal>usemultiplicity</literal> or <literal>useithreads</literal>. + (<literal>usemultiplicity</literal> is preferred unless you actually need + to use threads. For more details, see the + <citerefentry><refentrytitle>perlembed</refentrytitle></citerefentry> man page.) + If <application>PL/Perl</application> is used with a copy of Perl that was not built + this way, then it is only possible to have one Perl interpreter per + session, and so any one session can only execute either + <application>PL/PerlU</application> functions, or <application>PL/Perl</application> functions + that are all called by the same SQL role. + </para> + </note> + + </sect1> + + <sect1 id="plperl-triggers"> + <title>PL/Perl Triggers</title> + + <para> + PL/Perl can be used to write trigger functions. In a trigger function, + the hash reference <varname>$_TD</varname> contains information about the + current trigger event. <varname>$_TD</varname> is a global variable, + which gets a separate local value for each invocation of the trigger. + The fields of the <varname>$_TD</varname> hash reference are: + + <variablelist> + <varlistentry> + <term><literal>$_TD->{new}{foo}</literal></term> + <listitem> + <para> + <literal>NEW</literal> value of column <literal>foo</literal> + </para> + </listitem> + </varlistentry> + + <varlistentry> + <term><literal>$_TD->{old}{foo}</literal></term> + <listitem> + <para> + <literal>OLD</literal> value of column <literal>foo</literal> + </para> + </listitem> + </varlistentry> + + <varlistentry> + <term><literal>$_TD->{name}</literal></term> + <listitem> + <para> + Name of the trigger being called + </para> + </listitem> + </varlistentry> + + <varlistentry> + <term><literal>$_TD->{event}</literal></term> + <listitem> + <para> + Trigger event: <literal>INSERT</literal>, <literal>UPDATE</literal>, + <literal>DELETE</literal>, <literal>TRUNCATE</literal>, or <literal>UNKNOWN</literal> + </para> + </listitem> + </varlistentry> + + <varlistentry> + <term><literal>$_TD->{when}</literal></term> + <listitem> + <para> + When the trigger was called: <literal>BEFORE</literal>, + <literal>AFTER</literal>, <literal>INSTEAD OF</literal>, or + <literal>UNKNOWN</literal> + </para> + </listitem> + </varlistentry> + + <varlistentry> + <term><literal>$_TD->{level}</literal></term> + <listitem> + <para> + The trigger level: <literal>ROW</literal>, <literal>STATEMENT</literal>, or <literal>UNKNOWN</literal> + </para> + </listitem> + </varlistentry> + + <varlistentry> + <term><literal>$_TD->{relid}</literal></term> + <listitem> + <para> + OID of the table on which the trigger fired + </para> + </listitem> + </varlistentry> + + <varlistentry> + <term><literal>$_TD->{table_name}</literal></term> + <listitem> + <para> + Name of the table on which the trigger fired + </para> + </listitem> + </varlistentry> + + <varlistentry> + <term><literal>$_TD->{relname}</literal></term> + <listitem> + <para> + Name of the table on which the trigger fired. This has been deprecated, + and could be removed in a future release. + Please use $_TD->{table_name} instead. + </para> + </listitem> + </varlistentry> + + <varlistentry> + <term><literal>$_TD->{table_schema}</literal></term> + <listitem> + <para> + Name of the schema in which the table on which the trigger fired, is + </para> + </listitem> + </varlistentry> + + <varlistentry> + <term><literal>$_TD->{argc}</literal></term> + <listitem> + <para> + Number of arguments of the trigger function + </para> + </listitem> + </varlistentry> + + <varlistentry> + <term><literal>@{$_TD->{args}}</literal></term> + <listitem> + <para> + Arguments of the trigger function. Does not exist if <literal>$_TD->{argc}</literal> is 0. + </para> + </listitem> + </varlistentry> + + </variablelist> + </para> + + <para> + Row-level triggers can return one of the following: + + <variablelist> + <varlistentry> + <term><literal>return;</literal></term> + <listitem> + <para> + Execute the operation + </para> + </listitem> + </varlistentry> + + <varlistentry> + <term><literal>"SKIP"</literal></term> + <listitem> + <para> + Don't execute the operation + </para> + </listitem> + </varlistentry> + + <varlistentry> + <term><literal>"MODIFY"</literal></term> + <listitem> + <para> + Indicates that the <literal>NEW</literal> row was modified by + the trigger function + </para> + </listitem> + </varlistentry> + </variablelist> + </para> + + <para> + Here is an example of a trigger function, illustrating some of the + above: +<programlisting> +CREATE TABLE test ( + i int, + v varchar +); + +CREATE OR REPLACE FUNCTION valid_id() RETURNS trigger AS $$ + if (($_TD->{new}{i} >= 100) || ($_TD->{new}{i} <= 0)) { + return "SKIP"; # skip INSERT/UPDATE command + } elsif ($_TD->{new}{v} ne "immortal") { + $_TD->{new}{v} .= "(modified by trigger)"; + return "MODIFY"; # modify row and execute INSERT/UPDATE command + } else { + return; # execute INSERT/UPDATE command + } +$$ LANGUAGE plperl; + +CREATE TRIGGER test_valid_id_trig + BEFORE INSERT OR UPDATE ON test + FOR EACH ROW EXECUTE FUNCTION valid_id(); +</programlisting> + </para> + </sect1> + + <sect1 id="plperl-event-triggers"> + <title>PL/Perl Event Triggers</title> + + <para> + PL/Perl can be used to write event trigger functions. In an event trigger + function, the hash reference <varname>$_TD</varname> contains information + about the current trigger event. <varname>$_TD</varname> is a global variable, + which gets a separate local value for each invocation of the trigger. The + fields of the <varname>$_TD</varname> hash reference are: + + <variablelist> + <varlistentry> + <term><literal>$_TD->{event}</literal></term> + <listitem> + <para> + The name of the event the trigger is fired for. + </para> + </listitem> + </varlistentry> + + <varlistentry> + <term><literal>$_TD->{tag}</literal></term> + <listitem> + <para> + The command tag for which the trigger is fired. + </para> + </listitem> + </varlistentry> + </variablelist> + </para> + + <para> + The return value of the trigger function is ignored. + </para> + + <para> + Here is an example of an event trigger function, illustrating some of the + above: +<programlisting> +CREATE OR REPLACE FUNCTION perlsnitch() RETURNS event_trigger AS $$ + elog(NOTICE, "perlsnitch: " . $_TD->{event} . " " . $_TD->{tag} . " "); +$$ LANGUAGE plperl; + +CREATE EVENT TRIGGER perl_a_snitch + ON ddl_command_start + EXECUTE FUNCTION perlsnitch(); +</programlisting> + </para> + </sect1> + + <sect1 id="plperl-under-the-hood"> + <title>PL/Perl Under the Hood</title> + + <sect2 id="plperl-config"> + <title>Configuration</title> + + <para> + This section lists configuration parameters that affect <application>PL/Perl</application>. + </para> + + <variablelist> + + <varlistentry id="guc-plperl-on-init" xreflabel="plperl.on_init"> + <term> + <varname>plperl.on_init</varname> (<type>string</type>) + <indexterm> + <primary><varname>plperl.on_init</varname> configuration parameter</primary> + </indexterm> + </term> + <listitem> + <para> + Specifies Perl code to be executed when a Perl interpreter is first + initialized, before it is specialized for use by <literal>plperl</literal> or + <literal>plperlu</literal>. + The SPI functions are not available when this code is executed. + If the code fails with an error it will abort the initialization of + the interpreter and propagate out to the calling query, causing the + current transaction or subtransaction to be aborted. + </para> + <para> + The Perl code is limited to a single string. Longer code can be placed + into a module and loaded by the <literal>on_init</literal> string. + Examples: +<programlisting> +plperl.on_init = 'require "plperlinit.pl"' +plperl.on_init = 'use lib "/my/app"; use MyApp::PgInit;' +</programlisting> + </para> + <para> + Any modules loaded by <literal>plperl.on_init</literal>, either directly or + indirectly, will be available for use by <literal>plperl</literal>. This may + create a security risk. To see what modules have been loaded you can use: +<programlisting> +DO 'elog(WARNING, join ", ", sort keys %INC)' LANGUAGE plperl; +</programlisting> + </para> + <para> + Initialization will happen in the postmaster if the <literal>plperl</literal> library is + included in <xref linkend="guc-shared-preload-libraries"/>, in which + case extra consideration should be given to the risk of destabilizing + the postmaster. The principal reason for making use of this feature + is that Perl modules loaded by <literal>plperl.on_init</literal> need be + loaded only at postmaster start, and will be instantly available + without loading overhead in individual database sessions. However, + keep in mind that the overhead is avoided only for the first Perl + interpreter used by a database session — either PL/PerlU, or + PL/Perl for the first SQL role that calls a PL/Perl function. Any + additional Perl interpreters created in a database session will have + to execute <literal>plperl.on_init</literal> afresh. Also, on Windows there + will be no savings whatsoever from preloading, since the Perl + interpreter created in the postmaster process does not propagate to + child processes. + </para> + <para> + This parameter can only be set in the <filename>postgresql.conf</filename> file or on the server command line. + </para> + </listitem> + </varlistentry> + + <varlistentry id="guc-plperl-on-plperl-init" xreflabel="plperl.on_plperl_init"> + <term> + <varname>plperl.on_plperl_init</varname> (<type>string</type>) + <indexterm> + <primary><varname>plperl.on_plperl_init</varname> configuration parameter</primary> + </indexterm> + </term> + <term> + <varname>plperl.on_plperlu_init</varname> (<type>string</type>) + <indexterm> + <primary><varname>plperl.on_plperlu_init</varname> configuration parameter</primary> + </indexterm> + </term> + <listitem> + <para> + These parameters specify Perl code to be executed when a Perl + interpreter is specialized for <literal>plperl</literal> or + <literal>plperlu</literal> respectively. This will happen when a PL/Perl or + PL/PerlU function is first executed in a database session, or when + an additional interpreter has to be created because the other language + is called or a PL/Perl function is called by a new SQL role. This + follows any initialization done by <literal>plperl.on_init</literal>. + The SPI functions are not available when this code is executed. + The Perl code in <literal>plperl.on_plperl_init</literal> is executed after + <quote>locking down</quote> the interpreter, and thus it can only perform + trusted operations. + </para> + <para> + If the code fails with an error it will abort the initialization and + propagate out to the calling query, causing the current transaction or + subtransaction to be aborted. Any actions already done within Perl + won't be undone; however, that interpreter won't be used again. + If the language is used again the initialization will be attempted + again within a fresh Perl interpreter. + </para> + <para> + Only superusers can change these settings. Although these settings + can be changed within a session, such changes will not affect Perl + interpreters that have already been used to execute functions. + </para> + </listitem> + </varlistentry> + + <varlistentry id="guc-plperl-use-strict" xreflabel="plperl.use_strict"> + <term> + <varname>plperl.use_strict</varname> (<type>boolean</type>) + <indexterm> + <primary><varname>plperl.use_strict</varname> configuration parameter</primary> + </indexterm> + </term> + <listitem> + <para> + When set true subsequent compilations of PL/Perl functions will have + the <literal>strict</literal> pragma enabled. This parameter does not affect + functions already compiled in the current session. + </para> + </listitem> + </varlistentry> + + </variablelist> +</sect2> + + <sect2 id="plperl-missing"> + <title>Limitations and Missing Features</title> + + <para> + The following features are currently missing from PL/Perl, but they + would make welcome contributions. + + <itemizedlist> + <listitem> + <para> + PL/Perl functions cannot call each other directly. + </para> + </listitem> + + <listitem> + <para> + SPI is not yet fully implemented. + </para> + </listitem> + + <listitem> + <para> + If you are fetching very large data sets using + <literal>spi_exec_query</literal>, you should be aware that + these will all go into memory. You can avoid this by using + <literal>spi_query</literal>/<literal>spi_fetchrow</literal> as + illustrated earlier. + </para> + <para> + A similar problem occurs if a set-returning function passes a + large set of rows back to PostgreSQL via <literal>return</literal>. You + can avoid this problem too by instead using + <literal>return_next</literal> for each row returned, as shown + previously. + </para> + </listitem> + + <listitem> + <para> + When a session ends normally, not due to a fatal error, any + <literal>END</literal> blocks that have been defined are executed. + Currently no other actions are performed. Specifically, + file handles are not automatically flushed and objects are + not automatically destroyed. + </para> + </listitem> + </itemizedlist> + </para> + </sect2> + + </sect1> + +</chapter> |