diff options
author | Daniel Baumann <daniel.baumann@progress-linux.org> | 2024-05-04 12:17:33 +0000 |
---|---|---|
committer | Daniel Baumann <daniel.baumann@progress-linux.org> | 2024-05-04 12:17:33 +0000 |
commit | 5e45211a64149b3c659b90ff2de6fa982a5a93ed (patch) | |
tree | 739caf8c461053357daa9f162bef34516c7bf452 /doc/src/sgml/html/xaggr.html | |
parent | Initial commit. (diff) | |
download | postgresql-15-5e45211a64149b3c659b90ff2de6fa982a5a93ed.tar.xz postgresql-15-5e45211a64149b3c659b90ff2de6fa982a5a93ed.zip |
Adding upstream version 15.5.upstream/15.5
Signed-off-by: Daniel Baumann <daniel.baumann@progress-linux.org>
Diffstat (limited to 'doc/src/sgml/html/xaggr.html')
-rw-r--r-- | doc/src/sgml/html/xaggr.html | 528 |
1 files changed, 528 insertions, 0 deletions
diff --git a/doc/src/sgml/html/xaggr.html b/doc/src/sgml/html/xaggr.html new file mode 100644 index 0000000..a1d582e --- /dev/null +++ b/doc/src/sgml/html/xaggr.html @@ -0,0 +1,528 @@ +<?xml version="1.0" encoding="UTF-8" standalone="no"?> +<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"><html xmlns="http://www.w3.org/1999/xhtml"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /><title>38.12. User-Defined Aggregates</title><link rel="stylesheet" type="text/css" href="stylesheet.css" /><link rev="made" href="pgsql-docs@lists.postgresql.org" /><meta name="generator" content="DocBook XSL Stylesheets Vsnapshot" /><link rel="prev" href="xfunc-optimization.html" title="38.11. Function Optimization Information" /><link rel="next" href="xtypes.html" title="38.13. User-Defined Types" /></head><body id="docContent" class="container-fluid col-10"><div class="navheader"><table width="100%" summary="Navigation header"><tr><th colspan="5" align="center">38.12. User-Defined Aggregates</th></tr><tr><td width="10%" align="left"><a accesskey="p" href="xfunc-optimization.html" title="38.11. Function Optimization Information">Prev</a> </td><td width="10%" align="left"><a accesskey="u" href="extend.html" title="Chapter 38. Extending SQL">Up</a></td><th width="60%" align="center">Chapter 38. Extending <acronym class="acronym">SQL</acronym></th><td width="10%" align="right"><a accesskey="h" href="index.html" title="PostgreSQL 15.5 Documentation">Home</a></td><td width="10%" align="right"> <a accesskey="n" href="xtypes.html" title="38.13. User-Defined Types">Next</a></td></tr></table><hr /></div><div class="sect1" id="XAGGR"><div class="titlepage"><div><div><h2 class="title" style="clear: both">38.12. User-Defined Aggregates</h2></div></div></div><div class="toc"><dl class="toc"><dt><span class="sect2"><a href="xaggr.html#XAGGR-MOVING-AGGREGATES">38.12.1. Moving-Aggregate Mode</a></span></dt><dt><span class="sect2"><a href="xaggr.html#XAGGR-POLYMORPHIC-AGGREGATES">38.12.2. Polymorphic and Variadic Aggregates</a></span></dt><dt><span class="sect2"><a href="xaggr.html#XAGGR-ORDERED-SET-AGGREGATES">38.12.3. Ordered-Set Aggregates</a></span></dt><dt><span class="sect2"><a href="xaggr.html#XAGGR-PARTIAL-AGGREGATES">38.12.4. Partial Aggregation</a></span></dt><dt><span class="sect2"><a href="xaggr.html#XAGGR-SUPPORT-FUNCTIONS">38.12.5. Support Functions for Aggregates</a></span></dt></dl></div><a id="id-1.8.3.15.2" class="indexterm"></a><p> + Aggregate functions in <span class="productname">PostgreSQL</span> + are defined in terms of <em class="firstterm">state values</em> + and <em class="firstterm">state transition functions</em>. + That is, an aggregate operates using a state value that is updated + as each successive input row is processed. + To define a new aggregate + function, one selects a data type for the state value, + an initial value for the state, and a state transition + function. The state transition function takes the previous state + value and the aggregate's input value(s) for the current row, and + returns a new state value. + A <em class="firstterm">final function</em> + can also be specified, in case the desired result of the aggregate + is different from the data that needs to be kept in the running + state value. The final function takes the ending state value + and returns whatever is wanted as the aggregate result. + In principle, the transition and final functions are just ordinary + functions that could also be used outside the context of the + aggregate. (In practice, it's often helpful for performance reasons + to create specialized transition functions that can only work when + called as part of an aggregate.) + </p><p> + Thus, in addition to the argument and result data types seen by a user + of the aggregate, there is an internal state-value data type that + might be different from both the argument and result types. + </p><p> + If we define an aggregate that does not use a final function, + we have an aggregate that computes a running function of + the column values from each row. <code class="function">sum</code> is an + example of this kind of aggregate. <code class="function">sum</code> starts at + zero and always adds the current row's value to + its running total. For example, if we want to make a <code class="function">sum</code> + aggregate to work on a data type for complex numbers, + we only need the addition function for that data type. + The aggregate definition would be: + +</p><pre class="programlisting"> +CREATE AGGREGATE sum (complex) +( + sfunc = complex_add, + stype = complex, + initcond = '(0,0)' +); +</pre><p> + + which we might use like this: + +</p><pre class="programlisting"> +SELECT sum(a) FROM test_complex; + + sum +----------- + (34,53.9) +</pre><p> + + (Notice that we are relying on function overloading: there is more than + one aggregate named <code class="function">sum</code>, but + <span class="productname">PostgreSQL</span> can figure out which kind + of sum applies to a column of type <code class="type">complex</code>.) + </p><p> + The above definition of <code class="function">sum</code> will return zero + (the initial state value) if there are no nonnull input values. + Perhaps we want to return null in that case instead — the SQL standard + expects <code class="function">sum</code> to behave that way. We can do this simply by + omitting the <code class="literal">initcond</code> phrase, so that the initial state + value is null. Ordinarily this would mean that the <code class="literal">sfunc</code> + would need to check for a null state-value input. But for + <code class="function">sum</code> and some other simple aggregates like + <code class="function">max</code> and <code class="function">min</code>, + it is sufficient to insert the first nonnull input value into + the state variable and then start applying the transition function + at the second nonnull input value. <span class="productname">PostgreSQL</span> + will do that automatically if the initial state value is null and + the transition function is marked <span class="quote">“<span class="quote">strict</span>”</span> (i.e., not to be called + for null inputs). + </p><p> + Another bit of default behavior for a <span class="quote">“<span class="quote">strict</span>”</span> transition function + is that the previous state value is retained unchanged whenever a + null input value is encountered. Thus, null values are ignored. If you + need some other behavior for null inputs, do not declare your + transition function as strict; instead code it to test for null inputs and + do whatever is needed. + </p><p> + <code class="function">avg</code> (average) is a more complex example of an aggregate. + It requires + two pieces of running state: the sum of the inputs and the count + of the number of inputs. The final result is obtained by dividing + these quantities. Average is typically implemented by using an + array as the state value. For example, + the built-in implementation of <code class="function">avg(float8)</code> + looks like: + +</p><pre class="programlisting"> +CREATE AGGREGATE avg (float8) +( + sfunc = float8_accum, + stype = float8[], + finalfunc = float8_avg, + initcond = '{0,0,0}' +); +</pre><p> + </p><div class="note"><h3 class="title">Note</h3><p> + <code class="function">float8_accum</code> requires a three-element array, not just + two elements, because it accumulates the sum of squares as well as + the sum and count of the inputs. This is so that it can be used for + some other aggregates as well as <code class="function">avg</code>. + </p></div><p> + Aggregate function calls in SQL allow <code class="literal">DISTINCT</code> + and <code class="literal">ORDER BY</code> options that control which rows are fed + to the aggregate's transition function and in what order. These + options are implemented behind the scenes and are not the concern + of the aggregate's support functions. + </p><p> + For further details see the + <a class="xref" href="sql-createaggregate.html" title="CREATE AGGREGATE"><span class="refentrytitle">CREATE AGGREGATE</span></a> + command. + </p><div class="sect2" id="XAGGR-MOVING-AGGREGATES"><div class="titlepage"><div><div><h3 class="title">38.12.1. Moving-Aggregate Mode</h3></div></div></div><a id="id-1.8.3.15.12.2" class="indexterm"></a><a id="id-1.8.3.15.12.3" class="indexterm"></a><p> + Aggregate functions can optionally support <em class="firstterm">moving-aggregate + mode</em>, which allows substantially faster execution of aggregate + functions within windows with moving frame starting points. + (See <a class="xref" href="tutorial-window.html" title="3.5. Window Functions">Section 3.5</a> + and <a class="xref" href="sql-expressions.html#SYNTAX-WINDOW-FUNCTIONS" title="4.2.8. Window Function Calls">Section 4.2.8</a> for information about use of + aggregate functions as window functions.) + The basic idea is that in addition to a normal <span class="quote">“<span class="quote">forward</span>”</span> + transition function, the aggregate provides an <em class="firstterm">inverse + transition function</em>, which allows rows to be removed from the + aggregate's running state value when they exit the window frame. + For example a <code class="function">sum</code> aggregate, which uses addition as the + forward transition function, would use subtraction as the inverse + transition function. Without an inverse transition function, the window + function mechanism must recalculate the aggregate from scratch each time + the frame starting point moves, resulting in run time proportional to the + number of input rows times the average frame length. With an inverse + transition function, the run time is only proportional to the number of + input rows. + </p><p> + The inverse transition function is passed the current state value and the + aggregate input value(s) for the earliest row included in the current + state. It must reconstruct what the state value would have been if the + given input row had never been aggregated, but only the rows following + it. This sometimes requires that the forward transition function keep + more state than is needed for plain aggregation mode. Therefore, the + moving-aggregate mode uses a completely separate implementation from the + plain mode: it has its own state data type, its own forward transition + function, and its own final function if needed. These can be the same as + the plain mode's data type and functions, if there is no need for extra + state. + </p><p> + As an example, we could extend the <code class="function">sum</code> aggregate given above + to support moving-aggregate mode like this: + +</p><pre class="programlisting"> +CREATE AGGREGATE sum (complex) +( + sfunc = complex_add, + stype = complex, + initcond = '(0,0)', + msfunc = complex_add, + minvfunc = complex_sub, + mstype = complex, + minitcond = '(0,0)' +); +</pre><p> + + The parameters whose names begin with <code class="literal">m</code> define the + moving-aggregate implementation. Except for the inverse transition + function <code class="literal">minvfunc</code>, they correspond to the plain-aggregate + parameters without <code class="literal">m</code>. + </p><p> + The forward transition function for moving-aggregate mode is not allowed + to return null as the new state value. If the inverse transition + function returns null, this is taken as an indication that the inverse + function cannot reverse the state calculation for this particular input, + and so the aggregate calculation will be redone from scratch for the + current frame starting position. This convention allows moving-aggregate + mode to be used in situations where there are some infrequent cases that + are impractical to reverse out of the running state value. The inverse + transition function can <span class="quote">“<span class="quote">punt</span>”</span> on these cases, and yet still come + out ahead so long as it can work for most cases. As an example, an + aggregate working with floating-point numbers might choose to punt when + a <code class="literal">NaN</code> (not a number) input has to be removed from the running + state value. + </p><p> + When writing moving-aggregate support functions, it is important to be + sure that the inverse transition function can reconstruct the correct + state value exactly. Otherwise there might be user-visible differences + in results depending on whether the moving-aggregate mode is used. + An example of an aggregate for which adding an inverse transition + function seems easy at first, yet where this requirement cannot be met + is <code class="function">sum</code> over <code class="type">float4</code> or <code class="type">float8</code> inputs. A + naive declaration of <code class="function">sum(<code class="type">float8</code>)</code> could be + +</p><pre class="programlisting"> +CREATE AGGREGATE unsafe_sum (float8) +( + stype = float8, + sfunc = float8pl, + mstype = float8, + msfunc = float8pl, + minvfunc = float8mi +); +</pre><p> + + This aggregate, however, can give wildly different results than it would + have without the inverse transition function. For example, consider + +</p><pre class="programlisting"> +SELECT + unsafe_sum(x) OVER (ORDER BY n ROWS BETWEEN CURRENT ROW AND 1 FOLLOWING) +FROM (VALUES (1, 1.0e20::float8), + (2, 1.0::float8)) AS v (n,x); +</pre><p> + + This query returns <code class="literal">0</code> as its second result, rather than the + expected answer of <code class="literal">1</code>. The cause is the limited precision of + floating-point values: adding <code class="literal">1</code> to <code class="literal">1e20</code> results + in <code class="literal">1e20</code> again, and so subtracting <code class="literal">1e20</code> from that + yields <code class="literal">0</code>, not <code class="literal">1</code>. Note that this is a limitation + of floating-point arithmetic in general, not a limitation + of <span class="productname">PostgreSQL</span>. + </p></div><div class="sect2" id="XAGGR-POLYMORPHIC-AGGREGATES"><div class="titlepage"><div><div><h3 class="title">38.12.2. Polymorphic and Variadic Aggregates</h3></div></div></div><a id="id-1.8.3.15.13.2" class="indexterm"></a><a id="id-1.8.3.15.13.3" class="indexterm"></a><p> + Aggregate functions can use polymorphic + state transition functions or final functions, so that the same functions + can be used to implement multiple aggregates. + See <a class="xref" href="extend-type-system.html#EXTEND-TYPES-POLYMORPHIC" title="38.2.5. Polymorphic Types">Section 38.2.5</a> + for an explanation of polymorphic functions. + Going a step further, the aggregate function itself can be specified + with polymorphic input type(s) and state type, allowing a single + aggregate definition to serve for multiple input data types. + Here is an example of a polymorphic aggregate: + +</p><pre class="programlisting"> +CREATE AGGREGATE array_accum (anycompatible) +( + sfunc = array_append, + stype = anycompatiblearray, + initcond = '{}' +); +</pre><p> + + Here, the actual state type for any given aggregate call is the array type + having the actual input type as elements. The behavior of the aggregate + is to concatenate all the inputs into an array of that type. + (Note: the built-in aggregate <code class="function">array_agg</code> provides similar + functionality, with better performance than this definition would have.) + </p><p> + Here's the output using two different actual data types as arguments: + +</p><pre class="programlisting"> +SELECT attrelid::regclass, array_accum(attname) + FROM pg_attribute + WHERE attnum > 0 AND attrelid = 'pg_tablespace'::regclass + GROUP BY attrelid; + + attrelid | array_accum +---------------+--------------------------------------- + pg_tablespace | {spcname,spcowner,spcacl,spcoptions} +(1 row) + +SELECT attrelid::regclass, array_accum(atttypid::regtype) + FROM pg_attribute + WHERE attnum > 0 AND attrelid = 'pg_tablespace'::regclass + GROUP BY attrelid; + + attrelid | array_accum +---------------+--------------------------- + pg_tablespace | {name,oid,aclitem[],text[]} +(1 row) +</pre><p> + </p><p> + Ordinarily, an aggregate function with a polymorphic result type has a + polymorphic state type, as in the above example. This is necessary + because otherwise the final function cannot be declared sensibly: it + would need to have a polymorphic result type but no polymorphic argument + type, which <code class="command">CREATE FUNCTION</code> will reject on the grounds that + the result type cannot be deduced from a call. But sometimes it is + inconvenient to use a polymorphic state type. The most common case is + where the aggregate support functions are to be written in C and the + state type should be declared as <code class="type">internal</code> because there is + no SQL-level equivalent for it. To address this case, it is possible to + declare the final function as taking extra <span class="quote">“<span class="quote">dummy</span>”</span> arguments + that match the input arguments of the aggregate. Such dummy arguments + are always passed as null values since no specific value is available when the + final function is called. Their only use is to allow a polymorphic + final function's result type to be connected to the aggregate's input + type(s). For example, the definition of the built-in + aggregate <code class="function">array_agg</code> is equivalent to + +</p><pre class="programlisting"> +CREATE FUNCTION array_agg_transfn(internal, anynonarray) + RETURNS internal ...; +CREATE FUNCTION array_agg_finalfn(internal, anynonarray) + RETURNS anyarray ...; + +CREATE AGGREGATE array_agg (anynonarray) +( + sfunc = array_agg_transfn, + stype = internal, + finalfunc = array_agg_finalfn, + finalfunc_extra +); +</pre><p> + + Here, the <code class="literal">finalfunc_extra</code> option specifies that the final + function receives, in addition to the state value, extra dummy + argument(s) corresponding to the aggregate's input argument(s). + The extra <code class="type">anynonarray</code> argument allows the declaration + of <code class="function">array_agg_finalfn</code> to be valid. + </p><p> + An aggregate function can be made to accept a varying number of arguments + by declaring its last argument as a <code class="literal">VARIADIC</code> array, in much + the same fashion as for regular functions; see + <a class="xref" href="xfunc-sql.html#XFUNC-SQL-VARIADIC-FUNCTIONS" title="38.5.6. SQL Functions with Variable Numbers of Arguments">Section 38.5.6</a>. The aggregate's transition + function(s) must have the same array type as their last argument. The + transition function(s) typically would also be marked <code class="literal">VARIADIC</code>, + but this is not strictly required. + </p><div class="note"><h3 class="title">Note</h3><p> + Variadic aggregates are easily misused in connection with + the <code class="literal">ORDER BY</code> option (see <a class="xref" href="sql-expressions.html#SYNTAX-AGGREGATES" title="4.2.7. Aggregate Expressions">Section 4.2.7</a>), + since the parser cannot tell whether the wrong number of actual arguments + have been given in such a combination. Keep in mind that everything to + the right of <code class="literal">ORDER BY</code> is a sort key, not an argument to the + aggregate. For example, in +</p><pre class="programlisting"> +SELECT myaggregate(a ORDER BY a, b, c) FROM ... +</pre><p> + the parser will see this as a single aggregate function argument and + three sort keys. However, the user might have intended +</p><pre class="programlisting"> +SELECT myaggregate(a, b, c ORDER BY a) FROM ... +</pre><p> + If <code class="literal">myaggregate</code> is variadic, both these calls could be + perfectly valid. + </p><p> + For the same reason, it's wise to think twice before creating aggregate + functions with the same names and different numbers of regular arguments. + </p></div></div><div class="sect2" id="XAGGR-ORDERED-SET-AGGREGATES"><div class="titlepage"><div><div><h3 class="title">38.12.3. Ordered-Set Aggregates</h3></div></div></div><a id="id-1.8.3.15.14.2" class="indexterm"></a><p> + The aggregates we have been describing so far are <span class="quote">“<span class="quote">normal</span>”</span> + aggregates. <span class="productname">PostgreSQL</span> also + supports <em class="firstterm">ordered-set aggregates</em>, which differ from + normal aggregates in two key ways. First, in addition to ordinary + aggregated arguments that are evaluated once per input row, an + ordered-set aggregate can have <span class="quote">“<span class="quote">direct</span>”</span> arguments that are + evaluated only once per aggregation operation. Second, the syntax + for the ordinary aggregated arguments specifies a sort ordering + for them explicitly. An ordered-set aggregate is usually + used to implement a computation that depends on a specific row + ordering, for instance rank or percentile, so that the sort ordering + is a required aspect of any call. For example, the built-in + definition of <code class="function">percentile_disc</code> is equivalent to: + +</p><pre class="programlisting"> +CREATE FUNCTION ordered_set_transition(internal, anyelement) + RETURNS internal ...; +CREATE FUNCTION percentile_disc_final(internal, float8, anyelement) + RETURNS anyelement ...; + +CREATE AGGREGATE percentile_disc (float8 ORDER BY anyelement) +( + sfunc = ordered_set_transition, + stype = internal, + finalfunc = percentile_disc_final, + finalfunc_extra +); +</pre><p> + + This aggregate takes a <code class="type">float8</code> direct argument (the percentile + fraction) and an aggregated input that can be of any sortable data type. + It could be used to obtain a median household income like this: + +</p><pre class="programlisting"> +SELECT percentile_disc(0.5) WITHIN GROUP (ORDER BY income) FROM households; + percentile_disc +----------------- + 50489 +</pre><p> + + Here, <code class="literal">0.5</code> is a direct argument; it would make no sense + for the percentile fraction to be a value varying across rows. + </p><p> + Unlike the case for normal aggregates, the sorting of input rows for + an ordered-set aggregate is <span class="emphasis"><em>not</em></span> done behind the scenes, + but is the responsibility of the aggregate's support functions. + The typical implementation approach is to keep a reference to + a <span class="quote">“<span class="quote">tuplesort</span>”</span> object in the aggregate's state value, feed the + incoming rows into that object, and then complete the sorting and + read out the data in the final function. This design allows the + final function to perform special operations such as injecting + additional <span class="quote">“<span class="quote">hypothetical</span>”</span> rows into the data to be sorted. + While normal aggregates can often be implemented with support + functions written in <span class="application">PL/pgSQL</span> or another + PL language, ordered-set aggregates generally have to be written in + C, since their state values aren't definable as any SQL data type. + (In the above example, notice that the state value is declared as + type <code class="type">internal</code> — this is typical.) + Also, because the final function performs the sort, it is not possible + to continue adding input rows by executing the transition function again + later. This means the final function is not <code class="literal">READ_ONLY</code>; + it must be declared in <a class="link" href="sql-createaggregate.html" title="CREATE AGGREGATE"><code class="command">CREATE AGGREGATE</code></a> + as <code class="literal">READ_WRITE</code>, or as <code class="literal">SHAREABLE</code> if + it's possible for additional final-function calls to make use of the + already-sorted state. + </p><p> + The state transition function for an ordered-set aggregate receives + the current state value plus the aggregated input values for + each row, and returns the updated state value. This is the + same definition as for normal aggregates, but note that the direct + arguments (if any) are not provided. The final function receives + the last state value, the values of the direct arguments if any, + and (if <code class="literal">finalfunc_extra</code> is specified) null values + corresponding to the aggregated input(s). As with normal + aggregates, <code class="literal">finalfunc_extra</code> is only really useful if the + aggregate is polymorphic; then the extra dummy argument(s) are needed + to connect the final function's result type to the aggregate's input + type(s). + </p><p> + Currently, ordered-set aggregates cannot be used as window functions, + and therefore there is no need for them to support moving-aggregate mode. + </p></div><div class="sect2" id="XAGGR-PARTIAL-AGGREGATES"><div class="titlepage"><div><div><h3 class="title">38.12.4. Partial Aggregation</h3></div></div></div><a id="id-1.8.3.15.15.2" class="indexterm"></a><p> + Optionally, an aggregate function can support <em class="firstterm">partial + aggregation</em>. The idea of partial aggregation is to run the aggregate's + state transition function over different subsets of the input data + independently, and then to combine the state values resulting from those + subsets to produce the same state value that would have resulted from + scanning all the input in a single operation. This mode can be used for + parallel aggregation by having different worker processes scan different + portions of a table. Each worker produces a partial state value, and at + the end those state values are combined to produce a final state value. + (In the future this mode might also be used for purposes such as combining + aggregations over local and remote tables; but that is not implemented + yet.) + </p><p> + To support partial aggregation, the aggregate definition must provide + a <em class="firstterm">combine function</em>, which takes two values of the + aggregate's state type (representing the results of aggregating over two + subsets of the input rows) and produces a new value of the state type, + representing what the state would have been after aggregating over the + combination of those sets of rows. It is unspecified what the relative + order of the input rows from the two sets would have been. This means + that it's usually impossible to define a useful combine function for + aggregates that are sensitive to input row order. + </p><p> + As simple examples, <code class="literal">MAX</code> and <code class="literal">MIN</code> aggregates can be + made to support partial aggregation by specifying the combine function as + the same greater-of-two or lesser-of-two comparison function that is used + as their transition function. <code class="literal">SUM</code> aggregates just need an + addition function as combine function. (Again, this is the same as their + transition function, unless the state value is wider than the input data + type.) + </p><p> + The combine function is treated much like a transition function that + happens to take a value of the state type, not of the underlying input + type, as its second argument. In particular, the rules for dealing + with null values and strict functions are similar. Also, if the aggregate + definition specifies a non-null <code class="literal">initcond</code>, keep in mind that + that will be used not only as the initial state for each partial + aggregation run, but also as the initial state for the combine function, + which will be called to combine each partial result into that state. + </p><p> + If the aggregate's state type is declared as <code class="type">internal</code>, it is + the combine function's responsibility that its result is allocated in + the correct memory context for aggregate state values. This means in + particular that when the first input is <code class="literal">NULL</code> it's invalid + to simply return the second input, as that value will be in the wrong + context and will not have sufficient lifespan. + </p><p> + When the aggregate's state type is declared as <code class="type">internal</code>, it is + usually also appropriate for the aggregate definition to provide a + <em class="firstterm">serialization function</em> and a <em class="firstterm">deserialization + function</em>, which allow such a state value to be copied from one process + to another. Without these functions, parallel aggregation cannot be + performed, and future applications such as local/remote aggregation will + probably not work either. + </p><p> + A serialization function must take a single argument of + type <code class="type">internal</code> and return a result of type <code class="type">bytea</code>, which + represents the state value packaged up into a flat blob of bytes. + Conversely, a deserialization function reverses that conversion. It must + take two arguments of types <code class="type">bytea</code> and <code class="type">internal</code>, and + return a result of type <code class="type">internal</code>. (The second argument is unused + and is always zero, but it is required for type-safety reasons.) The + result of the deserialization function should simply be allocated in the + current memory context, as unlike the combine function's result, it is not + long-lived. + </p><p> + Worth noting also is that for an aggregate to be executed in parallel, + the aggregate itself must be marked <code class="literal">PARALLEL SAFE</code>. The + parallel-safety markings on its support functions are not consulted. + </p></div><div class="sect2" id="XAGGR-SUPPORT-FUNCTIONS"><div class="titlepage"><div><div><h3 class="title">38.12.5. Support Functions for Aggregates</h3></div></div></div><a id="id-1.8.3.15.16.2" class="indexterm"></a><p> + A function written in C can detect that it is being called as an + aggregate support function by calling + <code class="function">AggCheckCallContext</code>, for example: +</p><pre class="programlisting"> +if (AggCheckCallContext(fcinfo, NULL)) +</pre><p> + One reason for checking this is that when it is true, the first input + must be a temporary state value and can therefore safely be modified + in-place rather than allocating a new copy. + See <code class="function">int8inc()</code> for an example. + (While aggregate transition functions are always allowed to modify + the transition value in-place, aggregate final functions are generally + discouraged from doing so; if they do so, the behavior must be declared + when creating the aggregate. See <a class="xref" href="sql-createaggregate.html" title="CREATE AGGREGATE"><span class="refentrytitle">CREATE AGGREGATE</span></a> + for more detail.) + </p><p> + The second argument of <code class="function">AggCheckCallContext</code> can be used to + retrieve the memory context in which aggregate state values are being kept. + This is useful for transition functions that wish to use <span class="quote">“<span class="quote">expanded</span>”</span> + objects (see <a class="xref" href="xtypes.html#XTYPES-TOAST" title="38.13.1. TOAST Considerations">Section 38.13.1</a>) as their state values. + On first call, the transition function should return an expanded object + whose memory context is a child of the aggregate state context, and then + keep returning the same expanded object on subsequent calls. See + <code class="function">array_append()</code> for an example. (<code class="function">array_append()</code> + is not the transition function of any built-in aggregate, but it is written + to behave efficiently when used as transition function of a custom + aggregate.) + </p><p> + Another support routine available to aggregate functions written in C + is <code class="function">AggGetAggref</code>, which returns the <code class="literal">Aggref</code> + parse node that defines the aggregate call. This is mainly useful + for ordered-set aggregates, which can inspect the substructure of + the <code class="literal">Aggref</code> node to find out what sort ordering they are + supposed to implement. Examples can be found + in <code class="filename">orderedsetaggs.c</code> in the <span class="productname">PostgreSQL</span> + source code. + </p></div></div><div class="navfooter"><hr /><table width="100%" summary="Navigation footer"><tr><td width="40%" align="left"><a accesskey="p" href="xfunc-optimization.html" title="38.11. Function Optimization Information">Prev</a> </td><td width="20%" align="center"><a accesskey="u" href="extend.html" title="Chapter 38. Extending SQL">Up</a></td><td width="40%" align="right"> <a accesskey="n" href="xtypes.html" title="38.13. User-Defined Types">Next</a></td></tr><tr><td width="40%" align="left" valign="top">38.11. Function Optimization Information </td><td width="20%" align="center"><a accesskey="h" href="index.html" title="PostgreSQL 15.5 Documentation">Home</a></td><td width="40%" align="right" valign="top"> 38.13. User-Defined Types</td></tr></table></div></body></html>
\ No newline at end of file |