38.12. User-Defined Aggregates

38.12. User-Defined Aggregates
Prev	Up	Chapter 38. Extending SQL	Home	Next

38.12. User-Defined Aggregates #

38.12.1. Moving-Aggregate Mode
38.12.2. Polymorphic and Variadic Aggregates
38.12.3. Ordered-Set Aggregates
38.12.4. Partial Aggregation
38.12.5. Support Functions for Aggregates

+ Aggregate functions in PostgreSQL + are defined in terms of state values + and state transition functions. + That is, an aggregate operates using a state value that is updated + as each successive input row is processed. + To define a new aggregate + function, one selects a data type for the state value, + an initial value for the state, and a state transition + function. The state transition function takes the previous state + value and the aggregate's input value(s) for the current row, and + returns a new state value. + A final function + can also be specified, in case the desired result of the aggregate + is different from the data that needs to be kept in the running + state value. The final function takes the ending state value + and returns whatever is wanted as the aggregate result. + In principle, the transition and final functions are just ordinary + functions that could also be used outside the context of the + aggregate. (In practice, it's often helpful for performance reasons + to create specialized transition functions that can only work when + called as part of an aggregate.) +

+ Thus, in addition to the argument and result data types seen by a user + of the aggregate, there is an internal state-value data type that + might be different from both the argument and result types. +

+ If we define an aggregate that does not use a final function, + we have an aggregate that computes a running function of + the column values from each row. sum is an + example of this kind of aggregate. sum starts at + zero and always adds the current row's value to + its running total. For example, if we want to make a sum + aggregate to work on a data type for complex numbers, + we only need the addition function for that data type. + The aggregate definition would be: + +

+CREATE AGGREGATE sum (complex)
+(
+    sfunc = complex_add,
+    stype = complex,
+    initcond = '(0,0)'
+);
+

+ + which we might use like this: + +

+SELECT sum(a) FROM test_complex;
+
+   sum
+-----------
+ (34,53.9)
+

+ + (Notice that we are relying on function overloading: there is more than + one aggregate named sum, but + PostgreSQL can figure out which kind + of sum applies to a column of type complex.) +

+ The above definition of sum will return zero + (the initial state value) if there are no nonnull input values. + Perhaps we want to return null in that case instead — the SQL standard + expects sum to behave that way. We can do this simply by + omitting the initcond phrase, so that the initial state + value is null. Ordinarily this would mean that the sfunc + would need to check for a null state-value input. But for + sum and some other simple aggregates like + max and min, + it is sufficient to insert the first nonnull input value into + the state variable and then start applying the transition function + at the second nonnull input value. PostgreSQL + will do that automatically if the initial state value is null and + the transition function is marked “strict” (i.e., not to be called + for null inputs). +

+ Another bit of default behavior for a “strict” transition function + is that the previous state value is retained unchanged whenever a + null input value is encountered. Thus, null values are ignored. If you + need some other behavior for null inputs, do not declare your + transition function as strict; instead code it to test for null inputs and + do whatever is needed. +

+ avg (average) is a more complex example of an aggregate. + It requires + two pieces of running state: the sum of the inputs and the count + of the number of inputs. The final result is obtained by dividing + these quantities. Average is typically implemented by using an + array as the state value. For example, + the built-in implementation of avg(float8) + looks like: + +

+CREATE AGGREGATE avg (float8)
+(
+    sfunc = float8_accum,
+    stype = float8[],
+    finalfunc = float8_avg,
+    initcond = '{0,0,0}'
+);
+

Note

+ float8_accum requires a three-element array, not just + two elements, because it accumulates the sum of squares as well as + the sum and count of the inputs. This is so that it can be used for + some other aggregates as well as avg. +

+ Aggregate function calls in SQL allow DISTINCT + and ORDER BY options that control which rows are fed + to the aggregate's transition function and in what order. These + options are implemented behind the scenes and are not the concern + of the aggregate's support functions. +

+ For further details see the + CREATE AGGREGATE + command. +

38.12.1. Moving-Aggregate Mode #

+ Aggregate functions can optionally support moving-aggregate + mode, which allows substantially faster execution of aggregate + functions within windows with moving frame starting points. + (See Section 3.5 + and Section 4.2.8 for information about use of + aggregate functions as window functions.) + The basic idea is that in addition to a normal “forward” + transition function, the aggregate provides an inverse + transition function, which allows rows to be removed from the + aggregate's running state value when they exit the window frame. + For example a sum aggregate, which uses addition as the + forward transition function, would use subtraction as the inverse + transition function. Without an inverse transition function, the window + function mechanism must recalculate the aggregate from scratch each time + the frame starting point moves, resulting in run time proportional to the + number of input rows times the average frame length. With an inverse + transition function, the run time is only proportional to the number of + input rows. +

+ The inverse transition function is passed the current state value and the + aggregate input value(s) for the earliest row included in the current + state. It must reconstruct what the state value would have been if the + given input row had never been aggregated, but only the rows following + it. This sometimes requires that the forward transition function keep + more state than is needed for plain aggregation mode. Therefore, the + moving-aggregate mode uses a completely separate implementation from the + plain mode: it has its own state data type, its own forward transition + function, and its own final function if needed. These can be the same as + the plain mode's data type and functions, if there is no need for extra + state. +

+ As an example, we could extend the sum aggregate given above + to support moving-aggregate mode like this: + +

+CREATE AGGREGATE sum (complex)
+(
+    sfunc = complex_add,
+    stype = complex,
+    initcond = '(0,0)',
+    msfunc = complex_add,
+    minvfunc = complex_sub,
+    mstype = complex,
+    minitcond = '(0,0)'
+);
+

+ + The parameters whose names begin with m define the + moving-aggregate implementation. Except for the inverse transition + function minvfunc, they correspond to the plain-aggregate + parameters without m. +

+ The forward transition function for moving-aggregate mode is not allowed + to return null as the new state value. If the inverse transition + function returns null, this is taken as an indication that the inverse + function cannot reverse the state calculation for this particular input, + and so the aggregate calculation will be redone from scratch for the + current frame starting position. This convention allows moving-aggregate + mode to be used in situations where there are some infrequent cases that + are impractical to reverse out of the running state value. The inverse + transition function can “punt” on these cases, and yet still come + out ahead so long as it can work for most cases. As an example, an + aggregate working with floating-point numbers might choose to punt when + a NaN (not a number) input has to be removed from the running + state value. +

+ When writing moving-aggregate support functions, it is important to be + sure that the inverse transition function can reconstruct the correct + state value exactly. Otherwise there might be user-visible differences + in results depending on whether the moving-aggregate mode is used. + An example of an aggregate for which adding an inverse transition + function seems easy at first, yet where this requirement cannot be met + is sum over float4 or float8 inputs. A + naive declaration of sum(float8) could be + +

+CREATE AGGREGATE unsafe_sum (float8)
+(
+    stype = float8,
+    sfunc = float8pl,
+    mstype = float8,
+    msfunc = float8pl,
+    minvfunc = float8mi
+);
+

+ + This aggregate, however, can give wildly different results than it would + have without the inverse transition function. For example, consider + +

+SELECT
+  unsafe_sum(x) OVER (ORDER BY n ROWS BETWEEN CURRENT ROW AND 1 FOLLOWING)
+FROM (VALUES (1, 1.0e20::float8),
+             (2, 1.0::float8)) AS v (n,x);
+

+ + This query returns 0 as its second result, rather than the + expected answer of 1. The cause is the limited precision of + floating-point values: adding 1 to 1e20 results + in 1e20 again, and so subtracting 1e20 from that + yields 0, not 1. Note that this is a limitation + of floating-point arithmetic in general, not a limitation + of PostgreSQL. +

38.12.2. Polymorphic and Variadic Aggregates #

+ Aggregate functions can use polymorphic + state transition functions or final functions, so that the same functions + can be used to implement multiple aggregates. + See Section 38.2.5 + for an explanation of polymorphic functions. + Going a step further, the aggregate function itself can be specified + with polymorphic input type(s) and state type, allowing a single + aggregate definition to serve for multiple input data types. + Here is an example of a polymorphic aggregate: + +

+CREATE AGGREGATE array_accum (anycompatible)
+(
+    sfunc = array_append,
+    stype = anycompatiblearray,
+    initcond = '{}'
+);
+

+ + Here, the actual state type for any given aggregate call is the array type + having the actual input type as elements. The behavior of the aggregate + is to concatenate all the inputs into an array of that type. + (Note: the built-in aggregate array_agg provides similar + functionality, with better performance than this definition would have.) +

+ Here's the output using two different actual data types as arguments: + +

+SELECT attrelid::regclass, array_accum(attname)
+    FROM pg_attribute
+    WHERE attnum > 0 AND attrelid = 'pg_tablespace'::regclass
+    GROUP BY attrelid;
+
+   attrelid    |              array_accum
+---------------+---------------------------------------
+ pg_tablespace | {spcname,spcowner,spcacl,spcoptions}
+(1 row)
+
+SELECT attrelid::regclass, array_accum(atttypid::regtype)
+    FROM pg_attribute
+    WHERE attnum > 0 AND attrelid = 'pg_tablespace'::regclass
+    GROUP BY attrelid;
+
+   attrelid    |        array_accum
+---------------+---------------------------
+ pg_tablespace | {name,oid,aclitem[],text[]}
+(1 row)
+

+ Ordinarily, an aggregate function with a polymorphic result type has a + polymorphic state type, as in the above example. This is necessary + because otherwise the final function cannot be declared sensibly: it + would need to have a polymorphic result type but no polymorphic argument + type, which CREATE FUNCTION will reject on the grounds that + the result type cannot be deduced from a call. But sometimes it is + inconvenient to use a polymorphic state type. The most common case is + where the aggregate support functions are to be written in C and the + state type should be declared as internal because there is + no SQL-level equivalent for it. To address this case, it is possible to + declare the final function as taking extra “dummy” arguments + that match the input arguments of the aggregate. Such dummy arguments + are always passed as null values since no specific value is available when the + final function is called. Their only use is to allow a polymorphic + final function's result type to be connected to the aggregate's input + type(s). For example, the definition of the built-in + aggregate array_agg is equivalent to + +

+CREATE FUNCTION array_agg_transfn(internal, anynonarray)
+  RETURNS internal ...;
+CREATE FUNCTION array_agg_finalfn(internal, anynonarray)
+  RETURNS anyarray ...;
+
+CREATE AGGREGATE array_agg (anynonarray)
+(
+    sfunc = array_agg_transfn,
+    stype = internal,
+    finalfunc = array_agg_finalfn,
+    finalfunc_extra
+);
+

+ + Here, the finalfunc_extra option specifies that the final + function receives, in addition to the state value, extra dummy + argument(s) corresponding to the aggregate's input argument(s). + The extra anynonarray argument allows the declaration + of array_agg_finalfn to be valid. +

+ An aggregate function can be made to accept a varying number of arguments + by declaring its last argument as a VARIADIC array, in much + the same fashion as for regular functions; see + Section 38.5.6. The aggregate's transition + function(s) must have the same array type as their last argument. The + transition function(s) typically would also be marked VARIADIC, + but this is not strictly required. +

Note

+ Variadic aggregates are easily misused in connection with + the ORDER BY option (see Section 4.2.7), + since the parser cannot tell whether the wrong number of actual arguments + have been given in such a combination. Keep in mind that everything to + the right of ORDER BY is a sort key, not an argument to the + aggregate. For example, in +

+SELECT myaggregate(a ORDER BY a, b, c) FROM ...
+

+ the parser will see this as a single aggregate function argument and + three sort keys. However, the user might have intended +

+SELECT myaggregate(a, b, c ORDER BY a) FROM ...
+

+ If myaggregate is variadic, both these calls could be + perfectly valid. +

+ For the same reason, it's wise to think twice before creating aggregate + functions with the same names and different numbers of regular arguments. +

38.12.3. Ordered-Set Aggregates #

+ The aggregates we have been describing so far are “normal” + aggregates. PostgreSQL also + supports ordered-set aggregates, which differ from + normal aggregates in two key ways. First, in addition to ordinary + aggregated arguments that are evaluated once per input row, an + ordered-set aggregate can have “direct” arguments that are + evaluated only once per aggregation operation. Second, the syntax + for the ordinary aggregated arguments specifies a sort ordering + for them explicitly. An ordered-set aggregate is usually + used to implement a computation that depends on a specific row + ordering, for instance rank or percentile, so that the sort ordering + is a required aspect of any call. For example, the built-in + definition of percentile_disc is equivalent to: + +

+CREATE FUNCTION ordered_set_transition(internal, anyelement)
+  RETURNS internal ...;
+CREATE FUNCTION percentile_disc_final(internal, float8, anyelement)
+  RETURNS anyelement ...;
+
+CREATE AGGREGATE percentile_disc (float8 ORDER BY anyelement)
+(
+    sfunc = ordered_set_transition,
+    stype = internal,
+    finalfunc = percentile_disc_final,
+    finalfunc_extra
+);
+

+ + This aggregate takes a float8 direct argument (the percentile + fraction) and an aggregated input that can be of any sortable data type. + It could be used to obtain a median household income like this: + +

+SELECT percentile_disc(0.5) WITHIN GROUP (ORDER BY income) FROM households;
+ percentile_disc
+-----------------
+           50489
+

+ + Here, 0.5 is a direct argument; it would make no sense + for the percentile fraction to be a value varying across rows. +

+ Unlike the case for normal aggregates, the sorting of input rows for + an ordered-set aggregate is not done behind the scenes, + but is the responsibility of the aggregate's support functions. + The typical implementation approach is to keep a reference to + a “tuplesort” object in the aggregate's state value, feed the + incoming rows into that object, and then complete the sorting and + read out the data in the final function. This design allows the + final function to perform special operations such as injecting + additional “hypothetical” rows into the data to be sorted. + While normal aggregates can often be implemented with support + functions written in PL/pgSQL or another + PL language, ordered-set aggregates generally have to be written in + C, since their state values aren't definable as any SQL data type. + (In the above example, notice that the state value is declared as + type internal — this is typical.) + Also, because the final function performs the sort, it is not possible + to continue adding input rows by executing the transition function again + later. This means the final function is not READ_ONLY; + it must be declared in CREATE AGGREGATE + as READ_WRITE, or as SHAREABLE if + it's possible for additional final-function calls to make use of the + already-sorted state. +

+ The state transition function for an ordered-set aggregate receives + the current state value plus the aggregated input values for + each row, and returns the updated state value. This is the + same definition as for normal aggregates, but note that the direct + arguments (if any) are not provided. The final function receives + the last state value, the values of the direct arguments if any, + and (if finalfunc_extra is specified) null values + corresponding to the aggregated input(s). As with normal + aggregates, finalfunc_extra is only really useful if the + aggregate is polymorphic; then the extra dummy argument(s) are needed + to connect the final function's result type to the aggregate's input + type(s). +

+ Currently, ordered-set aggregates cannot be used as window functions, + and therefore there is no need for them to support moving-aggregate mode. +

38.12.4. Partial Aggregation #

+ Optionally, an aggregate function can support partial + aggregation. The idea of partial aggregation is to run the aggregate's + state transition function over different subsets of the input data + independently, and then to combine the state values resulting from those + subsets to produce the same state value that would have resulted from + scanning all the input in a single operation. This mode can be used for + parallel aggregation by having different worker processes scan different + portions of a table. Each worker produces a partial state value, and at + the end those state values are combined to produce a final state value. + (In the future this mode might also be used for purposes such as combining + aggregations over local and remote tables; but that is not implemented + yet.) +

+ To support partial aggregation, the aggregate definition must provide + a combine function, which takes two values of the + aggregate's state type (representing the results of aggregating over two + subsets of the input rows) and produces a new value of the state type, + representing what the state would have been after aggregating over the + combination of those sets of rows. It is unspecified what the relative + order of the input rows from the two sets would have been. This means + that it's usually impossible to define a useful combine function for + aggregates that are sensitive to input row order. +

+ As simple examples, MAX and MIN aggregates can be + made to support partial aggregation by specifying the combine function as + the same greater-of-two or lesser-of-two comparison function that is used + as their transition function. SUM aggregates just need an + addition function as combine function. (Again, this is the same as their + transition function, unless the state value is wider than the input data + type.) +

+ The combine function is treated much like a transition function that + happens to take a value of the state type, not of the underlying input + type, as its second argument. In particular, the rules for dealing + with null values and strict functions are similar. Also, if the aggregate + definition specifies a non-null initcond, keep in mind that + that will be used not only as the initial state for each partial + aggregation run, but also as the initial state for the combine function, + which will be called to combine each partial result into that state. +

+ If the aggregate's state type is declared as internal, it is + the combine function's responsibility that its result is allocated in + the correct memory context for aggregate state values. This means in + particular that when the first input is NULL it's invalid + to simply return the second input, as that value will be in the wrong + context and will not have sufficient lifespan. +

+ When the aggregate's state type is declared as internal, it is + usually also appropriate for the aggregate definition to provide a + serialization function and a deserialization + function, which allow such a state value to be copied from one process + to another. Without these functions, parallel aggregation cannot be + performed, and future applications such as local/remote aggregation will + probably not work either. +

+ A serialization function must take a single argument of + type internal and return a result of type bytea, which + represents the state value packaged up into a flat blob of bytes. + Conversely, a deserialization function reverses that conversion. It must + take two arguments of types bytea and internal, and + return a result of type internal. (The second argument is unused + and is always zero, but it is required for type-safety reasons.) The + result of the deserialization function should simply be allocated in the + current memory context, as unlike the combine function's result, it is not + long-lived. +

+ Worth noting also is that for an aggregate to be executed in parallel, + the aggregate itself must be marked PARALLEL SAFE. The + parallel-safety markings on its support functions are not consulted. +

38.12.5. Support Functions for Aggregates #

+ A function written in C can detect that it is being called as an + aggregate support function by calling + AggCheckCallContext, for example: +

+if (AggCheckCallContext(fcinfo, NULL))
+

+ One reason for checking this is that when it is true, the first input + must be a temporary state value and can therefore safely be modified + in-place rather than allocating a new copy. + See int8inc() for an example. + (While aggregate transition functions are always allowed to modify + the transition value in-place, aggregate final functions are generally + discouraged from doing so; if they do so, the behavior must be declared + when creating the aggregate. See CREATE AGGREGATE + for more detail.) +

+ The second argument of AggCheckCallContext can be used to + retrieve the memory context in which aggregate state values are being kept. + This is useful for transition functions that wish to use “expanded” + objects (see Section 38.13.1) as their state values. + On first call, the transition function should return an expanded object + whose memory context is a child of the aggregate state context, and then + keep returning the same expanded object on subsequent calls. See + array_append() for an example. (array_append() + is not the transition function of any built-in aggregate, but it is written + to behave efficiently when used as transition function of a custom + aggregate.) +

+ Another support routine available to aggregate functions written in C + is AggGetAggref, which returns the Aggref + parse node that defines the aggregate call. This is mainly useful + for ordered-set aggregates, which can inspect the substructure of + the Aggref node to find out what sort ordering they are + supposed to implement. Examples can be found + in orderedsetaggs.c in the PostgreSQL + source code. +

Prev	Up	Next
38.11. Function Optimization Information	Home	38.13. User-Defined Types