From 5e45211a64149b3c659b90ff2de6fa982a5a93ed Mon Sep 17 00:00:00 2001 From: Daniel Baumann Date: Sat, 4 May 2024 14:17:33 +0200 Subject: Adding upstream version 15.5. Signed-off-by: Daniel Baumann --- doc/src/sgml/advanced.sgml | 720 +++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 720 insertions(+) create mode 100644 doc/src/sgml/advanced.sgml (limited to 'doc/src/sgml/advanced.sgml') diff --git a/doc/src/sgml/advanced.sgml b/doc/src/sgml/advanced.sgml new file mode 100644 index 0000000..755c9f1 --- /dev/null +++ b/doc/src/sgml/advanced.sgml @@ -0,0 +1,720 @@ + + + + Advanced Features + + + Introduction + + + In the previous chapter we have covered the basics of using + SQL to store and access your data in + PostgreSQL. We will now discuss some + more advanced features of SQL that simplify + management and prevent loss or corruption of your data. Finally, + we will look at some PostgreSQL + extensions. + + + + This chapter will on occasion refer to examples found in to change or improve them, so it will be + useful to have read that chapter. Some examples from + this chapter can also be found in + advanced.sql in the tutorial directory. This + file also contains some sample data to load, which is not + repeated here. (Refer to for + how to use the file.) + + + + + + Views + + + view + + + + Refer back to the queries in . + Suppose the combined listing of weather records and city location + is of particular interest to your application, but you do not want + to type the query each time you need it. You can create a + view over the query, which gives a name to + the query that you can refer to like an ordinary table: + + +CREATE VIEW myview AS + SELECT name, temp_lo, temp_hi, prcp, date, location + FROM weather, cities + WHERE city = name; + +SELECT * FROM myview; + + + + + Making liberal use of views is a key aspect of good SQL database + design. Views allow you to encapsulate the details of the + structure of your tables, which might change as your application + evolves, behind consistent interfaces. + + + + Views can be used in almost any place a real table can be used. + Building views upon other views is not uncommon. + + + + + + Foreign Keys + + + foreign key + + + + referential integrity + + + + Recall the weather and + cities tables from . Consider the following problem: You + want to make sure that no one can insert rows in the + weather table that do not have a matching + entry in the cities table. This is called + maintaining the referential integrity of + your data. In simplistic database systems this would be + implemented (if at all) by first looking at the + cities table to check if a matching record + exists, and then inserting or rejecting the new + weather records. This approach has a + number of problems and is very inconvenient, so + PostgreSQL can do this for you. + + + + The new declaration of the tables would look like this: + + +CREATE TABLE cities ( + name varchar(80) primary key, + location point +); + +CREATE TABLE weather ( + city varchar(80) references cities(name), + temp_lo int, + temp_hi int, + prcp real, + date date +); + + + Now try inserting an invalid record: + + +INSERT INTO weather VALUES ('Berkeley', 45, 53, 0.0, '1994-11-28'); + + + +ERROR: insert or update on table "weather" violates foreign key constraint "weather_city_fkey" +DETAIL: Key (city)=(Berkeley) is not present in table "cities". + + + + + The behavior of foreign keys can be finely tuned to your + application. We will not go beyond this simple example in this + tutorial, but just refer you to + for more information. Making correct use of + foreign keys will definitely improve the quality of your database + applications, so you are strongly encouraged to learn about them. + + + + + + Transactions + + + transaction + + + + Transactions are a fundamental concept of all database + systems. The essential point of a transaction is that it bundles + multiple steps into a single, all-or-nothing operation. The intermediate + states between the steps are not visible to other concurrent transactions, + and if some failure occurs that prevents the transaction from completing, + then none of the steps affect the database at all. + + + + For example, consider a bank database that contains balances for various + customer accounts, as well as total deposit balances for branches. + Suppose that we want to record a payment of $100.00 from Alice's account + to Bob's account. Simplifying outrageously, the SQL commands for this + might look like: + + +UPDATE accounts SET balance = balance - 100.00 + WHERE name = 'Alice'; +UPDATE branches SET balance = balance - 100.00 + WHERE name = (SELECT branch_name FROM accounts WHERE name = 'Alice'); +UPDATE accounts SET balance = balance + 100.00 + WHERE name = 'Bob'; +UPDATE branches SET balance = balance + 100.00 + WHERE name = (SELECT branch_name FROM accounts WHERE name = 'Bob'); + + + + + The details of these commands are not important here; the important + point is that there are several separate updates involved to accomplish + this rather simple operation. Our bank's officers will want to be + assured that either all these updates happen, or none of them happen. + It would certainly not do for a system failure to result in Bob + receiving $100.00 that was not debited from Alice. Nor would Alice long + remain a happy customer if she was debited without Bob being credited. + We need a guarantee that if something goes wrong partway through the + operation, none of the steps executed so far will take effect. Grouping + the updates into a transaction gives us this guarantee. + A transaction is said to be atomic: from the point of + view of other transactions, it either happens completely or not at all. + + + + We also want a + guarantee that once a transaction is completed and acknowledged by + the database system, it has indeed been permanently recorded + and won't be lost even if a crash ensues shortly thereafter. + For example, if we are recording a cash withdrawal by Bob, + we do not want any chance that the debit to his account will + disappear in a crash just after he walks out the bank door. + A transactional database guarantees that all the updates made by + a transaction are logged in permanent storage (i.e., on disk) before + the transaction is reported complete. + + + + Another important property of transactional databases is closely + related to the notion of atomic updates: when multiple transactions + are running concurrently, each one should not be able to see the + incomplete changes made by others. For example, if one transaction + is busy totalling all the branch balances, it would not do for it + to include the debit from Alice's branch but not the credit to + Bob's branch, nor vice versa. So transactions must be all-or-nothing + not only in terms of their permanent effect on the database, but + also in terms of their visibility as they happen. The updates made + so far by an open transaction are invisible to other transactions + until the transaction completes, whereupon all the updates become + visible simultaneously. + + + + In PostgreSQL, a transaction is set up by surrounding + the SQL commands of the transaction with + BEGIN and COMMIT commands. So our banking + transaction would actually look like: + + +BEGIN; +UPDATE accounts SET balance = balance - 100.00 + WHERE name = 'Alice'; +-- etc etc +COMMIT; + + + + + If, partway through the transaction, we decide we do not want to + commit (perhaps we just noticed that Alice's balance went negative), + we can issue the command ROLLBACK instead of + COMMIT, and all our updates so far will be canceled. + + + + PostgreSQL actually treats every SQL statement as being + executed within a transaction. If you do not issue a BEGIN + command, + then each individual statement has an implicit BEGIN and + (if successful) COMMIT wrapped around it. A group of + statements surrounded by BEGIN and COMMIT + is sometimes called a transaction block. + + + + + Some client libraries issue BEGIN and COMMIT + commands automatically, so that you might get the effect of transaction + blocks without asking. Check the documentation for the interface + you are using. + + + + + It's possible to control the statements in a transaction in a more + granular fashion through the use of savepoints. Savepoints + allow you to selectively discard parts of the transaction, while + committing the rest. After defining a savepoint with + SAVEPOINT, you can if needed roll back to the savepoint + with ROLLBACK TO. All the transaction's database changes + between defining the savepoint and rolling back to it are discarded, but + changes earlier than the savepoint are kept. + + + + After rolling back to a savepoint, it continues to be defined, so you can + roll back to it several times. Conversely, if you are sure you won't need + to roll back to a particular savepoint again, it can be released, so the + system can free some resources. Keep in mind that either releasing or + rolling back to a savepoint + will automatically release all savepoints that were defined after it. + + + + All this is happening within the transaction block, so none of it + is visible to other database sessions. When and if you commit the + transaction block, the committed actions become visible as a unit + to other sessions, while the rolled-back actions never become visible + at all. + + + + Remembering the bank database, suppose we debit $100.00 from Alice's + account, and credit Bob's account, only to find later that we should + have credited Wally's account. We could do it using savepoints like + this: + + +BEGIN; +UPDATE accounts SET balance = balance - 100.00 + WHERE name = 'Alice'; +SAVEPOINT my_savepoint; +UPDATE accounts SET balance = balance + 100.00 + WHERE name = 'Bob'; +-- oops ... forget that and use Wally's account +ROLLBACK TO my_savepoint; +UPDATE accounts SET balance = balance + 100.00 + WHERE name = 'Wally'; +COMMIT; + + + + + This example is, of course, oversimplified, but there's a lot of control + possible in a transaction block through the use of savepoints. + Moreover, ROLLBACK TO is the only way to regain control of a + transaction block that was put in aborted state by the + system due to an error, short of rolling it back completely and starting + again. + + + + + + + Window Functions + + + window function + + + + A window function performs a calculation across a set of + table rows that are somehow related to the current row. This is comparable + to the type of calculation that can be done with an aggregate function. + However, window functions do not cause rows to become grouped into a single + output row like non-window aggregate calls would. Instead, the + rows retain their separate identities. Behind the scenes, the window + function is able to access more than just the current row of the query + result. + + + + Here is an example that shows how to compare each employee's salary + with the average salary in his or her department: + + +SELECT depname, empno, salary, avg(salary) OVER (PARTITION BY depname) FROM empsalary; + + + + depname | empno | salary | avg +-----------+-------+--------+----------------------- + develop | 11 | 5200 | 5020.0000000000000000 + develop | 7 | 4200 | 5020.0000000000000000 + develop | 9 | 4500 | 5020.0000000000000000 + develop | 8 | 6000 | 5020.0000000000000000 + develop | 10 | 5200 | 5020.0000000000000000 + personnel | 5 | 3500 | 3700.0000000000000000 + personnel | 2 | 3900 | 3700.0000000000000000 + sales | 3 | 4800 | 4866.6666666666666667 + sales | 1 | 5000 | 4866.6666666666666667 + sales | 4 | 4800 | 4866.6666666666666667 +(10 rows) + + + The first three output columns come directly from the table + empsalary, and there is one output row for each row in the + table. The fourth column represents an average taken across all the table + rows that have the same depname value as the current row. + (This actually is the same function as the non-window avg + aggregate, but the OVER clause causes it to be + treated as a window function and computed across the window frame.) + + + + A window function call always contains an OVER clause + directly following the window function's name and argument(s). This is what + syntactically distinguishes it from a normal function or non-window + aggregate. The OVER clause determines exactly how the + rows of the query are split up for processing by the window function. + The PARTITION BY clause within OVER + divides the rows into groups, or partitions, that share the same + values of the PARTITION BY expression(s). For each row, + the window function is computed across the rows that fall into the + same partition as the current row. + + + + You can also control the order in which rows are processed by + window functions using ORDER BY within OVER. + (The window ORDER BY does not even have to match the + order in which the rows are output.) Here is an example: + + +SELECT depname, empno, salary, + rank() OVER (PARTITION BY depname ORDER BY salary DESC) +FROM empsalary; + + + + depname | empno | salary | rank +-----------+-------+--------+------ + develop | 8 | 6000 | 1 + develop | 10 | 5200 | 2 + develop | 11 | 5200 | 2 + develop | 9 | 4500 | 4 + develop | 7 | 4200 | 5 + personnel | 2 | 3900 | 1 + personnel | 5 | 3500 | 2 + sales | 1 | 5000 | 1 + sales | 4 | 4800 | 2 + sales | 3 | 4800 | 2 +(10 rows) + + + As shown here, the rank function produces a numerical rank + for each distinct ORDER BY value in the current row's + partition, using the order defined by the ORDER BY clause. + rank needs no explicit parameter, because its behavior + is entirely determined by the OVER clause. + + + + The rows considered by a window function are those of the virtual + table produced by the query's FROM clause as filtered by its + WHERE, GROUP BY, and HAVING clauses + if any. For example, a row removed because it does not meet the + WHERE condition is not seen by any window function. + A query can contain multiple window functions that slice up the data + in different ways using different OVER clauses, but + they all act on the same collection of rows defined by this virtual table. + + + + We already saw that ORDER BY can be omitted if the ordering + of rows is not important. It is also possible to omit PARTITION + BY, in which case there is a single partition containing all rows. + + + + There is another important concept associated with window functions: + for each row, there is a set of rows within its partition called its + window frame. Some window functions act only + on the rows of the window frame, rather than of the whole partition. + By default, if ORDER BY is supplied then the frame consists of + all rows from the start of the partition up through the current row, plus + any following rows that are equal to the current row according to the + ORDER BY clause. When ORDER BY is omitted the + default frame consists of all rows in the partition. + + + There are options to define the window frame in other ways, but + this tutorial does not cover them. See + for details. + + + Here is an example using sum: + + + +SELECT salary, sum(salary) OVER () FROM empsalary; + + + + salary | sum +--------+------- + 5200 | 47100 + 5000 | 47100 + 3500 | 47100 + 4800 | 47100 + 3900 | 47100 + 4200 | 47100 + 4500 | 47100 + 4800 | 47100 + 6000 | 47100 + 5200 | 47100 +(10 rows) + + + + Above, since there is no ORDER BY in the OVER + clause, the window frame is the same as the partition, which for lack of + PARTITION BY is the whole table; in other words each sum is + taken over the whole table and so we get the same result for each output + row. But if we add an ORDER BY clause, we get very different + results: + + + +SELECT salary, sum(salary) OVER (ORDER BY salary) FROM empsalary; + + + + salary | sum +--------+------- + 3500 | 3500 + 3900 | 7400 + 4200 | 11600 + 4500 | 16100 + 4800 | 25700 + 4800 | 25700 + 5000 | 30700 + 5200 | 41100 + 5200 | 41100 + 6000 | 47100 +(10 rows) + + + + Here the sum is taken from the first (lowest) salary up through the + current one, including any duplicates of the current one (notice the + results for the duplicated salaries). + + + + Window functions are permitted only in the SELECT list + and the ORDER BY clause of the query. They are forbidden + elsewhere, such as in GROUP BY, HAVING + and WHERE clauses. This is because they logically + execute after the processing of those clauses. Also, window functions + execute after non-window aggregate functions. This means it is valid to + include an aggregate function call in the arguments of a window function, + but not vice versa. + + + + If there is a need to filter or group rows after the window calculations + are performed, you can use a sub-select. For example: + + +SELECT depname, empno, salary, enroll_date +FROM + (SELECT depname, empno, salary, enroll_date, + rank() OVER (PARTITION BY depname ORDER BY salary DESC, empno) AS pos + FROM empsalary + ) AS ss +WHERE pos < 3; + + + The above query only shows the rows from the inner query having + rank less than 3. + + + + When a query involves multiple window functions, it is possible to write + out each one with a separate OVER clause, but this is + duplicative and error-prone if the same windowing behavior is wanted + for several functions. Instead, each windowing behavior can be named + in a WINDOW clause and then referenced in OVER. + For example: + + +SELECT sum(salary) OVER w, avg(salary) OVER w + FROM empsalary + WINDOW w AS (PARTITION BY depname ORDER BY salary DESC); + + + + + More details about window functions can be found in + , + , + , and the + reference page. + + + + + + Inheritance + + + inheritance + + + + Inheritance is a concept from object-oriented databases. It opens + up interesting new possibilities of database design. + + + + Let's create two tables: A table cities + and a table capitals. Naturally, capitals + are also cities, so you want some way to show the capitals + implicitly when you list all cities. If you're really clever you + might invent some scheme like this: + + +CREATE TABLE capitals ( + name text, + population real, + elevation int, -- (in ft) + state char(2) +); + +CREATE TABLE non_capitals ( + name text, + population real, + elevation int -- (in ft) +); + +CREATE VIEW cities AS + SELECT name, population, elevation FROM capitals + UNION + SELECT name, population, elevation FROM non_capitals; + + + This works OK as far as querying goes, but it gets ugly when you + need to update several rows, for one thing. + + + + A better solution is this: + + +CREATE TABLE cities ( + name text, + population real, + elevation int -- (in ft) +); + +CREATE TABLE capitals ( + state char(2) UNIQUE NOT NULL +) INHERITS (cities); + + + + + In this case, a row of capitals + inherits all columns (name, + population, and elevation) from its + parent, cities. The + type of the column name is + text, a native PostgreSQL + type for variable length character strings. The + capitals table has + an additional column, state, which shows its + state abbreviation. In + PostgreSQL, a table can inherit from + zero or more other tables. + + + + For example, the following query finds the names of all cities, + including state capitals, that are located at an elevation + over 500 feet: + + +SELECT name, elevation + FROM cities + WHERE elevation > 500; + + + which returns: + + + name | elevation +-----------+----------- + Las Vegas | 2174 + Mariposa | 1953 + Madison | 845 +(3 rows) + + + + + On the other hand, the following query finds + all the cities that are not state capitals and + are situated at an elevation over 500 feet: + + +SELECT name, elevation + FROM ONLY cities + WHERE elevation > 500; + + + + name | elevation +-----------+----------- + Las Vegas | 2174 + Mariposa | 1953 +(2 rows) + + + + + Here the ONLY before cities + indicates that the query should be run over only the + cities table, and not tables below + cities in the inheritance hierarchy. Many + of the commands that we have already discussed — + SELECT, UPDATE, and + DELETE — support this ONLY + notation. + + + + + Although inheritance is frequently useful, it has not been integrated + with unique constraints or foreign keys, which limits its usefulness. + See for more detail. + + + + + + + Conclusion + + + PostgreSQL has many features not + touched upon in this tutorial introduction, which has been + oriented toward newer users of SQL. These + features are discussed in more detail in the remainder of this + book. + + + + If you feel you need more introductory material, please visit the PostgreSQL + web site + for links to more resources. + + + -- cgit v1.2.3