diff options
Diffstat (limited to 'doc/src/sgml/html/xml2.html')
-rw-r--r-- | doc/src/sgml/html/xml2.html | 274 |
1 files changed, 274 insertions, 0 deletions
diff --git a/doc/src/sgml/html/xml2.html b/doc/src/sgml/html/xml2.html new file mode 100644 index 0000000..d6ea359 --- /dev/null +++ b/doc/src/sgml/html/xml2.html @@ -0,0 +1,274 @@ +<?xml version="1.0" encoding="UTF-8" standalone="no"?> +<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"><html xmlns="http://www.w3.org/1999/xhtml"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /><title>F.50. xml2</title><link rel="stylesheet" type="text/css" href="stylesheet.css" /><link rev="made" href="pgsql-docs@lists.postgresql.org" /><meta name="generator" content="DocBook XSL Stylesheets Vsnapshot" /><link rel="prev" href="uuid-ossp.html" title="F.49. uuid-ossp" /><link rel="next" href="contrib-prog.html" title="Appendix G. Additional Supplied Programs" /></head><body id="docContent" class="container-fluid col-10"><div class="navheader"><table width="100%" summary="Navigation header"><tr><th colspan="5" align="center">F.50. xml2</th></tr><tr><td width="10%" align="left"><a accesskey="p" href="uuid-ossp.html" title="F.49. uuid-ossp">Prev</a> </td><td width="10%" align="left"><a accesskey="u" href="contrib.html" title="Appendix F. Additional Supplied Modules">Up</a></td><th width="60%" align="center">Appendix F. Additional Supplied Modules</th><td width="10%" align="right"><a accesskey="h" href="index.html" title="PostgreSQL 15.4 Documentation">Home</a></td><td width="10%" align="right"> <a accesskey="n" href="contrib-prog.html" title="Appendix G. Additional Supplied Programs">Next</a></td></tr></table><hr /></div><div class="sect1" id="XML2"><div class="titlepage"><div><div><h2 class="title" style="clear: both">F.50. xml2</h2></div></div></div><div class="toc"><dl class="toc"><dt><span class="sect2"><a href="xml2.html#id-1.11.7.59.4">F.50.1. Deprecation Notice</a></span></dt><dt><span class="sect2"><a href="xml2.html#id-1.11.7.59.5">F.50.2. Description of Functions</a></span></dt><dt><span class="sect2"><a href="xml2.html#id-1.11.7.59.6">F.50.3. <code class="literal">xpath_table</code></a></span></dt><dt><span class="sect2"><a href="xml2.html#id-1.11.7.59.7">F.50.4. XSLT Functions</a></span></dt><dt><span class="sect2"><a href="xml2.html#id-1.11.7.59.8">F.50.5. Author</a></span></dt></dl></div><a id="id-1.11.7.59.2" class="indexterm"></a><p> + The <code class="filename">xml2</code> module provides XPath querying and + XSLT functionality. + </p><div class="sect2" id="id-1.11.7.59.4"><div class="titlepage"><div><div><h3 class="title">F.50.1. Deprecation Notice</h3></div></div></div><p> + From <span class="productname">PostgreSQL</span> 8.3 on, there is XML-related + functionality based on the SQL/XML standard in the core server. + That functionality covers XML syntax checking and XPath queries, + which is what this module does, and more, but the API is + not at all compatible. It is planned that this module will be + removed in a future version of PostgreSQL in favor of the newer standard API, so + you are encouraged to try converting your applications. If you + find that some of the functionality of this module is not + available in an adequate form with the newer API, please explain + your issue to <code class="email"><<a class="email" href="mailto:pgsql-hackers@lists.postgresql.org">pgsql-hackers@lists.postgresql.org</a>></code> so that the deficiency + can be addressed. + </p></div><div class="sect2" id="id-1.11.7.59.5"><div class="titlepage"><div><div><h3 class="title">F.50.2. Description of Functions</h3></div></div></div><p> + <a class="xref" href="xml2.html#XML2-FUNCTIONS-TABLE" title="Table F.34. xml2 Functions">Table F.34</a> shows the functions provided by this module. + These functions provide straightforward XML parsing and XPath queries. + </p><div class="table" id="XML2-FUNCTIONS-TABLE"><p class="title"><strong>Table F.34. <code class="filename">xml2</code> Functions</strong></p><div class="table-contents"><table class="table" summary="xml2 Functions" border="1"><colgroup><col /></colgroup><thead><tr><th class="func_table_entry"><p class="func_signature"> + Function + </p> + <p> + Description + </p></th></tr></thead><tbody><tr><td class="func_table_entry"><p class="func_signature"> + <code class="function">xml_valid</code> ( <em class="parameter"><code>document</code></em> <code class="type">text</code> ) + → <code class="returnvalue">boolean</code> + </p> + <p> + Parses the given document and returns true if the + document is well-formed XML. (Note: this is an alias for the standard + PostgreSQL function <code class="function">xml_is_well_formed()</code>. The + name <code class="function">xml_valid()</code> is technically incorrect since validity + and well-formedness have different meanings in XML.) + </p></td></tr><tr><td class="func_table_entry"><p class="func_signature"> + <code class="function">xpath_string</code> ( <em class="parameter"><code>document</code></em> <code class="type">text</code>, <em class="parameter"><code>query</code></em> <code class="type">text</code> ) + → <code class="returnvalue">text</code> + </p> + <p> + Evaluates the XPath query on the supplied document, and + casts the result to <code class="type">text</code>. + </p></td></tr><tr><td class="func_table_entry"><p class="func_signature"> + <code class="function">xpath_number</code> ( <em class="parameter"><code>document</code></em> <code class="type">text</code>, <em class="parameter"><code>query</code></em> <code class="type">text</code> ) + → <code class="returnvalue">real</code> + </p> + <p> + Evaluates the XPath query on the supplied document, and + casts the result to <code class="type">real</code>. + </p></td></tr><tr><td class="func_table_entry"><p class="func_signature"> + <code class="function">xpath_bool</code> ( <em class="parameter"><code>document</code></em> <code class="type">text</code>, <em class="parameter"><code>query</code></em> <code class="type">text</code> ) + → <code class="returnvalue">boolean</code> + </p> + <p> + Evaluates the XPath query on the supplied document, and + casts the result to <code class="type">boolean</code>. + </p></td></tr><tr><td class="func_table_entry"><p class="func_signature"> + <code class="function">xpath_nodeset</code> ( <em class="parameter"><code>document</code></em> <code class="type">text</code>, <em class="parameter"><code>query</code></em> <code class="type">text</code>, <em class="parameter"><code>toptag</code></em> <code class="type">text</code>, <em class="parameter"><code>itemtag</code></em> <code class="type">text</code> ) + → <code class="returnvalue">text</code> + </p> + <p> + Evaluates the query on the document and wraps the result in XML + tags. If the result is multivalued, the output will look like: +</p><pre class="synopsis"> +<toptag> +<itemtag>Value 1 which could be an XML fragment</itemtag> +<itemtag>Value 2....</itemtag> +</toptag> +</pre><p> + If either <em class="parameter"><code>toptag</code></em> + or <em class="parameter"><code>itemtag</code></em> is an empty string, the relevant tag + is omitted. + </p></td></tr><tr><td class="func_table_entry"><p class="func_signature"> + <code class="function">xpath_nodeset</code> ( <em class="parameter"><code>document</code></em> <code class="type">text</code>, <em class="parameter"><code>query</code></em> <code class="type">text</code>, <em class="parameter"><code>itemtag</code></em> <code class="type">text</code> ) + → <code class="returnvalue">text</code> + </p> + <p> + Like <code class="function">xpath_nodeset(document, query, toptag, itemtag)</code> but result omits <em class="parameter"><code>toptag</code></em>. + </p></td></tr><tr><td class="func_table_entry"><p class="func_signature"> + <code class="function">xpath_nodeset</code> ( <em class="parameter"><code>document</code></em> <code class="type">text</code>, <em class="parameter"><code>query</code></em> <code class="type">text</code> ) + → <code class="returnvalue">text</code> + </p> + <p> + Like <code class="function">xpath_nodeset(document, query, toptag, itemtag)</code> but result omits both tags. + </p></td></tr><tr><td class="func_table_entry"><p class="func_signature"> + <code class="function">xpath_list</code> ( <em class="parameter"><code>document</code></em> <code class="type">text</code>, <em class="parameter"><code>query</code></em> <code class="type">text</code>, <em class="parameter"><code>separator</code></em> <code class="type">text</code> ) + → <code class="returnvalue">text</code> + </p> + <p> + Evaluates the query on the document and returns multiple values + separated by the specified separator, for example <code class="literal">Value + 1,Value 2,Value 3</code> if <em class="parameter"><code>separator</code></em> + is <code class="literal">,</code>. + </p></td></tr><tr><td class="func_table_entry"><p class="func_signature"> + <code class="function">xpath_list</code> ( <em class="parameter"><code>document</code></em> <code class="type">text</code>, <em class="parameter"><code>query</code></em> <code class="type">text</code> ) + → <code class="returnvalue">text</code> + </p> + <p> + This is a wrapper for the above function that uses <code class="literal">,</code> + as the separator. + </p></td></tr></tbody></table></div></div><br class="table-break" /></div><div class="sect2" id="id-1.11.7.59.6"><div class="titlepage"><div><div><h3 class="title">F.50.3. <code class="literal">xpath_table</code></h3></div></div></div><a id="id-1.11.7.59.6.2" class="indexterm"></a><pre class="synopsis"> +xpath_table(text key, text document, text relation, text xpaths, text criteria) returns setof record +</pre><p> + <code class="function">xpath_table</code> is a table function that evaluates a set of XPath + queries on each of a set of documents and returns the results as a + table. The primary key field from the original document table is returned + as the first column of the result so that the result set + can readily be used in joins. The parameters are described in + <a class="xref" href="xml2.html#XML2-XPATH-TABLE-PARAMETERS" title="Table F.35. xpath_table Parameters">Table F.35</a>. + </p><div class="table" id="XML2-XPATH-TABLE-PARAMETERS"><p class="title"><strong>Table F.35. <code class="function">xpath_table</code> Parameters</strong></p><div class="table-contents"><table class="table" summary="xpath_table Parameters" border="1"><colgroup><col class="col1" /><col class="col2" /></colgroup><thead><tr><th>Parameter</th><th>Description</th></tr></thead><tbody><tr><td><em class="parameter"><code>key</code></em></td><td> + <p> + the name of the <span class="quote">“<span class="quote">key</span>”</span> field — this is just a field to be used as + the first column of the output table, i.e., it identifies the record from + which each output row came (see note below about multiple values) + </p> + </td></tr><tr><td><em class="parameter"><code>document</code></em></td><td> + <p> + the name of the field containing the XML document + </p> + </td></tr><tr><td><em class="parameter"><code>relation</code></em></td><td> + <p> + the name of the table or view containing the documents + </p> + </td></tr><tr><td><em class="parameter"><code>xpaths</code></em></td><td> + <p> + one or more XPath expressions, separated by <code class="literal">|</code> + </p> + </td></tr><tr><td><em class="parameter"><code>criteria</code></em></td><td> + <p> + the contents of the WHERE clause. This cannot be omitted, so use + <code class="literal">true</code> or <code class="literal">1=1</code> if you want to + process all the rows in the relation + </p> + </td></tr></tbody></table></div></div><br class="table-break" /><p> + These parameters (except the XPath strings) are just substituted + into a plain SQL SELECT statement, so you have some flexibility — the + statement is + </p><p> + <code class="literal"> + SELECT <key>, <document> FROM <relation> WHERE <criteria> + </code> + </p><p> + so those parameters can be <span class="emphasis"><em>anything</em></span> valid in those particular + locations. The result from this SELECT needs to return exactly two + columns (which it will unless you try to list multiple fields for key + or document). Beware that this simplistic approach requires that you + validate any user-supplied values to avoid SQL injection attacks. + </p><p> + The function has to be used in a <code class="literal">FROM</code> expression, with an + <code class="literal">AS</code> clause to specify the output columns; for example +</p><pre class="programlisting"> +SELECT * FROM +xpath_table('article_id', + 'article_xml', + 'articles', + '/article/author|/article/pages|/article/title', + 'date_entered > ''2003-01-01'' ') +AS t(article_id integer, author text, page_count integer, title text); +</pre><p> + The <code class="literal">AS</code> clause defines the names and types of the columns in the + output table. The first is the <span class="quote">“<span class="quote">key</span>”</span> field and the rest correspond + to the XPath queries. + If there are more XPath queries than result columns, + the extra queries will be ignored. If there are more result columns + than XPath queries, the extra columns will be NULL. + </p><p> + Notice that this example defines the <code class="structname">page_count</code> result + column as an integer. The function deals internally with string + representations, so when you say you want an integer in the output, it will + take the string representation of the XPath result and use PostgreSQL input + functions to transform it into an integer (or whatever type the <code class="type">AS</code> + clause requests). An error will result if it can't do this — for + example if the result is empty — so you may wish to just stick to + <code class="type">text</code> as the column type if you think your data has any problems. + </p><p> + The calling <code class="command">SELECT</code> statement doesn't necessarily have to be + just <code class="literal">SELECT *</code> — it can reference the output + columns by name or join them to other tables. The function produces a + virtual table with which you can perform any operation you wish (e.g., + aggregation, joining, sorting etc.). So we could also have: +</p><pre class="programlisting"> +SELECT t.title, p.fullname, p.email +FROM xpath_table('article_id', 'article_xml', 'articles', + '/article/title|/article/author/@id', + 'xpath_string(article_xml,''/article/@date'') > ''2003-03-20'' ') + AS t(article_id integer, title text, author_id integer), + tblPeopleInfo AS p +WHERE t.author_id = p.person_id; +</pre><p> + as a more complicated example. Of course, you could wrap all + of this in a view for convenience. + </p><div class="sect3" id="id-1.11.7.59.6.12"><div class="titlepage"><div><div><h4 class="title">F.50.3.1. Multivalued Results</h4></div></div></div><p> + The <code class="function">xpath_table</code> function assumes that the results of each XPath query + might be multivalued, so the number of rows returned by the function + may not be the same as the number of input documents. The first row + returned contains the first result from each query, the second row the + second result from each query. If one of the queries has fewer values + than the others, null values will be returned instead. + </p><p> + In some cases, a user will know that a given XPath query will return + only a single result (perhaps a unique document identifier) — if used + alongside an XPath query returning multiple results, the single-valued + result will appear only on the first row of the result. The solution + to this is to use the key field as part of a join against a simpler + XPath query. As an example: + +</p><pre class="programlisting"> +CREATE TABLE test ( + id int PRIMARY KEY, + xml text +); + +INSERT INTO test VALUES (1, '<doc num="C1"> +<line num="L1"><a>1</a><b>2</b><c>3</c></line> +<line num="L2"><a>11</a><b>22</b><c>33</c></line> +</doc>'); + +INSERT INTO test VALUES (2, '<doc num="C2"> +<line num="L1"><a>111</a><b>222</b><c>333</c></line> +<line num="L2"><a>111</a><b>222</b><c>333</c></line> +</doc>'); + +SELECT * FROM + xpath_table('id','xml','test', + '/doc/@num|/doc/line/@num|/doc/line/a|/doc/line/b|/doc/line/c', + 'true') + AS t(id int, doc_num varchar(10), line_num varchar(10), val1 int, val2 int, val3 int) +WHERE id = 1 ORDER BY doc_num, line_num + + id | doc_num | line_num | val1 | val2 | val3 +----+---------+----------+------+------+------ + 1 | C1 | L1 | 1 | 2 | 3 + 1 | | L2 | 11 | 22 | 33 +</pre><p> + </p><p> + To get <code class="literal">doc_num</code> on every line, the solution is to use two invocations + of <code class="function">xpath_table</code> and join the results: + +</p><pre class="programlisting"> +SELECT t.*,i.doc_num FROM + xpath_table('id', 'xml', 'test', + '/doc/line/@num|/doc/line/a|/doc/line/b|/doc/line/c', + 'true') + AS t(id int, line_num varchar(10), val1 int, val2 int, val3 int), + xpath_table('id', 'xml', 'test', '/doc/@num', 'true') + AS i(id int, doc_num varchar(10)) +WHERE i.id=t.id AND i.id=1 +ORDER BY doc_num, line_num; + + id | line_num | val1 | val2 | val3 | doc_num +----+----------+------+------+------+--------- + 1 | L1 | 1 | 2 | 3 | C1 + 1 | L2 | 11 | 22 | 33 | C1 +(2 rows) +</pre><p> + </p></div></div><div class="sect2" id="id-1.11.7.59.7"><div class="titlepage"><div><div><h3 class="title">F.50.4. XSLT Functions</h3></div></div></div><p> + The following functions are available if libxslt is installed: + </p><div class="sect3" id="id-1.11.7.59.7.3"><div class="titlepage"><div><div><h4 class="title">F.50.4.1. <code class="literal">xslt_process</code></h4></div></div></div><a id="id-1.11.7.59.7.3.2" class="indexterm"></a><pre class="synopsis"> +xslt_process(text document, text stylesheet, text paramlist) returns text +</pre><p> + This function applies the XSL stylesheet to the document and returns + the transformed result. The <code class="literal">paramlist</code> is a list of parameter + assignments to be used in the transformation, specified in the form + <code class="literal">a=1,b=2</code>. Note that the + parameter parsing is very simple-minded: parameter values cannot + contain commas! + </p><p> + There is also a two-parameter version of <code class="function">xslt_process</code> which + does not pass any parameters to the transformation. + </p></div></div><div class="sect2" id="id-1.11.7.59.8"><div class="titlepage"><div><div><h3 class="title">F.50.5. Author</h3></div></div></div><p> + John Gray <code class="email"><<a class="email" href="mailto:jgray@azuli.co.uk">jgray@azuli.co.uk</a>></code> + </p><p> + Development of this module was sponsored by Torchbox Ltd. (www.torchbox.com). + It has the same BSD license as PostgreSQL. + </p></div></div><div class="navfooter"><hr /><table width="100%" summary="Navigation footer"><tr><td width="40%" align="left"><a accesskey="p" href="uuid-ossp.html" title="F.49. uuid-ossp">Prev</a> </td><td width="20%" align="center"><a accesskey="u" href="contrib.html" title="Appendix F. Additional Supplied Modules">Up</a></td><td width="40%" align="right"> <a accesskey="n" href="contrib-prog.html" title="Appendix G. Additional Supplied Programs">Next</a></td></tr><tr><td width="40%" align="left" valign="top">F.49. uuid-ossp </td><td width="20%" align="center"><a accesskey="h" href="index.html" title="PostgreSQL 15.4 Documentation">Home</a></td><td width="40%" align="right" valign="top"> Appendix G. Additional Supplied Programs</td></tr></table></div></body></html>
\ No newline at end of file |