summaryrefslogtreecommitdiffstats
path: root/doc/src/sgml/html/xml2.html
diff options
context:
space:
mode:
Diffstat (limited to 'doc/src/sgml/html/xml2.html')
-rw-r--r--doc/src/sgml/html/xml2.html274
1 files changed, 274 insertions, 0 deletions
diff --git a/doc/src/sgml/html/xml2.html b/doc/src/sgml/html/xml2.html
new file mode 100644
index 0000000..b1e5b33
--- /dev/null
+++ b/doc/src/sgml/html/xml2.html
@@ -0,0 +1,274 @@
+<?xml version="1.0" encoding="UTF-8" standalone="no"?>
+<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"><html xmlns="http://www.w3.org/1999/xhtml"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /><title>F.45. xml2</title><link rel="stylesheet" type="text/css" href="stylesheet.css" /><link rev="made" href="pgsql-docs@lists.postgresql.org" /><meta name="generator" content="DocBook XSL Stylesheets V1.79.1" /><link rel="prev" href="uuid-ossp.html" title="F.44. uuid-ossp" /><link rel="next" href="contrib-prog.html" title="Appendix G. Additional Supplied Programs" /></head><body id="docContent" class="container-fluid col-10"><div xmlns="http://www.w3.org/TR/xhtml1/transitional" class="navheader"><table width="100%" summary="Navigation header"><tr><th colspan="5" align="center">F.45. xml2</th></tr><tr><td width="10%" align="left"><a accesskey="p" href="uuid-ossp.html" title="F.44. uuid-ossp">Prev</a> </td><td width="10%" align="left"><a accesskey="u" href="contrib.html" title="Appendix F. Additional Supplied Modules">Up</a></td><th width="60%" align="center">Appendix F. Additional Supplied Modules</th><td width="10%" align="right"><a accesskey="h" href="index.html" title="PostgreSQL 13.4 Documentation">Home</a></td><td width="10%" align="right"> <a accesskey="n" href="contrib-prog.html" title="Appendix G. Additional Supplied Programs">Next</a></td></tr></table><hr></hr></div><div class="sect1" id="XML2"><div class="titlepage"><div><div><h2 class="title" style="clear: both">F.45. xml2</h2></div></div></div><div class="toc"><dl class="toc"><dt><span class="sect2"><a href="xml2.html#id-1.11.7.54.4">F.45.1. Deprecation Notice</a></span></dt><dt><span class="sect2"><a href="xml2.html#id-1.11.7.54.5">F.45.2. Description of Functions</a></span></dt><dt><span class="sect2"><a href="xml2.html#id-1.11.7.54.6">F.45.3. <code class="literal">xpath_table</code></a></span></dt><dt><span class="sect2"><a href="xml2.html#id-1.11.7.54.7">F.45.4. XSLT Functions</a></span></dt><dt><span class="sect2"><a href="xml2.html#id-1.11.7.54.8">F.45.5. Author</a></span></dt></dl></div><a id="id-1.11.7.54.2" class="indexterm"></a><p>
+ The <code class="filename">xml2</code> module provides XPath querying and
+ XSLT functionality.
+ </p><div class="sect2" id="id-1.11.7.54.4"><div class="titlepage"><div><div><h3 class="title">F.45.1. Deprecation Notice</h3></div></div></div><p>
+ From <span class="productname">PostgreSQL</span> 8.3 on, there is XML-related
+ functionality based on the SQL/XML standard in the core server.
+ That functionality covers XML syntax checking and XPath queries,
+ which is what this module does, and more, but the API is
+ not at all compatible. It is planned that this module will be
+ removed in a future version of PostgreSQL in favor of the newer standard API, so
+ you are encouraged to try converting your applications. If you
+ find that some of the functionality of this module is not
+ available in an adequate form with the newer API, please explain
+ your issue to <code class="email">&lt;<a class="email" href="mailto:pgsql-hackers@lists.postgresql.org">pgsql-hackers@lists.postgresql.org</a>&gt;</code> so that the deficiency
+ can be addressed.
+ </p></div><div class="sect2" id="id-1.11.7.54.5"><div class="titlepage"><div><div><h3 class="title">F.45.2. Description of Functions</h3></div></div></div><p>
+ <a class="xref" href="xml2.html#XML2-FUNCTIONS-TABLE" title="Table F.34. xml2 Functions">Table F.34</a> shows the functions provided by this module.
+ These functions provide straightforward XML parsing and XPath queries.
+ </p><div class="table" id="XML2-FUNCTIONS-TABLE"><p class="title"><strong>Table F.34. <code class="filename">xml2</code> Functions</strong></p><div class="table-contents"><table class="table" summary="xml2 Functions" border="1"><colgroup><col /></colgroup><thead><tr><th class="func_table_entry"><p class="func_signature">
+ Function
+ </p>
+ <p>
+ Description
+ </p></th></tr></thead><tbody><tr><td class="func_table_entry"><p class="func_signature">
+ <code class="function">xml_valid</code> ( <em class="parameter"><code>document</code></em> <code class="type">text</code> )
+ → <code class="returnvalue">boolean</code>
+ </p>
+ <p>
+ Parses the given document and returns true if the
+ document is well-formed XML. (Note: this is an alias for the standard
+ PostgreSQL function <code class="function">xml_is_well_formed()</code>. The
+ name <code class="function">xml_valid()</code> is technically incorrect since validity
+ and well-formedness have different meanings in XML.)
+ </p></td></tr><tr><td class="func_table_entry"><p class="func_signature">
+ <code class="function">xpath_string</code> ( <em class="parameter"><code>document</code></em> <code class="type">text</code>, <em class="parameter"><code>query</code></em> <code class="type">text</code> )
+ → <code class="returnvalue">text</code>
+ </p>
+ <p>
+ Evaluates the XPath query on the supplied document, and
+ casts the result to <code class="type">text</code>.
+ </p></td></tr><tr><td class="func_table_entry"><p class="func_signature">
+ <code class="function">xpath_number</code> ( <em class="parameter"><code>document</code></em> <code class="type">text</code>, <em class="parameter"><code>query</code></em> <code class="type">text</code> )
+ → <code class="returnvalue">real</code>
+ </p>
+ <p>
+ Evaluates the XPath query on the supplied document, and
+ casts the result to <code class="type">real</code>.
+ </p></td></tr><tr><td class="func_table_entry"><p class="func_signature">
+ <code class="function">xpath_bool</code> ( <em class="parameter"><code>document</code></em> <code class="type">text</code>, <em class="parameter"><code>query</code></em> <code class="type">text</code> )
+ → <code class="returnvalue">boolean</code>
+ </p>
+ <p>
+ Evaluates the XPath query on the supplied document, and
+ casts the result to <code class="type">boolean</code>.
+ </p></td></tr><tr><td class="func_table_entry"><p class="func_signature">
+ <code class="function">xpath_nodeset</code> ( <em class="parameter"><code>document</code></em> <code class="type">text</code>, <em class="parameter"><code>query</code></em> <code class="type">text</code>, <em class="parameter"><code>toptag</code></em> <code class="type">text</code>, <em class="parameter"><code>itemtag</code></em> <code class="type">text</code> )
+ → <code class="returnvalue">text</code>
+ </p>
+ <p>
+ Evaluates the query on the document and wraps the result in XML
+ tags. If the result is multivalued, the output will look like:
+</p><pre class="synopsis">
+&lt;toptag&gt;
+&lt;itemtag&gt;Value 1 which could be an XML fragment&lt;/itemtag&gt;
+&lt;itemtag&gt;Value 2....&lt;/itemtag&gt;
+&lt;/toptag&gt;
+</pre><p>
+ If either <em class="parameter"><code>toptag</code></em>
+ or <em class="parameter"><code>itemtag</code></em> is an empty string, the relevant tag
+ is omitted.
+ </p></td></tr><tr><td class="func_table_entry"><p class="func_signature">
+ <code class="function">xpath_nodeset</code> ( <em class="parameter"><code>document</code></em> <code class="type">text</code>, <em class="parameter"><code>query</code></em> <code class="type">text</code>, <em class="parameter"><code>itemtag</code></em> <code class="type">text</code> )
+ → <code class="returnvalue">text</code>
+ </p>
+ <p>
+ Like <code class="function">xpath_nodeset(document, query, toptag, itemtag)</code> but result omits <em class="parameter"><code>toptag</code></em>.
+ </p></td></tr><tr><td class="func_table_entry"><p class="func_signature">
+ <code class="function">xpath_nodeset</code> ( <em class="parameter"><code>document</code></em> <code class="type">text</code>, <em class="parameter"><code>query</code></em> <code class="type">text</code> )
+ → <code class="returnvalue">text</code>
+ </p>
+ <p>
+ Like <code class="function">xpath_nodeset(document, query, toptag, itemtag)</code> but result omits both tags.
+ </p></td></tr><tr><td class="func_table_entry"><p class="func_signature">
+ <code class="function">xpath_list</code> ( <em class="parameter"><code>document</code></em> <code class="type">text</code>, <em class="parameter"><code>query</code></em> <code class="type">text</code>, <em class="parameter"><code>separator</code></em> <code class="type">text</code> )
+ → <code class="returnvalue">text</code>
+ </p>
+ <p>
+ Evaluates the query on the document and returns multiple values
+ separated by the specified separator, for example <code class="literal">Value
+ 1,Value 2,Value 3</code> if <em class="parameter"><code>separator</code></em>
+ is <code class="literal">,</code>.
+ </p></td></tr><tr><td class="func_table_entry"><p class="func_signature">
+ <code class="function">xpath_list</code> ( <em class="parameter"><code>document</code></em> <code class="type">text</code>, <em class="parameter"><code>query</code></em> <code class="type">text</code> )
+ → <code class="returnvalue">text</code>
+ </p>
+ <p>
+ This is a wrapper for the above function that uses <code class="literal">,</code>
+ as the separator.
+ </p></td></tr></tbody></table></div></div><br class="table-break" /></div><div class="sect2" id="id-1.11.7.54.6"><div class="titlepage"><div><div><h3 class="title">F.45.3. <code class="literal">xpath_table</code></h3></div></div></div><a id="id-1.11.7.54.6.2" class="indexterm"></a><pre class="synopsis">
+xpath_table(text key, text document, text relation, text xpaths, text criteria) returns setof record
+</pre><p>
+ <code class="function">xpath_table</code> is a table function that evaluates a set of XPath
+ queries on each of a set of documents and returns the results as a
+ table. The primary key field from the original document table is returned
+ as the first column of the result so that the result set
+ can readily be used in joins. The parameters are described in
+ <a class="xref" href="xml2.html#XML2-XPATH-TABLE-PARAMETERS" title="Table F.35. xpath_table Parameters">Table F.35</a>.
+ </p><div class="table" id="XML2-XPATH-TABLE-PARAMETERS"><p class="title"><strong>Table F.35. <code class="function">xpath_table</code> Parameters</strong></p><div class="table-contents"><table class="table" summary="xpath_table Parameters" border="1"><colgroup><col class="col1" /><col class="col2" /></colgroup><thead><tr><th>Parameter</th><th>Description</th></tr></thead><tbody><tr><td><em class="parameter"><code>key</code></em></td><td>
+ <p>
+ the name of the <span class="quote">“<span class="quote">key</span>”</span> field — this is just a field to be used as
+ the first column of the output table, i.e., it identifies the record from
+ which each output row came (see note below about multiple values)
+ </p>
+ </td></tr><tr><td><em class="parameter"><code>document</code></em></td><td>
+ <p>
+ the name of the field containing the XML document
+ </p>
+ </td></tr><tr><td><em class="parameter"><code>relation</code></em></td><td>
+ <p>
+ the name of the table or view containing the documents
+ </p>
+ </td></tr><tr><td><em class="parameter"><code>xpaths</code></em></td><td>
+ <p>
+ one or more XPath expressions, separated by <code class="literal">|</code>
+ </p>
+ </td></tr><tr><td><em class="parameter"><code>criteria</code></em></td><td>
+ <p>
+ the contents of the WHERE clause. This cannot be omitted, so use
+ <code class="literal">true</code> or <code class="literal">1=1</code> if you want to
+ process all the rows in the relation
+ </p>
+ </td></tr></tbody></table></div></div><br class="table-break" /><p>
+ These parameters (except the XPath strings) are just substituted
+ into a plain SQL SELECT statement, so you have some flexibility — the
+ statement is
+ </p><p>
+ <code class="literal">
+ SELECT &lt;key&gt;, &lt;document&gt; FROM &lt;relation&gt; WHERE &lt;criteria&gt;
+ </code>
+ </p><p>
+ so those parameters can be <span class="emphasis"><em>anything</em></span> valid in those particular
+ locations. The result from this SELECT needs to return exactly two
+ columns (which it will unless you try to list multiple fields for key
+ or document). Beware that this simplistic approach requires that you
+ validate any user-supplied values to avoid SQL injection attacks.
+ </p><p>
+ The function has to be used in a <code class="literal">FROM</code> expression, with an
+ <code class="literal">AS</code> clause to specify the output columns; for example
+</p><pre class="programlisting">
+SELECT * FROM
+xpath_table('article_id',
+ 'article_xml',
+ 'articles',
+ '/article/author|/article/pages|/article/title',
+ 'date_entered &gt; ''2003-01-01'' ')
+AS t(article_id integer, author text, page_count integer, title text);
+</pre><p>
+ The <code class="literal">AS</code> clause defines the names and types of the columns in the
+ output table. The first is the <span class="quote">“<span class="quote">key</span>”</span> field and the rest correspond
+ to the XPath queries.
+ If there are more XPath queries than result columns,
+ the extra queries will be ignored. If there are more result columns
+ than XPath queries, the extra columns will be NULL.
+ </p><p>
+ Notice that this example defines the <code class="structname">page_count</code> result
+ column as an integer. The function deals internally with string
+ representations, so when you say you want an integer in the output, it will
+ take the string representation of the XPath result and use PostgreSQL input
+ functions to transform it into an integer (or whatever type the <code class="type">AS</code>
+ clause requests). An error will result if it can't do this — for
+ example if the result is empty — so you may wish to just stick to
+ <code class="type">text</code> as the column type if you think your data has any problems.
+ </p><p>
+ The calling <code class="command">SELECT</code> statement doesn't necessarily have to be
+ just <code class="literal">SELECT *</code> — it can reference the output
+ columns by name or join them to other tables. The function produces a
+ virtual table with which you can perform any operation you wish (e.g.,
+ aggregation, joining, sorting etc). So we could also have:
+</p><pre class="programlisting">
+SELECT t.title, p.fullname, p.email
+FROM xpath_table('article_id', 'article_xml', 'articles',
+ '/article/title|/article/author/@id',
+ 'xpath_string(article_xml,''/article/@date'') &gt; ''2003-03-20'' ')
+ AS t(article_id integer, title text, author_id integer),
+ tblPeopleInfo AS p
+WHERE t.author_id = p.person_id;
+</pre><p>
+ as a more complicated example. Of course, you could wrap all
+ of this in a view for convenience.
+ </p><div class="sect3" id="id-1.11.7.54.6.12"><div class="titlepage"><div><div><h4 class="title">F.45.3.1. Multivalued Results</h4></div></div></div><p>
+ The <code class="function">xpath_table</code> function assumes that the results of each XPath query
+ might be multivalued, so the number of rows returned by the function
+ may not be the same as the number of input documents. The first row
+ returned contains the first result from each query, the second row the
+ second result from each query. If one of the queries has fewer values
+ than the others, null values will be returned instead.
+ </p><p>
+ In some cases, a user will know that a given XPath query will return
+ only a single result (perhaps a unique document identifier) — if used
+ alongside an XPath query returning multiple results, the single-valued
+ result will appear only on the first row of the result. The solution
+ to this is to use the key field as part of a join against a simpler
+ XPath query. As an example:
+
+</p><pre class="programlisting">
+CREATE TABLE test (
+ id int PRIMARY KEY,
+ xml text
+);
+
+INSERT INTO test VALUES (1, '&lt;doc num="C1"&gt;
+&lt;line num="L1"&gt;&lt;a&gt;1&lt;/a&gt;&lt;b&gt;2&lt;/b&gt;&lt;c&gt;3&lt;/c&gt;&lt;/line&gt;
+&lt;line num="L2"&gt;&lt;a&gt;11&lt;/a&gt;&lt;b&gt;22&lt;/b&gt;&lt;c&gt;33&lt;/c&gt;&lt;/line&gt;
+&lt;/doc&gt;');
+
+INSERT INTO test VALUES (2, '&lt;doc num="C2"&gt;
+&lt;line num="L1"&gt;&lt;a&gt;111&lt;/a&gt;&lt;b&gt;222&lt;/b&gt;&lt;c&gt;333&lt;/c&gt;&lt;/line&gt;
+&lt;line num="L2"&gt;&lt;a&gt;111&lt;/a&gt;&lt;b&gt;222&lt;/b&gt;&lt;c&gt;333&lt;/c&gt;&lt;/line&gt;
+&lt;/doc&gt;');
+
+SELECT * FROM
+ xpath_table('id','xml','test',
+ '/doc/@num|/doc/line/@num|/doc/line/a|/doc/line/b|/doc/line/c',
+ 'true')
+ AS t(id int, doc_num varchar(10), line_num varchar(10), val1 int, val2 int, val3 int)
+WHERE id = 1 ORDER BY doc_num, line_num
+
+ id | doc_num | line_num | val1 | val2 | val3
+----+---------+----------+------+------+------
+ 1 | C1 | L1 | 1 | 2 | 3
+ 1 | | L2 | 11 | 22 | 33
+</pre><p>
+ </p><p>
+ To get <code class="literal">doc_num</code> on every line, the solution is to use two invocations
+ of <code class="function">xpath_table</code> and join the results:
+
+</p><pre class="programlisting">
+SELECT t.*,i.doc_num FROM
+ xpath_table('id', 'xml', 'test',
+ '/doc/line/@num|/doc/line/a|/doc/line/b|/doc/line/c',
+ 'true')
+ AS t(id int, line_num varchar(10), val1 int, val2 int, val3 int),
+ xpath_table('id', 'xml', 'test', '/doc/@num', 'true')
+ AS i(id int, doc_num varchar(10))
+WHERE i.id=t.id AND i.id=1
+ORDER BY doc_num, line_num;
+
+ id | line_num | val1 | val2 | val3 | doc_num
+----+----------+------+------+------+---------
+ 1 | L1 | 1 | 2 | 3 | C1
+ 1 | L2 | 11 | 22 | 33 | C1
+(2 rows)
+</pre><p>
+ </p></div></div><div class="sect2" id="id-1.11.7.54.7"><div class="titlepage"><div><div><h3 class="title">F.45.4. XSLT Functions</h3></div></div></div><p>
+ The following functions are available if libxslt is installed:
+ </p><div class="sect3" id="id-1.11.7.54.7.3"><div class="titlepage"><div><div><h4 class="title">F.45.4.1. <code class="literal">xslt_process</code></h4></div></div></div><a id="id-1.11.7.54.7.3.2" class="indexterm"></a><pre class="synopsis">
+xslt_process(text document, text stylesheet, text paramlist) returns text
+</pre><p>
+ This function applies the XSL stylesheet to the document and returns
+ the transformed result. The <code class="literal">paramlist</code> is a list of parameter
+ assignments to be used in the transformation, specified in the form
+ <code class="literal">a=1,b=2</code>. Note that the
+ parameter parsing is very simple-minded: parameter values cannot
+ contain commas!
+ </p><p>
+ There is also a two-parameter version of <code class="function">xslt_process</code> which
+ does not pass any parameters to the transformation.
+ </p></div></div><div class="sect2" id="id-1.11.7.54.8"><div class="titlepage"><div><div><h3 class="title">F.45.5. Author</h3></div></div></div><p>
+ John Gray <code class="email">&lt;<a class="email" href="mailto:jgray@azuli.co.uk">jgray@azuli.co.uk</a>&gt;</code>
+ </p><p>
+ Development of this module was sponsored by Torchbox Ltd. (www.torchbox.com).
+ It has the same BSD license as PostgreSQL.
+ </p></div></div><div xmlns="http://www.w3.org/TR/xhtml1/transitional" class="navfooter"><hr></hr><table width="100%" summary="Navigation footer"><tr><td width="40%" align="left"><a accesskey="p" href="uuid-ossp.html" title="F.44. uuid-ossp">Prev</a> </td><td width="20%" align="center"><a accesskey="u" href="contrib.html" title="Appendix F. Additional Supplied Modules">Up</a></td><td width="40%" align="right"> <a accesskey="n" href="contrib-prog.html" title="Appendix G. Additional Supplied Programs">Next</a></td></tr><tr><td width="40%" align="left" valign="top">F.44. uuid-ossp </td><td width="20%" align="center"><a accesskey="h" href="index.html" title="PostgreSQL 13.4 Documentation">Home</a></td><td width="40%" align="right" valign="top"> Appendix G. Additional Supplied Programs</td></tr></table></div></body></html> \ No newline at end of file