summaryrefslogtreecommitdiffstats
path: root/docs/html/parser-grddl.html
diff options
context:
space:
mode:
Diffstat (limited to 'docs/html/parser-grddl.html')
-rw-r--r--docs/html/parser-grddl.html102
1 files changed, 102 insertions, 0 deletions
diff --git a/docs/html/parser-grddl.html b/docs/html/parser-grddl.html
new file mode 100644
index 0000000..cd429cf
--- /dev/null
+++ b/docs/html/parser-grddl.html
@@ -0,0 +1,102 @@
+<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
+<html>
+<head>
+<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
+<title>GRDDL parser (name grddl): Raptor RDF Syntax Library Manual</title>
+<meta name="generator" content="DocBook XSL Stylesheets Vsnapshot">
+<link rel="home" href="index.html" title="Raptor RDF Syntax Library Manual">
+<link rel="up" href="raptor-parsers.html" title="Parsers in Raptor (syntax to triples)">
+<link rel="prev" href="raptor-parsers.html" title="Parsers in Raptor (syntax to triples)">
+<link rel="next" href="parser-guess.html" title="Guess parser (name guess)">
+<meta name="generator" content="GTK-Doc V1.33.1 (XML mode)">
+<link rel="stylesheet" href="style.css" type="text/css">
+</head>
+<body bgcolor="white" text="black" link="#0000FF" vlink="#840084" alink="#0000FF">
+<table class="navigation" id="top" width="100%" summary="Navigation header" cellpadding="2" cellspacing="5"><tr valign="middle">
+<td width="100%" align="left" class="shortcuts"></td>
+<td><a accesskey="h" href="index.html"><img src="home.png" width="16" height="16" border="0" alt="Home"></a></td>
+<td><a accesskey="u" href="raptor-parsers.html"><img src="up.png" width="16" height="16" border="0" alt="Up"></a></td>
+<td><a accesskey="p" href="raptor-parsers.html"><img src="left.png" width="16" height="16" border="0" alt="Prev"></a></td>
+<td><a accesskey="n" href="parser-guess.html"><img src="right.png" width="16" height="16" border="0" alt="Next"></a></td>
+</tr></table>
+<div class="section">
+<div class="titlepage"><div><div><h2 class="title" style="clear: both">
+<a name="parser-grddl"></a>GRDDL parser (name <code class="literal">grddl</code>)</h2></div></div></div>
+<p>A parser for the
+<a class="ulink" href="http://www.w3.org/TR/2007/PR-grddl-20070716/" target="_top">Gleaning Resource Descriptions from Dialects of Languages (GRDDL)</a>,
+W3C Proposed Recommendation of 2007-07-16 which allows reading XHTML
+and XML as RDF triples by using profiles in the document that declare
+XSLT transforms from the XHTML or XML content into RDF/XML or other
+RDF syntax which can then be parsed.</p>
+<p>The GRDDL parser is rather complex and different from the other
+parsers in that it retrieves URIs, reads HTML documents (possibly
+with errors), transforms the documents with XSLT and turns the result
+into a single graph. The default configuration of the GRDDL parser
+also reads microformats (hcard, hcalendar) and follows &lt;link&gt;
+tags that point to RDF/XML. Parts of the GRDDL process can be
+altered by configuration, which are describe below.
+</p>
+<p>The GRDDL parser defines 'base', 'Base' and 'url' XSLT parameters
+with the value of the base URI to allow some XSLT sheets to work. These
+set of parameters cannot be disabled.
+</p>
+<p>If the XSLT transform returns an empty string, no further
+processing of the result is done, and a warning is generated. The
+xsl:output method is mapped to result document mime types as follows:
+'text' to text/plain; 'xml' to application/xml and 'html' to text/html.
+Any result that is of type 'application/xml' or unknown mime type
+is assumed to be RDF/XML.
+</p>
+<p>The URIs that are processed during GRDDL operations can be checked
+and skipped if required using a handler set with the
+<a class="link" href="raptor2-section-parser.html#raptor-parser-set-uri-filter" title="raptor_parser_set_uri_filter ()"><code class="function">raptor_parser_set_uri_filter()</code></a>
+function. If the handler returns non-0, the URI is rejected.
+This uses
+<a class="link" href="raptor2-section-www.html#raptor-www-set-uri-filter" title="raptor_www_set_uri_filter ()"><code class="function">raptor_www_set_uri_filter()</code></a>
+internally.
+</p>
+<p>If the value of option
+<a class="link" href="raptor2-section-option.html#RAPTOR-OPTION-WWW-TIMEOUT:CAPS"><code class="literal">RAPTOR_OPTION_WWW_TIMEOUT</code></a>
+if set to a number &gt;0, it is used as the timeout in seconds
+for retrieving of URIs during GRDDL processing.
+This uses
+<a class="link" href="raptor2-section-www.html#raptor-www-set-connection-timeout" title="raptor_www_set_connection_timeout ()"><code class="function">raptor_www_set_connection_timeout()</code></a>
+internally.
+</p>
+<p>The hardcoded support for hcard and hcalendar
+microformats can be disabled by setting parser option
+<a class="link" href="raptor2-section-option.html#RAPTOR-OPTION-MICROFORMATS:CAPS"><code class="literal">RAPTOR_OPTION_MICROFORMATS</code></a>
+to 0
+or using
+<a class="link" href="raptor2-section-parser.html#raptor-parser-set-option" title="raptor_parser_set_option ()"><code class="function">raptor_parser_set_option()</code></a>
+with option
+<a class="link" href="raptor2-section-option.html#RAPTOR-OPTION-STRICT:CAPS"><code class="literal">RAPTOR_OPTION_STRICT</code></a>
+and a boolean value of 1.
+</p>
+<p>The GRDDL parser by default will try an XML parser on the
+content followed by a lax HTML parser. This can be disabled by
+setting parser option
+<a class="link" href="raptor2-section-option.html#RAPTOR-OPTION-HTML-TAG-SOUP:CAPS"><code class="literal">RAPTOR_OPTION_HTML_TAG_SOUP</code></a>
+to 0
+or using
+<a class="link" href="raptor2-section-parser.html#raptor-parser-set-option" title="raptor_parser_set_option ()"><code class="function">raptor_parser_set_option()</code></a>
+with option
+<a class="link" href="raptor2-section-option.html#RAPTOR-OPTION-STRICT:CAPS"><code class="literal">RAPTOR_OPTION_STRICT</code></a>
+and a boolean value of 1.
+</p>
+<p>The GRDDL parser by default will try to look for an HTML
+&lt;link&gt; tag that points to RDF/XML. This can be disabled by
+setting parser option
+<a class="link" href="raptor2-section-option.html#RAPTOR-OPTION-HTML-LINK:CAPS"><code class="literal">RAPTOR_OPTION_HTML_LINK</code></a>
+to 0
+or using
+<a class="link" href="raptor2-section-parser.html#raptor-parser-set-option" title="raptor_parser_set_option ()"><code class="function">raptor_parser_set_option()</code></a>
+with option
+<a class="link" href="raptor2-section-option.html#RAPTOR-OPTION-STRICT:CAPS"><code class="literal">RAPTOR_OPTION_STRICT</code></a>
+and a boolean value of 1.
+</p>
+</div>
+<div class="footer">
+<hr>Generated by GTK-Doc V1.33.1</div>
+</body>
+</html> \ No newline at end of file