diff options
Diffstat (limited to 'docs/html/parser-grddl.html')
-rw-r--r-- | docs/html/parser-grddl.html | 102 |
1 files changed, 102 insertions, 0 deletions
diff --git a/docs/html/parser-grddl.html b/docs/html/parser-grddl.html new file mode 100644 index 0000000..cd429cf --- /dev/null +++ b/docs/html/parser-grddl.html @@ -0,0 +1,102 @@ +<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"> +<html> +<head> +<meta http-equiv="Content-Type" content="text/html; charset=UTF-8"> +<title>GRDDL parser (name grddl): Raptor RDF Syntax Library Manual</title> +<meta name="generator" content="DocBook XSL Stylesheets Vsnapshot"> +<link rel="home" href="index.html" title="Raptor RDF Syntax Library Manual"> +<link rel="up" href="raptor-parsers.html" title="Parsers in Raptor (syntax to triples)"> +<link rel="prev" href="raptor-parsers.html" title="Parsers in Raptor (syntax to triples)"> +<link rel="next" href="parser-guess.html" title="Guess parser (name guess)"> +<meta name="generator" content="GTK-Doc V1.33.1 (XML mode)"> +<link rel="stylesheet" href="style.css" type="text/css"> +</head> +<body bgcolor="white" text="black" link="#0000FF" vlink="#840084" alink="#0000FF"> +<table class="navigation" id="top" width="100%" summary="Navigation header" cellpadding="2" cellspacing="5"><tr valign="middle"> +<td width="100%" align="left" class="shortcuts"></td> +<td><a accesskey="h" href="index.html"><img src="home.png" width="16" height="16" border="0" alt="Home"></a></td> +<td><a accesskey="u" href="raptor-parsers.html"><img src="up.png" width="16" height="16" border="0" alt="Up"></a></td> +<td><a accesskey="p" href="raptor-parsers.html"><img src="left.png" width="16" height="16" border="0" alt="Prev"></a></td> +<td><a accesskey="n" href="parser-guess.html"><img src="right.png" width="16" height="16" border="0" alt="Next"></a></td> +</tr></table> +<div class="section"> +<div class="titlepage"><div><div><h2 class="title" style="clear: both"> +<a name="parser-grddl"></a>GRDDL parser (name <code class="literal">grddl</code>)</h2></div></div></div> +<p>A parser for the +<a class="ulink" href="http://www.w3.org/TR/2007/PR-grddl-20070716/" target="_top">Gleaning Resource Descriptions from Dialects of Languages (GRDDL)</a>, +W3C Proposed Recommendation of 2007-07-16 which allows reading XHTML +and XML as RDF triples by using profiles in the document that declare +XSLT transforms from the XHTML or XML content into RDF/XML or other +RDF syntax which can then be parsed.</p> +<p>The GRDDL parser is rather complex and different from the other +parsers in that it retrieves URIs, reads HTML documents (possibly +with errors), transforms the documents with XSLT and turns the result +into a single graph. The default configuration of the GRDDL parser +also reads microformats (hcard, hcalendar) and follows <link> +tags that point to RDF/XML. Parts of the GRDDL process can be +altered by configuration, which are describe below. +</p> +<p>The GRDDL parser defines 'base', 'Base' and 'url' XSLT parameters +with the value of the base URI to allow some XSLT sheets to work. These +set of parameters cannot be disabled. +</p> +<p>If the XSLT transform returns an empty string, no further +processing of the result is done, and a warning is generated. The +xsl:output method is mapped to result document mime types as follows: +'text' to text/plain; 'xml' to application/xml and 'html' to text/html. +Any result that is of type 'application/xml' or unknown mime type +is assumed to be RDF/XML. +</p> +<p>The URIs that are processed during GRDDL operations can be checked +and skipped if required using a handler set with the +<a class="link" href="raptor2-section-parser.html#raptor-parser-set-uri-filter" title="raptor_parser_set_uri_filter ()"><code class="function">raptor_parser_set_uri_filter()</code></a> +function. If the handler returns non-0, the URI is rejected. +This uses +<a class="link" href="raptor2-section-www.html#raptor-www-set-uri-filter" title="raptor_www_set_uri_filter ()"><code class="function">raptor_www_set_uri_filter()</code></a> +internally. +</p> +<p>If the value of option +<a class="link" href="raptor2-section-option.html#RAPTOR-OPTION-WWW-TIMEOUT:CAPS"><code class="literal">RAPTOR_OPTION_WWW_TIMEOUT</code></a> +if set to a number >0, it is used as the timeout in seconds +for retrieving of URIs during GRDDL processing. +This uses +<a class="link" href="raptor2-section-www.html#raptor-www-set-connection-timeout" title="raptor_www_set_connection_timeout ()"><code class="function">raptor_www_set_connection_timeout()</code></a> +internally. +</p> +<p>The hardcoded support for hcard and hcalendar +microformats can be disabled by setting parser option +<a class="link" href="raptor2-section-option.html#RAPTOR-OPTION-MICROFORMATS:CAPS"><code class="literal">RAPTOR_OPTION_MICROFORMATS</code></a> +to 0 +or using +<a class="link" href="raptor2-section-parser.html#raptor-parser-set-option" title="raptor_parser_set_option ()"><code class="function">raptor_parser_set_option()</code></a> +with option +<a class="link" href="raptor2-section-option.html#RAPTOR-OPTION-STRICT:CAPS"><code class="literal">RAPTOR_OPTION_STRICT</code></a> +and a boolean value of 1. +</p> +<p>The GRDDL parser by default will try an XML parser on the +content followed by a lax HTML parser. This can be disabled by +setting parser option +<a class="link" href="raptor2-section-option.html#RAPTOR-OPTION-HTML-TAG-SOUP:CAPS"><code class="literal">RAPTOR_OPTION_HTML_TAG_SOUP</code></a> +to 0 +or using +<a class="link" href="raptor2-section-parser.html#raptor-parser-set-option" title="raptor_parser_set_option ()"><code class="function">raptor_parser_set_option()</code></a> +with option +<a class="link" href="raptor2-section-option.html#RAPTOR-OPTION-STRICT:CAPS"><code class="literal">RAPTOR_OPTION_STRICT</code></a> +and a boolean value of 1. +</p> +<p>The GRDDL parser by default will try to look for an HTML +<link> tag that points to RDF/XML. This can be disabled by +setting parser option +<a class="link" href="raptor2-section-option.html#RAPTOR-OPTION-HTML-LINK:CAPS"><code class="literal">RAPTOR_OPTION_HTML_LINK</code></a> +to 0 +or using +<a class="link" href="raptor2-section-parser.html#raptor-parser-set-option" title="raptor_parser_set_option ()"><code class="function">raptor_parser_set_option()</code></a> +with option +<a class="link" href="raptor2-section-option.html#RAPTOR-OPTION-STRICT:CAPS"><code class="literal">RAPTOR_OPTION_STRICT</code></a> +and a boolean value of 1. +</p> +</div> +<div class="footer"> +<hr>Generated by GTK-Doc V1.33.1</div> +</body> +</html>
\ No newline at end of file |