summaryrefslogtreecommitdiffstats
path: root/docs/raptor-tutorial-serializing.xml
diff options
context:
space:
mode:
Diffstat (limited to '')
-rw-r--r--docs/raptor-tutorial-serializing.xml402
1 files changed, 402 insertions, 0 deletions
diff --git a/docs/raptor-tutorial-serializing.xml b/docs/raptor-tutorial-serializing.xml
new file mode 100644
index 0000000..f27b7f5
--- /dev/null
+++ b/docs/raptor-tutorial-serializing.xml
@@ -0,0 +1,402 @@
+<!DOCTYPE refentry PUBLIC "-//OASIS//DTD DocBook XML V4.3//EN"
+ "http://www.oasis-open.org/docbook/xml/4.3/docbookx.dtd">
+<chapter id="tutorial-serializing" xmlns:xi="http://www.w3.org/2003/XInclude">
+<title>Serializing RDF triples to a syntax</title>
+
+<section id="tutorial-serializing-intro">
+<title>Introduction</title>
+
+<para>
+The typical sequence of operations to serialize is to create a
+serializer object, set various callback and features, start the
+serializing, send some RDF triples to the serializer object,
+finish the serializing and destroy the serializer object.
+</para>
+
+</section>
+
+
+<section id="tutorial-serializer-create">
+<title>Create the Serializer object</title>
+
+<para>The serializer can be created directly from a known name using
+<link linkend="raptor-new-serializer"><function>raptor_new_serializer()</function></link>
+such as <literal>rdfxml</literal> for the W3C Recommendation RDF/XML syntax:
+<programlisting>
+ raptor_serializer* rdf_serializer;
+
+ rdf_serializer = raptor_new_serializer(world, "rdfxml");
+</programlisting>
+or the name can be discovered from an <emphasis>description</emphasis>
+as discussed in
+<link linkend="tutorial-querying-functionality">Querying Functionality</link>
+</para>
+</section>
+
+
+<section id="tutorial-serializer-features">
+<title>Serializer options</title>
+
+<para>There are several options that can be set on serializers.
+The exact list of options can be found at run time via the
+<link linkend="tutorial-querying-functionality">Querying Functionality</link>
+or in the API reference for
+<link linkend="raptor-option"><literal>raptor_option</literal></link>.
+</para>
+
+<para>Options are integer enumerations of the
+<link linkend="raptor-option"><type>raptor_option</type></link> enum and have
+values that are either booleans, integers or strings.
+The function that sets options for serializers is:
+<link linkend="raptor-serializer-set-option">raptor_serializer_set_option()</link>
+used as follows:
+<programlisting>
+ /* Set a boolean or integer valued option to value 1 */
+ raptor_serializer_set_option(rdf_serializer, option, NULL, 1);
+
+ /* Set a string valued option to value "abc" */
+ raptor_serializer_set_option(rdf_serializer, option, "abc", -1);
+</programlisting>
+</para>
+
+<para>
+There is a corresponding function for reading the values of serializer
+option
+<link linkend="raptor-serializer-get-option"><function>raptor_serializer_get_option()</function></link>
+which takes the option enumeration parameter and returns the boolean /
+integer or string value correspondingly into the appropriate pointer
+argument.
+<programlisting>
+ /* Get a boolean or integer option value */
+ int int_var;
+ raptor_serializer_get_option(rdf_serializer, option, NULL, &amp;int_var);
+
+ /* Get a string option value */
+ char* string_var;
+ raptor_serializer_get_option(rdf_serializer, option, &amp;string_var, NULL);
+</programlisting>
+</para>
+
+</section>
+
+
+<section id="tutorial-serializer-declare-namespace">
+<title>Declare namespaces</title>
+
+<para>Raptor can use namespace prefix/URIs to abbreviate syntax
+in some syntaxes such as Turtle or any XML syntax including RDF/XML,
+RSS1.0 and Atom 1.0. These are declared
+with <link linkend="raptor-serializer-set-namespace"><function>raptor_serializer_set_namespace()</function></link>
+using a prefix and URI argument pair like this:
+<programlisting>
+ const unsigned char* prefix = "ex";
+ raptor_uri* uri = raptor_new_uri(world, "http://example.org");
+
+ raptor_serializer_set_namespace(rdf_serializer, prefix, uri);
+</programlisting>
+</para>
+
+<para>or
+<link linkend="raptor-serializer-set-namespace-from-namespace"><function>raptor_serializer_set_namespace_from_namespace()</function></link>
+from an existing namespace. This can be useful when connected up the
+the namespace declarations that are generated from a parser via a
+namespace handler set with
+<link linkend="raptor-parser-set-namespace-handler"><function>raptor_parser_set_namespace_handler()</function></link>
+</para>
+like this:
+<programlisting>
+ static void
+ relay_namespaces(void* user_data, raptor_namespace *nspace)
+ {
+ raptor_serializer_set_namespace_from_namespace(rdf_serializer, nspace);
+ }
+
+ ...
+
+ rdf_parser = raptor_new_parser(world, syntax_name);
+ raptor_parser_set_namespace_handler(rdf_parser, rdf_serializer, relay_namespaces);
+</programlisting>
+
+</section>
+
+
+<section id="tutorial-serializer-set-error-warning-handlers">
+<title>Set error and warning handlers</title>
+
+<para>Any time before serializing is started, a log handler can be set
+on the world object via the
+<link linkend="raptor-world-set-log-handler"><function>raptor_world_set_log_handler()</function></link>
+method to report errors and warnings from parsing.
+The method takes a user data argument plus a handler callback of type
+<link linkend="raptor-log-handler"><type>raptor_log_handler</type></link>
+with a signature that looks like this:
+<programlisting>
+void
+message_handler(void *user_data, raptor_log_message* message)
+{
+ /* do something with the message */
+}
+</programlisting>
+The handler gets the user data pointer as well as a
+<link linkend="raptor-log-message"><type>raptor_log_handler</type></link>
+pointer that includes associated location information, such as the
+log level,
+<link linkend="raptor-locator"><type>raptor_locator</type></link>,
+and the log message itself. The <emphasis>locator</emphasis>
+structure contains full information on the details of where in the
+file or URI the message occurred.
+</para>
+
+</section>
+
+
+<section id="tutorial-serializer-to-destination">
+<title>Provide a destination for the serialized syntax</title>
+
+<para>The operation of turning RDF triples into a syntax has several
+alternatives from functions that do most of the work writing to a file
+or string to functions that allow passing in a
+<link linkend="raptor-iostream"><type>raptor_iostream</type></link>
+which can be entirely user-constructed.</para>
+
+<section id="serialize-to-filename">
+<title>Serialize to a filename (<link linkend="raptor-serializer-start-to-filename"><function>raptor_serializer_start_to_filename()</function></link>)</title>
+
+<para>Serialize to a new filename
+(using <link linkend="raptor-new-iostream-to-filename"><function>raptor_new_iostream_to_filename()</function></link> internally)
+and uses asf base URI, the file's URI.
+<programlisting>
+ const char *filename = "raptor.rdf";
+ raptor_serializer_start_to_filename(rdf_serializer, filename);
+</programlisting>
+</para>
+</section>
+
+<section id="serialize-to-string">
+<title>Serialize to a string (<link linkend="raptor-serializer-start-to-string"><function>raptor_serializer_start_to_string()</function></link>)</title>
+
+<para>Serialize to a string that is allocated by the serializer
+(using <link linkend="raptor-new-iostream-to-string"><function>raptor_new_iostream_to_string()</function></link> internally). The
+resulting string is only constructed after <link linkend="raptor-serializer-serialize-end"><function>raptor_serializer_serialize_end()</function></link> is called and at that
+point it is assigned to the string pointer passed in, with the length
+written to the optional length pointer. This function
+takes an optional base URI but may be required by some serializers.
+<programlisting>
+ raptor_uri* uri = raptor_new_uri(world, "http://example.org/base");
+ void *string; /* destination for string */
+ size_t length; /* length of constructed string */
+ raptor_serializer* rdf_serializer = /* serializer created by some means */ ;
+
+ raptor_serializer_start_to_string(rdf_serializer, uri,
+ &amp;string, &amp;length);
+</programlisting>
+</para>
+
+</section>
+
+
+<section id="serialize-to-filehandle">
+<title>Serialize to a FILE* file handle (<link linkend="raptor-serializer-start-to-file-handle"><function>raptor_serializer_start_to_file_handle()</function></link>)</title>
+
+<para>Serialize to an existing open C FILE* file handle
+(using <link linkend="raptor-new-iostream-to-file-handle"><function>raptor_new_iostream_to_file_handle()</function></link> internally). The handle is not closed after serializing is finished. This function
+takes an optional base URI but may be required by some serializers.
+<programlisting>
+ raptor_uri* uri = raptor_new_uri(world, "http://example.org/base");
+ FILE* fh = fopen("raptor.rdf", "wb");
+ raptor_serializer* rdf_serializer = /* serializer created by some means */ ;
+
+ raptor_serializer_start_to_file_handle(rdf_serializer, uri, fh);
+</programlisting>
+</para>
+</section>
+
+<section id="serialize-to-iostream">
+<title>Serialize to an <link linkend="raptor-iostream"><type>raptor_iostream</type></link> (<link linkend="raptor-serializer-start-to-iostream"><function>raptor_serializer_start_to_iostream()</function></link>)</title>
+
+<para>This is the most flexible serializing method as it allows
+writing to any
+<link linkend="raptor-iostream"><type>raptor_iostream</type></link>
+which can be constructed to build any form of user-generated structure
+via callbacks. The iostream remains owned by the caller that can continue
+to write to it after the serializing is finished (after
+<link linkend="raptor-serializer-serialize-end"><function>raptor_serializer_serialize_end()</function></link>) is called).
+<programlisting>
+ raptor_uri* uri = raptor_new_uri(world, "http://example.org/base");
+ raptor_iostream* iostream = /* iostream initialized by some means */ ;
+ raptor_serializer* rdf_serializer = /* serializer created by some means */ ;
+
+ raptor_serializer_start_to_iostream(rdf_serializer, uri, iostream);
+
+ while( /* got RDF triples */ ) {
+ raptor_statement* triple = /* ... triple made from somewhere ... */ ;
+ raptor_serializer_serialize_statement(rdf_serializer, triple);
+ }
+ raptor_serializer_serialize_end(rdf_serializer);
+
+ raptor_free_serializer(rdf_serializer);
+
+ /* ... write other stuff to iostream ... */
+
+ raptor_free_iostream(iostream);
+</programlisting>
+</para>
+</section>
+
+
+</section>
+
+
+<section id="tutorial-serializer-get-triples">
+<title>Get or construct RDF Statements (Triples)</title>
+<para>
+An <link linkend="raptor-statement"><type>raptor_statement</type></link>
+containing the triple terms and optional graph term
+can be made either by receiving them from a
+<link linkend="raptor-parser"><type>raptor_parser</type></link>
+via parsing or can be constructed by hand.</para>
+
+<para>When constructing by hand,
+the <link linkend="raptor-statement"><type>raptor_statement</type></link>
+structure should be allocated by the application and the fields
+filled in. Each statement has three triple terms (subject,
+predicate, object) and an optional graph term. The subject can be a
+URI or blank node, the predicate can only be a URI and the object can
+be a URI, blank node or RDF literal. RDF literals can have either
+just a Unicode string, a Unicode string and a language or a Unicode
+string and a datatype URI.</para>
+
+<para>The statement terms are all instances of
+<link linkend="raptor-term"><type>raptor_term</type></link>
+objects constructed with the appropriate constructor for
+the URI, blank node or rdf literal types. The graph term
+of the statement is typically a URI or blank node.
+</para>
+
+<example id="raptor-example-rdfserialize">
+<title><filename>rdfserialize.c</filename>: Serialize 1 triple to RDF/XML (Abbreviated)</title>
+<programlisting>
+<xi:include href="rdfserialize.c" parse="text"/>
+</programlisting>
+
+<para>Compile it like this:
+<screen>
+$ gcc -o rdfserialize rdfserialize.c `pkg-config raptor2 --cflags --libs`
+</screen>
+and run it with an optional base URI argument
+<screen>
+$ ./rdfserialize
+&lt;?xml version="1.0" encoding="utf-8"?&gt;
+&lt;rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"&gt;
+ &lt;rdf:Description rdf:about="http://example.org/subject"&gt;
+ &lt;ns0:predicate xmlns:ns0="http://example.org/" xml:lang="en"&gt;An example&lt;/ns0:predicate&gt;
+ &lt;/rdf:Description&gt;
+&lt;/rdf:RDF&gt;
+</screen>
+</para>
+
+</example>
+
+</section>
+
+<section id="tutorial-serializer-send-triples">
+<title>Send RDF Triples to serializer</title>
+
+<para>
+Once the serializer has been started, RDF triples can be sent to it
+via the
+<link linkend="raptor-serializer-serialize-statement"><function>raptor_serializer_serialize_statement()</function></link>
+function with a
+<link linkend="raptor-statement"><type>raptor_statement</type></link>
+value.
+</para>
+
+<para>Once all triples are sent, the serializing must be finished
+with a call to
+<link linkend="raptor-serializer-serialize-end"><function>raptor_serializer_serialize_end()</function></link>.
+In particular, only at this point does the
+<link linkend="raptor-iostream"><type>raptor_iostream</type></link>
+get flushed or any string constructed for
+<link linkend="raptor-serializer-start-to-string"><function>raptor_serializer_start_to_string()</function></link>.
+<programlisting>
+ /* start the serializing somehow */
+ while( /* got RDF triples */ ) {
+ raptor_serializer_serialize_statement(rdf_serializer, triple);
+ }
+ raptor_serializer_serialize_end(rdf_serializer);
+ /* now can use the serializing result (FILE, string, raptor_iostream) */
+</programlisting>
+
+</para>
+</section>
+
+
+<section id="tutorial-serializer-runtime-info">
+<title>Querying serializer run-time information</title>
+
+<para>
+<link linkend="raptor-serializer-get-iostream"><function>raptor_serializer_get_iostream()</function></link>
+gets the current serializer's raptor_iostream.
+</para>
+
+<para>
+<link linkend="raptor-serializer-get-locator"><function>raptor_serializer_get_locator()</function></link>
+returns the <link linkend="raptor-locator"><type>raptor_locator</type></link>
+for the current position in the output stream. The <emphasis>locator</emphasis>
+structure contains full information on the details of where in the
+file or URI the current serializer has reached.
+</para>
+</section>
+
+
+<section id="tutorial-serializer-destroy">
+<title>Destroy the serializer</title>
+
+<para>
+To tidy up, delete the serializer object as follows:
+<programlisting>
+ raptor_free_serializer(rdf_serializer);
+</programlisting>
+</para>
+
+</section>
+
+<section id="tutorial-serializer-example">
+<title>Serializing example code</title>
+
+<example id="raptor-example-rdfcat">
+<title><filename>rdfcat.c</filename>: Read any RDF syntax and serialize to RDF/XML (Abbreviated)</title>
+<programlisting>
+<xi:include href="rdfcat.c" parse="text"/>
+</programlisting>
+
+<para>Compile it like this:
+<screen>
+$ gcc -o rdfcat rdfcat.c `pkg-config raptor2 --cflags --libs`
+</screen>
+and run it on an RDF file as:
+<screen>
+$ ./rdfcat raptor.rdf
+&lt;?xml version="1.0" encoding="utf-8"?&gt;
+&lt;rdf:RDF xmlns:foaf="http://xmlns.com/foaf/0.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="http://usefulinc.com/ns/doap#"&gt;
+ &lt;rdf:Description rdf:about=""&gt;
+ &lt;foaf:maker&gt;
+ &lt;foaf:Person&gt;
+ &lt;foaf:name&gt;Dave Beckett&lt;/foaf:name&gt;
+...
+</screen>
+</para>
+
+</example>
+
+</section>
+
+</chapter>
+
+
+<!--
+Local variables:
+mode: sgml
+sgml-parent-document: ("raptor-docs.xml" "book" "part")
+End:
+-->