diff options
Diffstat (limited to 'docs/raptor-tutorial-serializing.xml')
-rw-r--r-- | docs/raptor-tutorial-serializing.xml | 402 |
1 files changed, 402 insertions, 0 deletions
diff --git a/docs/raptor-tutorial-serializing.xml b/docs/raptor-tutorial-serializing.xml new file mode 100644 index 0000000..f27b7f5 --- /dev/null +++ b/docs/raptor-tutorial-serializing.xml @@ -0,0 +1,402 @@ +<!DOCTYPE refentry PUBLIC "-//OASIS//DTD DocBook XML V4.3//EN" + "http://www.oasis-open.org/docbook/xml/4.3/docbookx.dtd"> +<chapter id="tutorial-serializing" xmlns:xi="http://www.w3.org/2003/XInclude"> +<title>Serializing RDF triples to a syntax</title> + +<section id="tutorial-serializing-intro"> +<title>Introduction</title> + +<para> +The typical sequence of operations to serialize is to create a +serializer object, set various callback and features, start the +serializing, send some RDF triples to the serializer object, +finish the serializing and destroy the serializer object. +</para> + +</section> + + +<section id="tutorial-serializer-create"> +<title>Create the Serializer object</title> + +<para>The serializer can be created directly from a known name using +<link linkend="raptor-new-serializer"><function>raptor_new_serializer()</function></link> +such as <literal>rdfxml</literal> for the W3C Recommendation RDF/XML syntax: +<programlisting> + raptor_serializer* rdf_serializer; + + rdf_serializer = raptor_new_serializer(world, "rdfxml"); +</programlisting> +or the name can be discovered from an <emphasis>description</emphasis> +as discussed in +<link linkend="tutorial-querying-functionality">Querying Functionality</link> +</para> +</section> + + +<section id="tutorial-serializer-features"> +<title>Serializer options</title> + +<para>There are several options that can be set on serializers. +The exact list of options can be found at run time via the +<link linkend="tutorial-querying-functionality">Querying Functionality</link> +or in the API reference for +<link linkend="raptor-option"><literal>raptor_option</literal></link>. +</para> + +<para>Options are integer enumerations of the +<link linkend="raptor-option"><type>raptor_option</type></link> enum and have +values that are either booleans, integers or strings. +The function that sets options for serializers is: +<link linkend="raptor-serializer-set-option">raptor_serializer_set_option()</link> +used as follows: +<programlisting> + /* Set a boolean or integer valued option to value 1 */ + raptor_serializer_set_option(rdf_serializer, option, NULL, 1); + + /* Set a string valued option to value "abc" */ + raptor_serializer_set_option(rdf_serializer, option, "abc", -1); +</programlisting> +</para> + +<para> +There is a corresponding function for reading the values of serializer +option +<link linkend="raptor-serializer-get-option"><function>raptor_serializer_get_option()</function></link> +which takes the option enumeration parameter and returns the boolean / +integer or string value correspondingly into the appropriate pointer +argument. +<programlisting> + /* Get a boolean or integer option value */ + int int_var; + raptor_serializer_get_option(rdf_serializer, option, NULL, &int_var); + + /* Get a string option value */ + char* string_var; + raptor_serializer_get_option(rdf_serializer, option, &string_var, NULL); +</programlisting> +</para> + +</section> + + +<section id="tutorial-serializer-declare-namespace"> +<title>Declare namespaces</title> + +<para>Raptor can use namespace prefix/URIs to abbreviate syntax +in some syntaxes such as Turtle or any XML syntax including RDF/XML, +RSS1.0 and Atom 1.0. These are declared +with <link linkend="raptor-serializer-set-namespace"><function>raptor_serializer_set_namespace()</function></link> +using a prefix and URI argument pair like this: +<programlisting> + const unsigned char* prefix = "ex"; + raptor_uri* uri = raptor_new_uri(world, "http://example.org"); + + raptor_serializer_set_namespace(rdf_serializer, prefix, uri); +</programlisting> +</para> + +<para>or +<link linkend="raptor-serializer-set-namespace-from-namespace"><function>raptor_serializer_set_namespace_from_namespace()</function></link> +from an existing namespace. This can be useful when connected up the +the namespace declarations that are generated from a parser via a +namespace handler set with +<link linkend="raptor-parser-set-namespace-handler"><function>raptor_parser_set_namespace_handler()</function></link> +</para> +like this: +<programlisting> + static void + relay_namespaces(void* user_data, raptor_namespace *nspace) + { + raptor_serializer_set_namespace_from_namespace(rdf_serializer, nspace); + } + + ... + + rdf_parser = raptor_new_parser(world, syntax_name); + raptor_parser_set_namespace_handler(rdf_parser, rdf_serializer, relay_namespaces); +</programlisting> + +</section> + + +<section id="tutorial-serializer-set-error-warning-handlers"> +<title>Set error and warning handlers</title> + +<para>Any time before serializing is started, a log handler can be set +on the world object via the +<link linkend="raptor-world-set-log-handler"><function>raptor_world_set_log_handler()</function></link> +method to report errors and warnings from parsing. +The method takes a user data argument plus a handler callback of type +<link linkend="raptor-log-handler"><type>raptor_log_handler</type></link> +with a signature that looks like this: +<programlisting> +void +message_handler(void *user_data, raptor_log_message* message) +{ + /* do something with the message */ +} +</programlisting> +The handler gets the user data pointer as well as a +<link linkend="raptor-log-message"><type>raptor_log_handler</type></link> +pointer that includes associated location information, such as the +log level, +<link linkend="raptor-locator"><type>raptor_locator</type></link>, +and the log message itself. The <emphasis>locator</emphasis> +structure contains full information on the details of where in the +file or URI the message occurred. +</para> + +</section> + + +<section id="tutorial-serializer-to-destination"> +<title>Provide a destination for the serialized syntax</title> + +<para>The operation of turning RDF triples into a syntax has several +alternatives from functions that do most of the work writing to a file +or string to functions that allow passing in a +<link linkend="raptor-iostream"><type>raptor_iostream</type></link> +which can be entirely user-constructed.</para> + +<section id="serialize-to-filename"> +<title>Serialize to a filename (<link linkend="raptor-serializer-start-to-filename"><function>raptor_serializer_start_to_filename()</function></link>)</title> + +<para>Serialize to a new filename +(using <link linkend="raptor-new-iostream-to-filename"><function>raptor_new_iostream_to_filename()</function></link> internally) +and uses asf base URI, the file's URI. +<programlisting> + const char *filename = "raptor.rdf"; + raptor_serializer_start_to_filename(rdf_serializer, filename); +</programlisting> +</para> +</section> + +<section id="serialize-to-string"> +<title>Serialize to a string (<link linkend="raptor-serializer-start-to-string"><function>raptor_serializer_start_to_string()</function></link>)</title> + +<para>Serialize to a string that is allocated by the serializer +(using <link linkend="raptor-new-iostream-to-string"><function>raptor_new_iostream_to_string()</function></link> internally). The +resulting string is only constructed after <link linkend="raptor-serializer-serialize-end"><function>raptor_serializer_serialize_end()</function></link> is called and at that +point it is assigned to the string pointer passed in, with the length +written to the optional length pointer. This function +takes an optional base URI but may be required by some serializers. +<programlisting> + raptor_uri* uri = raptor_new_uri(world, "http://example.org/base"); + void *string; /* destination for string */ + size_t length; /* length of constructed string */ + raptor_serializer* rdf_serializer = /* serializer created by some means */ ; + + raptor_serializer_start_to_string(rdf_serializer, uri, + &string, &length); +</programlisting> +</para> + +</section> + + +<section id="serialize-to-filehandle"> +<title>Serialize to a FILE* file handle (<link linkend="raptor-serializer-start-to-file-handle"><function>raptor_serializer_start_to_file_handle()</function></link>)</title> + +<para>Serialize to an existing open C FILE* file handle +(using <link linkend="raptor-new-iostream-to-file-handle"><function>raptor_new_iostream_to_file_handle()</function></link> internally). The handle is not closed after serializing is finished. This function +takes an optional base URI but may be required by some serializers. +<programlisting> + raptor_uri* uri = raptor_new_uri(world, "http://example.org/base"); + FILE* fh = fopen("raptor.rdf", "wb"); + raptor_serializer* rdf_serializer = /* serializer created by some means */ ; + + raptor_serializer_start_to_file_handle(rdf_serializer, uri, fh); +</programlisting> +</para> +</section> + +<section id="serialize-to-iostream"> +<title>Serialize to an <link linkend="raptor-iostream"><type>raptor_iostream</type></link> (<link linkend="raptor-serializer-start-to-iostream"><function>raptor_serializer_start_to_iostream()</function></link>)</title> + +<para>This is the most flexible serializing method as it allows +writing to any +<link linkend="raptor-iostream"><type>raptor_iostream</type></link> +which can be constructed to build any form of user-generated structure +via callbacks. The iostream remains owned by the caller that can continue +to write to it after the serializing is finished (after +<link linkend="raptor-serializer-serialize-end"><function>raptor_serializer_serialize_end()</function></link>) is called). +<programlisting> + raptor_uri* uri = raptor_new_uri(world, "http://example.org/base"); + raptor_iostream* iostream = /* iostream initialized by some means */ ; + raptor_serializer* rdf_serializer = /* serializer created by some means */ ; + + raptor_serializer_start_to_iostream(rdf_serializer, uri, iostream); + + while( /* got RDF triples */ ) { + raptor_statement* triple = /* ... triple made from somewhere ... */ ; + raptor_serializer_serialize_statement(rdf_serializer, triple); + } + raptor_serializer_serialize_end(rdf_serializer); + + raptor_free_serializer(rdf_serializer); + + /* ... write other stuff to iostream ... */ + + raptor_free_iostream(iostream); +</programlisting> +</para> +</section> + + +</section> + + +<section id="tutorial-serializer-get-triples"> +<title>Get or construct RDF Statements (Triples)</title> +<para> +An <link linkend="raptor-statement"><type>raptor_statement</type></link> +containing the triple terms and optional graph term +can be made either by receiving them from a +<link linkend="raptor-parser"><type>raptor_parser</type></link> +via parsing or can be constructed by hand.</para> + +<para>When constructing by hand, +the <link linkend="raptor-statement"><type>raptor_statement</type></link> +structure should be allocated by the application and the fields +filled in. Each statement has three triple terms (subject, +predicate, object) and an optional graph term. The subject can be a +URI or blank node, the predicate can only be a URI and the object can +be a URI, blank node or RDF literal. RDF literals can have either +just a Unicode string, a Unicode string and a language or a Unicode +string and a datatype URI.</para> + +<para>The statement terms are all instances of +<link linkend="raptor-term"><type>raptor_term</type></link> +objects constructed with the appropriate constructor for +the URI, blank node or rdf literal types. The graph term +of the statement is typically a URI or blank node. +</para> + +<example id="raptor-example-rdfserialize"> +<title><filename>rdfserialize.c</filename>: Serialize 1 triple to RDF/XML (Abbreviated)</title> +<programlisting> +<xi:include href="rdfserialize.c" parse="text"/> +</programlisting> + +<para>Compile it like this: +<screen> +$ gcc -o rdfserialize rdfserialize.c `pkg-config raptor2 --cflags --libs` +</screen> +and run it with an optional base URI argument +<screen> +$ ./rdfserialize +<?xml version="1.0" encoding="utf-8"?> +<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"> + <rdf:Description rdf:about="http://example.org/subject"> + <ns0:predicate xmlns:ns0="http://example.org/" xml:lang="en">An example</ns0:predicate> + </rdf:Description> +</rdf:RDF> +</screen> +</para> + +</example> + +</section> + +<section id="tutorial-serializer-send-triples"> +<title>Send RDF Triples to serializer</title> + +<para> +Once the serializer has been started, RDF triples can be sent to it +via the +<link linkend="raptor-serializer-serialize-statement"><function>raptor_serializer_serialize_statement()</function></link> +function with a +<link linkend="raptor-statement"><type>raptor_statement</type></link> +value. +</para> + +<para>Once all triples are sent, the serializing must be finished +with a call to +<link linkend="raptor-serializer-serialize-end"><function>raptor_serializer_serialize_end()</function></link>. +In particular, only at this point does the +<link linkend="raptor-iostream"><type>raptor_iostream</type></link> +get flushed or any string constructed for +<link linkend="raptor-serializer-start-to-string"><function>raptor_serializer_start_to_string()</function></link>. +<programlisting> + /* start the serializing somehow */ + while( /* got RDF triples */ ) { + raptor_serializer_serialize_statement(rdf_serializer, triple); + } + raptor_serializer_serialize_end(rdf_serializer); + /* now can use the serializing result (FILE, string, raptor_iostream) */ +</programlisting> + +</para> +</section> + + +<section id="tutorial-serializer-runtime-info"> +<title>Querying serializer run-time information</title> + +<para> +<link linkend="raptor-serializer-get-iostream"><function>raptor_serializer_get_iostream()</function></link> +gets the current serializer's raptor_iostream. +</para> + +<para> +<link linkend="raptor-serializer-get-locator"><function>raptor_serializer_get_locator()</function></link> +returns the <link linkend="raptor-locator"><type>raptor_locator</type></link> +for the current position in the output stream. The <emphasis>locator</emphasis> +structure contains full information on the details of where in the +file or URI the current serializer has reached. +</para> +</section> + + +<section id="tutorial-serializer-destroy"> +<title>Destroy the serializer</title> + +<para> +To tidy up, delete the serializer object as follows: +<programlisting> + raptor_free_serializer(rdf_serializer); +</programlisting> +</para> + +</section> + +<section id="tutorial-serializer-example"> +<title>Serializing example code</title> + +<example id="raptor-example-rdfcat"> +<title><filename>rdfcat.c</filename>: Read any RDF syntax and serialize to RDF/XML (Abbreviated)</title> +<programlisting> +<xi:include href="rdfcat.c" parse="text"/> +</programlisting> + +<para>Compile it like this: +<screen> +$ gcc -o rdfcat rdfcat.c `pkg-config raptor2 --cflags --libs` +</screen> +and run it on an RDF file as: +<screen> +$ ./rdfcat raptor.rdf +<?xml version="1.0" encoding="utf-8"?> +<rdf:RDF xmlns:foaf="http://xmlns.com/foaf/0.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="http://usefulinc.com/ns/doap#"> + <rdf:Description rdf:about=""> + <foaf:maker> + <foaf:Person> + <foaf:name>Dave Beckett</foaf:name> +... +</screen> +</para> + +</example> + +</section> + +</chapter> + + +<!-- +Local variables: +mode: sgml +sgml-parent-document: ("raptor-docs.xml" "book" "part") +End: +--> |