summaryrefslogtreecommitdiffstats
path: root/docs/html/tutorial-parser-content.html
blob: fc7828b3e356a96d936874b69664e9bdafe84f1a (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
<title>Provide syntax content to parse: Raptor RDF Syntax Library Manual</title>
<meta name="generator" content="DocBook XSL Stylesheets Vsnapshot">
<link rel="home" href="index.html" title="Raptor RDF Syntax Library Manual">
<link rel="up" href="tutorial-parsing.html" title="Parsing syntaxes to RDF Triples">
<link rel="prev" href="tutorial-parse-strictness.html" title="Set the parsing strictness">
<link rel="next" href="restrict-parser-network-access.html" title="Restrict parser network access">
<meta name="generator" content="GTK-Doc V1.33.1 (XML mode)">
<link rel="stylesheet" href="style.css" type="text/css">
</head>
<body bgcolor="white" text="black" link="#0000FF" vlink="#840084" alink="#0000FF">
<table class="navigation" id="top" width="100%" summary="Navigation header" cellpadding="2" cellspacing="5"><tr valign="middle">
<td width="100%" align="left" class="shortcuts"></td>
<td><a accesskey="h" href="index.html"><img src="home.png" width="16" height="16" border="0" alt="Home"></a></td>
<td><a accesskey="u" href="tutorial-parsing.html"><img src="up.png" width="16" height="16" border="0" alt="Up"></a></td>
<td><a accesskey="p" href="tutorial-parse-strictness.html"><img src="left.png" width="16" height="16" border="0" alt="Prev"></a></td>
<td><a accesskey="n" href="restrict-parser-network-access.html"><img src="right.png" width="16" height="16" border="0" alt="Next"></a></td>
</tr></table>
<div class="section">
<div class="titlepage"><div><div><h2 class="title" style="clear: both">
<a name="tutorial-parser-content"></a>Provide syntax content to parse</h2></div></div></div>
<p>The operation of turning syntax into RDF triples has several
alternatives from functions that do most of the work starting from a
URI to functions that allow passing in data buffers.</p>
<div class="note">
<h3 class="title">Parsing and MIME Types</h3> 
The mime type of the retrieved content is not used to choose
a parser unless the parser is of type <code class="literal">guess</code>.
The guess parser will send an <code class="literal">Accept:</code> header
for all known parser syntax mime types (if a URI request is made)
and based on the response, including the identifiers used,
pick the appropriate parser to execute.  See
<a class="link" href="raptor2-section-world.html#raptor-world-guess-parser-name" title="raptor_world_guess_parser_name ()"><code class="function">raptor_world_guess_parser_name()</code></a>
for a full discussion of the inputs to the guessing.
</div>
<div class="section">
<div class="titlepage"><div><div><h3 class="title">
<a name="parse-from-uri"></a>Parse the content from a URI (<a class="link" href="raptor2-section-parser.html#raptor-parser-parse-uri" title="raptor_parser_parse_uri ()"><code class="function">raptor_parser_parse_uri()</code></a>)</h3></div></div></div>
<p>The URI is resolved and the content read from it and passed to
the parser:
</p>
<pre class="programlisting">
  raptor_parser_parse_uri(rdf_parser, uri, base_uri);
</pre>
<p>
The <span class="emphasis"><em>base_uri</em></span> is optional (can be
<code class="literal">NULL</code>) and will default to the
<span class="emphasis"><em>uri</em></span>.
</p>
</div>
<div class="section">
<div class="titlepage"><div><div><h3 class="title">
<a name="parse-from-www"></a>Parse the content of a URI using an existing WWW connection (<a class="link" href="raptor2-section-parser.html#raptor-parser-parse-uri-with-connection" title="raptor_parser_parse_uri_with_connection ()"><code class="function">raptor_parser_parse_uri_with_connection()</code></a>)</h3></div></div></div>
<p>The URI is resolved using an existing WWW connection (for
example a libcurl CURL handle) to allow for any existing
WWW configuration to be reused.  See
<a class="link" href="raptor2-section-www.html#raptor-new-www-with-connection" title="raptor_new_www_with_connection ()"><code class="function">raptor_new_www_with_connection</code></a>
for full details of how this works.   The content is then read from the
result of resolving the URI:
</p>
<pre class="programlisting">
  raptor_parser_parse_uri_with_connection(rdf_parser, uri, base_uri, connection);
</pre>
<p>
The <span class="emphasis"><em>base_uri</em></span> is optional (can be
<code class="literal">NULL</code>) and will default to the
<span class="emphasis"><em>uri</em></span>.
</p>
</div>
<div class="section">
<div class="titlepage"><div><div><h3 class="title">
<a name="parse-from-filehandle"></a>Parse the content of a C <code class="literal">FILE*</code> (<a class="link" href="raptor2-section-parser.html#raptor-parser-parse-file-stream" title="raptor_parser_parse_file_stream ()"><code class="function">raptor_parser_parse_file_stream()</code></a>)</h3></div></div></div>
<p>Parsing can read from a C STDIO file handle:
</p>
<pre class="programlisting">
  stream = fopen(filename, "rb");
  raptor_parser_parse_file_stream(rdf_parser, stream, filename, base_uri);
  fclose(stream);
</pre>
<p>
This function can use take an optional <span class="emphasis"><em>filename</em></span> which
is used in locator error messages.
The <span class="emphasis"><em>base_uri</em></span> may be required by some parsers
and if <code class="literal">NULL</code> will cause the parsing to fail.
This requirement can be checked by looking at the flags in
the parser description using
<a class="link" href="raptor2-section-world.html#raptor-world-get-parser-description" title="raptor_world_get_parser_description ()"><code class="function">raptor_world_get_parser_description()</code></a>.
</p>
</div>
<div class="section">
<div class="titlepage"><div><div><h3 class="title">
<a name="parse-from-file-uri"></a>Parse the content of a file URI (<a class="link" href="raptor2-section-parser.html#raptor-parser-parse-file" title="raptor_parser_parse_file ()"><code class="function">raptor_parser_parse_file()</code></a>)</h3></div></div></div>
<p>Parsing can read from a URI known to be a <code class="literal">file:</code> URI:
</p>
<pre class="programlisting">
  raptor_parser_parse_file(rdf_parser, file_uri, base_uri);
</pre>
<p>
This function requires that the <span class="emphasis"><em>file_uri</em></span> is
a file URI, that is 
<code class="literal">raptor_uri_uri_string_is_file_uri( raptor_uri_as_string( file_uri) )</code>
must be true.
The <span class="emphasis"><em>base_uri</em></span> may be required by some parsers
and if <code class="literal">NULL</code> will cause the parsing to fail.
</p>
</div>
<div class="section">
<div class="titlepage"><div><div><h3 class="title">
<a name="parse-from-chunks"></a>Parse chunks of syntax content provided by the application  (<a class="link" href="raptor2-section-parser.html#raptor-parser-parse-start" title="raptor_parser_parse_start ()"><code class="function">raptor_parser_parse_start()</code></a> and <a class="link" href="raptor2-section-parser.html#raptor-parser-parse-chunk" title="raptor_parser_parse_chunk ()"><code class="function">raptor_parser_parse_chunk()</code></a>)</h3></div></div></div>
<p>
</p>
<pre class="programlisting">
  raptor_parser_parse_start(rdf_parser, base_uri);
  while(/* not finished getting content */) {
    unsigned char *buffer;
    size_t buffer_len;

    /* ... obtain some syntax content in buffer of size buffer_len bytes ... */

    raptor_parser_parse_chunk(rdf_parser, buffer, buffer_len, 0);
  }
  raptor_parser_parse_chunk(rdf_parser, NULL, 0, 1); /* no data and is_end = 1 */
</pre>
<p>
The <span class="emphasis"><em>base_uri</em></span> argument to 
<a class="link" href="raptor2-section-parser.html#raptor-parser-parse-start" title="raptor_parser_parse_start ()"><code class="function">raptor_parser_parse_start()</code></a>
may be required by some parsers
and if <code class="literal">NULL</code> will cause the parsing to fail.
</p>
<p>On the last
<a class="link" href="raptor2-section-parser.html#raptor-parser-parse-chunk" title="raptor_parser_parse_chunk ()"><code class="function">raptor_parser_parse_chunk()</code></a>
call, or after the loop is ended, the <code class="literal">is_end</code>
parameter must be set to non-0.  Content can be passed with the
final call.  If no content is present at the end (such as in
some kind of <span class="quote"><span class="quote">end of file</span></span> situation), then a 0-length
buffer_len or NULL buffer can be used.</p>
<p>The minimal case is an entire parse in one chunk as follows:</p>
<pre class="programlisting">
  raptor_parser_parse_start(rdf_parser, base_uri);
  raptor_parser_parse_chunk(rdf_parser, buffer, buffer_len, 1); /* is_end = 1 */
</pre>
</div>
</div>
<div class="footer">
<hr>Generated by GTK-Doc V1.33.1</div>
</body>
</html>