summaryrefslogtreecommitdiffstats
path: root/www/lemon.html
blob: 30bdb8d4b6b0a64178d592234dc2b7e36633f4f3 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
<!DOCTYPE html>
<html><head>
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<meta http-equiv="content-type" content="text/html; charset=UTF-8">
<link href="sqlite.css" rel="stylesheet">
<title>The Lemon LALR(1) Parser Generator</title>
<!-- path= -->
</head>
<body>
<div class=nosearch>
<a href="index.html">
<img class="logo" src="images/sqlite370_banner.gif" alt="SQLite" border="0">
</a>
<div><!-- IE hack to prevent disappearing logo --></div>
<div class="tagline desktoponly">
Small. Fast. Reliable.<br>Choose any three.
</div>
<div class="menu mainmenu">
<ul>
<li><a href="index.html">Home</a>
<li class='mobileonly'><a href="javascript:void(0)" onclick='toggle_div("submenu")'>Menu</a>
<li class='wideonly'><a href='about.html'>About</a>
<li class='desktoponly'><a href="docs.html">Documentation</a>
<li class='desktoponly'><a href="download.html">Download</a>
<li class='wideonly'><a href='copyright.html'>License</a>
<li class='desktoponly'><a href="support.html">Support</a>
<li class='desktoponly'><a href="prosupport.html">Purchase</a>
<li class='search' id='search_menubutton'>
<a href="javascript:void(0)" onclick='toggle_search()'>Search</a>
</ul>
</div>
<div class="menu submenu" id="submenu">
<ul>
<li><a href='about.html'>About</a>
<li><a href='docs.html'>Documentation</a>
<li><a href='download.html'>Download</a>
<li><a href='support.html'>Support</a>
<li><a href='prosupport.html'>Purchase</a>
</ul>
</div>
<div class="searchmenu" id="searchmenu">
<form method="GET" action="search">
<select name="s" id="searchtype">
<option value="d">Search Documentation</option>
<option value="c">Search Changelog</option>
</select>
<input type="text" name="q" id="searchbox" value="">
<input type="submit" value="Go">
</form>
</div>
</div>
<script>
function toggle_div(nm) {
var w = document.getElementById(nm);
if( w.style.display=="block" ){
w.style.display = "none";
}else{
w.style.display = "block";
}
}
function toggle_search() {
var w = document.getElementById("searchmenu");
if( w.style.display=="block" ){
w.style.display = "none";
} else {
w.style.display = "block";
setTimeout(function(){
document.getElementById("searchbox").focus()
}, 30);
}
}
function div_off(nm){document.getElementById(nm).style.display="none";}
window.onbeforeunload = function(e){div_off("submenu");}
/* Disable the Search feature if we are not operating from CGI, since */
/* Search is accomplished using CGI and will not work without it. */
if( !location.origin || !location.origin.match || !location.origin.match(/http/) ){
document.getElementById("search_menubutton").style.display = "none";
}
/* Used by the Hide/Show button beside syntax diagrams, to toggle the */
function hideorshow(btn,obj){
var x = document.getElementById(obj);
var b = document.getElementById(btn);
if( x.style.display!='none' ){
x.style.display = 'none';
b.innerHTML='show';
}else{
x.style.display = '';
b.innerHTML='hide';
}
return false;
}
var antiRobot = 0;
function antiRobotGo(){
if( antiRobot!=3 ) return;
antiRobot = 7;
var j = document.getElementById("mtimelink");
if(j && j.hasAttribute("data-href")) j.href=j.getAttribute("data-href");
}
function antiRobotDefense(){
document.body.onmousedown=function(){
antiRobot |= 2;
antiRobotGo();
document.body.onmousedown=null;
}
document.body.onmousemove=function(){
antiRobot |= 2;
antiRobotGo();
document.body.onmousemove=null;
}
setTimeout(function(){
antiRobot |= 1;
antiRobotGo();
}, 100)
antiRobotGo();
}
antiRobotDefense();
</script>
<div class=fancy>
<div class=nosearch>
<div class="fancy_title">
The Lemon LALR(1) Parser Generator
</div>
<div class="fancy_toc">
<a onclick="toggle_toc()">
<span class="fancy_toc_mark" id="toc_mk">&#x25ba;</span>
Table Of Contents
</a>
<div id="toc_sub"><div class="fancy-toc1"><a href="#overview">1. Overview</a></div>
<div class="fancy-toc2"><a href="#lemon_source_files_and_documentation">1.1. Lemon Source Files And Documentation</a></div>
<div class="fancy-toc1"><a href="#advantages_of_lemon">2. Advantages of Lemon</a></div>
<div class="fancy-toc2"><a href="#use_of_lemon_within_sqlite">2.1. Use of Lemon Within SQLite</a></div>
<div class="fancy-toc2"><a href="#lemon_customizations_especially_for_sqlite">2.2. Lemon Customizations Especially For SQLite</a></div>
<div class="fancy-toc1"><a href="#history_of_lemon">3. History Of Lemon</a></div>
</div>
</div>
<script>
function toggle_toc(){
var sub = document.getElementById("toc_sub")
var mk = document.getElementById("toc_mk")
if( sub.style.display!="block" ){
sub.style.display = "block";
mk.innerHTML = "&#x25bc;";
} else {
sub.style.display = "none";
mk.innerHTML = "&#x25ba;";
}
}
</script>
</div>





<h1 id="overview"><span>1. </span>Overview</h1>

<p>The SQL language parser for SQLite is generated using a code-generator
program called "Lemon".  The Lemon program reads a grammar of the input
language and emits C-code to implement a parser for that language.


</p><h2 id="lemon_source_files_and_documentation"><span>1.1. </span>Lemon Source Files And Documentation</h2>

<p>Lemon does not have its own source repository.  Rather, Lemon consists
of a few files in the SQLite source tree:

</p><ul>
<li><p>
     <a href="https://sqlite.org/src/doc/trunk/doc/lemon.html">lemon.html</a> &rarr;
     The original detailed usage documentation and programmers reference
     for Lemon.
</p></li><li><p>
     <a href="https://sqlite.org/src/file/tool/lemon.c">lemon.c</a> &rarr; The source code
     for the utility program that reads a grammar file and generates 
     corresponding parser C-code.
</p></li><li><p>
     <a href="https://sqlite.org/src/file/tool/lempar.c">lempar.c</a> &rarr; A template
     for the generated parser C-code.  The "lemon" utility program reads this
     template and inserts additional code in order to generate a parser.
</p></li></ul>

<h1 id="advantages_of_lemon"><span>2. </span>Advantages of Lemon</h1>

<p>Lemon generates an LALR(1) parser.  Its operation is similar to the
more familiar tools <a href="https://en.wikipedia.org/wiki/Yacc">Yacc</a> and
<a href="https://en.wikipedia.org/wiki/GNU_bison">Bison</a>, but Lemon adds important
improvements, including:

</p><ul>
<li><p>
     The grammar syntax is less error prone - using symbolic names for
     semantic values rather that the "$1"-style positional notation
     of Yacc.
</p></li><li><p>
     In Lemon, the tokenizer calls the parser.  Yacc operates the other
     way around, with the parser calling the tokenizer.  The Lemon
     approach is reentrant and threadsafe, whereas Yacc uses global 
     variables and is therefore neither.  Reentrancy is especially
     important for SQLite since some SQL statements make recursive calls
     to the parser.  For example, when parsing a CREATE TABLE statement,
     SQLite invokes the parser recursively to generate an INSERT statement
     to make a new entry in the <a href="schematab.html">sqlite_schema</a> table.
</p></li><li><p>
     Lemon has the concept of a non-terminal destructor that can be
     used to reclaim memory or other resources following a syntax error
     or other aborted parse.
</p></li></ul>

<h2 id="use_of_lemon_within_sqlite"><span>2.1. </span>Use of Lemon Within SQLite</h2>

<p>Lemon is used in two places in SQLite.

</p><p>The primary use of Lemon is to create the SQL language parser.
A grammar file (<a href="https://sqlite.org/src/file/src/parse.y">parse.y</a>) is
compiled by Lemon into parse.c and parse.h.  The parse.c file is
incorporated into the <a href="amalgamation.html">amalgamation</a> without further modification.

</p><p>Lemon is also used to generate the parser for the query pattern
expressions in the <a href="fts5.html">FTS5</a> extension.  In this case, the input grammar
file is <a href="https://sqlite.org/src/file/ext/fts5/fts5parse.y">fts5parse.y</a>.

</p><h2 id="lemon_customizations_especially_for_sqlite"><span>2.2. </span>Lemon Customizations Especially For SQLite</h2>

<p>One of the advantages of hosting code generator tools as part of
the project is that the tools can be optimized to serve specific needs of
the overall project.  Lemon has benefited from this effect. Over the years,
the Lemon parser generator has been extended and enhanced to provide
new capabilities and improved performance to SQLite.  A few of the
specific enhancements to Lemon that are specifically designed for use
by SQLite include:

</p><ul>
<li><p>
Lemon has the concept of a "fallback" token.
The SQL language contains a large number of keywords and these keywords
have the potential to collide with identifier names.
Lemon has the ability to designate some keywords as being able to
"fallback" to an identifier.  If the keyword appears in the input token
stream in a context that would otherwise be a syntax error, the token
is automatically transformed into its fallback before the syntax error
is raised.  This feature allows the parser to be very forgiving of
reserved words used as identifiers, which is a problem that comes up
frequently in the SQL language.

</p></li><li><p>
In support of the <a href="testing.html#mcdc">100% MC/DC testing</a> goal for SQLite, 
the parser code generated by Lemon has no unreachable branches,
and contains extra (compile-time selected) instrumentation useful
for measuring test coverage.

</p></li><li><p>
Lemon supports conditional compilation of grammar file rules, so that
a different parser can be generated depending on compile-time options.

</p></li><li><p>
As a performance optimization, reduce actions in the Lemon input grammar
are allowed to contain comments of the form "/*A-overwrites-Z*/" to indicate
that the semantic value "A" on the right-hand side of the rule is allowed
to directly overwrite the semantic value "Z" on the left-hand side.
This simple optimization reduces the number of stack operations in the
push-down automaton used to parse the input grammar, and thus improve
performance of the parser.  It also makes the generated code a little smaller.
</p></li></ul>

<p>The parsing of SQL statements is a significant consumer of CPU cycles 
in any SQL database engine.  On-going efforts to optimize SQLite have caused
the developers to spend a lot of time tweaking Lemon to generate faster
parsers.  These efforts have benefited all users of the Lemon parser generator,
not just SQLite.  But if Lemon had been a separately maintained tool, it
would have been more difficult to make coordinated changes to both SQLite
and Lemon, and as a result not as much optimization would have been
accomplished.  Hence, the fact that the parser generator tool is included
in the source tree for SQLite has turned out to be a net benefit for both
the tool itself and for SQLite.

</p><h1 id="history_of_lemon"><span>3. </span>History Of Lemon</h1>

<p>Lemon was originally written by D. Richard Hipp (also the creator of SQLite)
while he was in graduate school at Duke University between 1987 and 1992.
The original creation date of Lemon has been lost, but was probably sometime
around 1990.  Lemon generates an LALR(1) parser.  There was a companion 
LL(1) parser generator tool named "Lime", but the source code for Lime
has been lost.

</p><p>The Lemon source code was originally written as separate source files,
and only later merged into a single "lemon.c" source file.

</p><p>The author of Lemon and SQLite (Hipp) reports that his C programming
skills were greatly enhanced by studying John Ousterhout's original
source code to Tcl.  Hipp discovered and studied Tcl in 1993.  Lemon
was written before then, and SQLite afterwards.  There is a clear
difference in the coding styles of these two products, with SQLite seeming
to be cleaner, more readable, and easier to maintain.
</p><p align="center"><small><i>This page last modified on  <a href="https://sqlite.org/docsrc/honeypot" id="mtimelink"  data-href="https://sqlite.org/docsrc/finfo/pages/lemon.in?m=b11f8f15fdaf8c816">2021-08-21 20:50:48</a> UTC </small></i></p>