summaryrefslogtreecommitdiffstats
path: root/www/affcase1.html
diff options
context:
space:
mode:
authorDaniel Baumann <daniel.baumann@progress-linux.org>2024-05-05 17:28:19 +0000
committerDaniel Baumann <daniel.baumann@progress-linux.org>2024-05-05 17:28:19 +0000
commit18657a960e125336f704ea058e25c27bd3900dcb (patch)
tree17b438b680ed45a996d7b59951e6aa34023783f2 /www/affcase1.html
parentInitial commit. (diff)
downloadsqlite3-upstream.tar.xz
sqlite3-upstream.zip
Adding upstream version 3.40.1.upstream/3.40.1upstream
Signed-off-by: Daniel Baumann <daniel.baumann@progress-linux.org>
Diffstat (limited to 'www/affcase1.html')
-rw-r--r--www/affcase1.html686
1 files changed, 686 insertions, 0 deletions
diff --git a/www/affcase1.html b/www/affcase1.html
new file mode 100644
index 0000000..d1c8e80
--- /dev/null
+++ b/www/affcase1.html
@@ -0,0 +1,686 @@
+<!DOCTYPE html>
+<html><head>
+<meta name="viewport" content="width=device-width, initial-scale=1.0">
+<meta http-equiv="content-type" content="text/html; charset=UTF-8">
+<link href="sqlite.css" rel="stylesheet">
+<title>What If OpenDocument Used SQLite?</title>
+<!-- path= -->
+</head>
+<body>
+<div class=nosearch>
+<a href="index.html">
+<img class="logo" src="images/sqlite370_banner.gif" alt="SQLite" border="0">
+</a>
+<div><!-- IE hack to prevent disappearing logo --></div>
+<div class="tagline desktoponly">
+Small. Fast. Reliable.<br>Choose any three.
+</div>
+<div class="menu mainmenu">
+<ul>
+<li><a href="index.html">Home</a>
+<li class='mobileonly'><a href="javascript:void(0)" onclick='toggle_div("submenu")'>Menu</a>
+<li class='wideonly'><a href='about.html'>About</a>
+<li class='desktoponly'><a href="docs.html">Documentation</a>
+<li class='desktoponly'><a href="download.html">Download</a>
+<li class='wideonly'><a href='copyright.html'>License</a>
+<li class='desktoponly'><a href="support.html">Support</a>
+<li class='desktoponly'><a href="prosupport.html">Purchase</a>
+<li class='search' id='search_menubutton'>
+<a href="javascript:void(0)" onclick='toggle_search()'>Search</a>
+</ul>
+</div>
+<div class="menu submenu" id="submenu">
+<ul>
+<li><a href='about.html'>About</a>
+<li><a href='docs.html'>Documentation</a>
+<li><a href='download.html'>Download</a>
+<li><a href='support.html'>Support</a>
+<li><a href='prosupport.html'>Purchase</a>
+</ul>
+</div>
+<div class="searchmenu" id="searchmenu">
+<form method="GET" action="search">
+<select name="s" id="searchtype">
+<option value="d">Search Documentation</option>
+<option value="c">Search Changelog</option>
+</select>
+<input type="text" name="q" id="searchbox" value="">
+<input type="submit" value="Go">
+</form>
+</div>
+</div>
+<script>
+function toggle_div(nm) {
+var w = document.getElementById(nm);
+if( w.style.display=="block" ){
+w.style.display = "none";
+}else{
+w.style.display = "block";
+}
+}
+function toggle_search() {
+var w = document.getElementById("searchmenu");
+if( w.style.display=="block" ){
+w.style.display = "none";
+} else {
+w.style.display = "block";
+setTimeout(function(){
+document.getElementById("searchbox").focus()
+}, 30);
+}
+}
+function div_off(nm){document.getElementById(nm).style.display="none";}
+window.onbeforeunload = function(e){div_off("submenu");}
+/* Disable the Search feature if we are not operating from CGI, since */
+/* Search is accomplished using CGI and will not work without it. */
+if( !location.origin || !location.origin.match || !location.origin.match(/http/) ){
+document.getElementById("search_menubutton").style.display = "none";
+}
+/* Used by the Hide/Show button beside syntax diagrams, to toggle the */
+function hideorshow(btn,obj){
+var x = document.getElementById(obj);
+var b = document.getElementById(btn);
+if( x.style.display!='none' ){
+x.style.display = 'none';
+b.innerHTML='show';
+}else{
+x.style.display = '';
+b.innerHTML='hide';
+}
+return false;
+}
+var antiRobot = 0;
+function antiRobotGo(){
+if( antiRobot!=3 ) return;
+antiRobot = 7;
+var j = document.getElementById("mtimelink");
+if(j && j.hasAttribute("data-href")) j.href=j.getAttribute("data-href");
+}
+function antiRobotDefense(){
+document.body.onmousedown=function(){
+antiRobot |= 2;
+antiRobotGo();
+document.body.onmousedown=null;
+}
+document.body.onmousemove=function(){
+antiRobot |= 2;
+antiRobotGo();
+document.body.onmousemove=null;
+}
+setTimeout(function(){
+antiRobot |= 1;
+antiRobotGo();
+}, 100)
+antiRobotGo();
+}
+antiRobotDefense();
+</script>
+
+
+
+<h1 align="center">
+What If OpenDocument Used SQLite?</h1>
+
+<h2>Introduction</h2>
+
+<p>Suppose the
+<a href="http://en.wikipedia.org/wiki/OpenDocument">OpenDocument</a> file format,
+and specifically the "ODP" OpenDocument Presentation format, were
+built around SQLite. Benefits would include:
+<ul>
+<li>Smaller documents
+<li>Faster File/Save times
+<li>Faster startup times
+<li>Less memory used
+<li>Document versioning
+<li>A better user experience
+</ul>
+
+<p>
+Note that this is only a thought experiment.
+We are not suggesting that OpenDocument be changed.
+Nor is this article a criticism of the current OpenDocument
+design. The point of this essay is to suggest ways to improve
+future file format designs.
+
+<h2>About OpenDocument And OpenDocument Presentation</h2>
+
+<p>
+The OpenDocument file format is used for office applications:
+word processors, spreadsheets, and presentations. It was originally
+designed for the OpenOffice suite but has since been incorporated into
+other desktop application suites. The OpenOffice application has been
+forked and renamed a few times. This author's primary use for OpenDocument is
+building slide presentations with either
+<a href="https://www.neooffice.org/neojava/en/index.php">NeoOffice</a> on Mac, or
+<a href="http://www.libreoffice.org/">LibreOffice</a> on Linux and Windows.
+
+<p>
+An OpenDocument Presentation or "ODP" file is a
+<a href="http://en.wikipedia.org/wiki/Zip_%28file_format%29">ZIP archive</a> containing
+XML files describing presentation slides and separate image files for the
+various images that are included as part of the presentation.
+(OpenDocument word processor and spreadsheet files are similarly
+structured but are not considered by this article.) The reader can
+easily see the content of an ODP file by using the "zip -l" command.
+For example, the following is the "zip -l" output from a 49-slide presentation
+about SQLite from the 2014
+<a href="http://southeastlinuxfest.org/">SouthEast LinuxFest</a>
+conference:
+
+<blockquote><pre>
+Archive: self2014.odp
+ Length Date Time Name
+--------- ---------- ----- ----
+ 47 2014-06-21 12:34 mimetype
+ 0 2014-06-21 12:34 Configurations2/statusbar/
+ 0 2014-06-21 12:34 Configurations2/accelerator/current.xml
+ 0 2014-06-21 12:34 Configurations2/floater/
+ 0 2014-06-21 12:34 Configurations2/popupmenu/
+ 0 2014-06-21 12:34 Configurations2/progressbar/
+ 0 2014-06-21 12:34 Configurations2/menubar/
+ 0 2014-06-21 12:34 Configurations2/toolbar/
+ 0 2014-06-21 12:34 Configurations2/images/Bitmaps/
+ 54702 2014-06-21 12:34 Pictures/10000000000001F40000018C595A5A3D.png
+ 46269 2014-06-21 12:34 Pictures/100000000000012C000000A8ED96BFD9.png
+<i>... 58 other pictures omitted...</i>
+ 13013 2014-06-21 12:34 Pictures/10000000000000EE0000004765E03BA8.png
+ 1005059 2014-06-21 12:34 Pictures/10000000000004760000034223EACEFD.png
+ 211831 2014-06-21 12:34 content.xml
+ 46169 2014-06-21 12:34 styles.xml
+ 1001 2014-06-21 12:34 meta.xml
+ 9291 2014-06-21 12:34 Thumbnails/thumbnail.png
+ 38705 2014-06-21 12:34 Thumbnails/thumbnail.pdf
+ 9664 2014-06-21 12:34 settings.xml
+ 9704 2014-06-21 12:34 META-INF/manifest.xml
+--------- -------
+ 10961006 78 files
+</pre></blockquote>
+
+<p>
+The ODP ZIP archive contains four different XML files:
+content.xml, styles.xml, meta.xml, and settings.xml. Those four files
+define the slide layout, text content, and styling. This particular
+presentation contains 62 images, ranging from full-screen pictures to
+tiny icons, each stored as a separate file in the Pictures
+folder. The "mimetype" file contains a single line of text that says:
+
+<blockquote><pre>
+application/vnd.oasis.opendocument.presentation
+</pre></blockquote>
+
+<p>The purpose of the other files and folders is presently
+unknown to the author but is probably not difficult to figure out.
+
+<h2>Limitations Of The OpenDocument Presentation Format</h2>
+
+<p>
+The use of a ZIP archive to encapsulate XML files plus resources is an
+elegant approach to an application file format.
+It is clearly superior to a custom binary file format.
+But using an SQLite database as the
+container, instead of ZIP, would be more elegant still.
+
+<p>A ZIP archive is basically a key/value database, optimized for
+the case of write-once/read-many and for a relatively small number
+of distinct keys (a few hundred to a few thousand) each with a large BLOB
+as its value. A ZIP archive can be viewed as a "pile-of-files"
+database. This works, but it has some shortcomings relative to an
+SQLite database, as follows:
+
+<ol>
+<li><p><b>Incremental update is hard.</b>
+<p>
+It is difficult to update individual entries in a ZIP archive.
+It is especially difficult to update individual entries in a ZIP
+archive in a way that does not destroy
+the entire document if the computer loses power and/or crashes
+in the middle of the update. It is not impossible to do this, but
+it is sufficiently difficult that nobody actually does it. Instead, whenever
+the user selects "File/Save", the entire ZIP archive is rewritten.
+Hence, "File/Save" takes longer than it ought, especially on
+older hardware. Newer machines are faster, but it is still bothersome
+that changing a single character in a 50 megabyte presentation causes one
+to burn through 50 megabytes of the finite write life on the SSD.
+
+<li><p><b>Startup is slow.</b>
+<p>
+In keeping with the pile-of-files theme, OpenDocument stores all slide
+content in a single big XML file named "content.xml".
+LibreOffice reads and parses this entire file just to display
+the first slide.
+LibreOffice also seems to
+read all images into memory as well, which makes sense seeing as when
+the user does "File/Save" it is going to have to write them all back out
+again, even though none of them changed. The net effect is that
+start-up is slow. Double-clicking an OpenDocument file brings up a
+progress bar rather than the first slide.
+This results in a bad user experience.
+The situation grows ever more annoying as
+the document size increases.
+
+<li><p><b>More memory is required.</b>
+<p>
+Because ZIP archives are optimized for storing big chunks of content, they
+encourage a style of programming where the entire document is read into
+memory at startup, all editing occurs in memory, then the entire document
+is written to disk during "File/Save". OpenOffice and its descendants
+embrace that pattern.
+
+<p>
+One might argue that it is ok, in this era of multi-gigabyte desktops, to
+read the entire document into memory.
+But it is not ok.
+For one, the amount of memory used far exceeds the (compressed) file size
+on disk. So a 50MB presentation might take 200MB or more RAM.
+That still is not a problem if one only edits a single document at a time.
+But when working on a talk, this author will typically have 10 or 15 different
+presentations up all at the same
+time (to facilitate copy/paste of slides from past presentation) and so
+gigabytes of memory are required.
+Add in an open web browser or two and a few other
+desktop apps, and suddenly the disk is whirling and the machine is swapping.
+And even having just a single document is a problem when working
+on an inexpensive Chromebook retrofitted with Ubuntu.
+Using less memory is always better.
+</p>
+
+<li><p><b>Crash recovery is difficult.</b>
+<p>
+The descendants of OpenOffice tend to segfault more often than commercial
+competitors. Perhaps for this reason, the OpenOffice forks make
+periodic backups of their in-memory documents so that users do not lose
+all pending edits when the inevitable application crash does occur.
+This causes frustrating pauses in the application for the few seconds
+while each backup is being made.
+After restarting from a crash, the user is presented with a dialog box
+that walks them through the recovery process. Managing the crash
+recovery this way involves lots of extra application logic and is
+generally an annoyance to the user.
+
+<li><p><b>Content is inaccessible.</b>
+<p>
+One cannot easily view, change, or extract the content of an
+OpenDocument presentation using generic tools.
+The only reasonable way to view or edit an OpenDocument document is to open
+it up using an application that is specifically designed to read or write
+OpenDocument (read: LibreOffice or one of its cousins). The situation
+could be worse. One can extract and view individual images (say) from
+a presentation using just the "zip" archiver tool. But it is not reasonable
+try to extract the text from a slide. Remember that all content is stored
+in a single "context.xml" file. That file is XML, so it is a text file.
+But it is not a text file that can be managed with an ordinary text
+editor. For the example presentation above, the content.xml file
+consist of exactly two lines. The first line of the file is just:
+
+<blockquote><pre>
+&lt;?xml version="1.0" encoding="UTF-8"?&gt;
+</pre></blockquote>
+
+<p>The second line of the file contains 211792 characters of
+impenetrable XML. Yes, 211792 characters all on one line.
+This file is a good stress-test for a text editor.
+Thankfully, the file is not some obscure
+binary format, but in terms of accessibility, it might as well be
+written in Klingon.
+</ol>
+
+<h2>First Improvement: Replace ZIP with SQLite</h2>
+
+<p>
+Let us suppose that instead of using a ZIP archive to store its files,
+OpenDocument used a very simple SQLite database with the following
+single-table schema:
+
+<blockquote><pre>
+CREATE TABLE OpenDocTree(
+ filename TEXT PRIMARY KEY, -- Name of file
+ filesize BIGINT, -- Size of file after decompression
+ content BLOB -- Compressed file content
+);
+</pre></blockquote>
+
+<p>
+For this first experiment, nothing else about the file format is changed.
+The OpenDocument is still a pile-of-files, only now each file is a row
+in an SQLite database rather than an entry in a ZIP archive.
+This simple change does not use the power of a relational
+database. Even so, this simple change shows some improvements.
+
+<a name="smaller"></a>
+
+<p>
+Surprisingly, using SQLite in place of ZIP makes the presentation
+file smaller. Really. One would think that a relational database file
+would be larger than a ZIP archive, but at least in the case of NeoOffice
+that is not so. The following is an actual screen-scrape showing
+the sizes of the same NeoOffice presentation, both in its original
+ZIP archive format as generated by NeoOffice (self2014.odp), and
+as repacked as an SQLite database using the
+<a href="http://www.sqlite.org/sqlar/doc/trunk/README.md">SQLAR</a> utility:
+
+<blockquote><pre>
+-rw-r--r-- 1 drh staff 10514994 Jun 8 14:32 self2014.odp
+-rw-r--r-- 1 drh staff 10464256 Jun 8 14:37 self2014.sqlar
+-rw-r--r-- 1 drh staff 10416644 Jun 8 14:40 zip.odp
+</pre></blockquote>
+
+<p>
+The SQLite database file ("self2014.sqlar") is about a
+half percent smaller than the equivalent ODP file! How can this be?
+Apparently the ZIP archive generator logic in NeoOffice
+is not as efficient as it could be, because when the same pile-of-files
+is recompressed using the command-line "zip" utility, one gets a file
+("zip.odp") that is smaller still, by another half percent, as seen
+in the third line above. So, a well-written ZIP archive
+can be slightly smaller than the equivalent SQLite database, as one would
+expect. But the difference is slight. The key take-away is that an
+SQLite database is size-competitive with a ZIP archive.
+
+<p>
+The other advantage to using SQLite in place of
+ZIP is that the document can now be updated incrementally, without risk
+of corrupting the document if a power loss or other crash occurs in the
+middle of the update. (Remember that writes to
+<a href="atomiccommit.html">SQLite databases are atomic</a>.) True, all the
+content is still kept in a single big XML file ("content.xml") which must
+be completely rewritten if so much as a single character changes. But
+with SQLite, only that one file needs to change. The other 77 files in the
+repository can remain unaltered. They do not all have to be rewritten,
+which in turn makes "File/Save" run much faster and saves wear on SSDs.
+
+<h2>Second Improvement: Split content into smaller pieces</h2>
+
+<p>
+A pile-of-files encourages content to be stored in a few large chunks.
+In the case of ODP, there are just four XML files that define the layout
+off all slides in a presentation. An SQLite database allows storing
+information in a few large chunks, but SQLite is also adept and efficient
+at storing information in numerous smaller pieces.
+
+<p>
+So then, instead of storing all content for all slides in a single
+oversized XML file ("content.xml"), suppose there was a separate table
+for storing the content of each slide separately. The table schema
+might look something like this:
+
+<blockquote><pre>
+CREATE TABLE slide(
+ pageNumber INTEGER, -- The slide page number
+ slideContent TEXT -- Slide content as XML or JSON
+);
+CREATE INDEX slide_pgnum ON slide(pageNumber); -- Optional
+</pre></blockquote>
+
+<p>The content of each slide could still be stored as compressed XML.
+But now each page is stored separately. So when opening a new document,
+the application could simply run:
+
+<blockquote><pre>
+SELECT slideContent FROM slide WHERE pageNumber=1;
+</pre></blockquote>
+
+<p>This query will quickly and efficiently return the content of the first
+slide, which could then be speedily parsed and displayed to the user.
+Only one page needs to be read and parsed in order render the first screen,
+which means that the first screen appears much faster and
+there is no longer a need for an annoying progress bar.
+
+<p>If the application wanted
+to keep all content in memory, it could continue reading and parsing the
+other pages using a background thread after drawing the first page. Or,
+since reading from SQLite is so efficient, the application might
+instead choose to reduce its memory footprint and only keep a single
+slide in memory at a time. Or maybe it keeps the current slide and the
+next slide in memory, to facility rapid transitions to the next slide.
+
+<p>
+Notice that dividing up the content into smaller pieces using an SQLite
+table gives flexibility to the implementation. The application can choose
+to read all content into memory at startup. Or it can read just a
+few pages into memory and keep the rest on disk. Or it can read just
+single page into memory at a time. And different versions of the application
+can make different choices without having to make any changes to the
+file format. Such options are not available when all content is in
+a single big XML file in a ZIP archive.
+
+<p>
+Splitting content into smaller pieces also helps File/Save operations
+to go faster. Instead of having to write back the content of all pages
+when doing a File/Save, the application only has to write back those
+pages that have actually changed.
+
+<p>
+One minor downside of splitting content into smaller pieces is that
+compression does not work as well on shorter texts and so the size of
+the document might increase. But as the bulk of the document space
+is used to store images, a small reduction in the compression efficiency
+of the text content will hardly be noticeable, and is a small price
+to pay for an improved user experience.
+
+<h2>Third Improvement: Versioning</h2>
+
+<p>
+Once one is comfortable with the concept of storing each slide separately,
+it is a small step to support versioning of the presentation. Consider
+the following schema:
+
+<blockquote><pre>
+CREATE TABLE slide(
+ slideId INTEGER PRIMARY KEY,
+ derivedFrom INTEGER REFERENCES slide,
+ content TEXT -- XML or JSON or whatever
+);
+CREATE TABLE version(
+ versionId INTEGER PRIMARY KEY,
+ priorVersion INTEGER REFERENCES version,
+ checkinTime DATETIME, -- When this version was saved
+ comment TEXT, -- Description of this version
+ manifest TEXT -- List of integer slideIds
+);
+</pre></blockquote>
+
+<p>
+In this schema, instead of each slide having a page number that determines
+its order within the presentation, each slide has a unique
+integer identifier that is unrelated to where it occurs in sequence.
+The order of slides in the presentation is determined by a list of
+slideIds, stored as a text string in the MANIFEST column of the VERSION
+table.
+Since multiple entries are allowed in the VERSION table, that means that
+multiple presentations can be stored in the same document.
+
+<p>
+On startup, the application first decides which version it
+wants to display. Since the versionId will naturally increase in time
+and one would normally want to see the latest version, an appropriate
+query might be:
+
+<blockquote><pre>
+SELECT manifest, versionId FROM version ORDER BY versionId DESC LIMIT 1;
+</pre></blockquote>
+
+<p>
+Or perhaps the application would rather use the
+most recent checkinTime:
+
+<blockquote><pre>
+SELECT manifest, versionId, max(checkinTime) FROM version;
+</pre></blockquote>
+
+<p>
+Using a single query such as the above, the application obtains a list
+of the slideIds for all slides in the presentation. The application then
+queries for the content of the first slide, and parses and displays that
+content, as before.
+
+<p>(Aside: Yes, that second query above that uses "max(checkinTime)"
+really does work and really does return a well-defined answer in SQLite.
+Such a query either returns an undefined answer or generates an error
+in many other SQL database engines, but in SQLite it does what you would
+expect: it returns the manifest and versionId of the entry that has the
+maximum checkinTime.)
+
+<p>When the user does a "File/Save", instead of overwriting the modified
+slides, the application can now make new entries in the SLIDE table for
+just those slides that have been added or altered. Then it creates a
+new entry in the VERSION table containing the revised manifest.
+
+<p>The VERSION table shown above has columns to record a check-in comment
+(presumably supplied by the user) and the time and date at which the File/Save
+action occurred. It also records the parent version to record the history
+of changes. Perhaps the manifest could be stored as a delta from the
+parent version, though typically the manifest will be small enough that
+storing a delta might be more trouble than it is worth. The SLIDE table
+also contains a derivedFrom column which could be used for delta encoding
+if it is determined that saving the slide content as a delta from its
+previous version is a worthwhile optimization.
+
+<p>So with this simple change, the ODP file now stores not just the most
+recent edit to the presentation, but a history of all historic edits. The
+user would normally want to see just the most recent edition of the
+presentation, but if desired, the user can now go backwards in time to
+see historical versions of the same presentation.
+
+<p>Or, multiple presentations could be stored within the same document.
+
+<p>With such a schema, the application would no longer need to make
+periodic backups of the unsaved changes to a separate file to avoid lost
+work in the event of a crash. Instead, a special "pending" version could
+be allocated and unsaved changes could be written into the pending version.
+Because only changes would need to be written, not the entire document,
+saving the pending changes would only involve writing a few kilobytes of
+content, not multiple megabytes, and would take milliseconds instead of
+seconds, and so it could be done frequently and silently in the background.
+Then when a crash occurs and the user reboots, all (or almost all)
+of their work is retained. If the user decides to discard unsaved changes,
+they simply go back to the previous version.
+
+<p>
+There are details to fill in here.
+Perhaps a screen can be provided that displays a history changes
+(perhaps with a graph) allowing the user to select which version they
+want to view or edit. Perhaps some facility can be provided to merge
+forks that might occur in the version history. And perhaps the
+application should provide a means to purge old and unwanted versions.
+The key point is that using an SQLite database to store the content,
+rather than a ZIP archive, makes all of these features much, much easier
+to implement, which increases the possibility that they will eventually
+get implemented.
+
+<h2>And So Forth...</h2>
+
+<p>
+In the previous sections, we have seen how moving from a key/value
+store implemented as a ZIP archive to a simple SQLite database
+with just three tables can add significant capabilities to an application
+file format.
+We could continue to enhance the schema with new tables, with indexes
+added for performance, with triggers and views for programming convenience,
+and constraints to enforce consistency of content even in the face of
+programming errors. Further enhancement ideas include:
+<ul>
+<li> Store an <a href="undoredo.html">automated undo/redo stack</a> in a database table so that
+ Undo could go back into prior edit sessions.
+<li> Add <a href="fts3.html#fts4">full text search</a> capabilities to the slide deck, or across
+ multiple slide decks.
+<li> Decompose the "settings.xml" file into an SQL table that
+ is more easily viewed and edited by separate applications.
+<li> Break out the "Presentor Notes" from each slide into a separate
+ table, for easier access from third-party applications and/or scripts.
+<li> Enhance the presentation concept beyond the simple linear sequence of
+ slides to allow for side-tracks and excursions to be taken depending on
+ how the audience is responding.
+</ul>
+
+<p>
+An SQLite database has a lot of capability, which
+this essay has only begun to touch upon. But hopefully this quick glimpse
+has convinced some readers that using an SQL database as an application
+file format is worth a second look.
+
+<p>
+Some readers might resist using SQLite as an application
+file format due to prior exposure to enterprise SQL databases and
+the caveats and limitations of those other systems.
+For example, many enterprise database
+engines advise against storing large strings or BLOBs in the database
+and instead suggest that large strings and BLOBs be stored as separate
+files and the filename stored in the database. But SQLite
+is not like that. Any column of an SQLite database can hold
+a string or BLOB up to about a gigabyte in size. And for strings and
+BLOBs of 100 kilobytes or less,
+<a href="intern-v-extern-blob.html">I/O performance is better</a> than using separate
+files.
+
+<p>
+Some readers might be reluctant to consider SQLite as an application
+file format because they have been inculcated with the idea that all
+SQL database schemas must be factored into third normal form and store
+only small primitive data types such as strings and integers. Certainly
+relational theory is important and designers should strive to understand
+it. But, as demonstrated above, it is often quite acceptable to store
+complex information as XML or JSON in text fields of a database.
+Do what works, not what your database professor said you ought to do.
+
+<h2>Review Of The Benefits Of Using SQLite</h2>
+
+<p>
+In summary,
+the claim of this essay is that using SQLite as a container for an application
+file format like OpenDocument
+and storing lots of smaller objects in that container
+works out much better than using a ZIP archive holding a few larger objects.
+To wit:
+
+<ol>
+<li><p>
+An SQLite database file is approximately the same size, and in some cases
+smaller, than a ZIP archive holding the same information.
+
+<li><p>
+The <a href="atomiccommit.html">atomic update capabilities</a>
+of SQLite allow small incremental changes
+to be safely written into the document. This reduces total disk I/O
+and improves File/Save performance, enhancing the user experience.
+
+<li><p>
+Startup time is reduced by allowing the application to read in only the
+content shown for the initial screen. This largely eliminates the
+need to show a progress bar when opening a new document. The document
+just pops up immediately, further enhancing the user experience.
+
+<li><p>
+The memory footprint of the application can be dramatically reduced by
+only loading content that is relevant to the current display and keeping
+the bulk of the content on disk. The fast query capability of SQLite
+make this a viable alternative to keeping all content in memory at all times.
+And when applications use less memory, it makes the entire computer more
+responsive, further enhancing the user experience.
+
+<li><p>
+The schema of an SQL database is able to represent information more directly
+and succinctly than a key/value database such as a ZIP archive. This makes
+the document content more accessible to third-party applications and scripts
+and facilitates advanced features such as built-in document versioning, and
+incremental saving of work in progress for recovery after a crash.
+</ol>
+
+<p>
+These are just a few of the benefits of using SQLite as an application file
+format &mdash; the benefits that seem most likely to improve the user
+experience for applications like OpenOffice. Other applications might
+benefit from SQLite in different ways. See the <a href="appfileformat.html">Application File Format</a>
+document for additional ideas.
+
+<p>
+Finally, let us reiterate that this essay is a thought experiment.
+The OpenDocument format is well-established and already well-designed.
+Nobody really believes that OpenDocument should be changed to use SQLite
+as its container instead of ZIP. Nor is this article a criticism of
+OpenDocument for not choosing SQLite as its container since OpenDocument
+predates SQLite. Rather, the point of this article is to use OpenDocument
+as a concrete example of how SQLite can be used to build better
+application file formats for future projects.
+<p align="center"><small><i>This page last modified on <a href="https://sqlite.org/docsrc/honeypot" id="mtimelink" data-href="https://sqlite.org/docsrc/finfo/pages/affcase1.in?m=cbee5540a91bc6487">2018-11-30 21:04:18</a> UTC </small></i></p>
+