diff options
author | Daniel Baumann <daniel.baumann@progress-linux.org> | 2024-04-13 14:07:11 +0000 |
---|---|---|
committer | Daniel Baumann <daniel.baumann@progress-linux.org> | 2024-04-13 14:07:11 +0000 |
commit | 63847496f14c813a5d80efd5b7de0f1294ffe1e3 (patch) | |
tree | 01c7571c7c762ceee70638549a99834fdd7c411b /www/affcase1.html | |
parent | Initial commit. (diff) | |
download | sqlite3-63847496f14c813a5d80efd5b7de0f1294ffe1e3.tar.xz sqlite3-63847496f14c813a5d80efd5b7de0f1294ffe1e3.zip |
Adding upstream version 3.45.1.upstream/3.45.1
Signed-off-by: Daniel Baumann <daniel.baumann@progress-linux.org>
Diffstat (limited to 'www/affcase1.html')
-rw-r--r-- | www/affcase1.html | 686 |
1 files changed, 686 insertions, 0 deletions
diff --git a/www/affcase1.html b/www/affcase1.html new file mode 100644 index 0000000..3d69dc1 --- /dev/null +++ b/www/affcase1.html @@ -0,0 +1,686 @@ +<!DOCTYPE html> +<html><head> +<meta name="viewport" content="width=device-width, initial-scale=1.0"> +<meta http-equiv="content-type" content="text/html; charset=UTF-8"> +<link href="sqlite.css" rel="stylesheet"> +<title>What If OpenDocument Used SQLite?</title> +<!-- path= --> +</head> +<body> +<div class=nosearch> +<a href="index.html"> +<img class="logo" src="images/sqlite370_banner.gif" alt="SQLite" border="0"> +</a> +<div><!-- IE hack to prevent disappearing logo --></div> +<div class="tagline desktoponly"> +Small. Fast. Reliable.<br>Choose any three. +</div> +<div class="menu mainmenu"> +<ul> +<li><a href="index.html">Home</a> +<li class='mobileonly'><a href="javascript:void(0)" onclick='toggle_div("submenu")'>Menu</a> +<li class='wideonly'><a href='about.html'>About</a> +<li class='desktoponly'><a href="docs.html">Documentation</a> +<li class='desktoponly'><a href="download.html">Download</a> +<li class='wideonly'><a href='copyright.html'>License</a> +<li class='desktoponly'><a href="support.html">Support</a> +<li class='desktoponly'><a href="prosupport.html">Purchase</a> +<li class='search' id='search_menubutton'> +<a href="javascript:void(0)" onclick='toggle_search()'>Search</a> +</ul> +</div> +<div class="menu submenu" id="submenu"> +<ul> +<li><a href='about.html'>About</a> +<li><a href='docs.html'>Documentation</a> +<li><a href='download.html'>Download</a> +<li><a href='support.html'>Support</a> +<li><a href='prosupport.html'>Purchase</a> +</ul> +</div> +<div class="searchmenu" id="searchmenu"> +<form method="GET" action="search"> +<select name="s" id="searchtype"> +<option value="d">Search Documentation</option> +<option value="c">Search Changelog</option> +</select> +<input type="text" name="q" id="searchbox" value=""> +<input type="submit" value="Go"> +</form> +</div> +</div> +<script> +function toggle_div(nm) { +var w = document.getElementById(nm); +if( w.style.display=="block" ){ +w.style.display = "none"; +}else{ +w.style.display = "block"; +} +} +function toggle_search() { +var w = document.getElementById("searchmenu"); +if( w.style.display=="block" ){ +w.style.display = "none"; +} else { +w.style.display = "block"; +setTimeout(function(){ +document.getElementById("searchbox").focus() +}, 30); +} +} +function div_off(nm){document.getElementById(nm).style.display="none";} +window.onbeforeunload = function(e){div_off("submenu");} +/* Disable the Search feature if we are not operating from CGI, since */ +/* Search is accomplished using CGI and will not work without it. */ +if( !location.origin || !location.origin.match || !location.origin.match(/http/) ){ +document.getElementById("search_menubutton").style.display = "none"; +} +/* Used by the Hide/Show button beside syntax diagrams, to toggle the */ +function hideorshow(btn,obj){ +var x = document.getElementById(obj); +var b = document.getElementById(btn); +if( x.style.display!='none' ){ +x.style.display = 'none'; +b.innerHTML='show'; +}else{ +x.style.display = ''; +b.innerHTML='hide'; +} +return false; +} +var antiRobot = 0; +function antiRobotGo(){ +if( antiRobot!=3 ) return; +antiRobot = 7; +var j = document.getElementById("mtimelink"); +if(j && j.hasAttribute("data-href")) j.href=j.getAttribute("data-href"); +} +function antiRobotDefense(){ +document.body.onmousedown=function(){ +antiRobot |= 2; +antiRobotGo(); +document.body.onmousedown=null; +} +document.body.onmousemove=function(){ +antiRobot |= 2; +antiRobotGo(); +document.body.onmousemove=null; +} +setTimeout(function(){ +antiRobot |= 1; +antiRobotGo(); +}, 100) +antiRobotGo(); +} +antiRobotDefense(); +</script> + + + +<h1 align="center"> +What If OpenDocument Used SQLite?</h1> + +<h2>Introduction</h2> + +<p>Suppose the +<a href="http://en.wikipedia.org/wiki/OpenDocument">OpenDocument</a> file format, +and specifically the "ODP" OpenDocument Presentation format, were +built around SQLite. Benefits would include: +<ul> +<li>Smaller documents +<li>Faster File/Save times +<li>Faster startup times +<li>Less memory used +<li>Document versioning +<li>A better user experience +</ul> + +<p> +Note that this is only a thought experiment. +We are not suggesting that OpenDocument be changed. +Nor is this article a criticism of the current OpenDocument +design. The point of this essay is to suggest ways to improve +future file format designs. + +<h2>About OpenDocument And OpenDocument Presentation</h2> + +<p> +The OpenDocument file format is used for office applications: +word processors, spreadsheets, and presentations. It was originally +designed for the OpenOffice suite but has since been incorporated into +other desktop application suites. The OpenOffice application has been +forked and renamed a few times. This author's primary use for OpenDocument is +building slide presentations with either +<a href="https://www.neooffice.org/neojava/en/index.php">NeoOffice</a> on Mac, or +<a href="http://www.libreoffice.org/">LibreOffice</a> on Linux and Windows. + +<p> +An OpenDocument Presentation or "ODP" file is a +<a href="http://en.wikipedia.org/wiki/Zip_%28file_format%29">ZIP archive</a> containing +XML files describing presentation slides and separate image files for the +various images that are included as part of the presentation. +(OpenDocument word processor and spreadsheet files are similarly +structured but are not considered by this article.) The reader can +easily see the content of an ODP file by using the "zip -l" command. +For example, the following is the "zip -l" output from a 49-slide presentation +about SQLite from the 2014 +<a href="http://southeastlinuxfest.org/">SouthEast LinuxFest</a> +conference: + +<blockquote><pre> +Archive: self2014.odp + Length Date Time Name +--------- ---------- ----- ---- + 47 2014-06-21 12:34 mimetype + 0 2014-06-21 12:34 Configurations2/statusbar/ + 0 2014-06-21 12:34 Configurations2/accelerator/current.xml + 0 2014-06-21 12:34 Configurations2/floater/ + 0 2014-06-21 12:34 Configurations2/popupmenu/ + 0 2014-06-21 12:34 Configurations2/progressbar/ + 0 2014-06-21 12:34 Configurations2/menubar/ + 0 2014-06-21 12:34 Configurations2/toolbar/ + 0 2014-06-21 12:34 Configurations2/images/Bitmaps/ + 54702 2014-06-21 12:34 Pictures/10000000000001F40000018C595A5A3D.png + 46269 2014-06-21 12:34 Pictures/100000000000012C000000A8ED96BFD9.png +<i>... 58 other pictures omitted...</i> + 13013 2014-06-21 12:34 Pictures/10000000000000EE0000004765E03BA8.png + 1005059 2014-06-21 12:34 Pictures/10000000000004760000034223EACEFD.png + 211831 2014-06-21 12:34 content.xml + 46169 2014-06-21 12:34 styles.xml + 1001 2014-06-21 12:34 meta.xml + 9291 2014-06-21 12:34 Thumbnails/thumbnail.png + 38705 2014-06-21 12:34 Thumbnails/thumbnail.pdf + 9664 2014-06-21 12:34 settings.xml + 9704 2014-06-21 12:34 META-INF/manifest.xml +--------- ------- + 10961006 78 files +</pre></blockquote> + +<p> +The ODP ZIP archive contains four different XML files: +content.xml, styles.xml, meta.xml, and settings.xml. Those four files +define the slide layout, text content, and styling. This particular +presentation contains 62 images, ranging from full-screen pictures to +tiny icons, each stored as a separate file in the Pictures +folder. The "mimetype" file contains a single line of text that says: + +<blockquote><pre> +application/vnd.oasis.opendocument.presentation +</pre></blockquote> + +<p>The purpose of the other files and folders is presently +unknown to the author but is probably not difficult to figure out. + +<h2>Limitations Of The OpenDocument Presentation Format</h2> + +<p> +The use of a ZIP archive to encapsulate XML files plus resources is an +elegant approach to an application file format. +It is clearly superior to a custom binary file format. +But using an SQLite database as the +container, instead of ZIP, would be more elegant still. + +<p>A ZIP archive is basically a key/value database, optimized for +the case of write-once/read-many and for a relatively small number +of distinct keys (a few hundred to a few thousand) each with a large BLOB +as its value. A ZIP archive can be viewed as a "pile-of-files" +database. This works, but it has some shortcomings relative to an +SQLite database, as follows: + +<ol> +<li><p><b>Incremental update is hard.</b> +<p> +It is difficult to update individual entries in a ZIP archive. +It is especially difficult to update individual entries in a ZIP +archive in a way that does not destroy +the entire document if the computer loses power and/or crashes +in the middle of the update. It is not impossible to do this, but +it is sufficiently difficult that nobody actually does it. Instead, whenever +the user selects "File/Save", the entire ZIP archive is rewritten. +Hence, "File/Save" takes longer than it ought, especially on +older hardware. Newer machines are faster, but it is still bothersome +that changing a single character in a 50 megabyte presentation causes one +to burn through 50 megabytes of the finite write life on the SSD. + +<li><p><b>Startup is slow.</b> +<p> +In keeping with the pile-of-files theme, OpenDocument stores all slide +content in a single big XML file named "content.xml". +LibreOffice reads and parses this entire file just to display +the first slide. +LibreOffice also seems to +read all images into memory as well, which makes sense seeing as when +the user does "File/Save" it is going to have to write them all back out +again, even though none of them changed. The net effect is that +start-up is slow. Double-clicking an OpenDocument file brings up a +progress bar rather than the first slide. +This results in a bad user experience. +The situation grows ever more annoying as +the document size increases. + +<li><p><b>More memory is required.</b> +<p> +Because ZIP archives are optimized for storing big chunks of content, they +encourage a style of programming where the entire document is read into +memory at startup, all editing occurs in memory, then the entire document +is written to disk during "File/Save". OpenOffice and its descendants +embrace that pattern. + +<p> +One might argue that it is ok, in this era of multi-gigabyte desktops, to +read the entire document into memory. +But it is not ok. +For one, the amount of memory used far exceeds the (compressed) file size +on disk. So a 50MB presentation might take 200MB or more RAM. +That still is not a problem if one only edits a single document at a time. +But when working on a talk, this author will typically have 10 or 15 different +presentations up all at the same +time (to facilitate copy/paste of slides from past presentation) and so +gigabytes of memory are required. +Add in an open web browser or two and a few other +desktop apps, and suddenly the disk is whirling and the machine is swapping. +And even having just a single document is a problem when working +on an inexpensive Chromebook retrofitted with Ubuntu. +Using less memory is always better. +</p> + +<li><p><b>Crash recovery is difficult.</b> +<p> +The descendants of OpenOffice tend to segfault more often than commercial +competitors. Perhaps for this reason, the OpenOffice forks make +periodic backups of their in-memory documents so that users do not lose +all pending edits when the inevitable application crash does occur. +This causes frustrating pauses in the application for the few seconds +while each backup is being made. +After restarting from a crash, the user is presented with a dialog box +that walks them through the recovery process. Managing the crash +recovery this way involves lots of extra application logic and is +generally an annoyance to the user. + +<li><p><b>Content is inaccessible.</b> +<p> +One cannot easily view, change, or extract the content of an +OpenDocument presentation using generic tools. +The only reasonable way to view or edit an OpenDocument document is to open +it up using an application that is specifically designed to read or write +OpenDocument (read: LibreOffice or one of its cousins). The situation +could be worse. One can extract and view individual images (say) from +a presentation using just the "zip" archiver tool. But it is not reasonable +try to extract the text from a slide. Remember that all content is stored +in a single "context.xml" file. That file is XML, so it is a text file. +But it is not a text file that can be managed with an ordinary text +editor. For the example presentation above, the content.xml file +consist of exactly two lines. The first line of the file is just: + +<blockquote><pre> +<?xml version="1.0" encoding="UTF-8"?> +</pre></blockquote> + +<p>The second line of the file contains 211792 characters of +impenetrable XML. Yes, 211792 characters all on one line. +This file is a good stress-test for a text editor. +Thankfully, the file is not some obscure +binary format, but in terms of accessibility, it might as well be +written in Klingon. +</ol> + +<h2>First Improvement: Replace ZIP with SQLite</h2> + +<p> +Let us suppose that instead of using a ZIP archive to store its files, +OpenDocument used a very simple SQLite database with the following +single-table schema: + +<blockquote><pre> +CREATE TABLE OpenDocTree( + filename TEXT PRIMARY KEY, -- Name of file + filesize BIGINT, -- Size of file after decompression + content BLOB -- Compressed file content +); +</pre></blockquote> + +<p> +For this first experiment, nothing else about the file format is changed. +The OpenDocument is still a pile-of-files, only now each file is a row +in an SQLite database rather than an entry in a ZIP archive. +This simple change does not use the power of a relational +database. Even so, this simple change shows some improvements. + +<a name="smaller"></a> + +<p> +Surprisingly, using SQLite in place of ZIP makes the presentation +file smaller. Really. One would think that a relational database file +would be larger than a ZIP archive, but at least in the case of NeoOffice +that is not so. The following is an actual screen-scrape showing +the sizes of the same NeoOffice presentation, both in its original +ZIP archive format as generated by NeoOffice (self2014.odp), and +as repacked as an SQLite database using the +<a href="https://www.sqlite.org/sqlar/doc/trunk/README.md">SQLAR</a> utility: + +<blockquote><pre> +-rw-r--r-- 1 drh staff 10514994 Jun 8 14:32 self2014.odp +-rw-r--r-- 1 drh staff 10464256 Jun 8 14:37 self2014.sqlar +-rw-r--r-- 1 drh staff 10416644 Jun 8 14:40 zip.odp +</pre></blockquote> + +<p> +The SQLite database file ("self2014.sqlar") is about a +half percent smaller than the equivalent ODP file! How can this be? +Apparently the ZIP archive generator logic in NeoOffice +is not as efficient as it could be, because when the same pile-of-files +is recompressed using the command-line "zip" utility, one gets a file +("zip.odp") that is smaller still, by another half percent, as seen +in the third line above. So, a well-written ZIP archive +can be slightly smaller than the equivalent SQLite database, as one would +expect. But the difference is slight. The key take-away is that an +SQLite database is size-competitive with a ZIP archive. + +<p> +The other advantage to using SQLite in place of +ZIP is that the document can now be updated incrementally, without risk +of corrupting the document if a power loss or other crash occurs in the +middle of the update. (Remember that writes to +<a href="atomiccommit.html">SQLite databases are atomic</a>.) True, all the +content is still kept in a single big XML file ("content.xml") which must +be completely rewritten if so much as a single character changes. But +with SQLite, only that one file needs to change. The other 77 files in the +repository can remain unaltered. They do not all have to be rewritten, +which in turn makes "File/Save" run much faster and saves wear on SSDs. + +<h2>Second Improvement: Split content into smaller pieces</h2> + +<p> +A pile-of-files encourages content to be stored in a few large chunks. +In the case of ODP, there are just four XML files that define the layout +off all slides in a presentation. An SQLite database allows storing +information in a few large chunks, but SQLite is also adept and efficient +at storing information in numerous smaller pieces. + +<p> +So then, instead of storing all content for all slides in a single +oversized XML file ("content.xml"), suppose there was a separate table +for storing the content of each slide separately. The table schema +might look something like this: + +<blockquote><pre> +CREATE TABLE slide( + pageNumber INTEGER, -- The slide page number + slideContent TEXT -- Slide content as XML or JSON +); +CREATE INDEX slide_pgnum ON slide(pageNumber); -- Optional +</pre></blockquote> + +<p>The content of each slide could still be stored as compressed XML. +But now each page is stored separately. So when opening a new document, +the application could simply run: + +<blockquote><pre> +SELECT slideContent FROM slide WHERE pageNumber=1; +</pre></blockquote> + +<p>This query will quickly and efficiently return the content of the first +slide, which could then be speedily parsed and displayed to the user. +Only one page needs to be read and parsed in order render the first screen, +which means that the first screen appears much faster and +there is no longer a need for an annoying progress bar. + +<p>If the application wanted +to keep all content in memory, it could continue reading and parsing the +other pages using a background thread after drawing the first page. Or, +since reading from SQLite is so efficient, the application might +instead choose to reduce its memory footprint and only keep a single +slide in memory at a time. Or maybe it keeps the current slide and the +next slide in memory, to facility rapid transitions to the next slide. + +<p> +Notice that dividing up the content into smaller pieces using an SQLite +table gives flexibility to the implementation. The application can choose +to read all content into memory at startup. Or it can read just a +few pages into memory and keep the rest on disk. Or it can read just +single page into memory at a time. And different versions of the application +can make different choices without having to make any changes to the +file format. Such options are not available when all content is in +a single big XML file in a ZIP archive. + +<p> +Splitting content into smaller pieces also helps File/Save operations +to go faster. Instead of having to write back the content of all pages +when doing a File/Save, the application only has to write back those +pages that have actually changed. + +<p> +One minor downside of splitting content into smaller pieces is that +compression does not work as well on shorter texts and so the size of +the document might increase. But as the bulk of the document space +is used to store images, a small reduction in the compression efficiency +of the text content will hardly be noticeable, and is a small price +to pay for an improved user experience. + +<h2>Third Improvement: Versioning</h2> + +<p> +Once one is comfortable with the concept of storing each slide separately, +it is a small step to support versioning of the presentation. Consider +the following schema: + +<blockquote><pre> +CREATE TABLE slide( + slideId INTEGER PRIMARY KEY, + derivedFrom INTEGER REFERENCES slide, + content TEXT -- XML or JSON or whatever +); +CREATE TABLE version( + versionId INTEGER PRIMARY KEY, + priorVersion INTEGER REFERENCES version, + checkinTime DATETIME, -- When this version was saved + comment TEXT, -- Description of this version + manifest TEXT -- List of integer slideIds +); +</pre></blockquote> + +<p> +In this schema, instead of each slide having a page number that determines +its order within the presentation, each slide has a unique +integer identifier that is unrelated to where it occurs in sequence. +The order of slides in the presentation is determined by a list of +slideIds, stored as a text string in the MANIFEST column of the VERSION +table. +Since multiple entries are allowed in the VERSION table, that means that +multiple presentations can be stored in the same document. + +<p> +On startup, the application first decides which version it +wants to display. Since the versionId will naturally increase in time +and one would normally want to see the latest version, an appropriate +query might be: + +<blockquote><pre> +SELECT manifest, versionId FROM version ORDER BY versionId DESC LIMIT 1; +</pre></blockquote> + +<p> +Or perhaps the application would rather use the +most recent checkinTime: + +<blockquote><pre> +SELECT manifest, versionId, max(checkinTime) FROM version; +</pre></blockquote> + +<p> +Using a single query such as the above, the application obtains a list +of the slideIds for all slides in the presentation. The application then +queries for the content of the first slide, and parses and displays that +content, as before. + +<p>(Aside: Yes, that second query above that uses "max(checkinTime)" +really does work and really does return a well-defined answer in SQLite. +Such a query either returns an undefined answer or generates an error +in many other SQL database engines, but in SQLite it does what you would +expect: it returns the manifest and versionId of the entry that has the +maximum checkinTime.) + +<p>When the user does a "File/Save", instead of overwriting the modified +slides, the application can now make new entries in the SLIDE table for +just those slides that have been added or altered. Then it creates a +new entry in the VERSION table containing the revised manifest. + +<p>The VERSION table shown above has columns to record a check-in comment +(presumably supplied by the user) and the time and date at which the File/Save +action occurred. It also records the parent version to record the history +of changes. Perhaps the manifest could be stored as a delta from the +parent version, though typically the manifest will be small enough that +storing a delta might be more trouble than it is worth. The SLIDE table +also contains a derivedFrom column which could be used for delta encoding +if it is determined that saving the slide content as a delta from its +previous version is a worthwhile optimization. + +<p>So with this simple change, the ODP file now stores not just the most +recent edit to the presentation, but a history of all historic edits. The +user would normally want to see just the most recent edition of the +presentation, but if desired, the user can now go backwards in time to +see historical versions of the same presentation. + +<p>Or, multiple presentations could be stored within the same document. + +<p>With such a schema, the application would no longer need to make +periodic backups of the unsaved changes to a separate file to avoid lost +work in the event of a crash. Instead, a special "pending" version could +be allocated and unsaved changes could be written into the pending version. +Because only changes would need to be written, not the entire document, +saving the pending changes would only involve writing a few kilobytes of +content, not multiple megabytes, and would take milliseconds instead of +seconds, and so it could be done frequently and silently in the background. +Then when a crash occurs and the user reboots, all (or almost all) +of their work is retained. If the user decides to discard unsaved changes, +they simply go back to the previous version. + +<p> +There are details to fill in here. +Perhaps a screen can be provided that displays a history changes +(perhaps with a graph) allowing the user to select which version they +want to view or edit. Perhaps some facility can be provided to merge +forks that might occur in the version history. And perhaps the +application should provide a means to purge old and unwanted versions. +The key point is that using an SQLite database to store the content, +rather than a ZIP archive, makes all of these features much, much easier +to implement, which increases the possibility that they will eventually +get implemented. + +<h2>And So Forth...</h2> + +<p> +In the previous sections, we have seen how moving from a key/value +store implemented as a ZIP archive to a simple SQLite database +with just three tables can add significant capabilities to an application +file format. +We could continue to enhance the schema with new tables, with indexes +added for performance, with triggers and views for programming convenience, +and constraints to enforce consistency of content even in the face of +programming errors. Further enhancement ideas include: +<ul> +<li> Store an <a href="undoredo.html">automated undo/redo stack</a> in a database table so that + Undo could go back into prior edit sessions. +<li> Add <a href="fts3.html#fts4">full text search</a> capabilities to the slide deck, or across + multiple slide decks. +<li> Decompose the "settings.xml" file into an SQL table that + is more easily viewed and edited by separate applications. +<li> Break out the "Presentor Notes" from each slide into a separate + table, for easier access from third-party applications and/or scripts. +<li> Enhance the presentation concept beyond the simple linear sequence of + slides to allow for side-tracks and excursions to be taken depending on + how the audience is responding. +</ul> + +<p> +An SQLite database has a lot of capability, which +this essay has only begun to touch upon. But hopefully this quick glimpse +has convinced some readers that using an SQL database as an application +file format is worth a second look. + +<p> +Some readers might resist using SQLite as an application +file format due to prior exposure to enterprise SQL databases and +the caveats and limitations of those other systems. +For example, many enterprise database +engines advise against storing large strings or BLOBs in the database +and instead suggest that large strings and BLOBs be stored as separate +files and the filename stored in the database. But SQLite +is not like that. Any column of an SQLite database can hold +a string or BLOB up to about a gigabyte in size. And for strings and +BLOBs of 100 kilobytes or less, +<a href="intern-v-extern-blob.html">I/O performance is better</a> than using separate +files. + +<p> +Some readers might be reluctant to consider SQLite as an application +file format because they have been inculcated with the idea that all +SQL database schemas must be factored into third normal form and store +only small primitive data types such as strings and integers. Certainly +relational theory is important and designers should strive to understand +it. But, as demonstrated above, it is often quite acceptable to store +complex information as XML or JSON in text fields of a database. +Do what works, not what your database professor said you ought to do. + +<h2>Review Of The Benefits Of Using SQLite</h2> + +<p> +In summary, +the claim of this essay is that using SQLite as a container for an application +file format like OpenDocument +and storing lots of smaller objects in that container +works out much better than using a ZIP archive holding a few larger objects. +To wit: + +<ol> +<li><p> +An SQLite database file is approximately the same size, and in some cases +smaller, than a ZIP archive holding the same information. + +<li><p> +The <a href="atomiccommit.html">atomic update capabilities</a> +of SQLite allow small incremental changes +to be safely written into the document. This reduces total disk I/O +and improves File/Save performance, enhancing the user experience. + +<li><p> +Startup time is reduced by allowing the application to read in only the +content shown for the initial screen. This largely eliminates the +need to show a progress bar when opening a new document. The document +just pops up immediately, further enhancing the user experience. + +<li><p> +The memory footprint of the application can be dramatically reduced by +only loading content that is relevant to the current display and keeping +the bulk of the content on disk. The fast query capability of SQLite +make this a viable alternative to keeping all content in memory at all times. +And when applications use less memory, it makes the entire computer more +responsive, further enhancing the user experience. + +<li><p> +The schema of an SQL database is able to represent information more directly +and succinctly than a key/value database such as a ZIP archive. This makes +the document content more accessible to third-party applications and scripts +and facilitates advanced features such as built-in document versioning, and +incremental saving of work in progress for recovery after a crash. +</ol> + +<p> +These are just a few of the benefits of using SQLite as an application file +format — the benefits that seem most likely to improve the user +experience for applications like OpenOffice. Other applications might +benefit from SQLite in different ways. See the <a href="appfileformat.html">Application File Format</a> +document for additional ideas. + +<p> +Finally, let us reiterate that this essay is a thought experiment. +The OpenDocument format is well-established and already well-designed. +Nobody really believes that OpenDocument should be changed to use SQLite +as its container instead of ZIP. Nor is this article a criticism of +OpenDocument for not choosing SQLite as its container since OpenDocument +predates SQLite. Rather, the point of this article is to use OpenDocument +as a concrete example of how SQLite can be used to build better +application file formats for future projects. + + |