diff options
Diffstat (limited to 'www/useovernet.html')
-rw-r--r-- | www/useovernet.html | 365 |
1 files changed, 365 insertions, 0 deletions
diff --git a/www/useovernet.html b/www/useovernet.html new file mode 100644 index 0000000..228d7bf --- /dev/null +++ b/www/useovernet.html @@ -0,0 +1,365 @@ +<!DOCTYPE html> +<html><head> +<meta name="viewport" content="width=device-width, initial-scale=1.0"> +<meta http-equiv="content-type" content="text/html; charset=UTF-8"> +<link href="sqlite.css" rel="stylesheet"> +<title>SQLite Over a Network, Caveats and Considerations</title> +<!-- path= --> +</head> +<body> +<div class=nosearch> +<a href="index.html"> +<img class="logo" src="images/sqlite370_banner.gif" alt="SQLite" border="0"> +</a> +<div><!-- IE hack to prevent disappearing logo --></div> +<div class="tagline desktoponly"> +Small. Fast. Reliable.<br>Choose any three. +</div> +<div class="menu mainmenu"> +<ul> +<li><a href="index.html">Home</a> +<li class='mobileonly'><a href="javascript:void(0)" onclick='toggle_div("submenu")'>Menu</a> +<li class='wideonly'><a href='about.html'>About</a> +<li class='desktoponly'><a href="docs.html">Documentation</a> +<li class='desktoponly'><a href="download.html">Download</a> +<li class='wideonly'><a href='copyright.html'>License</a> +<li class='desktoponly'><a href="support.html">Support</a> +<li class='desktoponly'><a href="prosupport.html">Purchase</a> +<li class='search' id='search_menubutton'> +<a href="javascript:void(0)" onclick='toggle_search()'>Search</a> +</ul> +</div> +<div class="menu submenu" id="submenu"> +<ul> +<li><a href='about.html'>About</a> +<li><a href='docs.html'>Documentation</a> +<li><a href='download.html'>Download</a> +<li><a href='support.html'>Support</a> +<li><a href='prosupport.html'>Purchase</a> +</ul> +</div> +<div class="searchmenu" id="searchmenu"> +<form method="GET" action="search"> +<select name="s" id="searchtype"> +<option value="d">Search Documentation</option> +<option value="c">Search Changelog</option> +</select> +<input type="text" name="q" id="searchbox" value=""> +<input type="submit" value="Go"> +</form> +</div> +</div> +<script> +function toggle_div(nm) { +var w = document.getElementById(nm); +if( w.style.display=="block" ){ +w.style.display = "none"; +}else{ +w.style.display = "block"; +} +} +function toggle_search() { +var w = document.getElementById("searchmenu"); +if( w.style.display=="block" ){ +w.style.display = "none"; +} else { +w.style.display = "block"; +setTimeout(function(){ +document.getElementById("searchbox").focus() +}, 30); +} +} +function div_off(nm){document.getElementById(nm).style.display="none";} +window.onbeforeunload = function(e){div_off("submenu");} +/* Disable the Search feature if we are not operating from CGI, since */ +/* Search is accomplished using CGI and will not work without it. */ +if( !location.origin || !location.origin.match || !location.origin.match(/http/) ){ +document.getElementById("search_menubutton").style.display = "none"; +} +/* Used by the Hide/Show button beside syntax diagrams, to toggle the */ +function hideorshow(btn,obj){ +var x = document.getElementById(obj); +var b = document.getElementById(btn); +if( x.style.display!='none' ){ +x.style.display = 'none'; +b.innerHTML='show'; +}else{ +x.style.display = ''; +b.innerHTML='hide'; +} +return false; +} +var antiRobot = 0; +function antiRobotGo(){ +if( antiRobot!=3 ) return; +antiRobot = 7; +var j = document.getElementById("mtimelink"); +if(j && j.hasAttribute("data-href")) j.href=j.getAttribute("data-href"); +} +function antiRobotDefense(){ +document.body.onmousedown=function(){ +antiRobot |= 2; +antiRobotGo(); +document.body.onmousedown=null; +} +document.body.onmousemove=function(){ +antiRobot |= 2; +antiRobotGo(); +document.body.onmousemove=null; +} +setTimeout(function(){ +antiRobot |= 1; +antiRobotGo(); +}, 100) +antiRobotGo(); +} +antiRobotDefense(); +</script> + + + + +<h1 align="center">SQLite Over a Network,<br/>Caveats and Considerations</h1> + + +<h1>Introduction</h1> +<p> + Users of the SQLite library, particularly application developers, + who want to access a SQLite database + from different systems connected by a network are often + tempted to simply open a <a href="c3ref/sqlite3.html">database connection</a> by specifying + a filename which references a database file somewhere within + a network filesystem. ("remote database" here) + This "file" is then accessed by means of + OS API's which permit the illusion of I/O from/to a local file. + The illusion is good but imperfect in important ways. +</p><p> + This simple, "remote database" approach is usually not the best way + to use a single SQLite database from multiple systems, + (even if it appears to "work"), + as it often leads to various kinds of trouble and grief. + Because these problems are inevitable with some usages, + but not frequent or repeatable, + it behooves application developers to not rely + on early testing success to decide + that their remote database use will work as desired. +</p> +<h1>Issues Arising with Remote Database Files</h1> +<p> +This diagram shows components and their linkages +for reference in the discussion following: +</p> + +<div class="center imgcontainer"> +<div style="max-width:705px;"><svg xmlns='http://www.w3.org/2000/svg' class="pikchr" width="705" height="84" viewBox="0 0 940.32 112.32"> +<path d="M2,110L182,110L182,2L2,2Z" style="fill:none;stroke-width:2.16;stroke:rgb(0,0,0);" /> +<text x="92" y="39" text-anchor="middle" fill="rgb(0,0,0)" font-size="140%" dominant-baseline="central">Client</text> +<text x="92" y="73" text-anchor="middle" fill="rgb(0,0,0)" font-size="140%" dominant-baseline="central">Application</text> +<path d="M380,110L560,110L560,2L380,2Z" style="fill:none;stroke-width:2.16;stroke:rgb(0,0,0);" /> +<text x="470" y="22" text-anchor="middle" fill="rgb(0,0,0)" font-size="140%" dominant-baseline="central">SQLite</text> +<text x="470" y="56" text-anchor="middle" fill="rgb(0,0,0)" font-size="140%" dominant-baseline="central">Database</text> +<text x="470" y="90" text-anchor="middle" fill="rgb(0,0,0)" font-size="140%" dominant-baseline="central">Engine</text> +<path d="M758,12L758,99A90 10 0 0 0 938 99L938,12A90 10 0 0 0 758 12A90 10 0 0 0 938 12" style="fill:none;stroke-width:2.16;stroke:rgb(0,0,0);" /> +<text x="848" y="47" text-anchor="middle" fill="rgb(0,0,0)" font-size="140%" dominant-baseline="central">Database</text> +<text x="848" y="81" text-anchor="middle" fill="rgb(0,0,0)" font-size="140%" dominant-baseline="central">File(s)</text> +<polygon points="182,56 193,51 193,60" style="fill:rgb(0,0,0)"/> +<polygon points="380,56 368,60 368,51" style="fill:rgb(0,0,0)"/> +<path d="M187,56L374,56" style="fill:none;stroke-width:2.16;stroke:rgb(0,0,0);" /> +<text x="281" y="37" text-anchor="middle" fill="rgb(0,0,0)" font-size="140%" dominant-baseline="central">SQLite API</text> +<text x="281" y="74" text-anchor="middle" fill="rgb(0,0,0)" font-size="140%" dominant-baseline="central">Calls</text> +<polygon points="560,56 571,51 571,60" style="fill:rgb(0,0,0)"/> +<polygon points="758,56 746,60 746,51" style="fill:rgb(0,0,0)"/> +<path d="M565,56L752,56" style="fill:none;stroke-width:2.16;stroke:rgb(0,0,0);" /> +<text x="659" y="37" text-anchor="middle" fill="rgb(0,0,0)" font-size="140%" dominant-baseline="central">DB Engine</text> +<text x="659" y="74" text-anchor="middle" fill="rgb(0,0,0)" font-size="140%" dominant-baseline="central">File I/O</text> +</svg> +</div> +</div> +<p> + The issues arise from the properties and utilization + of the two data/control channels between the above three blocks. +</p> + +<h2>Channel Traffic Volume</h2> +<p> + The "API Call" channel carries less information + than the "File I/O" channel. + API calls to submit queries or specify data modification + normally require substantially fewer bits to be passed + back and forth than are transferred to/from + the database file to store or find the data. + Query result retrieval will normally require much more file traffic + than API traffic because the data to be returned is rarely + to be found without reading unrequested data. +</p> +<h2>Channel Bandwidth</h2> +<p> + The API Call channel operates at processor main memory speeds + (Giga-words/second), with data often passed by reference (and so not copied.) + In contrast, even the fastest File I/O channels are slower. + They require the data to be copied, usually over a medium + requiring bit-serialization. For spinning magnetic media, + transfers await platter rotation and head movement, then + are limited by spin velocity. +</p> +<p> + When the File I/O channel includes a network connection, + (in addition to some genuine File I/O at its far end), + additional slowness is imposed. Even where raw transfer + rate does not limit bandwidth, the traffic must still be + packetized and buffered at both ends. + Additional layers of I/O handlers add scheduling delays. + However, slowed transfers are the least significant + issue with network filesystems. +</p> +<h2>Channel Reliability</h2> +<p> + The "API Call" channel is highly reliable, to the extent + that error rates are unstated and ignored as negligible. + The channel fails only when the system loses power + (excepting meteorites, etc.) +</p> +<p> + The "File I/O" channel, when it directly reaches a local storage device, + is also highly reliable. + (Spinning storage MTBF exceeds 1 million hours, + and NVRAM lasts longer.) + Local devices also have a characteristic + which is critical for enabling database management software + to be designed to ensure <a href="transactional.html">ACID</a> behavior: + When all process writes to the device have completed, + (when POSIX fsync() or Windows FlushFileBuffers() calls return), + the filesystem then either has + stored the "written" data or will do so + before storing any subsequently written data. +</p> +<p> + When network filesystem apparatus and software layers are interposed + between filesystem clients and a filesystem over an actual storage device, + significant sources of failure and misbehavior are introduced. + While network data transfers are error-checked well, transfer packets + do not all reliably arrive at their destination once sent. + Some packets are clobbered by other packets and must be resent. + Under packet clobbering conditions, repeated retries + can impose delays exceeding + what is needed for similar data to reach local storage. + Some portions of what a client writes can end up stored + out of time order relative to other portions written. +</p> +<p> + Because of the disordering and outright data loss + which occur in network filesystem writes, it is critical + that sets of file writes can be accurately known to be done + before a subsequent set of file writes begins. + This assurance is obtained by use of robustly designed + and correctly implemented fsync() (or equivalent) OS functions. + Unfortunately for some applications, network filesystem sync + operation can be less robust than local filesystem sync. + Attaining robust sync in the face of network packet transport errors + is hard, and safeguards are sometimes relaxed in favor of performance. +</p> +<p> + A similar hazard arises with file locking in network filesystems. + SQLite relies on exclusive locks for write operations, and those have + been known to operate incorrectly for some network filesystems. This + has led to database corruption. That may happen again as the designers + of such change their implementation to suit more common use cases. +</p> +<p> + The bottom line is that network filesystem sync and locking reliability + vary among implementations and installations. The design + assumptions upon which it relies may hold more true where + an application is tested than where it is relied upon. + <b>Rely upon it at your (and your customers') peril.</b> + See <a href="lockingv3.html#how_to_corrupt">How To Corrupt Your Database Files</a>. +</p> + +<h1>Performance and Reliability Issues</h1> +<p> + From the above diagram and discussion, it is obvious that + performance (aka "speed") is degraded by insertion + of a network link into one of the two channels. + Consideration of relative traffic volumes between + the API Call channel and the File I/O channel + reveals that such insertion will have less performance + impact at the API Call channel. +</p> +<p> + Consideration of reliability impact is easier, with a clearer outcome: + Inserting a network link into the API Call channel may also result + in call failures at times. But if the Client Application + has bothered to use SQL/SQLite transactions properly, + such failures will only cause a transaction to fail + and be rolled back, without compromising the integrity + of the data. In contrast, if the network link is + inserted into the File I/O channel, transactions may fail + (as for the API Call insertion) but with the additional + effect that the remote database is corrupted. +</p> +<p> + These network unreliability issues can be mitigated, + completely or to an acceptable degree, + by using SQLite in rollback mode. + However, the SQLite library is not tested in across-a-network + scenarios, nor is that reasonably possible. + Hence, use of a remote database is done <b>at the user's risk</b>. +</p> + +<h1>Recommendations</h1> +<p> + Generally, if your data is separated from the application + by a network, you want to use a client/server database. + This is due to the fact that the database engine acts + as a bandwidth-reducing filter on the database traffic. +</p><p> + If your data is separated from the application by a network, + you want the low-traffic link to be across the network, + not the high-traffic link. This means that the database engine + needs to be on the same machine as the database itself. + Such is the case with a client/server database like PostgreSQL. + SQLite is different in that the database engine runs on + the same machine as the application, which forces the + higher-traffic link to traverse the network in remote + database scenarios. That normally results in lower performance. +</p><p> + Network filesystems do not support the ability to do + simultaneous reads and writes while at the same time + keeping the database consistent. + So if you have multiple clients on multiple different + machines which need to do simultaneous database + reads and writes, you have these choices: +</p><p> + 1. Use a client/server database engine. + <a href=https://postgresql.org/>PostgreSQL</a> + is an excellent choice. A variation of this is: +</p><p> + 2. Host an SQLite database in <a href="wal.html">WAL mode</a>, but do + all reads and writes from processes on the same machine + that stores the database file. + Implement a proxy that runs on the database machine that + relays read/write requests from remote machines. +</p><p> + 3. Use SQLite in <a href="isolation.html">rollback mode</a>. + This means you can have multiple simultaneous readers or one writer, + but not simultaneous readers and writers. +</p><p> + Application programmers should be cognizant of the possibility + that their application's users will elect to use a remote database + if they can do so. Unless one of the above choices + has been effected, or one at a time, exclusive access is used, + a programmer should consider blocking that + election unless reliability is of little importance. +</p> +<h1>Summary</h1> +<p> + Choose the technology that is right for you and your customers. + If your data lives on a different machine from your application, + then you should consider a client/server database. + SQLite is designed for situations where the data and application + coexist on the same machine. + SQLite can still be made to work in many remote database + situations, but a client/server solution will usually work + better in that scenario. +</p> +<p align="center"><small><i>This page last modified on <a href="https://sqlite.org/docsrc/honeypot" id="mtimelink" data-href="https://sqlite.org/docsrc/finfo/pages/useovernet.in?m=9c50e45c371c9589d">2022-06-22 21:14:29</a> UTC </small></i></p> + |