summaryrefslogtreecommitdiffstats
path: root/www/psow.html
blob: 8062135de8d9b2a26b76800fe68a68b5dd4d7956 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
<!DOCTYPE html>
<html><head>
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<meta http-equiv="content-type" content="text/html; charset=UTF-8">
<link href="sqlite.css" rel="stylesheet">
<title>Powersafe Overwrite</title>
<!-- path= -->
</head>
<body>
<div class=nosearch>
<a href="index.html">
<img class="logo" src="images/sqlite370_banner.gif" alt="SQLite" border="0">
</a>
<div><!-- IE hack to prevent disappearing logo --></div>
<div class="tagline desktoponly">
Small. Fast. Reliable.<br>Choose any three.
</div>
<div class="menu mainmenu">
<ul>
<li><a href="index.html">Home</a>
<li class='mobileonly'><a href="javascript:void(0)" onclick='toggle_div("submenu")'>Menu</a>
<li class='wideonly'><a href='about.html'>About</a>
<li class='desktoponly'><a href="docs.html">Documentation</a>
<li class='desktoponly'><a href="download.html">Download</a>
<li class='wideonly'><a href='copyright.html'>License</a>
<li class='desktoponly'><a href="support.html">Support</a>
<li class='desktoponly'><a href="prosupport.html">Purchase</a>
<li class='search' id='search_menubutton'>
<a href="javascript:void(0)" onclick='toggle_search()'>Search</a>
</ul>
</div>
<div class="menu submenu" id="submenu">
<ul>
<li><a href='about.html'>About</a>
<li><a href='docs.html'>Documentation</a>
<li><a href='download.html'>Download</a>
<li><a href='support.html'>Support</a>
<li><a href='prosupport.html'>Purchase</a>
</ul>
</div>
<div class="searchmenu" id="searchmenu">
<form method="GET" action="search">
<select name="s" id="searchtype">
<option value="d">Search Documentation</option>
<option value="c">Search Changelog</option>
</select>
<input type="text" name="q" id="searchbox" value="">
<input type="submit" value="Go">
</form>
</div>
</div>
<script>
function toggle_div(nm) {
var w = document.getElementById(nm);
if( w.style.display=="block" ){
w.style.display = "none";
}else{
w.style.display = "block";
}
}
function toggle_search() {
var w = document.getElementById("searchmenu");
if( w.style.display=="block" ){
w.style.display = "none";
} else {
w.style.display = "block";
setTimeout(function(){
document.getElementById("searchbox").focus()
}, 30);
}
}
function div_off(nm){document.getElementById(nm).style.display="none";}
window.onbeforeunload = function(e){div_off("submenu");}
/* Disable the Search feature if we are not operating from CGI, since */
/* Search is accomplished using CGI and will not work without it. */
if( !location.origin || !location.origin.match || !location.origin.match(/http/) ){
document.getElementById("search_menubutton").style.display = "none";
}
/* Used by the Hide/Show button beside syntax diagrams, to toggle the */
function hideorshow(btn,obj){
var x = document.getElementById(obj);
var b = document.getElementById(btn);
if( x.style.display!='none' ){
x.style.display = 'none';
b.innerHTML='show';
}else{
x.style.display = '';
b.innerHTML='hide';
}
return false;
}
var antiRobot = 0;
function antiRobotGo(){
if( antiRobot!=3 ) return;
antiRobot = 7;
var j = document.getElementById("mtimelink");
if(j && j.hasAttribute("data-href")) j.href=j.getAttribute("data-href");
}
function antiRobotDefense(){
document.body.onmousedown=function(){
antiRobot |= 2;
antiRobotGo();
document.body.onmousedown=null;
}
document.body.onmousemove=function(){
antiRobot |= 2;
antiRobotGo();
document.body.onmousemove=null;
}
setTimeout(function(){
antiRobot |= 1;
antiRobotGo();
}, 100)
antiRobotGo();
}
antiRobotDefense();
</script>



<h1 align="center">Powersafe Overwrite</h1>

<p>"Powersafe overwrite" is a term used by the SQLite team to describe
a behavior of some filesystems and disk-controllers related to
data preservation during a power loss.  Powersafe overwrite
is a boolean property: either the storage system has it or it does not.

<p>We say that a system has the powersafe overwrite property if the following
statement is true:

<blockquote>
  <b>When an application writes a range of bytes in a file, no
  bytes outside of that range will change, even if the write occurs
  just before a crash or power failure.</b>
</blockquote>

<p>The powersafe overwrite property says nothing about the state of the
bytes that were written.  Those bytes might contain their old values,
their new values, random values, or some combination of these.  The powersafe
overwrite property merely states that writes cannot change bytes outside
of the range of bytes written.

<p>In other words, powersafe overwrite means that there is no "collateral
damage" when a power loss occurs while writing.  Only those bytes actually
being written might be damaged.

<p>In practical terms, what the powersafe write property means is that when
the disk controller detects an impending power loss, it finishes writing
whatever sector it is working on prior to parking the heads.  It means that
individual sector writes will complete once started, even if
there is a power loss.

<p>Consider what would happen if disk sector writes are interrupted
by a power loss.  If an application writes two or three bytes in the middle
of some file, the operating system will implement this by first reading
the entire sector containing those bytes, making the change to the
sector in memory, then writing the entire sector back to the disk.  If a power
loss occurs during the writeback and the sector was not completely written,
then on the next read after reboot, error correcting codes
in the sector will probably detect irreparable damage and the disk 
controller will read out the sector as all zeros or all ones.  Thus
values will have changed outside of the range of the two or three bytes 
that were written at the application level - a violation of the powersafe
overwrite property.

<h2>SQLite Assumptions About Powersafe Overwrite</h2>

<p>All versions of SQLite up to and including <a href="releaselog/3_7_9.html">version 3.7.9</a>
(2011-11-01) assume that
the filesystem does <u>not</u> provide powersafe overwrite.  SQLite 
has traditionally assumed that when any one byte of a file changes, all
other bytes within the same sector of that byte have the potential of
being corrupted on a power loss.  When writing, SQLite has made sure
to journal all bytes in the same sector of any modifications
and it pads journal files out to the next sector boundary so that
subsequent appends to that journal cannot damage prior records.
SQLite understands the sector size to be the value returned by the
xSectorSize method in the <a href="vfs.html">VFS</a>.  The SQLite team has often referred
to the value returned by xSectorSize as the "blast radius" of a write,
since it expresses the range of bytes that might be damaged if a power
loss occurs during the write.
The default <a href="vfs.html">VFSes</a> for unix and windows have always returned 512 as 
the sector size (or blast radius) for all versions of SQLite up to
and including version 3.7.9.

<p>Newer disk drives have begun using 4096 byte sectors however.  Beginning
with SQLite <a href="releaselog/3_7_10.html">version 3.7.10</a> (2012-01-16), 
the SQLite development team experimented with 
changes xSectorSize to report 4096 bytes as the blast radius.
This had the effect of increasing write overhead on
many databases.  For a database with a <a href="pragma.html#pragma_page_size">PRAGMA page_size</a> of 1024
(a very common choice) making a change to a single page in the database
now requires SQLite to backup three other adjacent pages to the rollback
journal, whereas formerly it only had to backup the one page that was
changing.  In <a href="wal.html">WAL mode</a>, each transaction had to be padded out to the
next 4096-byte boundary in the WAL file, rather than the next 512-byte
boundary, resulting in thousands of extra bytes being written
per transaction.

<p>The extra write overhead prompted a reexamination of assumptions about
powersafe overwrite.  With modern disk drives, the capacity has become
so large and the data density so great that a single sector is very
small and writing a single sector takes very little time.  We know that
disk drives can detect an impending power loss and continue
to operate for some small amount of time on residual energy because those
drives are able to park their heads before spinning down.  And
so if an impending power loss is detectable by the disk controller, it
seems reasonable that the controller will finish writing
whatever sector it is current working on when the imminent power loss 
is first detected, prior to parking the heads, as long as doing so
does not take too long, which it should not with
small and dense sectors.  Hence it seems reasonable
to assume powersafe overwrite for modern disks.  Indeed, BerkeleyDB has
made this assumption for decades, we are told.  Caution is advised
though. As Roger Binns noted on the SQLite developers mailing list:
"'poorly written' should be the main assumption about drive firmware."

<a name="tornpage"></a>

<h2>Torn Pages</h2>

<p>A torn page occurs when a database page is larger than a disk sector,
the database page is written to disk, but a power loss occurs prior to
all sectors of the database page being written.  Then, upon recovery, part of
the database page will have the old content while some other parts of the
page will have the new content.  Some database engines assume that 
page writes are atomic and hence a torn page is an unrecoverable error.
</p>

<p>SQLite never assumes that database page writes are atomic,
regardless of the PSOW setting.<sup>(1)</sup>
And hence SQLite is always able to automatically recover from torn pages
induced by a crash.  Enabling PSOW does not decrease SQLite's ability
to recover from a torn page.</p>

<h2>Changes In SQLite Version 3.7.10</h2>

<p>The <a href="vfs.html">VFS</a> for SQLite <a href="releaselog/3_7_10.html">version 3.7.10</a> (2012-01-16)
adds a new device characteristic 
named <a href="c3ref/c_iocap_atomic.html">SQLITE_IOCAP_POWERSAFE_OVERWRITE</a>.  Database files that report this
characteristic are assumed to reside on storage systems that have the
powersafe overwrite property.
The default unix and windows <a href="vfs.html">VFSes</a> now report
<a href="c3ref/c_iocap_atomic.html">SQLITE_IOCAP_POWERSAFE_OVERWRITE</a> if SQLite is compiled with
<a href="compile.html#powersafe_overwrite">-DSQLITE_POWERSAFE_OVERWRITE=1</a> or they
make the legacy assumption that storage does not have the powersafe
overwrite property if compiled with
<a href="compile.html#powersafe_overwrite">-DSQLITE_POWERSAFE_OVERWRITE=0</a>.
For now, the default is for powersafe overwrite to be turned on, though
we may revisit this in the future and default it off.

<p>The powersafe overwrite property for individual databases can be
specified as the database is opened using the "psow" query parameter
with a <a href="uri.html">URI filename</a>.  For example, to always assume powersafe
overwrite for a file (perhaps to ensure maximum write performance), 
open it as

<blockquote>
   file:somefile.db?psow=1
</blockquote>

<p>Or to be extra safe with a database and to force SQLite to assume the
database lacks powersafe overwrite, open it using

<blockquote>
   file:somefile.db?psow=0
</blockquote>

<p>There is also a new <a href="c3ref/c_fcntl_begin_atomic_write.html#sqlitefcntlpowersafeoverwrite">SQLITE_FCNTL_POWERSAFE_OVERWRITE</a> opcode for
the <a href="c3ref/file_control.html">sqlite3_file_control()</a> that allows
an application to query the powersafe overwrite property for a database
file.

<hr>
<h2>Notes:</h2>
<ol><li value=1><p>
SQLite never assumes atomic page writes <em>in its default configurations</em>.
But a custom <a href="vfs.html">VFS</a> can set one of the 
<a href="c3ref/c_iocap_atomic.html">SQLITE_IOCAP_ATOMIC</a> bits in the result of the xDeviceCharacteristic()
method and then SQLite will assume that page writes are atomic.  The
application must supply a custom VFS to accomplish this, however, since
none of the standard VFSes will ever set any of the atomic bits in the
xDeviceCharacteristics() vector.
</ol>
<p align="center"><small><i>This page last modified on  <a href="https://sqlite.org/docsrc/honeypot" id="mtimelink"  data-href="https://sqlite.org/docsrc/finfo/pages/psow.in?m=7c9c05107b">2022-01-08 05:02:57</a> UTC </small></i></p>