1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
|
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"><html xmlns="http://www.w3.org/1999/xhtml"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /><title>19.2. Creating a Database Cluster</title><link rel="stylesheet" type="text/css" href="stylesheet.css" /><link rev="made" href="pgsql-docs@lists.postgresql.org" /><meta name="generator" content="DocBook XSL Stylesheets Vsnapshot" /><link rel="prev" href="postgres-user.html" title="19.1. The PostgreSQL User Account" /><link rel="next" href="server-start.html" title="19.3. Starting the Database Server" /></head><body id="docContent" class="container-fluid col-10"><div xmlns="http://www.w3.org/TR/xhtml1/transitional" class="navheader"><table width="100%" summary="Navigation header"><tr><th colspan="5" align="center">19.2. Creating a Database Cluster</th></tr><tr><td width="10%" align="left"><a accesskey="p" href="postgres-user.html" title="19.1. The PostgreSQL User Account">Prev</a> </td><td width="10%" align="left"><a accesskey="u" href="runtime.html" title="Chapter 19. Server Setup and Operation">Up</a></td><th width="60%" align="center">Chapter 19. Server Setup and Operation</th><td width="10%" align="right"><a accesskey="h" href="index.html" title="PostgreSQL 14.5 Documentation">Home</a></td><td width="10%" align="right"> <a accesskey="n" href="server-start.html" title="19.3. Starting the Database Server">Next</a></td></tr></table><hr></hr></div><div class="sect1" id="CREATING-CLUSTER"><div class="titlepage"><div><div><h2 class="title" style="clear: both">19.2. Creating a Database Cluster</h2></div></div></div><div class="toc"><dl class="toc"><dt><span class="sect2"><a href="creating-cluster.html#CREATING-CLUSTER-MOUNT-POINTS">19.2.1. Use of Secondary File Systems</a></span></dt><dt><span class="sect2"><a href="creating-cluster.html#CREATING-CLUSTER-FILESYSTEM">19.2.2. File Systems</a></span></dt></dl></div><a id="id-1.6.6.5.2" class="indexterm"></a><a id="id-1.6.6.5.3" class="indexterm"></a><p>
Before you can do anything, you must initialize a database storage
area on disk. We call this a <em class="firstterm">database cluster</em>.
(The <acronym class="acronym">SQL</acronym> standard uses the term catalog cluster.) A
database cluster is a collection of databases that is managed by a
single instance of a running database server. After initialization, a
database cluster will contain a database named <code class="literal">postgres</code>,
which is meant as a default database for use by utilities, users and third
party applications. The database server itself does not require the
<code class="literal">postgres</code> database to exist, but many external utility
programs assume it exists. Another database created within each cluster
during initialization is called
<code class="literal">template1</code>. As the name suggests, this will be used
as a template for subsequently created databases; it should not be
used for actual work. (See <a class="xref" href="managing-databases.html" title="Chapter 23. Managing Databases">Chapter 23</a> for
information about creating new databases within a cluster.)
</p><p>
In file system terms, a database cluster is a single directory
under which all data will be stored. We call this the <em class="firstterm">data
directory</em> or <em class="firstterm">data area</em>. It is
completely up to you where you choose to store your data. There is no
default, although locations such as
<code class="filename">/usr/local/pgsql/data</code> or
<code class="filename">/var/lib/pgsql/data</code> are popular.
The data directory must be initialized before being used, using the program
<a class="xref" href="app-initdb.html" title="initdb"><span class="refentrytitle"><span class="application">initdb</span></span></a><a id="id-1.6.6.5.5.6" class="indexterm"></a>
which is installed with <span class="productname">PostgreSQL</span>.
</p><p>
If you are using a pre-packaged version
of <span class="productname">PostgreSQL</span>, it may well have a specific
convention for where to place the data directory, and it may also
provide a script for creating the data directory. In that case you
should use that script in preference to
running <code class="command">initdb</code> directly.
Consult the package-level documentation for details.
</p><p>
To initialize a database cluster manually,
run <code class="command">initdb</code> and specify the desired
file system location of the database cluster with the
<code class="option">-D</code> option, for example:
</p><pre class="screen">
<code class="prompt">$</code> <strong class="userinput"><code>initdb -D /usr/local/pgsql/data</code></strong>
</pre><p>
Note that you must execute this command while logged into the
<span class="productname">PostgreSQL</span> user account, which is
described in the previous section.
</p><div class="tip"><h3 class="title">Tip</h3><p>
As an alternative to the <code class="option">-D</code> option, you can set
the environment variable <code class="envar">PGDATA</code>.
<a id="id-1.6.6.5.8.1.3" class="indexterm"></a>
</p></div><p>
Alternatively, you can run <code class="command">initdb</code> via
the <a class="xref" href="app-pg-ctl.html" title="pg_ctl"><span class="refentrytitle"><span class="application">pg_ctl</span></span></a>
program<a id="id-1.6.6.5.9.3" class="indexterm"></a> like so:
</p><pre class="screen">
<code class="prompt">$</code> <strong class="userinput"><code>pg_ctl -D /usr/local/pgsql/data initdb</code></strong>
</pre><p>
This may be more intuitive if you are
using <code class="command">pg_ctl</code> for starting and stopping the
server (see <a class="xref" href="server-start.html" title="19.3. Starting the Database Server">Section 19.3</a>), so
that <code class="command">pg_ctl</code> would be the sole command you use
for managing the database server instance.
</p><p>
<code class="command">initdb</code> will attempt to create the directory you
specify if it does not already exist. Of course, this will fail if
<code class="command">initdb</code> does not have permissions to write in the
parent directory. It's generally recommendable that the
<span class="productname">PostgreSQL</span> user own not just the data
directory but its parent directory as well, so that this should not
be a problem. If the desired parent directory doesn't exist either,
you will need to create it first, using root privileges if the
grandparent directory isn't writable. So the process might look
like this:
</p><pre class="screen">
root# <strong class="userinput"><code>mkdir /usr/local/pgsql</code></strong>
root# <strong class="userinput"><code>chown postgres /usr/local/pgsql</code></strong>
root# <strong class="userinput"><code>su postgres</code></strong>
postgres$ <strong class="userinput"><code>initdb -D /usr/local/pgsql/data</code></strong>
</pre><p>
</p><p>
<code class="command">initdb</code> will refuse to run if the data directory
exists and already contains files; this is to prevent accidentally
overwriting an existing installation.
</p><p>
Because the data directory contains all the data stored in the
database, it is essential that it be secured from unauthorized
access. <code class="command">initdb</code> therefore revokes access
permissions from everyone but the
<span class="productname">PostgreSQL</span> user, and optionally, group.
Group access, when enabled, is read-only. This allows an unprivileged
user in the same group as the cluster owner to take a backup of the
cluster data or perform other operations that only require read access.
</p><p>
Note that enabling or disabling group access on an existing cluster requires
the cluster to be shut down and the appropriate mode to be set on all
directories and files before restarting
<span class="productname">PostgreSQL</span>. Otherwise, a mix of modes might
exist in the data directory. For clusters that allow access only by the
owner, the appropriate modes are <code class="literal">0700</code> for directories
and <code class="literal">0600</code> for files. For clusters that also allow
reads by the group, the appropriate modes are <code class="literal">0750</code>
for directories and <code class="literal">0640</code> for files.
</p><p>
However, while the directory contents are secure, the default
client authentication setup allows any local user to connect to the
database and even become the database superuser. If you do not
trust other local users, we recommend you use one of
<code class="command">initdb</code>'s <code class="option">-W</code>, <code class="option">--pwprompt</code>
or <code class="option">--pwfile</code> options to assign a password to the
database superuser.<a id="id-1.6.6.5.14.5" class="indexterm"></a>
Also, specify <code class="option">-A scram-sha-256</code>
so that the default <code class="literal">trust</code> authentication
mode is not used; or modify the generated <code class="filename">pg_hba.conf</code>
file after running <code class="command">initdb</code>, but
<span class="emphasis"><em>before</em></span> you start the server for the first time. (Other
reasonable approaches include using <code class="literal">peer</code> authentication
or file system permissions to restrict connections. See <a class="xref" href="client-authentication.html" title="Chapter 21. Client Authentication">Chapter 21</a> for more information.)
</p><p>
<code class="command">initdb</code> also initializes the default
locale<a id="id-1.6.6.5.15.2" class="indexterm"></a> for the database cluster.
Normally, it will just take the locale settings in the environment
and apply them to the initialized database. It is possible to
specify a different locale for the database; more information about
that can be found in <a class="xref" href="locale.html" title="24.1. Locale Support">Section 24.1</a>. The default sort order used
within the particular database cluster is set by
<code class="command">initdb</code>, and while you can create new databases using
different sort order, the order used in the template databases that initdb
creates cannot be changed without dropping and recreating them.
There is also a performance impact for using locales
other than <code class="literal">C</code> or <code class="literal">POSIX</code>. Therefore, it is
important to make this choice correctly the first time.
</p><p>
<code class="command">initdb</code> also sets the default character set encoding
for the database cluster. Normally this should be chosen to match the
locale setting. For details see <a class="xref" href="multibyte.html" title="24.3. Character Set Support">Section 24.3</a>.
</p><p>
Non-<code class="literal">C</code> and non-<code class="literal">POSIX</code> locales rely on the
operating system's collation library for character set ordering.
This controls the ordering of keys stored in indexes. For this reason,
a cluster cannot switch to an incompatible collation library version,
either through snapshot restore, binary streaming replication, a
different operating system, or an operating system upgrade.
</p><div class="sect2" id="CREATING-CLUSTER-MOUNT-POINTS"><div class="titlepage"><div><div><h3 class="title">19.2.1. Use of Secondary File Systems</h3></div></div></div><a id="id-1.6.6.5.18.2" class="indexterm"></a><p>
Many installations create their database clusters on file systems
(volumes) other than the machine's <span class="quote">“<span class="quote">root</span>”</span> volume. If you
choose to do this, it is not advisable to try to use the secondary
volume's topmost directory (mount point) as the data directory.
Best practice is to create a directory within the mount-point
directory that is owned by the <span class="productname">PostgreSQL</span>
user, and then create the data directory within that. This avoids
permissions problems, particularly for operations such
as <span class="application">pg_upgrade</span>, and it also ensures clean failures if
the secondary volume is taken offline.
</p></div><div class="sect2" id="CREATING-CLUSTER-FILESYSTEM"><div class="titlepage"><div><div><h3 class="title">19.2.2. File Systems</h3></div></div></div><p>
Generally, any file system with POSIX semantics can be used for
PostgreSQL. Users prefer different file systems for a variety of reasons,
including vendor support, performance, and familiarity. Experience
suggests that, all other things being equal, one should not expect major
performance or behavior changes merely from switching file systems or
making minor file system configuration changes.
</p><div class="sect3" id="CREATING-CLUSTER-NFS"><div class="titlepage"><div><div><h4 class="title">19.2.2.1. NFS</h4></div></div></div><a id="id-1.6.6.5.19.3.2" class="indexterm"></a><p>
It is possible to use an <acronym class="acronym">NFS</acronym> file system for storing
the <span class="productname">PostgreSQL</span> data directory.
<span class="productname">PostgreSQL</span> does nothing special for
<acronym class="acronym">NFS</acronym> file systems, meaning it assumes
<acronym class="acronym">NFS</acronym> behaves exactly like locally-connected drives.
<span class="productname">PostgreSQL</span> does not use any functionality that
is known to have nonstandard behavior on <acronym class="acronym">NFS</acronym>, such as
file locking.
</p><p>
The only firm requirement for using <acronym class="acronym">NFS</acronym> with
<span class="productname">PostgreSQL</span> is that the file system is mounted
using the <code class="literal">hard</code> option. With the
<code class="literal">hard</code> option, processes can <span class="quote">“<span class="quote">hang</span>”</span>
indefinitely if there are network problems, so this configuration will
require a careful monitoring setup. The <code class="literal">soft</code> option
will interrupt system calls in case of network problems, but
<span class="productname">PostgreSQL</span> will not repeat system calls
interrupted in this way, so any such interruption will result in an I/O
error being reported.
</p><p>
It is not necessary to use the <code class="literal">sync</code> mount option. The
behavior of the <code class="literal">async</code> option is sufficient, since
<span class="productname">PostgreSQL</span> issues <code class="literal">fsync</code>
calls at appropriate times to flush the write caches. (This is analogous
to how it works on a local file system.) However, it is strongly
recommended to use the <code class="literal">sync</code> export option on the NFS
<span class="emphasis"><em>server</em></span> on systems where it exists (mainly Linux).
Otherwise, an <code class="literal">fsync</code> or equivalent on the NFS client is
not actually guaranteed to reach permanent storage on the server, which
could cause corruption similar to running with the parameter <a class="xref" href="runtime-config-wal.html#GUC-FSYNC">fsync</a> off. The defaults of these mount and export
options differ between vendors and versions, so it is recommended to
check and perhaps specify them explicitly in any case to avoid any
ambiguity.
</p><p>
In some cases, an external storage product can be accessed either via NFS
or a lower-level protocol such as iSCSI. In the latter case, the storage
appears as a block device and any available file system can be created on
it. That approach might relieve the DBA from having to deal with some of
the idiosyncrasies of NFS, but of course the complexity of managing
remote storage then happens at other levels.
</p></div></div></div><div xmlns="http://www.w3.org/TR/xhtml1/transitional" class="navfooter"><hr></hr><table width="100%" summary="Navigation footer"><tr><td width="40%" align="left"><a accesskey="p" href="postgres-user.html" title="19.1. The PostgreSQL User Account">Prev</a> </td><td width="20%" align="center"><a accesskey="u" href="runtime.html" title="Chapter 19. Server Setup and Operation">Up</a></td><td width="40%" align="right"> <a accesskey="n" href="server-start.html" title="19.3. Starting the Database Server">Next</a></td></tr><tr><td width="40%" align="left" valign="top">19.1. The <span xmlns="http://www.w3.org/1999/xhtml" class="productname">PostgreSQL</span> User Account </td><td width="20%" align="center"><a accesskey="h" href="index.html" title="PostgreSQL 14.5 Documentation">Home</a></td><td width="40%" align="right" valign="top"> 19.3. Starting the Database Server</td></tr></table></div></body></html>
|