summaryrefslogtreecommitdiffstats
path: root/doc/src/sgml/html/creating-cluster.html
diff options
context:
space:
mode:
Diffstat (limited to '')
-rw-r--r--doc/src/sgml/html/creating-cluster.html203
1 files changed, 203 insertions, 0 deletions
diff --git a/doc/src/sgml/html/creating-cluster.html b/doc/src/sgml/html/creating-cluster.html
new file mode 100644
index 0000000..d3fa3cb
--- /dev/null
+++ b/doc/src/sgml/html/creating-cluster.html
@@ -0,0 +1,203 @@
+<?xml version="1.0" encoding="UTF-8" standalone="no"?>
+<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"><html xmlns="http://www.w3.org/1999/xhtml"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /><title>19.2. Creating a Database Cluster</title><link rel="stylesheet" type="text/css" href="stylesheet.css" /><link rev="made" href="pgsql-docs@lists.postgresql.org" /><meta name="generator" content="DocBook XSL Stylesheets Vsnapshot" /><link rel="prev" href="postgres-user.html" title="19.1. The PostgreSQL User Account" /><link rel="next" href="server-start.html" title="19.3. Starting the Database Server" /></head><body id="docContent" class="container-fluid col-10"><div class="navheader"><table width="100%" summary="Navigation header"><tr><th colspan="5" align="center">19.2. Creating a Database Cluster</th></tr><tr><td width="10%" align="left"><a accesskey="p" href="postgres-user.html" title="19.1. The PostgreSQL User Account">Prev</a> </td><td width="10%" align="left"><a accesskey="u" href="runtime.html" title="Chapter 19. Server Setup and Operation">Up</a></td><th width="60%" align="center">Chapter 19. Server Setup and Operation</th><td width="10%" align="right"><a accesskey="h" href="index.html" title="PostgreSQL 16.2 Documentation">Home</a></td><td width="10%" align="right"> <a accesskey="n" href="server-start.html" title="19.3. Starting the Database Server">Next</a></td></tr></table><hr /></div><div class="sect1" id="CREATING-CLUSTER"><div class="titlepage"><div><div><h2 class="title" style="clear: both">19.2. Creating a Database Cluster <a href="#CREATING-CLUSTER" class="id_link">#</a></h2></div></div></div><div class="toc"><dl class="toc"><dt><span class="sect2"><a href="creating-cluster.html#CREATING-CLUSTER-MOUNT-POINTS">19.2.1. Use of Secondary File Systems</a></span></dt><dt><span class="sect2"><a href="creating-cluster.html#CREATING-CLUSTER-FILESYSTEM">19.2.2. File Systems</a></span></dt></dl></div><a id="id-1.6.6.5.2" class="indexterm"></a><a id="id-1.6.6.5.3" class="indexterm"></a><p>
+ Before you can do anything, you must initialize a database storage
+ area on disk. We call this a <em class="firstterm">database cluster</em>.
+ (The <acronym class="acronym">SQL</acronym> standard uses the term catalog cluster.) A
+ database cluster is a collection of databases that is managed by a
+ single instance of a running database server. After initialization, a
+ database cluster will contain a database named <code class="literal">postgres</code>,
+ which is meant as a default database for use by utilities, users and third
+ party applications. The database server itself does not require the
+ <code class="literal">postgres</code> database to exist, but many external utility
+ programs assume it exists. There are two more databases created within
+ each cluster during initialization, named <code class="literal">template1</code>
+ and <code class="literal">template0</code>. As the names suggest, these will be
+ used as templates for subsequently-created databases; they should not be
+ used for actual work. (See <a class="xref" href="managing-databases.html" title="Chapter 23. Managing Databases">Chapter 23</a> for
+ information about creating new databases within a cluster.)
+ </p><p>
+ In file system terms, a database cluster is a single directory
+ under which all data will be stored. We call this the <em class="firstterm">data
+ directory</em> or <em class="firstterm">data area</em>. It is
+ completely up to you where you choose to store your data. There is no
+ default, although locations such as
+ <code class="filename">/usr/local/pgsql/data</code> or
+ <code class="filename">/var/lib/pgsql/data</code> are popular.
+ The data directory must be initialized before being used, using the program
+ <a class="xref" href="app-initdb.html" title="initdb"><span class="refentrytitle"><span class="application">initdb</span></span></a><a id="id-1.6.6.5.5.6" class="indexterm"></a>
+ which is installed with <span class="productname">PostgreSQL</span>.
+ </p><p>
+ If you are using a pre-packaged version
+ of <span class="productname">PostgreSQL</span>, it may well have a specific
+ convention for where to place the data directory, and it may also
+ provide a script for creating the data directory. In that case you
+ should use that script in preference to
+ running <code class="command">initdb</code> directly.
+ Consult the package-level documentation for details.
+ </p><p>
+ To initialize a database cluster manually,
+ run <code class="command">initdb</code> and specify the desired
+ file system location of the database cluster with the
+ <code class="option">-D</code> option, for example:
+</p><pre class="screen">
+<code class="prompt">$</code> <strong class="userinput"><code>initdb -D /usr/local/pgsql/data</code></strong>
+</pre><p>
+ Note that you must execute this command while logged into the
+ <span class="productname">PostgreSQL</span> user account, which is
+ described in the previous section.
+ </p><div class="tip"><h3 class="title">Tip</h3><p>
+ As an alternative to the <code class="option">-D</code> option, you can set
+ the environment variable <code class="envar">PGDATA</code>.
+ <a id="id-1.6.6.5.8.1.3" class="indexterm"></a>
+ </p></div><p>
+ Alternatively, you can run <code class="command">initdb</code> via
+ the <a class="xref" href="app-pg-ctl.html" title="pg_ctl"><span class="refentrytitle"><span class="application">pg_ctl</span></span></a>
+ program<a id="id-1.6.6.5.9.3" class="indexterm"></a> like so:
+</p><pre class="screen">
+<code class="prompt">$</code> <strong class="userinput"><code>pg_ctl -D /usr/local/pgsql/data initdb</code></strong>
+</pre><p>
+ This may be more intuitive if you are
+ using <code class="command">pg_ctl</code> for starting and stopping the
+ server (see <a class="xref" href="server-start.html" title="19.3. Starting the Database Server">Section 19.3</a>), so
+ that <code class="command">pg_ctl</code> would be the sole command you use
+ for managing the database server instance.
+ </p><p>
+ <code class="command">initdb</code> will attempt to create the directory you
+ specify if it does not already exist. Of course, this will fail if
+ <code class="command">initdb</code> does not have permissions to write in the
+ parent directory. It's generally recommendable that the
+ <span class="productname">PostgreSQL</span> user own not just the data
+ directory but its parent directory as well, so that this should not
+ be a problem. If the desired parent directory doesn't exist either,
+ you will need to create it first, using root privileges if the
+ grandparent directory isn't writable. So the process might look
+ like this:
+</p><pre class="screen">
+root# <strong class="userinput"><code>mkdir /usr/local/pgsql</code></strong>
+root# <strong class="userinput"><code>chown postgres /usr/local/pgsql</code></strong>
+root# <strong class="userinput"><code>su postgres</code></strong>
+postgres$ <strong class="userinput"><code>initdb -D /usr/local/pgsql/data</code></strong>
+</pre><p>
+ </p><p>
+ <code class="command">initdb</code> will refuse to run if the data directory
+ exists and already contains files; this is to prevent accidentally
+ overwriting an existing installation.
+ </p><p>
+ Because the data directory contains all the data stored in the
+ database, it is essential that it be secured from unauthorized
+ access. <code class="command">initdb</code> therefore revokes access
+ permissions from everyone but the
+ <span class="productname">PostgreSQL</span> user, and optionally, group.
+ Group access, when enabled, is read-only. This allows an unprivileged
+ user in the same group as the cluster owner to take a backup of the
+ cluster data or perform other operations that only require read access.
+ </p><p>
+ Note that enabling or disabling group access on an existing cluster requires
+ the cluster to be shut down and the appropriate mode to be set on all
+ directories and files before restarting
+ <span class="productname">PostgreSQL</span>. Otherwise, a mix of modes might
+ exist in the data directory. For clusters that allow access only by the
+ owner, the appropriate modes are <code class="literal">0700</code> for directories
+ and <code class="literal">0600</code> for files. For clusters that also allow
+ reads by the group, the appropriate modes are <code class="literal">0750</code>
+ for directories and <code class="literal">0640</code> for files.
+ </p><p>
+ However, while the directory contents are secure, the default
+ client authentication setup allows any local user to connect to the
+ database and even become the database superuser. If you do not
+ trust other local users, we recommend you use one of
+ <code class="command">initdb</code>'s <code class="option">-W</code>, <code class="option">--pwprompt</code>
+ or <code class="option">--pwfile</code> options to assign a password to the
+ database superuser.<a id="id-1.6.6.5.14.5" class="indexterm"></a>
+ Also, specify <code class="option">-A scram-sha-256</code>
+ so that the default <code class="literal">trust</code> authentication
+ mode is not used; or modify the generated <code class="filename">pg_hba.conf</code>
+ file after running <code class="command">initdb</code>, but
+ <span class="emphasis"><em>before</em></span> you start the server for the first time. (Other
+ reasonable approaches include using <code class="literal">peer</code> authentication
+ or file system permissions to restrict connections. See <a class="xref" href="client-authentication.html" title="Chapter 21. Client Authentication">Chapter 21</a> for more information.)
+ </p><p>
+ <code class="command">initdb</code> also initializes the default
+ locale<a id="id-1.6.6.5.15.2" class="indexterm"></a> for the database cluster.
+ Normally, it will just take the locale settings in the environment
+ and apply them to the initialized database. It is possible to
+ specify a different locale for the database; more information about
+ that can be found in <a class="xref" href="locale.html" title="24.1. Locale Support">Section 24.1</a>. The default sort order used
+ within the particular database cluster is set by
+ <code class="command">initdb</code>, and while you can create new databases using
+ different sort order, the order used in the template databases that initdb
+ creates cannot be changed without dropping and recreating them.
+ There is also a performance impact for using locales
+ other than <code class="literal">C</code> or <code class="literal">POSIX</code>. Therefore, it is
+ important to make this choice correctly the first time.
+ </p><p>
+ <code class="command">initdb</code> also sets the default character set encoding
+ for the database cluster. Normally this should be chosen to match the
+ locale setting. For details see <a class="xref" href="multibyte.html" title="24.3. Character Set Support">Section 24.3</a>.
+ </p><p>
+ Non-<code class="literal">C</code> and non-<code class="literal">POSIX</code> locales rely on the
+ operating system's collation library for character set ordering.
+ This controls the ordering of keys stored in indexes. For this reason,
+ a cluster cannot switch to an incompatible collation library version,
+ either through snapshot restore, binary streaming replication, a
+ different operating system, or an operating system upgrade.
+ </p><div class="sect2" id="CREATING-CLUSTER-MOUNT-POINTS"><div class="titlepage"><div><div><h3 class="title">19.2.1. Use of Secondary File Systems <a href="#CREATING-CLUSTER-MOUNT-POINTS" class="id_link">#</a></h3></div></div></div><a id="id-1.6.6.5.18.2" class="indexterm"></a><p>
+ Many installations create their database clusters on file systems
+ (volumes) other than the machine's <span class="quote">“<span class="quote">root</span>”</span> volume. If you
+ choose to do this, it is not advisable to try to use the secondary
+ volume's topmost directory (mount point) as the data directory.
+ Best practice is to create a directory within the mount-point
+ directory that is owned by the <span class="productname">PostgreSQL</span>
+ user, and then create the data directory within that. This avoids
+ permissions problems, particularly for operations such
+ as <span class="application">pg_upgrade</span>, and it also ensures clean failures if
+ the secondary volume is taken offline.
+ </p></div><div class="sect2" id="CREATING-CLUSTER-FILESYSTEM"><div class="titlepage"><div><div><h3 class="title">19.2.2. File Systems <a href="#CREATING-CLUSTER-FILESYSTEM" class="id_link">#</a></h3></div></div></div><p>
+ Generally, any file system with POSIX semantics can be used for
+ PostgreSQL. Users prefer different file systems for a variety of reasons,
+ including vendor support, performance, and familiarity. Experience
+ suggests that, all other things being equal, one should not expect major
+ performance or behavior changes merely from switching file systems or
+ making minor file system configuration changes.
+ </p><div class="sect3" id="CREATING-CLUSTER-NFS"><div class="titlepage"><div><div><h4 class="title">19.2.2.1. NFS <a href="#CREATING-CLUSTER-NFS" class="id_link">#</a></h4></div></div></div><a id="id-1.6.6.5.19.3.2" class="indexterm"></a><p>
+ It is possible to use an <acronym class="acronym">NFS</acronym> file system for storing
+ the <span class="productname">PostgreSQL</span> data directory.
+ <span class="productname">PostgreSQL</span> does nothing special for
+ <acronym class="acronym">NFS</acronym> file systems, meaning it assumes
+ <acronym class="acronym">NFS</acronym> behaves exactly like locally-connected drives.
+ <span class="productname">PostgreSQL</span> does not use any functionality that
+ is known to have nonstandard behavior on <acronym class="acronym">NFS</acronym>, such as
+ file locking.
+ </p><p>
+ The only firm requirement for using <acronym class="acronym">NFS</acronym> with
+ <span class="productname">PostgreSQL</span> is that the file system is mounted
+ using the <code class="literal">hard</code> option. With the
+ <code class="literal">hard</code> option, processes can <span class="quote">“<span class="quote">hang</span>”</span>
+ indefinitely if there are network problems, so this configuration will
+ require a careful monitoring setup. The <code class="literal">soft</code> option
+ will interrupt system calls in case of network problems, but
+ <span class="productname">PostgreSQL</span> will not repeat system calls
+ interrupted in this way, so any such interruption will result in an I/O
+ error being reported.
+ </p><p>
+ It is not necessary to use the <code class="literal">sync</code> mount option. The
+ behavior of the <code class="literal">async</code> option is sufficient, since
+ <span class="productname">PostgreSQL</span> issues <code class="literal">fsync</code>
+ calls at appropriate times to flush the write caches. (This is analogous
+ to how it works on a local file system.) However, it is strongly
+ recommended to use the <code class="literal">sync</code> export option on the NFS
+ <span class="emphasis"><em>server</em></span> on systems where it exists (mainly Linux).
+ Otherwise, an <code class="literal">fsync</code> or equivalent on the NFS client is
+ not actually guaranteed to reach permanent storage on the server, which
+ could cause corruption similar to running with the parameter <a class="xref" href="runtime-config-wal.html#GUC-FSYNC">fsync</a> off. The defaults of these mount and export
+ options differ between vendors and versions, so it is recommended to
+ check and perhaps specify them explicitly in any case to avoid any
+ ambiguity.
+ </p><p>
+ In some cases, an external storage product can be accessed either via NFS
+ or a lower-level protocol such as iSCSI. In the latter case, the storage
+ appears as a block device and any available file system can be created on
+ it. That approach might relieve the DBA from having to deal with some of
+ the idiosyncrasies of NFS, but of course the complexity of managing
+ remote storage then happens at other levels.
+ </p></div></div></div><div class="navfooter"><hr /><table width="100%" summary="Navigation footer"><tr><td width="40%" align="left"><a accesskey="p" href="postgres-user.html" title="19.1. The PostgreSQL User Account">Prev</a> </td><td width="20%" align="center"><a accesskey="u" href="runtime.html" title="Chapter 19. Server Setup and Operation">Up</a></td><td width="40%" align="right"> <a accesskey="n" href="server-start.html" title="19.3. Starting the Database Server">Next</a></td></tr><tr><td width="40%" align="left" valign="top">19.1. The <span class="productname">PostgreSQL</span> User Account </td><td width="20%" align="center"><a accesskey="h" href="index.html" title="PostgreSQL 16.2 Documentation">Home</a></td><td width="40%" align="right" valign="top"> 19.3. Starting the Database Server</td></tr></table></div></body></html> \ No newline at end of file