diff options
Diffstat (limited to 'doc/src/sgml/html/creating-cluster.html')
-rw-r--r-- | doc/src/sgml/html/creating-cluster.html | 203 |
1 files changed, 203 insertions, 0 deletions
diff --git a/doc/src/sgml/html/creating-cluster.html b/doc/src/sgml/html/creating-cluster.html new file mode 100644 index 0000000..d3fa3cb --- /dev/null +++ b/doc/src/sgml/html/creating-cluster.html @@ -0,0 +1,203 @@ +<?xml version="1.0" encoding="UTF-8" standalone="no"?> +<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"><html xmlns="http://www.w3.org/1999/xhtml"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /><title>19.2. Creating a Database Cluster</title><link rel="stylesheet" type="text/css" href="stylesheet.css" /><link rev="made" href="pgsql-docs@lists.postgresql.org" /><meta name="generator" content="DocBook XSL Stylesheets Vsnapshot" /><link rel="prev" href="postgres-user.html" title="19.1. The PostgreSQL User Account" /><link rel="next" href="server-start.html" title="19.3. Starting the Database Server" /></head><body id="docContent" class="container-fluid col-10"><div class="navheader"><table width="100%" summary="Navigation header"><tr><th colspan="5" align="center">19.2. Creating a Database Cluster</th></tr><tr><td width="10%" align="left"><a accesskey="p" href="postgres-user.html" title="19.1. The PostgreSQL User Account">Prev</a> </td><td width="10%" align="left"><a accesskey="u" href="runtime.html" title="Chapter 19. Server Setup and Operation">Up</a></td><th width="60%" align="center">Chapter 19. Server Setup and Operation</th><td width="10%" align="right"><a accesskey="h" href="index.html" title="PostgreSQL 16.2 Documentation">Home</a></td><td width="10%" align="right"> <a accesskey="n" href="server-start.html" title="19.3. Starting the Database Server">Next</a></td></tr></table><hr /></div><div class="sect1" id="CREATING-CLUSTER"><div class="titlepage"><div><div><h2 class="title" style="clear: both">19.2. Creating a Database Cluster <a href="#CREATING-CLUSTER" class="id_link">#</a></h2></div></div></div><div class="toc"><dl class="toc"><dt><span class="sect2"><a href="creating-cluster.html#CREATING-CLUSTER-MOUNT-POINTS">19.2.1. Use of Secondary File Systems</a></span></dt><dt><span class="sect2"><a href="creating-cluster.html#CREATING-CLUSTER-FILESYSTEM">19.2.2. File Systems</a></span></dt></dl></div><a id="id-1.6.6.5.2" class="indexterm"></a><a id="id-1.6.6.5.3" class="indexterm"></a><p> + Before you can do anything, you must initialize a database storage + area on disk. We call this a <em class="firstterm">database cluster</em>. + (The <acronym class="acronym">SQL</acronym> standard uses the term catalog cluster.) A + database cluster is a collection of databases that is managed by a + single instance of a running database server. After initialization, a + database cluster will contain a database named <code class="literal">postgres</code>, + which is meant as a default database for use by utilities, users and third + party applications. The database server itself does not require the + <code class="literal">postgres</code> database to exist, but many external utility + programs assume it exists. There are two more databases created within + each cluster during initialization, named <code class="literal">template1</code> + and <code class="literal">template0</code>. As the names suggest, these will be + used as templates for subsequently-created databases; they should not be + used for actual work. (See <a class="xref" href="managing-databases.html" title="Chapter 23. Managing Databases">Chapter 23</a> for + information about creating new databases within a cluster.) + </p><p> + In file system terms, a database cluster is a single directory + under which all data will be stored. We call this the <em class="firstterm">data + directory</em> or <em class="firstterm">data area</em>. It is + completely up to you where you choose to store your data. There is no + default, although locations such as + <code class="filename">/usr/local/pgsql/data</code> or + <code class="filename">/var/lib/pgsql/data</code> are popular. + The data directory must be initialized before being used, using the program + <a class="xref" href="app-initdb.html" title="initdb"><span class="refentrytitle"><span class="application">initdb</span></span></a><a id="id-1.6.6.5.5.6" class="indexterm"></a> + which is installed with <span class="productname">PostgreSQL</span>. + </p><p> + If you are using a pre-packaged version + of <span class="productname">PostgreSQL</span>, it may well have a specific + convention for where to place the data directory, and it may also + provide a script for creating the data directory. In that case you + should use that script in preference to + running <code class="command">initdb</code> directly. + Consult the package-level documentation for details. + </p><p> + To initialize a database cluster manually, + run <code class="command">initdb</code> and specify the desired + file system location of the database cluster with the + <code class="option">-D</code> option, for example: +</p><pre class="screen"> +<code class="prompt">$</code> <strong class="userinput"><code>initdb -D /usr/local/pgsql/data</code></strong> +</pre><p> + Note that you must execute this command while logged into the + <span class="productname">PostgreSQL</span> user account, which is + described in the previous section. + </p><div class="tip"><h3 class="title">Tip</h3><p> + As an alternative to the <code class="option">-D</code> option, you can set + the environment variable <code class="envar">PGDATA</code>. + <a id="id-1.6.6.5.8.1.3" class="indexterm"></a> + </p></div><p> + Alternatively, you can run <code class="command">initdb</code> via + the <a class="xref" href="app-pg-ctl.html" title="pg_ctl"><span class="refentrytitle"><span class="application">pg_ctl</span></span></a> + program<a id="id-1.6.6.5.9.3" class="indexterm"></a> like so: +</p><pre class="screen"> +<code class="prompt">$</code> <strong class="userinput"><code>pg_ctl -D /usr/local/pgsql/data initdb</code></strong> +</pre><p> + This may be more intuitive if you are + using <code class="command">pg_ctl</code> for starting and stopping the + server (see <a class="xref" href="server-start.html" title="19.3. Starting the Database Server">Section 19.3</a>), so + that <code class="command">pg_ctl</code> would be the sole command you use + for managing the database server instance. + </p><p> + <code class="command">initdb</code> will attempt to create the directory you + specify if it does not already exist. Of course, this will fail if + <code class="command">initdb</code> does not have permissions to write in the + parent directory. It's generally recommendable that the + <span class="productname">PostgreSQL</span> user own not just the data + directory but its parent directory as well, so that this should not + be a problem. If the desired parent directory doesn't exist either, + you will need to create it first, using root privileges if the + grandparent directory isn't writable. So the process might look + like this: +</p><pre class="screen"> +root# <strong class="userinput"><code>mkdir /usr/local/pgsql</code></strong> +root# <strong class="userinput"><code>chown postgres /usr/local/pgsql</code></strong> +root# <strong class="userinput"><code>su postgres</code></strong> +postgres$ <strong class="userinput"><code>initdb -D /usr/local/pgsql/data</code></strong> +</pre><p> + </p><p> + <code class="command">initdb</code> will refuse to run if the data directory + exists and already contains files; this is to prevent accidentally + overwriting an existing installation. + </p><p> + Because the data directory contains all the data stored in the + database, it is essential that it be secured from unauthorized + access. <code class="command">initdb</code> therefore revokes access + permissions from everyone but the + <span class="productname">PostgreSQL</span> user, and optionally, group. + Group access, when enabled, is read-only. This allows an unprivileged + user in the same group as the cluster owner to take a backup of the + cluster data or perform other operations that only require read access. + </p><p> + Note that enabling or disabling group access on an existing cluster requires + the cluster to be shut down and the appropriate mode to be set on all + directories and files before restarting + <span class="productname">PostgreSQL</span>. Otherwise, a mix of modes might + exist in the data directory. For clusters that allow access only by the + owner, the appropriate modes are <code class="literal">0700</code> for directories + and <code class="literal">0600</code> for files. For clusters that also allow + reads by the group, the appropriate modes are <code class="literal">0750</code> + for directories and <code class="literal">0640</code> for files. + </p><p> + However, while the directory contents are secure, the default + client authentication setup allows any local user to connect to the + database and even become the database superuser. If you do not + trust other local users, we recommend you use one of + <code class="command">initdb</code>'s <code class="option">-W</code>, <code class="option">--pwprompt</code> + or <code class="option">--pwfile</code> options to assign a password to the + database superuser.<a id="id-1.6.6.5.14.5" class="indexterm"></a> + Also, specify <code class="option">-A scram-sha-256</code> + so that the default <code class="literal">trust</code> authentication + mode is not used; or modify the generated <code class="filename">pg_hba.conf</code> + file after running <code class="command">initdb</code>, but + <span class="emphasis"><em>before</em></span> you start the server for the first time. (Other + reasonable approaches include using <code class="literal">peer</code> authentication + or file system permissions to restrict connections. See <a class="xref" href="client-authentication.html" title="Chapter 21. Client Authentication">Chapter 21</a> for more information.) + </p><p> + <code class="command">initdb</code> also initializes the default + locale<a id="id-1.6.6.5.15.2" class="indexterm"></a> for the database cluster. + Normally, it will just take the locale settings in the environment + and apply them to the initialized database. It is possible to + specify a different locale for the database; more information about + that can be found in <a class="xref" href="locale.html" title="24.1. Locale Support">Section 24.1</a>. The default sort order used + within the particular database cluster is set by + <code class="command">initdb</code>, and while you can create new databases using + different sort order, the order used in the template databases that initdb + creates cannot be changed without dropping and recreating them. + There is also a performance impact for using locales + other than <code class="literal">C</code> or <code class="literal">POSIX</code>. Therefore, it is + important to make this choice correctly the first time. + </p><p> + <code class="command">initdb</code> also sets the default character set encoding + for the database cluster. Normally this should be chosen to match the + locale setting. For details see <a class="xref" href="multibyte.html" title="24.3. Character Set Support">Section 24.3</a>. + </p><p> + Non-<code class="literal">C</code> and non-<code class="literal">POSIX</code> locales rely on the + operating system's collation library for character set ordering. + This controls the ordering of keys stored in indexes. For this reason, + a cluster cannot switch to an incompatible collation library version, + either through snapshot restore, binary streaming replication, a + different operating system, or an operating system upgrade. + </p><div class="sect2" id="CREATING-CLUSTER-MOUNT-POINTS"><div class="titlepage"><div><div><h3 class="title">19.2.1. Use of Secondary File Systems <a href="#CREATING-CLUSTER-MOUNT-POINTS" class="id_link">#</a></h3></div></div></div><a id="id-1.6.6.5.18.2" class="indexterm"></a><p> + Many installations create their database clusters on file systems + (volumes) other than the machine's <span class="quote">“<span class="quote">root</span>”</span> volume. If you + choose to do this, it is not advisable to try to use the secondary + volume's topmost directory (mount point) as the data directory. + Best practice is to create a directory within the mount-point + directory that is owned by the <span class="productname">PostgreSQL</span> + user, and then create the data directory within that. This avoids + permissions problems, particularly for operations such + as <span class="application">pg_upgrade</span>, and it also ensures clean failures if + the secondary volume is taken offline. + </p></div><div class="sect2" id="CREATING-CLUSTER-FILESYSTEM"><div class="titlepage"><div><div><h3 class="title">19.2.2. File Systems <a href="#CREATING-CLUSTER-FILESYSTEM" class="id_link">#</a></h3></div></div></div><p> + Generally, any file system with POSIX semantics can be used for + PostgreSQL. Users prefer different file systems for a variety of reasons, + including vendor support, performance, and familiarity. Experience + suggests that, all other things being equal, one should not expect major + performance or behavior changes merely from switching file systems or + making minor file system configuration changes. + </p><div class="sect3" id="CREATING-CLUSTER-NFS"><div class="titlepage"><div><div><h4 class="title">19.2.2.1. NFS <a href="#CREATING-CLUSTER-NFS" class="id_link">#</a></h4></div></div></div><a id="id-1.6.6.5.19.3.2" class="indexterm"></a><p> + It is possible to use an <acronym class="acronym">NFS</acronym> file system for storing + the <span class="productname">PostgreSQL</span> data directory. + <span class="productname">PostgreSQL</span> does nothing special for + <acronym class="acronym">NFS</acronym> file systems, meaning it assumes + <acronym class="acronym">NFS</acronym> behaves exactly like locally-connected drives. + <span class="productname">PostgreSQL</span> does not use any functionality that + is known to have nonstandard behavior on <acronym class="acronym">NFS</acronym>, such as + file locking. + </p><p> + The only firm requirement for using <acronym class="acronym">NFS</acronym> with + <span class="productname">PostgreSQL</span> is that the file system is mounted + using the <code class="literal">hard</code> option. With the + <code class="literal">hard</code> option, processes can <span class="quote">“<span class="quote">hang</span>”</span> + indefinitely if there are network problems, so this configuration will + require a careful monitoring setup. The <code class="literal">soft</code> option + will interrupt system calls in case of network problems, but + <span class="productname">PostgreSQL</span> will not repeat system calls + interrupted in this way, so any such interruption will result in an I/O + error being reported. + </p><p> + It is not necessary to use the <code class="literal">sync</code> mount option. The + behavior of the <code class="literal">async</code> option is sufficient, since + <span class="productname">PostgreSQL</span> issues <code class="literal">fsync</code> + calls at appropriate times to flush the write caches. (This is analogous + to how it works on a local file system.) However, it is strongly + recommended to use the <code class="literal">sync</code> export option on the NFS + <span class="emphasis"><em>server</em></span> on systems where it exists (mainly Linux). + Otherwise, an <code class="literal">fsync</code> or equivalent on the NFS client is + not actually guaranteed to reach permanent storage on the server, which + could cause corruption similar to running with the parameter <a class="xref" href="runtime-config-wal.html#GUC-FSYNC">fsync</a> off. The defaults of these mount and export + options differ between vendors and versions, so it is recommended to + check and perhaps specify them explicitly in any case to avoid any + ambiguity. + </p><p> + In some cases, an external storage product can be accessed either via NFS + or a lower-level protocol such as iSCSI. In the latter case, the storage + appears as a block device and any available file system can be created on + it. That approach might relieve the DBA from having to deal with some of + the idiosyncrasies of NFS, but of course the complexity of managing + remote storage then happens at other levels. + </p></div></div></div><div class="navfooter"><hr /><table width="100%" summary="Navigation footer"><tr><td width="40%" align="left"><a accesskey="p" href="postgres-user.html" title="19.1. The PostgreSQL User Account">Prev</a> </td><td width="20%" align="center"><a accesskey="u" href="runtime.html" title="Chapter 19. Server Setup and Operation">Up</a></td><td width="40%" align="right"> <a accesskey="n" href="server-start.html" title="19.3. Starting the Database Server">Next</a></td></tr><tr><td width="40%" align="left" valign="top">19.1. The <span class="productname">PostgreSQL</span> User Account </td><td width="20%" align="center"><a accesskey="h" href="index.html" title="PostgreSQL 16.2 Documentation">Home</a></td><td width="40%" align="right" valign="top"> 19.3. Starting the Database Server</td></tr></table></div></body></html>
\ No newline at end of file |