Adding upstream version 14.5.upstream/14.5 upstream

Signed-off-by: Daniel Baumann <daniel.baumann@progress-linux.org>
author: Daniel Baumann <daniel.baumann@progress-linux.org> 2024-05-04 12:15:05 +0000
committer: Daniel Baumann <daniel.baumann@progress-linux.org> 2024-05-04 12:15:05 +0000
commit: 46651ce6fe013220ed397add242004d764fc0153 (patch)
tree: 6e5299f990f88e60174a1d3ae6e48eedd2688b2b /doc/src/sgml/sources.sgml
parent: Initial commit. (diff)
download: postgresql-14-upstream.tar.xz
postgresql-14-upstream.zip
1 files changed, 1029 insertions, 0 deletions
diff --git a/doc/src/sgml/sources.sgml b/doc/src/sgml/sources.sgml
new file mode 100644
index 0000000..e6ae02f
--- /dev/null
+++ b/doc/src/sgml/sources.sgml
@@ -0,0 +1,1029 @@
+<!-- doc/src/sgml/sources.sgml -->
+
+ <chapter id="source">
+  <title>PostgreSQL Coding Conventions</title>
+
+  <sect1 id="source-format">
+   <title>Formatting</title>
+
+   <para>
+    Source code formatting uses 4 column tab spacing, with
+    tabs preserved (i.e., tabs are not expanded to spaces).
+    Each logical indentation level is one additional tab stop.
+   </para>
+
+   <para>
+    Layout rules (brace positioning, etc) follow BSD conventions.  In
+    particular, curly braces for the controlled blocks of <literal>if</literal>,
+    <literal>while</literal>, <literal>switch</literal>, etc go on their own lines.
+   </para>
+
+   <para>
+    Limit line lengths so that the code is readable in an 80-column window.
+    (This doesn't mean that you must never go past 80 columns.  For instance,
+    breaking a long error message string in arbitrary places just to keep the
+    code within 80 columns is probably not a net gain in readability.)
+   </para>
+
+   <para>
+    To maintain a consistent coding style, do not use C++ style comments
+    (<literal>//</literal> comments).  <application>pgindent</application>
+    will replace them with <literal>/* ... */</literal>.
+   </para>
+
+   <para>
+    The preferred style for multi-line comment blocks is
+<programlisting>
+/*
+ * comment text begins here
+ * and continues here
+ */
+</programlisting>
+    Note that comment blocks that begin in column 1 will be preserved as-is
+    by <application>pgindent</application>, but it will re-flow indented comment blocks
+    as though they were plain text.  If you want to preserve the line breaks
+    in an indented block, add dashes like this:
+<programlisting>
+    /*----------
+     * comment text begins here
+     * and continues here
+     *----------
+     */
+</programlisting>
+   </para>
+
+   <para>
+    While submitted patches do not absolutely have to follow these formatting
+    rules, it's a good idea to do so.  Your code will get run through
+    <application>pgindent</application> before the next release, so there's no point in
+    making it look nice under some other set of formatting conventions.
+    A good rule of thumb for patches is <quote>make the new code look like
+    the existing code around it</quote>.
+   </para>
+
+   <para>
+    The <filename>src/tools</filename> directory contains sample settings
+    files that can be used with the <productname>emacs</productname>,
+    <productname>xemacs</productname> or <productname>vim</productname>
+    editors to help ensure that they format code according to these
+    conventions.
+   </para>
+
+   <para>
+    The text browsing tools <application>more</application> and
+    <application>less</application> can be invoked as:
+<programlisting>
+more -x4
+less -x4
+</programlisting>
+    to make them show tabs appropriately.
+   </para>
+  </sect1>
+
+  <sect1 id="error-message-reporting">
+   <title>Reporting Errors Within the Server</title>
+
+   <indexterm>
+    <primary>ereport</primary>
+   </indexterm>
+   <indexterm>
+    <primary>elog</primary>
+   </indexterm>
+
+   <para>
+    Error, warning, and log messages generated within the server code
+    should be created using <function>ereport</function>, or its older cousin
+    <function>elog</function>.  The use of this function is complex enough to
+    require some explanation.
+   </para>
+
+   <para>
+    There are two required elements for every message: a severity level
+    (ranging from <literal>DEBUG</literal> to <literal>PANIC</literal>) and a primary
+    message text.  In addition there are optional elements, the most
+    common of which is an error identifier code that follows the SQL spec's
+    SQLSTATE conventions.
+    <function>ereport</function> itself is just a shell macro that exists
+    mainly for the syntactic convenience of making message generation
+    look like a single function call in the C source code.  The only parameter
+    accepted directly by <function>ereport</function> is the severity level.
+    The primary message text and any optional message elements are
+    generated by calling auxiliary functions, such as <function>errmsg</function>,
+    within the <function>ereport</function> call.
+   </para>
+
+   <para>
+    A typical call to <function>ereport</function> might look like this:
+<programlisting>
+ereport(ERROR,
+        errcode(ERRCODE_DIVISION_BY_ZERO),
+        errmsg("division by zero"));
+</programlisting>
+    This specifies error severity level <literal>ERROR</literal> (a run-of-the-mill
+    error).  The <function>errcode</function> call specifies the SQLSTATE error code
+    using a macro defined in <filename>src/include/utils/errcodes.h</filename>.  The
+    <function>errmsg</function> call provides the primary message text.
+   </para>
+
+   <para>
+    You will also frequently see this older style, with an extra set of
+    parentheses surrounding the auxiliary function calls:
+<programlisting>
+ereport(ERROR,
+        (errcode(ERRCODE_DIVISION_BY_ZERO),
+         errmsg("division by zero")));
+</programlisting>
+    The extra parentheses were required
+    before <productname>PostgreSQL</productname> version 12, but are now
+    optional.
+   </para>
+
+   <para>
+    Here is a more complex example:
+<programlisting>
+ereport(ERROR,
+        errcode(ERRCODE_AMBIGUOUS_FUNCTION),
+        errmsg("function %s is not unique",
+               func_signature_string(funcname, nargs,
+                                     NIL, actual_arg_types)),
+        errhint("Unable to choose a best candidate function. "
+                "You might need to add explicit typecasts."));
+</programlisting>
+    This illustrates the use of format codes to embed run-time values into
+    a message text.  Also, an optional <quote>hint</quote> message is provided.
+    The auxiliary function calls can be written in any order, but
+    conventionally <function>errcode</function>
+    and <function>errmsg</function> appear first.
+   </para>
+
+   <para>
+    If the severity level is <literal>ERROR</literal> or higher,
+    <function>ereport</function> aborts execution of the current query
+    and does not return to the caller. If the severity level is
+    lower than <literal>ERROR</literal>, <function>ereport</function> returns normally.
+   </para>
+
+   <para>
+    The available auxiliary routines for <function>ereport</function> are:
+  <itemizedlist>
+   <listitem>
+    <para>
+     <function>errcode(sqlerrcode)</function> specifies the SQLSTATE error identifier
+     code for the condition.  If this routine is not called, the error
+     identifier defaults to
+     <literal>ERRCODE_INTERNAL_ERROR</literal> when the error severity level is
+     <literal>ERROR</literal> or higher, <literal>ERRCODE_WARNING</literal> when the
+     error level is <literal>WARNING</literal>, otherwise (for <literal>NOTICE</literal>
+     and below) <literal>ERRCODE_SUCCESSFUL_COMPLETION</literal>.
+     While these defaults are often convenient, always think whether they
+     are appropriate before omitting the <function>errcode()</function> call.
+    </para>
+   </listitem>
+   <listitem>
+    <para>
+     <function>errmsg(const char *msg, ...)</function> specifies the primary error
+     message text, and possibly run-time values to insert into it.  Insertions
+     are specified by <function>sprintf</function>-style format codes.  In addition to
+     the standard format codes accepted by <function>sprintf</function>, the format
+     code <literal>%m</literal> can be used to insert the error message returned
+     by <function>strerror</function> for the current value of <literal>errno</literal>.
+     <footnote>
+      <para>
+       That is, the value that was current when the <function>ereport</function> call
+       was reached; changes of <literal>errno</literal> within the auxiliary reporting
+       routines will not affect it.  That would not be true if you were to
+       write <literal>strerror(errno)</literal> explicitly in <function>errmsg</function>'s
+       parameter list; accordingly, do not do so.
+      </para>
+     </footnote>
+     <literal>%m</literal> does not require any
+     corresponding entry in the parameter list for <function>errmsg</function>.
+     Note that the message string will be run through <function>gettext</function>
+     for possible localization before format codes are processed.
+    </para>
+   </listitem>
+   <listitem>
+    <para>
+     <function>errmsg_internal(const char *msg, ...)</function> is the same as
+     <function>errmsg</function>, except that the message string will not be
+     translated nor included in the internationalization message dictionary.
+     This should be used for <quote>cannot happen</quote> cases that are probably
+     not worth expending translation effort on.
+    </para>
+   </listitem>
+   <listitem>
+    <para>
+     <function>errmsg_plural(const char *fmt_singular, const char *fmt_plural,
+     unsigned long n, ...)</function> is like <function>errmsg</function>, but with
+     support for various plural forms of the message.
+     <replaceable>fmt_singular</replaceable> is the English singular format,
+     <replaceable>fmt_plural</replaceable> is the English plural format,
+     <replaceable>n</replaceable> is the integer value that determines which plural
+     form is needed, and the remaining arguments are formatted according
+     to the selected format string.  For more information see
+     <xref linkend="nls-guidelines"/>.
+    </para>
+   </listitem>
+   <listitem>
+    <para>
+     <function>errdetail(const char *msg, ...)</function> supplies an optional
+     <quote>detail</quote> message; this is to be used when there is additional
+     information that seems inappropriate to put in the primary message.
+     The message string is processed in just the same way as for
+     <function>errmsg</function>.
+    </para>
+   </listitem>
+   <listitem>
+    <para>
+     <function>errdetail_internal(const char *msg, ...)</function> is the same
+     as <function>errdetail</function>, except that the message string will not be
+     translated nor included in the internationalization message dictionary.
+     This should be used for detail messages that are not worth expending
+     translation effort on, for instance because they are too technical to be
+     useful to most users.
+    </para>
+   </listitem>
+   <listitem>
+    <para>
+     <function>errdetail_plural(const char *fmt_singular, const char *fmt_plural,
+     unsigned long n, ...)</function> is like <function>errdetail</function>, but with
+     support for various plural forms of the message.
+     For more information see <xref linkend="nls-guidelines"/>.
+    </para>
+   </listitem>
+   <listitem>
+    <para>
+     <function>errdetail_log(const char *msg, ...)</function> is the same as
+     <function>errdetail</function> except that this string goes only to the server
+     log, never to the client.  If both <function>errdetail</function> (or one of
+     its equivalents above) and
+     <function>errdetail_log</function> are used then one string goes to the client
+     and the other to the log.  This is useful for error details that are
+     too security-sensitive or too bulky to include in the report
+     sent to the client.
+    </para>
+   </listitem>
+   <listitem>
+    <para>
+     <function>errdetail_log_plural(const char *fmt_singular, const char
+     *fmt_plural, unsigned long n, ...)</function> is like
+     <function>errdetail_log</function>, but with support for various plural forms of
+     the message.
+     For more information see <xref linkend="nls-guidelines"/>.
+    </para>
+   </listitem>
+   <listitem>
+    <para>
+     <function>errhint(const char *msg, ...)</function> supplies an optional
+     <quote>hint</quote> message; this is to be used when offering suggestions
+     about how to fix the problem, as opposed to factual details about
+     what went wrong.
+     The message string is processed in just the same way as for
+     <function>errmsg</function>.
+    </para>
+   </listitem>
+   <listitem>
+    <para>
+     <function>errhint_plural(const char *fmt_singular, const char *fmt_plural,
+     unsigned long n, ...)</function> is like <function>errhint</function>, but with
+     support for various plural forms of the message.
+     For more information see <xref linkend="nls-guidelines"/>.
+    </para>
+   </listitem>
+   <listitem>
+    <para>
+     <function>errcontext(const char *msg, ...)</function> is not normally called
+     directly from an <function>ereport</function> message site; rather it is used
+     in <literal>error_context_stack</literal> callback functions to provide
+     information about the context in which an error occurred, such as the
+     current location in a PL function.
+     The message string is processed in just the same way as for
+     <function>errmsg</function>.  Unlike the other auxiliary functions, this can
+     be called more than once per <function>ereport</function> call; the successive
+     strings thus supplied are concatenated with separating newlines.
+    </para>
+   </listitem>
+   <listitem>
+    <para>
+     <function>errposition(int cursorpos)</function> specifies the textual location
+     of an error within a query string.  Currently it is only useful for
+     errors detected in the lexical and syntactic analysis phases of
+     query processing.
+    </para>
+   </listitem>
+   <listitem>
+    <para>
+     <function>errtable(Relation rel)</function> specifies a relation whose
+     name and schema name should be included as auxiliary fields in the error
+     report.
+    </para>
+   </listitem>
+   <listitem>
+    <para>
+     <function>errtablecol(Relation rel, int attnum)</function> specifies
+     a column whose name, table name, and schema name should be included as
+     auxiliary fields in the error report.
+    </para>
+   </listitem>
+   <listitem>
+    <para>
+     <function>errtableconstraint(Relation rel, const char *conname)</function>
+     specifies a table constraint whose name, table name, and schema name
+     should be included as auxiliary fields in the error report.  Indexes
+     should be considered to be constraints for this purpose, whether or
+     not they have an associated <structname>pg_constraint</structname> entry.  Be
+     careful to pass the underlying heap relation, not the index itself, as
+     <literal>rel</literal>.
+    </para>
+   </listitem>
+   <listitem>
+    <para>
+     <function>errdatatype(Oid datatypeOid)</function> specifies a data
+     type whose name and schema name should be included as auxiliary fields
+     in the error report.
+    </para>
+   </listitem>
+   <listitem>
+    <para>
+     <function>errdomainconstraint(Oid datatypeOid, const char *conname)</function>
+     specifies a domain constraint whose name, domain name, and schema name
+     should be included as auxiliary fields in the error report.
+    </para>
+   </listitem>
+   <listitem>
+    <para>
+     <function>errcode_for_file_access()</function> is a convenience function that
+     selects an appropriate SQLSTATE error identifier for a failure in a
+     file-access-related system call.  It uses the saved
+     <literal>errno</literal> to determine which error code to generate.
+     Usually this should be used in combination with <literal>%m</literal> in the
+     primary error message text.
+    </para>
+   </listitem>
+   <listitem>
+    <para>
+     <function>errcode_for_socket_access()</function> is a convenience function that
+     selects an appropriate SQLSTATE error identifier for a failure in a
+     socket-related system call.
+    </para>
+   </listitem>
+   <listitem>
+    <para>
+     <function>errhidestmt(bool hide_stmt)</function> can be called to specify
+     suppression of the <literal>STATEMENT:</literal> portion of a message in the
+     postmaster log.  Generally this is appropriate if the message text
+     includes the current statement already.
+    </para>
+   </listitem>
+   <listitem>
+    <para>
+     <function>errhidecontext(bool hide_ctx)</function> can be called to
+     specify suppression of the <literal>CONTEXT:</literal> portion of a message in
+     the postmaster log.  This should only be used for verbose debugging
+     messages where the repeated inclusion of context would bloat the log
+     too much.
+    </para>
+   </listitem>
+  </itemizedlist>
+   </para>
+
+   <note>
+    <para>
+     At most one of the functions <function>errtable</function>,
+     <function>errtablecol</function>, <function>errtableconstraint</function>,
+     <function>errdatatype</function>, or <function>errdomainconstraint</function> should
+     be used in an <function>ereport</function> call.  These functions exist to
+     allow applications to extract the name of a database object associated
+     with the error condition without having to examine the
+     potentially-localized error message text.
+     These functions should be used in error reports for which it's likely
+     that applications would wish to have automatic error handling.  As of
+     <productname>PostgreSQL</productname> 9.3, complete coverage exists only for
+     errors in SQLSTATE class 23 (integrity constraint violation), but this
+     is likely to be expanded in future.
+    </para>
+   </note>
+
+   <para>
+    There is an older function <function>elog</function> that is still heavily used.
+    An <function>elog</function> call:
+<programlisting>
+elog(level, "format string", ...);
+</programlisting>
+    is exactly equivalent to:
+<programlisting>
+ereport(level, errmsg_internal("format string", ...));
+</programlisting>
+    Notice that the SQLSTATE error code is always defaulted, and the message
+    string is not subject to translation.
+    Therefore, <function>elog</function> should be used only for internal errors and
+    low-level debug logging.  Any message that is likely to be of interest to
+    ordinary users should go through <function>ereport</function>.  Nonetheless,
+    there are enough internal <quote>cannot happen</quote> error checks in the
+    system that <function>elog</function> is still widely used; it is preferred for
+    those messages for its notational simplicity.
+   </para>
+
+   <para>
+    Advice about writing good error messages can be found in
+    <xref linkend="error-style-guide"/>.
+   </para>
+  </sect1>
+
+  <sect1 id="error-style-guide">
+   <title>Error Message Style Guide</title>
+
+   <para>
+    This style guide is offered in the hope of maintaining a consistent,
+    user-friendly style throughout all the messages generated by
+    <productname>PostgreSQL</productname>.
+   </para>
+
+  <simplesect>
+   <title>What Goes Where</title>
+
+   <para>
+    The primary message should be short, factual, and avoid reference to
+    implementation details such as specific function names.
+    <quote>Short</quote> means <quote>should fit on one line under normal
+    conditions</quote>.  Use a detail message if needed to keep the primary
+    message short, or if you feel a need to mention implementation details
+    such as the particular system call that failed. Both primary and detail
+    messages should be factual.  Use a hint message for suggestions about what
+    to do to fix the problem, especially if the suggestion might not always be
+    applicable.
+   </para>
+
+   <para>
+    For example, instead of:
+<programlisting>
+IpcMemoryCreate: shmget(key=%d, size=%u, 0%o) failed: %m
+(plus a long addendum that is basically a hint)
+</programlisting>
+    write:
+<programlisting>
+Primary:    could not create shared memory segment: %m
+Detail:     Failed syscall was shmget(key=%d, size=%u, 0%o).
+Hint:       the addendum
+</programlisting>
+   </para>
+
+   <para>
+    Rationale: keeping the primary message short helps keep it to the point,
+    and lets clients lay out screen space on the assumption that one line is
+    enough for error messages.  Detail and hint messages can be relegated to a
+    verbose mode, or perhaps a pop-up error-details window.  Also, details and
+    hints would normally be suppressed from the server log to save
+    space. Reference to implementation details is best avoided since users
+    aren't expected to know the details.
+   </para>
+
+  </simplesect>
+
+  <simplesect>
+   <title>Formatting</title>
+
+   <para>
+    Don't put any specific assumptions about formatting into the message
+    texts.  Expect clients and the server log to wrap lines to fit their own
+    needs.  In long messages, newline characters (\n) can be used to indicate
+    suggested paragraph breaks.  Don't end a message with a newline.  Don't
+    use tabs or other formatting characters.  (In error context displays,
+    newlines are automatically added to separate levels of context such as
+    function calls.)
+   </para>
+
+   <para>
+    Rationale: Messages are not necessarily displayed on terminal-type
+    displays.  In GUI displays or browsers these formatting instructions are
+    at best ignored.
+   </para>
+
+  </simplesect>
+
+  <simplesect>
+   <title>Quotation Marks</title>
+
+   <para>
+    English text should use double quotes when quoting is appropriate.
+    Text in other languages should consistently use one kind of quotes that is
+    consistent with publishing customs and computer output of other programs.
+   </para>
+
+   <para>
+    Rationale: The choice of double quotes over single quotes is somewhat
+    arbitrary, but tends to be the preferred use.  Some have suggested
+    choosing the kind of quotes depending on the type of object according to
+    SQL conventions (namely, strings single quoted, identifiers double
+    quoted).  But this is a language-internal technical issue that many users
+    aren't even familiar with, it won't scale to other kinds of quoted terms,
+    it doesn't translate to other languages, and it's pretty pointless, too.
+   </para>
+
+  </simplesect>
+
+  <simplesect>
+   <title>Use of Quotes</title>
+
+   <para>
+    Always use quotes to delimit file names, user-supplied identifiers, and
+    other variables that might contain words.  Do not use them to mark up
+    variables that will not contain words (for example, operator names).
+   </para>
+
+   <para>
+    There are functions in the backend that will double-quote their own output
+    as needed (for example, <function>format_type_be()</function>).  Do not put
+    additional quotes around the output of such functions.
+   </para>
+
+   <para>
+    Rationale: Objects can have names that create ambiguity when embedded in a
+    message.  Be consistent about denoting where a plugged-in name starts and
+    ends.  But don't clutter messages with unnecessary or duplicate quote
+    marks.
+   </para>
+
+  </simplesect>
+
+  <simplesect>
+   <title>Grammar and Punctuation</title>
+
+   <para>
+    The rules are different for primary error messages and for detail/hint
+    messages:
+   </para>
+
+   <para>
+    Primary error messages: Do not capitalize the first letter.  Do not end a
+    message with a period.  Do not even think about ending a message with an
+    exclamation point.
+   </para>
+
+   <para>
+    Detail and hint messages: Use complete sentences, and end each with
+    a period.  Capitalize the first word of sentences.  Put two spaces after
+    the period if another sentence follows (for English text; might be
+    inappropriate in other languages).
+   </para>
+
+   <para>
+    Error context strings: Do not capitalize the first letter and do
+    not end the string with a period.  Context strings should normally
+    not be complete sentences.
+   </para>
+
+   <para>
+    Rationale: Avoiding punctuation makes it easier for client applications to
+    embed the message into a variety of grammatical contexts.  Often, primary
+    messages are not grammatically complete sentences anyway.  (And if they're
+    long enough to be more than one sentence, they should be split into
+    primary and detail parts.)  However, detail and hint messages are longer
+    and might need to include multiple sentences.  For consistency, they should
+    follow complete-sentence style even when there's only one sentence.
+   </para>
+
+  </simplesect>
+
+  <simplesect>
+   <title>Upper Case vs. Lower Case</title>
+
+   <para>
+    Use lower case for message wording, including the first letter of a
+    primary error message.  Use upper case for SQL commands and key words if
+    they appear in the message.
+   </para>
+
+   <para>
+    Rationale: It's easier to make everything look more consistent this
+    way, since some messages are complete sentences and some not.
+   </para>
+
+  </simplesect>
+
+  <simplesect>
+   <title>Avoid Passive Voice</title>
+
+   <para>
+    Use the active voice.  Use complete sentences when there is an acting
+    subject (<quote>A could not do B</quote>).  Use telegram style without
+    subject if the subject would be the program itself; do not use
+    <quote>I</quote> for the program.
+   </para>
+
+   <para>
+    Rationale: The program is not human.  Don't pretend otherwise.
+   </para>
+
+  </simplesect>
+
+  <simplesect>
+   <title>Present vs. Past Tense</title>
+
+   <para>
+    Use past tense if an attempt to do something failed, but could perhaps
+    succeed next time (perhaps after fixing some problem).  Use present tense
+    if the failure is certainly permanent.
+   </para>
+
+   <para>
+    There is a nontrivial semantic difference between sentences of the form:
+<programlisting>
+could not open file "%s": %m
+</programlisting>
+and:
+<programlisting>
+cannot open file "%s"
+</programlisting>
+    The first one means that the attempt to open the file failed.  The
+    message should give a reason, such as <quote>disk full</quote> or
+    <quote>file doesn't exist</quote>.  The past tense is appropriate because
+    next time the disk might not be full anymore or the file in question might
+    exist.
+   </para>
+
+   <para>
+    The second form indicates that the functionality of opening the named file
+    does not exist at all in the program, or that it's conceptually
+    impossible.  The present tense is appropriate because the condition will
+    persist indefinitely.
+   </para>
+
+   <para>
+    Rationale: Granted, the average user will not be able to draw great
+    conclusions merely from the tense of the message, but since the language
+    provides us with a grammar we should use it correctly.
+   </para>
+
+  </simplesect>
+
+  <simplesect>
+   <title>Type of the Object</title>
+
+   <para>
+    When citing the name of an object, state what kind of object it is.
+   </para>
+
+   <para>
+    Rationale: Otherwise no one will know what <quote>foo.bar.baz</quote>
+    refers to.
+   </para>
+
+  </simplesect>
+
+  <simplesect>
+   <title>Brackets</title>
+
+   <para>
+    Square brackets are only to be used (1) in command synopses to denote
+    optional arguments, or (2) to denote an array subscript.
+   </para>
+
+   <para>
+    Rationale: Anything else does not correspond to widely-known customary
+    usage and will confuse people.
+   </para>
+
+  </simplesect>
+
+  <simplesect>
+   <title>Assembling Error Messages</title>
+
+   <para>
+   When a message includes text that is generated elsewhere, embed it in
+   this style:
+<programlisting>
+could not open file %s: %m
+</programlisting>
+   </para>
+
+   <para>
+    Rationale: It would be difficult to account for all possible error codes
+    to paste this into a single smooth sentence, so some sort of punctuation
+    is needed.  Putting the embedded text in parentheses has also been
+    suggested, but it's unnatural if the embedded text is likely to be the
+    most important part of the message, as is often the case.
+   </para>
+
+  </simplesect>
+
+  <simplesect>
+   <title>Reasons for Errors</title>
+
+   <para>
+    Messages should always state the reason why an error occurred.
+    For example:
+<programlisting>
+BAD:    could not open file %s
+BETTER: could not open file %s (I/O failure)
+</programlisting>
+    If no reason is known you better fix the code.
+   </para>
+
+  </simplesect>
+
+  <simplesect>
+   <title>Function Names</title>
+
+   <para>
+    Don't include the name of the reporting routine in the error text. We have
+    other mechanisms for finding that out when needed, and for most users it's
+    not helpful information.  If the error text doesn't make as much sense
+    without the function name, reword it.
+<programlisting>
+BAD:    pg_strtoint32: error in "z": cannot parse "z"
+BETTER: invalid input syntax for type integer: "z"
+</programlisting>
+   </para>
+
+   <para>
+    Avoid mentioning called function names, either; instead say what the code
+    was trying to do:
+<programlisting>
+BAD:    open() failed: %m
+BETTER: could not open file %s: %m
+</programlisting>
+    If it really seems necessary, mention the system call in the detail
+    message.  (In some cases, providing the actual values passed to the
+    system call might be appropriate information for the detail message.)
+   </para>
+
+   <para>
+    Rationale: Users don't know what all those functions do.
+   </para>
+
+  </simplesect>
+
+  <simplesect>
+   <title>Tricky Words to Avoid</title>
+
+  <formalpara>
+    <title>Unable</title>
+   <para>
+    <quote>Unable</quote> is nearly the passive voice.  Better use
+    <quote>cannot</quote> or <quote>could not</quote>, as appropriate.
+   </para>
+  </formalpara>
+
+  <formalpara>
+    <title>Bad</title>
+   <para>
+    Error messages like <quote>bad result</quote> are really hard to interpret
+    intelligently.  It's better to write why the result is <quote>bad</quote>,
+    e.g., <quote>invalid format</quote>.
+   </para>
+  </formalpara>
+
+  <formalpara>
+    <title>Illegal</title>
+   <para>
+    <quote>Illegal</quote> stands for a violation of the law, the rest is
+    <quote>invalid</quote>. Better yet, say why it's invalid.
+   </para>
+  </formalpara>
+
+  <formalpara>
+    <title>Unknown</title>
+   <para>
+    Try to avoid <quote>unknown</quote>.  Consider <quote>error: unknown
+    response</quote>.  If you don't know what the response is, how do you know
+    it's erroneous? <quote>Unrecognized</quote> is often a better choice.
+    Also, be sure to include the value being complained of.
+<programlisting>
+BAD:    unknown node type
+BETTER: unrecognized node type: 42
+</programlisting>
+   </para>
+  </formalpara>
+
+  <formalpara>
+    <title>Find vs. Exists</title>
+   <para>
+    If the program uses a nontrivial algorithm to locate a resource (e.g., a
+    path search) and that algorithm fails, it is fair to say that the program
+    couldn't <quote>find</quote> the resource.  If, on the other hand, the
+    expected location of the resource is known but the program cannot access
+    it there then say that the resource doesn't <quote>exist</quote>.  Using
+    <quote>find</quote> in this case sounds weak and confuses the issue.
+   </para>
+  </formalpara>
+
+  <formalpara>
+    <title>May vs. Can vs. Might</title>
+   <para>
+    <quote>May</quote> suggests permission (e.g., "You may borrow my rake."),
+    and has little use in documentation or error messages.
+    <quote>Can</quote> suggests ability (e.g., "I can lift that log."),
+    and <quote>might</quote> suggests possibility (e.g., "It might rain
+    today.").  Using the proper word clarifies meaning and assists
+    translation.
+   </para>
+  </formalpara>
+
+  <formalpara>
+    <title>Contractions</title>
+   <para>
+    Avoid contractions, like <quote>can't</quote>;  use
+    <quote>cannot</quote> instead.
+   </para>
+  </formalpara>
+
+  <formalpara>
+    <title>Non-negative</title>
+   <para>
+    Avoid <quote>non-negative</quote> as it is ambiguous
+    about whether it accepts zero.  It's better to use
+    <quote>greater than zero</quote> or
+    <quote>greater than or equal to zero</quote>.
+   </para>
+  </formalpara>
+
+  </simplesect>
+
+  <simplesect>
+   <title>Proper Spelling</title>
+
+   <para>
+    Spell out words in full.  For instance, avoid:
+  <itemizedlist>
+   <listitem>
+    <para>
+     spec
+    </para>
+   </listitem>
+   <listitem>
+    <para>
+     stats
+    </para>
+   </listitem>
+   <listitem>
+    <para>
+     parens
+    </para>
+   </listitem>
+   <listitem>
+    <para>
+     auth
+    </para>
+   </listitem>
+   <listitem>
+    <para>
+     xact
+    </para>
+   </listitem>
+  </itemizedlist>
+   </para>
+
+   <para>
+    Rationale: This will improve consistency.
+   </para>
+
+  </simplesect>
+
+  <simplesect>
+   <title>Localization</title>
+
+   <para>
+    Keep in mind that error message texts need to be translated into other
+    languages.  Follow the guidelines in <xref linkend="nls-guidelines"/>
+    to avoid making life difficult for translators.
+   </para>
+  </simplesect>
+
+  </sect1>
+
+  <sect1 id="source-conventions">
+   <title>Miscellaneous Coding Conventions</title>
+
+   <simplesect>
+    <title>C Standard</title>
+    <para>
+     Code in <productname>PostgreSQL</productname> should only rely on language
+     features available in the C99 standard. That means a conforming
+     C99 compiler has to be able to compile postgres, at least aside
+     from a few platform dependent pieces.
+    </para>
+    <para>
+     A few features included in the C99 standard are, at this time, not
+     permitted to be used in core <productname>PostgreSQL</productname>
+     code. This currently includes variable length arrays, intermingled
+     declarations and code, <literal>//</literal> comments, universal
+     character names. Reasons for that include portability and historical
+     practices.
+    </para>
+    <para>
+     Features from later revisions of the C standard or compiler specific
+     features can be used, if a fallback is provided.
+    </para>
+    <para>
+     For example <literal>_Static_assert()</literal> and
+     <literal>__builtin_constant_p</literal> are currently used, even though
+     they are from newer revisions of the C standard and a
+     <productname>GCC</productname> extension respectively. If not available
+     we respectively fall back to using a C99 compatible replacement that
+     performs the same checks, but emits rather cryptic messages and do not
+     use <literal>__builtin_constant_p</literal>.
+    </para>
+   </simplesect>
+
+   <simplesect>
+    <title>Function-Like Macros and Inline Functions</title>
+    <para>
+     Both, macros with arguments and <literal>static inline</literal>
+     functions, may be used. The latter are preferable if there are
+     multiple-evaluation hazards when written as a macro, as e.g., the
+     case with
+<programlisting>
+#define Max(x, y)       ((x) > (y) ? (x) : (y))
+</programlisting>
+     or when the macro would be very long. In other cases it's only
+     possible to use macros, or at least easier.  For example because
+     expressions of various types need to be passed to the macro.
+    </para>
+    <para>
+     When the definition of an inline function references symbols
+     (i.e., variables, functions) that are only available as part of the
+     backend, the function may not be visible when included from frontend
+     code.
+<programlisting>
+#ifndef FRONTEND
+static inline MemoryContext
+MemoryContextSwitchTo(MemoryContext context)
+{
+    MemoryContext old = CurrentMemoryContext;
+
+    CurrentMemoryContext = context;
+    return old;
+}
+#endif   /* FRONTEND */
+</programlisting>
+     In this example <literal>CurrentMemoryContext</literal>, which is only
+     available in the backend, is referenced and the function thus
+     hidden with a <literal>#ifndef FRONTEND</literal>. This rule
+     exists because some compilers emit references to symbols
+     contained in inline functions even if the function is not used.
+    </para>
+   </simplesect>
+
+   <simplesect>
+    <title>Writing Signal Handlers</title>
+    <para>
+     To be suitable to run inside a signal handler code has to be
+     written very carefully. The fundamental problem is that, unless
+     blocked, a signal handler can interrupt code at any time. If code
+     inside the signal handler uses the same state as code outside
+     chaos may ensue. As an example consider what happens if a signal
+     handler tries to acquire a lock that's already held in the
+     interrupted code.
+    </para>
+    <para>
+     Barring special arrangements code in signal handlers may only
+     call async-signal safe functions (as defined in POSIX) and access
+     variables of type <literal>volatile sig_atomic_t</literal>. A few
+     functions in <command>postgres</command> are also deemed signal safe, importantly
+     <function>SetLatch()</function>.
+    </para>
+    <para>
+     In most cases signal handlers should do nothing more than note
+     that a signal has arrived, and wake up code running outside of
+     the handler using a latch. An example of such a handler is the
+     following:
+<programlisting>
+static void
+handle_sighup(SIGNAL_ARGS)
+{
+    int         save_errno = errno;
+
+    got_SIGHUP = true;
+    SetLatch(MyLatch);
+
+    errno = save_errno;
+}
+</programlisting>
+     <varname>errno</varname> is saved and restored because
+     <function>SetLatch()</function> might change it. If that were not done
+     interrupted code that's currently inspecting <varname>errno</varname> might see the wrong
+     value.
+    </para>
+   </simplesect>
+
+   <simplesect>
+    <title>Calling Function Pointers</title>
+
+    <para>
+     For clarity, it is preferred to explicitly dereference a function pointer
+     when calling the pointed-to function if the pointer is a simple variable,
+     for example:
+<programlisting>
+(*emit_log_hook) (edata);
+</programlisting>
+     (even though <literal>emit_log_hook(edata)</literal> would also work).
+     When the function pointer is part of a structure, then the extra
+     punctuation can and usually should be omitted, for example:
+<programlisting>
+paramInfo->paramFetch(paramInfo, paramId);
+</programlisting>
+    </para>
+   </simplesect>
+  </sect1>
+ </chapter>
author	Daniel Baumann <daniel.baumann@progress-linux.org>	2024-05-04 12:15:05 +0000
committer	Daniel Baumann <daniel.baumann@progress-linux.org>	2024-05-04 12:15:05 +0000
commit	46651ce6fe013220ed397add242004d764fc0153 (patch)
tree	6e5299f990f88e60174a1d3ae6e48eedd2688b2b /doc/src/sgml/sources.sgml
parent	Initial commit. (diff)
download	postgresql-14-upstream.tar.xz postgresql-14-upstream.zip