diff options
author | Daniel Baumann <daniel.baumann@progress-linux.org> | 2024-05-04 12:15:05 +0000 |
---|---|---|
committer | Daniel Baumann <daniel.baumann@progress-linux.org> | 2024-05-04 12:15:05 +0000 |
commit | 46651ce6fe013220ed397add242004d764fc0153 (patch) | |
tree | 6e5299f990f88e60174a1d3ae6e48eedd2688b2b /doc/src/sgml/sources.sgml | |
parent | Initial commit. (diff) | |
download | postgresql-14-upstream.tar.xz postgresql-14-upstream.zip |
Adding upstream version 14.5.upstream/14.5upstream
Signed-off-by: Daniel Baumann <daniel.baumann@progress-linux.org>
Diffstat (limited to 'doc/src/sgml/sources.sgml')
-rw-r--r-- | doc/src/sgml/sources.sgml | 1029 |
1 files changed, 1029 insertions, 0 deletions
diff --git a/doc/src/sgml/sources.sgml b/doc/src/sgml/sources.sgml new file mode 100644 index 0000000..e6ae02f --- /dev/null +++ b/doc/src/sgml/sources.sgml @@ -0,0 +1,1029 @@ +<!-- doc/src/sgml/sources.sgml --> + + <chapter id="source"> + <title>PostgreSQL Coding Conventions</title> + + <sect1 id="source-format"> + <title>Formatting</title> + + <para> + Source code formatting uses 4 column tab spacing, with + tabs preserved (i.e., tabs are not expanded to spaces). + Each logical indentation level is one additional tab stop. + </para> + + <para> + Layout rules (brace positioning, etc) follow BSD conventions. In + particular, curly braces for the controlled blocks of <literal>if</literal>, + <literal>while</literal>, <literal>switch</literal>, etc go on their own lines. + </para> + + <para> + Limit line lengths so that the code is readable in an 80-column window. + (This doesn't mean that you must never go past 80 columns. For instance, + breaking a long error message string in arbitrary places just to keep the + code within 80 columns is probably not a net gain in readability.) + </para> + + <para> + To maintain a consistent coding style, do not use C++ style comments + (<literal>//</literal> comments). <application>pgindent</application> + will replace them with <literal>/* ... */</literal>. + </para> + + <para> + The preferred style for multi-line comment blocks is +<programlisting> +/* + * comment text begins here + * and continues here + */ +</programlisting> + Note that comment blocks that begin in column 1 will be preserved as-is + by <application>pgindent</application>, but it will re-flow indented comment blocks + as though they were plain text. If you want to preserve the line breaks + in an indented block, add dashes like this: +<programlisting> + /*---------- + * comment text begins here + * and continues here + *---------- + */ +</programlisting> + </para> + + <para> + While submitted patches do not absolutely have to follow these formatting + rules, it's a good idea to do so. Your code will get run through + <application>pgindent</application> before the next release, so there's no point in + making it look nice under some other set of formatting conventions. + A good rule of thumb for patches is <quote>make the new code look like + the existing code around it</quote>. + </para> + + <para> + The <filename>src/tools</filename> directory contains sample settings + files that can be used with the <productname>emacs</productname>, + <productname>xemacs</productname> or <productname>vim</productname> + editors to help ensure that they format code according to these + conventions. + </para> + + <para> + The text browsing tools <application>more</application> and + <application>less</application> can be invoked as: +<programlisting> +more -x4 +less -x4 +</programlisting> + to make them show tabs appropriately. + </para> + </sect1> + + <sect1 id="error-message-reporting"> + <title>Reporting Errors Within the Server</title> + + <indexterm> + <primary>ereport</primary> + </indexterm> + <indexterm> + <primary>elog</primary> + </indexterm> + + <para> + Error, warning, and log messages generated within the server code + should be created using <function>ereport</function>, or its older cousin + <function>elog</function>. The use of this function is complex enough to + require some explanation. + </para> + + <para> + There are two required elements for every message: a severity level + (ranging from <literal>DEBUG</literal> to <literal>PANIC</literal>) and a primary + message text. In addition there are optional elements, the most + common of which is an error identifier code that follows the SQL spec's + SQLSTATE conventions. + <function>ereport</function> itself is just a shell macro that exists + mainly for the syntactic convenience of making message generation + look like a single function call in the C source code. The only parameter + accepted directly by <function>ereport</function> is the severity level. + The primary message text and any optional message elements are + generated by calling auxiliary functions, such as <function>errmsg</function>, + within the <function>ereport</function> call. + </para> + + <para> + A typical call to <function>ereport</function> might look like this: +<programlisting> +ereport(ERROR, + errcode(ERRCODE_DIVISION_BY_ZERO), + errmsg("division by zero")); +</programlisting> + This specifies error severity level <literal>ERROR</literal> (a run-of-the-mill + error). The <function>errcode</function> call specifies the SQLSTATE error code + using a macro defined in <filename>src/include/utils/errcodes.h</filename>. The + <function>errmsg</function> call provides the primary message text. + </para> + + <para> + You will also frequently see this older style, with an extra set of + parentheses surrounding the auxiliary function calls: +<programlisting> +ereport(ERROR, + (errcode(ERRCODE_DIVISION_BY_ZERO), + errmsg("division by zero"))); +</programlisting> + The extra parentheses were required + before <productname>PostgreSQL</productname> version 12, but are now + optional. + </para> + + <para> + Here is a more complex example: +<programlisting> +ereport(ERROR, + errcode(ERRCODE_AMBIGUOUS_FUNCTION), + errmsg("function %s is not unique", + func_signature_string(funcname, nargs, + NIL, actual_arg_types)), + errhint("Unable to choose a best candidate function. " + "You might need to add explicit typecasts.")); +</programlisting> + This illustrates the use of format codes to embed run-time values into + a message text. Also, an optional <quote>hint</quote> message is provided. + The auxiliary function calls can be written in any order, but + conventionally <function>errcode</function> + and <function>errmsg</function> appear first. + </para> + + <para> + If the severity level is <literal>ERROR</literal> or higher, + <function>ereport</function> aborts execution of the current query + and does not return to the caller. If the severity level is + lower than <literal>ERROR</literal>, <function>ereport</function> returns normally. + </para> + + <para> + The available auxiliary routines for <function>ereport</function> are: + <itemizedlist> + <listitem> + <para> + <function>errcode(sqlerrcode)</function> specifies the SQLSTATE error identifier + code for the condition. If this routine is not called, the error + identifier defaults to + <literal>ERRCODE_INTERNAL_ERROR</literal> when the error severity level is + <literal>ERROR</literal> or higher, <literal>ERRCODE_WARNING</literal> when the + error level is <literal>WARNING</literal>, otherwise (for <literal>NOTICE</literal> + and below) <literal>ERRCODE_SUCCESSFUL_COMPLETION</literal>. + While these defaults are often convenient, always think whether they + are appropriate before omitting the <function>errcode()</function> call. + </para> + </listitem> + <listitem> + <para> + <function>errmsg(const char *msg, ...)</function> specifies the primary error + message text, and possibly run-time values to insert into it. Insertions + are specified by <function>sprintf</function>-style format codes. In addition to + the standard format codes accepted by <function>sprintf</function>, the format + code <literal>%m</literal> can be used to insert the error message returned + by <function>strerror</function> for the current value of <literal>errno</literal>. + <footnote> + <para> + That is, the value that was current when the <function>ereport</function> call + was reached; changes of <literal>errno</literal> within the auxiliary reporting + routines will not affect it. That would not be true if you were to + write <literal>strerror(errno)</literal> explicitly in <function>errmsg</function>'s + parameter list; accordingly, do not do so. + </para> + </footnote> + <literal>%m</literal> does not require any + corresponding entry in the parameter list for <function>errmsg</function>. + Note that the message string will be run through <function>gettext</function> + for possible localization before format codes are processed. + </para> + </listitem> + <listitem> + <para> + <function>errmsg_internal(const char *msg, ...)</function> is the same as + <function>errmsg</function>, except that the message string will not be + translated nor included in the internationalization message dictionary. + This should be used for <quote>cannot happen</quote> cases that are probably + not worth expending translation effort on. + </para> + </listitem> + <listitem> + <para> + <function>errmsg_plural(const char *fmt_singular, const char *fmt_plural, + unsigned long n, ...)</function> is like <function>errmsg</function>, but with + support for various plural forms of the message. + <replaceable>fmt_singular</replaceable> is the English singular format, + <replaceable>fmt_plural</replaceable> is the English plural format, + <replaceable>n</replaceable> is the integer value that determines which plural + form is needed, and the remaining arguments are formatted according + to the selected format string. For more information see + <xref linkend="nls-guidelines"/>. + </para> + </listitem> + <listitem> + <para> + <function>errdetail(const char *msg, ...)</function> supplies an optional + <quote>detail</quote> message; this is to be used when there is additional + information that seems inappropriate to put in the primary message. + The message string is processed in just the same way as for + <function>errmsg</function>. + </para> + </listitem> + <listitem> + <para> + <function>errdetail_internal(const char *msg, ...)</function> is the same + as <function>errdetail</function>, except that the message string will not be + translated nor included in the internationalization message dictionary. + This should be used for detail messages that are not worth expending + translation effort on, for instance because they are too technical to be + useful to most users. + </para> + </listitem> + <listitem> + <para> + <function>errdetail_plural(const char *fmt_singular, const char *fmt_plural, + unsigned long n, ...)</function> is like <function>errdetail</function>, but with + support for various plural forms of the message. + For more information see <xref linkend="nls-guidelines"/>. + </para> + </listitem> + <listitem> + <para> + <function>errdetail_log(const char *msg, ...)</function> is the same as + <function>errdetail</function> except that this string goes only to the server + log, never to the client. If both <function>errdetail</function> (or one of + its equivalents above) and + <function>errdetail_log</function> are used then one string goes to the client + and the other to the log. This is useful for error details that are + too security-sensitive or too bulky to include in the report + sent to the client. + </para> + </listitem> + <listitem> + <para> + <function>errdetail_log_plural(const char *fmt_singular, const char + *fmt_plural, unsigned long n, ...)</function> is like + <function>errdetail_log</function>, but with support for various plural forms of + the message. + For more information see <xref linkend="nls-guidelines"/>. + </para> + </listitem> + <listitem> + <para> + <function>errhint(const char *msg, ...)</function> supplies an optional + <quote>hint</quote> message; this is to be used when offering suggestions + about how to fix the problem, as opposed to factual details about + what went wrong. + The message string is processed in just the same way as for + <function>errmsg</function>. + </para> + </listitem> + <listitem> + <para> + <function>errhint_plural(const char *fmt_singular, const char *fmt_plural, + unsigned long n, ...)</function> is like <function>errhint</function>, but with + support for various plural forms of the message. + For more information see <xref linkend="nls-guidelines"/>. + </para> + </listitem> + <listitem> + <para> + <function>errcontext(const char *msg, ...)</function> is not normally called + directly from an <function>ereport</function> message site; rather it is used + in <literal>error_context_stack</literal> callback functions to provide + information about the context in which an error occurred, such as the + current location in a PL function. + The message string is processed in just the same way as for + <function>errmsg</function>. Unlike the other auxiliary functions, this can + be called more than once per <function>ereport</function> call; the successive + strings thus supplied are concatenated with separating newlines. + </para> + </listitem> + <listitem> + <para> + <function>errposition(int cursorpos)</function> specifies the textual location + of an error within a query string. Currently it is only useful for + errors detected in the lexical and syntactic analysis phases of + query processing. + </para> + </listitem> + <listitem> + <para> + <function>errtable(Relation rel)</function> specifies a relation whose + name and schema name should be included as auxiliary fields in the error + report. + </para> + </listitem> + <listitem> + <para> + <function>errtablecol(Relation rel, int attnum)</function> specifies + a column whose name, table name, and schema name should be included as + auxiliary fields in the error report. + </para> + </listitem> + <listitem> + <para> + <function>errtableconstraint(Relation rel, const char *conname)</function> + specifies a table constraint whose name, table name, and schema name + should be included as auxiliary fields in the error report. Indexes + should be considered to be constraints for this purpose, whether or + not they have an associated <structname>pg_constraint</structname> entry. Be + careful to pass the underlying heap relation, not the index itself, as + <literal>rel</literal>. + </para> + </listitem> + <listitem> + <para> + <function>errdatatype(Oid datatypeOid)</function> specifies a data + type whose name and schema name should be included as auxiliary fields + in the error report. + </para> + </listitem> + <listitem> + <para> + <function>errdomainconstraint(Oid datatypeOid, const char *conname)</function> + specifies a domain constraint whose name, domain name, and schema name + should be included as auxiliary fields in the error report. + </para> + </listitem> + <listitem> + <para> + <function>errcode_for_file_access()</function> is a convenience function that + selects an appropriate SQLSTATE error identifier for a failure in a + file-access-related system call. It uses the saved + <literal>errno</literal> to determine which error code to generate. + Usually this should be used in combination with <literal>%m</literal> in the + primary error message text. + </para> + </listitem> + <listitem> + <para> + <function>errcode_for_socket_access()</function> is a convenience function that + selects an appropriate SQLSTATE error identifier for a failure in a + socket-related system call. + </para> + </listitem> + <listitem> + <para> + <function>errhidestmt(bool hide_stmt)</function> can be called to specify + suppression of the <literal>STATEMENT:</literal> portion of a message in the + postmaster log. Generally this is appropriate if the message text + includes the current statement already. + </para> + </listitem> + <listitem> + <para> + <function>errhidecontext(bool hide_ctx)</function> can be called to + specify suppression of the <literal>CONTEXT:</literal> portion of a message in + the postmaster log. This should only be used for verbose debugging + messages where the repeated inclusion of context would bloat the log + too much. + </para> + </listitem> + </itemizedlist> + </para> + + <note> + <para> + At most one of the functions <function>errtable</function>, + <function>errtablecol</function>, <function>errtableconstraint</function>, + <function>errdatatype</function>, or <function>errdomainconstraint</function> should + be used in an <function>ereport</function> call. These functions exist to + allow applications to extract the name of a database object associated + with the error condition without having to examine the + potentially-localized error message text. + These functions should be used in error reports for which it's likely + that applications would wish to have automatic error handling. As of + <productname>PostgreSQL</productname> 9.3, complete coverage exists only for + errors in SQLSTATE class 23 (integrity constraint violation), but this + is likely to be expanded in future. + </para> + </note> + + <para> + There is an older function <function>elog</function> that is still heavily used. + An <function>elog</function> call: +<programlisting> +elog(level, "format string", ...); +</programlisting> + is exactly equivalent to: +<programlisting> +ereport(level, errmsg_internal("format string", ...)); +</programlisting> + Notice that the SQLSTATE error code is always defaulted, and the message + string is not subject to translation. + Therefore, <function>elog</function> should be used only for internal errors and + low-level debug logging. Any message that is likely to be of interest to + ordinary users should go through <function>ereport</function>. Nonetheless, + there are enough internal <quote>cannot happen</quote> error checks in the + system that <function>elog</function> is still widely used; it is preferred for + those messages for its notational simplicity. + </para> + + <para> + Advice about writing good error messages can be found in + <xref linkend="error-style-guide"/>. + </para> + </sect1> + + <sect1 id="error-style-guide"> + <title>Error Message Style Guide</title> + + <para> + This style guide is offered in the hope of maintaining a consistent, + user-friendly style throughout all the messages generated by + <productname>PostgreSQL</productname>. + </para> + + <simplesect> + <title>What Goes Where</title> + + <para> + The primary message should be short, factual, and avoid reference to + implementation details such as specific function names. + <quote>Short</quote> means <quote>should fit on one line under normal + conditions</quote>. Use a detail message if needed to keep the primary + message short, or if you feel a need to mention implementation details + such as the particular system call that failed. Both primary and detail + messages should be factual. Use a hint message for suggestions about what + to do to fix the problem, especially if the suggestion might not always be + applicable. + </para> + + <para> + For example, instead of: +<programlisting> +IpcMemoryCreate: shmget(key=%d, size=%u, 0%o) failed: %m +(plus a long addendum that is basically a hint) +</programlisting> + write: +<programlisting> +Primary: could not create shared memory segment: %m +Detail: Failed syscall was shmget(key=%d, size=%u, 0%o). +Hint: the addendum +</programlisting> + </para> + + <para> + Rationale: keeping the primary message short helps keep it to the point, + and lets clients lay out screen space on the assumption that one line is + enough for error messages. Detail and hint messages can be relegated to a + verbose mode, or perhaps a pop-up error-details window. Also, details and + hints would normally be suppressed from the server log to save + space. Reference to implementation details is best avoided since users + aren't expected to know the details. + </para> + + </simplesect> + + <simplesect> + <title>Formatting</title> + + <para> + Don't put any specific assumptions about formatting into the message + texts. Expect clients and the server log to wrap lines to fit their own + needs. In long messages, newline characters (\n) can be used to indicate + suggested paragraph breaks. Don't end a message with a newline. Don't + use tabs or other formatting characters. (In error context displays, + newlines are automatically added to separate levels of context such as + function calls.) + </para> + + <para> + Rationale: Messages are not necessarily displayed on terminal-type + displays. In GUI displays or browsers these formatting instructions are + at best ignored. + </para> + + </simplesect> + + <simplesect> + <title>Quotation Marks</title> + + <para> + English text should use double quotes when quoting is appropriate. + Text in other languages should consistently use one kind of quotes that is + consistent with publishing customs and computer output of other programs. + </para> + + <para> + Rationale: The choice of double quotes over single quotes is somewhat + arbitrary, but tends to be the preferred use. Some have suggested + choosing the kind of quotes depending on the type of object according to + SQL conventions (namely, strings single quoted, identifiers double + quoted). But this is a language-internal technical issue that many users + aren't even familiar with, it won't scale to other kinds of quoted terms, + it doesn't translate to other languages, and it's pretty pointless, too. + </para> + + </simplesect> + + <simplesect> + <title>Use of Quotes</title> + + <para> + Always use quotes to delimit file names, user-supplied identifiers, and + other variables that might contain words. Do not use them to mark up + variables that will not contain words (for example, operator names). + </para> + + <para> + There are functions in the backend that will double-quote their own output + as needed (for example, <function>format_type_be()</function>). Do not put + additional quotes around the output of such functions. + </para> + + <para> + Rationale: Objects can have names that create ambiguity when embedded in a + message. Be consistent about denoting where a plugged-in name starts and + ends. But don't clutter messages with unnecessary or duplicate quote + marks. + </para> + + </simplesect> + + <simplesect> + <title>Grammar and Punctuation</title> + + <para> + The rules are different for primary error messages and for detail/hint + messages: + </para> + + <para> + Primary error messages: Do not capitalize the first letter. Do not end a + message with a period. Do not even think about ending a message with an + exclamation point. + </para> + + <para> + Detail and hint messages: Use complete sentences, and end each with + a period. Capitalize the first word of sentences. Put two spaces after + the period if another sentence follows (for English text; might be + inappropriate in other languages). + </para> + + <para> + Error context strings: Do not capitalize the first letter and do + not end the string with a period. Context strings should normally + not be complete sentences. + </para> + + <para> + Rationale: Avoiding punctuation makes it easier for client applications to + embed the message into a variety of grammatical contexts. Often, primary + messages are not grammatically complete sentences anyway. (And if they're + long enough to be more than one sentence, they should be split into + primary and detail parts.) However, detail and hint messages are longer + and might need to include multiple sentences. For consistency, they should + follow complete-sentence style even when there's only one sentence. + </para> + + </simplesect> + + <simplesect> + <title>Upper Case vs. Lower Case</title> + + <para> + Use lower case for message wording, including the first letter of a + primary error message. Use upper case for SQL commands and key words if + they appear in the message. + </para> + + <para> + Rationale: It's easier to make everything look more consistent this + way, since some messages are complete sentences and some not. + </para> + + </simplesect> + + <simplesect> + <title>Avoid Passive Voice</title> + + <para> + Use the active voice. Use complete sentences when there is an acting + subject (<quote>A could not do B</quote>). Use telegram style without + subject if the subject would be the program itself; do not use + <quote>I</quote> for the program. + </para> + + <para> + Rationale: The program is not human. Don't pretend otherwise. + </para> + + </simplesect> + + <simplesect> + <title>Present vs. Past Tense</title> + + <para> + Use past tense if an attempt to do something failed, but could perhaps + succeed next time (perhaps after fixing some problem). Use present tense + if the failure is certainly permanent. + </para> + + <para> + There is a nontrivial semantic difference between sentences of the form: +<programlisting> +could not open file "%s": %m +</programlisting> +and: +<programlisting> +cannot open file "%s" +</programlisting> + The first one means that the attempt to open the file failed. The + message should give a reason, such as <quote>disk full</quote> or + <quote>file doesn't exist</quote>. The past tense is appropriate because + next time the disk might not be full anymore or the file in question might + exist. + </para> + + <para> + The second form indicates that the functionality of opening the named file + does not exist at all in the program, or that it's conceptually + impossible. The present tense is appropriate because the condition will + persist indefinitely. + </para> + + <para> + Rationale: Granted, the average user will not be able to draw great + conclusions merely from the tense of the message, but since the language + provides us with a grammar we should use it correctly. + </para> + + </simplesect> + + <simplesect> + <title>Type of the Object</title> + + <para> + When citing the name of an object, state what kind of object it is. + </para> + + <para> + Rationale: Otherwise no one will know what <quote>foo.bar.baz</quote> + refers to. + </para> + + </simplesect> + + <simplesect> + <title>Brackets</title> + + <para> + Square brackets are only to be used (1) in command synopses to denote + optional arguments, or (2) to denote an array subscript. + </para> + + <para> + Rationale: Anything else does not correspond to widely-known customary + usage and will confuse people. + </para> + + </simplesect> + + <simplesect> + <title>Assembling Error Messages</title> + + <para> + When a message includes text that is generated elsewhere, embed it in + this style: +<programlisting> +could not open file %s: %m +</programlisting> + </para> + + <para> + Rationale: It would be difficult to account for all possible error codes + to paste this into a single smooth sentence, so some sort of punctuation + is needed. Putting the embedded text in parentheses has also been + suggested, but it's unnatural if the embedded text is likely to be the + most important part of the message, as is often the case. + </para> + + </simplesect> + + <simplesect> + <title>Reasons for Errors</title> + + <para> + Messages should always state the reason why an error occurred. + For example: +<programlisting> +BAD: could not open file %s +BETTER: could not open file %s (I/O failure) +</programlisting> + If no reason is known you better fix the code. + </para> + + </simplesect> + + <simplesect> + <title>Function Names</title> + + <para> + Don't include the name of the reporting routine in the error text. We have + other mechanisms for finding that out when needed, and for most users it's + not helpful information. If the error text doesn't make as much sense + without the function name, reword it. +<programlisting> +BAD: pg_strtoint32: error in "z": cannot parse "z" +BETTER: invalid input syntax for type integer: "z" +</programlisting> + </para> + + <para> + Avoid mentioning called function names, either; instead say what the code + was trying to do: +<programlisting> +BAD: open() failed: %m +BETTER: could not open file %s: %m +</programlisting> + If it really seems necessary, mention the system call in the detail + message. (In some cases, providing the actual values passed to the + system call might be appropriate information for the detail message.) + </para> + + <para> + Rationale: Users don't know what all those functions do. + </para> + + </simplesect> + + <simplesect> + <title>Tricky Words to Avoid</title> + + <formalpara> + <title>Unable</title> + <para> + <quote>Unable</quote> is nearly the passive voice. Better use + <quote>cannot</quote> or <quote>could not</quote>, as appropriate. + </para> + </formalpara> + + <formalpara> + <title>Bad</title> + <para> + Error messages like <quote>bad result</quote> are really hard to interpret + intelligently. It's better to write why the result is <quote>bad</quote>, + e.g., <quote>invalid format</quote>. + </para> + </formalpara> + + <formalpara> + <title>Illegal</title> + <para> + <quote>Illegal</quote> stands for a violation of the law, the rest is + <quote>invalid</quote>. Better yet, say why it's invalid. + </para> + </formalpara> + + <formalpara> + <title>Unknown</title> + <para> + Try to avoid <quote>unknown</quote>. Consider <quote>error: unknown + response</quote>. If you don't know what the response is, how do you know + it's erroneous? <quote>Unrecognized</quote> is often a better choice. + Also, be sure to include the value being complained of. +<programlisting> +BAD: unknown node type +BETTER: unrecognized node type: 42 +</programlisting> + </para> + </formalpara> + + <formalpara> + <title>Find vs. Exists</title> + <para> + If the program uses a nontrivial algorithm to locate a resource (e.g., a + path search) and that algorithm fails, it is fair to say that the program + couldn't <quote>find</quote> the resource. If, on the other hand, the + expected location of the resource is known but the program cannot access + it there then say that the resource doesn't <quote>exist</quote>. Using + <quote>find</quote> in this case sounds weak and confuses the issue. + </para> + </formalpara> + + <formalpara> + <title>May vs. Can vs. Might</title> + <para> + <quote>May</quote> suggests permission (e.g., "You may borrow my rake."), + and has little use in documentation or error messages. + <quote>Can</quote> suggests ability (e.g., "I can lift that log."), + and <quote>might</quote> suggests possibility (e.g., "It might rain + today."). Using the proper word clarifies meaning and assists + translation. + </para> + </formalpara> + + <formalpara> + <title>Contractions</title> + <para> + Avoid contractions, like <quote>can't</quote>; use + <quote>cannot</quote> instead. + </para> + </formalpara> + + <formalpara> + <title>Non-negative</title> + <para> + Avoid <quote>non-negative</quote> as it is ambiguous + about whether it accepts zero. It's better to use + <quote>greater than zero</quote> or + <quote>greater than or equal to zero</quote>. + </para> + </formalpara> + + </simplesect> + + <simplesect> + <title>Proper Spelling</title> + + <para> + Spell out words in full. For instance, avoid: + <itemizedlist> + <listitem> + <para> + spec + </para> + </listitem> + <listitem> + <para> + stats + </para> + </listitem> + <listitem> + <para> + parens + </para> + </listitem> + <listitem> + <para> + auth + </para> + </listitem> + <listitem> + <para> + xact + </para> + </listitem> + </itemizedlist> + </para> + + <para> + Rationale: This will improve consistency. + </para> + + </simplesect> + + <simplesect> + <title>Localization</title> + + <para> + Keep in mind that error message texts need to be translated into other + languages. Follow the guidelines in <xref linkend="nls-guidelines"/> + to avoid making life difficult for translators. + </para> + </simplesect> + + </sect1> + + <sect1 id="source-conventions"> + <title>Miscellaneous Coding Conventions</title> + + <simplesect> + <title>C Standard</title> + <para> + Code in <productname>PostgreSQL</productname> should only rely on language + features available in the C99 standard. That means a conforming + C99 compiler has to be able to compile postgres, at least aside + from a few platform dependent pieces. + </para> + <para> + A few features included in the C99 standard are, at this time, not + permitted to be used in core <productname>PostgreSQL</productname> + code. This currently includes variable length arrays, intermingled + declarations and code, <literal>//</literal> comments, universal + character names. Reasons for that include portability and historical + practices. + </para> + <para> + Features from later revisions of the C standard or compiler specific + features can be used, if a fallback is provided. + </para> + <para> + For example <literal>_Static_assert()</literal> and + <literal>__builtin_constant_p</literal> are currently used, even though + they are from newer revisions of the C standard and a + <productname>GCC</productname> extension respectively. If not available + we respectively fall back to using a C99 compatible replacement that + performs the same checks, but emits rather cryptic messages and do not + use <literal>__builtin_constant_p</literal>. + </para> + </simplesect> + + <simplesect> + <title>Function-Like Macros and Inline Functions</title> + <para> + Both, macros with arguments and <literal>static inline</literal> + functions, may be used. The latter are preferable if there are + multiple-evaluation hazards when written as a macro, as e.g., the + case with +<programlisting> +#define Max(x, y) ((x) > (y) ? (x) : (y)) +</programlisting> + or when the macro would be very long. In other cases it's only + possible to use macros, or at least easier. For example because + expressions of various types need to be passed to the macro. + </para> + <para> + When the definition of an inline function references symbols + (i.e., variables, functions) that are only available as part of the + backend, the function may not be visible when included from frontend + code. +<programlisting> +#ifndef FRONTEND +static inline MemoryContext +MemoryContextSwitchTo(MemoryContext context) +{ + MemoryContext old = CurrentMemoryContext; + + CurrentMemoryContext = context; + return old; +} +#endif /* FRONTEND */ +</programlisting> + In this example <literal>CurrentMemoryContext</literal>, which is only + available in the backend, is referenced and the function thus + hidden with a <literal>#ifndef FRONTEND</literal>. This rule + exists because some compilers emit references to symbols + contained in inline functions even if the function is not used. + </para> + </simplesect> + + <simplesect> + <title>Writing Signal Handlers</title> + <para> + To be suitable to run inside a signal handler code has to be + written very carefully. The fundamental problem is that, unless + blocked, a signal handler can interrupt code at any time. If code + inside the signal handler uses the same state as code outside + chaos may ensue. As an example consider what happens if a signal + handler tries to acquire a lock that's already held in the + interrupted code. + </para> + <para> + Barring special arrangements code in signal handlers may only + call async-signal safe functions (as defined in POSIX) and access + variables of type <literal>volatile sig_atomic_t</literal>. A few + functions in <command>postgres</command> are also deemed signal safe, importantly + <function>SetLatch()</function>. + </para> + <para> + In most cases signal handlers should do nothing more than note + that a signal has arrived, and wake up code running outside of + the handler using a latch. An example of such a handler is the + following: +<programlisting> +static void +handle_sighup(SIGNAL_ARGS) +{ + int save_errno = errno; + + got_SIGHUP = true; + SetLatch(MyLatch); + + errno = save_errno; +} +</programlisting> + <varname>errno</varname> is saved and restored because + <function>SetLatch()</function> might change it. If that were not done + interrupted code that's currently inspecting <varname>errno</varname> might see the wrong + value. + </para> + </simplesect> + + <simplesect> + <title>Calling Function Pointers</title> + + <para> + For clarity, it is preferred to explicitly dereference a function pointer + when calling the pointed-to function if the pointer is a simple variable, + for example: +<programlisting> +(*emit_log_hook) (edata); +</programlisting> + (even though <literal>emit_log_hook(edata)</literal> would also work). + When the function pointer is part of a structure, then the extra + punctuation can and usually should be omitted, for example: +<programlisting> +paramInfo->paramFetch(paramInfo, paramId); +</programlisting> + </para> + </simplesect> + </sect1> + </chapter> |