summaryrefslogtreecommitdiffstats
path: root/doc/src/sgml/sources.sgml
blob: e0a6f775b6e89269f658b00ef1cc25cea998ded0 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
1001
1002
1003
1004
1005
1006
1007
1008
1009
1010
1011
1012
1013
1014
1015
1016
1017
1018
1019
1020
1021
1022
1023
1024
1025
1026
1027
1028
1029
1030
1031
1032
1033
1034
1035
<!-- doc/src/sgml/sources.sgml -->

 <chapter id="source">
  <title>PostgreSQL Coding Conventions</title>

  <sect1 id="source-format">
   <title>Formatting</title>

   <para>
    Source code formatting uses 4 column tab spacing, with
    tabs preserved (i.e., tabs are not expanded to spaces).
    Each logical indentation level is one additional tab stop.
   </para>

   <para>
    Layout rules (brace positioning, etc.) follow BSD conventions.  In
    particular, curly braces for the controlled blocks of <literal>if</literal>,
    <literal>while</literal>, <literal>switch</literal>, etc. go on their own lines.
   </para>

   <para>
    Limit line lengths so that the code is readable in an 80-column window.
    (This doesn't mean that you must never go past 80 columns.  For instance,
    breaking a long error message string in arbitrary places just to keep the
    code within 80 columns is probably not a net gain in readability.)
   </para>

   <para>
    To maintain a consistent coding style, do not use C++ style comments
    (<literal>//</literal> comments).  <application>pgindent</application>
    will replace them with <literal>/* ... */</literal>.
   </para>

   <para>
    The preferred style for multi-line comment blocks is
<programlisting>
/*
 * comment text begins here
 * and continues here
 */
</programlisting>
    Note that comment blocks that begin in column 1 will be preserved as-is
    by <application>pgindent</application>, but it will re-flow indented comment blocks
    as though they were plain text.  If you want to preserve the line breaks
    in an indented block, add dashes like this:
<programlisting>
    /*----------
     * comment text begins here
     * and continues here
     *----------
     */
</programlisting>
   </para>

   <para>
    While submitted patches do not absolutely have to follow these formatting
    rules, it's a good idea to do so.  Your code will get run through
    <application>pgindent</application> before the next release, so there's no point in
    making it look nice under some other set of formatting conventions.
    A good rule of thumb for patches is <quote>make the new code look like
    the existing code around it</quote>.
   </para>

   <para>
    The <filename>src/tools/editors</filename> directory contains sample settings
    files that can be used with the <productname>Emacs</productname>,
    <productname>xemacs</productname> or <productname>vim</productname>
    editors to help ensure that they format code according to these
    conventions.
   </para>

   <para>
    If you'd like to run <application>pgindent</application> locally
    to help make your code match project style, see
    the <filename>src/tools/pgindent</filename> directory.
   </para>

   <para>
    The text browsing tools <application>more</application> and
    <application>less</application> can be invoked as:
<programlisting>
more -x4
less -x4
</programlisting>
    to make them show tabs appropriately.
   </para>
  </sect1>

  <sect1 id="error-message-reporting">
   <title>Reporting Errors Within the Server</title>

   <indexterm>
    <primary>ereport</primary>
   </indexterm>
   <indexterm>
    <primary>elog</primary>
   </indexterm>

   <para>
    Error, warning, and log messages generated within the server code
    should be created using <function>ereport</function>, or its older cousin
    <function>elog</function>.  The use of this function is complex enough to
    require some explanation.
   </para>

   <para>
    There are two required elements for every message: a severity level
    (ranging from <literal>DEBUG</literal> to <literal>PANIC</literal>) and a primary
    message text.  In addition there are optional elements, the most
    common of which is an error identifier code that follows the SQL spec's
    SQLSTATE conventions.
    <function>ereport</function> itself is just a shell macro that exists
    mainly for the syntactic convenience of making message generation
    look like a single function call in the C source code.  The only parameter
    accepted directly by <function>ereport</function> is the severity level.
    The primary message text and any optional message elements are
    generated by calling auxiliary functions, such as <function>errmsg</function>,
    within the <function>ereport</function> call.
   </para>

   <para>
    A typical call to <function>ereport</function> might look like this:
<programlisting>
ereport(ERROR,
        errcode(ERRCODE_DIVISION_BY_ZERO),
        errmsg("division by zero"));
</programlisting>
    This specifies error severity level <literal>ERROR</literal> (a run-of-the-mill
    error).  The <function>errcode</function> call specifies the SQLSTATE error code
    using a macro defined in <filename>src/include/utils/errcodes.h</filename>.  The
    <function>errmsg</function> call provides the primary message text.
   </para>

   <para>
    You will also frequently see this older style, with an extra set of
    parentheses surrounding the auxiliary function calls:
<programlisting>
ereport(ERROR,
        (errcode(ERRCODE_DIVISION_BY_ZERO),
         errmsg("division by zero")));
</programlisting>
    The extra parentheses were required
    before <productname>PostgreSQL</productname> version 12, but are now
    optional.
   </para>

   <para>
    Here is a more complex example:
<programlisting>
ereport(ERROR,
        errcode(ERRCODE_AMBIGUOUS_FUNCTION),
        errmsg("function %s is not unique",
               func_signature_string(funcname, nargs,
                                     NIL, actual_arg_types)),
        errhint("Unable to choose a best candidate function. "
                "You might need to add explicit typecasts."));
</programlisting>
    This illustrates the use of format codes to embed run-time values into
    a message text.  Also, an optional <quote>hint</quote> message is provided.
    The auxiliary function calls can be written in any order, but
    conventionally <function>errcode</function>
    and <function>errmsg</function> appear first.
   </para>

   <para>
    If the severity level is <literal>ERROR</literal> or higher,
    <function>ereport</function> aborts execution of the current query
    and does not return to the caller. If the severity level is
    lower than <literal>ERROR</literal>, <function>ereport</function> returns normally.
   </para>

   <para>
    The available auxiliary routines for <function>ereport</function> are:
  <itemizedlist>
   <listitem>
    <para>
     <function>errcode(sqlerrcode)</function> specifies the SQLSTATE error identifier
     code for the condition.  If this routine is not called, the error
     identifier defaults to
     <literal>ERRCODE_INTERNAL_ERROR</literal> when the error severity level is
     <literal>ERROR</literal> or higher, <literal>ERRCODE_WARNING</literal> when the
     error level is <literal>WARNING</literal>, otherwise (for <literal>NOTICE</literal>
     and below) <literal>ERRCODE_SUCCESSFUL_COMPLETION</literal>.
     While these defaults are often convenient, always think whether they
     are appropriate before omitting the <function>errcode()</function> call.
    </para>
   </listitem>
   <listitem>
    <para>
     <function>errmsg(const char *msg, ...)</function> specifies the primary error
     message text, and possibly run-time values to insert into it.  Insertions
     are specified by <function>sprintf</function>-style format codes.  In addition to
     the standard format codes accepted by <function>sprintf</function>, the format
     code <literal>%m</literal> can be used to insert the error message returned
     by <function>strerror</function> for the current value of <literal>errno</literal>.
     <footnote>
      <para>
       That is, the value that was current when the <function>ereport</function> call
       was reached; changes of <literal>errno</literal> within the auxiliary reporting
       routines will not affect it.  That would not be true if you were to
       write <literal>strerror(errno)</literal> explicitly in <function>errmsg</function>'s
       parameter list; accordingly, do not do so.
      </para>
     </footnote>
     <literal>%m</literal> does not require any
     corresponding entry in the parameter list for <function>errmsg</function>.
     Note that the message string will be run through <function>gettext</function>
     for possible localization before format codes are processed.
    </para>
   </listitem>
   <listitem>
    <para>
     <function>errmsg_internal(const char *msg, ...)</function> is the same as
     <function>errmsg</function>, except that the message string will not be
     translated nor included in the internationalization message dictionary.
     This should be used for <quote>cannot happen</quote> cases that are probably
     not worth expending translation effort on.
    </para>
   </listitem>
   <listitem>
    <para>
     <function>errmsg_plural(const char *fmt_singular, const char *fmt_plural,
     unsigned long n, ...)</function> is like <function>errmsg</function>, but with
     support for various plural forms of the message.
     <replaceable>fmt_singular</replaceable> is the English singular format,
     <replaceable>fmt_plural</replaceable> is the English plural format,
     <replaceable>n</replaceable> is the integer value that determines which plural
     form is needed, and the remaining arguments are formatted according
     to the selected format string.  For more information see
     <xref linkend="nls-guidelines"/>.
    </para>
   </listitem>
   <listitem>
    <para>
     <function>errdetail(const char *msg, ...)</function> supplies an optional
     <quote>detail</quote> message; this is to be used when there is additional
     information that seems inappropriate to put in the primary message.
     The message string is processed in just the same way as for
     <function>errmsg</function>.
    </para>
   </listitem>
   <listitem>
    <para>
     <function>errdetail_internal(const char *msg, ...)</function> is the same
     as <function>errdetail</function>, except that the message string will not be
     translated nor included in the internationalization message dictionary.
     This should be used for detail messages that are not worth expending
     translation effort on, for instance because they are too technical to be
     useful to most users.
    </para>
   </listitem>
   <listitem>
    <para>
     <function>errdetail_plural(const char *fmt_singular, const char *fmt_plural,
     unsigned long n, ...)</function> is like <function>errdetail</function>, but with
     support for various plural forms of the message.
     For more information see <xref linkend="nls-guidelines"/>.
    </para>
   </listitem>
   <listitem>
    <para>
     <function>errdetail_log(const char *msg, ...)</function> is the same as
     <function>errdetail</function> except that this string goes only to the server
     log, never to the client.  If both <function>errdetail</function> (or one of
     its equivalents above) and
     <function>errdetail_log</function> are used then one string goes to the client
     and the other to the log.  This is useful for error details that are
     too security-sensitive or too bulky to include in the report
     sent to the client.
    </para>
   </listitem>
   <listitem>
    <para>
     <function>errdetail_log_plural(const char *fmt_singular, const char
     *fmt_plural, unsigned long n, ...)</function> is like
     <function>errdetail_log</function>, but with support for various plural forms of
     the message.
     For more information see <xref linkend="nls-guidelines"/>.
    </para>
   </listitem>
   <listitem>
    <para>
     <function>errhint(const char *msg, ...)</function> supplies an optional
     <quote>hint</quote> message; this is to be used when offering suggestions
     about how to fix the problem, as opposed to factual details about
     what went wrong.
     The message string is processed in just the same way as for
     <function>errmsg</function>.
    </para>
   </listitem>
   <listitem>
    <para>
     <function>errhint_plural(const char *fmt_singular, const char *fmt_plural,
     unsigned long n, ...)</function> is like <function>errhint</function>, but with
     support for various plural forms of the message.
     For more information see <xref linkend="nls-guidelines"/>.
    </para>
   </listitem>
   <listitem>
    <para>
     <function>errcontext(const char *msg, ...)</function> is not normally called
     directly from an <function>ereport</function> message site; rather it is used
     in <literal>error_context_stack</literal> callback functions to provide
     information about the context in which an error occurred, such as the
     current location in a PL function.
     The message string is processed in just the same way as for
     <function>errmsg</function>.  Unlike the other auxiliary functions, this can
     be called more than once per <function>ereport</function> call; the successive
     strings thus supplied are concatenated with separating newlines.
    </para>
   </listitem>
   <listitem>
    <para>
     <function>errposition(int cursorpos)</function> specifies the textual location
     of an error within a query string.  Currently it is only useful for
     errors detected in the lexical and syntactic analysis phases of
     query processing.
    </para>
   </listitem>
   <listitem>
    <para>
     <function>errtable(Relation rel)</function> specifies a relation whose
     name and schema name should be included as auxiliary fields in the error
     report.
    </para>
   </listitem>
   <listitem>
    <para>
     <function>errtablecol(Relation rel, int attnum)</function> specifies
     a column whose name, table name, and schema name should be included as
     auxiliary fields in the error report.
    </para>
   </listitem>
   <listitem>
    <para>
     <function>errtableconstraint(Relation rel, const char *conname)</function>
     specifies a table constraint whose name, table name, and schema name
     should be included as auxiliary fields in the error report.  Indexes
     should be considered to be constraints for this purpose, whether or
     not they have an associated <structname>pg_constraint</structname> entry.  Be
     careful to pass the underlying heap relation, not the index itself, as
     <literal>rel</literal>.
    </para>
   </listitem>
   <listitem>
    <para>
     <function>errdatatype(Oid datatypeOid)</function> specifies a data
     type whose name and schema name should be included as auxiliary fields
     in the error report.
    </para>
   </listitem>
   <listitem>
    <para>
     <function>errdomainconstraint(Oid datatypeOid, const char *conname)</function>
     specifies a domain constraint whose name, domain name, and schema name
     should be included as auxiliary fields in the error report.
    </para>
   </listitem>
   <listitem>
    <para>
     <function>errcode_for_file_access()</function> is a convenience function that
     selects an appropriate SQLSTATE error identifier for a failure in a
     file-access-related system call.  It uses the saved
     <literal>errno</literal> to determine which error code to generate.
     Usually this should be used in combination with <literal>%m</literal> in the
     primary error message text.
    </para>
   </listitem>
   <listitem>
    <para>
     <function>errcode_for_socket_access()</function> is a convenience function that
     selects an appropriate SQLSTATE error identifier for a failure in a
     socket-related system call.
    </para>
   </listitem>
   <listitem>
    <para>
     <function>errhidestmt(bool hide_stmt)</function> can be called to specify
     suppression of the <literal>STATEMENT:</literal> portion of a message in the
     postmaster log.  Generally this is appropriate if the message text
     includes the current statement already.
    </para>
   </listitem>
   <listitem>
    <para>
     <function>errhidecontext(bool hide_ctx)</function> can be called to
     specify suppression of the <literal>CONTEXT:</literal> portion of a message in
     the postmaster log.  This should only be used for verbose debugging
     messages where the repeated inclusion of context would bloat the log
     too much.
    </para>
   </listitem>
  </itemizedlist>
   </para>

   <note>
    <para>
     At most one of the functions <function>errtable</function>,
     <function>errtablecol</function>, <function>errtableconstraint</function>,
     <function>errdatatype</function>, or <function>errdomainconstraint</function> should
     be used in an <function>ereport</function> call.  These functions exist to
     allow applications to extract the name of a database object associated
     with the error condition without having to examine the
     potentially-localized error message text.
     These functions should be used in error reports for which it's likely
     that applications would wish to have automatic error handling.  As of
     <productname>PostgreSQL</productname> 9.3, complete coverage exists only for
     errors in SQLSTATE class 23 (integrity constraint violation), but this
     is likely to be expanded in future.
    </para>
   </note>

   <para>
    There is an older function <function>elog</function> that is still heavily used.
    An <function>elog</function> call:
<programlisting>
elog(level, "format string", ...);
</programlisting>
    is exactly equivalent to:
<programlisting>
ereport(level, errmsg_internal("format string", ...));
</programlisting>
    Notice that the SQLSTATE error code is always defaulted, and the message
    string is not subject to translation.
    Therefore, <function>elog</function> should be used only for internal errors and
    low-level debug logging.  Any message that is likely to be of interest to
    ordinary users should go through <function>ereport</function>.  Nonetheless,
    there are enough internal <quote>cannot happen</quote> error checks in the
    system that <function>elog</function> is still widely used; it is preferred for
    those messages for its notational simplicity.
   </para>

   <para>
    Advice about writing good error messages can be found in
    <xref linkend="error-style-guide"/>.
   </para>
  </sect1>

  <sect1 id="error-style-guide">
   <title>Error Message Style Guide</title>

   <para>
    This style guide is offered in the hope of maintaining a consistent,
    user-friendly style throughout all the messages generated by
    <productname>PostgreSQL</productname>.
   </para>

  <simplesect>
   <title>What Goes Where</title>

   <para>
    The primary message should be short, factual, and avoid reference to
    implementation details such as specific function names.
    <quote>Short</quote> means <quote>should fit on one line under normal
    conditions</quote>.  Use a detail message if needed to keep the primary
    message short, or if you feel a need to mention implementation details
    such as the particular system call that failed. Both primary and detail
    messages should be factual.  Use a hint message for suggestions about what
    to do to fix the problem, especially if the suggestion might not always be
    applicable.
   </para>

   <para>
    For example, instead of:
<programlisting>
IpcMemoryCreate: shmget(key=%d, size=%u, 0%o) failed: %m
(plus a long addendum that is basically a hint)
</programlisting>
    write:
<programlisting>
Primary:    could not create shared memory segment: %m
Detail:     Failed syscall was shmget(key=%d, size=%u, 0%o).
Hint:       the addendum
</programlisting>
   </para>

   <para>
    Rationale: keeping the primary message short helps keep it to the point,
    and lets clients lay out screen space on the assumption that one line is
    enough for error messages.  Detail and hint messages can be relegated to a
    verbose mode, or perhaps a pop-up error-details window.  Also, details and
    hints would normally be suppressed from the server log to save
    space. Reference to implementation details is best avoided since users
    aren't expected to know the details.
   </para>

  </simplesect>

  <simplesect>
   <title>Formatting</title>

   <para>
    Don't put any specific assumptions about formatting into the message
    texts.  Expect clients and the server log to wrap lines to fit their own
    needs.  In long messages, newline characters (\n) can be used to indicate
    suggested paragraph breaks.  Don't end a message with a newline.  Don't
    use tabs or other formatting characters.  (In error context displays,
    newlines are automatically added to separate levels of context such as
    function calls.)
   </para>

   <para>
    Rationale: Messages are not necessarily displayed on terminal-type
    displays.  In GUI displays or browsers these formatting instructions are
    at best ignored.
   </para>

  </simplesect>

  <simplesect>
   <title>Quotation Marks</title>

   <para>
    English text should use double quotes when quoting is appropriate.
    Text in other languages should consistently use one kind of quotes that is
    consistent with publishing customs and computer output of other programs.
   </para>

   <para>
    Rationale: The choice of double quotes over single quotes is somewhat
    arbitrary, but tends to be the preferred use.  Some have suggested
    choosing the kind of quotes depending on the type of object according to
    SQL conventions (namely, strings single quoted, identifiers double
    quoted).  But this is a language-internal technical issue that many users
    aren't even familiar with, it won't scale to other kinds of quoted terms,
    it doesn't translate to other languages, and it's pretty pointless, too.
   </para>

  </simplesect>

  <simplesect>
   <title>Use of Quotes</title>

   <para>
    Always use quotes to delimit file names, user-supplied identifiers, and
    other variables that might contain words.  Do not use them to mark up
    variables that will not contain words (for example, operator names).
   </para>

   <para>
    There are functions in the backend that will double-quote their own output
    as needed (for example, <function>format_type_be()</function>).  Do not put
    additional quotes around the output of such functions.
   </para>

   <para>
    Rationale: Objects can have names that create ambiguity when embedded in a
    message.  Be consistent about denoting where a plugged-in name starts and
    ends.  But don't clutter messages with unnecessary or duplicate quote
    marks.
   </para>

  </simplesect>

  <simplesect>
   <title>Grammar and Punctuation</title>

   <para>
    The rules are different for primary error messages and for detail/hint
    messages:
   </para>

   <para>
    Primary error messages: Do not capitalize the first letter.  Do not end a
    message with a period.  Do not even think about ending a message with an
    exclamation point.
   </para>

   <para>
    Detail and hint messages: Use complete sentences, and end each with
    a period.  Capitalize the first word of sentences.  Put two spaces after
    the period if another sentence follows (for English text; might be
    inappropriate in other languages).
   </para>

   <para>
    Error context strings: Do not capitalize the first letter and do
    not end the string with a period.  Context strings should normally
    not be complete sentences.
   </para>

   <para>
    Rationale: Avoiding punctuation makes it easier for client applications to
    embed the message into a variety of grammatical contexts.  Often, primary
    messages are not grammatically complete sentences anyway.  (And if they're
    long enough to be more than one sentence, they should be split into
    primary and detail parts.)  However, detail and hint messages are longer
    and might need to include multiple sentences.  For consistency, they should
    follow complete-sentence style even when there's only one sentence.
   </para>

  </simplesect>

  <simplesect>
   <title>Upper Case vs. Lower Case</title>

   <para>
    Use lower case for message wording, including the first letter of a
    primary error message.  Use upper case for SQL commands and key words if
    they appear in the message.
   </para>

   <para>
    Rationale: It's easier to make everything look more consistent this
    way, since some messages are complete sentences and some not.
   </para>

  </simplesect>

  <simplesect>
   <title>Avoid Passive Voice</title>

   <para>
    Use the active voice.  Use complete sentences when there is an acting
    subject (<quote>A could not do B</quote>).  Use telegram style without
    subject if the subject would be the program itself; do not use
    <quote>I</quote> for the program.
   </para>

   <para>
    Rationale: The program is not human.  Don't pretend otherwise.
   </para>

  </simplesect>

  <simplesect>
   <title>Present vs. Past Tense</title>

   <para>
    Use past tense if an attempt to do something failed, but could perhaps
    succeed next time (perhaps after fixing some problem).  Use present tense
    if the failure is certainly permanent.
   </para>

   <para>
    There is a nontrivial semantic difference between sentences of the form:
<programlisting>
could not open file "%s": %m
</programlisting>
and:
<programlisting>
cannot open file "%s"
</programlisting>
    The first one means that the attempt to open the file failed.  The
    message should give a reason, such as <quote>disk full</quote> or
    <quote>file doesn't exist</quote>.  The past tense is appropriate because
    next time the disk might not be full anymore or the file in question might
    exist.
   </para>

   <para>
    The second form indicates that the functionality of opening the named file
    does not exist at all in the program, or that it's conceptually
    impossible.  The present tense is appropriate because the condition will
    persist indefinitely.
   </para>

   <para>
    Rationale: Granted, the average user will not be able to draw great
    conclusions merely from the tense of the message, but since the language
    provides us with a grammar we should use it correctly.
   </para>

  </simplesect>

  <simplesect>
   <title>Type of the Object</title>

   <para>
    When citing the name of an object, state what kind of object it is.
   </para>

   <para>
    Rationale: Otherwise no one will know what <quote>foo.bar.baz</quote>
    refers to.
   </para>

  </simplesect>

  <simplesect>
   <title>Brackets</title>

   <para>
    Square brackets are only to be used (1) in command synopses to denote
    optional arguments, or (2) to denote an array subscript.
   </para>

   <para>
    Rationale: Anything else does not correspond to widely-known customary
    usage and will confuse people.
   </para>

  </simplesect>

  <simplesect>
   <title>Assembling Error Messages</title>

   <para>
   When a message includes text that is generated elsewhere, embed it in
   this style:
<programlisting>
could not open file %s: %m
</programlisting>
   </para>

   <para>
    Rationale: It would be difficult to account for all possible error codes
    to paste this into a single smooth sentence, so some sort of punctuation
    is needed.  Putting the embedded text in parentheses has also been
    suggested, but it's unnatural if the embedded text is likely to be the
    most important part of the message, as is often the case.
   </para>

  </simplesect>

  <simplesect>
   <title>Reasons for Errors</title>

   <para>
    Messages should always state the reason why an error occurred.
    For example:
<programlisting>
BAD:    could not open file %s
BETTER: could not open file %s (I/O failure)
</programlisting>
    If no reason is known you better fix the code.
   </para>

  </simplesect>

  <simplesect>
   <title>Function Names</title>

   <para>
    Don't include the name of the reporting routine in the error text. We have
    other mechanisms for finding that out when needed, and for most users it's
    not helpful information.  If the error text doesn't make as much sense
    without the function name, reword it.
<programlisting>
BAD:    pg_strtoint32: error in "z": cannot parse "z"
BETTER: invalid input syntax for type integer: "z"
</programlisting>
   </para>

   <para>
    Avoid mentioning called function names, either; instead say what the code
    was trying to do:
<programlisting>
BAD:    open() failed: %m
BETTER: could not open file %s: %m
</programlisting>
    If it really seems necessary, mention the system call in the detail
    message.  (In some cases, providing the actual values passed to the
    system call might be appropriate information for the detail message.)
   </para>

   <para>
    Rationale: Users don't know what all those functions do.
   </para>

  </simplesect>

  <simplesect>
   <title>Tricky Words to Avoid</title>

  <formalpara>
    <title>Unable</title>
   <para>
    <quote>Unable</quote> is nearly the passive voice.  Better use
    <quote>cannot</quote> or <quote>could not</quote>, as appropriate.
   </para>
  </formalpara>

  <formalpara>
    <title>Bad</title>
   <para>
    Error messages like <quote>bad result</quote> are really hard to interpret
    intelligently.  It's better to write why the result is <quote>bad</quote>,
    e.g., <quote>invalid format</quote>.
   </para>
  </formalpara>

  <formalpara>
    <title>Illegal</title>
   <para>
    <quote>Illegal</quote> stands for a violation of the law, the rest is
    <quote>invalid</quote>. Better yet, say why it's invalid.
   </para>
  </formalpara>

  <formalpara>
    <title>Unknown</title>
   <para>
    Try to avoid <quote>unknown</quote>.  Consider <quote>error: unknown
    response</quote>.  If you don't know what the response is, how do you know
    it's erroneous? <quote>Unrecognized</quote> is often a better choice.
    Also, be sure to include the value being complained of.
<programlisting>
BAD:    unknown node type
BETTER: unrecognized node type: 42
</programlisting>
   </para>
  </formalpara>

  <formalpara>
    <title>Find vs. Exists</title>
   <para>
    If the program uses a nontrivial algorithm to locate a resource (e.g., a
    path search) and that algorithm fails, it is fair to say that the program
    couldn't <quote>find</quote> the resource.  If, on the other hand, the
    expected location of the resource is known but the program cannot access
    it there then say that the resource doesn't <quote>exist</quote>.  Using
    <quote>find</quote> in this case sounds weak and confuses the issue.
   </para>
  </formalpara>

  <formalpara>
    <title>May vs. Can vs. Might</title>
   <para>
    <quote>May</quote> suggests permission (e.g., "You may borrow my rake."),
    and has little use in documentation or error messages.
    <quote>Can</quote> suggests ability (e.g., "I can lift that log."),
    and <quote>might</quote> suggests possibility (e.g., "It might rain
    today.").  Using the proper word clarifies meaning and assists
    translation.
   </para>
  </formalpara>

  <formalpara>
    <title>Contractions</title>
   <para>
    Avoid contractions, like <quote>can't</quote>;  use
    <quote>cannot</quote> instead.
   </para>
  </formalpara>

  <formalpara>
    <title>Non-negative</title>
   <para>
    Avoid <quote>non-negative</quote> as it is ambiguous
    about whether it accepts zero.  It's better to use
    <quote>greater than zero</quote> or
    <quote>greater than or equal to zero</quote>.
   </para>
  </formalpara>

  </simplesect>

  <simplesect>
   <title>Proper Spelling</title>

   <para>
    Spell out words in full.  For instance, avoid:
  <itemizedlist>
   <listitem>
    <para>
     spec
    </para>
   </listitem>
   <listitem>
    <para>
     stats
    </para>
   </listitem>
   <listitem>
    <para>
     parens
    </para>
   </listitem>
   <listitem>
    <para>
     auth
    </para>
   </listitem>
   <listitem>
    <para>
     xact
    </para>
   </listitem>
  </itemizedlist>
   </para>

   <para>
    Rationale: This will improve consistency.
   </para>

  </simplesect>

  <simplesect>
   <title>Localization</title>

   <para>
    Keep in mind that error message texts need to be translated into other
    languages.  Follow the guidelines in <xref linkend="nls-guidelines"/>
    to avoid making life difficult for translators.
   </para>
  </simplesect>

  </sect1>

  <sect1 id="source-conventions">
   <title>Miscellaneous Coding Conventions</title>

   <simplesect>
    <title>C Standard</title>
    <para>
     Code in <productname>PostgreSQL</productname> should only rely on language
     features available in the C99 standard. That means a conforming
     C99 compiler has to be able to compile postgres, at least aside
     from a few platform dependent pieces.
    </para>
    <para>
     A few features included in the C99 standard are, at this time, not
     permitted to be used in core <productname>PostgreSQL</productname>
     code. This currently includes variable length arrays, intermingled
     declarations and code, <literal>//</literal> comments, universal
     character names. Reasons for that include portability and historical
     practices.
    </para>
    <para>
     Features from later revisions of the C standard or compiler specific
     features can be used, if a fallback is provided.
    </para>
    <para>
     For example <literal>_Static_assert()</literal> and
     <literal>__builtin_constant_p</literal> are currently used, even though
     they are from newer revisions of the C standard and a
     <productname>GCC</productname> extension respectively. If not available
     we respectively fall back to using a C99 compatible replacement that
     performs the same checks, but emits rather cryptic messages and do not
     use <literal>__builtin_constant_p</literal>.
    </para>
   </simplesect>

   <simplesect>
    <title>Function-Like Macros and Inline Functions</title>
    <para>
     Both macros with arguments and <literal>static inline</literal>
     functions may be used. The latter are preferable if there are
     multiple-evaluation hazards when written as a macro, as e.g., the
     case with
<programlisting>
#define Max(x, y)       ((x) > (y) ? (x) : (y))
</programlisting>
     or when the macro would be very long. In other cases it's only
     possible to use macros, or at least easier.  For example because
     expressions of various types need to be passed to the macro.
    </para>
    <para>
     When the definition of an inline function references symbols
     (i.e., variables, functions) that are only available as part of the
     backend, the function may not be visible when included from frontend
     code.
<programlisting>
#ifndef FRONTEND
static inline MemoryContext
MemoryContextSwitchTo(MemoryContext context)
{
    MemoryContext old = CurrentMemoryContext;

    CurrentMemoryContext = context;
    return old;
}
#endif   /* FRONTEND */
</programlisting>
     In this example <literal>CurrentMemoryContext</literal>, which is only
     available in the backend, is referenced and the function thus
     hidden with a <literal>#ifndef FRONTEND</literal>. This rule
     exists because some compilers emit references to symbols
     contained in inline functions even if the function is not used.
    </para>
   </simplesect>

   <simplesect>
    <title>Writing Signal Handlers</title>
    <para>
     To be suitable to run inside a signal handler code has to be
     written very carefully. The fundamental problem is that, unless
     blocked, a signal handler can interrupt code at any time. If code
     inside the signal handler uses the same state as code outside
     chaos may ensue. As an example consider what happens if a signal
     handler tries to acquire a lock that's already held in the
     interrupted code.
    </para>
    <para>
     Barring special arrangements code in signal handlers may only
     call async-signal safe functions (as defined in POSIX) and access
     variables of type <literal>volatile sig_atomic_t</literal>. A few
     functions in <command>postgres</command> are also deemed signal safe, importantly
     <function>SetLatch()</function>.
    </para>
    <para>
     In most cases signal handlers should do nothing more than note
     that a signal has arrived, and wake up code running outside of
     the handler using a latch. An example of such a handler is the
     following:
<programlisting>
static void
handle_sighup(SIGNAL_ARGS)
{
    int         save_errno = errno;

    got_SIGHUP = true;
    SetLatch(MyLatch);

    errno = save_errno;
}
</programlisting>
     <varname>errno</varname> is saved and restored because
     <function>SetLatch()</function> might change it. If that were not done
     interrupted code that's currently inspecting <varname>errno</varname> might see the wrong
     value.
    </para>
   </simplesect>

   <simplesect>
    <title>Calling Function Pointers</title>

    <para>
     For clarity, it is preferred to explicitly dereference a function pointer
     when calling the pointed-to function if the pointer is a simple variable,
     for example:
<programlisting>
(*emit_log_hook) (edata);
</programlisting>
     (even though <literal>emit_log_hook(edata)</literal> would also work).
     When the function pointer is part of a structure, then the extra
     punctuation can and usually should be omitted, for example:
<programlisting>
paramInfo->paramFetch(paramInfo, paramId);
</programlisting>
    </para>
   </simplesect>
  </sect1>
 </chapter>