summaryrefslogtreecommitdiffstats
path: root/doc/src/sgml/html/collation.html
blob: 5336258484ba012743f90f62577a51b79c328488 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"><html xmlns="http://www.w3.org/1999/xhtml"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /><title>24.2. Collation Support</title><link rel="stylesheet" type="text/css" href="stylesheet.css" /><link rev="made" href="pgsql-docs@lists.postgresql.org" /><meta name="generator" content="DocBook XSL Stylesheets Vsnapshot" /><link rel="prev" href="locale.html" title="24.1. Locale Support" /><link rel="next" href="multibyte.html" title="24.3. Character Set Support" /></head><body id="docContent" class="container-fluid col-10"><div class="navheader"><table width="100%" summary="Navigation header"><tr><th colspan="5" align="center">24.2. Collation Support</th></tr><tr><td width="10%" align="left"><a accesskey="p" href="locale.html" title="24.1. Locale Support">Prev</a> </td><td width="10%" align="left"><a accesskey="u" href="charset.html" title="Chapter 24. Localization">Up</a></td><th width="60%" align="center">Chapter 24. Localization</th><td width="10%" align="right"><a accesskey="h" href="index.html" title="PostgreSQL 16.2 Documentation">Home</a></td><td width="10%" align="right"> <a accesskey="n" href="multibyte.html" title="24.3. Character Set Support">Next</a></td></tr></table><hr /></div><div class="sect1" id="COLLATION"><div class="titlepage"><div><div><h2 class="title" style="clear: both">24.2. Collation Support <a href="#COLLATION" class="id_link">#</a></h2></div></div></div><div class="toc"><dl class="toc"><dt><span class="sect2"><a href="collation.html#COLLATION-CONCEPTS">24.2.1. Concepts</a></span></dt><dt><span class="sect2"><a href="collation.html#COLLATION-MANAGING">24.2.2. Managing Collations</a></span></dt><dt><span class="sect2"><a href="collation.html#ICU-CUSTOM-COLLATIONS">24.2.3. ICU Custom Collations</a></span></dt></dl></div><a id="id-1.6.11.4.2" class="indexterm"></a><p>
   The collation feature allows specifying the sort order and character
   classification behavior of data per-column, or even per-operation.
   This alleviates the restriction that the
   <code class="symbol">LC_COLLATE</code> and <code class="symbol">LC_CTYPE</code> settings
   of a database cannot be changed after its creation.
  </p><div class="sect2" id="COLLATION-CONCEPTS"><div class="titlepage"><div><div><h3 class="title">24.2.1. Concepts <a href="#COLLATION-CONCEPTS" class="id_link">#</a></h3></div></div></div><p>
    Conceptually, every expression of a collatable data type has a
    collation.  (The built-in collatable data types are
    <code class="type">text</code>, <code class="type">varchar</code>, and <code class="type">char</code>.
    User-defined base types can also be marked collatable, and of course
    a <a class="glossterm" href="glossary.html#GLOSSARY-DOMAIN"><em class="glossterm"><a class="glossterm" href="glossary.html#GLOSSARY-DOMAIN" title="Domain">domain</a></em></a> over a
    collatable data type is collatable.)  If the
    expression is a column reference, the collation of the expression is the
    defined collation of the column.  If the expression is a constant, the
    collation is the default collation of the data type of the
    constant.  The collation of a more complex expression is derived
    from the collations of its inputs, as described below.
   </p><p>
    The collation of an expression can be the <span class="quote"><span class="quote">default</span></span>
    collation, which means the locale settings defined for the
    database.  It is also possible for an expression's collation to be
    indeterminate.  In such cases, ordering operations and other
    operations that need to know the collation will fail.
   </p><p>
    When the database system has to perform an ordering or a character
    classification, it uses the collation of the input expression.  This
    happens, for example, with <code class="literal">ORDER BY</code> clauses
    and function or operator calls such as <code class="literal">&lt;</code>.
    The collation to apply for an <code class="literal">ORDER BY</code> clause
    is simply the collation of the sort key.  The collation to apply for a
    function or operator call is derived from the arguments, as described
    below.  In addition to comparison operators, collations are taken into
    account by functions that convert between lower and upper case
    letters, such as <code class="function">lower</code>, <code class="function">upper</code>, and
    <code class="function">initcap</code>; by pattern matching operators; and by
    <code class="function">to_char</code> and related functions.
   </p><p>
    For a function or operator call, the collation that is derived by
    examining the argument collations is used at run time for performing
    the specified operation.  If the result of the function or operator
    call is of a collatable data type, the collation is also used at parse
    time as the defined collation of the function or operator expression,
    in case there is a surrounding expression that requires knowledge of
    its collation.
   </p><p>
    The <em class="firstterm">collation derivation</em> of an expression can be
    implicit or explicit.  This distinction affects how collations are
    combined when multiple different collations appear in an
    expression.  An explicit collation derivation occurs when a
    <code class="literal">COLLATE</code> clause is used; all other collation
    derivations are implicit.  When multiple collations need to be
    combined, for example in a function call, the following rules are
    used:

    </p><div class="orderedlist"><ol class="orderedlist" type="1"><li class="listitem"><p>
       If any input expression has an explicit collation derivation, then
       all explicitly derived collations among the input expressions must be
       the same, otherwise an error is raised.  If any explicitly
       derived collation is present, that is the result of the
       collation combination.
      </p></li><li class="listitem"><p>
       Otherwise, all input expressions must have the same implicit
       collation derivation or the default collation.  If any non-default
       collation is present, that is the result of the collation combination.
       Otherwise, the result is the default collation.
      </p></li><li class="listitem"><p>
       If there are conflicting non-default implicit collations among the
       input expressions, then the combination is deemed to have indeterminate
       collation.  This is not an error condition unless the particular
       function being invoked requires knowledge of the collation it should
       apply.  If it does, an error will be raised at run-time.
      </p></li></ol></div><p>

    For example, consider this table definition:
</p><pre class="programlisting">
CREATE TABLE test1 (
    a text COLLATE "de_DE",
    b text COLLATE "es_ES",
    ...
);
</pre><p>

    Then in
</p><pre class="programlisting">
SELECT a &lt; 'foo' FROM test1;
</pre><p>
    the <code class="literal">&lt;</code> comparison is performed according to
    <code class="literal">de_DE</code> rules, because the expression combines an
    implicitly derived collation with the default collation.  But in
</p><pre class="programlisting">
SELECT a &lt; ('foo' COLLATE "fr_FR") FROM test1;
</pre><p>
    the comparison is performed using <code class="literal">fr_FR</code> rules,
    because the explicit collation derivation overrides the implicit one.
    Furthermore, given
</p><pre class="programlisting">
SELECT a &lt; b FROM test1;
</pre><p>
    the parser cannot determine which collation to apply, since the
    <code class="structfield">a</code> and <code class="structfield">b</code> columns have conflicting
    implicit collations.  Since the <code class="literal">&lt;</code> operator
    does need to know which collation to use, this will result in an
    error.  The error can be resolved by attaching an explicit collation
    specifier to either input expression, thus:
</p><pre class="programlisting">
SELECT a &lt; b COLLATE "de_DE" FROM test1;
</pre><p>
    or equivalently
</p><pre class="programlisting">
SELECT a COLLATE "de_DE" &lt; b FROM test1;
</pre><p>
    On the other hand, the structurally similar case
</p><pre class="programlisting">
SELECT a || b FROM test1;
</pre><p>
    does not result in an error, because the <code class="literal">||</code> operator
    does not care about collations: its result is the same regardless
    of the collation.
   </p><p>
    The collation assigned to a function or operator's combined input
    expressions is also considered to apply to the function or operator's
    result, if the function or operator delivers a result of a collatable
    data type.  So, in
</p><pre class="programlisting">
SELECT * FROM test1 ORDER BY a || 'foo';
</pre><p>
    the ordering will be done according to <code class="literal">de_DE</code> rules.
    But this query:
</p><pre class="programlisting">
SELECT * FROM test1 ORDER BY a || b;
</pre><p>
    results in an error, because even though the <code class="literal">||</code> operator
    doesn't need to know a collation, the <code class="literal">ORDER BY</code> clause does.
    As before, the conflict can be resolved with an explicit collation
    specifier:
</p><pre class="programlisting">
SELECT * FROM test1 ORDER BY a || b COLLATE "fr_FR";
</pre><p>
   </p></div><div class="sect2" id="COLLATION-MANAGING"><div class="titlepage"><div><div><h3 class="title">24.2.2. Managing Collations <a href="#COLLATION-MANAGING" class="id_link">#</a></h3></div></div></div><p>
    A collation is an SQL schema object that maps an SQL name to locales
    provided by libraries installed in the operating system.  A collation
    definition has a <em class="firstterm">provider</em> that specifies which
    library supplies the locale data.  One standard provider name
    is <code class="literal">libc</code>, which uses the locales provided by the
    operating system C library.  These are the locales used by most tools
    provided by the operating system.  Another provider
    is <code class="literal">icu</code>, which uses the external
    ICU<a id="id-1.6.11.4.5.2.4" class="indexterm"></a> library.  ICU locales can only be
    used if support for ICU was configured when PostgreSQL was built.
   </p><p>
    A collation object provided by <code class="literal">libc</code> maps to a
    combination of <code class="symbol">LC_COLLATE</code> and <code class="symbol">LC_CTYPE</code>
    settings, as accepted by the <code class="literal">setlocale()</code> system library call.  (As
    the name would suggest, the main purpose of a collation is to set
    <code class="symbol">LC_COLLATE</code>, which controls the sort order.  But
    it is rarely necessary in practice to have an
    <code class="symbol">LC_CTYPE</code> setting that is different from
    <code class="symbol">LC_COLLATE</code>, so it is more convenient to collect
    these under one concept than to create another infrastructure for
    setting <code class="symbol">LC_CTYPE</code> per expression.)  Also,
    a <code class="literal">libc</code> collation
    is tied to a character set encoding (see <a class="xref" href="multibyte.html" title="24.3. Character Set Support">Section 24.3</a>).
    The same collation name may exist for different encodings.
   </p><p>
    A collation object provided by <code class="literal">icu</code> maps to a named
    collator provided by the ICU library.  ICU does not support
    separate <span class="quote"><span class="quote">collate</span></span> and <span class="quote"><span class="quote">ctype</span></span> settings, so
    they are always the same.  Also, ICU collations are independent of the
    encoding, so there is always only one ICU collation of a given name in
    a database.
   </p><div class="sect3" id="COLLATION-MANAGING-STANDARD"><div class="titlepage"><div><div><h4 class="title">24.2.2.1. Standard Collations <a href="#COLLATION-MANAGING-STANDARD" class="id_link">#</a></h4></div></div></div><p>
    On all platforms, the collations named <code class="literal">default</code>,
    <code class="literal">C</code>, and <code class="literal">POSIX</code> are available.  Additional
    collations may be available depending on operating system support.
    The <code class="literal">default</code> collation selects the <code class="symbol">LC_COLLATE</code>
    and <code class="symbol">LC_CTYPE</code> values specified at database creation time.
    The <code class="literal">C</code> and <code class="literal">POSIX</code> collations both specify
    <span class="quote"><span class="quote">traditional C</span></span> behavior, in which only the ASCII letters
    <span class="quote"><span class="quote"><code class="literal">A</code></span></span> through <span class="quote"><span class="quote"><code class="literal">Z</code></span></span>
    are treated as letters, and sorting is done strictly by character
    code byte values.
   </p><div class="note"><h3 class="title">Note</h3><p>
     The <code class="literal">C</code> and <code class="literal">POSIX</code> locales may behave
     differently depending on the database encoding.
    </p></div><p>
    Additionally, two SQL standard collation names are available:

    </p><div class="variablelist"><dl class="variablelist"><dt><span class="term"><code class="literal">unicode</code></span></dt><dd><p>
        This collation sorts using the Unicode Collation Algorithm with the
        Default Unicode Collation Element Table.  It is available in all
        encodings.  ICU support is required to use this collation.  (This
        collation has the same behavior as the ICU root locale; see <a class="xref" href="collation.html#COLLATION-MANAGING-PREDEFINED-ICU-UND-X-ICU"><code class="literal">und-x-icu</code> (for <span class="quote"><span class="quote">undefined</span></span>)</a>.)
       </p></dd><dt><span class="term"><code class="literal">ucs_basic</code></span></dt><dd><p>
        This collation sorts by Unicode code point.  It is only available for
        encoding <code class="literal">UTF8</code>.  (This collation has the same
        behavior as the libc locale specification <code class="literal">C</code> in
        <code class="literal">UTF8</code> encoding.)
       </p></dd></dl></div><p>
   </p></div><div class="sect3" id="COLLATION-MANAGING-PREDEFINED"><div class="titlepage"><div><div><h4 class="title">24.2.2.2. Predefined Collations <a href="#COLLATION-MANAGING-PREDEFINED" class="id_link">#</a></h4></div></div></div><p>
    If the operating system provides support for using multiple locales
    within a single program (<code class="function">newlocale</code> and related functions),
    or if support for ICU is configured,
    then when a database cluster is initialized, <code class="command">initdb</code>
    populates the system catalog <code class="literal">pg_collation</code> with
    collations based on all the locales it finds in the operating
    system at the time.
   </p><p>
    To inspect the currently available locales, use the query <code class="literal">SELECT
    * FROM pg_collation</code>, or the command <code class="command">\dOS+</code>
    in <span class="application">psql</span>.
   </p><div class="sect4" id="COLLATION-MANAGING-PREDEFINED-LIBC"><div class="titlepage"><div><div><h5 class="title">24.2.2.2.1. libc Collations <a href="#COLLATION-MANAGING-PREDEFINED-LIBC" class="id_link">#</a></h5></div></div></div><p>
    For example, the operating system might
    provide a locale named <code class="literal">de_DE.utf8</code>.
    <code class="command">initdb</code> would then create a collation named
    <code class="literal">de_DE.utf8</code> for encoding <code class="literal">UTF8</code>
    that has both <code class="symbol">LC_COLLATE</code> and
    <code class="symbol">LC_CTYPE</code> set to <code class="literal">de_DE.utf8</code>.
    It will also create a collation with the <code class="literal">.utf8</code>
    tag stripped off the name.  So you could also use the collation
    under the name <code class="literal">de_DE</code>, which is less cumbersome
    to write and makes the name less encoding-dependent.  Note that,
    nevertheless, the initial set of collation names is
    platform-dependent.
   </p><p>
    The default set of collations provided by <code class="literal">libc</code> map
    directly to the locales installed in the operating system, which can be
    listed using the command <code class="literal">locale -a</code>.  In case
    a <code class="literal">libc</code> collation is needed that has different values
    for <code class="symbol">LC_COLLATE</code> and <code class="symbol">LC_CTYPE</code>, or if new
    locales are installed in the operating system after the database system
    was initialized, then a new collation may be created using
    the <a class="xref" href="sql-createcollation.html" title="CREATE COLLATION"><span class="refentrytitle">CREATE COLLATION</span></a> command.
    New operating system locales can also be imported en masse using
    the <a class="link" href="functions-admin.html#FUNCTIONS-ADMIN-COLLATION" title="Table 9.98. Collation Management Functions"><code class="function">pg_import_system_collations()</code></a> function.
   </p><p>
    Within any particular database, only collations that use that
    database's encoding are of interest.  Other entries in
    <code class="literal">pg_collation</code> are ignored.  Thus, a stripped collation
    name such as <code class="literal">de_DE</code> can be considered unique
    within a given database even though it would not be unique globally.
    Use of the stripped collation names is recommended, since it will
    make one fewer thing you need to change if you decide to change to
    another database encoding.  Note however that the <code class="literal">default</code>,
    <code class="literal">C</code>, and <code class="literal">POSIX</code> collations can be used regardless of
    the database encoding.
   </p><p>
    <span class="productname">PostgreSQL</span> considers distinct collation
    objects to be incompatible even when they have identical properties.
    Thus for example,
</p><pre class="programlisting">
SELECT a COLLATE "C" &lt; b COLLATE "POSIX" FROM test1;
</pre><p>
    will draw an error even though the <code class="literal">C</code> and <code class="literal">POSIX</code>
    collations have identical behaviors.  Mixing stripped and non-stripped
    collation names is therefore not recommended.
   </p></div><div class="sect4" id="COLLATION-MANAGING-PREDEFINED-ICU"><div class="titlepage"><div><div><h5 class="title">24.2.2.2.2. ICU Collations <a href="#COLLATION-MANAGING-PREDEFINED-ICU" class="id_link">#</a></h5></div></div></div><p>
    With ICU, it is not sensible to enumerate all possible locale names.  ICU
    uses a particular naming system for locales, but there are many more ways
    to name a locale than there are actually distinct locales.
    <code class="command">initdb</code> uses the ICU APIs to extract a set of distinct
    locales to populate the initial set of collations.  Collations provided by
    ICU are created in the SQL environment with names in BCP 47 language tag
    format, with a <span class="quote"><span class="quote">private use</span></span>
    extension <code class="literal">-x-icu</code> appended, to distinguish them from
    libc locales.
   </p><p>
    Here are some example collations that might be created:

    </p><div class="variablelist"><dl class="variablelist"><dt id="COLLATION-MANAGING-PREDEFINED-ICU-DE-X-ICU"><span class="term"><code class="literal">de-x-icu</code></span> <a href="#COLLATION-MANAGING-PREDEFINED-ICU-DE-X-ICU" class="id_link">#</a></dt><dd><p>German collation, default variant</p></dd><dt id="COLLATION-MANAGING-PREDEFINED-ICU-DE-AT-X-ICU"><span class="term"><code class="literal">de-AT-x-icu</code></span> <a href="#COLLATION-MANAGING-PREDEFINED-ICU-DE-AT-X-ICU" class="id_link">#</a></dt><dd><p>German collation for Austria, default variant</p><p>
        (There are also, say, <code class="literal">de-DE-x-icu</code>
        or <code class="literal">de-CH-x-icu</code>, but as of this writing, they are
        equivalent to <code class="literal">de-x-icu</code>.)
       </p></dd><dt id="COLLATION-MANAGING-PREDEFINED-ICU-UND-X-ICU"><span class="term"><code class="literal">und-x-icu</code> (for <span class="quote"><span class="quote">undefined</span></span>)</span> <a href="#COLLATION-MANAGING-PREDEFINED-ICU-UND-X-ICU" class="id_link">#</a></dt><dd><p>
        ICU <span class="quote"><span class="quote">root</span></span> collation.  Use this to get a reasonable
        language-agnostic sort order.
       </p></dd></dl></div><p>
   </p><p>
    Some (less frequently used) encodings are not supported by ICU.  When the
    database encoding is one of these, ICU collation entries
    in <code class="literal">pg_collation</code> are ignored.  Attempting to use one
    will draw an error along the lines of <span class="quote"><span class="quote">collation "de-x-icu" for
    encoding "WIN874" does not exist</span></span>.
   </p></div></div><div class="sect3" id="COLLATION-CREATE"><div class="titlepage"><div><div><h4 class="title">24.2.2.3. Creating New Collation Objects <a href="#COLLATION-CREATE" class="id_link">#</a></h4></div></div></div><p>
    If the standard and predefined collations are not sufficient, users can
    create their own collation objects using the SQL
    command <a class="xref" href="sql-createcollation.html" title="CREATE COLLATION"><span class="refentrytitle">CREATE COLLATION</span></a>.
   </p><p>
    The standard and predefined collations are in the
    schema <code class="literal">pg_catalog</code>, like all predefined objects.
    User-defined collations should be created in user schemas.  This also
    ensures that they are saved by <code class="command">pg_dump</code>.
   </p><div class="sect4" id="COLLATION-MANAGING-CREATE-LIBC"><div class="titlepage"><div><div><h5 class="title">24.2.2.3.1. libc Collations <a href="#COLLATION-MANAGING-CREATE-LIBC" class="id_link">#</a></h5></div></div></div><p>
     New libc collations can be created like this:
</p><pre class="programlisting">
CREATE COLLATION german (provider = libc, locale = 'de_DE');
</pre><p>
     The exact values that are acceptable for the <code class="literal">locale</code>
     clause in this command depend on the operating system.  On Unix-like
     systems, the command <code class="literal">locale -a</code> will show a list.
    </p><p>
     Since the predefined libc collations already include all collations
     defined in the operating system when the database instance is
     initialized, it is not often necessary to manually create new ones.
     Reasons might be if a different naming system is desired (in which case
     see also <a class="xref" href="collation.html#COLLATION-COPY" title="24.2.2.3.3. Copying Collations">Section 24.2.2.3.3</a>) or if the operating system has
     been upgraded to provide new locale definitions (in which case see
     also <a class="link" href="functions-admin.html#FUNCTIONS-ADMIN-COLLATION" title="Table 9.98. Collation Management Functions"><code class="function">pg_import_system_collations()</code></a>).
    </p></div><div class="sect4" id="COLLATION-MANAGING-CREATE-ICU"><div class="titlepage"><div><div><h5 class="title">24.2.2.3.2. ICU Collations <a href="#COLLATION-MANAGING-CREATE-ICU" class="id_link">#</a></h5></div></div></div><p>
     ICU collations can be created like:

</p><pre class="programlisting">
CREATE COLLATION german (provider = icu, locale = 'de-DE');
</pre><p>

     ICU locales are specified as a BCP 47 <a class="link" href="locale.html#ICU-LANGUAGE-TAG" title="24.1.5.3. Language Tag">Language Tag</a>, but can also accept most
     libc-style locale names. If possible, libc-style locale names are
     transformed into language tags.
    </p><p>
     New ICU collations can customize collation behavior extensively by
     including collation attributes in the language tag. See <a class="xref" href="collation.html#ICU-CUSTOM-COLLATIONS" title="24.2.3. ICU Custom Collations">Section 24.2.3</a> for details and examples.
    </p></div><div class="sect4" id="COLLATION-COPY"><div class="titlepage"><div><div><h5 class="title">24.2.2.3.3. Copying Collations <a href="#COLLATION-COPY" class="id_link">#</a></h5></div></div></div><p>
    The command <a class="xref" href="sql-createcollation.html" title="CREATE COLLATION"><span class="refentrytitle">CREATE COLLATION</span></a> can also be used to
    create a new collation from an existing collation, which can be useful to
    be able to use operating-system-independent collation names in
    applications, create compatibility names, or use an ICU-provided collation
    under a more readable name.  For example:
</p><pre class="programlisting">
CREATE COLLATION german FROM "de_DE";
CREATE COLLATION french FROM "fr-x-icu";
</pre><p>
   </p></div></div><div class="sect3" id="COLLATION-NONDETERMINISTIC"><div class="titlepage"><div><div><h4 class="title">24.2.2.4. Nondeterministic Collations <a href="#COLLATION-NONDETERMINISTIC" class="id_link">#</a></h4></div></div></div><p>
     A collation is either <em class="firstterm">deterministic</em> or
     <em class="firstterm">nondeterministic</em>.  A deterministic collation uses
     deterministic comparisons, which means that it considers strings to be
     equal only if they consist of the same byte sequence.  Nondeterministic
     comparison may determine strings to be equal even if they consist of
     different bytes.  Typical situations include case-insensitive comparison,
     accent-insensitive comparison, as well as comparison of strings in
     different Unicode normal forms.  It is up to the collation provider to
     actually implement such insensitive comparisons; the deterministic flag
     only determines whether ties are to be broken using bytewise comparison.
     See also <a class="ulink" href="https://www.unicode.org/reports/tr10" target="_top">Unicode Technical
     Standard 10</a> for more information on the terminology.
    </p><p>
     To create a nondeterministic collation, specify the property
     <code class="literal">deterministic = false</code> to <code class="command">CREATE
     COLLATION</code>, for example:
</p><pre class="programlisting">
CREATE COLLATION ndcoll (provider = icu, locale = 'und', deterministic = false);
</pre><p>
     This example would use the standard Unicode collation in a
     nondeterministic way.  In particular, this would allow strings in
     different normal forms to be compared correctly.  More interesting
     examples make use of the ICU customization facilities explained above.
     For example:
</p><pre class="programlisting">
CREATE COLLATION case_insensitive (provider = icu, locale = 'und-u-ks-level2', deterministic = false);
CREATE COLLATION ignore_accents (provider = icu, locale = 'und-u-ks-level1-kc-true', deterministic = false);
</pre><p>
    </p><p>
     All standard and predefined collations are deterministic, all
     user-defined collations are deterministic by default.  While
     nondeterministic collations give a more <span class="quote"><span class="quote">correct</span></span> behavior,
     especially when considering the full power of Unicode and its many
     special cases, they also have some drawbacks.  Foremost, their use leads
     to a performance penalty.  Note, in particular, that B-tree cannot use
     deduplication with indexes that use a nondeterministic collation.  Also,
     certain operations are not possible with nondeterministic collations,
     such as pattern matching operations.  Therefore, they should be used
     only in cases where they are specifically wanted.
    </p><div class="tip"><h3 class="title">Tip</h3><p>
      To deal with text in different Unicode normalization forms, it is also
      an option to use the functions/expressions
      <code class="function">normalize</code> and <code class="literal">is normalized</code> to
      preprocess or check the strings, instead of using nondeterministic
      collations.  There are different trade-offs for each approach.
     </p></div></div></div><div class="sect2" id="ICU-CUSTOM-COLLATIONS"><div class="titlepage"><div><div><h3 class="title">24.2.3. ICU Custom Collations <a href="#ICU-CUSTOM-COLLATIONS" class="id_link">#</a></h3></div></div></div><p>
    ICU allows extensive control over collation behavior by defining new
    collations with collation settings as a part of the language tag. These
    settings can modify the collation order to suit a variety of needs. For
    instance:

</p><pre class="programlisting">
-- ignore differences in accents and case
CREATE COLLATION ignore_accent_case (provider = icu, deterministic = false, locale = 'und-u-ks-level1');
SELECT 'Å' = 'A' COLLATE ignore_accent_case; -- true
SELECT 'z' = 'Z' COLLATE ignore_accent_case; -- true

-- upper case letters sort before lower case.
CREATE COLLATION upper_first (provider = icu, locale = 'und-u-kf-upper');
SELECT 'B' &lt; 'b' COLLATE upper_first; -- true

-- treat digits numerically and ignore punctuation
CREATE COLLATION num_ignore_punct (provider = icu, deterministic = false, locale = 'und-u-ka-shifted-kn');
SELECT 'id-45' &lt; 'id-123' COLLATE num_ignore_punct; -- true
SELECT 'w;x*y-z' = 'wxyz' COLLATE num_ignore_punct; -- true
</pre><p>

    Many of the available options are described in <a class="xref" href="collation.html#ICU-COLLATION-SETTINGS" title="24.2.3.2. Collation Settings for an ICU Locale">Section 24.2.3.2</a>, or see <a class="xref" href="collation.html#ICU-EXTERNAL-REFERENCES" title="24.2.3.5. External References for ICU">Section 24.2.3.5</a> for more details.
   </p><div class="sect3" id="ICU-COLLATION-COMPARISON-LEVELS"><div class="titlepage"><div><div><h4 class="title">24.2.3.1. ICU Comparison Levels <a href="#ICU-COLLATION-COMPARISON-LEVELS" class="id_link">#</a></h4></div></div></div><p>
     Comparison of two strings (collation) in ICU is determined by a
     multi-level process, where textual features are grouped into
     "levels". Treatment of each level is controlled by the <a class="link" href="collation.html#ICU-COLLATION-SETTINGS-TABLE" title="Table 24.2. ICU Collation Settings">collation settings</a>. Higher
     levels correspond to finer textual features.
    </p><p>
     <a class="xref" href="collation.html#ICU-COLLATION-LEVELS" title="Table 24.1. ICU Collation Levels">Table 24.1</a> shows which textual feature
     differences are considered significant when determining equality at the
     given level. The Unicode character <code class="literal">U+2063</code> is an
     invisible separator, and as seen in the table, is ignored for at all
     levels of comparison less than <code class="literal">identic</code>.
    </p><div class="table" id="ICU-COLLATION-LEVELS"><p class="title"><strong>Table 24.1. ICU Collation Levels</strong></p><div class="table-contents"><table class="table" summary="ICU Collation Levels" border="1"><colgroup><col class="col1" /><col class="col2" /><col class="col3" /><col class="col4" /><col class="col5" /><col class="col6" /><col class="col7" /><col class="col8" /></colgroup><thead><tr><th>Level</th><th>Description</th><th><code class="literal">'f' = 'f'</code></th><th><code class="literal">'ab' = U&amp;'a\2063b'</code></th><th><code class="literal">'x-y' = 'x_y'</code></th><th><code class="literal">'g' = 'G'</code></th><th><code class="literal">'n' = 'ñ'</code></th><th><code class="literal">'y' = 'z'</code></th></tr></thead><tbody><tr><td>level1</td><td>Base Character</td><td><code class="literal">true</code></td><td><code class="literal">true</code></td><td><code class="literal">true</code></td><td><code class="literal">true</code></td><td><code class="literal">true</code></td><td><code class="literal">false</code></td></tr><tr><td>level2</td><td>Accents</td><td><code class="literal">true</code></td><td><code class="literal">true</code></td><td><code class="literal">true</code></td><td><code class="literal">true</code></td><td><code class="literal">false</code></td><td><code class="literal">false</code></td></tr><tr><td>level3</td><td>Case/Variants</td><td><code class="literal">true</code></td><td><code class="literal">true</code></td><td><code class="literal">true</code></td><td><code class="literal">false</code></td><td><code class="literal">false</code></td><td><code class="literal">false</code></td></tr><tr><td>level4</td><td>Punctuation</td><td><code class="literal">true</code></td><td><code class="literal">true</code></td><td><code class="literal">false</code></td><td><code class="literal">false</code></td><td><code class="literal">false</code></td><td><code class="literal">false</code></td></tr><tr><td>identic</td><td>All</td><td><code class="literal">true</code></td><td><code class="literal">false</code></td><td><code class="literal">false</code></td><td><code class="literal">false</code></td><td><code class="literal">false</code></td><td><code class="literal">false</code></td></tr></tbody></table></div></div><br class="table-break" /><p>
     At every level, even with full normalization off, basic normalization is
     performed. For example, <code class="literal">'á'</code> may be composed of the
     code points <code class="literal">U&amp;'\0061\0301'</code> or the single code
     point <code class="literal">U&amp;'\00E1'</code>, and those sequences will be
     considered equal even at the <code class="literal">identic</code> level. To treat
     any difference in code point representation as distinct, use a collation
     created with <code class="symbol">deterministic</code> set to
     <code class="literal">true</code>.
    </p><div class="sect4" id="ICU-COLLATION-LEVEL-EXAMPLES"><div class="titlepage"><div><div><h5 class="title">24.2.3.1.1. Collation Level Examples <a href="#ICU-COLLATION-LEVEL-EXAMPLES" class="id_link">#</a></h5></div></div></div><pre class="programlisting">
CREATE COLLATION level3 (provider = icu, deterministic = false, locale = 'und-u-ka-shifted-ks-level3');
CREATE COLLATION level4 (provider = icu, deterministic = false, locale = 'und-u-ka-shifted-ks-level4');
CREATE COLLATION identic (provider = icu, deterministic = false, locale = 'und-u-ka-shifted-ks-identic');

-- invisible separator ignored at all levels except identic
SELECT 'ab' = U&amp;'a\2063b' COLLATE level4; -- true
SELECT 'ab' = U&amp;'a\2063b' COLLATE identic; -- false

-- punctuation ignored at level3 but not at level 4
SELECT 'x-y' = 'x_y' COLLATE level3; -- true
SELECT 'x-y' = 'x_y' COLLATE level4; -- false
</pre></div></div><div class="sect3" id="ICU-COLLATION-SETTINGS"><div class="titlepage"><div><div><h4 class="title">24.2.3.2. Collation Settings for an ICU Locale <a href="#ICU-COLLATION-SETTINGS" class="id_link">#</a></h4></div></div></div><p>
     <a class="xref" href="collation.html#ICU-COLLATION-SETTINGS-TABLE" title="Table 24.2. ICU Collation Settings">Table 24.2</a> shows the available
     collation settings, which can be used as part of a language tag to
     customize a collation.
    </p><div class="table" id="ICU-COLLATION-SETTINGS-TABLE"><p class="title"><strong>Table 24.2. ICU Collation Settings</strong></p><div class="table-contents"><table class="table" summary="ICU Collation Settings" border="1"><colgroup><col class="col1" /><col class="col2" /><col class="col3" /><col class="col4" /></colgroup><thead><tr><th>Key</th><th>Values</th><th>Default</th><th>Description</th></tr></thead><tbody><tr><td><code class="literal">co</code></td><td><code class="literal">emoji</code>, <code class="literal">phonebk</code>, <code class="literal">standard</code>, <em class="replaceable"><code>...</code></em></td><td><code class="literal">standard</code></td><td>
          Collation type. See <a class="xref" href="collation.html#ICU-EXTERNAL-REFERENCES" title="24.2.3.5. External References for ICU">Section 24.2.3.5</a> for additional options and details.
         </td></tr><tr><td><code class="literal">ka</code></td><td><code class="literal">noignore</code>, <code class="literal">shifted</code></td><td><code class="literal">noignore</code></td><td>
          If set to <code class="literal">shifted</code>, causes some characters
          (e.g. punctuation or space) to be ignored in comparison. Key
          <code class="literal">ks</code> must be set to <code class="literal">level3</code> or
          lower to take effect. Set key <code class="literal">kv</code> to control which
          character classes are ignored.
         </td></tr><tr><td><code class="literal">kb</code></td><td><code class="literal">true</code>, <code class="literal">false</code></td><td><code class="literal">false</code></td><td>
          Backwards comparison for the level 2 differences. For example,
          locale <code class="literal">und-u-kb</code> sorts <code class="literal">'àe'</code>
          before <code class="literal">'aé'</code>.
         </td></tr><tr><td><code class="literal">kc</code></td><td><code class="literal">true</code>, <code class="literal">false</code></td><td><code class="literal">false</code></td><td>
          <p>
           Separates case into a "level 2.5" that falls between accents and
           other level 3 features.
          </p>
          <p>
           If set to <code class="literal">true</code> and <code class="literal">ks</code> is set
           to <code class="literal">level1</code>, will ignore accents but take case
           into account.
          </p>
         </td></tr><tr><td><code class="literal">kf</code></td><td>
          <code class="literal">upper</code>, <code class="literal">lower</code>,
          <code class="literal">false</code>
         </td><td><code class="literal">false</code></td><td>
          If set to <code class="literal">upper</code>, upper case sorts before lower
          case. If set to <code class="literal">lower</code>, lower case sorts before
          upper case. If set to <code class="literal">false</code>, the sort depends on
          the rules of the locale.
         </td></tr><tr><td><code class="literal">kn</code></td><td><code class="literal">true</code>, <code class="literal">false</code></td><td><code class="literal">false</code></td><td>
          If set to <code class="literal">true</code>, numbers within a string are
          treated as a single numeric value rather than a sequence of
          digits. For example, <code class="literal">'id-45'</code> sorts before
          <code class="literal">'id-123'</code>.
         </td></tr><tr><td><code class="literal">kk</code></td><td><code class="literal">true</code>, <code class="literal">false</code></td><td><code class="literal">false</code></td><td>
          <p>
           Enable full normalization; may affect performance. Basic
           normalization is performed even when set to
           <code class="literal">false</code>. Locales for languages that require full
           normalization typically enable it by default.
          </p>
          <p>
           Full normalization is important in some cases, such as when
           multiple accents are applied to a single character. For example,
           the code point sequences <code class="literal">U&amp;'\0065\0323\0302'</code>
           and <code class="literal">U&amp;'\0065\0302\0323'</code> represent
           an <code class="literal">e</code> with circumflex and dot-below accents
           applied in different orders. With full normalization
           on, these code point sequences are treated as equal; otherwise they
           are unequal.
          </p>
         </td></tr><tr><td><code class="literal">kr</code></td><td>
          <code class="literal">space</code>, <code class="literal">punct</code>,
          <code class="literal">symbol</code>, <code class="literal">currency</code>,
          <code class="literal">digit</code>, <em class="replaceable"><code>script-id</code></em>
         </td><td> </td><td>
          <p>
           Set to one or more of the valid values, or any BCP 47
           <em class="replaceable"><code>script-id</code></em>, e.g. <code class="literal">latn</code>
           ("Latin") or <code class="literal">grek</code> ("Greek"). Multiple values are
           separated by "<code class="literal">-</code>".
          </p>
          <p>
           Redefines the ordering of classes of characters; those characters
           belonging to a class earlier in the list sort before characters
           belonging to a class later in the list. For instance, the value
           <code class="literal">digit-currency-space</code> (as part of a language tag
           like <code class="literal">und-u-kr-digit-currency-space</code>) sorts
           punctuation before digits and spaces.
          </p>
         </td></tr><tr><td><code class="literal">ks</code></td><td><code class="literal">level1</code>, <code class="literal">level2</code>, <code class="literal">level3</code>, <code class="literal">level4</code>, <code class="literal">identic</code></td><td><code class="literal">level3</code></td><td>
          Sensitivity (or "strength") when determining equality, with
          <code class="literal">level1</code> the least sensitive to differences and
          <code class="literal">identic</code> the most sensitive to differences. See
          <a class="xref" href="collation.html#ICU-COLLATION-LEVELS" title="Table 24.1. ICU Collation Levels">Table 24.1</a> for details.
         </td></tr><tr><td><code class="literal">kv</code></td><td>
          <code class="literal">space</code>, <code class="literal">punct</code>,
          <code class="literal">symbol</code>, <code class="literal">currency</code>
         </td><td><code class="literal">punct</code></td><td>
          Classes of characters ignored during comparison at level 3. Setting
          to a later value includes earlier values;
          e.g. <code class="literal">symbol</code> also includes
          <code class="literal">punct</code> and <code class="literal">space</code> in the
          characters to be ignored. Key <code class="literal">ka</code> must be set to
          <code class="literal">shifted</code> and key <code class="literal">ks</code> must be set
          to <code class="literal">level3</code> or lower to take effect.
         </td></tr></tbody></table></div></div><br class="table-break" /><p>
     Defaults may depend on locale. The above table is not meant to be
     complete. See <a class="xref" href="collation.html#ICU-EXTERNAL-REFERENCES" title="24.2.3.5. External References for ICU">Section 24.2.3.5</a> for additional
     options and details.
    </p><div class="note"><h3 class="title">Note</h3><p>
      For many collation settings, you must create the collation with
      <code class="option">deterministic</code> set to <code class="literal">false</code> for the
      setting to have the desired effect (see <a class="xref" href="collation.html#COLLATION-NONDETERMINISTIC" title="24.2.2.4. Nondeterministic Collations">Section 24.2.2.4</a>). Additionally, some settings
      only take effect when the key <code class="literal">ka</code> is set to
      <code class="literal">shifted</code> (see <a class="xref" href="collation.html#ICU-COLLATION-SETTINGS-TABLE" title="Table 24.2. ICU Collation Settings">Table 24.2</a>).
     </p></div></div><div class="sect3" id="ICU-LOCALE-EXAMPLES"><div class="titlepage"><div><div><h4 class="title">24.2.3.3. Collation Settings Examples <a href="#ICU-LOCALE-EXAMPLES" class="id_link">#</a></h4></div></div></div><div class="variablelist"><dl class="variablelist"><dt id="COLLATION-MANAGING-CREATE-ICU-DE-U-CO-PHONEBK-X-ICU"><span class="term"><code class="literal">CREATE COLLATION "de-u-co-phonebk-x-icu" (provider = icu, locale = 'de-u-co-phonebk');</code></span> <a href="#COLLATION-MANAGING-CREATE-ICU-DE-U-CO-PHONEBK-X-ICU" class="id_link">#</a></dt><dd><p>German collation with phone book collation type</p></dd><dt id="COLLATION-MANAGING-CREATE-ICU-UND-U-CO-EMOJI-X-ICU"><span class="term"><code class="literal">CREATE COLLATION "und-u-co-emoji-x-icu" (provider = icu, locale = 'und-u-co-emoji');</code></span> <a href="#COLLATION-MANAGING-CREATE-ICU-UND-U-CO-EMOJI-X-ICU" class="id_link">#</a></dt><dd><p>
         Root collation with Emoji collation type, per Unicode Technical Standard #51
        </p></dd><dt id="COLLATION-MANAGING-CREATE-ICU-EN-U-KR-GREK-LATN"><span class="term"><code class="literal">CREATE COLLATION latinlast (provider = icu, locale = 'en-u-kr-grek-latn');</code></span> <a href="#COLLATION-MANAGING-CREATE-ICU-EN-U-KR-GREK-LATN" class="id_link">#</a></dt><dd><p>
         Sort Greek letters before Latin ones.  (The default is Latin before Greek.)
        </p></dd><dt id="COLLATION-MANAGING-CREATE-ICU-EN-U-KF-UPPER"><span class="term"><code class="literal">CREATE COLLATION upperfirst (provider = icu, locale = 'en-u-kf-upper');</code></span> <a href="#COLLATION-MANAGING-CREATE-ICU-EN-U-KF-UPPER" class="id_link">#</a></dt><dd><p>
         Sort upper-case letters before lower-case letters.  (The default is
         lower-case letters first.)
        </p></dd><dt id="COLLATION-MANAGING-CREATE-ICU-EN-U-KF-UPPER-KR-GREK-LATN"><span class="term"><code class="literal">CREATE COLLATION special (provider = icu, locale = 'en-u-kf-upper-kr-grek-latn');</code></span> <a href="#COLLATION-MANAGING-CREATE-ICU-EN-U-KF-UPPER-KR-GREK-LATN" class="id_link">#</a></dt><dd><p>
         Combines both of the above options.
        </p></dd></dl></div></div><div class="sect3" id="ICU-TAILORING-RULES"><div class="titlepage"><div><div><h4 class="title">24.2.3.4. ICU Tailoring Rules <a href="#ICU-TAILORING-RULES" class="id_link">#</a></h4></div></div></div><p>
     If the options provided by the collation settings shown above are not
     sufficient, the order of collation elements can be changed with tailoring
     rules, whose syntax is detailed at <a class="ulink" href="https://unicode-org.github.io/icu/userguide/collation/customization/" target="_top">https://unicode-org.github.io/icu/userguide/collation/customization/</a>.
    </p><p>
     This small example creates a collation based on the root locale with a
     tailoring rule:
</p><pre class="programlisting">
CREATE COLLATION custom (provider = icu, locale = 'und', rules = '&amp;V &lt;&lt; w &lt;&lt;&lt; W');
</pre><p>
     With this rule, the letter <span class="quote"><span class="quote">W</span></span> is sorted after
     <span class="quote"><span class="quote">V</span></span>, but is treated as a secondary difference similar to an
     accent.  Rules like this are contained in the locale definitions of some
     languages.  (Of course, if a locale definition already contains the
     desired rules, then they don't need to be specified again explicitly.)
    </p><p>
     Here is a more complex example.  The following statement sets up a
     collation named <code class="literal">ebcdic</code> with rules to sort US-ASCII
     characters in the order of the EBCDIC encoding.

</p><pre class="programlisting">
CREATE COLLATION ebcdic (provider = icu, locale = 'und',
rules = $$
&amp; ' ' &lt; '.' &lt; '&lt;' &lt; '(' &lt; '+' &lt; \|
&lt; '&amp;' &lt; '!' &lt; '$' &lt; '*' &lt; ')' &lt; ';'
&lt; '-' &lt; '/' &lt; ',' &lt; '%' &lt; '_' &lt; '&gt;' &lt; '?'
&lt; '`' &lt; ':' &lt; '#' &lt; '@' &lt; \' &lt; '=' &lt; '"'
&lt;*a-r &lt; '~' &lt;*s-z &lt; '^' &lt; '[' &lt; ']'
&lt; '{' &lt;*A-I &lt; '}' &lt;*J-R &lt; '\' &lt;*S-Z &lt;*0-9
$$);

SELECT c
FROM (VALUES ('a'), ('b'), ('A'), ('B'), ('1'), ('2'), ('!'), ('^')) AS x(c)
ORDER BY c COLLATE ebcdic;
 c
---
 !
 a
 b
 ^
 A
 B
 1
 2
</pre><p>
    </p></div><div class="sect3" id="ICU-EXTERNAL-REFERENCES"><div class="titlepage"><div><div><h4 class="title">24.2.3.5. External References for ICU <a href="#ICU-EXTERNAL-REFERENCES" class="id_link">#</a></h4></div></div></div><p>
     This section (<a class="xref" href="collation.html#ICU-CUSTOM-COLLATIONS" title="24.2.3. ICU Custom Collations">Section 24.2.3</a>) is only a brief
     overview of ICU behavior and language tags. Refer to the following
     documents for technical details, additional options, and new behavior:
    </p><div class="itemizedlist"><ul class="itemizedlist" style="list-style-type: disc; "><li class="listitem"><p>
       <a class="ulink" href="https://www.unicode.org/reports/tr35/tr35-collation.html" target="_top">Unicode Technical Standard #35</a>
      </p></li><li class="listitem"><p>
       <a class="ulink" href="https://tools.ietf.org/html/bcp47" target="_top">BCP 47</a>
      </p></li><li class="listitem"><p>
       <a class="ulink" href="https://github.com/unicode-org/cldr/blob/master/common/bcp47/collation.xml" target="_top">CLDR repository</a>
      </p></li><li class="listitem"><p>
       <a class="ulink" href="https://unicode-org.github.io/icu/userguide/locale/" target="_top">https://unicode-org.github.io/icu/userguide/locale/</a>
      </p></li><li class="listitem"><p>
       <a class="ulink" href="https://unicode-org.github.io/icu/userguide/collation/" target="_top">https://unicode-org.github.io/icu/userguide/collation/</a>
      </p></li></ul></div></div></div></div><div class="navfooter"><hr /><table width="100%" summary="Navigation footer"><tr><td width="40%" align="left"><a accesskey="p" href="locale.html" title="24.1. Locale Support">Prev</a> </td><td width="20%" align="center"><a accesskey="u" href="charset.html" title="Chapter 24. Localization">Up</a></td><td width="40%" align="right"> <a accesskey="n" href="multibyte.html" title="24.3. Character Set Support">Next</a></td></tr><tr><td width="40%" align="left" valign="top">24.1. Locale Support </td><td width="20%" align="center"><a accesskey="h" href="index.html" title="PostgreSQL 16.2 Documentation">Home</a></td><td width="40%" align="right" valign="top"> 24.3. Character Set Support</td></tr></table></div></body></html>