summaryrefslogtreecommitdiffstats
path: root/intl/docs/dataintl.rst
blob: d5616e08ae9f70bca5a14afedec3cfed3ddf8e5e (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
.. role:: js(code)
   :language: javascript

=========================
UI Internationalization
=========================

There are many types of data that need to be formatted into a locale specific format,
or require locale specific API operations.

Gecko provides a rich set of locale aware APIs for operations such as:

* date and time formatting
* number formatting
* searching
* sorting
* plural rules
* calendar and locale information

.. note::

  Most of the APIs are backed by the Unicode projects `CLDR`_ and `ICU`_ and are
  focused on enabling front-end code internationalization, which means the majority of
  the APIs are primarily available in JavaScript, with C++ and Rust having only a small
  subset of them exposed.

JavaScript Internationalization API
===================================

Data internationalization APIs are formalized in the JavaScript standard `ECMA 402`_.
These APIs are supported by all major JS environments.

It is best to consult the MDN article on the current state of the `Intl API`_.
Mozilla has an excellent support of the API and relies on it for majority
of its needs. Yet, when working on Firefox UI the :js:`Services.intl` wrapper
should be used.

Services.intl
=============

:js:`Services.intl` is an extension of the JS Intl API which should be used whenever
working with Gecko app user interface with chrome privileges.

The API provides the same objects and methods as :js:`Intl.*`, but fine tunes them
to the Gecko app user preferences, including matching OS Preferences and
other locale choices that web content exposed JS Intl API cannot.

For example, here's an example of a locale aware date formatting
using the regular :js:`Intl.DateTimeFormat`:

.. code-block:: javascript

    let rtf = new Intl.DateTimeFormat(navigator.languages, {
      year: "numeric",
      month: "long",
      day: "numeric"
    });
    let value = rtf.format(new Date());

It will do a good job at formatting the date to the user locale, but it will
only be able to use the customization bits that are exposed to the Web, based on
the locale the user broadcasts to the Web and any additional settings.

But that ignores bits of information that could inform the formatting.

Public API such as :js:`Intl.*` will not be able to look into the Operating System for
regional preferences. It will also respect settings such as `Resist Fingerprinting`
by masking its timezone and locale settings.

This is a fair tradeoff when dealing with the Web Content, but in most cases, the
privileged UI of the Gecko application should be able to access all of those
additional bits and not be affected by the anti-fingerprinting masking.

`mozIntl` is a simple wrapper which in its simplest form works exactly the same. It's
exposed on :js:`Services.intl` object and can be used just like a regular `Intl` API:

.. code-block:: javascript

    let rtf = new Services.intl.DateTimeFormat(undefined, {
      year: "numeric",
      month: "long",
      day: "numeric"
    });
    let value = rtf.format(new Date());

The difference is that this API will now use the set of locales as defined for
Gecko, and will also respect additional regional preferences that Gecko
will fetch from the Operating System.

For those reasons, when dealing with Gecko application UI, it is always recommended
to use the :js:`Services.intl` wrapper.

Additional APIs
================

On top of wrapping up `Intl` API, `mozIntl` provides a number of features
in form of additional options to existing APIs as well as completely new APIs.

Many of those extensions are in the process of being standardized, but are
already available to Gecko developers for internal use.

Below is the list of current extensions:

mozIntl.DateTimeFormat
----------------------

`DateTimeFormat` in `mozIntl` gets additional options that provide greater
simplicity and consistency to the API.

* :js:`timeStyle` and :js:`dateStyle` can take values :js:`short`, :js:`medium`,
  :js:`long` and :js:`full`.
  These options can replace the manual listing of tokens like :js:`year`, :js:`day`, :js:`hour` etc.
  and will compose the most natural date or time format of a given style for the selected
  locale.

Using :js:`timeStyle` and :js:`dateStyle` is highly recommended over listing the tokens,
because different locales may use different default styles for displaying the same tokens.

Additional value is that using those styles allows `mozIntl` to look into
Operating System patterns, which gives users the ability to customize those
patterns to their liking.

Example use:

.. code-block:: javascript

    let dtf = new Services.intl.DateTimeFormat(undefined, {
      timeStyle: "short",
      dateStyle: "short"
    });
    let value = dtf.format(new Date());

This will select the best locale to match the current Gecko application locale,
then potentially check for Operating System regional preferences customizations,
produce the correct pattern for short date+time style and format the date into it.


mozIntl.getCalendarInfo(locale)
-------------------------------

The API will return the following calendar information for a given locale code:

* firstDayOfWeek
    an integer in the range 1=Monday to 7=Sunday indicating the day
    considered the first day of the week in calendars, e.g. 7 for en-US,
    1 for en-GB, 7 for bn-IN
* minDays
    an integer in the range of 1 to 7 indicating the minimum number
    of days required in the first week of the year, e.g. 1 for en-US, 4 for de
* weekend
    an array with values in the range 1=Monday to 7=Sunday indicating the days
    of the week considered as part of the weekend, e.g. [6, 7] for en-US and en-GB,
    [7] for bn-IN (note that "weekend" is *not* necessarily two days)

Those bits of information should be especially useful for any UI that works
with calendar data.

Example:

.. code-block:: javascript

    // omitting the `locale` argument will make the API return data for the
    // current Gecko application UI locale.
    let {
      firstDayOfWeek,  // 1
      minDays,         // 4
      weekend,         // [6, 7]
      calendar,        // "gregory"
      locale,          // "pl"
    } = Services.intl.getCalendarInfo();


mozIntl.DisplayNames(locales, options)
-----------------------------------------

:js:`DisplayNames` API is useful to retrieve various terms available in the
internationalization API. :js:`mozIntl.DisplayNames` extends the standard
`Intl.DisplayNames`_ to additionally provide localization for date-time types.

The API takes a locale fallback chain list, and an options object which can contain
two keys:

* :js:`style` which can take values :js:`narrow`, :js:`short`, :js:`abbreviated`, :js:`long`
* :js:`type` which can take values :js:`language`, :js:`script`, :js:`region`,
  :js:`currency`, :js:`weekday`, :js:`month`, :js:`quarter`, :js:`dayPeriod`,
  :js:`dateTimeField`

Example:

.. code-block:: javascript

    let dateTimeFieldDisplayNames = new Services.intl.DisplayNames(undefined, {
      type: "dateTimeField",
    });
    dateTimeFieldDisplayNames.resolvedOptions().locale = "pl";
    dateTimeFieldDisplayNames.of("year") = "rok";

    let monthDisplayNames = new Services.intl.DisplayNames(undefined, {
      type: "month", style: "long",
    });
    monthDisplayNames.of(1) = "styczeń";

    let weekdaysDisplayNames = new Services.intl.DisplayNames(undefined, {
      type: "weekday", style: "short",
    });
    weekdaysDisplayNames.of(1) = "pon";

    let dayPeriodsDisplayNames = new Services.intl.DisplayNames(undefined, {
      type: "dayPeriod", style: "narrow",
    });
    dayPeriodsDisplayNames.of("am") = "AM";


mozIntl.RelativeTimeFormat(locales, options)
--------------------------------------------

API which can be used to format an interval or a date into a textual
representation of a relative time, such as **5 minutes ago** or **in 2 days**.

This API is in the process of standardization and in its raw form will not handle
any calculations to select the best unit. It is intended to just offer a way
to format a value.

`mozIntl` wrapper extends the functionality providing the calculations and
allowing the user to get the current best textual representation of the delta.

Example:

.. code-block:: javascript

    let rtf = new Services.intl.RelativeTimeFormat(undefined, {
      style: "long", // "narrow" | "short" | "long" (default)
      numeric: "auto", // "always" | "auto" (default)
    });

    let now = Date.now();
    rtf.formatBestUnit(new Date(now - 3 * 1000 * 60)); // "3 minutes ago"

The option `numeric` has value set to `auto` by default, which means that when possible
the formatter will use special textual terms like *yesterday*, *last year*, and so on.

Those values require specific calculations that the raw `Intl.*` API cannot provide.
For example, *yesterday* requires the algorithm to know not only the time delta,
but also what time of the day `now` is. 15 hours ago may be *yesterday* if it
is 10am, but will still be *today* if it is 11pm.

For that reason the future `Intl.RelativeTimeFormat` will use *always* as default,
since terms such as *15 hours ago* are independent of the current time.

.. note::

  In the current form, the API should be only used to format standalone values.
  Without additional capitalization rules, it cannot be freely used in sentences.

mozIntl.getLanguageDisplayNames(locales, langCodes)
---------------------------------------------------

API which returns a list of language names formatted for display.

Example:

.. code-block:: javascript

  let langs = getLanguageDisplayNames(["pl"], ["fr", "de", "en"]);
  langs === ["Francuski", "Niemiecki", "Angielski"];


mozIntl.getRegionDisplayNames(locales, regionCodes)
---------------------------------------------------

API which returns a list of region names formatted for display.

Example:

.. code-block:: javascript

  let regs = getRegionDisplayNames(["pl"], ["US", "CA", "MX"]);
  regs === ["Stany Zjednoczone", "Kanada", "Meksyk"];

mozIntl.getLocaleDisplayNames(locales, localeCodes)
---------------------------------------------------

API which returns a list of region names formatted for display.

Example:

.. code-block:: javascript

  let locs = getLocaleDisplayNames(["pl"], ["sr-RU", "es-MX", "fr-CA"]);
  locs === ["Serbski (Rosja)", "Hiszpański (Meksyk)", "Francuski (Kanada)"];

mozIntl.getAvailableLocaleDisplayNames(type)
---------------------------------------------------

API which returns a list of locale display name codes available for a
given type.
Available types are: "language", "region".

Example:

.. code-block:: javascript

  let codes = getAvailableLocaleDisplayNames("region");
  codes === ["au", "ae", "af", ...];

Best Practices
==============

The most important best practice when dealing with data internationalization is to
perform it as close to the actual UI as possible; right before the UI is displayed.

The reason for this practice is that internationalized data is considered *"opaque"*,
which means that no code should ever attempt to operate on it. Late resolution also
increases the chance that the data will be formatted in the current locale
selection and not formatted and cached prematurely.

It's very important to not attempt to search, concatenate or in any other way
alter the output of the API. Once it gets formatted, the only thing to do with
the output should be to present it to the user.

Testing
-------

The above is also important in the context of testing. It is a common mistake to
attempt to write tests that verify the output of the UI with internationalized data.

The underlying data set used to create the formatted version of the data may and will
change over time, both due to dataset improvements and also changes to the language
and regional preferences over time.
That means that tests that attempt to verify the exact output will require
significantly higher level of maintenance and will remain brittle.

Most of the APIs provide special method, like :js:`resolvedOptions` which should be used
instead to verify that the output is matching the expectations.

Future extensions
=================

If you find yourself in the need of additional internationalization APIs not currently
supported, you can verify if the API proposal is already in the works here,
and file a bug in the component `Core::Internationalization`_ to request it.

.. _ECMA 402: https://tc39.github.io/ecma402/
.. _Intl API: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Intl
.. _CLDR: http://cldr.unicode.org/
.. _ICU: http://site.icu-project.org/
.. _Core::Internationalization: https://bugzilla.mozilla.org/enter_bug.cgi?product=Core&component=Internationalization
.. _Intl.DisplayNames: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Intl/DisplayNames