diff options
author | Daniel Baumann <daniel.baumann@progress-linux.org> | 2024-04-19 00:47:55 +0000 |
---|---|---|
committer | Daniel Baumann <daniel.baumann@progress-linux.org> | 2024-04-19 00:47:55 +0000 |
commit | 26a029d407be480d791972afb5975cf62c9360a6 (patch) | |
tree | f435a8308119effd964b339f76abb83a57c29483 /toolkit/components/translations/docs | |
parent | Initial commit. (diff) | |
download | firefox-26a029d407be480d791972afb5975cf62c9360a6.tar.xz firefox-26a029d407be480d791972afb5975cf62c9360a6.zip |
Adding upstream version 124.0.1.upstream/124.0.1
Signed-off-by: Daniel Baumann <daniel.baumann@progress-linux.org>
Diffstat (limited to 'toolkit/components/translations/docs')
-rw-r--r-- | toolkit/components/translations/docs/img/about-translations.png | bin | 0 -> 138768 bytes | |||
-rw-r--r-- | toolkit/components/translations/docs/index.md | 17 | ||||
-rw-r--r-- | toolkit/components/translations/docs/resources/01_overview.md | 145 | ||||
-rw-r--r-- | toolkit/components/translations/docs/resources/02_contributing.md | 157 | ||||
-rw-r--r-- | toolkit/components/translations/docs/resources/03_bergamot.md | 119 |
5 files changed, 438 insertions, 0 deletions
diff --git a/toolkit/components/translations/docs/img/about-translations.png b/toolkit/components/translations/docs/img/about-translations.png Binary files differnew file mode 100644 index 0000000000..51c4d12042 --- /dev/null +++ b/toolkit/components/translations/docs/img/about-translations.png diff --git a/toolkit/components/translations/docs/index.md b/toolkit/components/translations/docs/index.md new file mode 100644 index 0000000000..7a13d5cc1c --- /dev/null +++ b/toolkit/components/translations/docs/index.md @@ -0,0 +1,17 @@ +# Firefox Translations + +Firefox Translations is a project initiative to give Firefox the ability to translate web content from one language to another language as first-class functionality in the browser. + +This project is based initially on the [Firefox Translations WebExtension]. + +## Resources + +```{toctree} +:titlesonly: +:maxdepth: 1 +:glob: + +resources/* +``` + +[Firefox Translations WebExtension]: https://github.com/mozilla/firefox-translations diff --git a/toolkit/components/translations/docs/resources/01_overview.md b/toolkit/components/translations/docs/resources/01_overview.md new file mode 100644 index 0000000000..632b3002df --- /dev/null +++ b/toolkit/components/translations/docs/resources/01_overview.md @@ -0,0 +1,145 @@ +# Overview + +The following is a high-level overview of the technologies associated with Firefox Translations. + +- [Supported Platforms](#supported-platforms) +- [Language Translations](#language-translation) + - [Technology](#technology) + - [Models](#models) + - [Pivot Translations](#pivot-translations) +- [Language Identification](#language-identification) + - [Technology](#technology-1) + - [Models](#models-1) +- [Remote Settings](#remote-settings) + - [Enabling Firefox Translations](#enabling-firefox-translations) + - [Translating Web Pages](#translating-web-pages) + - [about:translations](#abouttranslations) + +--- +## Supported Platforms + +<input type="checkbox" style="pointer-events: none;" checked><b>Desktop</b><br> +<input type="checkbox" style="pointer-events: none;" checked><b>Android</b><br> +<input type="checkbox" style="pointer-events: none;"><b>iOS</b><br> + +```{note} +- Firefox Translations is available only on devices support [SSE4.1] due to required [SIMD] calculations in [WASM]. +``` + + +--- +## Language Translation + +Firefox Translations utilizes trained machine-learning models that run locally on client +architecture to translate web content from one language to another. + +### Technology + +Firefox Translations utilizes a [WASM] version of the [Bergamot] library to translate from +one language to another. [Bergamot] is powered by [Marian]. + +### Models + +[Bergamot] translation models are single-direction, one-to-one models trained to translate from one language +to one other language (e.g. **`en ⟶ es`**). When Firefox Translations determines a source language and a target language +it utilizes a model specific to this language pair to translate from one to the other. + +### Pivot Translations + +In the event that there is no model to translate directly from a source language and a target language, +Firefox Translations will attempt to satisfy a transitive translation path and will perform a multi-step +translation from the source language to the target language. + + + +```{admonition} Example +> **_No direct translation model exists_**<br> +> <input type="checkbox" style="pointer-events: none;"><b>`es ⟶ fr`</b><br> +> +> **_Transitive dependency satisfied_**<br> +> <input type="checkbox" style="pointer-events: none;" checked><b>`es ⟶ en`</b><br> +> <input type="checkbox" style="pointer-events: none;" checked><b>`en ⟶ fr`</b><br> +> +> **_Pivot translation_**<br> +> <input type="checkbox" style="pointer-events: none;" checked><b>`es ⟶ en ⟶ fr`</b><br> + +In this example, no direct model exists for **`es ⟶ fr`**, but a transitive dependency is satisfied by the two +models for **`es ⟶ en`** and **`en ⟶ fr`**. Firefox Translations will pivot on the **`en`** language by first +translating from **`es`** to **`en`** and then from **`en`** to **`fr`**. +``` +```{note} +- Firefox Translations will not pivot more than once. +- At present, only **`en`** is used as a pivot language. +``` + +--- +## Language Identification + +Firefox Translations utilizes trained machine-learning models that run locally on client +architecture to identify content as being written in a detected language. + +### Technology + +Firefox Translations utilizes a [CLD2] language detector to identify in which language content is written. + +### Models + +No models are currently used for language identification, since [CLD2] exists in the Firefox source tree. + +--- +## Remote Settings + +Remote Settings is not currently used for language identification, since [CLD2] exists in the Firefox source tree. + +--- +## Using Firefox Translations + +The following documentation describes a high-level overview of using Firefox Translations. + +```{note} +- Firefox Translations is actively under development and is currently available only in [Firefox Nightly]. +``` + +### Enabling Firefox Translations + +Firefox Translations functionality can be enabled by modifying the [translations preferences] in **`about:config`**. + +These configurations are likely to change as the project develops which is why this documentation links to them +in the source code rather than defining them. + +At a time when the preferences are more stable, they can be documented here more clearly. + +### Translating Web Pages + +Once Firefox Translations is enabled, Firefox will analyze each web page to determine if it is translatable +via the available translations models. + +If the web page is translatable, then a translations icon will appear in the URL bar of the browser, allowing +the user to initiate the available translation process. + +### about:translations + +When Firefox Translations is enabled, a page called **`about:translations`** becomes available in the browser. + +This is a test page where there user can select a source language and a target language by typing content into +the source-language text box and seeing the translated text in the target-language text box. + +```{note} +**`about:translations`** is a developer-focused UI that is useful for testing the state, performance, and quality of the language models in an interactive environment. It is fairly unpolished and not intended to be shipped as a product at this time. + +It is, however, useful and fun, so it is documented here. +``` + +![](../img/about-translations.png) + + +<!-- Hyperlinks --> +[Bergamot]: https://browser.mt/ +[CLD2]: https://github.com/CLD2Owners/cld2 +[Firefox Nightly]: https://www.mozilla.org/en-US/firefox/channel/desktop/ +[Marian]: https://aclanthology.org/P18-4020/ +[Remote Settings]: https://remote-settings.readthedocs.io/en/latest/ +[SIMD]: https://en.wikipedia.org/wiki/Single_instruction,_multiple_data +[SSE4.1]: https://en.wikipedia.org/wiki/SSE4#SSE4.1 +[translations preferences]: https://searchfox.org/mozilla-central/search?q=browser.translations&path=all.js&case=true®exp=false +[WASM]: https://webassembly.org/ diff --git a/toolkit/components/translations/docs/resources/02_contributing.md b/toolkit/components/translations/docs/resources/02_contributing.md new file mode 100644 index 0000000000..c093ee550d --- /dev/null +++ b/toolkit/components/translations/docs/resources/02_contributing.md @@ -0,0 +1,157 @@ +# Contributing + +The following content goes more in-depth than the [Overview](./01_overview.md) section +to provide helpful information regarding contributing to Firefox Translations. + +- [Source Code](#source-code) +- [Architecture](#architecture) + - [JSActors](#jsactors) +- [Remote Settings](#remote-settings) + - [Admin Dashboards](#admin-dashboards) + - [Prod Admin Access](#prod-admin-access) + - [Pulling From Different Sources](#pulling-from-different-sources) + - [Versioning](#versioning) + - [Non-Breaking Changes](#non-breaking-changes) + - [Breaking Changes](#breaking-changes) +- [Language Identification](#language-identification) + +--- +## Source Code + +The primary source code for Firefox Translations lives in the following directory: + +> **[toolkit/components/translations]** + +--- +## Architecture + +### JSActors + +Translations functionality is divided into different classes based on which access privileges are needed. +Generally, this split is between [Parent] and [Child] versions of [JSWindowActors]. + +The [Parent] actors have access to privileged content and is responsible for things like downloading models +from [Remote Settings](#remote-settings), modifying privileged UI components etc. + +The [Child] actors are responsible for interacting with content on the page itself, requesting content from +the [Parent] actors, and creating [Workers] to carry out tasks. + +--- +## Remote Settings + +The machine-learning models and [WASM] binaries are all hosted in Remote Settings and are downloaded/cached when needed. + +### Admin Dashboards + +In order to get access to Firefox Translations content in the Remote Settings admin dashboards, you will need to request +access in the Remote Settings component on [Bugzilla]. + +Once you have access to Firefox Translations content in Remote Settings, you will be able to view it in the admin dashboards: + +**Dev**<br> +> [https://settings.dev.mozaws.net/v0/admin](https://settings.dev.mozaws.net/v1/admin) + +**Stage**<br> +> [https://remote-settings.allizom.org/v0/admin](https://settings-writer.stage.mozaws.net/v1/admin) + +### Prod Admin Access + +In order to access the prod admin dashboard, you must also have access to a VPN that is authorized to view the dashboard. +To gain access to the VPN, follow [Step 3] on this page in the Remote Settings documentation. + +**Prod**<br> +> [https://remote-settings.mozilla.org/v1/admin](https://settings-writer.prod.mozaws.net/v1/admin) + + +### Pulling From Different Sources + +When you are running Firefox, you can choose to pull data from **Dev**, **Stage**, or **Prod** by downloading and installing +the latest [remote-settings-devtools] Firefox extension. + +### Versioning + +Firefox Translations uses semantic versioning for all of its records via the **`version`** property. + +#### Non-breaking Changes + +Firefox Translations code will always retrieve the maximum compatible version of each record from Remote Settings. +If two records exist with different versions, (e.g. **`1.0`** and **`1.1`**) then only the version **`1.1`** record +will be considered. + +This allows us to update and ship new versions of models that are compatible with the current source code and wasm runtimes +in both backward-compatible and forward-compatible ways. These can be released through remote settings independent of the +[Firefox Release Schedule]. + +#### Breaking Changes + +Breaking changes for Firefox Translations are a bit more tricky. These are changes that make older-version records +incompatible with the current Firefox source code and/or [WASM] runtimes. + +While a breaking change will result in a change of the semver number (e.g. **`1.1 ⟶ 2.0`**), this alone is not +sufficient. Since Firefox Translations always attempts to use the maximum compatible version, only bumping this number +would result in older versions of Firefox attempting to use a newer-version record that is no longer compatible with the +Firefox source code or [WASM] runtimes. + +To handle these changes, Firefox Translations utilizes Remote Settings [Filter Expressions] to make certain records +available to only particular releases of Firefox. This will allow Firefox Translations to make different sets of Remote Settings records available to different versions +of Firefox. + +```{admonition} Example + +Let's say that Firefox 108 through Firefox 120 is compatible with translations model records in the **`1.*`** major-version range, however Firefox 121 and onward is compatible with only model records in the **`2.*`** major-version range. + +This will allow us to mark the **`1.*`** major-version records with the following filter expression: + +**` +"filter_expression": "env.version|versionCompare('108.a0') >= 0 && env.version|versionCompare('121.a0') < 0" +`** + +This means that these records will only be available in Firefox versions greater than or equal to 108, and less than 121. + +Similarly, we will be able to mark all of the **`2.*`** major-version records with this filter expression: + +**` +"filter_expression": "env.version|versionCompare('121.a0') >= 0" +`** + +This means that these records will only be available in Firefox versions greater than or equal to Firefox 121. + +``` + +Tying breaking changes to releases in this way frees up Firefox Translations to make changes as large as entirely +switching one third-party library for another in the compiled source code, while allowing older versions of Firefox to continue utilizing the old library and allowing newer versions of Firefox to utilize the new library. + +--- +## Language Identification + +Translations currently uses the [CLD2] language detector. + +We have previously experimented with using the [fastText] language detector, but we opted to use [CLD2] due to complications with [fastText] [WASM] runtime performance. The benefit of the [CLD2] language detector is that it already exists in the Firefox source tree. In the future, we would still like to explore moving to a more modern language detector such as [CLD3], or perhaps something else. + + +<!-- Hyperlinks --> +[Bugzilla]: https://bugzilla.mozilla.org/enter_bug.cgi?product=Cloud%20Services&component=Server%3A%20Remote%20Settings +[Child]: https://searchfox.org/mozilla-central/search?q=TranslationsChild +[CLD2]: https://github.com/CLD2Owners/cld2 +[CLD3]: https://github.com/google/cld3 +[Download and Install]: https://emscripten.org/docs/getting_started/downloads.html#download-and-install +[emscripten (2.0.3)]: https://github.com/emscripten-core/emscripten/blob/main/ChangeLog.md#203-09102020 +[emscripten (2.0.18)]: https://github.com/emscripten-core/emscripten/blob/main/ChangeLog.md#2018-04232021 +[emscripten (3.1.35)]: https://github.com/emscripten-core/emscripten/blob/main/ChangeLog.md#3135---040323 +[Environments]: https://remote-settings.readthedocs.io/en/latest/getting-started.html#environments +[eval()]: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/eval +[fastText]: https://fasttext.cc/ +[Filter Expressions]: https://remote-settings.readthedocs.io/en/latest/target-filters.html#filter-expressions +[Firefox Release Schedule]: https://wiki.mozilla.org/Release_Management/Calendar +[generate functions]: https://emscripten.org/docs/api_reference/emscripten.h.html?highlight=dynamic_execution#functions +[Getting Set Up To Work On The Firefox Codebase]: https://firefox-source-docs.mozilla.org/setup/index.html +[importScripts()]: https://developer.mozilla.org/en-US/docs/Web/API/WorkerGlobalScope/importScripts +[JSWindowActors]: https://firefox-source-docs.mozilla.org/dom/ipc/jsactors.html#jswindowactor +[minify]: https://github.com/tdewolff/minify +[Parent]: https://searchfox.org/mozilla-central/search?q=TranslationsParent +[Step 3]: https://remote-settings.readthedocs.io/en/latest/getting-started.html#create-a-new-official-type-of-remote-settings +[remote-settings-devtools]: https://github.com/mozilla-extensions/remote-settings-devtools/releases +[Remote Settings]: https://remote-settings.readthedocs.io/en/latest/ +[toolkit/components/translations]: https://searchfox.org/mozilla-central/search?q=toolkit%2Fcomponents%2Ftranslations +[WASM]: https://webassembly.org/ +[Workers]: https://searchfox.org/mozilla-central/search?q=%2Ftranslations.*worker&path=&case=false®exp=true diff --git a/toolkit/components/translations/docs/resources/03_bergamot.md b/toolkit/components/translations/docs/resources/03_bergamot.md new file mode 100644 index 0000000000..a73eaa8393 --- /dev/null +++ b/toolkit/components/translations/docs/resources/03_bergamot.md @@ -0,0 +1,119 @@ +# The Bergamot Translator + +The [Bergamot Translator](https://github.com/browsermt/bergamot-translator) is the translations engine used to power Firefox translations. The project configures a fork of [Marian NMT](https://marian-nmt.github.io/) that enables translations through a Wasm API. + +Bergamot adds a few additional pieces of code on top of the Marian code, which includes HTML alignments (matching up source and target tags in a translation) and sentence iteration. It provides the [Wasm API](https://github.com/browsermt/bergamot-translator/tree/main/wasm) that Firefox uses in its own translation implementation. The Bergamot Translator uses a forked copy of the Marian NMT package in order to provide support for quantized translation models. + +--- +## Building Bergamot + +The Wasm and the JS file that integrate with Firefox can be generated using the `build-bergamot.py` script. + +```sh +cd toolkit/components/translations/bergamot-translator +./build-bergamot.py +``` + +There are a few additional options and up to date documentation for building which are documented by: + +```sh +./build-bergamot.py --help +``` + +After building, the Wasm can be loaded locally for testing by uncommenting the lines at the bottom of `toolkit/components/translations/jar.mn`. In addition, debug symbols can be built with the `--debug` option. This is useful for using the Firefox Profiler. + +--- +## Uploading to Remote Settings + +The Wasm artifact is uploaded and distributed via [Remote Settings](https://remote-settings.readthedocs.io/en/latest/index.html). An upload script is available for updating the Wasm in Remote Setting via: + +```sh +cd toolkit/components/translations/bergamot-translator +./upload-bergamot.py --help +``` + +The help flag will output up to date documentation on how to run the script. In order to do a full release: + +### Breaking changes + +If the Bergamot Translator has a breaking change, then the `BERGAMOT_MAJOR_VERSION` in `toolkit/components/translations/actors/TranslationsParent.sys.mjs` will need to be incremented by one. Any given release of Firefox will pull in minor changes when the records are updated, but major changes will need to ride the release trains. + +### Releasing + +1. Run the `./build-bergamot.py` script + +1. Bump the `remote_settings.version` in `toolkit/components/translations/bergamot-translator/moz.yaml`. + - A minor release would be `"1.0"` ➡️ `"1.1"`. + - A major release would be `"1.1"` ➡️ `"2.0"`. + +1. Run the `./upload-bergamot.py --server prod` + - Follow the instructions for adding the Bearer Token. + - By default new updates use JEXL filters and are filtered to just Nightly and local builds. + +1. Request review on the changes. + - Log in to the [Mozilla Corporate VPN](https://mozilla-hub.atlassian.net/wiki/spaces/IT/pages/15761733/Mozilla+Corporate+VPN) + - Log into the [Remote Settings admin](https://remote-settings.mozilla.org/v1/admin) + - If this is a major change, then the `filter_expression` can be removed, as the change will ride the trains. + - Request review on the changes. + +1. Verify the changes on Nightly. + - Install the [Remote Settings Devtool](https://github.com/mozilla-extensions/remote-settings-devtools/releases). + - Open the Remote Settings Devtool. + - Switch the environment to `Prod (preview)`. + - Clear all local data. + - Restart Nightly. + - Verify that it is working in Nightly by trigging different translations. + +1. Publish to Nightly + - Notify release drivers (<release-drivers@mozilla.org>) that a new translation engine release is hitting Nightly (see example emails below). This is optional for a major release, since it will ride the trains. + - Have another team member approve the release from Remote Settings. + +1. Prepare to publish to Beta / Release + - (Do not do this step if it's a major release.) + - Wait a few days to verify there are no issues on Nightly. + - Log into the [Remote Settings admin](https://remote-settings.mozilla.org/v1/admin) + - Remove the "filter_expression" text from the `bergamot-translator` version. + - Request review. + - Repeat Step 5 to verify for Beta and Release. + +1. Publish to Beta / Release + - (Do not do this step if it's a major release.) + - Notify release drivers (<release-drivers@mozilla.org>) that a new translation engine release is hitting Beta / Release (see example emails below). + - Publish the changes + - Monitor for any increased breakage via [telemetry](https://sql.telemetry.mozilla.org/dashboard/translations?p_date=d_last_7_days). + + +### Example Nightly release email + +``` +Hello Release Drivers, + +The Translations team is releasing a new version of the translations engine via remote +settings. We are releasing a test update on Nightly [Fx123], and plan to follow-up on +[DATE] with a release to both Beta [Fx123] and Release [Fx123] if we've found there are +no issues. We can roll back the release if any unexpected issues are found. + +The plan for this release is available: + +https://firefox-source-docs.mozilla.org/toolkit/components/translations/resources/03_bergamot.html#release + +Thank you, +[NAME] +``` + +### Example Beta / Release release email + +``` +Hello Release Drivers, + +The Translations team is moving forward with a release of a new translations engine +to both Beta [Fx123] and Release [Fx123]. It has been in Nightly [Fx123] with no issues +found. We can roll back the release if any unexpected issues are found. + +The plan for this release is available: + +https://firefox-source-docs.mozilla.org/toolkit/components/translations/resources/03_bergamot.html#release + +Thank you, +[NAME] +``` |