diff options
Diffstat (limited to 'services/sync/docs/payload-evolution.md')
-rw-r--r-- | services/sync/docs/payload-evolution.md | 168 |
1 files changed, 168 insertions, 0 deletions
diff --git a/services/sync/docs/payload-evolution.md b/services/sync/docs/payload-evolution.md new file mode 100644 index 0000000000..e195ee545d --- /dev/null +++ b/services/sync/docs/payload-evolution.md @@ -0,0 +1,168 @@ +# Handling the evolution of Sync payloads + +(Note that this document has been written in the format of an [application-services ADR](https://github.com/mozilla/application-services/blob/main/docs/adr/0000-use-markdown-architectural-decision-records.md) +but the relelvant teams decided that ultimately the best home for this doc is in mozilla-central) + +* Status: Accepted +* Deciders: sync team, credentials management team +* Date: 2023-03-15 + +Technical Story: +* https://github.com/mozilla/application-services/pull/5434 +* https://docs.google.com/document/d/1ToLOERA5HKzEzRVZNv6Ohv_2wZaujW69pVb1Kef2jNY + +## Context and Problem Statement + +Sync exists on all platforms (Desktop, Android, iOS), all channels (Nightly, Beta, Release, ESR) and is heavily used across all Firefox features. +Whenever there are feature changes or requests that potentially involve schema changes, there are not a lot of good options to ensure sync doesn’t break for any specific client. +Since sync data is synced from all channels, we need to make sure each client can handle the new data and that all channels can support the new schema. +Issues like [credit card failing on android and desktop release channels due to schema change on desktop Nightly](https://bugzilla.mozilla.org/show_bug.cgi?id=1812235) +are examples of such cases we can run into. +This document describes our decision on how we will support payload evolution over time. + +Note that even though this document exists in the application-services repository, it should +be considered to apply to all sync implementations, whether in this repository, in mozilla-central, +or anywhere else. + +## Definitions + +* A "new" Firefox installation is a version of Firefox which has a change to a Sync payload which + is not yet understood by "recent" versions. The most common example would be a Nightly version + of Firefox with a new feature not yet on the release channel. + +* A "recent" Firefox installation is a version older than a "new" version, which does not understand + or have support for new features in "new" versions, but which we still want to support without + breakage and without the user perceiving data-loss. This is typically accepted to mean the + current ESR version or later, but taking into account the slow update when new ESRs are released. + +* An "old" version is any version before what we consider "recent". + + +## Decision Drivers + +* It must be possible to change what data is carried by Sync to meet future product requirements. +* Both desktop and mobile platforms must be considered. +* We must not break "recent" Firefox installations when a "new" Firefox installation syncs, and vice-versa. +* Round-tripping data from a "new" Firefox installation through a "recent" Firefox installation must not discard any of the new data, and vice-versa. +* Some degree of breakage for "old" Firefox installations when "new" or "recent" firefoxes sync + might be considered acceptable if absolutely necessary. +* However, breakage of "new" or "recent" Firefoxes when an "old" version syncs is *not* acceptable. +* Because such evolution should be rare, we do not want to set an up-front policy about locking out + "old" versions just because they might have a problem in the future. That is, we want to avoid + a policy that dictates versions more than (say) 2 years old will break when syncing "just in case" +* Any solution to this must be achievable in a relatively short timeframe as we know of product + asks coming down the line which require this capability. + +## Considered Options + +* A backwards compatible schema policy, consisting of (a) having engines "round trip" data they + do not know about and (b) never changing the semantics of existing data. +* A policy which prevents "recent" clients from syncing, or editing data, or other restrictions. +* A formal schema-driven process. +* Consider the sync payloads frozen and never change them. +* Use separate collections for new data + +## Decision Outcome + +Chosen option: A backwards compatible schema policy because it is very flexible and the only option +meeting the decision drivers. + +## Pros and Cons of the Options + +### A backwards compatible schema policy + +A summary of this option is a policy by which: + +* Every sync engine must arrange to persist any fields from the payload which it + does not understand. The next time that engine needs to upload that record to the storage server, + it must arrange to add all such "unknown" fields back into the payload. + +* Different engines must identify different locations where this might happen. For example, the + `passwords` engine would identify the "root" of the payload, `addresses` and `creditcards` would + identify the `entry` sub-object in the payload, while the history engine would probably identify + *both* the root of the payload and the `visits` array. + +* Fields can not change type, nor be removed for a significant amount of time. This might mean + that "new" clients must support both new fields *and* fields which are considered deprecated + by these "new" clients because they are still used by "recent" versions. + +The pros and cons: + +* Good, because it meets the requirements. + +* Good, because the initial set of work identified is relatively simple to implement (that work + specifically is to support the round-tripping of "unknown" fields, in the hope that by the + time actual schema changes are proposed, this round-trip capability will then be on all "recent" + versions) + +* Bad, because the inability to deprecate or change existing fields means that + some evolution tasks become complicated. For example, consider a hypothetical change where + we wanted to change from "street/city/state" fields into a free-form "address" field. New + Firefox versions would need to populate *both* new and old fields when writing to the server, + and handle the fact that only the old versions might be updated when it sees an incoming + record written by a "recent" or "old" versions of Firefox. However, this should be rare. + +* Bad, because it's not possible to prove a proposed change meets the requirements - the policy + is informal and requires good judgement as changes are proposed. + +### A policy which prevents "recent" clients from syncing, or editing data + +Proposals which fit into this category might have been implemented by (say) adding +a version number to the schema, and if clients did not fully understand the schema it would +either prevent syncing the record, or sync it but not allow editing it, or similar. + +This was rejected because: + +* The user would certainly perceive data-loss if we ignored the incoming data entirely. +* If we still wanted older versions to "partially" see the record (eg, but disallow editing) we'd + still need most of the chosen option anyway - specifically, we could still never + deprecate fields etc. +* The UI/UX of trying to explain to the user why they can't edit a record was deemed impossible + to do in a satisfactory way. +* This would effectively penalize users who chose to use Nightly Firefoxes in any way. Simply + allowing a Nightly to sync would effectively break Release/Mobile Firefox versions. + +### A formal schema-driven process. + +Ideally we could formally describe schemas, but we can't come up with anything here which +works with the constraints of supporting older clients - we simply can't update older released +Firefoxes so they know how to work with the new schemas. We also couldn't come up with a solution +where a schema is downloaded dynamically which also allowed the *semantics* (as opposed to simply +validity) of new fields to be described. + +### Consider the sync payloads frozen and never change them. + +A process where payloads are frozen was rejected because: + +* The most naive approach here would not meet the needs of Firefox in the future. + +* A complicated system where we started creating new payload and new collections + (ie, freezing "old" schemas but then creating "new" schemas only understood by + newer clients) could not be conceived in a way that still met the requirements, + particularly around data-loss for older clients. For example, adding a credit-card + on a Nightly version but having it be completely unavailable on a release firefox + isn't acceptable. + +### Use separate collections for new data + +We could store the new data in a separate collection. For example define a +bookmarks2 collection where each record has the same guid as one in bookmarks alongside any new fields. +Newer clients use both collections to sync. + +The pros and cons: + +* Good, because it allows newer clients to sync new data without affecting recent or older clients +* Bad, because sync writes would lose atomicity without server changes. + We can currently write to a single collection in an atomic way, but don't have a way to write to multiple collections. +* Bad because this number of collections grows each time we want to add fields. +* Bad because it potentially leaks extra information to an attacker that gets access to the encrypted server records. + For example if we added a new collection for a single field, then the attacker could guess if that + field was set or not based on the size of the encrypted record. +* Bad because it's difficult to handle nested data with this approach, + for example adding a field to a history record visit. +* Bad because it has the same issue with dependent data as the chosen solution. + +## Links <!-- optional --> + +* This document was originally [brain-stormed in this google docs document](https://docs.google.com/document/d/1ToLOERA5HKzEzRVZNv6Ohv_2wZaujW69pVb1Kef2jNY), + which may be of interest for historical context, but should not be considered part of this ADR. |