diff options
Diffstat (limited to 'intl/icu/source/test/testdata/localeMatcherTest.txt')
-rw-r--r-- | intl/icu/source/test/testdata/localeMatcherTest.txt | 1960 |
1 files changed, 1960 insertions, 0 deletions
diff --git a/intl/icu/source/test/testdata/localeMatcherTest.txt b/intl/icu/source/test/testdata/localeMatcherTest.txt new file mode 100644 index 0000000000..9d92efd232 --- /dev/null +++ b/intl/icu/source/test/testdata/localeMatcherTest.txt @@ -0,0 +1,1960 @@ +# © 2017 and later: Unicode, Inc. and others. +# License & terms of use: http://www.unicode.org/copyright.html +# +# Data-driven test for the language/locale matcher. +# Format: +# +# Everything after "#" is a comment. +# ** test: This line starts a group of test cases. +# +# Lines starting with an '@' sign provide matcher parameters. +# @supported=<comma-separated supported languages> +# @default=<default language> # no value = no explicit default +# @favor=[normal|script] # no value = no explicit setting +# @threshold=<number 0..100> # no value = no explicit setting +# +# A line with ">>" is a getBestMatch() test case: +# <comma-separated desired languages> >> match | desired | combined +# - match is the expected best supported language +# - desired is the expected best desired language +# - combined is the expected result of combine(match, desired) +# An expected language can be "null" to check for the matcher returning null. +# An empty or omitted value is not tested. (Omitted = not even the '|' separator.) +# +# ** test: A new test group resets all matcher parameters. + +## X + +** test: testParentLocales + +# es-419, es-AR, and es-MX are in a cluster; es is in a different one + +@supported=es-419, es-ES +es-AR >> es-419 +@supported=es-ES, es-419 +es-AR >> es-419 + +@supported=es-419, es +es-AR >> es-419 +@supported=es, es-419 +es-AR >> es-419 + +@supported=es-MX, es +es-AR >> es-MX +@supported=es, es-MX +es-AR >> es-MX + +# en-GB, en-AU, and en-NZ are in a cluster; en in a different one + +@supported=en-GB, en-US +en-AU >> en-GB +@supported=en-US, en-GB +en-AU >> en-GB + +@supported=en-GB, en +en-AU >> en-GB +@supported=en, en-GB +en-AU >> en-GB + +@supported=en-NZ, en-US +en-AU >> en-NZ +@supported=en-US, en-NZ +en-AU >> en-NZ + +@supported=en-NZ, en +en-AU >> en-NZ +@supported=en, en-NZ +en-AU >> en-NZ + +# pt-AU and pt-PT in one cluster; pt-BR in another + +@supported=pt-PT, pt-BR +pt-AO >> pt-PT +@supported=pt-BR, pt-PT +pt-AO >> pt-PT + +@supported=pt-PT, pt +pt-AO >> pt-PT +@supported=pt, pt-PT +pt-AO >> pt-PT + +@supported=zh-MO, zh-TW +zh-HK >> zh-MO +@supported=zh-TW, zh-MO +zh-HK >> zh-MO + +@supported=zh-MO, zh-CN +zh-HK >> zh-MO +@supported=zh-CN, zh-MO +zh-HK >> zh-MO + +@supported=zh-MO, zh +zh-HK >> zh-MO +@supported=zh, zh-MO +zh-HK >> zh-MO + +@favor=script +@supported=es-419, es-ES +es-AR >> es-419 +@supported=es-ES, es-419 +es-AR >> es-419 +@supported=es-419, es +es-AR >> es-419 +@supported=es, es-419 +es-AR >> es-419 +@supported=es-MX, es +es-AR >> es-MX +@supported=es, es-MX +es-AR >> es-MX +@supported=en-GB, en-US +en-AU >> en-GB +@supported=en-US, en-GB +en-AU >> en-GB +@supported=en-GB, en +en-AU >> en-GB +@supported=en, en-GB +en-AU >> en-GB +@supported=en-NZ, en-US +en-AU >> en-NZ +@supported=en-US, en-NZ +en-AU >> en-NZ +@supported=en-NZ, en +en-AU >> en-NZ +@supported=en, en-NZ +en-AU >> en-NZ +@supported=pt-PT, pt-BR +pt-AO >> pt-PT +@supported=pt-BR, pt-PT +pt-AO >> pt-PT +@supported=pt-PT, pt +pt-AO >> pt-PT +@supported=pt, pt-PT +pt-AO >> pt-PT +@supported=zh-MO, zh-TW +zh-HK >> zh-MO +@supported=zh-TW, zh-MO +zh-HK >> zh-MO +@supported=zh-MO, zh-CN +zh-HK >> zh-MO +@supported=zh-CN, zh-MO +zh-HK >> zh-MO +@supported=zh-MO, zh +zh-HK >> zh-MO +@supported=zh, zh-MO +zh-HK >> zh-MO + +** test: testChinese + +@supported=zh-CN, zh-TW, iw +zh-Hant-TW >> zh-TW +zh-Hant >> zh-TW +zh-TW >> zh-TW +zh-Hans-CN >> zh-CN +zh-CN >> zh-CN +zh >> zh-CN + +@favor=script +zh-Hant-TW >> zh-TW +zh-Hant >> zh-TW +zh-TW >> zh-TW +zh-Hans-CN >> zh-CN +zh-CN >> zh-CN +zh >> zh-CN + +** test: testenGB + +@supported=fr, en, en-GB, es-419, es-MX, es +en-NZ >> en-GB +es-ES >> es +es-AR >> es-419 +es-MX >> es-MX + +@favor=script +en-NZ >> en-GB +es-ES >> es +es-AR >> es-419 +es-MX >> es-MX + +** test: testFallbacks + +@supported=91, en, hi +sa >> hi + +@favor=script +sa >> hi + +** test: testBasics + +@supported=fr, en-GB, en +en-GB >> en-GB +en >> en +fr >> fr +ja >> fr # return first if no match + +@favor=script +en-GB >> en-GB +en >> en +fr >> fr +ja >> fr + +** test: testFallback + +# check that script fallbacks are handled right + +@supported=zh-CN, zh-TW, iw +zh-Hant >> zh-TW +zh >> zh-CN +zh-Hans-CN >> zh-CN +zh-Hant-HK >> zh-TW +he-IT >> iw + +@favor=script +zh-Hant >> zh-TW +zh >> zh-CN +zh-Hans-CN >> zh-CN +zh-Hant-HK >> zh-TW +he-IT >> iw + +** test: testSpecials + +# check that nearby languages are handled + +@supported=en, fil, ro, nn +tl >> fil +mo >> ro +nb >> nn + +# make sure default works + +ja >> en + +@favor=script +tl >> fil +mo >> ro +nb >> nn +ja >> en + +** test: testRegionalSpecials + +# verify that en-AU is closer to en-GB than to en (which is en-US) + +@supported=en, en-GB, es, es-419 +es-MX >> es-419 +en-AU >> en-GB +es-ES >> es + +@favor=script +es-MX >> es-419 +en-AU >> en-GB +es-ES >> es + +** test: testHK + +# HK and MO are closer to each other for Hant than to TW + +@supported=zh, zh-TW, zh-MO +zh-HK >> zh-MO +@supported=zh, zh-TW, zh-HK +zh-MO >> zh-HK + +@favor=script +@supported=zh, zh-TW, zh-MO +zh-HK >> zh-MO +@supported=zh, zh-TW, zh-HK +zh-MO >> zh-HK + +** test: testMatch-matchOnMazimized + +@supported=zh, zh-Hant +und-TW >> zh-Hant # und-TW should be closer to zh-Hant than to zh + +@supported=en-Hant-TW, und-TW +zh-Hant >> und-TW # zh-Hant should be closer to und-TW than to en-Hant-TW +zh >> en-Hant-TW # no match so get first + +@favor=script +@supported=zh, zh-Hant +und-TW >> zh-Hant +@supported=en-Hant-TW, und-TW +zh-Hant >> und-TW +zh >> en-Hant-TW # no match so get first + +** test: testMatchLegacyCode + +@supported=fr, i-klingon, en-Latn-US +en-GB-oed >> en-Latn-US + +@favor=script +en-GB-oed >> en-Latn-US + +** test: testGetBestMatchForList-exactMatch +@supported=fr, en-GB, ja, es-ES, es-MX +ja, de >> ja + +@favor=script +ja, de >> ja + +** test: testGetBestMatchForList-simpleVariantMatch +@supported=fr, en-GB, ja, es-ES, es-MX +de, en-US >> en-GB # Intentionally avoiding a perfect-match or two candidates for variant matches. + +# Fallback. + +de, zh >> fr + +@favor=script +de, en-US >> en-GB +de, zh >> fr + +** test: testGetBestMatchForList-matchOnMaximized +# Check that if the preference is maximized already, it works as well. + +@supported=en, ja +ja-Jpan-JP, en-AU >> ja # Match for ja-Jpan-JP (maximized already) + +# ja-JP matches ja on likely subtags, and it's listed first, thus it wins over the second preference en-GB. + +ja-JP, en-US >> ja # Match for ja-Jpan-JP (maximized already) + +# Check that if the preference is maximized already, it works as well. + +ja-Jpan-JP, en-US >> ja # Match for ja-Jpan-JP (maximized already) + +@favor=script +ja-Jpan-JP, en-AU >> ja +ja-JP, en-US >> ja +ja-Jpan-JP, en-US >> ja + +** test: testGetBestMatchForList-noMatchOnMaximized +# Regression test for http://b/5714572 . +# de maximizes to de-DE. Pick the exact match for the secondary language instead. +@supported=en, de, fr, ja +de-CH, fr >> de + +@favor=script +de-CH, fr >> de + +** test: testBestMatchForTraditionalChinese + +# Scenario: An application that only supports Simplified Chinese (and some other languages), +# but does not support Traditional Chinese. zh-Hans-CN could be replaced with zh-CN, zh, or +# zh-Hans, it wouldn't make much of a difference. + +# The script distance (simplified vs. traditional Han) is considered small enough +# to be an acceptable match. The regional difference is considered almost insignificant. + +@supported=fr, zh-Hans-CN, en-US +zh-TW >> fr # no match so get first +zh-Hant >> fr # no match so get first + +# For geopolitical reasons, you might want to avoid a zh-Hant -> zh-Hans match. +# In this case, if zh-TW, zh-HK or a tag starting with zh-Hant is requested, you can +# change your call to getBestMatch to include a 2nd language preference. +# "en" is a better match since its distance to "en-US" is closer than the distance +# from "zh-TW" to "zh-CN" (script distance). + +zh-TW, en >> en-US +zh-Hant-CN, en >> en-US +zh-Hans, en >> zh-Hans-CN + +@favor=script +zh-TW >> fr # no match so get first +zh-Hant >> fr # no match so get first +zh-TW, en >> en-US +zh-Hant-CN, en >> en-US +zh-Hans, en >> zh-Hans-CN + +** test: testUndefined +# When the undefined language doesn't match anything in the list, +# getBestMatch returns the default, as usual. + +@supported=it, fr +und >> it + +# When it *does* occur in the list, bestMatch returns it, as expected. +@supported=it, und +und >> und + +# The unusual part: max("und") = "en-Latn-US", and since matching is based on maximized +# tags, the undefined language would normally match English. But that would produce the +# counterintuitive results that getBestMatch("und", XLocaleMatcher("it,en")) would be "en", and +# getBestMatch("en", XLocaleMatcher("it,und")) would be "und". + +# To avoid that, we change the matcher's definitions of max +# so that max("und")="und". That produces the following, more desirable +# results: + +@supported=it, en +und >> it +@supported=it, und +en >> it + +@favor=script +@supported=it, fr +und >> it +@supported=it, und +und >> und +@supported=it, en +und >> it +@supported=it, und +en >> it + +** test: testGetBestMatch-regionDistance + +@supported=es-AR, es +es-MX >> es-AR +@supported=fr, en, en-GB +en-CA >> en +@supported=de-AT, de-DE, de-CH +de >> de-DE + +@favor=script +@supported=es-AR, es +es-MX >> es-AR +@supported=fr, en, en-GB +en-CA >> en +@supported=de-AT, de-DE, de-CH +de >> de-DE + +** test: testAsymmetry + +@supported=mul, nl +af >> nl # af => nl +@supported=mul, af +nl >> mul # but nl !=> af + +@favor=script +@supported=mul, nl +af >> nl +@supported=mul, af +nl >> mul + +** test: testGetBestMatchForList-matchOnMaximized2 + +# ja-JP matches ja on likely subtags, and it's listed first, thus it wins over the second preference en-GB. + +@supported=fr, en-GB, ja, es-ES, es-MX +ja-JP, en-GB >> ja # Match for ja-JP, with likely region subtag + +# Check that if the preference is maximized already, it works as well. + +ja-Jpan-JP, en-GB >> ja # Match for ja-Jpan-JP (maximized already) + +@favor=script +ja-JP, en-GB >> ja +ja-Jpan-JP, en-GB >> ja + +** test: testGetBestMatchForList-closeEnoughMatchOnMaximized + +@supported=en-GB, en, de, fr, ja +de-CH, fr >> de +en-US, ar, nl, de, ja >> en + +@favor=script +de-CH, fr >> de +en-US, ar, nl, de, ja >> en + +** test: testGetBestMatchForPortuguese + +# pt might be supported and not pt-PT + +# European user who prefers Spanish over Brazilian Portuguese as a fallback. + +@supported=pt-PT, pt-BR, es, es-419 +pt-PT, es, pt >> pt-PT +@supported=pt-PT, pt, es, es-419 +pt-PT, es, pt >> pt-PT # pt implicit + +# Brazilian user who prefers South American Spanish over European Portuguese as a fallback. +# The asymmetry between this case and above is because it's "pt-PT" that's missing between the +# matchers as "pt-BR" is a much more common language. + +@supported=pt-PT, pt-BR, es, es-419 +pt, es-419, pt-PT >> pt-BR +pt-PT, es, pt >> pt-PT +@supported=pt-PT, pt, es, es-419 +pt-PT, es, pt >> pt-PT +pt, es-419, pt-PT >> pt + +@supported=pt-BR, es, es-419 +pt, es-419, pt-PT >> pt-BR + +# Code that adds the user's country can get "pt-US" for a user's language. +# That should fall back to "pt-BR". + +@supported=pt-PT, pt-BR, es, es-419 +pt-US, pt-PT >> pt-BR +@supported=pt-PT, pt, es, es-419 +pt-US, pt-PT, pt >> pt # pt-BR implicit + +@favor=script +@supported=pt-PT, pt-BR, es, es-419 +pt-PT, es, pt >> pt-PT +@supported=pt-PT, pt, es, es-419 +pt-PT, es, pt >> pt-PT + +@supported=pt-PT, pt-BR, es, es-419 +pt, es-419, pt-PT >> pt-BR +pt-PT, es, pt >> pt-PT +@supported=pt-PT, pt, es, es-419 +pt-PT, es, pt >> pt-PT +pt, es-419, pt-PT >> pt + +@supported=pt-BR, es, es-419 +pt, es-419, pt-PT >> pt-BR + +@supported=pt-PT, pt-BR, es, es-419 +pt-US, pt-PT >> pt-BR +@supported=pt-PT, pt, es, es-419 +pt-US, pt-PT, pt >> pt + +** test: testVariantWithScriptMatch 1 and 2 + +@supported=fr, en, sv +en-GB >> en +@supported=en, sv +en-GB, sv >> en + +@favor=script +@supported=fr, en, sv +en-GB >> en +@supported=en, sv +en-GB, sv >> en + +** test: testLongLists + +@supported=en, sv +sv >> sv + +@supported=af, am, ar, az, be, bg, bn, bs, ca, cs, cy, da, de, el, en, en-GB, es, es-419, et, eu, fa, fi, fil, fr, ga, gl, gu, hi, hr, hu, hy, id, is, it, iw, ja, ka, kk, km, kn, ko, ky, lo, lt, lv, mk, ml, mn, mr, ms, my, ne, nl, no, pa, pl, pt, pt-PT, ro, ru, si, sk, sl, sq, sr, sr-Latn, sv, sw, ta, te, th, tr, uk, ur, uz, vi, zh-CN, zh-TW, zu +sv >> sv + +@supported=af, af-NA, af-ZA, agq, agq-CM, ak, ak-GH, am, am-ET, ar, ar-001, ar-AE, ar-BH, ar-DJ, ar-DZ, ar-EG, ar-EH, ar-ER, ar-IL, ar-IQ, ar-JO, ar-KM, ar-KW, ar-LB, ar-LY, ar-MA, ar-MR, ar-OM, ar-PS, ar-QA, ar-SA, ar-SD, ar-SO, ar-SS, ar-SY, ar-TD, ar-TN, ar-YE, as, as-IN, asa, asa-TZ, ast, ast-ES, az, az-Cyrl, az-Cyrl-AZ, az-Latn, az-Latn-AZ, bas, bas-CM, be, be-BY, bem, bem-ZM, bez, bez-TZ, bg, bg-BG, bm, bm-ML, bn, bn-BD, bn-IN, bo, bo-CN, bo-IN, br, br-FR, brx, brx-IN, bs, bs-Cyrl, bs-Cyrl-BA, bs-Latn, bs-Latn-BA, ca, ca-AD, ca-ES, ca-ES-VALENCIA, ca-FR, ca-IT, ce, ce-RU, cgg, cgg-UG, chr, chr-US, ckb, ckb-IQ, ckb-IR, cs, cs-CZ, cu, cu-RU, cy, cy-GB, da, da-DK, da-GL, dav, dav-KE, de, de-AT, de-BE, de-CH, de-DE, de-LI, de-LU, dje, dje-NE, dsb, dsb-DE, dua, dua-CM, dyo, dyo-SN, dz, dz-BT, ebu, ebu-KE, ee, ee-GH, ee-TG, el, el-CY, el-GR, en, en-001, en-150, en-AG, en-AI, en-AS, en-AT, en-AU, en-BB, en-BE, en-BI, en-BM, en-BS, en-BW, en-BZ, en-CA, en-CC, en-CH, en-CK, en-CM, en-CX, en-CY, en-DE, en-DG, en-DK, en-DM, en-ER, en-FI, en-FJ, en-FK, en-FM, en-GB, en-GD, en-GG, en-GH, en-GI, en-GM, en-GU, en-GY, en-HK, en-IE, en-IL, en-IM, en-IN, en-IO, en-JE, en-JM, en-KE, en-KI, en-KN, en-KY, en-LC, en-LR, en-LS, en-MG, en-MH, en-MO, en-MP, en-MS, en-MT, en-MU, en-MW, en-MY, en-NA, en-NF, en-NG, en-NL, en-NR, en-NU, en-NZ, en-PG, en-PH, en-PK, en-PN, en-PR, en-PW, en-RW, en-SB, en-SC, en-SD, en-SE, en-SG, en-SH, en-SI, en-SL, en-SS, en-SX, en-SZ, en-TC, en-TK, en-TO, en-TT, en-TV, en-TZ, en-UG, en-UM, en-US, en-US-POSIX, en-VC, en-VG, en-VI, en-VU, en-WS, en-ZA, en-ZM, en-ZW, eo, eo-001, es, es-419, es-AR, es-BO, es-CL, es-CO, es-CR, es-CU, es-DO, es-EA, es-EC, es-ES, es-GQ, es-GT, es-HN, es-IC, es-MX, es-NI, es-PA, es-PE, es-PH, es-PR, es-PY, es-SV, es-US, es-UY, es-VE, et, et-EE, eu, eu-ES, ewo, ewo-CM, fa, fa-AF, fa-IR, ff, ff-CM, ff-GN, ff-MR, ff-SN, fi, fi-FI, fil, fil-PH, fo, fo-DK, fo-FO, fr, fr-BE, fr-BF, fr-BI, fr-BJ, fr-BL, fr-CA, fr-CD, fr-CF, fr-CG, fr-CH, fr-CI, fr-CM, fr-DJ, fr-DZ, fr-FR, fr-GA, fr-GF, fr-GN, fr-GP, fr-GQ, fr-HT, fr-KM, fr-LU, fr-MA, fr-MC, fr-MF, fr-MG, fr-ML, fr-MQ, fr-MR, fr-MU, fr-NC, fr-NE, fr-PF, fr-PM, fr-RE, fr-RW, fr-SC, fr-SN, fr-SY, fr-TD, fr-TG, fr-TN, fr-VU, fr-WF, fr-YT, fur, fur-IT, fy, fy-NL, ga, ga-IE, gd, gd-GB, gl, gl-ES, gsw, gsw-CH, gsw-FR, gsw-LI, gu, gu-IN, guz, guz-KE, gv, gv-IM, ha, ha-GH, ha-NE, ha-NG, haw, haw-US, he, he-IL, hi, hi-IN, hr, hr-BA, hr-HR, hsb, hsb-DE, hu, hu-HU, hy, hy-AM, id, id-ID, ig, ig-NG, ii, ii-CN, is, is-IS, it, it-CH, it-IT, it-SM, ja, ja-JP, jgo, jgo-CM, jmc, jmc-TZ, ka, ka-GE, kab, kab-DZ, kam, kam-KE, kde, kde-TZ, kea, kea-CV, khq, khq-ML, ki, ki-KE, kk, kk-KZ, kkj, kkj-CM, kl, kl-GL, kln, kln-KE, km, km-KH, kn, kn-IN, ko, ko-KP, ko-KR, kok, kok-IN, ks, ks-IN, ksb, ksb-TZ, ksf, ksf-CM, ksh, ksh-DE, kw, kw-GB, ky, ky-KG, lag, lag-TZ, lb, lb-LU, lg, lg-UG, lkt, lkt-US, ln, ln-AO, ln-CD, ln-CF, ln-CG, lo, lo-LA, lrc, lrc-IQ, lrc-IR, lt, lt-LT, lu, lu-CD, luo, luo-KE, luy, luy-KE, lv, lv-LV, mas, mas-KE, mas-TZ, mer, mer-KE, mfe, mfe-MU, mg, mg-MG, mgh, mgh-MZ, mgo, mgo-CM, mk, mk-MK, ml, ml-IN, mn, mn-MN, mr, mr-IN, ms, ms-BN, ms-MY, ms-SG, mt, mt-MT, mua, mua-CM, my, my-MM, mzn, mzn-IR, naq, naq-NA, nb, nb-NO, nb-SJ, nd, nd-ZW, ne, ne-IN, ne-NP, nl, nl-AW, nl-BE, nl-BQ, nl-CW, nl-NL, nl-SR, nl-SX, nmg, nmg-CM, nn, nn-NO, nnh, nnh-CM, nus, nus-SS, nyn, nyn-UG, om, om-ET, om-KE, or, or-IN, os, os-GE, os-RU, pa, pa-Arab, pa-Arab-PK, pa-Guru, pa-Guru-IN, pl, pl-PL, prg, prg-001, ps, ps-AF, pt, pt-AO, pt-BR, pt-CV, pt-GW, pt-MO, pt-MZ, pt-PT, pt-ST, pt-TL, qu, qu-BO, qu-EC, qu-PE, rm, rm-CH, rn, rn-BI, ro, ro-MD, ro-RO, rof, rof-TZ, root, ru, ru-BY, ru-KG, ru-KZ, ru-MD, ru-RU, ru-UA, rw, rw-RW, rwk, rwk-TZ, sah, sah-RU, saq, saq-KE, sbp, sbp-TZ, se, se-FI, se-NO, se-SE, seh, seh-MZ, ses, ses-ML, sg, sg-CF, shi, shi-Latn, shi-Latn-MA, shi-Tfng, shi-Tfng-MA, si, si-LK, sk, sk-SK, sl, sl-SI, smn, smn-FI, sn, sn-ZW, so, so-DJ, so-ET, so-KE, so-SO, sq, sq-AL, sq-MK, sq-XK, sr, sr-Cyrl, sr-Cyrl-BA, sr-Cyrl-ME, sr-Cyrl-RS, sr-Cyrl-XK, sr-Latn, sr-Latn-BA, sr-Latn-ME, sr-Latn-RS, sr-Latn-XK, sv, sv-AX, sv-FI, sv-SE, sw, sw-CD, sw-KE, sw-TZ, sw-UG, ta, ta-IN, ta-LK, ta-MY, ta-SG, te, te-IN, teo, teo-KE, teo-UG, th, th-TH, ti, ti-ER, ti-ET, tk, tk-TM, to, to-TO, tr, tr-CY, tr-TR, twq, twq-NE, tzm, tzm-MA, ug, ug-CN, uk, uk-UA, ur, ur-IN, ur-PK, uz, uz-Arab, uz-Arab-AF, uz-Cyrl, uz-Cyrl-UZ, uz-Latn, uz-Latn-UZ, vai, vai-Latn, vai-Latn-LR, vai-Vaii, vai-Vaii-LR, vi, vi-VN, vo, vo-001, vun, vun-TZ, wae, wae-CH, xog, xog-UG, yav, yav-CM, yi, yi-001, yo, yo-BJ, yo-NG, zgh, zgh-MA, zh, zh-Hans, zh-Hans-CN, zh-Hans-HK, zh-Hans-MO, zh-Hans-SG, zh-Hant, zh-Hant-HK, zh-Hant-MO, zh-Hant-TW, zu, zu-ZA +sv >> sv + +@favor=script +@supported=en, sv +sv >> sv + +@supported=af, am, ar, az, be, bg, bn, bs, ca, cs, cy, da, de, el, en, en-GB, es, es-419, et, eu, fa, fi, fil, fr, ga, gl, gu, hi, hr, hu, hy, id, is, it, iw, ja, ka, kk, km, kn, ko, ky, lo, lt, lv, mk, ml, mn, mr, ms, my, ne, nl, no, pa, pl, pt, pt-PT, ro, ru, si, sk, sl, sq, sr, sr-Latn, sv, sw, ta, te, th, tr, uk, ur, uz, vi, zh-CN, zh-TW, zu +sv >> sv + +@supported=af, af-NA, af-ZA, agq, agq-CM, ak, ak-GH, am, am-ET, ar, ar-001, ar-AE, ar-BH, ar-DJ, ar-DZ, ar-EG, ar-EH, ar-ER, ar-IL, ar-IQ, ar-JO, ar-KM, ar-KW, ar-LB, ar-LY, ar-MA, ar-MR, ar-OM, ar-PS, ar-QA, ar-SA, ar-SD, ar-SO, ar-SS, ar-SY, ar-TD, ar-TN, ar-YE, as, as-IN, asa, asa-TZ, ast, ast-ES, az, az-Cyrl, az-Cyrl-AZ, az-Latn, az-Latn-AZ, bas, bas-CM, be, be-BY, bem, bem-ZM, bez, bez-TZ, bg, bg-BG, bm, bm-ML, bn, bn-BD, bn-IN, bo, bo-CN, bo-IN, br, br-FR, brx, brx-IN, bs, bs-Cyrl, bs-Cyrl-BA, bs-Latn, bs-Latn-BA, ca, ca-AD, ca-ES, ca-ES-VALENCIA, ca-FR, ca-IT, ce, ce-RU, cgg, cgg-UG, chr, chr-US, ckb, ckb-IQ, ckb-IR, cs, cs-CZ, cu, cu-RU, cy, cy-GB, da, da-DK, da-GL, dav, dav-KE, de, de-AT, de-BE, de-CH, de-DE, de-LI, de-LU, dje, dje-NE, dsb, dsb-DE, dua, dua-CM, dyo, dyo-SN, dz, dz-BT, ebu, ebu-KE, ee, ee-GH, ee-TG, el, el-CY, el-GR, en, en-001, en-150, en-AG, en-AI, en-AS, en-AT, en-AU, en-BB, en-BE, en-BI, en-BM, en-BS, en-BW, en-BZ, en-CA, en-CC, en-CH, en-CK, en-CM, en-CX, en-CY, en-DE, en-DG, en-DK, en-DM, en-ER, en-FI, en-FJ, en-FK, en-FM, en-GB, en-GD, en-GG, en-GH, en-GI, en-GM, en-GU, en-GY, en-HK, en-IE, en-IL, en-IM, en-IN, en-IO, en-JE, en-JM, en-KE, en-KI, en-KN, en-KY, en-LC, en-LR, en-LS, en-MG, en-MH, en-MO, en-MP, en-MS, en-MT, en-MU, en-MW, en-MY, en-NA, en-NF, en-NG, en-NL, en-NR, en-NU, en-NZ, en-PG, en-PH, en-PK, en-PN, en-PR, en-PW, en-RW, en-SB, en-SC, en-SD, en-SE, en-SG, en-SH, en-SI, en-SL, en-SS, en-SX, en-SZ, en-TC, en-TK, en-TO, en-TT, en-TV, en-TZ, en-UG, en-UM, en-US, en-US-POSIX, en-VC, en-VG, en-VI, en-VU, en-WS, en-ZA, en-ZM, en-ZW, eo, eo-001, es, es-419, es-AR, es-BO, es-CL, es-CO, es-CR, es-CU, es-DO, es-EA, es-EC, es-ES, es-GQ, es-GT, es-HN, es-IC, es-MX, es-NI, es-PA, es-PE, es-PH, es-PR, es-PY, es-SV, es-US, es-UY, es-VE, et, et-EE, eu, eu-ES, ewo, ewo-CM, fa, fa-AF, fa-IR, ff, ff-CM, ff-GN, ff-MR, ff-SN, fi, fi-FI, fil, fil-PH, fo, fo-DK, fo-FO, fr, fr-BE, fr-BF, fr-BI, fr-BJ, fr-BL, fr-CA, fr-CD, fr-CF, fr-CG, fr-CH, fr-CI, fr-CM, fr-DJ, fr-DZ, fr-FR, fr-GA, fr-GF, fr-GN, fr-GP, fr-GQ, fr-HT, fr-KM, fr-LU, fr-MA, fr-MC, fr-MF, fr-MG, fr-ML, fr-MQ, fr-MR, fr-MU, fr-NC, fr-NE, fr-PF, fr-PM, fr-RE, fr-RW, fr-SC, fr-SN, fr-SY, fr-TD, fr-TG, fr-TN, fr-VU, fr-WF, fr-YT, fur, fur-IT, fy, fy-NL, ga, ga-IE, gd, gd-GB, gl, gl-ES, gsw, gsw-CH, gsw-FR, gsw-LI, gu, gu-IN, guz, guz-KE, gv, gv-IM, ha, ha-GH, ha-NE, ha-NG, haw, haw-US, he, he-IL, hi, hi-IN, hr, hr-BA, hr-HR, hsb, hsb-DE, hu, hu-HU, hy, hy-AM, id, id-ID, ig, ig-NG, ii, ii-CN, is, is-IS, it, it-CH, it-IT, it-SM, ja, ja-JP, jgo, jgo-CM, jmc, jmc-TZ, ka, ka-GE, kab, kab-DZ, kam, kam-KE, kde, kde-TZ, kea, kea-CV, khq, khq-ML, ki, ki-KE, kk, kk-KZ, kkj, kkj-CM, kl, kl-GL, kln, kln-KE, km, km-KH, kn, kn-IN, ko, ko-KP, ko-KR, kok, kok-IN, ks, ks-IN, ksb, ksb-TZ, ksf, ksf-CM, ksh, ksh-DE, kw, kw-GB, ky, ky-KG, lag, lag-TZ, lb, lb-LU, lg, lg-UG, lkt, lkt-US, ln, ln-AO, ln-CD, ln-CF, ln-CG, lo, lo-LA, lrc, lrc-IQ, lrc-IR, lt, lt-LT, lu, lu-CD, luo, luo-KE, luy, luy-KE, lv, lv-LV, mas, mas-KE, mas-TZ, mer, mer-KE, mfe, mfe-MU, mg, mg-MG, mgh, mgh-MZ, mgo, mgo-CM, mk, mk-MK, ml, ml-IN, mn, mn-MN, mr, mr-IN, ms, ms-BN, ms-MY, ms-SG, mt, mt-MT, mua, mua-CM, my, my-MM, mzn, mzn-IR, naq, naq-NA, nb, nb-NO, nb-SJ, nd, nd-ZW, ne, ne-IN, ne-NP, nl, nl-AW, nl-BE, nl-BQ, nl-CW, nl-NL, nl-SR, nl-SX, nmg, nmg-CM, nn, nn-NO, nnh, nnh-CM, nus, nus-SS, nyn, nyn-UG, om, om-ET, om-KE, or, or-IN, os, os-GE, os-RU, pa, pa-Arab, pa-Arab-PK, pa-Guru, pa-Guru-IN, pl, pl-PL, prg, prg-001, ps, ps-AF, pt, pt-AO, pt-BR, pt-CV, pt-GW, pt-MO, pt-MZ, pt-PT, pt-ST, pt-TL, qu, qu-BO, qu-EC, qu-PE, rm, rm-CH, rn, rn-BI, ro, ro-MD, ro-RO, rof, rof-TZ, root, ru, ru-BY, ru-KG, ru-KZ, ru-MD, ru-RU, ru-UA, rw, rw-RW, rwk, rwk-TZ, sah, sah-RU, saq, saq-KE, sbp, sbp-TZ, se, se-FI, se-NO, se-SE, seh, seh-MZ, ses, ses-ML, sg, sg-CF, shi, shi-Latn, shi-Latn-MA, shi-Tfng, shi-Tfng-MA, si, si-LK, sk, sk-SK, sl, sl-SI, smn, smn-FI, sn, sn-ZW, so, so-DJ, so-ET, so-KE, so-SO, sq, sq-AL, sq-MK, sq-XK, sr, sr-Cyrl, sr-Cyrl-BA, sr-Cyrl-ME, sr-Cyrl-RS, sr-Cyrl-XK, sr-Latn, sr-Latn-BA, sr-Latn-ME, sr-Latn-RS, sr-Latn-XK, sv, sv-AX, sv-FI, sv-SE, sw, sw-CD, sw-KE, sw-TZ, sw-UG, ta, ta-IN, ta-LK, ta-MY, ta-SG, te, te-IN, teo, teo-KE, teo-UG, th, th-TH, ti, ti-ER, ti-ET, tk, tk-TM, to, to-TO, tr, tr-CY, tr-TR, twq, twq-NE, tzm, tzm-MA, ug, ug-CN, uk, uk-UA, ur, ur-IN, ur-PK, uz, uz-Arab, uz-Arab-AF, uz-Cyrl, uz-Cyrl-UZ, uz-Latn, uz-Latn-UZ, vai, vai-Latn, vai-Latn-LR, vai-Vaii, vai-Vaii-LR, vi, vi-VN, vo, vo-001, vun, vun-TZ, wae, wae-CH, xog, xog-UG, yav, yav-CM, yi, yi-001, yo, yo-BJ, yo-NG, zgh, zgh-MA, zh, zh-Hans, zh-Hans-CN, zh-Hans-HK, zh-Hans-MO, zh-Hans-SG, zh-Hant, zh-Hant-HK, zh-Hant-MO, zh-Hant-TW, zu, zu-ZA +sv >> sv + +** test: test8288 + +@supported=it, en +und >> it +und, en >> en + +# examples from +# http://unicode.org/repos/cldr/tags/latest/common/bcp47/ +# http://unicode.org/repos/cldr/tags/latest/common/validity/variant.xml + +@favor=script +und >> it +und, en >> en + +** test: testUnHack + +@supported=en-NZ, en-IT +en-US >> en-NZ + +@favor=script +en-US >> en-NZ + +** test: testEmptySupported => null +en >> null + +# testVariantsAndExtensions + +** test: tests the .combine() method + +@supported=und, fr +fr-BE-fonipa >> fr | | fr-BE-fonipa +@supported=und, fr-CA +fr-BE-fonipa >> fr-CA | | fr-BE-fonipa +@supported=und, fr-fonupa +fr-BE-fonipa >> fr-fonupa | | fr-BE-fonipa +@supported=und, no +nn-BE-fonipa >> no | | no-BE-fonipa +@supported=und, en-GB-u-sd-gbsct +en-fonipa-u-nu-Arab-ca-buddhist-t-m0-iso-i0-pinyin >> en-GB-u-sd-gbsct | | en-GB-fonipa-u-nu-Arab-ca-buddhist-t-m0-iso-i0-pinyin + +@supported=en-PSCRACK, de-PSCRACK, fr-PSCRACK, pt-PT-PSCRACK +fr-PSCRACK >> fr-PSCRACK +fr >> en-PSCRACK +de-CH >> en-PSCRACK + +@favor=script +@supported=und, fr +fr-BE-fonipa >> fr +@supported=und, fr-CA +fr-BE-fonipa >> fr-CA +@supported=und, fr-fonupa +fr-BE-fonipa >> fr-fonupa +@supported=und, no +nn-BE-fonipa >> no | | no-BE-fonipa +@supported=und, en-GB-u-sd-gbsct +en-fonipa-u-nu-Arab-ca-buddhist-t-m0-iso-i0-pinyin >> en-GB-u-sd-gbsct | | en-GB-fonipa-u-nu-Arab-ca-buddhist-t-m0-iso-i0-pinyin + +@supported=en-PSCRACK, de-PSCRACK, fr-PSCRACK, pt-PT-PSCRACK +fr-PSCRACK >> fr-PSCRACK +fr >> en-PSCRACK +de-CH >> en-PSCRACK + +** test: testClusters +# we favor es-419 over others in cluster. Clusters: es- {ES, MA, EA} {419, AR, MX} + +@supported=und, es, es-MA, es-MX, es-419 +es-AR >> es-419 +@supported=und, es-MA, es, es-419, es-MX +es-AR >> es-419 +@supported=und, es, es-MA, es-MX, es-419 +es-EA >> es +@supported=und, es-MA, es, es-419, es-MX +es-EA >> es + +# of course, fall back to within cluster + +@supported=und, es, es-MA, es-MX +es-AR >> es-MX +@supported=und, es-MA, es, es-MX +es-AR >> es-MX +@supported=und, es-MA, es-MX, es-419 +es-EA >> es-MA +@supported=und, es-MA, es-419, es-MX +es-EA >> es-MA + +# we favor es-GB over others in cluster. Clusters: en- {US, GU, VI} {GB, IN, ZA} + +@supported=und, en, en-GU, en-IN, en-GB +en-ZA >> en-GB +@supported=und, en-GU, en, en-GB, en-IN +en-ZA >> en-GB +@supported=und, en, en-GU, en-IN, en-GB +en-VI >> en +@supported=und, en-GU, en, en-GB, en-IN +en-VI >> en + +# of course, fall back to within cluster + +@supported=und, en, en-GU, en-IN +en-ZA >> en-IN +@supported=und, en-GU, en, en-IN +en-ZA >> en-IN +@supported=und, en-GU, en-IN, en-GB +en-VI >> en-GU +@supported=und, en-GU, en-GB, en-IN +en-VI >> en-GU + +@favor=script +@supported=und, es, es-MA, es-MX, es-419 +es-AR >> es-419 +@supported=und, es-MA, es, es-419, es-MX +es-AR >> es-419 +@supported=und, es, es-MA, es-MX, es-419 +es-EA >> es +@supported=und, es-MA, es, es-419, es-MX +es-EA >> es + +@supported=und, es, es-MA, es-MX +es-AR >> es-MX +@supported=und, es-MA, es, es-MX +es-AR >> es-MX +@supported=und, es-MA, es-MX, es-419 +es-EA >> es-MA +@supported=und, es-MA, es-419, es-MX +es-EA >> es-MA + +@supported=und, en, en-GU, en-IN, en-GB +en-ZA >> en-GB +@supported=und, en-GU, en, en-GB, en-IN +en-ZA >> en-GB +@supported=und, en, en-GU, en-IN, en-GB +en-VI >> en +@supported=und, en-GU, en, en-GB, en-IN +en-VI >> en + +@supported=und, en, en-GU, en-IN +en-ZA >> en-IN +@supported=und, en-GU, en, en-IN +en-ZA >> en-IN +@supported=und, en-GU, en-IN, en-GB +en-VI >> en-GU +@supported=und, en-GU, en-GB, en-IN +en-VI >> en-GU + +** test: testThreshold +@supported=50, und, fr-CA-fonupa +@threshold=60 +fr-BE-fonipa >> fr-CA-fonupa | | fr-BE-fonipa +@supported=und, fr-Cyrl-CA-fonupa +fr-BE-fonipa >> fr-Cyrl-CA-fonupa | | fr-Cyrl-BE-fonipa +@threshold=50 +fr-BE-fonipa >> und + +@favor=script +@supported=50, und, fr-CA-fonupa +@threshold= +fr-BE-fonipa >> fr-CA-fonupa | | fr-BE-fonipa +@supported=und, fr-Cyrl-CA-fonupa +fr-BE-fonipa >> und + +** test: testScriptFirst +@supported=ru, fr +zh, pl >> ru +zh-Cyrl, pl >> ru +@supported=hr, en-Cyrl +sr >> hr +@supported=da, ru, hr +sr >> da + +@favor=script +@supported=ru, fr +zh, pl >> fr +zh-Cyrl, pl >> ru +@supported=hr, en-Cyrl +sr >> en-Cyrl +@supported=da, ru, hr +sr >> ru + +## III + +** test: testBasicsWithDefault +@supported=en-GB, en +@default=fr +en-GB >> en-GB +en-US >> en +fr >> fr +ja >> fr + +@favor=script +en-GB >> en-GB +en-US >> en +fr >> en +ja >> fr + +** test: testEmptyWithDefault +@default=en +fr >> en + +** test: testGetBestMatchForList_exactMatch +@supported=fr, en-GB, ja, es-ES, es-MX +ja, de >> ja + +** test: testGetBestMatchForList_simpleVariantMatch +# Intentionally avoiding a perfect-match or two candidates for variant matches. +@supported=fr, en-GB, ja, es-ES, es-MX +de, en-US >> en-GB +# Fall back. +de, zh >> fr + +** test: TestEuHack +@supported=en-NZ, en-IT +en-US >> en-NZ + +** test: TestBasics +@supported=fr, en-GB, en +en-GB >> en-GB +en-US >> en +fr-FR >> fr +ja-JP >> fr +zu >> en +# For a language that doesn't match anything, return the default. +zxx >> fr + +@favor=script +en-GB >> en-GB +en-US >> en +fr-FR >> fr +ja-JP >> fr +zu >> en +zxx >> en + +** test: TestExactMatch +@supported=fr, en-GB, ja, es-ES, es-MX +ja, de >> ja + +** test: TestSimpleVariantMatch +@supported=fr, en-GB, ja, es-ES, es-MX +de, en-US >> en-GB +de, zh >> fr + +** test: TestMatchOnMaximized +# ja-JP matches ja on likely subtags, and it's listed first, thus it wins +# over the secondary preference en-GB. +@supported=fr, en-GB, ja, es-ES, es-MX +ja-JP, en-GB >> ja +# Check that if the preference is maximized already, it works as well. +ja-Jpan-JP, en-GB >> ja +@supported=fr, zh-Hant, en +zh, en >> en + +@favor=script +zh, en >> en + +** test: TestCloseEnoughMatchOnMaximized +@supported=en-GB, en, de, fr, ja +de-CH, fr >> de +en-US, ar, nl, de, ja >> en + +** test: TestGetBestMatchForPortuguese +# 1. a supported set containing an explicit pt: {pt-PT, pt-BR, es, es-419} +# 2. a supported set containing an implicit pt: {pt-PT, pt, es, es-419} +# 3. a supported set containing no pt: {pt-BR, es, es-419} +# European user who prefers Spanish over Brazilian Portuguese as a fallback. +@supported=pt-PT, pt-BR, es, es-419 +pt-PT, es, pt >> pt-PT +@supported=pt-PT, pt, es, es-419 +pt-PT, es, pt >> pt-PT +@supported=pt-BR, es, es-419 +pt-PT, es, pt >> pt-BR + +# Brazilian user who prefers South American Spanish over European Portuguese +# as a fallback. The asymmetry between this case and above is because it's +# "pt-PT" that's missing between the matchers. +@supported=pt-PT, pt-BR, es, es-419 +pt, es-419, pt-PT >> pt-BR +@supported=pt-PT, pt, es, es-419 +pt, es-419, pt-PT >> pt +@supported=pt-BR, es, es-419 +pt, es-419, pt-PT >> pt-BR + +# Sometimes we get "pt-US" for a user's language (which CLDR doesn't +# recognize) but we deal with that as a synonym for "pt-BR". +@supported=pt-PT, pt-BR, es, es-419 +pt-US, pt-PT >> pt-BR +@supported=pt-PT, pt, es, es-419 +pt-US, pt-PT >> pt + +@favor=script +@supported=pt-BR, es, es-419 +pt-PT, es, pt >> pt-BR +@supported=pt-PT, pt, es, es-419 +pt-US, pt-PT >> pt + +** test: TestScriptAndRegion +@supported=en-GB, en +en-CA >> en +# fr-CA is a "close enough" match to "fr" to be returned in favor of "en-GB" +@supported=fr, en-GB, en +fr-CA, en-CA >> fr +@supported=zh-Hant, zh-TW +zh-HK >> zh-Hant + +@favor=script +@supported=en-GB, en +en-CA >> en +@supported=fr, en-GB, en +fr-CA, en-CA >> fr +@supported=zh-Hant, zh-TW +zh-HK >> zh-Hant + +** test: TestFallback +@supported=zh-CN, zh-TW, iw +zh-Hant >> zh-TW +zh >> zh-CN +zh-Hans-CN >> zh-CN +zh-Hant-HK >> zh-TW +he-IT >> iw + +** test: TestFallbackWithDefault +# Check that script fallbacks are handled right and that we don't have to +# fall back to the default. +@supported=zh-CN, zh-TW, iw +@default=fr +zh-Hant >> zh-TW +zh >> zh-CN +zh-Hans-CN >> zh-CN +zh-Hant-HK >> zh-TW +he-IT >> iw + +@favor=script +zh-Hant >> zh-TW +zh >> zh-CN +zh-Hans-CN >> zh-CN +zh-Hant-HK >> zh-TW +he-IT >> iw + +** test: TestSpecials +# Check that nearby languages are handled. +@supported=en, fil, ro, nn +tl >> fil +mo >> ro +nb >> nn +ja >> en # Make sure default works. + +** test: TestRegionalSpecials +# Verify that en-AU is closer to en-GB than to en (which is en-US). +@supported=en, en-GB, es-ES, es-419 +en-AU >> en-GB +# Following 2 cases test closer/smaller region difference. +es-MX >> es-419 +es-PT >> es-ES + +@favor=script +en-AU >> en-GB +es-MX >> es-419 +es-PT >> es-ES + +** test: TestEmpty +fr >> null + +** test: TestUndefined +# When the undefined language doesn't match anything in the list, +# return the default. +@supported=it, fr +und >> it +# When it *does* occur in the list, return it. +@supported=it, und +und >> und +# The unusual part: +# max("und") = "en-Latn-US", and since matching is based on +# maximized tags, the undefined language would normally match +# English. But that would produce the counterintuitive results +# that BestMatchFor("und", LanguageMatcher("it,en")) would be "en", +# and BestMatchFor("en", LanguageMatcher("it,und")) would be "und". + +# To avoid that, we change the matcher's definitions of max +# (AddLikelySubtagsWithDefaults) so that max("und")="und". That +# produces the following, more desirable results: +@supported=it, en +und >> it +@supported=it, und +en >> it + +** test: TestVariantWithScriptMatch +@supported=fr, en, sv +en-GB >> en +en-GB, sv >> en + +@favor=script +en-GB, sv >> en + +** test: Serbian +@supported=und, sr +sr-ME >> sr +@supported=und, sr-ME +sr >> sr-ME +@supported=und, sr-Latn +bs >> und +@supported=und, bs +sr-Latn >> und +@supported=und, sr +bs >> und +@supported=und, bs +sr >> und +@supported=und, sr-Latn +sr >> sr-Latn +@supported=und, sr +sr-Latn >> sr + +@favor=script +sr-ME >> sr +@supported=und, sr-ME +sr >> sr-ME +@supported=und, sr-Latn +bs >> sr-Latn +@supported=und, bs +sr-Latn >> bs +@supported=und, sr +bs >> und +@supported=und, bs +sr >> und +@supported=und, sr-Latn +sr >> sr-Latn +@supported=und, sr +sr-Latn >> sr + +** test: MatchGooglePrivateUseSubtag +@supported=fr, x-bork, en-Latn-US +x-piglatin >> fr +x-bork >> x-bork +@supported=fr, en-GB, x-bork, es-ES, es-419 +x-piglatin >> fr +x-bork >> x-bork + +@favor=script +@supported=fr, x-bork, en-Latn-US +x-piglatin >> x-bork +x-bork >> x-bork +@supported=fr, en-GB, x-bork, es-ES, es-419 +x-piglatin >> x-bork +x-bork >> x-bork + +** test: MatchLegacyCode +@supported=fr, i-klingon, en-Latn-US +en-GB-oed >> en-Latn-US +i-klingon >> tlh + +@favor=script +en-GB-oed >> en-Latn-US +i-klingon >> tlh + +** test: MatchGooglePseudoLocale +# Google pseudo locales using variant subtags. +# (See below for the region code based pseudo locales.) +@supported=fr, en-PSACCENT, ar-PSBIDI, en-PSCRACK, zh-Hans-PSCRACK, pt-PT-PSCRACK, pt +de >> fr +en-US >> fr +en >> fr +ar-PSBIDI >> ar-PSBIDI +en-PSACCENT >> en-PSACCENT +en-PSCRACK >> en-PSCRACK +pt-BR >> pt +pt-PT-PSCRACK >> pt-PT-PSCRACK +zh-Hans-PSCRACK >> zh-Hans-PSCRACK + +@favor=script +de >> fr +en-US >> fr +en >> fr +ar-PSBIDI >> ar-PSBIDI +en-PSACCENT >> en-PSACCENT +en-PSCRACK >> en-PSCRACK +pt-BR >> pt +pt-PT-PSCRACK >> pt-PT-PSCRACK +zh-Hans-PSCRACK >> zh-Hans-PSCRACK + +** test: MatchGooglePseudoLocaleWithFallbacks +# Pseudo locales based on the fall back option (XA..XC region codes). +@supported=fr, en-XA, ar-XB, en-XC, zh-Hans-XC, pt +de >> fr +en-US >> fr +en >> fr +ar-XB >> ar-XB +en-XA >> en-XA +en-XC >> en-XC +pt-BR >> pt +zh-Hans-XC >> zh-Hans-XC + +@favor=script +de >> fr +en-US >> fr +en >> fr +ar-XB >> ar-XB +en-XA >> en-XA +en-XC >> en-XC +pt-BR >> pt +zh-Hans-XC >> zh-Hans-XC + +** test: DoNotMatchGooglePseudoLocale +@supported=fr, en-XA, ar-XB, en-PSACCENT, ar-PSBIDI, en-DE, pt, ar-SY, ar-PSCRACK +de >> fr +# We wouldn't want to return pseudo locales when there's a good match for an +# ordinary locale. +# Note: If LanguageMatcher was not aware of PSACCENT, it would consider the +# distance from "en" to "en-PSACCENT" smaller than to "en-DE" (the standard +# variant distance is smaller than a region distance). +en >> en-DE +ar-EG >> ar-SY +pt-BR >> pt +ar-XB >> ar-XB +ar-PSBIDI >> ar-PSBIDI +en-XA >> en-XA +en-PSACCENT >> en-PSACCENT +ar-PSCRACK >> ar-PSCRACK + +@favor=script +de >> en-DE +en >> en-DE +ar-EG >> ar-SY +pt-BR >> pt +ar-XB >> ar-XB +ar-PSBIDI >> ar-PSBIDI +en-XA >> en-XA +en-PSACCENT >> en-PSACCENT +ar-PSCRACK >> ar-PSCRACK + +** test: BestMatchForTraditionalChinese +# Scenario: An application that only supports Simplified Chinese (and some +# other languages), but does not support Traditional Chinese. zh-Hans-CN +# could be replaced with zh-CN, zh, or zh-Hans, it wouldn't make much of a +# difference. +# The script distance (simplified vs. traditional Han) is considered small +# enough to be an acceptable match. The regional difference is considered +# almost insignificant. +@supported=fr, zh-Hans-CN, en-US +zh-TW >> fr # no match so get first +zh-Hant >> fr # no match so get first + +# For geopolitical reasons, you might want to avoid a zh-Hant -> zh-Hans +# match. In this case, if zh-TW, zh-HK or a tag starting with zh-Hant is +# requested, you can change your call to getBestMatch to include a 2nd +# language preference. "en" is a better match since its distance to "en-US" +# is closer than the distance from "zh-TW" to "zh-CN" (script distance). +zh-TW, en >> en-US +zh-Hant-CN, en >> en-US +zh-Hans, en >> zh-Hans-CN + +** test: MaxBeforeEquals +# Compare maximized forms of earlier items before testing equality +# of later items. +@supported=en, fr-CA +en-US, fr-CA >> en + +@favor=script +en-US, fr-CA >> en + +** test: SiblingDefaultRegion +@supported=de-AT, de-DE, de-CH +de >> de-DE + +** test: ReturnDefaultInsteadOfNullForEmptyPriorityList +@default=und +de >> und + +** test: ReturnSpecifiedDefaultForNoMatch +@supported=de, en, fr +@default=und +hi >> und + +@favor=script +hi >> und + +** test: MatchedLanguageIgnoresDefault +@supported=de, en, fr +@default=und +fr >> fr + +@favor=script +fr >> fr + +## GenX + +** test: TwoSpanishes +@supported=es, es-MX +@default=und +es-001 >> es +und >> und +ca >> es +gl-ES >> es +es >> es +es-MX >> es-MX +es-002 >> es +es-003 >> es-MX +es-005 >> es-MX +es-019 >> es-MX +es-029 >> es-MX +es-419 >> es-MX +es-142 >> es +es-150 >> es +es-AD >> es +es-AR >> es-MX +es-BO >> es-MX +es-BZ >> es-MX +es-CA >> es-MX +es-CL >> es-MX +es-CO >> es-MX +es-CR >> es-MX +es-CU >> es-MX +es-DO >> es-MX +es-EC >> es-MX +es-ES >> es +es-GI >> es +es-GQ >> es +es-GT >> es-MX +es-HN >> es-MX +es-NI >> es-MX +es-PA >> es-MX +es-PE >> es-MX +es-PH >> es +es-PR >> es-MX +es-PY >> es-MX +es-SV >> es-MX +es-US >> es-MX +es-UY >> es-MX +es-VE >> es-MX + +@favor=script +es-001 >> es +und >> und +ca >> es +gl-ES >> es +es >> es +es-MX >> es-MX +es-002 >> es +es-003 >> es-MX +es-005 >> es-MX +es-019 >> es-MX +es-029 >> es-MX +es-419 >> es-MX +es-142 >> es +es-150 >> es +es-AD >> es +es-AR >> es-MX +es-BO >> es-MX +es-BZ >> es-MX +es-CA >> es-MX +es-CL >> es-MX +es-CO >> es-MX +es-CR >> es-MX +es-CU >> es-MX +es-DO >> es-MX +es-EC >> es-MX +es-ES >> es +es-GI >> es +es-GQ >> es +es-GT >> es-MX +es-HN >> es-MX +es-NI >> es-MX +es-PA >> es-MX +es-PE >> es-MX +es-PH >> es +es-PR >> es-MX +es-PY >> es-MX +es-SV >> es-MX +es-US >> es-MX +es-UY >> es-MX +es-VE >> es-MX + +** test: Three Spanishes +@supported=es, es-419, es-MX +@default=und +es-001 >> es +und >> und +ca >> es +gl-ES >> es +es >> es +es-419 >> es-419 +es-002 >> es +es-003 >> es-419 +es-005 >> es-419 +es-019 >> es-419 +es-029 >> es-419 +es-142 >> es +es-150 >> es +es-AD >> es +es-AR >> es-419 +es-BO >> es-419 +es-BZ >> es-419 +es-CA >> es-419 +es-CL >> es-419 +es-CO >> es-419 +es-CR >> es-419 +es-CU >> es-419 +es-DO >> es-419 +es-EC >> es-419 +es-ES >> es +es-GI >> es +es-GQ >> es +es-GT >> es-419 +es-HN >> es-419 +es-MX >> es-MX +es-NI >> es-419 +es-PA >> es-419 +es-PE >> es-419 +es-PH >> es +es-PR >> es-419 +es-PY >> es-419 +es-SV >> es-419 +es-US >> es-419 +es-UY >> es-419 +es-VE >> es-419 + +@favor=script +es-001 >> es +und >> und +ca >> es +gl-ES >> es +es >> es +es-419 >> es-419 +es-002 >> es +es-003 >> es-419 +es-005 >> es-419 +es-019 >> es-419 +es-029 >> es-419 +es-142 >> es +es-150 >> es +es-AD >> es +es-AR >> es-419 +es-BO >> es-419 +es-BZ >> es-419 +es-CA >> es-419 +es-CL >> es-419 +es-CO >> es-419 +es-CR >> es-419 +es-CU >> es-419 +es-DO >> es-419 +es-EC >> es-419 +es-ES >> es +es-GI >> es +es-GQ >> es +es-GT >> es-419 +es-HN >> es-419 +es-MX >> es-MX +es-NI >> es-419 +es-PA >> es-419 +es-PE >> es-419 +es-PH >> es +es-PR >> es-419 +es-PY >> es-419 +es-SV >> es-419 +es-US >> es-419 +es-UY >> es-419 +es-VE >> es-419 + +** test: Englishes +@supported=en-GB, en-US +@default=und +und >> und +ja >> und +fr-CA >> und + +# Great Britain fallback +en-AU >> en-GB +en-BZ >> en-GB +en-IN >> en-GB +en-IE >> en-GB +en-JM >> en-GB +en-NZ >> en-GB +en-PK >> en-GB +en-TT >> en-GB +en-ZA >> en-GB + +# United States fallback +en-CA >> en-US +en-US >> en-US +en >> en-US + +@favor=script +und >> und +ja >> und +fr-CA >> en-US +en-AU >> en-GB +en-BZ >> en-GB +en-CA >> en-US +en-IN >> en-GB +en-IE >> en-GB +en-JM >> en-GB +en-NZ >> en-GB +en-PK >> en-GB +en-TT >> en-GB +en-ZA >> en-GB +en-US >> en-US +en >> en-US + +** test: TestFallback +# manyEnMatcher +@supported=en-GB, en-US, en, en-AU +@default=und +und >> und +ja >> und +fr-CA >> und + +# nonUsMatcher +fr >> und + +# onlyAuMatcher +@supported=en-AU, ja, ca +fr >> und + +# noEnMatcher +@supported=pl, ja, ca +fr >> und + +@favor=script +@supported=en-GB, en-US, en, en-AU +und >> und +ja >> und +fr-CA >> en-US +fr >> en-US +@supported=en-AU, ja, ca +fr >> en-AU +@supported=pl, ja, ca +fr >> pl + +## Go + +** test: basics +@supported=fr, en-GB, en +en-GB >> en-GB +en-US >> en +fr-FR >> fr +ja-JP >> fr + +** test: script fallbacks +@supported=zh-CN, zh-TW, iw +zh-Hant >> zh-TW +zh >> zh-CN +zh-Hans-CN >> zh-CN +zh-Hant-HK >> zh-TW +@default=iw +he-IT >> iw + +@favor=script +he-IT >> iw + +** test: language-specific script fallbacks 1 +@supported=en, sr, nl +sr-Latn >> sr +sh >> en +hr >> en +bs >> en +nl-Cyrl >> en # Mark: Expected value should be en not sr. Script difference exceeds threshold, so can't be nl + +@favor=script +sr-Latn >> sr +hr >> en +bs >> en +nl-Cyrl >> sr + +** test: language-specific script fallbacks 2 +@supported=en, sr-Latn +sr >> sr-Latn +sr-Cyrl >> sr-Latn +@default=und +hr >> und + +@favor=script +@default= +sr >> sr-Latn +sr-Cyrl >> sr-Latn +@default=und +hr >> en + +** test: don't match hr to sr-Latn +@supported=en, sr-Latn +hr >> en + +@favor=script +hr >> en + +** test: both deprecated and not +@supported=fil, tl, iw, he +he-IT >> iw +he >> iw +iw >> iw +fil-IT >> fil +fil >> fil +tl >> fil + +@favor=script +he-IT >> iw +he >> iw +iw >> iw +fil-IT >> fil +fil >> fil +tl >> fil + +** test: nearby languages: Nynorsk to Bokmål +@supported=en, nb +nn >> nb + +@favor=script +nn >> nb + +** test: nearby languages: Danish does not match nn +@supported=en, nn +da >> en + +@favor=script +da >> en + +** test: nearby languages: Danish matches no +@supported=en, no +da >> no + +@favor=script +da >> no + +** test: nearby languages: Danish matches nb +@supported=en, nb +da >> nb + +** test: prefer matching languages over language variants. Get en-GB, should get nn? +@supported=nn, en-GB +no, en-US >> en-GB +nb, en-US >> en-GB + +@favor=script +no, en-US >> nn +nb, en-US >> nn + +** test: deprecated version is closer than same language with other differences +@supported=nl, he, en-GB +iw, en-US >> he + +@favor=script +iw, en-US >> he + +** test: macro equivalent is closer than same language with other differences +@supported=nl, zh, en-GB, no +cmn, en-US >> zh +nb, en-US >> no + +@favor=script +cmn, en-US >> zh +nb, en-US >> no + +** test: legacy equivalent is closer than same language with other differences +@supported=nl, fil, en-GB +tl, en-US >> fil + +@favor=script +tl, en-US >> fil + +** test: distinguish near equivalents +@supported=en, ro, mo, ro-MD +ro >> ro +mo >> ro # ro=mo for the locale matcher +ro-MD >> ro-MD + +@favor=script +ro >> ro +mo >> ro # ro=mo for the locale matcher +ro-MD >> ro-MD + +** test: maximization of legacy +@supported=sr-Cyrl, sr-Latn, ro, ro-MD +sh >> sr-Latn +mo >> ro + +@favor=script +sh >> sr-Latn +mo >> ro + +** test: empty +fr >> null +en >> null + +** test: private use subtags +@supported=fr, en-GB, x-bork, es-ES, es-419 +x-piglatin >> fr +x-bork >> x-bork + +** test: legacy codes +@supported=fr, i-klingon, en-Latn-US +en-GB-oed >> en-Latn-US +i-klingon >> tlh + + +** test: simple variant match +@supported=fr, en-GB, ja, es-ES, es-MX +de, en-US >> en-GB +de, zh >> fr + +** test: best match for traditional Chinese +@supported=fr, zh-Hans-CN, en-US +zh-TW >> fr # no match so get first +zh-Hant >> fr # no match so get first +zh-TW, en >> en-US +zh-Hant-CN, en >> en-US +zh-Hans, en >> zh-Hans-CN + +** test: return most originally similar among likely-subtags equivalent locales +@supported=af, af-Latn, af-Arab +af >> af +af-ZA >> af +af-Latn-ZA >> af-Latn +af-Latn >> af-Latn + +@favor=script +af >> af +af-ZA >> af +af-Latn-ZA >> af-Latn +af-Latn >> af-Latn + +@supported=nl, nl-NL, nl-BE +@favor= +nl >> nl +nl-Latn >> nl +nl-Latn-NL >> nl-NL +nl-NL >> nl-NL + +@favor=script +nl >> nl +nl-Latn >> nl +nl-Latn-NL >> nl-NL +nl-NL >> nl-NL + +@supported=nl, nl-Latn, nl-NL, nl-BE +@favor= +nl >> nl +nl-Latn >> nl-Latn +nl-NL >> nl-NL +nl-Latn-NL >> nl-Latn + +@favor=script +nl >> nl +nl-Latn >> nl-Latn +nl-NL >> nl-NL +nl-Latn-NL >> nl-Latn + +** test: region may replace matched if matched is enclosing +@supported=es-419, es +@default=es-MX +es-MX >> es-419 +@default= +es-SG >> es + +@favor=script +@default=es-MX +es-MX >> es-419 +@default= +es-SG >> es + +** test: region distance Portuguese +@supported=pt, pt-PT +pt-ES >> pt-PT + +@favor=script +pt-ES >> pt-PT + +** test: if no preferred locale specified, pick top language, not regional +@supported=en, fr, fr-CA, fr-CH +fr-US >> fr + +@favor=script +fr-US >> fr + +** test: region distance German +@supported=de-AT, de-DE, de-CH +de >> de-DE + +** test: en-AU is closer to en-GB than to en (which is en-US) +@supported=en, en-GB, es-ES, es-419 +en-AU >> en-GB +@default=es-MX +es-MX >> es-419 +@default= +es-PT >> es-ES + +@favor=script +en-AU >> en-GB +es-MX >> es-419 +@default= +es-PT >> es-ES + +** test: undefined +@supported=it, fr +und >> it + +** test: und does not match en +@supported=it, en +und >> it + +** test: undefined in priority list +@supported=it, und +und >> und +en >> it + +** test: undefined +@supported=it, fr, zh +und-FR >> fr +und-CN >> zh +und-Hans >> zh +und-Hant >> it # no match so get first +und-Latn >> it + +@favor=script +und-FR >> fr +und-CN >> zh +und-Hans >> zh +und-Hant >> it # no match so get first +und-Latn >> it + +** test: match on maximized tag +@supported=fr, en-GB, ja, es-ES, es-MX +ja-JP, en-GB >> ja +ja-Jpan-JP, en-GB >> ja + +** test: pick best maximized tag +@supported=ja, ja-Jpan-US, ja-JP, en, ru +ja-Jpan, ru >> ja +ja-JP, ru >> ja-JP +ja-US, ru >> ja-Jpan-US + +@favor=script +ja-Jpan, ru >> ja +ja-JP, ru >> ja-JP +ja-US, ru >> ja-Jpan-US + +** test: termination: pick best maximized match +@supported=ja, ja-Jpan, ja-JP, en, ru +ja-Jpan-JP, ru >> ja-Jpan +ja-Jpan, ru >> ja-Jpan + +@favor=script +ja-Jpan-JP, ru >> ja-Jpan +ja-Jpan, ru >> ja-Jpan + +** test: same language over exact, but distinguish when user is explicit +@supported=fr, en-GB, ja, es-ES, es-MX +ja, de >> ja +@supported=en, de, fr, ja +de-CH, fr >> de +@supported=en-GB, nl +en, nl >> en-GB +en, nl, en-GB >> en-GB + +@favor=script +@supported=fr, en-GB, ja, es-ES, es-MX +ja, de >> ja +@supported=en, de, fr, ja +de-CH, fr >> de +@supported=en-GB, nl +en, nl >> en-GB +en, nl, en-GB >> en-GB + +** test: parent relation preserved +@supported=en, en-US, en-GB, es, es-419, pt, pt-BR, pt-PT, zh, zh-Hant, zh-Hant-HK +en-150 >> en-GB +en-AU >> en-GB +en-BE >> en-GB +en-GG >> en-GB +en-GI >> en-GB +en-HK >> en-GB +en-IE >> en-GB +en-IM >> en-GB +en-IN >> en-GB +en-JE >> en-GB +en-MT >> en-GB +en-NZ >> en-GB +en-PK >> en-GB +en-SG >> en-GB +en-DE >> en-GB +@default=es-AR +es-AR >> es-419 +@default=es-BO +es-BO >> es-419 +@default=es-CL +es-CL >> es-419 +@default=es-CO +es-CO >> es-419 +@default=es-CR +es-CR >> es-419 +@default=es-CU +es-CU >> es-419 +@default=es-DO +es-DO >> es-419 +@default=es-EC +es-EC >> es-419 +@default=es-GT +es-GT >> es-419 +@default=es-HN +es-HN >> es-419 +@default=es-MX +es-MX >> es-419 +@default=es-NI +es-NI >> es-419 +@default=es-PA +es-PA >> es-419 +@default=es-PE +es-PE >> es-419 +@default=es-PR +es-PR >> es-419 +@default= +es-PT >> es +@default=es-PY +es-PY >> es-419 +@default=es-SV +es-SV >> es-419 +@default= +es-US >> es-419 +@default=es-UY +es-UY >> es-419 +@default=es-VE +es-VE >> es-419 +@default= +pt-AO >> pt-PT +pt-CV >> pt-PT +pt-GW >> pt-PT +pt-MO >> pt-PT +pt-MZ >> pt-PT +pt-ST >> pt-PT +pt-TL >> pt-PT + +@favor=script +en-150 >> en-GB +en-AU >> en-GB +en-BE >> en-GB +en-GG >> en-GB +en-GI >> en-GB +en-HK >> en-GB +en-IE >> en-GB +en-IM >> en-GB +en-IN >> en-GB +en-JE >> en-GB +en-MT >> en-GB +en-NZ >> en-GB +en-PK >> en-GB +en-SG >> en-GB +en-DE >> en-GB +@default=es-AR +es-AR >> es-419 +@default=es-BO +es-BO >> es-419 +@default=es-CL +es-CL >> es-419 +@default=es-CO +es-CO >> es-419 +@default=es-CR +es-CR >> es-419 +@default=es-CU +es-CU >> es-419 +@default=es-DO +es-DO >> es-419 +@default=es-EC +es-EC >> es-419 +@default=es-GT +es-GT >> es-419 +@default=es-HN +es-HN >> es-419 +@default=es-MX +es-MX >> es-419 +@default=es-NI +es-NI >> es-419 +@default=es-PA +es-PA >> es-419 +@default=es-PE +es-PE >> es-419 +@default=es-PR +es-PR >> es-419 +@default= +es-PT >> es +@default=es-PY +es-PY >> es-419 +@default=es-SV +es-SV >> es-419 +@default= +es-US >> es-419 +@default=es-UY +es-UY >> es-419 +@default=es-VE +es-VE >> es-419 +@default= +pt-AO >> pt-PT +pt-CV >> pt-PT +pt-GW >> pt-PT +pt-MO >> pt-PT +pt-MZ >> pt-PT +pt-ST >> pt-PT +pt-TL >> pt-PT + +** test: preserve extensions +@supported=en, de, sl-NEDIS +@default=de-u-co-phonebk +de-FR-u-co-phonebk >> de +@default=sl-NEDIS-u-cu-eur +sl-NEDIS-u-cu-eur >> sl-NEDIS +sl-u-cu-eur >> sl-NEDIS +sl-HR-NEDIS-u-cu-eur >> sl-NEDIS +@default=de-t-m0-iso-i0-pinyin +de-t-m0-iso-i0-pinyin >> de + +@favor=script +@default=de-u-co-phonebk +de-FR-u-co-phonebk >> de +@default=sl-NEDIS-u-cu-eur +sl-NEDIS-u-cu-eur >> sl-NEDIS +sl-u-cu-eur >> sl-NEDIS +sl-HR-NEDIS-u-cu-eur >> sl-NEDIS +@default=de-t-m0-iso-i0-pinyin +de-t-m0-iso-i0-pinyin >> de + +## ULS + +** test: testEmptyUserLanguagesGetsEmpty_getBestMatches +@supported=de + >> de + +** test: testNoStrongMatchGetsEmpty_getBestMatches +@supported=de +fr >> de + +@favor=script +fr >> de + +** test: testLooseMatchForGeneral_getBestMatches +@supported=es-419 +es-MX >> es-419 + +@favor=script +es-MX >> es-419 + +** test: testLooseMatchForEnglish_getBestMatches +@supported=en, en-GB +en-CA >> en + +@favor=script +en-CA >> en + +** test: testLooseMatchForChinese_getBestMatches +@supported=zh +zh-TW >> zh + +@favor=script +zh-TW >> zh + +## Geo + +** test: testGetBestMatchWithMinMatchScore +@supported=fr-FR, fr, fr-CA, en +@default=und +fr >> fr +@supported=en, fr, fr-CA +fr-FR >> fr # Parent match is chosen. +@supported=en, fr-CA +fr-FR >> fr-CA # Sibling match is chosen. +@supported=fr-CA, fr-FR +fr >> fr-FR # Inferred region match is chosen. +fr-SN >> fr-FR +@supported=en, fr-FR +fr >> fr-FR # Child match is chosen. +@supported=de, en, it +fr >> und +@supported=iw, en +iw-Latn >> und +@supported=iw, no +ru >> und +@supported=iw-Latn, iw-Cyrl, iw +ru >> und +@supported=iw, iw-Latn +ru >> und +en >> und +@supported=en, uk +ru >> und +@supported=zh-TW, en +zh-CN >> und # no match +@supported=ja +ru >> und + +@favor=script +@supported=fr-FR, fr, fr-CA, en +fr >> fr +@supported=en, fr, fr-CA +fr-FR >> fr +@supported=en, fr-CA +fr-FR >> fr-CA +@supported=fr-CA, fr-FR +fr >> fr-FR +fr-SN >> fr-FR +@supported=en, fr-FR +fr >> fr-FR +@supported=de, en, it +fr >> en +@supported=iw, en +iw-Latn >> en +@supported=iw, no +ru >> und +@supported=iw-Latn, iw-Cyrl, iw +ru >> iw-Cyrl +@supported=iw, iw-Latn +ru >> und +en >> iw-Latn +@supported=en, uk +ru >> uk +@supported=zh-TW, en +zh-CN >> und # no match +@supported=ja +ru >> und + +** test: favor a more-default locale among equally imperfect matches +@supported=fr-CA, fr-CH, fr-FR, fr-GB +fr-SN >> fr-FR +@supported=sr-Latn, sr-Cyrl, sr-Grek +@threshold=60 +sr-Thai >> sr-Cyrl |