From 36d22d82aa202bb199967e9512281e9a53db42c9 Mon Sep 17 00:00:00 2001 From: Daniel Baumann Date: Sun, 7 Apr 2024 21:33:14 +0200 Subject: Adding upstream version 115.7.0esr. Signed-off-by: Daniel Baumann --- intl/icu/source/test/testdata/collationtest.txt | 2585 +++++++++++++++++++++++ 1 file changed, 2585 insertions(+) create mode 100644 intl/icu/source/test/testdata/collationtest.txt (limited to 'intl/icu/source/test/testdata/collationtest.txt') diff --git a/intl/icu/source/test/testdata/collationtest.txt b/intl/icu/source/test/testdata/collationtest.txt new file mode 100644 index 0000000000..abda337e54 --- /dev/null +++ b/intl/icu/source/test/testdata/collationtest.txt @@ -0,0 +1,2585 @@ +# Copyright (C) 2016 and later: Unicode, Inc. and others. +# License & terms of use: http://www.unicode.org/copyright.html +# Copyright (c) 2012-2015 International Business Machines +# Corporation and others. All Rights Reserved. +# +# This file should be in UTF-8 with a signature byte sequence ("BOM"). +# +# collationtest.txt: Collation test data. +# +# created on: 2012apr13 +# created by: Markus W. Scherer + +# A line with "** test: description" is used for verbose and error output. + +# A collator can be set with "@ root" or "@ locale language-tag", +# for example "@ locale de-u-co-phonebk". +# An old-style locale ID can also be used, for example "@ locale de@collation=phonebook". + +# A collator can be built with "@ rules". +# An "@ rules" line is followed by one or more lines with the tailoring rules. + +# A collator can be modified with "% attribute=value". + +# "* compare" tests the order (= or <) of the following strings. +# The relation can be "=" or "<" (the level of the difference is not specified) +# or "<1", "<2", " 1 CE +&ae=ch=cH=Ch=CH # 2 chars -> 2 CEs +&rst=yz=yZ=Yz=YZ # 2 chars -> 3 CEs +% caseFirst=lower +* compare +<1 ae += ch +<3 cH +<3 Ch +<3 CH +<1 rst += yz +<3 yZ +<3 Yz +<3 YZ +<1 w +<1 x += uv +<3 uV += Uv # mixed case on single CE cannot distinguish variations +<3 UV + +** test: tertiary CEs, tertiary, caseLevel=off, caseFirst=lower +@ rules +&\u0001<<"a". +# We need to back up before the identical prefix "1" and compare the full numbers. +<1 11b +<1 101a + +** test: simple locale data test +@ locale de +* compare +<1 a +<2 ä +<1 ae +<2 æ + +@ locale de-u-co-phonebk +* compare +<1 a +<1 ae +<2 ä +<2 æ + +# The following test cases were moved here from ICU 52's DataDrivenCollationTest.txt. + +** test: DataDrivenCollationTest/TestMorePinyin +# Testing the primary strength. +@ locale zh +% strength=primary +* compare +< lā += lĀ += Lā += LĀ +< lān += lĀn +< lē += lĒ += Lē += LĒ +< lēn += lĒn + +** test: DataDrivenCollationTest/TestLithuanian +# Lithuanian sort order. +@ locale lt +* compare +< cz +< č +< d +< iz +< j +< sz +< š +< t +< zz +< ž + +** test: DataDrivenCollationTest/TestLatvian +# Latvian sort order. +@ locale lv +* compare +< cz +< č +< d +< gz +< ģ +< h +< iz +< j +< kz +< ķ +< l +< lz +< ļ +< m +< nz +< ņ +< o +< rz +< ŗ +< s +< sz +< š +< t +< zz +< ž + +** test: DataDrivenCollationTest/TestEstonian +# Estonian sort order. +@ locale et +* compare +< sy +< š +< šy +< z +< zy +< ž +< v +< va +< w +< õ +< õy +< ä +< äy +< ö +< öy +< ü +< üy +< x + +** test: DataDrivenCollationTest/TestAlbanian +# Albanian sort order. +@ locale sq +* compare +< cz +< ç +< d +< dz +< dh +< e +< ez +< ë +< f +< gz +< gj +< h +< lz +< ll +< m +< nz +< nj +< o +< rz +< rr +< s +< sz +< sh +< t +< tz +< th +< u +< xz +< xh +< y +< zz +< zh + +** test: DataDrivenCollationTest/TestSimplifiedChineseOrder +# Sorted file has different order. +@ root +# normalization=on turned on & off automatically. +* compare +< \u5F20 +< \u5F20\u4E00\u8E3F + +** test: DataDrivenCollationTest/TestTibetanNormalizedIterativeCrash +# This pretty much crashes. +@ root +* compare +< \u0f71\u0f72\u0f80\u0f71\u0f72 +< \u0f80 + +** test: DataDrivenCollationTest/TestThaiPartialSortKeyProblems +# These are examples of strings that caused trouble in partial sort key testing. +@ locale th-TH +* compare +< \u0E01\u0E01\u0E38\u0E18\u0E20\u0E31\u0E13\u0E11\u0E4C +< \u0E01\u0E01\u0E38\u0E2A\u0E31\u0E19\u0E42\u0E18 +* compare +< \u0E01\u0E07\u0E01\u0E32\u0E23 +< \u0E01\u0E07\u0E42\u0E01\u0E49 +* compare +< \u0E01\u0E23\u0E19\u0E17\u0E32 +< \u0E01\u0E23\u0E19\u0E19\u0E40\u0E0A\u0E49\u0E32 +* compare +< \u0E01\u0E23\u0E30\u0E40\u0E08\u0E35\u0E22\u0E27 +< \u0E01\u0E23\u0E30\u0E40\u0E08\u0E35\u0E4A\u0E22\u0E27 +* compare +< \u0E01\u0E23\u0E23\u0E40\u0E0A\u0E2D +< \u0E01\u0E23\u0E23\u0E40\u0E0A\u0E49\u0E32 + +** test: DataDrivenCollationTest/TestJavaStyleRule +# java.text allows rules to start as '<<