diff options
Diffstat (limited to 'third_party/rust/encoding_rs/doc')
40 files changed, 323 insertions, 0 deletions
diff --git a/third_party/rust/encoding_rs/doc/Big5.txt b/third_party/rust/encoding_rs/doc/Big5.txt new file mode 100644 index 0000000000..61e8fd5801 --- /dev/null +++ b/third_party/rust/encoding_rs/doc/Big5.txt @@ -0,0 +1,16 @@ +/// This is Big5 with HKSCS with mappings to more recent Unicode assignments +/// instead of the Private Use Area code points that have been used historically. +/// It is believed to be able to decode existing Web content in a way that makes +/// sense. +/// +/// To avoid form submissions generating data that Web servers don't understand, +/// the encoder doesn't use the HKSCS byte sequences that precede the unextended +/// Big5 in the lexical order. +/// +/// [Index visualization](https://encoding.spec.whatwg.org/big5.html), +/// [Visualization of BMP coverage](https://encoding.spec.whatwg.org/big5-bmp.html) +/// +/// This encoding is designed to be suited for decoding the Windows code page 950 +/// and its HKSCS patched "951" variant such that the text makes sense, given +/// assignments that Unicode has made after those encodings used Private Use +/// Area characters. diff --git a/third_party/rust/encoding_rs/doc/EUC-JP.txt b/third_party/rust/encoding_rs/doc/EUC-JP.txt new file mode 100644 index 0000000000..f90a735e52 --- /dev/null +++ b/third_party/rust/encoding_rs/doc/EUC-JP.txt @@ -0,0 +1,12 @@ +/// This is the legacy Unix encoding for Japanese. +/// +/// For compatibility with Web servers that don't expect three-byte sequences +/// in form submissions, the encoder doesn't generate three-byte sequences. +/// That is, the JIS X 0212 support is decode-only. +/// +/// [Index visualization](https://encoding.spec.whatwg.org/euc-jp.html), +/// [Visualization of BMP coverage](https://encoding.spec.whatwg.org/euc-jp-bmp.html) +/// +/// This encoding roughly matches the Windows code page 20932. There are error +/// handling differences and a handful of 2-byte sequences that decode differently. +/// Additionall, Windows doesn't support 3-byte sequences. diff --git a/third_party/rust/encoding_rs/doc/EUC-KR.txt b/third_party/rust/encoding_rs/doc/EUC-KR.txt new file mode 100644 index 0000000000..ef24c980e0 --- /dev/null +++ b/third_party/rust/encoding_rs/doc/EUC-KR.txt @@ -0,0 +1,10 @@ +/// This is the Korean encoding for Windows. It extends the Unix legacy encoding +/// for Korean, based on KS X 1001 (which also formed the base of MacKorean on Mac OS +/// Classic), with all the characters from the Hangul Syllables block of Unicode. +/// +/// [Index visualization](https://encoding.spec.whatwg.org/euc-kr.html), +/// [Visualization of BMP coverage](https://encoding.spec.whatwg.org/euc-kr-bmp.html) +/// +/// This encoding matches the Windows code page 949, except Windows decodes byte 0x80 +/// to U+0080 and some byte sequences that are error per the Encoding Standard to +/// the question mark or the Private Use Area. diff --git a/third_party/rust/encoding_rs/doc/GBK.txt b/third_party/rust/encoding_rs/doc/GBK.txt new file mode 100644 index 0000000000..2faefff45e --- /dev/null +++ b/third_party/rust/encoding_rs/doc/GBK.txt @@ -0,0 +1,16 @@ +/// The decoder for this encoding is the same as the decoder for gb18030. +/// The encoder side of this encoding is GBK with Windows code page 936 euro +/// sign behavior. GBK extends GB2312-80 to cover the CJK Unified Ideographs +/// Unicode block as well as a handful of ideographs from the CJK Unified +/// Ideographs Extension A and CJK Compatibility Ideographs blocks. +/// +/// Unlike e.g. in the case of ISO-8859-1 and windows-1252, GBK encoder wasn't +/// unified with the gb18030 encoder in the Encoding Standard out of concern +/// that servers that expect GBK form submissions might not be able to handle +/// the four-byte sequences. +/// +/// [Index visualization for the two-byte sequences](https://encoding.spec.whatwg.org/gb18030.html), +/// [Visualization of BMP coverage of the two-byte index](https://encoding.spec.whatwg.org/gb18030-bmp.html) +/// +/// The encoder of this encoding roughly matches the Windows code page 936. +/// The decoder side is a superset. diff --git a/third_party/rust/encoding_rs/doc/IBM866.txt b/third_party/rust/encoding_rs/doc/IBM866.txt new file mode 100644 index 0000000000..871ff42139 --- /dev/null +++ b/third_party/rust/encoding_rs/doc/IBM866.txt @@ -0,0 +1,8 @@ +/// This the most notable one of the DOS Cyrillic code pages. It has the same +/// box drawing characters as code page 437, so it can be used for decoding +/// DOS-era ASCII + box drawing data. +/// +/// [Index visualization](https://encoding.spec.whatwg.org/ibm866.html), +/// [Visualization of BMP coverage](https://encoding.spec.whatwg.org/ibm866-bmp.html) +/// +/// This encoding matches the Windows code page 866. diff --git a/third_party/rust/encoding_rs/doc/ISO-2022-JP.txt b/third_party/rust/encoding_rs/doc/ISO-2022-JP.txt new file mode 100644 index 0000000000..65713a1e9f --- /dev/null +++ b/third_party/rust/encoding_rs/doc/ISO-2022-JP.txt @@ -0,0 +1,10 @@ +/// This the primary pre-UTF-8 encoding for Japanese email. It uses the ASCII +/// byte range to encode non-Basic Latin characters. It's the only encoding +/// supported by this crate whose encoder is stateful. +/// +/// [Index visualization](https://encoding.spec.whatwg.org/jis0208.html), +/// [Visualization of BMP coverage](https://encoding.spec.whatwg.org/jis0208-bmp.html) +/// +/// This encoding roughly matches the Windows code page 50220. Notably, Windows +/// uses U+30FB in place of the REPLACEMENT CHARACTER and otherwise differs in +/// error handling. diff --git a/third_party/rust/encoding_rs/doc/ISO-8859-10.txt b/third_party/rust/encoding_rs/doc/ISO-8859-10.txt new file mode 100644 index 0000000000..8aca388e7c --- /dev/null +++ b/third_party/rust/encoding_rs/doc/ISO-8859-10.txt @@ -0,0 +1,8 @@ +/// This is the Nordic part of the ISO/IEC 8859 encoding family. This encoding +/// is also known as Latin 6. +/// +/// [Index visualization](https://encoding.spec.whatwg.org/iso-8859-10.html), +/// [Visualization of BMP coverage](https://encoding.spec.whatwg.org/iso-8859-10-bmp.html) +/// +/// The Windows code page number for this encoding is 28600, but kernel32.dll +/// does not support this encoding. diff --git a/third_party/rust/encoding_rs/doc/ISO-8859-13.txt b/third_party/rust/encoding_rs/doc/ISO-8859-13.txt new file mode 100644 index 0000000000..20cd549673 --- /dev/null +++ b/third_party/rust/encoding_rs/doc/ISO-8859-13.txt @@ -0,0 +1,8 @@ +/// This is the Baltic part of the ISO/IEC 8859 encoding family. This encoding +/// is also known as Latin 7. +/// +/// [Index visualization](https://encoding.spec.whatwg.org/iso-8859-13.html), +/// [Visualization of BMP coverage](https://encoding.spec.whatwg.org/iso-8859-13-bmp.html) +/// +/// This encoding matches the Windows code page 28603, except Windows decodes +/// unassigned code points to the Private Use Area of Unicode. diff --git a/third_party/rust/encoding_rs/doc/ISO-8859-14.txt b/third_party/rust/encoding_rs/doc/ISO-8859-14.txt new file mode 100644 index 0000000000..3e4833bf19 --- /dev/null +++ b/third_party/rust/encoding_rs/doc/ISO-8859-14.txt @@ -0,0 +1,8 @@ +/// This is the Celtic part of the ISO/IEC 8859 encoding family. This encoding +/// is also known as Latin 8. +/// +/// [Index visualization](https://encoding.spec.whatwg.org/iso-8859-14.html), +/// [Visualization of BMP coverage](https://encoding.spec.whatwg.org/iso-8859-14-bmp.html) +/// +/// The Windows code page number for this encoding is 28604, but kernel32.dll +/// does not support this encoding. diff --git a/third_party/rust/encoding_rs/doc/ISO-8859-15.txt b/third_party/rust/encoding_rs/doc/ISO-8859-15.txt new file mode 100644 index 0000000000..922896a882 --- /dev/null +++ b/third_party/rust/encoding_rs/doc/ISO-8859-15.txt @@ -0,0 +1,7 @@ +/// This is the revised Western European part of the ISO/IEC 8859 encoding +/// family. This encoding is also known as Latin 9. +/// +/// [Index visualization](https://encoding.spec.whatwg.org/iso-8859-15.html), +/// [Visualization of BMP coverage](https://encoding.spec.whatwg.org/iso-8859-15-bmp.html) +/// +/// This encoding matches the Windows code page 28605. diff --git a/third_party/rust/encoding_rs/doc/ISO-8859-16.txt b/third_party/rust/encoding_rs/doc/ISO-8859-16.txt new file mode 100644 index 0000000000..d1ae50bf08 --- /dev/null +++ b/third_party/rust/encoding_rs/doc/ISO-8859-16.txt @@ -0,0 +1,8 @@ +/// This is the South-Eastern European part of the ISO/IEC 8859 encoding +/// family. This encoding is also known as Latin 10. +/// +/// [Index visualization](https://encoding.spec.whatwg.org/iso-8859-16.html), +/// [Visualization of BMP coverage](https://encoding.spec.whatwg.org/iso-8859-16-bmp.html) +/// +/// The Windows code page number for this encoding is 28606, but kernel32.dll +/// does not support this encoding. diff --git a/third_party/rust/encoding_rs/doc/ISO-8859-2.txt b/third_party/rust/encoding_rs/doc/ISO-8859-2.txt new file mode 100644 index 0000000000..298df0916a --- /dev/null +++ b/third_party/rust/encoding_rs/doc/ISO-8859-2.txt @@ -0,0 +1,6 @@ +/// This is the Central European part of the ISO/IEC 8859 encoding family. This encoding is also known as Latin 2. +/// +/// [Index visualization](https://encoding.spec.whatwg.org/iso-8859-2.html), +/// [Visualization of BMP coverage](https://encoding.spec.whatwg.org/iso-8859-2-bmp.html) +/// +/// This encoding matches the Windows code page 28592. diff --git a/third_party/rust/encoding_rs/doc/ISO-8859-3.txt b/third_party/rust/encoding_rs/doc/ISO-8859-3.txt new file mode 100644 index 0000000000..c462ce8f7f --- /dev/null +++ b/third_party/rust/encoding_rs/doc/ISO-8859-3.txt @@ -0,0 +1,6 @@ +/// This is the South European part of the ISO/IEC 8859 encoding family. This encoding is also known as Latin 3. +/// +/// [Index visualization](https://encoding.spec.whatwg.org/iso-8859-3.html), +/// [Visualization of BMP coverage](https://encoding.spec.whatwg.org/iso-8859-3-bmp.html) +/// +/// This encoding matches the Windows code page 28593. diff --git a/third_party/rust/encoding_rs/doc/ISO-8859-4.txt b/third_party/rust/encoding_rs/doc/ISO-8859-4.txt new file mode 100644 index 0000000000..40449c4398 --- /dev/null +++ b/third_party/rust/encoding_rs/doc/ISO-8859-4.txt @@ -0,0 +1,6 @@ +/// This is the North European part of the ISO/IEC 8859 encoding family. This encoding is also known as Latin 4. +/// +/// [Index visualization](https://encoding.spec.whatwg.org/iso-8859-4.html), +/// [Visualization of BMP coverage](https://encoding.spec.whatwg.org/iso-8859-4-bmp.html) +/// +/// This encoding matches the Windows code page 28594. diff --git a/third_party/rust/encoding_rs/doc/ISO-8859-5.txt b/third_party/rust/encoding_rs/doc/ISO-8859-5.txt new file mode 100644 index 0000000000..41774ec542 --- /dev/null +++ b/third_party/rust/encoding_rs/doc/ISO-8859-5.txt @@ -0,0 +1,6 @@ +/// This is the Cyrillic part of the ISO/IEC 8859 encoding family. +/// +/// [Index visualization](https://encoding.spec.whatwg.org/iso-8859-5.html), +/// [Visualization of BMP coverage](https://encoding.spec.whatwg.org/iso-8859-5-bmp.html) +/// +/// This encoding matches the Windows code page 28595. diff --git a/third_party/rust/encoding_rs/doc/ISO-8859-6.txt b/third_party/rust/encoding_rs/doc/ISO-8859-6.txt new file mode 100644 index 0000000000..4c70c22583 --- /dev/null +++ b/third_party/rust/encoding_rs/doc/ISO-8859-6.txt @@ -0,0 +1,7 @@ +/// This is the Arabic part of the ISO/IEC 8859 encoding family. +/// +/// [Index visualization](https://encoding.spec.whatwg.org/iso-8859-6.html), +/// [Visualization of BMP coverage](https://encoding.spec.whatwg.org/iso-8859-6-bmp.html) +/// +/// This encoding matches the Windows code page 28596, except Windows decodes +/// unassigned code points to the Private Use Area of Unicode. diff --git a/third_party/rust/encoding_rs/doc/ISO-8859-7.txt b/third_party/rust/encoding_rs/doc/ISO-8859-7.txt new file mode 100644 index 0000000000..b78ed38e41 --- /dev/null +++ b/third_party/rust/encoding_rs/doc/ISO-8859-7.txt @@ -0,0 +1,11 @@ +/// This is the Greek part of the ISO/IEC 8859 encoding family. +/// +/// [Index visualization](https://encoding.spec.whatwg.org/iso-8859-7.html), +/// [Visualization of BMP coverage](https://encoding.spec.whatwg.org/iso-8859-7-bmp.html) +/// +/// This encoding roughly matches the Windows code page 28597. Windows decodes +/// unassigned code points, the currency signs at 0xA4 and 0xA5 as well as +/// 0xAA, which should be U+037A GREEK YPOGEGRAMMENI, to the Private Use Area +/// of Unicode. Windows decodes 0xA1 to U+02BD MODIFIER LETTER REVERSED COMMA +/// instead of U+2018 LEFT SINGLE QUOTATION MARK and 0xA2 to U+02BC MODIFIER +/// LETTER APOSTROPHE instead of U+2019 RIGHT SINGLE QUOTATION MARK. diff --git a/third_party/rust/encoding_rs/doc/ISO-8859-8-I.txt b/third_party/rust/encoding_rs/doc/ISO-8859-8-I.txt new file mode 100644 index 0000000000..b73e572e15 --- /dev/null +++ b/third_party/rust/encoding_rs/doc/ISO-8859-8-I.txt @@ -0,0 +1,9 @@ +/// This is the Hebrew part of the ISO/IEC 8859 encoding family in logical order. +/// +/// [Index visualization](https://encoding.spec.whatwg.org/iso-8859-8.html), +/// [Visualization of BMP coverage](https://encoding.spec.whatwg.org/iso-8859-8-bmp.html) +/// +/// This encoding roughly matches the Windows code page 38598. Windows decodes +/// 0xAF to OVERLINE instead of MACRON and 0xFE and 0xFD to the Private Use +/// Area instead of LRM and RLM. Windows decodes unassigned code points to +/// the private use area. diff --git a/third_party/rust/encoding_rs/doc/ISO-8859-8.txt b/third_party/rust/encoding_rs/doc/ISO-8859-8.txt new file mode 100644 index 0000000000..c5600e38fe --- /dev/null +++ b/third_party/rust/encoding_rs/doc/ISO-8859-8.txt @@ -0,0 +1,9 @@ +/// This is the Hebrew part of the ISO/IEC 8859 encoding family in visual order. +/// +/// [Index visualization](https://encoding.spec.whatwg.org/iso-8859-8.html), +/// [Visualization of BMP coverage](https://encoding.spec.whatwg.org/iso-8859-8-bmp.html) +/// +/// This encoding roughly matches the Windows code page 28598. Windows decodes +/// 0xAF to OVERLINE instead of MACRON and 0xFE and 0xFD to the Private Use +/// Area instead of LRM and RLM. Windows decodes unassigned code points to +/// the private use area. diff --git a/third_party/rust/encoding_rs/doc/KOI8-R.txt b/third_party/rust/encoding_rs/doc/KOI8-R.txt new file mode 100644 index 0000000000..46dcfe7659 --- /dev/null +++ b/third_party/rust/encoding_rs/doc/KOI8-R.txt @@ -0,0 +1,6 @@ +/// This is an encoding for Russian from [RFC 1489](https://tools.ietf.org/html/rfc1489). +/// +/// [Index visualization](https://encoding.spec.whatwg.org/koi8-r.html), +/// [Visualization of BMP coverage](https://encoding.spec.whatwg.org/koi8-r-bmp.html) +/// +/// This encoding matches the Windows code page 20866. diff --git a/third_party/rust/encoding_rs/doc/KOI8-U.txt b/third_party/rust/encoding_rs/doc/KOI8-U.txt new file mode 100644 index 0000000000..a263745ef1 --- /dev/null +++ b/third_party/rust/encoding_rs/doc/KOI8-U.txt @@ -0,0 +1,6 @@ +/// This is an encoding for Ukrainian adapted from KOI8-R. +/// +/// [Index visualization](https://encoding.spec.whatwg.org/koi8-u.html), +/// [Visualization of BMP coverage](https://encoding.spec.whatwg.org/koi8-u-bmp.html) +/// +/// This encoding matches the Windows code page 21866. diff --git a/third_party/rust/encoding_rs/doc/Shift_JIS.txt b/third_party/rust/encoding_rs/doc/Shift_JIS.txt new file mode 100644 index 0000000000..b982ab5b3e --- /dev/null +++ b/third_party/rust/encoding_rs/doc/Shift_JIS.txt @@ -0,0 +1,8 @@ +/// This is the Japanese encoding for Windows. +/// +/// [Index visualization](https://encoding.spec.whatwg.org/shift_jis.html), +/// [Visualization of BMP coverage](https://encoding.spec.whatwg.org/shift_jis-bmp.html) +/// +/// This encoding matches the Windows code page 932, except Windows decodes some byte +/// sequences that are error per the Encoding Standard to the question mark or the +/// Private Use Area and generally uses U+30FB in place of the REPLACEMENT CHARACTER. diff --git a/third_party/rust/encoding_rs/doc/UTF-16BE.txt b/third_party/rust/encoding_rs/doc/UTF-16BE.txt new file mode 100644 index 0000000000..0a7df99a4f --- /dev/null +++ b/third_party/rust/encoding_rs/doc/UTF-16BE.txt @@ -0,0 +1,8 @@ +/// This decode-only encoding uses 16-bit code units due to Unicode originally +/// having been designed as a 16-bit reportoire. In the absence of a byte order +/// mark the big endian byte order is assumed. +/// +/// There is no corresponding encoder in this crate or in the Encoding +/// Standard. The output encoding of this encoding is UTF-8. +/// +/// This encoding matches the Windows code page 1201. diff --git a/third_party/rust/encoding_rs/doc/UTF-16LE.txt b/third_party/rust/encoding_rs/doc/UTF-16LE.txt new file mode 100644 index 0000000000..3a98e8b986 --- /dev/null +++ b/third_party/rust/encoding_rs/doc/UTF-16LE.txt @@ -0,0 +1,8 @@ +/// This decode-only encoding uses 16-bit code units due to Unicode originally +/// having been designed as a 16-bit reportoire. In the absence of a byte order +/// mark the little endian byte order is assumed. +/// +/// There is no corresponding encoder in this crate or in the Encoding +/// Standard. The output encoding of this encoding is UTF-8. +/// +/// This encoding matches the Windows code page 1200. diff --git a/third_party/rust/encoding_rs/doc/UTF-8.txt b/third_party/rust/encoding_rs/doc/UTF-8.txt new file mode 100644 index 0000000000..3a93e67dce --- /dev/null +++ b/third_party/rust/encoding_rs/doc/UTF-8.txt @@ -0,0 +1,5 @@ +/// This is the encoding that should be used for all new development it can +/// represent all of Unicode. +/// +/// This encoding matches the Windows code page 65001, except Windows differs +/// in the number of errors generated for some erroneous byte sequences. diff --git a/third_party/rust/encoding_rs/doc/gb18030.txt b/third_party/rust/encoding_rs/doc/gb18030.txt new file mode 100644 index 0000000000..572a593d08 --- /dev/null +++ b/third_party/rust/encoding_rs/doc/gb18030.txt @@ -0,0 +1,9 @@ +/// This encoding matches GB18030-2005 except the two-byte sequence 0xA3 0xA0 +/// maps to U+3000 for compatibility with existing Web content. As a result, +/// this encoding can represent all of Unicode except for the private-use +/// character U+E5E5. +/// +/// [Index visualization for the two-byte sequences](https://encoding.spec.whatwg.org/gb18030.html), +/// [Visualization of BMP coverage of the two-byte index](https://encoding.spec.whatwg.org/gb18030-bmp.html) +/// +/// This encoding matches the Windows code page 54936. diff --git a/third_party/rust/encoding_rs/doc/macintosh.txt b/third_party/rust/encoding_rs/doc/macintosh.txt new file mode 100644 index 0000000000..d00fece7c8 --- /dev/null +++ b/third_party/rust/encoding_rs/doc/macintosh.txt @@ -0,0 +1,7 @@ +/// This is the MacRoman encoding from Mac OS Classic. +/// +/// [Index visualization](https://encoding.spec.whatwg.org/macintosh.html), +/// [Visualization of BMP coverage](https://encoding.spec.whatwg.org/macintosh-bmp.html) +/// +/// This encoding matches the Windows code page 10000, except Windows decodes +/// 0xBD to U+2126 OHM SIGN instead of U+03A9 GREEK CAPITAL LETTER OMEGA. diff --git a/third_party/rust/encoding_rs/doc/replacement.txt b/third_party/rust/encoding_rs/doc/replacement.txt new file mode 100644 index 0000000000..2398df06d4 --- /dev/null +++ b/third_party/rust/encoding_rs/doc/replacement.txt @@ -0,0 +1,10 @@ +/// This decode-only encoding decodes all non-zero-length streams to a single +/// REPLACEMENT CHARACTER. Its purpose is to avoid the use of an +/// ASCII-compatible fallback encoding (typically windows-1252) for some +/// encodings that are no longer supported by the Web Platform and that +/// would be dangerous to treat as ASCII-compatible. +/// +/// There is no corresponding encoder. The output encoding of this encoding +/// is UTF-8. +/// +/// This encoding does not have a Windows code page number. diff --git a/third_party/rust/encoding_rs/doc/windows-1250.txt b/third_party/rust/encoding_rs/doc/windows-1250.txt new file mode 100644 index 0000000000..96e38ef4a0 --- /dev/null +++ b/third_party/rust/encoding_rs/doc/windows-1250.txt @@ -0,0 +1,6 @@ +/// This is the Central European encoding for Windows. +/// +/// [Index visualization](https://encoding.spec.whatwg.org/windows-1250.html), +/// [Visualization of BMP coverage](https://encoding.spec.whatwg.org/windows-1250-bmp.html) +/// +/// This encoding matches the Windows code page 1250. diff --git a/third_party/rust/encoding_rs/doc/windows-1251.txt b/third_party/rust/encoding_rs/doc/windows-1251.txt new file mode 100644 index 0000000000..9645611e23 --- /dev/null +++ b/third_party/rust/encoding_rs/doc/windows-1251.txt @@ -0,0 +1,6 @@ +/// This is the Cyrillic encoding for Windows. +/// +/// [Index visualization](https://encoding.spec.whatwg.org/windows-1251.html), +/// [Visualization of BMP coverage](https://encoding.spec.whatwg.org/windows-1251-bmp.html) +/// +/// This encoding matches the Windows code page 1251. diff --git a/third_party/rust/encoding_rs/doc/windows-1252.txt b/third_party/rust/encoding_rs/doc/windows-1252.txt new file mode 100644 index 0000000000..d613fbe25c --- /dev/null +++ b/third_party/rust/encoding_rs/doc/windows-1252.txt @@ -0,0 +1,7 @@ +/// This is the Western encoding for Windows. It is an extension of ISO-8859-1, +/// which is known as Latin 1. +/// +/// [Index visualization](https://encoding.spec.whatwg.org/windows-1252.html), +/// [Visualization of BMP coverage](https://encoding.spec.whatwg.org/windows-1252-bmp.html) +/// +/// This encoding matches the Windows code page 1252. diff --git a/third_party/rust/encoding_rs/doc/windows-1253.txt b/third_party/rust/encoding_rs/doc/windows-1253.txt new file mode 100644 index 0000000000..edcacd9037 --- /dev/null +++ b/third_party/rust/encoding_rs/doc/windows-1253.txt @@ -0,0 +1,8 @@ +/// This is the Greek encoding for Windows. It is mostly an extension of +/// ISO-8859-7, but U+0386 is mapped to a different byte. +/// +/// [Index visualization](https://encoding.spec.whatwg.org/windows-1253.html), +/// [Visualization of BMP coverage](https://encoding.spec.whatwg.org/windows-1253-bmp.html) +/// +/// This encoding matches the Windows code page 1253, except Windows decodes +/// unassigned code points to the Private Use Area of Unicode. diff --git a/third_party/rust/encoding_rs/doc/windows-1254.txt b/third_party/rust/encoding_rs/doc/windows-1254.txt new file mode 100644 index 0000000000..26491a93a4 --- /dev/null +++ b/third_party/rust/encoding_rs/doc/windows-1254.txt @@ -0,0 +1,7 @@ +/// This is the Turkish encoding for Windows. It is an extension of ISO-8859-9, +/// which is known as Latin 5. +/// +/// [Index visualization](https://encoding.spec.whatwg.org/windows-1254.html), +/// [Visualization of BMP coverage](https://encoding.spec.whatwg.org/windows-1254-bmp.html) +/// +/// This encoding matches the Windows code page 1254. diff --git a/third_party/rust/encoding_rs/doc/windows-1255.txt b/third_party/rust/encoding_rs/doc/windows-1255.txt new file mode 100644 index 0000000000..cbcf86dc1c --- /dev/null +++ b/third_party/rust/encoding_rs/doc/windows-1255.txt @@ -0,0 +1,8 @@ +/// This is the Hebrew encoding for Windows. It is an extension of ISO-8859-8-I, +/// except for a currency sign swap. +/// +/// [Index visualization](https://encoding.spec.whatwg.org/windows-1255.html), +/// [Visualization of BMP coverage](https://encoding.spec.whatwg.org/windows-1255-bmp.html) +/// +/// This encoding matches the Windows code page 1255, except Windows decodes +/// unassigned code points to the Private Use Area of Unicode. diff --git a/third_party/rust/encoding_rs/doc/windows-1256.txt b/third_party/rust/encoding_rs/doc/windows-1256.txt new file mode 100644 index 0000000000..38bf2ef4e6 --- /dev/null +++ b/third_party/rust/encoding_rs/doc/windows-1256.txt @@ -0,0 +1,6 @@ +/// This is the Arabic encoding for Windows. +/// +/// [Index visualization](https://encoding.spec.whatwg.org/windows-1256.html), +/// [Visualization of BMP coverage](https://encoding.spec.whatwg.org/windows-1256-bmp.html) +/// +/// This encoding matches the Windows code page 1256. diff --git a/third_party/rust/encoding_rs/doc/windows-1257.txt b/third_party/rust/encoding_rs/doc/windows-1257.txt new file mode 100644 index 0000000000..fc3fad21d4 --- /dev/null +++ b/third_party/rust/encoding_rs/doc/windows-1257.txt @@ -0,0 +1,7 @@ +/// This is the Baltic encoding for Windows. +/// +/// [Index visualization](https://encoding.spec.whatwg.org/windows-1257.html), +/// [Visualization of BMP coverage](https://encoding.spec.whatwg.org/windows-1257-bmp.html) +/// +/// This encoding matches the Windows code page 1257, except Windows decodes +/// unassigned code points to the Private Use Area of Unicode. diff --git a/third_party/rust/encoding_rs/doc/windows-1258.txt b/third_party/rust/encoding_rs/doc/windows-1258.txt new file mode 100644 index 0000000000..1ae5bbb12c --- /dev/null +++ b/third_party/rust/encoding_rs/doc/windows-1258.txt @@ -0,0 +1,11 @@ +/// This is the Vietnamese encoding for Windows. +/// +/// [Index visualization](https://encoding.spec.whatwg.org/windows-1258.html), +/// [Visualization of BMP coverage](https://encoding.spec.whatwg.org/windows-1258-bmp.html) +/// +/// This encoding matches the Windows code page 1258 when used in the +/// non-normalizing mode. Unlike with the other single-byte encodings, the +/// result of decoding is not necessarily in Normalization Form C. On the +/// other hand, input in the Normalization Form C is not encoded without +/// replacement. In general, it's a bad idea to encode to encodings other +/// than UTF-8, but this encoding is especially hazardous to encode to. diff --git a/third_party/rust/encoding_rs/doc/windows-874.txt b/third_party/rust/encoding_rs/doc/windows-874.txt new file mode 100644 index 0000000000..ddbc71143f --- /dev/null +++ b/third_party/rust/encoding_rs/doc/windows-874.txt @@ -0,0 +1,7 @@ +/// This is the Thai encoding for Windows. It is an extension of TIS-620 / ISO-8859-11. +/// +/// [Index visualization](https://encoding.spec.whatwg.org/windows-874.html), +/// [Visualization of BMP coverage](https://encoding.spec.whatwg.org/windows-874-bmp.html) +/// +/// This encoding matches the Windows code page 874, except Windows decodes +/// unassigned code points to the Private Use Area of Unicode. diff --git a/third_party/rust/encoding_rs/doc/x-mac-cyrillic.txt b/third_party/rust/encoding_rs/doc/x-mac-cyrillic.txt new file mode 100644 index 0000000000..b5519a122c --- /dev/null +++ b/third_party/rust/encoding_rs/doc/x-mac-cyrillic.txt @@ -0,0 +1,6 @@ +/// This is the MacUkrainian encoding from Mac OS Classic. +/// +/// [Index visualization](https://encoding.spec.whatwg.org/x-mac-cyrillic.html), +/// [Visualization of BMP coverage](https://encoding.spec.whatwg.org/x-mac-cyrillic-bmp.html) +/// +/// This encoding matches the Windows code page 10017. diff --git a/third_party/rust/encoding_rs/doc/x-user-defined.txt b/third_party/rust/encoding_rs/doc/x-user-defined.txt new file mode 100644 index 0000000000..e00ddc662e --- /dev/null +++ b/third_party/rust/encoding_rs/doc/x-user-defined.txt @@ -0,0 +1,6 @@ +/// This encoding offsets the non-ASCII bytes by `0xF700` thereby decoding +/// them to the Private Use Area of Unicode. It was used for loading binary +/// data into a JavaScript string using `XMLHttpRequest` before XHR supported +/// the `"arraybuffer"` response type. +/// +/// This encoding does not have a Windows code page number. |