summaryrefslogtreecommitdiffstats
path: root/third-party/utf8cpp/test_data/utf8samples/Unicode_transcriptions.html
diff options
context:
space:
mode:
Diffstat (limited to '')
-rw-r--r--third-party/utf8cpp/test_data/utf8samples/Unicode_transcriptions.html167
1 files changed, 167 insertions, 0 deletions
diff --git a/third-party/utf8cpp/test_data/utf8samples/Unicode_transcriptions.html b/third-party/utf8cpp/test_data/utf8samples/Unicode_transcriptions.html
new file mode 100644
index 0000000..69b29ff
--- /dev/null
+++ b/third-party/utf8cpp/test_data/utf8samples/Unicode_transcriptions.html
@@ -0,0 +1,167 @@
+? *Unicode Transcriptions* Notes <#Notes>
+
+Glyphs <http://www.macchiato.com/unicode/show.html> | Samples
+<http://www.macchiato.com/unicode/Unicode_transcriptions.html> | Charts
+<http://www.macchiato.com/unicode/charts.html> | UTF
+<http://www.macchiato.com/unicode/convert.html> | Forms
+<http://www-4.ibm.com/software/developer/library/utfencodingforms/> |
+Home <http://www.macchiato.com>.
+<http://member.linkexchange.com/cgi-bin/fc/fastcounter-login?750641>
+
+Name Text Image
+Arabic (Arabic) يونِكود ?
+Arabic (Persian) یونی‌کُد / ?/
+Armenian Յունիկօդ
+Bengali য়ূনিকোড
+Bopomofo ㄊㄨㄥ˅ ㄧˋ ㄇㄚ˅
+ㄨㄢˋ ㄍㄨㄛˊ ㄇㄚ˅
+Braille
+Buhid
+Canadian Aboriginal ᔫᗂᑰᑦ
+Cherokee ᏳᏂᎪᏛ
+Cypriot
+Cyrillic (Russian) Юникод ?
+Deseret (English) ???????
+Devanagari (Hindi) यूनिकोड ?
+Ethiopic ዩኒኮድ
+Georgian უნიკოდი ?
+Gothic
+Greek Γιούνικοντ
+Gujarati યૂનિકોડ
+Gurmukhi ਯੂਨਿਕੋਡ
+Han (Chinese) 统一码 ?
+統一碼 ?
+万国码 ?
+萬國碼 ?
+Hangul 유니코드
+Hanunoo
+Hebrew יוניקוד
+Hebrew (pointed) יוּנִיקוׁד
+Hebrew (Yiddish) יוניקאָד ?
+Hiragana (Japanese) ゆにこおど
+Katakana (Japanese) ユニコード ?
+Kannada ಯೂನಿಕೋಡ್
+Khmer យូនីគោដ
+Lao
+Latin Unicode Unicode
+Latin (IPA <#English_Pronunciation>) ˈjunɪˌkoːd ?
+Latin (Am. Dict. <#American_Dictionary>) Ūnĭcōde̽ ?
+Limbu
+Linear B
+Malayalam യൂനികോഡ്
+Mongolian
+Myanmar
+Ogham ᚔᚒᚅᚔᚉᚑᚇ / /
+Old Italic
+Oriya ୟୂନିକୋଡ
+Osmanya
+Runic (Anglo-Saxon) ᛡᚢᚾᛁᚳᚩᛞ
+Shavian
+Sinhala යණනිකෞද්
+Syriac ܝܘܢܝܩܘܕ
+Tagbanwa
+Tagalog
+Tai Le
+Tamil யூனிகோட்
+Telugu యూనికోడ్
+Thaana
+Thai ยูนืโคด
+Tibetan (Dzongkha) ཨུ་ནི་ཀོཌྲ།
+Ugaritic
+Yi
+
+
+ Notes:
+
+There are different ways to transcribe the word “Unicode”, depending on
+the language and script. In some cases there is only one language that
+customarily uses a given script; in others there are many languages. The
+goal here is at a minimum to collect at least one transcription for each
+script in a language customarily written in that script, with more
+languages if possible. If the transcription is the same for multiple
+languages in a script, then a single representative language is used.
+
+Still missing are transcriptions for the items above in RED (in at least
+one language). I would appreciate any other transcriptions, or
+corrections for the ones listed here. Send to mark3@macchiato.com
+<mailto:mark3@macchiato.com>, using the directions below:
+
+ * *Supplying Missing Items*
+ o Most Latin-script languages will follow the spelling, and
+ change the pronunciation. For any that would not, it would
+ be good to have the alternate spelling.
+ o For non-Latin scripts the goal is to match the English
+ pronunciation — /*not*/ spelling. Above is the IPA <#IPA>
+ (in phonemic transcription) that should be matched as
+ closely as possible (without sounding affected in the target
+ language)
+ o Text would be best in either the UTF-8 text, or the code
+ points in hex HTML. E.g. either of the following:
+ + "Юникод"
+ + "&#x042E;&#x043D;&#x0438;&#x043A;&#x043E;&#x0434;"
+ + Note: for / supplementary characters/
+ <http://www.unicode.org/glossary/#supplementary_character>,
+ there should be one hex number per code point, not two
+ surrogates
+ <http://www.unicode.org/glossary/#surrogate_code_point>:
+ # &#x10000; /*not*/ &#xD800;&xDC00;
+ o If you have a good font, I'd also appreciate a GIF. It
+ should be *96 x 24* bits, with the text centered, in black
+ on white (plus grays if smoothed).
+ * *Other Comments*
+ o Because some browsers won't handle the text, both text and
+ GIF image are supplied. If you can’t read the text columns,
+ see Display Problems
+ <http://www.unicode.org/help/display_problems.html>.
+ o The Chinese versions (inc. Bopomofo) are translations, not
+ transcriptions, since "transcription in Chinese is pretty
+ lame" [J. Becker].
+ o There are other "translations" of Unicode that may be in
+ use, such as the Vietnamese "Thống Nhất Mã".
+ o For sample pages in different languages on the Unicode site,
+ see What is Unicode?
+ <http://www.unicode.org/unicode/standard/WhatIsUnicode.html>
+ o Americans are not generally used to IPA, and find a variety
+ of different systems in their dictionaries. This one leaves
+ the base letters as they are, and uses diacritics for
+ pronunciation.
+ * *Etymology of /Unicode/*
+ o Coined by J. Becker. Not related to previous usages, such as:
+ + A telegraphic code in which one word or set of letters
+ represents a sentence or phrase; a telegram or message
+ in this. (late 19th century, OED)
+ o According to my references, the prefix "uni" is directly
+ from Latin while the word "code" is through French.
+ o The original Indo-European apparently would have been
+ *oino-kau-do ("one strike give"): *kau apparently being
+ related to such English words as: hew, haggle, hoe, hag,
+ hay, hack, caudad, caudal, caudate, caudex, coda, codex,
+ codicil, coward, incus, and Kovač (personal name: "smith").
+ + I will leave the exact derivations to the exegetes,
+ but I like the association with "haggle" myself.
+ * *Contributions*
+ o This draws on contributions or comments from:
+ + Dixon Au
+ + Joe Becker
+ + Maurice Bauhahn
+ + Abel Cheung
+ + Peter Constable
+ + Michael Everson
+ + Christopher John Fynn
+ + Michael Kaplan
+ + George Kiraz
+ + Abdul Malik
+ + Siva Nataraja
+ + Roozbeh Pournader
+ + Jonathan Rosenne
+ + Jungshik Shin
+
+------------------------------------------------------------------------
+
+
+Terms of Use <http://www.macchiato.com/terms_of_use.html>. Last updated:
+MED - 04/20/2003 15:30:33.
+<http://member.linkexchange.com/cgi-bin/fc/fastcounter-login?750641>
+
+
+