summaryrefslogtreecommitdiffstats
path: root/third-party/utf8cpp/test_data/utf8samples/Unicode_transcriptions.html
blob: 69b29ffa5f5c833778d208aa652a68ff0f928085 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
? 	*Unicode Transcriptions* 	Notes <#Notes>

Glyphs <http://www.macchiato.com/unicode/show.html> | Samples
<http://www.macchiato.com/unicode/Unicode_transcriptions.html> | Charts
<http://www.macchiato.com/unicode/charts.html> | UTF
<http://www.macchiato.com/unicode/convert.html> | Forms
<http://www-4.ibm.com/software/developer/library/utfencodingforms/> |
Home <http://www.macchiato.com>.
<http://member.linkexchange.com/cgi-bin/fc/fastcounter-login?750641>

Name 	Text 	Image
Arabic (Arabic) 	يونِكود 	?
Arabic (Persian) 	یونی‌کُد 	/ ?/
Armenian 	Յունիկօդ 	
Bengali 	য়ূনিকোড 	
Bopomofo 	ㄊㄨㄥ˅ ㄧˋ ㄇㄚ˅ 	
ㄨㄢˋ ㄍㄨㄛˊ ㄇㄚ˅ 	
Braille 	  	 
Buhid 	  	 
Canadian Aboriginal 	ᔫᗂᑰᑦ 	
Cherokee 	ᏳᏂᎪᏛ 	
Cypriot 	  	 
Cyrillic (Russian) 	Юникод 	?
Deseret (English) 	??????? 	
Devanagari (Hindi) 	यूनिकोड 	?
Ethiopic 	ዩኒኮድ 	
Georgian 	უნიკოდი 	?
Gothic 	  	 
Greek 	Γιούνικοντ 	
Gujarati 	યૂનિકોડ 	
Gurmukhi 	ਯੂਨਿਕੋਡ 	
Han (Chinese) 	统一码 	?
統一碼 	?
万国码 	?
萬國碼 	?
Hangul 	유니코드 	
Hanunoo 	  	 
Hebrew 	יוניקוד 	
Hebrew (pointed) 	יוּנִיקוׁד 	
Hebrew (Yiddish) 	יוניקאָד 	?
Hiragana (Japanese) 	ゆにこおど 	 
Katakana (Japanese) 	ユニコード 	?
Kannada 	ಯೂನಿಕೋಡ್ 	
Khmer 	យូនីគោដ 	
Lao 	  	 
Latin 	Unicode 	Unicode
Latin (IPA <#English_Pronunciation>) 	ˈjunɪˌkoːd 	?
Latin (Am. Dict. <#American_Dictionary>) 	Ūnĭcōde̽ 	?
Limbu 	  	 
Linear B 	  	 
Malayalam 	യൂനികോഡ് 	
Mongolian 	  	
Myanmar 	  	
Ogham 	ᚔᚒᚅᚔᚉᚑᚇ 	/ /
Old Italic 	  	 
Oriya 	ୟୂନିକୋଡ 	
Osmanya 	  	 
Runic (Anglo-Saxon) 	ᛡᚢᚾᛁᚳᚩᛞ 	
Shavian 	  	 
Sinhala 	යණනිකෞද් 	
Syriac 	ܝܘܢܝܩܘܕ 	
Tagbanwa 	  	 
Tagalog 	  	 
Tai Le 	  	 
Tamil 	யூனிகோட் 	
Telugu 	యూనికోడ్ 	
Thaana 	  	
Thai 	ยูนืโคด 	
Tibetan (Dzongkha) 	ཨུ་ནི་ཀོཌྲ། 	
Ugaritic 	  	 
Yi 	  	


      Notes:

There are different ways to transcribe the word “Unicode”, depending on
the language and script. In some cases there is only one language that
customarily uses a given script; in others there are many languages. The
goal here is at a minimum to collect at least one transcription for each
script in a language customarily written in that script, with more
languages if possible. If the transcription is the same for multiple
languages in a script, then a single representative language is used.

Still missing are transcriptions for the items above in RED (in at least
one language). I would appreciate any other transcriptions, or
corrections for the ones listed here. Send to mark3@macchiato.com
<mailto:mark3@macchiato.com>, using the directions below:

    * *Supplying Missing Items*
          o Most Latin-script languages will follow the spelling, and
            change the pronunciation. For any that would not, it would
            be good to have the alternate spelling.
          o For non-Latin scripts the goal is to match the English
            pronunciation — /*not*/ spelling. Above is the IPA <#IPA>
            (in phonemic transcription) that should be matched as
            closely as possible (without sounding affected in the target
            language)
          o Text would be best in either the UTF-8 text, or the code
            points in hex HTML. E.g. either of the following:
                + "Юникод"
                + "&#x042E;&#x043D;&#x0438;&#x043A;&#x043E;&#x0434;"
                + Note: for / supplementary characters/
                  <http://www.unicode.org/glossary/#supplementary_character>,
                  there should be one hex number per code point, not two
                  surrogates
                  <http://www.unicode.org/glossary/#surrogate_code_point>:
                      # &#x10000; /*not*/ &#xD800;&xDC00;
          o If you have a good font, I'd also appreciate a GIF. It
            should be *96 x 24* bits, with the text centered, in black
            on white (plus grays if smoothed).
    * *Other Comments*
          o Because some browsers won't handle the text, both text and
            GIF image are supplied. If you can’t read the text columns,
            see Display Problems
            <http://www.unicode.org/help/display_problems.html>.
          o The Chinese versions (inc. Bopomofo) are translations, not
            transcriptions, since "transcription in Chinese is pretty
            lame" [J. Becker].
          o There are other "translations" of Unicode that may be in
            use, such as the Vietnamese "Thống Nhất Mã".
          o For sample pages in different languages on the Unicode site,
            see What is Unicode?
            <http://www.unicode.org/unicode/standard/WhatIsUnicode.html>
          o Americans are not generally used to IPA, and find a variety
            of different systems in their dictionaries. This one leaves
            the base letters as they are, and uses diacritics for
            pronunciation.
    * *Etymology of /Unicode/*
          o Coined by J. Becker. Not related to previous usages, such as:
                + A telegraphic code in which one word or set of letters
                  represents a sentence or phrase; a telegram or message
                  in this. (late 19th century, OED)
          o According to my references, the prefix "uni" is directly
            from Latin while the word "code" is through French.
          o The original Indo-European apparently would have been
            *oino-kau-do ("one strike give"): *kau apparently being
            related to such English words as: hew, haggle, hoe, hag,
            hay, hack, caudad, caudal, caudate, caudex, coda, codex,
            codicil, coward, incus, and Kovač (personal name: "smith").
                + I will leave the exact derivations to the exegetes,
                  but I like the association with "haggle" myself.
    * *Contributions*
          o This draws on contributions or comments from:
                + Dixon Au
                + Joe Becker
                + Maurice Bauhahn
                + Abel Cheung
                + Peter Constable
                + Michael Everson
                + Christopher John Fynn
                + Michael Kaplan
                + George Kiraz
                + Abdul Malik
                + Siva Nataraja
                + Roozbeh Pournader
                + Jonathan Rosenne
                + Jungshik Shin

------------------------------------------------------------------------
	

Terms of Use <http://www.macchiato.com/terms_of_use.html>. Last updated:
MED - 04/20/2003 15:30:33.
<http://member.linkexchange.com/cgi-bin/fc/fastcounter-login?750641>