diff options
Diffstat (limited to 'test/iso-8859-1a.html')
-rw-r--r-- | test/iso-8859-1a.html | 275 |
1 files changed, 275 insertions, 0 deletions
diff --git a/test/iso-8859-1a.html b/test/iso-8859-1a.html new file mode 100644 index 0000000..972329d --- /dev/null +++ b/test/iso-8859-1a.html @@ -0,0 +1,275 @@ +<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"> +<!-- X-URL: http://www.ramsch.org/martin/uni/fmi-hp/iso8859-1.html --> +<!-- Date: Tue, 28 Dec 2004 20:24:09 GMT --> +<!-- Last-Modified: Mon, 15 May 2000 09:37:37 GMT --> +<HTML> +<HEAD> +<TITLE>Martin Ramsch - iso8859-1 table</TITLE> +<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=iso-8859-1"> +<BASE HREF="http://www.ramsch.org/martin/uni/fmi-hp/iso8859-1.html"> +</HEAD> + +<BODY> + +<H1 ALIGN=center>iso8859-1 table, with cp-1252</H1> + +<PRE> +Description Code Entity name +=================================== ============ ============== +quotation mark &#34; --> " &quot; --> " +ampersand &#38; --> & &amp; --> & +less-than sign &#60; --> < &lt; --> < +greater-than sign &#62; --> > &gt; --> > + +Description Char Code Entity name +=================================== ==== ============ ============== +euro sign &128; --> € +undefined &129; -->  +single low-9 quotation mark &130; --> ‚ +latin small letter f with hook &131; --> ƒ +double low-9 quotation mark &132; --> „ +horizontal ellipsis &133; --> … +dagger &134; --> † +double dagger &135; --> ‡ +modifier letter circumflex accent &136; --> ˆ +per mille sign &137; --> ‰ +latin capital letter s with caron &138; --> Š +single left-pointing angle quote mark &139; --> ‹ +latin capital ligature oe &140; --> Œ +undefined &141; -->  +latin capital letter z with caron &142; --> Ž +undefined &143; -->  + +undefined &144; -->  +left single quotation mark &145; --> ‘ +right single quotation mark &146; --> ’ +left double quotation mark &147; --> “ +right double quotation mark &148; --> ” +bullet &149; --> • +en dash &150; --> – +em dash &151; --> — +small tilde &152; --> ˜ +trade mark sign &153; --> ™ +latin small letter s with caron &154; --> š +single right-pointing angle quote mark &155; --> › +latin small ligature oe &156; --> œ +undefined &157; -->  +latin small letter z with caron &158; --> ž +latin capital letter y with diaeresis &159; --> Ÿ + +non-breaking space &#160; -->   &nbsp; --> +inverted exclamation &#161; --> ¡ &iexcl; --> ¡ +cent sign &#162; --> ¢ &cent; --> ¢ +pound sterling &#163; --> £ &pound; --> £ +general currency sign &#164; --> ¤ &curren; --> ¤ +yen sign &#165; --> ¥ &yen; --> ¥ +broken vertical bar &#166; --> ¦ &brvbar; --> ¦ + Non-standard &brkbar; --> &brkbar; +section sign &#167; --> § &sect; --> § +umlaut (dieresis) &#168; --> ¨ &uml; --> ¨ + Non-standard &die; --> ¨ +copyright &#169; --> © &copy; --> © +feminine ordinal &#170; --> ª &ordf; --> ª +left angle quote, guillemotleft &#171; --> « &laquo; --> « +not sign &#172; --> ¬ &not; --> ¬ +soft hyphen &#173; --> ­ &shy; --> ­ +registered trademark &#174; --> ® &reg; --> ® +macron accent &#175; --> ¯ &macr; --> ¯ + Non-standard &hibar; --> &hibar; +degree sign &#176; --> ° &deg; --> ° +plus or minus &#177; --> ± &plusmn; --> ± +superscript two &#178; --> ² &sup2; --> ² +superscript three &#179; --> ³ &sup3; --> ³ +acute accent &#180; --> ´ &acute; --> ´ +micro sign &#181; --> µ &micro; --> µ +paragraph sign &#182; --> ¶ &para; --> ¶ +middle dot &#183; --> · &middot; --> · +cedilla &#184; --> ¸ &cedil; --> ¸ +superscript one &#185; --> ¹ &sup1; --> ¹ +masculine ordinal &#186; --> º &ordm; --> º +right angle quote, guillemotright &#187; --> » &raquo; --> » +fraction one-fourth &#188; --> ¼ &frac14; --> ¼ +fraction one-half &#189; --> ½ &frac12; --> ½ +fraction three-fourths &#190; --> ¾ &frac34; --> ¾ +inverted question mark &#191; --> ¿ &iquest; --> ¿ +capital A, grave accent &#192; --> À &Agrave; --> À +capital A, acute accent &#193; --> Á &Aacute; --> Á +capital A, circumflex accent &#194; -->  &Acirc; -->  +capital A, tilde &#195; --> à &Atilde; --> à +capital A, dieresis or umlaut mark &#196; --> Ä &Auml; --> Ä +capital A, ring &#197; --> Å &Aring; --> Å +capital AE diphthong (ligature) &#198; --> Æ &AElig; --> Æ +capital C, cedilla &#199; --> Ç &Ccedil; --> Ç +capital E, grave accent &#200; --> È &Egrave; --> È +capital E, acute accent &#201; --> É &Eacute; --> É +capital E, circumflex accent &#202; --> Ê &Ecirc; --> Ê +capital E, dieresis or umlaut mark &#203; --> Ë &Euml; --> Ë +capital I, grave accent &#204; --> Ì &Igrave; --> Ì +capital I, acute accent &#205; --> Í &Iacute; --> Í +capital I, circumflex accent &#206; --> Î &Icirc; --> Î +capital I, dieresis or umlaut mark &#207; --> Ï &Iuml; --> Ï +capital Eth, Icelandic &#208; --> Ð &ETH; --> Ð + Non-standard &Dstrok; --> Đ +capital N, tilde &#209; --> Ñ &Ntilde; --> Ñ +capital O, grave accent &#210; --> Ò &Ograve; --> Ò +capital O, acute accent &#211; --> Ó &Oacute; --> Ó +capital O, circumflex accent &#212; --> Ô &Ocirc; --> Ô +capital O, tilde &#213; --> Õ &Otilde; --> Õ +capital O, dieresis or umlaut mark &#214; --> Ö &Ouml; --> Ö +multiply sign &#215; --> × &times; --> × +capital O, slash &#216; --> Ø &Oslash; --> Ø +capital U, grave accent &#217; --> Ù &Ugrave; --> Ù +capital U, acute accent &#218; --> Ú &Uacute; --> Ú +capital U, circumflex accent &#219; --> Û &Ucirc; --> Û +capital U, dieresis or umlaut mark &#220; --> Ü &Uuml; --> Ü +capital Y, acute accent &#221; --> Ý &Yacute; --> Ý +capital THORN, Icelandic &#222; --> Þ &THORN; --> Þ +small sharp s, German (sz ligature) &#223; --> ß &szlig; --> ß +small a, grave accent &#224; --> à &agrave; --> à +small a, acute accent &#225; --> á &aacute; --> á +small a, circumflex accent &#226; --> â &acirc; --> â +small a, tilde &#227; --> ã &atilde; --> ã +small a, dieresis or umlaut mark &#228; --> ä &auml; --> ä +small a, ring &#229; --> å &aring; --> å +small ae diphthong (ligature) &#230; --> æ &aelig; --> æ +small c, cedilla &#231; --> ç &ccedil; --> ç +small e, grave accent &#232; --> è &egrave; --> è +small e, acute accent &#233; --> é &eacute; --> é +small e, circumflex accent &#234; --> ê &ecirc; --> ê +small e, dieresis or umlaut mark &#235; --> ë &euml; --> ë +small i, grave accent &#236; --> ì &igrave; --> ì +small i, acute accent &#237; --> í &iacute; --> í +small i, circumflex accent &#238; --> î &icirc; --> î +small i, dieresis or umlaut mark &#239; --> ï &iuml; --> ï +small eth, Icelandic &#240; --> ð &eth; --> ð +small n, tilde &#241; --> ñ &ntilde; --> ñ +small o, grave accent &#242; --> ò &ograve; --> ò +small o, acute accent &#243; --> ó &oacute; --> ó +small o, circumflex accent &#244; --> ô &ocirc; --> ô +small o, tilde &#245; --> õ &otilde; --> õ +small o, dieresis or umlaut mark &#246; --> ö &ouml; --> ö +division sign &#247; --> ÷ &divide; --> ÷ +small o, slash &#248; --> ø &oslash; --> ø +small u, grave accent &#249; --> ù &ugrave; --> ù +small u, acute accent &#250; --> ú &uacute; --> ú +small u, circumflex accent &#251; --> û &ucirc; --> û +small u, dieresis or umlaut mark &#252; --> ü &uuml; --> ü +small y, acute accent &#253; --> ý &yacute; --> ý +small thorn, Icelandic &#254; --> þ &thorn; --> þ +small y, dieresis or umlaut mark &#255; --> ÿ &yuml; --> ÿ +</PRE> +<!-- removed: second /PRE, a hack for HotJava 1.0 preBeta 1 --> +<HR> + +<STRONG>How to read</STRONG> this table. The columns are +<DL COMPACT> +<DT>1st:<DD>textual <EM>description</EM> of the character +<DT>2nd:<DD>character inserted directly into the HTML page as <EM>one + byte</EM> +<DT>3rd:<DD>character written as <EM>numeric HTML entity</EM>, in the + format:<BR>"how it looks literally" <CODE>--></CODE> + "what your browser does with it" +<DT>4th:<DD>character written as <EM>symbolic HTML entity</EM>, in the + format:<BR>"how it looks literally" <CODE>--></CODE> + "what your browser does with it" +</DL> + +So for example, if you see something like "<CODE>&divide; --> +&divide;</CODE>" in the 4th column, this means your browser +doesn't know about the entity name "divide" and just puts it +literally. + +<P> +<STRONG>This table</STRONG> grew out of an overview of the "ISO +Latin-1 Character Set" overview related to the Hyper-G Text Format +(<A HREF="http://www.hyperwave.de/HTFdoc">HTF</A>). + +The entity names <CODE>&brkbar;</CODE> and <CODE>&Dstrok;</CODE> +seem to be unique to HTF. + +The entity name <CODE>&hibar;</CODE> has been supported by X Mosaic +but seems to be replaced with <CODE>&macr;</CODE>. + +The entity names <CODE>&uml;</CODE> and <CODE>&die;</CODE> should +be equivalent. + +<P><STRONG>The standards stuff:</STRONG> +The +<A HREF="http://www.w3.org/hypertext/WWW/MarkUp/html-spec/">HTML 2.0 Standard</A> +includes a section on +<A HREF="http://www.w3.org/hypertext/WWW/MarkUp/html-spec/html-spec_9.html#SEC99">Character Entity Sets</A> +and an overview on the +<A HREF="http://www.w3.org/hypertext/WWW/MarkUp/html-spec/html-spec_13.html#SEC106">HTML Coded Character Set</A> +(The entity names are derived from <A HREF="http://www.ucc.ie/info/net/isolat1.html">ISO 8879</A>). +<BR> + +Or have a look at the +<A HREF="http://www.w3.org/hypertext/WWW/MarkUp/html3/latin1.html">Latin-1 Character Entities</A> +as listed in an draft for the +<A HREF="http://www.w3.org/hypertext/WWW/MarkUp/html3/CoverPage.html">HTML 3.0 specification</A>. +<BR> + +The +<A HREF="http://www.w3.org/hypertext/WWW/MarkUp/HTMLPlus/htmlplus_59.html">Appendix II</A> +of CERN's +<A HREF="http://www.w3.org/hypertext/WWW/MarkUp/HTMLPlus/htmlplus_1.html">HTML+ Discussion Document</A> +contains a +<A HREF="http://www.w3.org/hypertext/WWW/MarkUp/HTMLPlus/htmlplus_table.ps">table</A> +(in PostScript format) of the proposed character entities for HTML+ and their +corresponding character codes for Unicode and the Adobe Latin-1 & Symbol +character sets. +<P> + +<STRONG>Please note</STRONG> that there is nothing wrong with using +characters of ISO Latin-1 above 127: the normal transmission protocol +for the WWW, +<A HREF="http://www.w3.org/pub/WWW/Protocols/rfc1945/rfc1945">HTTP/1.0</A>, +uses the 8bit ISO latin-1 as default encoding. +(Thanks to Roman +Czyborra for pointing this out!) +<P> + +<STRONG>Other information:</STRONG> +<UL> + +<LI><STRONG>Kevin J. Brewer</STRONG> has done two very good pages on the subject: + <UL> + <LI><A HREF="http://www.bbsinc.com/iso8859.html">ASCII - ISO 8859-1 (Latin-1) with HTML 3.0 Entities Table</A> and + <LI><A HREF="http://www.bbsinc.com/iso8879.html">ISO 8879 Entities Gopher Menu</A> + </UL> + +<LI>The excellent overview on the series of + <A HREF="http://czyborra.com/charsets/iso8859.html">ISO 8859 + character sets</A> compiled by Roman Czyborra. + +<LI>Also have a look on Alan Flavell's page of + <A HREF="http://ppewww.ph.gla.ac.uk/%7Eflavell/iso8859/iso8859-pointers.html">pointers + to information about ISO8859</A>. It's written very well! + +<LI>Maybe also of interest to you is the + <A HREF="ftp://ftp.vlsivie.tuwien.ac.at/pub/8bit/FAQ-ISO-8859-1">ISO + 8859-1 FAQ</A> by Michael Gschwind + (<A HREF="mailto:mike@vlsivie.tuwien.ac.at">mike@vlsivie.tuwien.ac.at</A>), + part of his page on + <A HREF="http://www.vlsivie.tuwien.ac.at/mike/i18n.html">Internationalization</A>. + +<LI>For users of X11R5 on SunOS systems: the + <A HREF="Compose.txt">table over the compose combinations</A> + (also coded <A HREF="Compose.html">with entities</A> where possible). + It's taken from the MIT X sources in + <CODE>server/ddx/sun/Compose.list</CODE>. + +<LI>Finally you could have a look at + <A HREF="ftp://ds.internic.net/rfc/rfc1345.txt">RFC 1345: + Character Mnemonics & Character Sets</A> + by K. Simonsen (06/11/92, 103 pages, approx. 240 kbyte). + +</UL> + + +<HR> + +<ADDRESS><A HREF="http://ramsch.home.pages.de/">Martin Ramsch</A>, 16.02.1994, 07.01.1996, 01.07.1996, 1998-10-09, 2000-05-15</ADDRESS> + +</BODY> +</HTML> |