Adding upstream version 7.0.6-dfsg.upstream/7.0.6-dfsg upstream

Signed-off-by: Daniel Baumann <daniel.baumann@progress-linux.org>
author: Daniel Baumann <daniel.baumann@progress-linux.org> 2024-04-07 16:49:04 +0000
committer: Daniel Baumann <daniel.baumann@progress-linux.org> 2024-04-07 16:49:04 +0000
commit: 16f504a9dca3fe3b70568f67b7d41241ae485288 (patch)
tree: c60f36ada0496ba928b7161059ba5ab1ab224f9d /src/libs/xpcom18a4/xpcom/string/doc
parent: Initial commit. (diff)
download: virtualbox-upstream.tar.xz
virtualbox-upstream.zip
2 files changed, 2552 insertions, 0 deletions
diff --git a/src/libs/xpcom18a4/xpcom/string/doc/README.html b/src/libs/xpcom18a4/xpcom/string/doc/README.html
new file mode 100644
index 00000000..154b7969
--- /dev/null
+++ b/src/libs/xpcom18a4/xpcom/string/doc/README.html
@@ -0,0 +1,44 @@
+<html>
+<!-- ***** BEGIN LICENSE BLOCK *****
+   - Version: MPL 1.1/GPL 2.0/LGPL 2.1
+   -
+   - The contents of this file are subject to the Mozilla Public License Version
+   - 1.1 (the "License"); you may not use this file except in compliance with
+   - the License. You may obtain a copy of the License at
+   - http://www.mozilla.org/MPL/
+   -
+   - Software distributed under the License is distributed on an "AS IS" basis,
+   - WITHOUT WARRANTY OF ANY KIND, either express or implied. See the License
+   - for the specific language governing rights and limitations under the
+   - License.
+   -
+   - The Original Code is Mozilla.
+   -
+   - The Initial Developer of the Original Code is
+   - Netscape Communications.
+   - Portions created by the Initial Developer are Copyright (C) 2001
+   - the Initial Developer. All Rights Reserved.
+   -
+   - Contributor(s):
+   -   Scott Collins <scc@mozilla.org> (original author)
+   -
+   - Alternatively, the contents of this file may be used under the terms of
+   - either of the GNU General Public License Version 2 or later (the "GPL"),
+   - or the GNU Lesser General Public License Version 2.1 or later (the "LGPL"),
+   - in which case the provisions of the GPL or the LGPL are applicable instead
+   - of those above. If you wish to allow use of your version of this file only
+   - under the terms of either the GPL or the LGPL, and not to allow others to
+   - use your version of this file under the terms of the MPL, indicate your
+   - decision by deleting the provisions above and replace them with the notice
+   - and other provisions required by the GPL or the LGPL. If you do not delete
+   - the provisions above, a recipient may use your version of this file under
+   - the terms of any one of the MPL, the GPL or the LGPL.
+   -
+   - ***** END LICENSE BLOCK ***** -->
+<body>
+  <h1><span class="LXRSHORTDESC">documentation aimed at programmers who are clients of the string library</span></h1>
+<p>
+  <span class="LXRLONGDESC"></span>
+</p>
+</body>
+</html>
diff --git a/src/libs/xpcom18a4/xpcom/string/doc/string-guide.html b/src/libs/xpcom18a4/xpcom/string/doc/string-guide.html
new file mode 100644
index 00000000..41dbd217
--- /dev/null
+++ b/src/libs/xpcom18a4/xpcom/string/doc/string-guide.html
@@ -0,0 +1,2508 @@
+<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">
+<html>
+  <head>
+    <title>an incomplete guide to mozilla/string</title>
+
+    <link rel="stylesheet" href="http://www.mozilla.org/projects/string/string-guide.css" title="remote stylesheet" type="text/css">
+    <link rel="alternate stylesheet" href="string-guide.css" title="local stylesheet" type="text/css">
+  </head>
+  <body>
+<!-- ----|---------|---------|---------|---------|---------|---------|---------| -->
+<!-- ...............................................................Front Matter -->
+<h1>an incomplete guide to <a class="exact-uri" href="http://lxr.mozilla.org/seamonkey/source/string/">mozilla/string</a></h1>
+    <h1><font color="red">This document is now deprecated in favor of <a href="http://www.mozilla.org/projects/xpcom/string-guide.html">The new string guide</a>.</font></h1>
+<div class="author-note">
+  <p>by <a href="http://ScottCollins.net/">Scott Collins</a><!-- /p -->
+  <p>last modified 8 April 2001<!-- /p -->
+</div>
+
+<div class="abstract">
+  <p>
+    <h1>Abstract</h1>
+    This document <span class="LXRSHORTDESC">provides
+      an <a href="#users_guide">introduction</a> to the design and use of the string classes in mozilla,
+      <a href="#implementors_guide">detailed information</a> on their implementation and how one may extend them,
+      and <a href="#faq">answers</a> to frequently asked questions about strings</span>.
+  </p>
+</div>
+
+
+
+<h2><a name="contents">contents</a></h2>
+
+<div class="contents">
+  <ul>
+    <li><a href="#users_guide"       >user's guide</a></li>
+    <li><a href="#implementors_guide">implementor's guide</a></li>
+    <li><a href="#faq"               >frequently asked questions</a></li>
+  </ul>
+</div>
+
+<p>
+  Please direct all comments, requests, and contributions to,
+      in order of preference,
+    the tracking bug <a href="http://bugzilla.mozilla.org/show_bug.cgi?id=70076">#70076</a> for this document,
+    the author <a class="exact-uri" href="mailto:scc@mozilla.org?subject=string-guide">scc@mozilla.org</a>, and/or
+    the newsgroup <a class="exact-uri" href="news:netscape.public.mozilla.xpcom">news:netscape.public.mozilla.xpcom</a>
+      (should there be a strings newsgroup?)
+</p>
+
+<div class="author-note">
+  <p>
+    A note to potential editors:
+      don't even <strong>consider</strong> modifying this document with an HTML editor.
+    That would destroy the internal formatting,
+      and make patches unmanagable.
+  </p>
+</div>
+
+
+
+
+<!-- ...............................................................User's Guide -->
+<hr>
+<h1><a name="users_guide">user's guide</a></h1>
+
+<div class="author-note">
+  <p>
+    Strings in mozilla are a world apart from <span class="code">char*</span>s.
+    If you don't know why they are different,
+      this section is the place for you to start.
+    If you're already familiar with the hierarchy of string classes in mozilla,
+      then you might want to skip ahead to the <a href="#implementors_guide">implementor's guide</a>
+      or the <a href="#faq">FAQ</a>.
+  </p>
+</div>
+
+<div class="contents">
+  <ul>
+    <li><a href="#users_guide_introduction">introduction</a></li>
+    <li><a href="#users_guide_how_to"      >using the string classes correctly; using the correct string class</a></li>
+    <li><a href="#users_guide_iterators"   >using string iterators</a></li>
+    <li><a href="#users_guide_summary"     >summary</a></li>
+  </ul>
+</div>
+
+<h2><a name="users_guide_introduction">introduction</a></h2>
+  <h3>what and what isn't a string?</h3>
+<p>
+  A string is an opaque container holding a, possibly zero length, linear sequence of characters.
+  Understanding the implications of this statement is the foundation for understanding all mozilla's string classes.
+</p>
+
+  <h3>readable and writable</h3>
+  <h3>dependent strings</h3>
+  <h3>flat strings</h3>
+  <h3>encoding</h3>
+  <h3>sharing</h3>
+
+<h2><a name="users_guide_how_to">using the string classes correctly; using the correct string class</a></h2>
+  <h3>basic string operations</h3>
+    <h4>comparison</h4>
+    <h4>concatenation</h4>
+    <h4>substrings</h4>
+    <h4>find and replace</h4>
+  <h3>conversions</h3>
+    <h4>calling a function that expects a different kind of string</h4>
+    <h4>converting between string classes</h4>
+    <h4>converting between encodings</h4>
+  <h3>selecting the right string class</h3>
+    <h4>user string classes</h4>
+    <h4>selecting the right string class for a parameter</h4>
+    <h4>selecting the right string class for a local variable</h4>
+    <h4>selecting the right string class for a member variable</h4>
+    <h4>selecting the right string class for a return value</h4>
+    <h4>selecting the right string class in IDL</h4>
+  <h3>dont's</h3>
+
+<h2><a name="users_guide_iterators">using string iterators</a></h2>
+  <h3>what is an iterator?</h3>
+  <h3>reading iterators and writing iterators</h3>
+  <h3>`chunky' iterating for efficiency</h3>
+  <h3><span class="code">copy_string</span>, character sources and sinks</h3>
+  <h3>encoding conversion iterators</h3>
+
+<h2><a name="users_guide_summary">summary</a></h2>
+
+
+<!-- ........................................................Implementor's Guide -->
+<hr>
+<h1><a name="implementors_guide">implementor's guide</a></h1>
+
+<div class="author-note">
+  <p>
+    
+  </p>
+</div>
+
+<div class="contents">
+  <ul>
+    <!-- li></li -->
+  </ul>
+</div>
+
+
+
+<!-- ........................................................................FAQ -->
+<hr>
+<h1><a name="faq">frequently asked questions</a></h1>
+
+<div class="author-note">
+</div>
+
+<div class="contents">
+  <ul>
+<!--
+    <li>
+      I have a wide string, i.e., an instance of a class derived from <span class="code">nsAString</span>
+      <ul>
+        <li>I want a pointer to the characters</span>
+        <li>I want a narrow string</li>
+        <li>I want to <span class="code">printf</span> it</li>
+      </ul>
+    </li>
+    <li>
+      I have a <span class="code">PRUnichar*</span>
+      <ul>
+        <li>I want a wide string</span>
+        <li>I want a narrow string</span>
+        <li>I want to <span class="code">printf</span> it</li>
+      </ul>
+    </li>
+    <li>
+      I have a narrow string, i.e., an instance of a class derived from <span class="code">nsACString</span>
+      <ul>
+        <li>I want a pointer to the characters</span>
+        <li>I want a narrow string</li>
+        <li>I want to <span class="code">printf</span> it</li>
+      </ul>
+    </li>
+    <li>
+      I have a <span class="code">char*</span>
+      <ul>
+        <li>I want a wide string</span>
+        <li>I want a narrow string</span>
+      </ul>
+    </li>
+    <li>
+      I have a literal character sequence, e.g., <span class="code">"Hello, World!\n"</span>
+      <ul>
+        <li>I want a wide string</span>
+        <li>I want a narrow string</span>
+      </ul>
+    </li>
+    <li>What's the best way to return a string?</li>
+    <li>How can I get a pointer to the characters in a string?</li>
+    <li>How can I <span class="code">printf</span> a string?</li>
+  </ul>
+-->
+</div>
+
+
+<table class="chart">
+  <tr>
+    <th></th>
+    <th colspan="5">you have some <span class="code">char</span>s</th>
+  </tr>
+  <tr>
+    <th>you want</th>
+    <th><span class="code">'x'</span></th>
+    <th><span class="code">char c</span></th>
+    <th><span class="code">"foo"</span></th>
+    <th><span class="code">char* cp</span></th>
+    <th><span class="code">nsACString& cs</span></th>
+  </tr>
+  <tr>
+    <th class="row-label"><span class="code">char</span></th>
+    <td colspan="2">.</td>
+<!-- "foo"          -->    <td><span class="code">[]</span></td>
+<!-- char* cp       -->    <td><span class="code">[]</span></td>
+<!-- nsACString& cs -->    <td><a href="#faq_how_to_extract_a_character">extract a character</a></td>
+  </tr>
+  <tr>
+    <th class="row-label"><span class="code">PRUnichar</span></th>
+<!-- 'x'            -->    <td><span class="code">PRUnichar('x')</span></td>
+<!-- char c         -->    <td><span class="code">PRUnichar(c)</span></td>
+    <td colspan="3"><a href="#faq_how_to_convert_encoding">convert encoding</a>, <a href="#faq_how_to_extract_a_character">extract a character</a></td>
+  </tr>
+  <tr>
+    <th class="row-label"><span class="code">char*</span></th>
+<!-- 'x'            -->    <td><span class="code">&amp;</span></td>
+<!-- char c         -->    <td><span class="code">&amp;</span></td>
+<!-- "foo"          -->    <td><span class="code">&amp;</span></td>
+<!-- char* cp       -->    <td>.</td>
+<!-- nsACString& cs -->    <td><a href="#faq_how_to_get_a_pointer">get a pointer</a></td>
+  </tr>
+  <tr>
+    <th class="row-label"><span class="code">PRUnichar*</span></th>
+    <td colspan="5"><a href="#faq_how_to_convert_encoding">convert encoding</a>, <a href="#faq_how_to_get_a_pointer">get a pointer</a></td>
+  </tr>
+  <tr>
+    <th class="row-label"><span class="code">nsACString</span></th>
+<!-- 'x'            -->    <td><span class="code">NS_LITERAL_CSTRING("x")</span></td>
+<!-- char c         -->    <td><a href="#faq_how_to_make_a_string">make a string</a></td>
+<!-- "foo"          -->    <td><span class="code">NS_LITERAL_CSTRING("foo")</td>
+<!-- char* cp       -->    <td><a href="#faq_how_to_make_a_string">make a string</a></td>
+<!-- nsACString& cs -->    <td>.</td>
+  </tr>
+  <tr>
+    <th class="row-label"><span class="code">nsAString</span></th>
+<!-- 'x'            -->    <td><span class="code">NS_LITERAL_STRING("x")</span></td>
+<!-- char c         -->    <td><a href="#faq_how_to_convert_encoding">convert encoding</a></td>
+<!-- "foo"          -->    <td><span class="code">NS_LITERAL_STRING("foo")</span></td>
+    <td colspan="2"><a href="#faq_how_to_convert_encoding">convert encoding</a></td>
+  </tr>
+  <tr>
+    <th class="row-label">to call <span class="code">printf</span></th>
+    <td colspan="4">.</td>
+<!-- nsACString& cs -->    <td><a href="#faq_how_to_call_printf">call <span class="code">printf</span></a></td>
+  </tr>
+</table>
+
+<table class="chart">
+  <tr>
+    <th></th>
+    <th colspan="3">you have some <span class="code">PRUnichar</span>s</th>
+  </tr>
+  <tr>
+    <th>you want</th>
+    <th><span class="code">PRUnichar w</span></th>
+    <th><span class="code">PRUnichar* wp</span></th>
+    <th><span class="code">nsAString& s</span></th>
+  </tr>
+  <tr>
+    <th class="row-label"><span class="code">char</span></th>
+<!-- PRUnichar w    -->    <td></td>
+<!-- PRUnichar* wp  -->    <td></td>
+<!-- nsAString& s   -->    <td></td>
+  </tr>
+  <tr>
+    <th class="row-label"><span class="code">PRUnichar</span></th>
+<!-- PRUnichar w    -->    <td></td>
+<!-- PRUnichar* wp  -->    <td><span class="code">[]</span></td>
+<!-- nsAString& s   -->    <td><a href="#faq_how_to_extract_a_character">extract a character</a></td>
+  </tr>
+  <tr>
+    <th class="row-label"><span class="code">char*</span></th>
+<!-- PRUnichar w    -->    <td></td>
+<!-- PRUnichar* wp  -->    <td></td>
+<!-- nsAString& s   -->    <td></td>
+  </tr>
+  <tr>
+    <th class="row-label"><span class="code">PRUnichar*</span></th>
+<!-- PRUnichar w    -->    <td><span class="code">&amp;</span></td>
+<!-- PRUnichar* wp  -->    <td></td>
+<!-- nsAString& s   -->    <td><a href="#faq_how_to_get_a_pointer">get a pointer</a></td>
+  </tr>
+  <tr>
+    <th class="row-label"><span class="code">nsACString</span></th>
+<!-- PRUnichar w    -->    <td></td>
+<!-- PRUnichar* wp  -->    <td></td>
+<!-- nsAString& s   -->    <td></td>
+  </tr>
+  <tr>
+    <th class="row-label"><span class="code">nsAString</span></th>
+<!-- PRUnichar w    -->    <td></td>
+<!-- PRUnichar* wp  -->    <td></td>
+<!-- nsAString& s   -->    <td></td>
+  </tr>
+  <tr>
+    <th class="row-label">to call <span class="code">printf</span></th>
+<!-- PRUnichar w    -->    <td></td>
+<!-- PRUnichar* wp  -->    <td></td>
+<!-- nsAString& s   -->    <td><a href="#faq_how_to_call_printf">call <span class="code">printf</span></a></td>
+  </tr>
+</table>
+
+<div class="faq">
+  <dl>
+    <dt>
+      is there any string doc?
+    </dt>
+    <dd>
+      Yes, you're soaking in it!
+    </dd>
+
+
+
+<!-- getting a pointer -->
+    <dt>
+      <a name="faq_how_to_get_a_pointer">I have a string, how do I get a pointer to the characters?</a>
+    </dt>
+    <dd>
+      You want to avoid this situation.
+      In your own interfaces, prefer string types over raw pointers.
+      Any interface that wants to process a string using a single pointer is making two expensive assumptions.
+      First, that the string is stored in one contiguous hunk; and
+        second, that the string is zero-terminated.
+      If this isn't the case,
+        then to get a pointer, storage must be allocated and the entire string must be copied to it and zero-terminated.
+      You may not be able to avoid needing a pointer when interacting with system calls. 
+    </dd>
+    <dd>
+      Some string classes guarantee that they are `flat'.
+      That is, that their data is stored in one contiguous zero-terminated hunk.
+      This <strong>does not</strong> imply that there are no embedded nulls.  Caveat emptor.
+      All strings that explicitly promise flatness
+        inherit from the class <span class="code">nsAFlatString</span>
+          or <span class="code">nsAFlatCString</span>
+        and can produce a constant pointer to their data with the <span class="code">get()</span> member function.
+      Even strings that don't explicitly promise to be flat
+        may happen to be flat.
+      The helper function <span class="code">PromiseFlatString</span> will produce
+        a <span class="code">const</span> dependent string that is guaranteed to be flat.
+      If you use this on a string that already happens to be flat,
+        the result is simply a reference through to that string.
+      Otherwise,
+        <span class="code">PromiseFlatString</span> does the work to allocate, copy, terminate, and manage
+        a temporary flat string.
+      Since the result of <span class="code">PromiseFlatString</span> is a temporary,
+        you must be careful not to get and hold a pointer to its data for longer than the temporary itself lives.
+    </dd>
+    <dd>
+<div class="source-code">
+<pre>
+  /* I have a string, how do I get a pointer to the characters? */
+
+extern void EvilNarrowOSFunction( const char* );    // evil OS routines that want a pointers
+extern void EvilWideOSFunction( const PRUnichar* );
+
+void func( const nsAString&amp; aString, const nsACString&amp; aCString )
+  {
+    EvilWideOSFunction( NS_LITERAL_STRING("Hello, World!").<span class="notice">get()</span> );
+      // literal strings are flat already (as are |nsString|s, et al), just use |.get()|
+
+    EvilWideOSFunction( <span class="notice">PromiseFlatString(</span>aString<span class="notice">).get()</span> );
+      // for strings that don't explicitly guarantee flatness, use |PromiseFlatString|
+
+
+      // beware holding the pointer for longer than the life of the promise
+    <span class="warning">const PRUnichar* wp = PromiseFlatString(aString).get(); // BAD! |wp| dangles
+    EvilWideOSFunction(wp);</span>
+
+      // if you really need to use the pointer from |PromiseFlatString| in more than one expression...
+    const nsAFlatString&amp; flat = <span class="notice">PromiseFlatString(</span>aString<span class="notice">)</span>;
+    EvilWideOSFunction(flat.<span class="notice">get()</span>);
+    SomeOtherFunction(flat.<span class="notice">get()</span>);
+
+      // similarly for |char| strings
+    EvilNarrowOSFunction( <span class="notice">PromiseFlatCString(</span>aCString<span class="notice">).get()</span> );
+  }
+</pre>
+</div>
+    </dd>
+
+
+
+<!-- extracting a character -->
+    <dt>
+      <a name="faq_how_to_extract_a_character">How do I get a particular character out of a string?</a>
+    </dt>
+    <dd>
+      Flat strings provide <span class="code">operator[]</span> and <span class="code">CharAt()</span>.
+      All strings provide <span class="code">First()</span>, <span class="code">Last()</span>, and access with iterators.
+      <strong>Don't</strong> promise a string flat just to do character indexing.
+      Prefer, instead, to get an iterator and <span class="code">advance</span> it to the position you care about.
+    </dd>
+    <dd>
+<div class="source-code">
+<pre>
+  /* How do I get a particular character out of a string? */
+
+PRUnichar Get5thCharacterOf( const nsAString& aString )
+  {
+    if ( aString.Length() >= 5 )
+      {
+        nsAString::const_iterator iter;
+        aString.BeginReading(iter); // make |iter| point to the beginning of |aString|
+        iter.advance(5);
+        return *iter;
+      }
+
+    return PRUnichar(0);
+  }
+</pre>
+</div>
+    </dd>
+    <dd>
+      Using iterators isn't as bad as the example above makes it feel.
+      The typical use is for advancing through a string, examining many characters.
+    </dd>
+
+
+
+<!-- how to convert encoding -->
+    <dt>
+      <a name="faq_how_to_convert_encoding">How do I convert from one encoding to another?</a>
+    </dt>
+    <dd>
+    </dd>
+
+
+
+<!-- how to make a string -->
+    <dt>
+      <a name="faq_how_to_make_a_string">How do I create a string?</a>
+    </dt>
+    <dd>
+    </dd>
+
+
+<!-- how to return a string -->
+    <dt>
+      What is the best way to return a string?
+    </dt>
+    <dd>
+      <p>
+        There are several reasonable ways to produce a string result from a function.
+        If you are already holding the answer as a sharable string,
+          you can simply return that string (pass-by-value).
+        Otherwise,
+          the most efficient and flexible way to return a string is
+          to assign your result into a non-<span class="code">const</span> reference parameter.
+        Don't bother to create a sharable string from scratch with your generated result.
+      </p>
+      <p>
+        Why?
+        The two things you want to minimize in string manipulation are,
+          in order of importance,
+            heap allocation, and
+            moving characters around.
+      </p>
+    </dd>
+    <dd>
+<div class="source-code">
+<pre>
+  /* What is the best way to return a string? */
+
+class foo
+  {
+    public:
+      // ...
+      void GetShortName( nsAString&amp; aResult ) const;
+      nsCommonString GetFullName() const;
+      
+    private:
+      nsCommonString    mFullName;
+
+      const PRUnichar*  mShortName;
+      PRUint32          mShortNameLength;
+      
+  };
+
+nsCommonString
+foo::GetFullName() const
+  {
+    return mFullName;
+  }
+
+void
+foo::GetShortName( nsAString&amp; aResult ) const
+  {
+    aResult = DependentString(mShortName, mShortNameLength);
+  }
+</pre>
+</div>
+    </dd>
+
+
+    <dt>
+      <a name="faq_how_to_call_printf">How do I <span class="code">printf</span> a string, e.g., for debugging.</a>
+    </dt>
+    <dd>
+      If your string is already narrow, you just have to worry about <a href="#faq_how_to_get_a_pointer">making it flat, and then getting a pointer</a>.
+    </dd>
+    <dd>
+      If your string happens to be wide,
+        you'll need to convert it before you can <span class="code">printf</span> something reasonable.
+      If it's just for debugging,
+        you probably wouldn't care if something odd was printed in the case of a Unicode character that didn't have
+        an ASCII equivalent. (If you have a UTF-8 terminal, the result is 
+       perfectly legible and nothing odd is printed.)
+      The simplest thing in this case is to make a temporary conversion using <span class="code">NS_ConvertUTF16toUTF8</span>.
+      The result is conveniently flat already, so getting the pointer is simple.
+      Remember not to hold onto the pointer you get out of this beyond the lifetime of temporary.
+    </dd>
+    <dd>
+<div class="source-code">
+<pre>
+  /* How do I |printf| a string? */
+
+
+void PrintSomeStrings( const nsAString& aString, const PRUnichar* aKey, const nsACString& aCString )
+  {
+      // |printf|ing a narrow string is easy
+    printf("%s\n", <span class="notice">PromiseFlatCString(</span>aCString<span class="notice">).get()</span>);     // GOOD
+
+      // the simplest way to get a |printf|-able |const char*| out of a string
+    printf("%s\n", <span class="notice">NS_ConvertUTF16toUTF8(</span>aKey<span class="notice">).get()</span>);       // GOOD
+
+      // works just as well with an formal wide string type...
+    printf("%s\n", <span class="notice">NS_ConvertUTF16toUTF8(</span>aString<span class="notice">).get()</span>);
+
+
+      // But don't hold onto the pointer longer than the lifetime of the temporary!
+    <span class="warning">const char* cstring = NS_ConvertUTF16toUTF8(aKey).get(); // BAD! |cstring| is dangling
+    printf("%s\n", cstring);</span>
+  }
+</pre>
+</div>
+    </dd>
+
+  </dl>
+
+<p>
+  Here are the email answers I have yet to format into the FAQ.
+  Some of the URLs may be out-dated or moved.
+  The messages are in order from oldest to newest.
+</p>
+<p class="editnote">[Note : In June, 2003, these emails were modified
+to better reflect what is stored in 'wide' string
+classes (UTF-16 string instead of UCS-2)  and what        
+related methods do as a part of the patch for <a href=
+"http://bugzilla.mozilla.org/show_bug.cgi?id=183156" 
+title="replace UCS2 in function/class/method names with UTF16">bug 183156</a>.
+Therefore, they're a little different from  the original emails
+written by <a href="http://ScottCollins.net/">Scott Collins</a>]
+</p>
+<hr>
+<pre>
+Date: Thu, 13 Apr 2000 19:41:47 -0400
+</pre>
+
+<p>Encoding Wars
+
+<p>This message is all about strings and the various encodings that might
+be used to interpret their contents, the ramifications of that, and
+where we're heading.  The point of this message is to say what we're
+currently thinking, and get feedback.  I apologize in advance for the
+rambling, and for the fact that this message may accidentally mix
+discussion of how things <strong>are</strong> and how they will be.
+
+<p>There are many different possible encodings.  Three in common use in
+the Mozilla source base are: ASCII, UTF-16, and UTF-8.  In ASCII, every
+<!--the Mozilla source base are: ASCII, UCS2, and UTF8.  In ASCII, every-->
+character fits in 7-bits and is typically stored in an 8-bit byte.  We
+usually represent ASCII strings with <span class="code">nsCString</span>s, <span class="code">nsXPIDLCString</span>s,
+or <span class="code">char</span> string literals.  In UTF-16, characters occupy one 16-bit code unit (
+<a href="http://www.unicode.org/glossary/index.html#BMP_character">
+<abbr title="Basic Multilingual Plane">BMP</abbr>characters</a>) 
+or two 16-bit code units 
+(<a href="http://www.unicode.org/glossary/index.html#supplementary_character">
+<abbr title="Supplementary Plane : Plane 1 through 16">non-BMP</abbr> characters</a>).
+We usually represent UTF-16 strings as <span class="code">nsString</span>s, etc., i.e., two-byte
+or `wide' strings.  UTF-8 is a multi-byte encoding.  A character might
+occupy one, two, three, or four bytes.  It is easiest to store and
+manipulate such a string within a single-byte or `narrow' string
+implementation.
+
+<p>None of our current string implementations know the encoding of the
+data they hold at any given moment.  An <span class="code">nsCString</span> might legitimately
+hold data encoded in ASCII, UTF-8 or even EBCDIC for that matter.
+
+<p>Operations that convert from one encoding to another, or operations
+that are encoding sensitive (e.g., <span class="code">to_upper</span>), rightly belong in
+i18n.  The fact that our current string interfaces automatically and
+implicitly convert between wide and narrow strings is actually the
+source of many errors in two particular categories: (1) unintended
+extra work, (2) mistaken re-encoding, e.g., accidentally `converting'
+a UTF-8 string to UTF-16 by pretending the UTF-8 string is ASCII and then
+padding with <span class="code">'\0'</span>s.
+
+<p>We've known these were bad for a long time, and have been trying to
+find the right way to fix them.  The current thinking is to just byte
+the bullet and eliminate implicit conversions.  That has interesting
+ramifications.
+
+<div class="source-code">
+<pre>
+void foo( const nsString&amp;  aUTF16string );
+
+foo("hello"); // works!  constructs a temporary |nsString| by
+              // converting the ASCII literal with padding.
+              // Note: this requires an allocation
+</pre>
+</div>
+
+<p>Though we've always hated this form since it requires a heap
+allocation.  In current code, we recommend
+
+<div class="source-code">
+<pre>
+foo( nsAutoString("hello") );
+</pre>
+</div>
+
+<p>which still copy/converts, but at least it probably doesn't need to do
+a heap allocation.  In the best of all worlds, no conversion, copying,
+or allocation would be necessary.  To do that, you would need to be
+able to directly specify a UTF-16 string, e.g., with the <span class="code">L"hello"</span>
+notation, and wrap that in an interface that just held a pointer. 
+E.g., something like
+
+<div class="source-code">
+<pre>
+void foo( const nsAReadableString&amp;  aUTF16string );
+
+foo( nsLiteralString(L"hello") );
+</pre>
+</div>
+
+<p>There are problems with this example, however.  The <span class="code">L</span> notation
+specifically makes objects that are arrays of <span class="code">wchar_t</span>, which under
+GCC is a 4-byte element.  This leads to incompatibility with JS, and
+the annoyance of possibly bloated storage  (I'm sort of minimizing the
+situation here.  It's worse that I make it sound).  More about tricks
+to get around this in a bit, but first, let me talk about what to do
+in the meantime while we're just getting rid of implicit constructors.
+ Initially to get around this problem (what problem?  The problem that
+<span class="code">foo("hello")</span> stopped compiling on my machine when I threw the
+switch) I made a routine called <span class="code">NS_ConvertToString</span> which looked like
+this
+
+<div class="source-code">
+<pre>
+inline
+nsAutoString
+NS_ConvertToString( const char* anASCIIstring )
+  {
+    nsAutoString aUCS2string;
+    aUCS2string.AssignWithConversion(anASCIIstring);
+    return aUCS2string;
+  }
+</pre>
+</div>
+
+<p>Which lets me write
+
+<div class="source-code">
+<pre>
+foo( NS_ConvertToString("hello") );
+</pre>
+</div>
+
+<p>This was <strong>OK</strong>, but in discussion there were concerns about performance
+on machines that didn't <span class="code">inline</span> well, and issues about naming.  In
+that meeting we came up with an alternate naming strategy that we
+think has room for growth and an implementation more likely to be
+efficient on every platform.  The implementation is to define a new
+class that derives from <span class="code">nsAutoString</span>, but allows construction from a
+<span class="code">char*</span>
+
+<div class="source-code">
+<pre>
+class NS_ConvertASCIItoUTF16 : public nsAutoString
+  {
+    public:
+      NS_ConvertASCIItoUTF16( const char* );
+      // ...
+  };
+</pre>
+</div>
+
+<p>Which gives identical (though renamed) notation for calling <span class="code">foo</span>:
+
+<div class="source-code">
+<pre>
+foo( NS_ConvertASCIItoUTF16("hello") );
+</pre>
+</div>
+
+<p>It looks like a function call to an explicit encoding conversion.  It
+acts like a function call to an explicit encoding conversion.  It <strong>is</strong>
+a function call to an explicit encoding conversion.  We think that
+this naming pattern has room for growth.  In the meeting, we concluded
+that the best representation for encoding conversions is a family of
+functions, and <span class="code">NS_ConvertASCIItoUTF16</span> fits right in.  We think that
+XPCOM probably can't live without the ASCII to UTF-16 conversion (though
+as explicit as possible) but that all others rightly belong in i18n
+land.
+
+<p>You can probably deduce from the clues in <span class="code">NS_ConvertToString</span>, above,
+that constructors weren't the only thing that became explicit. 
+Assignment, appending, comparison, et al, got renamed so that when
+assigning, appending, or comparing to a value in a different encoding
+the `WithConversion' form must be used.  E.g.,
+
+<div class="source-code">
+<pre>
+nsString aUTF16string;
+nsCString anASCIIstring;
+// ...
+
+aUTF16string += anASCIIstring;  // Currently legal, but not for long
+aUTF16string.Append(anASCIIstring); // same
+
+aUTF16string.AppendWithConversion(anASCIIstring); // the new way
+
+if ( aUTF16string == anASCIIstring ) // Sorry, this is going away too
+  // ...
+
+if ( aUTF16string.EqualsWithConversion(anASCIIstring) )
+  // ...
+</pre>
+</div>
+
+<p>Yes, it's long and annoying.  Just like the extra work you were
+implicitly asking to have done, perhaps incorrectly.  There are other
+reasons to rename these functions.  When <span class="code">nsString</span> and <span class="code">nsCString</span>
+defined a ton of, e.g., <span class="code">Append</span>s each there was no problem, because
+nobody wanted to override <span class="code">Append</span>.  Now, with strings inheriting from
+abstract base classes we immediately run into the problem that
+overriding and overloading don't mix very well in C++.  Because of a
+feature of C++ called name hiding, it is problematic to override only
+a single signature of a name overloaded in a base class.  The base
+<span class="code">nsAWritableString</span> provides several <span class="code">Append</span>s, all for objects of
+(hopefully) the same encoding.  <span class="code">nsString</span> can't easily add a bunch of
+new <span class="code">Append</span>s (the converting ones) without running face first into
+the name hiding problem.  The discussion of the fix for this is mostly
+unrelated to encoding issues, so I'll defer it to another post.
+
+<p>In hindsight, after the meeting, it seemed clear that all the
+`WithConversion' forms would be better named
+
+<div class="source-code">
+<pre>
+xxxConvertingASCIItoUTF16
+xxxConvertingUTF16toASCII
+</pre>
+</div>
+
+<p>however, the <strong>real</strong> goal (probably) is to move most such conversions
+into i18n.  Just bringing attention to the previously implicit
+conversions is a good first step.  Renaming these conversions as just
+suggested is probably the right thing to do, though it sort of
+validates them, which I'm not sure we really want.  This is a decision
+we need to discuss further.
+
+<p>Now, back to the string literal problem above.  One possible solution
+is to use a macro.  Imagine
+
+<div class="source-code">
+<pre>
+NS_LITERAL_STRING("Hello")
+</pre>
+</div>
+
+<p>which on a machine where the <span class="code">L</span> trick works, turns into
+
+<div class="source-code">
+<pre>
+nsLiteralString(L"Hello")
+</pre>
+</div>
+
+<p>but on a machine where there is trouble, turns into something less
+appealing, but more likely to work, like
+
+<div class="source-code">
+<pre>
+NS_ConvertASCIItoUTF16("Hello")
+</pre>
+</div>
+
+<p>Another solution is to add a compilation step that fixes <span class="code">L</span> strings
+on bad platforms to be non-<span class="code">L</span> strings, but padded with <span class="code">\0</span>s.  E.g.,
+<span class="code">L"Hello"</span> gets preprocessed into <span class="code">"\000H\000e\000l\000l\000o\000"</span>. 
+This solution is more annoying to the developer, where the prior
+solution is more annoying during the runtime.
+
+<p>Before we go to too much trouble on this specific feature, we will
+probably want to do more measurement to see just how much and how
+often we are converting constant literal strings, and why.
+
+
+<p>I'm currently ripping through the tree fixing things to use the
+`WithConversion' forms where appropriate.  I was also converting
+things to use <span class="code">NS_ConvertToString</span> where appropriate; unless I get
+talked out of it, I want to switch midstream to
+<span class="code">NS_ConvertASCIItoUTF16</span>, then go back and fix up the
+<span class="code">NS_ConvertToString</span> instances later.  I've set things up so I can
+check in as I go.  After all these conversions have been done, I'll be
+able to throw the switch (what switch?  NEW_STRING_APIS) which will
+make <span class="code">nsString</span> inherit from <span class="code">nsAWritableString</span>, etc. and allow us to
+start exploiting these other opportunities (e.g., for literal strings,
+shared strings, etc.  See
+<a class="exact-uri" href="http://bugzilla.mozilla.org/show_bug.cgi?id=28221">http://bugzilla.mozilla.org/show_bug.cgi?id=28221</a> for details and
+reasoning.)
+
+<p>I guess I'm expecting comments on:
+
+<ul>
+  <li>how really annoying this whole topic is
+  <li>how bad <span class="code">L"xxx"</span> is
+  <li>whether to move forward with <span class="code">NS_ConvertASCIItoUTF16</span>
+  <li>whether we should move to xxxConvertingASCIItoUTF16 etc instead
+      of `WithConverting'
+  <li>arguments about where encoding conversions should live
+  <li>arguments about whether going between 1 and 2 byte storage is an
+      encoding conversion
+  <li>questions about stuff I didn't mention or didn't explain well
+  <li>pointing out stuff I'm just plain wrong about, or things I forgot
+  <li>etc
+</ul>
+
+<p>So as not to jumble the discussion, I'll be separately posting other
+requests for comments about specific features of the design of the new
+string hierarchy.
+
+<p>I hope this helps keep everybody filled in on what we're thinking and
+able to point out what we're forgetting or screwing up :-)
+
+
+
+
+
+<hr>
+<pre>
+Date: Wed, 19 Apr 2000 21:12:47 -0400
+Subject: more string info
+</pre>
+
+<p>  <a class="exact-uri" href="news://news.mozilla.org/scc-705460.16423913042000@news.mozilla.org">news://news.mozilla.org/scc-705460.16423913042000@news.mozilla.org</a>
+
+
+
+
+
+<hr>
+<pre>
+Date: Fri, 26 May 2000 15:31:37 -0400
+Subject: Re: Question on ==
+</pre>
+
+<p>I would prefer you compare with <span class="code">Equals</span> (which should really be named
+<span class="code">IsEqualTo</span>) rather than <span class="code">operator==()</span> because of this:
+
+<div class="source-code">
+<pre>
+char* a;
+char* b;
+
+// ...
+
+if ( a == b )
+  // ...
+</pre>
+</div>
+
+<p>Comparing two raw `string' pointers doesn't compare the characters
+they point to, but instead compares the bits of the pointers.  For
+this reason, I may eventually make comparison of a string with a
+pointer using operators just go away.
+
+
+
+
+
+<hr>
+<pre>
+Date: Wed, 14 Jun 2000 14:38:55 -0400
+Subject: Re: Fix to XprtDefs.h
+</pre>
+
+<p>Yes, we're aware that turning off <span class="code">wchar_t</span> support makes <span class="code">wchar_t</span> be
+a synonym for <span class="code">unsigned short</span> under Metrowerks.  We know that the
+current version of VC++ also makes these types equivalent.  In theory,
+though, the types are distinct even when they are the same size and
+shape.  By using real <span class="code">wchar_t</span> support, we are forced to recognize
+the distinction and navigate it appropriately with <span class="code">reinterpret_cast</span>
+(via <span class="code">NS_REINTERPRET_CAST</span>).  The win here is that we aren't caught by
+compiler changes that suddenly make some set of compilers compliant
+and therefore break our code.  We will add an autoconf test that lets
+UNIX compilers opt in to our string scheme when they have an
+appropriately shaped <span class="code">wchar_t</span>.  If these happen to be compliant
+compilers, all will be well.  If they don't, the casts don't hurt,
+because they are type correct.  We are writing our code to meet the
+standard as we move forward.
+
+<p>The win for us is realized by the following macros
+
+<div class="source-code">
+<pre>
+#ifdef HAVE_CPP_2BYTE_WCHAR_T
+  #define NS_LITERAL_STRING(s)  nsLiteralString(L##s, \
+                      (sizeof(L##s)/sizeof(wchar_t))-1)
+#else
+  #define NS_LITERAL_STRING(s)  NS_ConvertASCIItoUTF16(s, \
+                       sizeof(s)-1)
+#endif
+</pre>
+</div>
+
+<p>An <span class="code">nsLiteralString</span> points directly to the literal characters.  No
+copying, no conversion, and the length calculation happens at compile
+time.  This has turned out to be as large a savings as 15% of code
+space and 8% of data space, net, in our string test harness  It's
+faster as well, again by eliminating the copying, conversion, and
+length calculation.  We don't know yet what those numbers translate
+into in our real code base, but we have high hopes.
+
+<p>I don't want to be in the position to ask you to change your code.  I
+don't think it's appropriate for me to do so.  The AIM application
+that is your client is our client as well.  They need to resolve this
+difference between us in whatever way they think best.  That may mean
+asking you if changing your apis is the right thing to do.  Or it may
+mean applying the casts.  Our code-base and yours, Justin, are more
+like cousins.  I don't think you should have to change just to conform
+to us.  You may think my arguments for using real <span class="code">wchar_t</span> have
+merit, and adopt similar usage just because you agree; but I think the
+only obligation you have is to follow the technical solution you think
+is right for your code.
+
+<p>If you decide to make this api change, it will mean shipping a new
+binary (on Mac) for your library to clients who want to switch over to
+the new api (since the name mangling will be different, and therefore,
+the link requirements will change).
+
+<p>Hope this helps,
+
+
+
+
+
+<hr>
+<pre>
+Date: Thu, 15 Jun 2000 19:36:55 -0400
+Subject: Re: Checkin approval for bug 32336
+</pre>
+
+<div class="source-code">
+<pre>
+S.Equals(NS_LITERAL_STRING("bar"), PR_TRUE, 3)
+</pre>
+</div>
+
+<p>doesn't compile because there is no three parameter form for <span class="code">Equals</span>.
+ For all definitions of <span class="code">Equals</span> on strings, see "nsAReadableString.h"
+
+<p><a class="exact-uri" href="http://lxr.mozilla.org/seamonkey/source/xpcom/ds/nsAReadableString.h">http://lxr.mozilla.org/seamonkey/source/xpcom/ds/nsAReadableString.h</a>
+
+<p>There is an <span class="code">EqualsWithConversion</span> that takes three parameters.
+
+<p>  <a class="exact-uri" href="http://lxr.mozilla.org/seamonkey/source/xpcom/ds/nsString2.h#731">http://lxr.mozilla.org/seamonkey/source/xpcom/ds/nsString2.h#731</a>
+
+<p>It is ``EqualsWithConversion'' because it admits the possibility of an
+encoding specific transformation, in this case to provide
+case-insensitive comparison.  This also wouldn't compile, however,
+since, at the moment, an <span class="code">nsLiteralString</span> doesn't provide an operator
+to produce a <span class="code">const PRUnichar*</span> (though perhaps it should), and it
+doesn't satisfy the other interfaces that match this call, e.g., a
+<span class="code">const nsString&</span>.
+
+<p>Perhaps I need to move case-insensitive comparison up out of
+<span class="code">nsString</span> into a global encoding specific transformations and
+algorithms file (which was on its way anyway as Waterson, knows); this
+use is one bit of evidence to support this.  In the short term, this
+can be fixed (if we think the current behavior is wrong) by providing
+<span class="code">operator const CharT*() const</span> on literal string.
+
+<p>If you can live with out case-folding, the earlier form is preferred
+
+<div class="source-code">
+<pre>
+S == NS_LITERAL_STRING("bar")
+</pre>
+</div>
+
+<p>if you can't, then one of the fixes I mentioned is in order.
+
+
+
+
+
+<hr>
+<pre>
+Date: Thu, 15 Jun 2000 19:47:12 -0400
+Subject: Re: [Fwd: how to use nsString ?]
+</pre>
+
+<pre class="email-quote">
+  >I see these same examples time and again in the embedding
+  >samples/docs, but I can't compile them.
+</pre>
+
+<p>Apologies.  Documentation mentioning strings is getting out of date. 
+Here are some specific answers.
+
+
+<pre class="email-quote">
+  >nsString URLString("http://www.mozilla.org");
+</pre>
+
+<p>...is now perhaps best expressed as
+
+  nsString URLString( NS_LITERAL_STRING("http://www.mozilla.org") );
+
+<p>since an <span class="code">nsString</span> is a sequence of 2-byte wide characters, and the
+routines that implicitly convert 1-byte sequences (like the literal
+sequence you specified, "http:...") are now gone.
+
+<p>Up until not too long ago, one would have had to say
+
+<div class="source-code">
+<pre>
+nsString URLString;
+URLString.AssignWithConversion("http://www.mozilla.org");
+</pre>
+</div>
+
+<p>The <span class="code">NS_LITERAL_STRING</span> construction is new machinery that has the
+potential to make many operations much more efficient.
+
+<pre class="email-quote">
+  >nsString URLString;
+  >URLString.SetString("www.mozilla.org");
+</pre>
+
+<p><span class="code">SetString</span> was a synonym for <span class="code">Assign</span> or assignment with
+<span class="code">operator=()</span>, it too went away.  The equivalent is the second
+example I gave above, that is, the one with <span class="code">AssignWithConversion</span>. 
+
+<p><span class="code">Assign</span> still exists.  <span class="code">AssignWithConversion</span> takes on that
+functionality for assignments that require encoding transformations
+(e.g., from ASCII to UTF16).  <span class="code">SetString</span> is gone, since it was always
+a synonym for <span class="code">Assign</span>. 
+
+<p>Learn more about the general APIs for strings that we are trying to
+move to by examining
+
+<a class="exact-uri" href="http://lxr.mozilla.org/seamonkey/source/xpcom/ds/nsAReadableString.h">http://lxr.mozilla.org/seamonkey/source/xpcom/ds/nsAReadableString.h</a>
+<a class="exact-uri" href="http://lxr.mozilla.org/seamonkey/source/xpcom/ds/nsAWritableString.h">http://lxr.mozilla.org/seamonkey/source/xpcom/ds/nsAWritableString.h</a>
+
+<p>Hope this helps,
+
+
+
+
+
+<hr>
+<pre>
+Date: Thu, 15 Jun 2000 21:26:51 -0400
+Subject: Re: Checkin approval for bug 32336
+</pre>
+
+<pre class="email-quote">
+  >I *need* the count attribute, because I need to compare only the first 
+  >chars (that's inherent to the logic).
+</pre>
+
+<p>This is what substrings are for.  In that case, you could use
+
+<div class="source-code">
+<pre>
+Substring(S, 0, 3) == NS_LITERAL_STRING("bar")
+</pre>
+</div>
+
+<p>As for case-folding, it's best if you can case-fold everything up
+front, instead of doing it repeatedly.  I'll have to get back to you
+on a general solution to that problem, or what my schedule for getting
+it checked in would be.  I'm sorry, I know that's not what you needed
+to hear.  If the source string is an <span class="code">nsString</span>, you can continue to
+exploit its implementation of these routines, e.g., <span class="code">ToLower</span> all
+up-front.
+
+<p>Hope this helps,
+
+
+
+
+
+<hr>
+<pre>
+Date: Mon, 19 Jun 2000 14:23:47 -0400
+Subject: Re: string fu
+</pre>
+
+<pre class="email-quote">
+  >It seems less convenient to have to first check path.IsEmpty, and
+  >then if false get path.Last and test it.
+</pre>
+
+<p>What would you prefer?  That extracting a character not in the string
+always return <span class="code">CharT(0)</span>?  Can't do it for two reasons: (1) <span class="code">0</span> may be
+a valid character in a particular encoding, so it can't be used in
+general as a ``no character at that position'' marker; and (2) I can't
+control what an individual string implementation does when asked to
+get an out-of-bounds fragment, it's explicitly undefined.  That means
+the result of <span class="code">CharAt</span> is explicitly undefined for indexes outside the
+defined contents of the string.  As a debugging convenience, I have
+made this assert, but it has always been the case that retrieving such
+a character had undefined results ... even in [the old] code.
+
+<p>OK, you might say, well at least let me ask for a character that is
+only off the end by one.  E.g., <span class="code">Last</span> of an empty string.  Reason (1)
+from above still applies.  How bad is it to say, for the case you gave
+
+<div class="source-code">
+<pre>
+PRBool needsDelim = PR_FALSE;
+if ( !path.IsEmpty() )
+  {
+    PRUnichar last = path.Last();
+    needsDelim = !(last == '/' || last == '\\');
+  }
+</pre>
+</div>
+
+<p>In general, you probably want to opt out of a whole lot of work when
+the source string is empty.  It is slightly less convenient, but it
+doesn't tie us to a bunch of implementation specific mojo.
+
+
+<pre class="email-quote">
+  >Can we fix GetUnicode in this case?
+</pre>
+
+<p>This is an annoying property of auto strings, e.g., that they always
+have an allocated buffer.  I'm happy to fix this bug, however, be
+aware that <span class="code">GetUnicode</span> and <span class="code">GetBuffer</span> are artifacts of [the old]
+implementation that we don't want to support.  They are not part of
+the abstract interface.  We will keep them no longer than we have to. 
+They don't support our multi-fragment paradigm.  People who require a
+contiguous hunk of characters in the future, and are unwilling to
+switch over to chunky-iterators, may be forced to copy the string to
+their own buffer.  There will be an implementation of narrow character
+string that guarantees contiguous allocation and a zero-terminator,
+much as <span class="code">nsCString</span> does now, for compatibility with platform uses,
+but this won't be the default string class.
+
+
+
+
+
+<hr>
+<pre>
+Date: Mon, 19 Jun 2000 17:22:31 -0400
+</pre>
+
+<p>Clarifying String Sematics
+
+<p>Recently, I added an assert to the string operations that extract
+characters, namely <span class="code">First()</span>, <span class="code">Last()</span>, <span class="code">CharAt()</span>, and
+<span class="code">operator[]()</span>.  This assert fires when any of these routines are used
+to access a character outside the defined contents of the string.  For
+<span class="code">First()</span> and <span class="code">Last()</span> that means whenever they are applied to an
+empty string.  For <span class="code">CharAt()</span> and <span class="code">operator[]()</span>, that means whenever
+they are used to access an index outside the range of
+<span class="code">0</span>..<span class="code">Length()-1</span>.  There have been some complaints, however, the
+result was always undefined.  What follows is extracted from an email
+exchange between me and warren on this topic.  I hope it clarifies
+strings semantics
+
+<p>Warren writes:
+<pre class="email-quote">
+  >I hit your funky CharAt assertion tonight in this piece of code:
+
+  >NS_IMETHODIMP
+  >nsIOService::ResolveRelativePath(
+  >    const char *relativePath,
+  >    const char* basePath,
+  >    char **result )
+  >  {
+  >    nsCAutoString name;
+  >    nsCAutoString path(basePath);
+  >    
+  >    PRUnichar last = path.Last();
+  >    PRBool needsDelim = !(last == '/' || last == '\\' || last ==
+  >    '\0');
+  >    ...
+
+  >where basePath is null. It seems less convenient to have to first
+  >check path.IsEmpty, and then if false get path.Last and test it.
+</pre>
+
+<p>I replied:
+<pre class="email-quote">
+  >What would you prefer?  That extracting a character not in the
+  >string always return <span class="code">CharT(0)</span>?  Can't do it for two reasons:
+  >(1) <span class="code">0</span> may be a valid character in a particular encoding, so it
+  >can't be used in general as a ``no character at that position''
+  >marker; and (2) I can't control what an individual string
+  >implementation does when asked to get an out-of-bounds fragment,
+  >it's explicitly undefined.  That means the result of <span class="code">CharAt</span> is
+  >explicitly undefined for indexes outside the defined contents of
+  >the string.  As a debugging convenience, I have made this assert,
+  >but it has always been the case that retrieving such a character
+  >had undefined results ... even in [the old] code.
+
+  >OK, you might say, well at least let me ask for a character that
+  >is only off the end by one.  E.g., <span class="code">Last</span> of an empty string.
+  >Reason (1) from above still applies.  How bad is it to say, for the
+  >case you gave
+
+  >  PRBool needsDelim = PR_FALSE;
+  >  if ( !path.IsEmpty() )
+  >    {
+  >      PRUnichar last = path.Last();
+  >      needsDelim = !(last == '/' || last == '\\');
+  >    }
+
+  >In general, you probably want to opt out of a whole lot of work
+  >when the source string is empty.  It is slightly less convenient,
+  >but it doesn't tie us to a bunch of implementation specific mojo.
+</pre>
+
+<p>Warren also asks:
+<pre class="email-quote">
+  >Here's another issue, perhaps more serious. If I say this:
+
+  >  foo(const PRUnichar* s) {
+  >    nsAutoString str(s);
+  >    bar(str.get());
+  >  }
+
+  >where s is null, bar will get passed a zero-length PRUnichar
+  >sequence instead of null. This makes it so that you can't just
+  >test for the argument == null. You have to nsCRT::strlen(arg) == 0
+  >which is much less efficient. Can we fix GetUnicode in this case?
+</pre>
+
+<p>And I reply:
+<pre class="email-quote">
+  >This is an annoying property of auto strings, e.g., that they
+  >always have an allocated buffer.  I'm happy to fix this bug,
+  >however, be aware that <span class="code">GetUnicode</span> and <span class="code">GetBuffer</span> are artifacts
+  >of [the old] implementation that we don't want to support.  They
+  >are not part of the abstract interface.  We will keep them no
+  >longer than we have to.  They don't support our multi-fragment
+  >paradigm.  People who require a contiguous hunk of characters in
+  >the future, and are unwilling to switch over to chunky-iterators,
+  >may be forced to copy the string to their own buffer.  There will
+  >be an implementation of narrow character string that guarantees
+  >contiguous allocation and a zero-terminator, much as <span class="code">nsCString</span>
+  >does now, for compatibility with platform uses, but this won't be
+  >the default string class.
+</pre>
+
+<p>In a later message, Chris Waterson asks a related question
+<pre class="email-quote">
+  >scc: should we add <span class="code">operator PRUnichar*()</span> to
+  >NS_ConvertASCIItoUTF16?
+</pre>
+
+<p>And I reply:
+<pre class="email-quote">
+  >It seems reasonable.  A lot more reasonable that forcing people to
+  >call <span class="code">GetUnicode()</span>.  I alluded to platform specific classes in an
+  >earlier message to warren that you were cc'd on, Chris.  I imagine
+  >that the <span class="code">...Convert...</span> routines would be required to produce
+  >contiguous allocation 0-terminated strings (though the as yet
+  >unimplemented <span class="code">...Copy...</span> forms, of course wouldn't.  So <span class="code">operator
+  >const PRUnichar*() const</span> makes perfect sense to me here.
+</pre>
+
+<p>Hope this makes sense,
+
+
+
+
+<hr>
+<pre>
+Date: Tue, 20 Jun 2000 04:05:31 -0400
+Subject: Re: NS_LITERAL_STRING is broken
+</pre>
+
+<p>The behavior you describe sounds exactly like when you say
+
+<div class="source-code">
+<pre>
+const char* foobar = "foobar";
+
+... NS_LITERAL_STRING(foobar).get() ...
+</pre>
+</div>
+
+<p>because in this case, the thing passed in is a <span class="code">const char*</span>. 
+<span class="code">NS_LITERAL_STRING</span> is not meant to be used in this way.  It is only
+meant to be used around a <span class="code">"</span> delimited string.  The type of such is
+<span class="code">const char[N]</span> where N is the number of characters in the string + 1
+for the zero terminator it helpfully adds.  <span class="code">sizeof</span> such a type is
+<span class="code">N</span>.
+
+<p>Are you sure you had the actual string as an argument, as in your
+example to me?  Or could the actual code have been like my sample,
+above?
+
+
+
+
+
+<hr>
+<pre>
+Date: Thu, 29 Jun 2000 13:35:10 -0400
+Subject: Re: a fix
+</pre>
+
+<pre class="email-quote">
+  > +       if (Length() == 0) { return nsnull; }
+</pre>
+
+
+<p>Dave,
+
+<p>please read
+
+  <a class="exact-uri" href="news://news.mozilla.org/scc-314ABF.14261619062000@news.mozilla.org">news://news.mozilla.org/scc-314ABF.14261619062000@news.mozilla.org</a>
+
+<p>It's just plain wrong to let people try to index into a string outside
+its defined contents.  I can't just return <span class="code">'\0'</span> or <span class="code">PRUnichar('\0')</span>
+there as that <strong>could</strong> be a legal value to have somewhere in your
+string for some encodings ... and the encoding is not specified.  So
+your patch has the basic problem of defeating my plan to stop people
+from doing this bad thing.
+
+<p>The second problem with your patch is that you use the symbolic
+constant <span class="code">nsnull</span>, which is ostensibly a pointer value; <span class="code">Last</span> returns
+a character.  <span class="code">nsnull</span> is not appropriate for that purpose.  In fact,
+C++ gurus pretty much eschew the use of symbolic constants for <span class="code">0</span>. 
+<span class="code">NULL</span> is to be avoided.  <span class="code">nsnull</span> is wrong-headed in that it presumes
+we could have some <strong>other</strong> application specific value for <span class="code">NULL</span>.  We
+can't, it would never work.  It's just wasted brain-print.  Always use
+<span class="code">0</span> for these situations, and if you want to communicate the fact that
+something is a pointer type, either use a comment or a
+(construction-style) cast, like so (graded examples from worst to
+best:)
+
+<ul>
+  <li>F: FindChildByNameWithHint("Chuck", nsnull);
+
+  <li>D: FindChildByNameWithHint("Chuck", NULL);
+
+  <li>C: FindChildByNameWithHint("Chuck", /* Child* */ 0);
+
+  <li>B: typedef Child* Child_ptr;
+     FindChildByNameWithHint("Chuck", Child_ptr(0));
+
+  <li>A: FindChildByNameWithHint("Chuck", 0);
+</ul>
+
+<p>Don't let this discourage you; keep up the good work :-)
+
+
+
+
+
+<hr>
+<pre>
+Date: Tue, 8 Aug 2000 23:47:16 -0400
+Subject: Re: nsWritingIterator?
+</pre>
+
+<pre class="email-quote">
+  >Can you give me any pointers to examples, or docs, or just some
+  >general advice?
+</pre>
+
+  <a class="exact-uri" href="http://ScottCollins.net/Journal/discussion/string_iterators.html">http://ScottCollins.net/Journal/discussion/string_iterators.html</a>
+
+<p>does this help?
+
+<p>I can personally walk you through any specific scenario you need.
+
+
+
+
+
+<hr>
+<pre>
+Date: Wed, 9 Aug 2000 02:35:03 -0400
+Subject: Re: nsWritingIterator?
+</pre>
+
+<p>You got it right... it's <span class="code">nsWritingIterator<CharT></span> for whichever
+character type you care about, either <span class="code">char</span> or <span class="code">PRUnichar</span>.  You
+_can_ use this iterator like a character pointer ... that is, you can
+dereference it, assign into its dereference, etc.  It is more
+efficient, though, to directly address a particular range of
+characters around where it points by asking it for its actual
+character pointer with <span class="code">get</span>, and knowing that there are
+<span class="code">size_forward()</span> characters available ahead of that pointer and
+<span class="code">size_backward()</span> characters available behind it.  After examining
+those characters by hand, you can advance the iterator beyond the
+characters you have examined (and possibly into the next chunk, should
+one exist) by adding into it (with +=) the count of the characters you
+have processed.
+
+<p>Here are three examples of running through a string and modifying some
+of the characters in it.  All use <span class="code">nsWritingIterator</span>s.
+
+
+<div class="source-code">
+<pre>
+  // inefficient, but works in a pinch:
+  //  iterators can hide all details of chunks by acting like
+  //  a raw character pointer
+
+nsWritingIterator&lt;PRUnichar&gt; s = S.BeginWriting();
+nsWritingIterator&lt;PRUnichar&gt; done_with_string = S.EndWriting();
+
+  // for each character in the string |S|
+while ( s != done_with_string )
+  {
+      // if the character is lower case, capitalize it
+    if ( 'a' &lt;= *s &amp;&amp; *s &lt;= 'z' )
+      *s = *s -'a' + 'A';
+  }
+
+
+
+
+  // efficient
+  //  iterators provide a mechanism by which you can process
+  //  a chunk-at-a-time
+
+nsWritingIterator&lt;PRUnichar&gt; iter = S.BeginWriting();
+nsWritingIterator&lt;PRUnichar&gt; done_with_string = S.EndWriting();
+
+  // for each chunk of the string
+while ( iter != done_with_string )
+  {
+    size_t N = iter.size_forward();  // # of chars in this chunk
+    PRUnichar* s = iter.get();
+    PRUnichar* done_with_chunk = s + N;
+
+      // for each character in this chunk
+    for ( ; s &lt; done_with_chunk; ++s )
+      {
+         // if the character is lower case, capitalize it
+       if ( 'a' &lt;= *s &amp;&amp; *s &lt;= 'z' )
+          *s = *s - 'a' + 'A';
+      } 
+
+      // advance the iterator past characters
+      //  we examined (and into the next chunk, if any)
+    s += N;
+  }
+
+
+
+  // elegant
+  //  pull your transformation into a `sink', and |copy_string|
+  //  will efficiently pump any kind of string into it
+
+struct Capitalize
+  {
+      // inline
+    PRUint32
+    write( PRUnichar* s, PRUint32 N )
+        // processes one chunk, called repeatedly by |copy_string|
+      {
+        PRUnichar* done_with_chunk = s + N;
+
+         // for each character in this chunk
+        for ( ; s &lt; done_with_chunk; ++s )
+          {
+              // if the character is lower case, capitalize it
+            if ( 'a' &lt;= *s &amp;&amp; *s &lt;= 'z' )
+              *s = *s - 'a' + 'A';
+          }
+      }
+  };
+
+copy_string(S.BeginWriting(), S.EndWriting(), Capitalize());
+</pre>
+</div>
+
+
+
+<p>Does this show it better?
+
+
+
+
+
+<hr>
+<pre>
+Date: Thu, 17 Aug 2000 18:23:22 -0400
+</pre>
+
+<pre class="email-quote">
+  >I tried looking at the string header files but they
+  >are awfully complicated.
+</pre>
+
+<p>I'll explain things in a little <strong>more</strong> detail than you need, then so
+that some of the stuff you see in these headers will make more sense. 
+I'll also answer your questions out of order.
+
+<p>First: the string hierarchy looks like this
+
+<a class="exact-uri" href="http://ScottCollins.net/Journal/discussion/string_hierarchy.gif">http://ScottCollins.net/Journal/discussion/string_hierarchy.gif</a>
+
+<p>The two most important headers are:
+
+<a class="exact-uri" href="http://lxr.mozilla.org/seamonkey/source/xpcom/ds/nsAReadableString.h">http://lxr.mozilla.org/seamonkey/source/xpcom/ds/nsAReadableString.h</a>
+<a class="exact-uri" href="http://lxr.mozilla.org/seamonkey/source/xpcom/ds/nsAWritableString.h">http://lxr.mozilla.org/seamonkey/source/xpcom/ds/nsAWritableString.h</a>
+
+<p>These abstract classes, <span class="code">nsAReadable[C]String</span>, and
+<span class="code">nsAWritable[C]String</span> are typically what you will want to use in the
+interfaces of new code.  If you write a piece of code that takes a
+string for input, consider, e.g.,
+
+<div class="source-code">
+<pre>
+void consumes_a_string( const nsAReadableString&amp;  aInput );
+</pre>
+</div>
+
+<p>If you write a piece of code that modifies a string, consider
+
+<div class="source-code">
+<pre>
+void modifies_a_string( nsAWritableString&amp;  aResult );
+</pre>
+</div>
+
+
+<p>When creating your own classes, member strings will typically be
+<span class="code">nsString</span>s.  When you can't avoid creating a short string that you
+need only temporarily during a function, you will typically use
+<span class="code">nsAutoString</span>.  When someone passes you a raw pointer, or a raw
+pointer and a length, representing a buffer of characters that you may
+examine, but won't own, you can treat it like a string by wrapping it
+in an <span class="code">nsLiteralString</span>, e.g.,
+
+<div class="source-code">
+<pre>
+void
+reads_a_buffer( const PRUnichar* aInput, PRUint32 aInputLength )
+  {
+    nsLiteralString input(aInput, aInputLength);
+      // doesn't allocate or copy
+
+    // ...
+  }
+</pre>
+</div>
+
+<p>You will use <span class="code">nsLiteralString</span> around quoted constant strings as well,
+though typically through the <span class="code">NS_LITERAL_STRING</span> macro, to avoid doing
+a length calculation 
+
+<div class="source-code">
+<pre>
+NS_LITERAL_STRING("x")
+</pre>
+</div>
+
+<p>expands to
+
+<div class="source-code">
+<pre>
+nsLiteralString(L"x", (sizeof(L"x")/sizeof(PRUnichar) - 1))
+</pre>
+</div>
+
+<p>if <span class="code">L</span> notation works as needed on your platform.
+
+Those are the basics.  Now onto your questions:
+
+
+<pre class="email-quote">
+  >For example this won't compile. [...]
+
+  >str1 += L"abc " + str2 + L"def";
+</pre>
+
+
+<p><span class="code">L"abc "</span> makes a an object that is a <span class="code">const wchar_t[5]</span>, and none of
+the string code knows about <span class="code">wchar_t</span>.  The main reason is that
+<span class="code">wchar_t</span> is not necessarily the right size (it can be 4 bytes under
+gcc).  If you wrap these constant expressions in <span class="code">NS_LITERAL_STRING</span>,
+as described above, you should get the right thing, e.g., 
+
+<div class="source-code">
+<pre>
+str1 += NS_LITERAL_STRING("abc ") + str2 + NS_LITERAL_STRING("def");
+</pre>
+</div>
+
+
+<pre class="email-quote">
+  >Another one is:
+  >function(const PRUnichar *foo);
+  >call function(L"abc " + str2);
+
+  >It won't create a temporary nsString.
+</pre>
+
+<p>This one, I have a quick and easy explanation for.  If <span class="code">function</span> was
+declared like this
+
+<div class="source-code">
+<pre>
+function( const nsAReadableString&amp;  )
+</pre>
+</div>
+
+<p>then, no problem, since a <span class="code">nsPromiseConcatenation</span> (which was the
+result of adding those two things together) <strong>is</strong> a readable string. 
+No other objects need to be created; no copying needs to be performed.
+
+<p>In all cases, we want the creation of <span class="code">nsString</span>s et al, to be
+<span class="code">explicit</span>, since creation is unbelievably expensive, requiring heap
+allocation, locks, copying, etc.
+
+<p>I hope this answers both your posts,
+
+
+
+
+
+<hr>
+<pre>
+Date: Thu, 17 Aug 2000 20:57:08 -0400
+Subject: re our conversation
+</pre>
+
+  return ToNewUnicode( nsLiteralCString(buffer) );
+
+
+
+
+
+
+<hr>
+<pre>
+Date: Fri, 18 Aug 2000 02:52:45 -0400
+Subject: Re: More questions and new string API
+</pre>
+
+<pre class="email-quote">
+  >1) How do I return a static string?
+
+  >const nsAReadableString&amp;  foo() {return NS_LITERAL_STRING("x");}
+  >errors on taking the address of a temporary variable.
+</pre>
+
+<p>Unfortunately, <span class="code">NS_LITERAL_STRING</span>s definition is not particularly
+amenable to this use.  Instead, you would have to say something like
+this:
+
+<div class="source-code">
+<pre>
+const nsAReadableString&
+foo()
+  {
+#ifdef HAVE_CPP_2BYTE_WCHAR_T
+    static nsLiteralString static_foo(L"x", 1);
+#else
+    static nsLiteralString static_foo;
+    static PRBool initialized = PR_FALSE;
+    if ( !initialized )
+      {
+        static_foo.AssignWithConversion("x", 1);
+        initialized = PR_TRUE;
+      }
+#endif
+    return static_foo;
+  }
+</pre>
+</div>
+
+
+<pre class="email-quote">
+  >2) I'm using these with the STL library in an XPCOM component.
+  >What type should I use with map?  This doesn't work...
+
+  >typedef map<const nsAReadableString&, myType*> mapStringMyType;
+  >mapStringMyType foo;
+  >foo.find(nsAReadableString);  - I want to find on a ReadableString
+</pre>
+
+<p>I don't know what errors you are getting; but it probably doesn't work
+because a reference isn't an assignable type.  This is just a guess. 
+You may need to use
+
+<div class="source-code">
+<pre>
+map<const nsAReadableString*, myType*>
+</pre>
+</div>
+
+<p>If you actually want the map to manage ownership of the keys, then
+you'll want to use a concrete type, e.g.,
+
+<div class="source-code">
+<pre>
+map<nsString, myType*>
+</pre>
+</div>
+
+<p>or perhaps
+
+<div class="source-code">
+<pre>
+map<nsSharedStringPtr, myType*>
+</pre>
+</div>
+
+<p>Or maybe there's something else wrong.  Send me the error messages. 
+If you end up using a pointer, then of course you'll have to supply a
+comparison function to the <span class="code">map</span> template.  You won't be satisfied
+with the default comparison of pointers :-)  Sorry I couldn't answer
+this one more completely.
+
+
+<pre class="email-quote">
+  >3) How do a get a raw PRUnichar pointer out of nsAReadableString
+  >when I need to call something that wants 'unsigned short *'?
+</pre>
+
+<p>The problem with this scenario is that an <span class="code">nsAReadableString</span> doesn't
+promise that all its data is contiguous, nor that it is
+zero-terminated, which is what I suspect you want in this case.  If
+the function you want to call can take {pointer, length} tuples, and
+can consume the string in hunks without zero termination ... then you
+can use <span class="code">copy_string</span> to pump the string into your function, see
+
+  <a class="exact-uri" href="http://ScottCollins.net/Journal/discussion/string_iterators.html">http://ScottCollins.net/Journal/discussion/string_iterators.html</a>
+
+<p>If not, and you absolutely have to have a contiguous zero-terminated
+buffer, then there is a new facility (part of the DOMAPI branch) that
+does what you need.  It's not checked in on the trunk; it should
+be in early next week.  It is <span class="code">nsPromiseFlatString</span>.  This class
+promises a contiguous zero-terminated buffer; and has an <span class="code">operator
+PRUnichar*</span> to produce a pointer to that buffer automatically.  If the
+underlying class <strong>is</strong> one that happens to be a single fragment and
+zero-terminated, then, like <span class="code">nsPromiseSubstring</span> and
+<span class="code">nsPromiseConcatenation</span>, this class merely holds a reference into the
+original data.  If, however, the underlying string is multi-fragment
+or not zero-terminated, then <span class="code">nsPromiseFlatString</span> allocates a
+contiguous buffer of appropriate size and copies the fragmented string
+data to it.  So given
+
+<div class="source-code">
+<pre>
+void ReadBuffer( PRUnichar* );
+</pre>
+</div>
+
+<p>You can call this as efficiently as possible with an arbitrary string
+like so
+
+<div class="source-code">
+<pre>
+ReadBuffer( nsPromiseFlatString(aString) );
+</pre>
+</div>
+
+
+<p>If the function you are calling needs to take ownership of the buffer
+you hand it, then you will probably call <span class="code">ToNewUnicode</span> like so
+
+<div class="source-code">
+<pre>
+void ConsumeBuffer( PRUnichar* );
+
+ConsumeBuffer( ToNewUnicode(aString) );
+</pre>
+</div>
+
+<p>The global function <span class="code">ToNewUnicode</span> is declared in "nsReadableUtils.h",
+and was only recently added to the build.  It is currently being used
+in the DOMAPI branch.  It is part of the build, but the file
+"dlldeps.c" in XPCOM may need to be modified to ensure it is exported
+on your platform if you are building the tip.
+
+Needless to say, you want to avoid functions that require bare
+pointers for several reasons: (a) they typically assume
+zero-termination, which is not guaranteed by the normal encodings; (b)
+they require contiguous allocation, which may not be possible; (c)
+they scan for the end of the string, at linear cost (if the encoding
+makes it possible at all), when the length could be known in advance. 
+If you have to do it, the above mechanisms work, but be aware of the
+cost and the potential need to copy.
+
+
+<pre class="email-quote">
+  >4) How do I declare a local variable to hold a nsAReadableString?
+  >and a member variable?
+</pre>
+
+<p><span class="code">nsAReadableString</span> is an abstract type.  So you can't have a concrete
+instance of it.  All strings in the hierarchy are readable strings. 
+If you just want a reference to a readable string, you can say, e.g.,
+
+<div class="source-code">
+<pre>
+struct foo
+  {
+    const nsAReadableString&amp;  mString;
+    // ...
+
+    foo( const nsAReadableString&amp;  aString ) : mString(aString) { }
+  };
+</pre>
+</div>
+
+<p>...similarly with pointers; but I suspect you are looking for
+something more concrete.  An <span class="code">nsString</span> is a <span class="code">nsAReadableString</span>, and
+is the typical thing you want as a member variable.  An <span class="code">nsAutoString</span>
+is also an <span class="code">nsAReadableString</span> and is typically what you would use for
+a short (in length) temporary (in lifetime) local variable, as I
+mentioned in my previous post.
+
+
+<pre class="email-quote">
+  >5) If I call a function that returns a PRUnichar* and I want t
+  >use it as a nsAReadableString should I wrap it in a
+  >nsLiteralString?
+</pre>
+
+<p>Yes, though remember, an <span class="code">nsLiteralString</span> assumes the lifetime of the
+underlying data is under someone else's control.  If the called
+function gives you a buffer that you need to <span class="code">delete</span>, you will have
+to manage that yourself.  Currently, people often use <span class="code">nsXPIDLString</span>
+to handle that.  XPIDL strings are <strong>not</strong> part of the hierarchy.  They
+are only used as a sort of string-<span class="code">auto_ptr</span>.  However, I'm
+integrating their functionality into <span class="code">nsString</span>.  There is no problem
+in wrapping the same pointer in both as two separate local variables,
+one to give you the readable interface, and one to manage the
+lifetime.
+
+<p>If it's OK with you, I'd like to post this reply (including your
+quoted questions) to n.p.m.xpcom and also put a copy near the string
+iterator discussion I provided a link to above, so that other people
+with similar questions can see these answers.
+
+<p>Hope this helps,
+
+
+
+
+
+<hr>
+<pre>
+Date: Sun, 3 Sep 2000 03:52:17 -0400
+</pre>
+
+<p>In article <8nu9m2$eo14@secnews.netscape.com>, "Jon Smirl" 
+<jonsmirl@mediaone.com> wrote:
+
+> I have the new strings up and running in my app. They work as
+> advertised  and
+> I haven't found any bugs. Thanks for the good job in designing and
+> implementing them.  Here's are a summary of issues I've encountered
+> so far...
+
+<p>Thanks, and I appreciate your comments and insights.
+
+
+> 
+> 1) Should there be a nsSegmentedString derived from nsString instead
+> of building segment support into nsString? None of my strings are
+> segmented  but
+> I keep executing code that is supports it. nsPromiseFlatString would
+> be trivial in the non-segmented case.
+
+<p>The general case is that a string does not promise to have contiguous
+data.  A specific case is that, for some implementations, it does. 
+You couldn't do it the other way around, because a segmented string
+couldn't satisfy all the promises of a flat string.  However, through
+the use of chunky iterators, operating on strings that happen to be
+flat is very efficient.  In fact, <span class="code">nsPromiseFlatString</span> is trivial in
+the non-segmented case.  In addition, I'll be adding an abstract flat
+class into the hierarchy, which will present additional interface ...
+in your local routines where you actually have declared a concrete
+string instance that happens to be flat, the compiler will give you
+the benefit of using the flat specific routines (e.g., a substring
+object over a flat string is simpler than the general purpose
+substring).  I need to be cautious about this, though, since I don't
+automatically want people propagating the flat type through their
+interfaces.  That would put us in the same boat we're in right now ...
+where routines only work on a specific kind of string, which denies
+other parts of the code the opportunity to use an implementation
+beneficial to its specific needs, and typically for no good reason.
+
+> 
+> 2) Should nsAWritableString have a way to get the buffer and then
+> return  it?
+> I need to get the buffer to pass it to OS calls. I'm doing this now
+> by passing around nsStrings instead of the interface.  If I just use
+> the interface I encur an extra copy since I have to use a temporary
+> buffer. 
+
+<p>A specific string implementation could promise this, but in general, a
+writable could not.  After all, a writable doesn't even guarantee
+contiguous storage.  To some degree, this is what
+<span class="code">nsPromiseFlatString</span> is for.  However, this is a readable promise
+only.  It will also be the case that <span class="code">ns[C]String</span>s, in the very near
+future will be able to just assume ownership of an arbitrary buffer
+allocated on the free store with the XPCOM allocators ... getting one
+to give up its buffer, on the other hand, presents some problems.  Do
+you have a lot of places where the system writes into your string
+buffer space?  Or do you have a lot of system routines that return you
+new buffers?  I can imagine using <span class="code">nsPromiseFlatString</span> for this, but
+what happens when the OS alters the underlying data?  If the promise
+had generated that flat data on behalf of a multi-fragment string,
+should it now put the changes back?  It's possible to do, I just want
+to know if it's correct to allow this situation to happen.
+
+
+
+> 
+> 3) There needs to be a NS_LITERAL_CHAR() to go along with
+> NS_LITERAL_STRING().
+
+<p>OK.
+
+
+
+> Having NS_LITERAL_STRING() all over the code  clutters
+> it up and makes it hard to tell what the code is doing, could we
+> have a standard short alias for this?
+
+<p>Yes, I'll try to think of something ... perhaps <span class="code">NS_LSTR</span>?
+
+
+> 4) nsLiteralString should support n.ToInteger(&error);
+
+<p><span class="code">ToInteger</span> is actually a bad interface.  It's only good if your
+entire string is the number; this encourages you to edit your string
+until it is one, or perhaps copy the numeric part to another string. 
+Better if you just <span class="code">sscanf</span> a string (don't know if I can provide
+that in the general case, but I'm thinking about it), or else use
+regular C++ extractors (which wouldn't be too hard for me to
+provide), or else I could give you a <span class="code">ToInteger</span> that works on a pair
+of iterators, extracting the integer from the digits between them. 
+
+> 
+> 5) There should be a global define for an interface to a readonly
+> empty string.
+
+<p>Yes, there will be.
+
+
+> 
+> 6) Something is wrong with concatenation....
+
+<p>Hopefully I've fixed this now.
+
+
+
+> 8) A forward definition is missing in the h files
+
+<p>I'll check it out.
+
+
+
+<p>My understanding is that you have already found the answers to your
+other questions.
+
+<p>I hope this helps,
+
+
+
+
+<hr>
+<pre>
+Date: Wed, 20 Sep 2000 17:32:13 -0400
+Subject: Re: how to free an nsString::ToNewCString
+</pre>
+
+<pre class="email-quote">
+  >What's the current approved way to free an nsString::ToNewCString? 
+</pre>
+
+<p><span class="code">nsMemory::Free</span>
+
+
+
+
+
+<hr>
+
+<p>You use several <span class="code">NS_ConvertASCIItoUTF16("...").get()</span>, these should be
+
+  NS_LITERAL_STRING("...").get()
+
+<p>Don't do this to the very first case where you aren't wrapping an actual literal string.
+The first instance would should exploit <span class="code">NS_LITERAL_STRING</span> technology as well,
+around the initial declarations of the strings ... probably want to do this with
+<span class="code">NS_NAMED_LITERAL_STRING</span>.
+
+
+
+<hr>
+<pre>
+Date: Thu, 12 Oct 2000 00:57:28 -0400
+Subject: string answers
+</pre>
+
+<div class="source-code">
+<pre>
+nsresult
+DoSomething( nsAWritableString&amp;  answer )
+  {
+    nsresult rv;
+
+    nsXPIDLString registry_data;
+    Fetch("key", getter_Shares(registry_data));
+
+    nsLiteralString path(not_my_string);
+
+    PRInt32 first_colon = path.FindChar(PRUnichar(':'));
+    if ( first_colon != -1 )
+      {
+        // convert ... extract path from |path|
+        nsCOMPtr<nsILocalFile> localFile( do_CreateInstance(CID, &rv)
+);
+        if ( localFile )
+          {
+           
+localFile->SetPersistentDescriptor(NS_ConvertUTF16toUTF8(path));
+
+            nsXPIDLString converted_path;
+            localFile->GetUnicodePath(getter_Copies(converted_path));
+            answer = converted_path.get();
+          }
+      }
+    else
+      {
+        answer = path;
+      }
+
+
+    return rv;
+  }
+</pre>
+</div>
+
+
+
+
+
+<hr>
+<pre>
+Date: Thu, 12 Oct 2000 02:03:49 -0400
+Subject: Re: and the answer is ...
+</pre>
+
+<p>You can see from the line of code that you're on, that this should
+have been fine.  <span class="code">nsMemory::Alloc</span> would be asked to allocate a 1 byte
+object.  But it failed trying to allocate that.  Which suggests that
+the allocator was busy and non-reentrant and the debugger tried to
+misuse it.  Yes?
+
+<p>Of course, this doesn't solve your problem.  Perhaps we need to go
+back to the idea of a function that returns a pointer to the first
+hunk of the string.
+
+<div class="source-code">
+<pre>
+const char*
+debug_string( const nsAReadableCString& aCString )
+  {
+    nsReadingIterator&lt;char&gt; iter;
+    aCString.BeginReading(iter);
+    return aCString.IsEmpty() ? "" : iter.get();
+  }
+</pre>
+</div>
+
+<p>This code should work regardless of what the allocator is doing.  The
+downsides are (a) it only returns the first hunk of the string, in the
+case of a multi-fragment string; and (b) that hunk <strong>might</strong> not be
+zero-terminated.
+
+<p>Hope this helps,
+
+
+
+
+
+<hr>
+<pre>
+Date: Thu, 12 Oct 2000 08:30:32 -0400
+Subject: Re: Self healing the cache :-)
+</pre>
+
+<p>At 3:04 PM -0400 10/11/00, Mike Shaver wrote:
+<pre class="email-quote">
+  >NS_LITERAL_STRING(NS_XPCOM_SHUTDOWN_OBSERVER_ID);
+</pre>
+
+<p>Macro ugliness makes <span class="code">NS_LITERAL_STRING</span> inappropriate for use over
+other macros.  In other words:
+
+<div class="source-code">
+<pre>
+NS_LITERAL_STRING("foo")
+</pre>
+</div>
+
+<p>is <strong>good</strong>.
+
+<div class="source-code">
+<pre>
+#define FOO "foo"
+NS_LITERAL_STRING(FOO)
+</pre>
+</div>
+
+<p>is <strong>bad</strong>.  Why?  Because it turns into
+
+<div class="source-code">
+<pre>
+nsLiteralString(LFOO, sizeof(LFOO)...
+</pre>
+</div>
+
+<p>and there is no <span class="code">LFOO</span>.  Sorry.  If you have to do this to a
+macro-ized string, do the magic by hand, e.g.,
+
+<div class="source-code">
+<pre>
+nsLiteralString(FOO, sizeof(FOO)/sizeof(PRUnichar)
+                                          + sizeof(PRUnichar('\0')))
+</pre>
+</div>
+
+<p>or else if you don't care that <span class="code">nsLiteralString</span> will scan for the
+length, just say
+
+<div class="source-code">
+<pre>
+nsLiteralString(FOO)
+</pre>
+</div>
+
+<p>Hope this helps,
+
+
+
+
+
+<hr>
+<pre>
+Date: Thu, 12 Oct 2000 08:36:14 -0400
+Subject: Re: Self healing the cache :-)
+</pre>
+
+<p>Actually, I'm not even sure you can do it by hand, since you didn't
+
+<div class="source-code">
+<pre>
+#define FOO L"foo"
+</pre>
+</div>
+
+<p>and <strong>can't</strong> do that cross-platform.  The other way around this is to
+define a global instead of a macro, that is, instead of saying
+
+<div class="source-code">
+<pre>
+#define FOO "foo"
+</pre>
+</div>
+
+<p>at the top of your file, say
+
+<div class="source-code">
+<pre>
+NS_NAMED_LITERAL_STRING(FOO, "foo")
+</pre>
+</div>
+
+<p>or else, if the macro was used only in one spot ... perhaps you could
+just eliminate the macro in favor of <span class="code">NS_NAMED_LITERAL</span> in situ.
+
+<p>Arghh.  In this case, you may be stuck with the extra work of
+<span class="code">AssignWithConversion</span>.
+
+
+
+
+
+<hr>
+<pre>
+Date: Sun, 3 Dec 2000 16:38:07 -0400
+Subject: Re: another copy_string question
+</pre>
+
+<pre class="email-quote">
+  >Is there a way to tell, inside the write() sink, if one is in the
+  >final hunk?  I need to do some special processing at the end.
+</pre>
+
+<p>No, there isn't.  But you could move such special processing into the
+destructor of the sink.  Remember, the sink is passed by reference, so
+you can exactly control its lifetime.
+
+<div class="source-code">
+<pre>
+{
+  MySink sink;
+  nsReadingIterator&lt;PRUnichar&gt; sourceStart = aStr.BeginReading();
+  nsReadingIterator&lt;PRUnichar&gt; sourceEnd = aStr.EndReading();
+  copy_string(sourceStart, sourceEnd, sink);
+    // |sink| destructor executed here
+}
+</pre>
+</div>
+
+<p>Hope this helps,
+
+
+
+
+
+<hr>
+<pre>
+Date: Fri, 15 Dec 2000 20:02:08 -0400
+Subject: fragment of code
+</pre>
+
+<div class="source-code">
+<pre>
+nsPromiseFlatString flatKey(aReadable);
+
+flatKey.get()
+</pre>
+</div>
+
+
+
+
+
+
+<hr>
+<pre>
+Date: Tue, 16 Jan 2001 16:47:37 -0400
+Subject: Re: a few string questions...
+</pre>
+
+>I've accumulated a few questions I've been wanting to ask you, mostly
+>about string stuff.  Nothing urgent, but I want to ask them before I
+>forget.  So here goes...:
+>
+>1) Is it acceptable to use nsLiteralCString or nsLiteralString on 
+>something that's not a literal?  This can be useful in some places,
+>for example, to convert a char* to PRUnichar*:
+>
+>PRUnichar* new = ToNewUnicode(nsLiteralCString(myCharPtr));
+
+<p>This is explicitly allowed.  That's why I'm proposing to change the
+names of those classes to <span class="code">nsLocal[C]String</span>.
+
+
+>2) Should nsString2x.h and nsString2x.cpp go away?  They look like a
+>never-completed rewrite or something...
+
+<p>Yes.  They should go away.  They are uncompleted [old] bullshit,
+exactly as you diagnosed.
+
+<p>I'll look into the other two questions.
+
+
+
+
+
+<hr>
+<pre>
+Date: Thu, 1 Feb 2001 15:12:41 -0400
+Subject: Re: [Fwd: bad string, bad string]
+</pre>
+
+<p>We've been removing implicit conversion operators because they
+_always_ lead to trouble.  Usually they make it harder to pick the
+right function when overloading is involved and in the past they have
+led to huge performance suckage because we ended up doing conversions
+when we didn't need to because the implicit operator made us pick the
+wrong function.
+
+<p>It's borderline when the class implements something that is <strong>so</strong>
+close, as with a guaranteed flat string or an <span class="code">nsCOMPtr</span> ... but the
+general recommendation is to avoid implicit conversions.
+
+<p>See bug #53057.
+
+
+
+
+
+<hr>
+<pre>
+Date: Tue, 6 Feb 2001 18:52:23 -0400
+Subject: seeking review for bug #57087
+</pre>
+
+<p>  bug:
+    <a class="exact-uri" href="http://bugzilla.mozilla.org/show_bug.cgi?id=57087">http://bugzilla.mozilla.org/show_bug.cgi?id=57087</a>
+
+  patch:
+    <a class="exact-uri" href="http://bugzilla.mozilla.org/showattachment.cgi?attach_id=24576">http://bugzilla.mozilla.org/showattachment.cgi?attach_id=24576</a>
+
+<p>This patch is supposed to add the ability to define very long literal
+strings more easily by breaking lines, e.g.,
+
+<div class="source-code">
+<pre>
+NS_MULTILINE_LITERAL( NS_L("This is the start of a very long line")
+                      NS_L(" which actually continues across")
+                      NS_L(" a couple more.") )
+</pre>
+</div>
+
+<p>The main danger in this scheme is callers who omit the inner <span class="code">NS_L</span>
+wrapping.  Though I believe this will be caught at compile time as the
+wrong type initializer.
+
+<p>Seeking input from everybody, and waterson in particular.
+
+
+
+
+
+<hr>
+<pre>
+Date: Wed, 14 Feb 2001 16:09:10 -0400
+Subject: Re: Question...
+</pre>
+
+<p>There are some utilities in "xpcom/ds/nsReadableUtils.h".  In
+particular, if you want to get back a new heap-allocated ASCII string
+with the minimal work, you would say
+
+<div class="source-code">
+<pre>
+PRUnichar* sourceChars = ...;
+
+char* destChars = ToNewCString(nsLiteralString(sourceChars));
+</pre>
+</div>
+
+
+<p>It's more efficient if you happen to already know the length.  If you
+don't, don't bother counting, that's what I'll do in the constructor
+for <span class="code">nsLiteralString</span>.  If you do, then call like this
+
+<div class="source-code">
+<pre>
+destChars = ToNewCString( nsLiteralString(sourceChars, length) );
+</pre>
+</div>
+
+<p>Other routines in that file will help you if, for instance, you wanted
+to translate into a buffer you had already allocated.
+
+<p>Hope this helps,
+
+
+
+
+
+<hr>
+<pre>
+Date: Fri, 23 Feb 2001 03:12:58 -0400
+Subject: string snippet
+</pre>
+
+<div class="source-code">
+<pre>
+nsCString aInput;
+
+
+
+nsReadingIterator&lt;char&gt; search_start;
+aInput.BeginReading(search_start);
+
+nsReadingIterator&lt;char&gt; search_end;
+aInput.EndReading(search_end);
+
+if ( FindCharInReadable(':', search_start, search_end) )
+  {
+    ++search_start;
+    return ToNewCString( Substring(aInput, search_start, search_end)
+);
+  }
+</pre>
+</div>
+
+
+
+
+
+
+<hr>
+<pre>
+Date: Wed, 7 Mar 2001 19:44:08 -0400
+Subject: string help
+</pre>
+
+<p>Here you go, Mike:
+
+  http://scottcollins.net/journal/discussion/mjudge-scratch.cpp
+
+
+
+
+
+
+<hr>
+<pre>
+Date: Fri, 9 Mar 2001 20:56:07 -0400
+Subject: Re: string assertions
+</pre>
+
+<p>If you get an iterator into a string and you advance it all the way to
+the end of the string, and then <strong>keep</strong> trying to advance it, you hit
+this assert.  This could happen, for example if you tried to copy 10
+characters out of a 9 character string.  I've tried to make this
+impossible to get to.  As far as I know, all my routines trim requests
+in advance of manipulating iterators.  When you see this, you should
+get the stack.  That will take you right to the bad spot.
+
+
+
+
+
+<hr>
+<pre>
+Date: Sat, 31 Mar 2001 11:04:03 -0400
+Subject: Re: Sun bustage and string advice
+</pre>
+
+<p>You do know you are comparing two pointers now?  It seems unlikely
+those two pointers would ever be the same pointer.  You probably want
+to say something like
+
+<div class="source-code">
+<pre>
+NS_LITERAL_STRING("foo").Equals(aTopic) // or
+
+NS_LITERAL_STRING("foo") == nsLiteralString(aTopic)
+</pre>
+</div>
+
+<p>...so that you compare the <strong>contents</strong> of two strings.  Right now,
+you're just testing to see if two pointers both point to the same
+location in memory.  A lot of people make this mistake.  I would like
+to make it obvious to people that comparing two pointers does not
+compare strings.  Can you tell me what gave you that impression so
+that I can figure out how to better educate people not to do this?  By
+the way, it's not that I don't <strong>want</strong> to make this compare two
+strings; it's that in C++, you can't override operations for built-in
+types.  And pointers are built-in types.  So I can't make
+<span class="code">operator==(const PRUnichar*, const PRUnichar*)</span> do anything different
+than it already does, which is the same thing it does for any other
+pointer.
+
+
+
+
+
+
+</div>
+
+
+
+<!-- .................................................................End Matter -->
+
+
+
+  </body>
+</html>
author	Daniel Baumann <daniel.baumann@progress-linux.org>	2024-04-07 16:49:04 +0000
committer	Daniel Baumann <daniel.baumann@progress-linux.org>	2024-04-07 16:49:04 +0000
commit	16f504a9dca3fe3b70568f67b7d41241ae485288 (patch)
tree	c60f36ada0496ba928b7161059ba5ab1ab224f9d /src/libs/xpcom18a4/xpcom/string/doc
parent	Initial commit. (diff)
download	virtualbox-upstream.tar.xz virtualbox-upstream.zip