summaryrefslogtreecommitdiffstats
path: root/docs/nspr/using_io_timeouts_and_interrupts_on_nt.rst
diff options
context:
space:
mode:
Diffstat (limited to 'docs/nspr/using_io_timeouts_and_interrupts_on_nt.rst')
-rw-r--r--docs/nspr/using_io_timeouts_and_interrupts_on_nt.rst131
1 files changed, 131 insertions, 0 deletions
diff --git a/docs/nspr/using_io_timeouts_and_interrupts_on_nt.rst b/docs/nspr/using_io_timeouts_and_interrupts_on_nt.rst
new file mode 100644
index 0000000000..9aaca06a1b
--- /dev/null
+++ b/docs/nspr/using_io_timeouts_and_interrupts_on_nt.rst
@@ -0,0 +1,131 @@
+This technical memo is a cautionary note on using NetScape Portable
+Runtime's (NSPR) IO timeout and interrupt on Windows NT 3.51 and 4.0.
+Due to a limitation of the present implementation of NSPR IO on NT,
+programs must follow the following guideline:
+
+If a thread calls an NSPR IO function on a file descriptor and the IO
+function fails with <tt>PR_IO_TIMEOUT_ERROR</tt> or
+<tt>PR_PENDING_INTERRUPT_ERROR</tt>, the file descriptor must be closed
+before the thread exits.
+
+In this memo we explain the problem this guideline is trying to work
+around and discuss its limitations.
+
+.. _NSPR_IO_on_NT:
+
+NSPR IO on NT
+-------------
+
+The IO model of NSPR 2.0 is synchronous and blocking. A thread calling
+an IO function is blocked until the IO operation finishes, either due to
+a successful IO completion or an error. If the IO operation cannot
+complete before the specified timeout, the IO function returns with
+<tt>PR_IO_TIMEOUT_ERROR</tt>. If the thread gets interrupted by another
+thread's <tt>PR_Interrupt()</tt> call, the IO function returns with
+<tt>PR_PENDING_INTERRUPT_ERROR</tt>.
+
+On Windows NT, NSPR IO is implemented using NT's *overlapped* (also
+called *asynchronous*) *IO*. When a thread calls an IO function, the
+thread issues an overlapped IO request using the overlapped buffer in
+its <tt>PRThread</tt> structure. Then the thread is put to sleep. In the
+meantime, there are dedicated internal threads (called the *idle
+threads*) monitoring the IO completion port for completed IO requests.
+If a completed IO request appears at the IO completion port, an idle
+thread fetches it and wakes up the thread that issued the IO request
+earlier. This is the normal way the thread is awakened.
+
+.. _IO_Timeout_and_Interrupt:
+
+IO Timeout and Interrupt
+------------------------
+
+However, NSPR may wake up the thread in two other situations:
+
+- if the overlapped IO request is not completed before the specified
+ timeout. (Note that we can't specify timeout on overlapped IO
+ requests, so the timeouts are all handled at the NSPR level.) In this
+ case, the error is <tt>PR_IO_TIMEOUT_ERROR</tt>.
+- if the thread gets interrupted by another thread's
+ <tt>PR_Interrupt()</tt> call. In this case, the error is
+ <tt>PR_PENDING_INTERRUPT_ERROR</tt>.
+
+These two errors are generated by the NSPR layer, so the OS is oblivious
+of what is going on and the overlapped IO request is still in progress.
+The OS still has a pointer to the overlapped buffer in the thread's
+<tt>PRThread</tt> structure. If the thread subsequently exists and its
+<tt>PRThread</tt> structure gets deleted, the pointer to the overlapped
+buffer will be pointing to freed memory. This is problematic.
+
+.. _Canceling_Overlapped_IO_by_Closing_the_File_Descriptor:
+
+Canceling Overlapped IO by Closing the File Descriptor
+------------------------------------------------------
+
+Therefore, we need to cancel the outstanding overlapped IO request
+before the thread exits. NT's <tt>CancelIo()</tt> function would be
+ideal for this purpose. Unfortunately, <tt>CancelIo()</tt> is not
+available on NT 3.51. So we can't go this route as long as we are
+supporting NT 3.51. The only reliable way to cancel outstanding
+overlapped IO request that works on both NT 3.51 and 4.0 is to close the
+file descriptor, hence the rule of thumb stated at the beginning of this
+memo.
+
+.. _Limitations:
+
+Limitations
+-----------
+
+This seemingly harsh way to force the completion of outstanding
+overlapped IO request has the following limitations:
+
+- It is difficult for threads to shared a file descriptor. For example,
+ suppose thread A and thread B call <tt>PR_Accept()</tt> on the same
+ socket, and they time out at the same time. Following the rule of
+ thumb, both threads would close the socket. The first
+ <tt>PR_Close()</tt> would succeed, but the second <tt>PR_Close()</tt>
+ would be freeing freed memory. A solution that may work is to use a
+ lock to ensure only one thread can be using that socket at all times.
+- Once there is a timeout or interrupt error, the file descriptor is no
+ longer usable. Suppose the file descriptor is intended to be used for
+ the life time of the process, for example, the logging file, this is
+ really not acceptable. A possible solution is to add a
+ <tt>PR_DisableInterrupt()</tt> function to turn off interrupts when
+ accessing such file descriptors.
+
+..
+
+ *A related known bug is that timeout and interrupt don't work for
+ <tt>PR_Connect()</tt> on NT. This bug is due to a different
+ limitation in our NT implementation.*
+
+.. _Conclusions:
+
+Conclusions
+-----------
+
+As long as we need to support NT 3.51, we need to program under the
+guideline that after an IO timeout or interrupt error, the thread must
+make sure the file descriptor is closed before it exits. Programs should
+also take care in sharing file descriptors and using IO timeout or
+interrupt on files that need to stay open throughout the process.
+
+When we stop supporting NT 3.51, we can look into using NT 4's
+<tt>CancelIo()</tt> function to cancel outstanding overlapped IO
+requests when we get IO timeout or interrupt errors. If
+<tt>CancelIo()</tt> really works as advertised, that should
+fundamentally solve this problem.
+
+If these limitations with IO timeout and interrupt are not acceptable to
+the needs of your programs, you can consider using the Win95 version of
+NSPR. The Win95 version runs without trouble on NT, but you would lose
+the better performance provided by NT fibers and asynchronous IO.
+
+|
+
+.. _Original_Document_Information:
+
+Original Document Information
+-----------------------------
+
+- Author: larryh@netscape.com
+- Last Updated Date: December 1, 2004