summaryrefslogtreecommitdiffstats
path: root/ml/dlib/docs/docs/intro.xml
blob: bd21c7e5b0c61bfd4eae9e934db0e9796e94d3f5 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
<?xml version="1.0" encoding="ISO-8859-1"?>
<?xml-stylesheet type="text/xsl" href="stylesheet.xsl"?>

<doc>
    <title>Introduction</title>



    <!-- ************************************************************************* -->

    <body>
        





        <!--   ****************************   OVERVIEW SECTION  ****************************    -->



        <h1>Overview</h1>




         <p>
            Dlib is a general purpose cross-platform open source software library written in the C++ programming
            language. Its design is heavily influenced by ideas from design by contract and component-based 
            software engineering. This means it is, first and foremost, a collection of independent
            software components, each accompanied by extensive documentation and thorough debugging modes. 
         </p>

      
        <p>
           <a href='mailto:davis@dlib.net'>Davis King</a> has been the primary
           author of dlib since development began in 2002.  In that time
           dlib has grown to include a wide variety of tools.  In particular,
           it now contains software components for dealing with networking,
           threads, graphical interfaces, complex data structures, linear
           algebra, statistical machine learning, image processing, data
           mining, XML and text parsing, numerical optimization, Bayesian
           networks, and numerous other tasks.  In
           recent years, much of the development has been focused on creating 
           a broad set of statistical machine learning tools.  However, dlib
           remains a general purpose library and <a href="howto_contribute.html">welcomes contributions</a> of high
           quality software components useful in any domain.
        </p>

        <p>
         Core to the development philosophy of dlib is a dedication to
         portability and ease of use.  Therefore, all code in dlib is designed
         to be as portable as possible and similarly to not require a user to
         configure or install anything.  To help achieve this, all platform
         specific code is confined inside the API wrappers.  Everything else is
         either layered on top of those wrappers or is written in pure ISO
         standard C++.  Currently the library is known to work on OS X, MS
         Windows, Linux, Solaris, the BSDs, and HP-UX.  It should work on any
         POSIX platform but I haven't had the opportunity to test it on any
         others (if you have access to other platforms and would like to help
         increase this list then let me know).
        </p>
        <p>
         The rest of this page explains everything you need to know to get started using the library.  It 
         explains where to find the documentation for each object/function and how to interpret
         what you find there.   For help compiling with dlib check out the <a href="compile.html">how to compile</a>
        page.  Or if you are having trouble finding where a particular object's documentation is located you may
        be able to find it by consulting the <a href="term_index.html">index</a>.</p>
        <p>
         The library is also covered by the very liberal Boost Software License
         so feel free to use it any way you like.  However, if you use dlib in
         your research then please cite <a href="faq.html#HowcanIcitedlib">its Journal of Machine Learning Research paper</a> when
         publishing.
        </p>
        <p>
           Finally, I must give some credit to the <a
           href="http://www.cse.ohio-state.edu/~weide/rsrg/index.html">Reusable
           Software Research Group</a> at Ohio State since they taught me much
           of the software engineering techniques used in the creation of this library.  
        </p>


        <!--   ****************************   NOTATION SECTION  ****************************    -->


        <h1>Notation</h1>
        <p>
        For the most part I try to document my code in a way that any C++ programmer would understand, 
        but for the sake of brevity I use some of the following uncommon notation.
        </p>

        <ul>
            <li/><b> kernel, extension, and abstract </b>
            <ul>
                Each component of the library has a specification which defines its core behavior and interface.  This 
                specification defines what is called the component's kernel.  Additionally, each component may have any number of 
                extensions.  An extension is essentially a specification for something that layers functionality on top of the 
                kernel of a component.  
                <br/>
                <br/>  In the naming of files I use the word abstract to indicate that a file
                contains a specification of a kernel component or extension rather than an actual implementation.
            </ul>




            <br/><li/><b>/*! comments like this !*/</b>
            <ul>
                This is just for "formal comments."  Generally these appear after a function prototype and contain
                the requires/ensures stuff or at the top of a class and tell you general things about the class. 
            </ul>




            <br/><li/> <b> requires/ensures/throws </b>
            <ul>
                These words appear in the formal comment following function prototypes and have the following meanings.
                <br/><br/><u>requires</u>:   This defines a list of requirements for calling the function.  These requirements
                MUST be met or a call to the function has undefined results.   (note that when the checking/debugging modes
                are enabled on an object then it will throw the dlib::fatal_error exception with fatal_error::type == EBROKEN_ASSERT when the requires clause is
                broken rather than causing "undefined results")

               <br/><br/><u>ensures</u>:    This defines what the function does.  It is a list of conditions that will be 
                true after the function finishes executing.  Note that if an exception is thrown then nothing in the 
                ensures clause is guaranteed to be true.

               <br/><br/><u>throws</u>:     This defines what exceptions may be thrown by this function.  It generally
                tells you why the exception might be thrown.  It also tells you what the function does in this event: 
                Does it have no effect at all? Does it corrupt any objects?  etc.  

                <br/>
                <br/>
                Sometimes these blocks do not appear in the formal comment.  The meanings in these cases are as follows:
                <br/><u>missing requires</u>:   There are no requirements, you may put anything in the function arguments.
                <br/><u>missing ensures</u>:    This means that the effects of the function are unspecified.  This is often used
                for call backs where the client programmer implements the actual function. 
                <br/><u>missing throws</u>:     This doesn't mean anything.  A function without a throws block
                might throw exceptions or it might not.
                                    
                <br/>
                <br/>
                So in summary, the requires clause must always be satisfied, the ensures clause tells you what the 
                function does when it does <i>not</i> throw or return an error, and the throws clause tells you what happens when the function
                <i>does</i> throw.

            </ul>



            <br/><li/> <anchor>meaning_of_hash</anchor> <b> meaning of # symbol </b>
            <ul>
                I use this as a prefix on identifiers to make reference to the value of the identifier "after"
                some event has occurred.  
                <br/><br/>
                The most common place I use this notation is inside the formal comment following a function prototype.  
                If the # symbol appears in a requires/ensures/throws block then it means the value of
                the identifier after the function has finished, otherwise all references to an identifier 
                refer to its value before the function was called.
                <br/><br/>
                An example will make it clear.


                <code_box><font color='#3333FF'>int</font> <b>funct</b><font face="Lucida Console">(</font>
<font color='#3333FF'>    int</font>&amp; something
<font face="Lucida Console">);</font>
<font color='#009900'>/*!
    requires
        - something &gt; 4     
    ensures
        - #some_other_function() == 9
        - #something == something + 1
        - returns something
!*/</font>
</code_box>

                This says that funct() requires that "something" be greater than 4, that funct() will increment "something" 
                by 1, and funct() returns the original value of something.  It also says that
                <i>after</i> the call to funct() ends a call to some_other_function() will return the value 9.

            </ul>
            
            <br/><li/> <anchor>CONVENTION</anchor> <b> CONVENTION </b>
            <ul>
                This is a section of the formal comment which appears at the top of classes which are 
                actual implementations (as opposed to specifications).  This section of the comment contains
                a list of invariants that tell you what the member variables are used for.  It also relates 
                the state of the member variables to the class interface.
                <br/>
                <br/>
                For example, you might see a line in this section that says "my_size == size()".  This just means
                that the member variable my_size always contains the value returned by the size() function.
            </ul>




            <br/><li/> <b> "initial value for its type" </b>
            <ul>
                I frequently say that after a function executes some variable or argument will have an 
                initial value for its type.  This makes sense for objects with a user defined constructor,
                but for anything else not so much.   Therefore the initial value of a type with no user defined
                constructor is undefined. 
            </ul>

        </ul>
        



        <!--   ****************************   ORGANIZATION SECTION  ****************************    -->

        <h1>Organization</h1>

        <p>
        The library can be thought of as a collection of components.  Each component always consists of
        at least two separate files, a specification file and an implementation file.  The specification
        files are the ones that end with _abstract.h.  Each of these specification files don't actually 
        contain any code and they even have preprocessor directives that prevent any of their contents from 
        being included.  Their purpose is purely to document a component's interface in a file that isn't
        cluttered with implementation details the user shouldn't need to know about.  
        </p>
        
        <p>
           The next important concept in dlib organization is multi-implementation components.  That is, 
           some components provide more than one implementation of what is defined in their specification.  
           When you use these components you have to identify them with names like <tt>dlib::component::kernel_1a</tt>.
           Often these components will have just a debugging and non-debugging implementation.  However, many components
           provide a large number of alternate implementations.  For example, the <a href="compression.html#entropy_encoder_model">entropy_encoder_model</a>
           has 32 different implementations you can choose from.  
        </p>

        <ul>

            <li/> <b>File organization for multi-implementation components</b>
            <ul>
                Each component gets its own folder and one file in the root of the directory tree.
                <br/><br/>
                I will use the <a href="containers.html#queue">queue</a> object as a typical example and
                explain what each of its files contain.
                Below is the directory structure and all the files related to the queue component.  

                <br/><br/>
                <ul><li/> <b> file tree </b>
                    <ul>
                        <li/> dlib/
                        <ul>
                            <li/> queue.h
                            <li/> queue/
                            <ul>
                                <li/> queue_kernel_abstract.h  
                                <li/> queue_kernel_1.h
                                <li/> queue_kernel_2.h
                                <li/> queue_kernel_c.h
                                <li/> queue_sort_abstract.h
                                <li/> queue_sort_1.h
                            </ul>
                        </ul>
                    </ul>
   

                    <br/>

                    <li/> <a href="dlib/queue.h.html">queue.h</a> 
                    <ul> This file does not contain any executable code.  All it does is define the typedefs such as 
                    kernel_1a, kernel_1a_c, etc. for the queue object.  See the <a href="#creating_objects">Creating Objects</a>
                    section to learn what these typedefs are for.
                    </ul>
                    
                    <li/> <a href="dlib/queue/queue_kernel_abstract.h.html"> queue_kernel_abstract.h </a>
                    <ul> 
                    This file does not contain any code.  It even has preprocessor directives that prevent
                    any of its contents from being included.   
                    <br/>
                    <br/>
                    The purpose of this file is to define exactly what a queue object does and what its 
                    interface is.   
                    </ul>

                    <li/> <a href="dlib/queue/queue_sort_abstract.h.html"> queue_sort_abstract.h </a>
                    <ul> 
                    This file also doesn't contain any code.  Its only purpose is to define the sort
                    extension to queue objects.  
                    </ul>

                    <li/> <a href="dlib/queue/queue_kernel_1.h.html"> queue_kernel_1.h </a>
                    <ul> 
                    This file contains an implementation of the queue kernel specification found 
                    in queue_kernel_abstract.h 
                    </ul>

                    <li/> <a href="dlib/queue/queue_kernel_2.h.html"> queue_kernel_2.h </a>
                    <ul> 
                    This file contains another implementation of the queue kernel specification found 
                    in queue_kernel_abstract.h 
                    </ul>

                    <li/> <a href="dlib/queue/queue_sort_1.h.html"> queue_sort_1.h </a>
                    <ul> 
                    This file contains an implementation of the queue sort extension specification found 
                    in queue_sort_abstract.h 
                    </ul>

                    <li/> <a href="dlib/queue/queue_kernel_c.h.html"> queue_kernel_c.h </a>
                    <ul> 
                    This file contains a templated class which wraps any implementation of the queue kernel
                    specification.  It is used during debugging to check that the requires clauses are never
                    violated.
                    </ul>
                </ul>
            </ul>
        </ul>





        <!--   ****************************   CREATING OBJECTS SECTION  ****************************    -->


    
    
        <anchor>creating_objects</anchor>
        <h1>Creating Objects</h1>

        <p>
        To create many of the objects in this library you need to choose which kernel implementation you would like and if you 
        want the checking version or any extensions.   
        </p>
        <p>
        To make this easy there are header files which define typedefs of all this stuff.  For 
        example, to create a queue of ints using queue kernel implementation 1 you would type
        <tt>dlib::queue&lt;int&gt;::kernel_1a my_queue;</tt>.   Or to get the debugging/checking version you 
        would type <tt>dlib::queue&lt;int&gt;::kernel_1a_c my_queue;</tt>.
        </p>
        <p>
        There can be a lot of different typedefs for each component.  You can find a list of them
        in the section for the component in question.  For the queue component they can be found
        <a href="containers.html#queue">here</a>.
        </p>
        <p>
        None of the above applies to the single-implementation components, that is, anything that doesn't have an "implementations"
        section in its documentation.  These tools are designed to have only one implementation and thus do not follow the
        above naming convention.  For example, to create a 
        <a href="other.html#logger">logger</a> object you would simply type <tt>dlib::logger mylog("name");</tt>. 
        For the purposes of object creation the API components also appear to be single-implementation.  That is, there is no 
        need to specify which implementation you want since it is automatically determined by which platform you compile under.
        Note also that there are no explicit checking versions of these components.  However, there are 
        <a href="metaprogramming.html#DLIB_ASSERT">DLIB_ASSERT</a> statements that perform checking and you can 
        enable them by #defining DEBUG or ENABLE_ASSERTS. 
        </p>


        <!--   ****************************   ASSUMPTIONS SECTION  ****************************    -->


        <h1>Assumptions</h1>
        There are some restrictions on the behavior of certain objects or functions.  
        Rather than replicating these restrictions all over the place in my documentation they
        are listed here.  

        <ul>

        <li/> <b> global swap() </b>
        <ul>
        It is assumed that this operator does not throw.  Undefined behavior results if it does.
        Note that std::swap() for all intrinsics and std::string does not throw.  
        </ul>



        <br/><li/> <b> operator&lt;() </b>
        <ul>
        It is assumed that this operator (or std::less or any similar functor supplied by you to the library) 
        does not throw.  Undefined behavior results if it does.  
        </ul>



        <br/><li/> <b> dlib::general_hash </b>
        <ul>
        It is assumed that general_hash does not throw.  Undefined behavior results if it does.
        This is actually noted in the general hash spec file but I'm listing it here also for good measure.

        </ul>



        </ul>





        <!--   ****************************   THREAD SAFETY SECTION  ****************************    -->


         <anchor>thread_safety</anchor>
        <h1>Thread Safety</h1>
          
        <p>
        In the library there are three kinds of objects with regards to threading:  
        <ul>
           <li>Objects which are completely thread safe.  This means that any pattern of access from 
           multiple threads is safe.</li>
           <li>Objects which are safe to use if no threads touch the same instance, but require access
           to a particular instance to be serialized via a mutex if it is shared among threads. </li>
           <li>Objects which share some kind of global resource or are reference counted.  This kind of object is
               extremely thread unfriendly and can only be used in a threaded program with great care. </li>
        </ul>
        </p>

        <p>
        How do you know which components/objects are thread safe and which aren't?   The rule is that if
        the specification for the component doesn't mention threading or thread safety then 
        it is ok to use as long as you serialize access to shared instances.   If the component might have
        some global resources or be reference counted then the specifications will tell you this. 
        Lastly if the component is completely thread safe then the specification will tell you this.
        </p>
        <p>
           Also note that global functions in dlib are always thread safe. 
        </p>

    
    </body>



    <!-- ************************************************************************* -->

</doc>