For reference, the code that generates this error is: dlib/test_for_odr_violations.h and dlib/test_for_odr_violations.cpp.
For reference, the code that generates this error is: dlib/test_for_odr_violations.h and dlib/test_for_odr_violations.cpp.
If you think you found some kind of bug or problem in dlib then feel free to submit a dlib issue on github. But include the version of dlib you are using, what you are trying, what happened, what you expected to have happened instead, etc.
On the other hand, if you haven't found a bug or problem in dlib, but instead are looking for machine learning/computer vision/programming help then post your question to stack overflow with the dlib tag.
First, note that you need a version of Visual Studio with decent C++11 support. This means you need Visual Studio 2015 or newer.
There are instructions on the How to Compile page. If you do not understand the instructions in the "Compiling on Windows Using Visual Studio" section or are getting errors then follow the instructions in the "Compiling on Any Operating System Using CMake" section. In particular, install CMake and then type these exact commands from within the root of the dlib distribution:@Article{dlib09, author = {Davis E. King}, title = {Dlib-ml: A Machine Learning Toolkit}, journal = {Journal of Machine Learning Research}, year = {2009}, volume = {10}, pages = {1755-1758}, }
You should try kernel ridge regression instead since it also doesn't take any parameters but is always very fast.
For example, you could reduce the amount of data by saying this:
Picking the right kernel all comes down to understanding your data, and obviously this is highly dependent on your problem.
One thing that's sometimes useful is to plot each feature against the target value. You can get an idea of what your overall feature space looks like and maybe tell if a linear kernel is the right solution. But this still hides important information from you. For example, imagine you have two diagonal lines which are very close together and are both the same length. Suppose one line is of the +1 class and the other is the -1 class. Each feature (the x or y coordinate values) by itself tells you almost nothing about which class a point belongs to but together they tell you everything you need to know.
On the other hand, if you know something about the data you are working with then you can also try and generate your own features. So for example, if your data is a bunch of images and you know that one of your classes contains a lot of lines then you can make a feature that attempts to measure the number of lines in an image using a hough transform or sobel edge filter or whatever. Generally, try and think up features which should be highly correlated with your target value. A good way to do this is to try and actually hand code N solutions to the problem using whatever you know about your data or domain. If you do a good job then you will have N really great features and a linear or rbf kernel will probably do very well when using them.
Or you can just try a whole bunch of kernels, kernel parameters, and training algorithm options while using cross validation. I.e. when in doubt, use brute force :) There is an example of that kind of thing in the model selection example program.
So you need to pick the gamma value so that it is scaled reasonably to your data. A good rule of thumb (i.e. not the optimal gamma, just a heuristic guess) is the following:
Sometimes annotating all the objects in each image is too onerous, or there are ambiguous objects you don't care about. In these cases you should annotate these objects you don't care about with ignore boxes so that the MMOD loss knows to ignore them. You can do this with dlib's imglab tool by selecting a box and pressing i. Moreover, there are two ways the code treats ignore boxes. When a detector generates a detection it compares it against any ignore boxes and ignores it if the boxes "overlap". Deciding if they overlap is based on either their intersection over union or just basic percent coverage of one by another. You have to think about what mode you want when you annotate things and configure the training code appropriately. The default behavior is to use intersection over union to measure overlap. However, if you wanted to simply mask out large parts of an image you wouldn't want to use intersection over union to measure overlap since small boxes contained entirely within the large ignored region would have small IoU with the big ignore region and thus not "overlap" the ignore region. In this case you should change the settings to reflect this before training. The available configuration options are discussed in great detail in parts of dlib's documentation.
Here are some examples of bad datasets:
Another way you can mess this up is when using the random_cropper to jitter your training data, which is common when training a CNN or other deep model. In general, the random_cropper finds images that are more or less centered on your objects of interest and it also scales the images so the object has some user specified minimum size. That's all fine. But what can happen is you train a model that gets 0 training error but when you go and use it it doesn't detect any objects. Why is that? It's probably because all the objects in your normal images, the ones you give to the random_cropper, are really small. Smaller than the min size you told the cropper to make them. So now your testing images are really different from your training images. Moreover, in general object detectors have some minimum size they scan and if objects are smaller than that they will never be found. Another related issue is all your uncropped images might show objects at the very border of the image. But the random_cropper will center the objects in the crops, by padding with zeros if necessary. Again, make your testing images look like the training images. Pad the edges of your images with zeros if needed.
For example, a HOG detector isn't going to be able to learn to detect human faces that are upright as well as faces rotated 90 degrees. If you wanted to deal with that you would be best off training 2 detectors. One for upright faces and another for 90 degree rotated faces. You can efficiently run multiple HOG detectors at once using the evaluate_detectors function, so it's not a huge deal to do this. Dlib's imglab tool also has a --cluster option that will help you split a training dataset into clusters that can be detected by a single HOG detector. You will still need to manually review and clean the dataset after applying --cluster, but it makes the process of splitting a dataset into coherent poses, from the point of view of HOG, a lot easier.
A related issue arises because HOG is a rigid template, which is that the boxes in your training data need to all have essentially the same aspect ratio. For instance, a single HOG filter can't possibly detect objects that are both 100x50 pixels and 50x100 pixels. To do this you would need to split your dataset into two parts, objects with a 2:1 aspect ratio and objects with a 1:2 aspect ratio and then train two separate HOG detectors, one for each aspect ratio.
However, it should be emphasized that even using multiple HOG detectors will only get you so far. So at some point you should consider using a CNN based detection method since CNNs can generally deal with arbitrary rotations, poses, and deformations with one unified detector.
To make this even more complicated, Visual Studio 2017 had regressions in its C++11 support. So all versions of Visual Studio 2017 prior to December 2017 would just hang if you tried to compile the DNN examples. Happily, the newest versions of Visual Studio 2017 appear to have good C++11 support and will compile the DNN codes without any issue. So make sure your Visual Studio is fully updated.
Finally, it should be noted that you should give the -T host=x64 cmake option when generating a Visual Studio project. If you don't do this then you will get the default Visual Studio toolchain, which runs the compiler in 32bit mode, restricting it to 2GB of RAM, leading to compiler crashes due to it running out of RAM in some cases. This isn't the 1990s anymore, so you should probably run your compiler in 64bit mode so it can use your computer's RAM. Giving -T host=x64 will let Visual Studio use as much RAM as it needs.
Here is an example of one problem it addresses. Since dlib exposes the entire network architecture to the C++ type system we can get automatic serialization of networks. Without this, we would have to resort to the kind of hacky global layer registry used in other tools that compose networks entirely at runtime.
Another nice feature is that we get to use C++11 alias template statements to create network sub-blocks, which we can then use to easily define very large networks. There are examples of this in this example program. It should also be pointed out that it takes days or even weeks to train one network. So it isn't as if you will be writing a program that loops over large numbers of networks and trains them all. This makes the time needed to recompile a program to change the network irrelevant compared to the entire training time. Moreover, there are plenty of compile time constructs in C++ you can use to enumerate network architectures (e.g. loop over filter widths) if you really wanted to do so.
All that said, if you think you found a compelling use case that isn't supported by the current API feel free to post a github issue.