diff options
Diffstat (limited to 'doc/wiki/Design.Code.txt')
-rw-r--r-- | doc/wiki/Design.Code.txt | 203 |
1 files changed, 203 insertions, 0 deletions
diff --git a/doc/wiki/Design.Code.txt b/doc/wiki/Design.Code.txt new file mode 100644 index 0000000..5fbddba --- /dev/null +++ b/doc/wiki/Design.Code.txt @@ -0,0 +1,203 @@ +Code Design +=========== + +Generally Dovecot follows Linux kernel coding style +[https://www.kernel.org/doc/Documentation/CodingStyle]. + +Most of the coding style design is about getting as many compiler warnings and +errors as possible. Dovecot already has some patched-Clang-specific features to +get more warnings, and will likely have more in future. Issues found by static +analyzers should also be fixed in some way, e.g. adding extra asserts. + +Runtime errors should be caught by asserts or NULL pointer dereferences, which +cleanly crash the program instead of it continuing and possibly corrupting data +or causing other bad things. Dovecot's master process restarts the crashed +processes anyway. Obviously the master process should be very careful to avoid +crashing, but even then in some OSes a crashed Dovecot master gets restarted by +the init process. + +Dovecot's bottlenecks are primarily disk I/O and secondarily memory usage, so +using extra CPU for asserts and other extra checks costs practically nothing. + +Types +----- + + * Unsigned types are used whenever the value isn't allowed to be negative. + This makes it easier to do "value too large" checks when you don't also have + to check for negative values. Also in arithmetic it's better to have the + value wrap (and hopefully checked later!) than cause undefined behavior with + a signed integer overflow. + * 'char *' always points to a NUL-terminated string, 'unsigned char *' + doesn't. + * 'size_t' should generally be used when pointing to a large memory area, + especially for 'mmap()'. Since 'size_t' can be slower to access than + 'unsigned int' (or at least use memory), it's fine to use 'unsigned int' + when it's "very unlikely" that the size ever goes beyond 4 GB + (e.g.'string_t'). + * 'uoff_t' is used for file offsets/sizes. This is usually 64bit, even with + 32bit machines. + * 'uint32_t' vs. 'unsigned int': Use 'uint32_t' when the type really should be + 32bit, but don't spend too much energy trying to avoid mixing it with + 'unsigned int', since they are going to be the same types probably for the + rest of Dovecot's life.. + * 'uint8_t' vs. 'unsigned char': I doubt Dovecot will ever be compiled + anywhere where these differ from one another, but for readability use + 'uint8_t' for binary data and 'unsigned char' for text data. + +Function parameters +------------------- + + * Try to avoid using/allowing NULL pointers. For example for a public API + instead of having 'foo(struct bar *bar)' where bar can be NULL, you could + have both 'foo(void)' and 'foo_with_bar(struct bar *bar)'. Of course don't + try too hard if it makes the API otherwise ugly. + * Dovecot v2.2+ marks all such parameters with ATTR_NULL. These can be + verified with a patched clang. Unfortunately the API isn't very nice, it + would be better to mark each parameter separately with ATTR_NULL instead + of having a numbered list. This will likely change in future. + * Also unless you know that a parameter can be NULL, don't bother wasting + code on checking if it is. Ideally those are noticed by the patched clang + check, but even if not, it's not that bad to crash on NULL pointer + dereference. + * '_r' suffix in parameter is used for values that are returned (e.g. + 'foo(const char **error_r)'). + * Use const for pointers pretty much whenever possible. + * Some day in future I'd like compiler to give warnings if function parameters + are modified by the function itself. There's probably a lot of code that + does this currently, but try avoiding it in future. + +Function return values +---------------------- + + * 'int' type is commonly used for return values. Usually this is used to mean + one of: + * -1 = error, 0 = ok + * -1 = error, 0 = unfinished (e.g. non-blocking call), 1 = finished + * unfortunately there are also other uses. I've been wondering about using + different types for these some day, such as "err" and "err3", but haven't + figured out an easy enough to use design, especially one where compiler + verifies that types aren't accidentally mixed. + * 'bool' shouldn't be used as return value for ok/error, use int 0/-1 instead. + (Old code has some of these. Sometimes it's also not quite clear if + something is an "error" or not, so this rule isn't perfect.) + * Functions returning pointers shouldn't use NULL pointer to mean an error, + only for things like "not found". For example instead of 'if ((ctx = init()) + == NULL)' use 'if (init(&ctx) < 0)' + * Usually when calling a function, either save/check the return value or + explicitly say that you don't care about it by saying '(void)func();' + * Some functions where the return value can nearly always be ignored it's + annoying to add the '(void)' prefix. You can avoid these by adding + ATTR_NOWARN_UNUSED_RESULT to such function's prototype. A patched clang + can be used to find missing return value checks. + * If a system call fails unexpectedly, always log an error about it, possibly + even kill the process if it's vital for correct functionality + (e.g.'gettimeofday()'). '%m' in Dovecot's 'printf()'-style functions expands + to 'strerror(errno)'. + +Boolean expressions +------------------- + +Try to use boolean expressions the way they work in Java. C doesn't require +this, but I think it makes the code easier to understand and reduces bugs in +some cases (e.g.'if (!foo())' when thinking foo() returns bool/FALSE, but +actually returns int/-1 on error). We've a clang patch +[http://dovecot.org/patches/clang/] to give warnings in these cases. As +expected, it found quite a lot of bugs (some real bugs and a lot of "it just +accidentally worked" +[https://github.com/dovecot/core/commit/d9a7e950a9cd21f2b4a90ec7759fca9e8fcc7995]). + + * 'bool x' and 'bool x:1' are the boolean types + * TRUE and FALSE are the only valid explicit boolean values (not 0 or 1, and + currently also not true/false although that could be changed) + * !=, ==, <, >, etc. comparisons create a boolean + * if, for, while, etc. require a boolean + +So: + + * 'if (!ptr)' -> 'if (ptr == NULL)' + * 'if (!num)' -> 'if (num == 0)' + * 'if (flags & FLAG_FOO)' -> 'if ((flags & FLAG_FOO) != 0)' + +Memory +------ + +Memory is always allocated through one of Dovecot's wrappers, e.g. 'i_malloc()' +or 'i_new()'. All of Dovecot's memory allocations always succeed or kill the +process. There's no point in writing a lot of code to check for memory +allocation failures that happen just about never. The only reason some memory +allocations fail in Dovecot currently is because a process VSZ limit is +reached, which usually indicates either a memory leak or trying to access a +mailbox that is too large. In either of these cases it's better to just +completely restart the process than try to limp along without getting anything +useful done anymore. + +Memory allocations can be assumed to be zero-initialized. All of the memory +allocation functions do it, except 't_malloc()' and 't_buffer_get()', which you +should almost never use directly anyway. The code currently also assumes that +pointers in zero-initialized memory area are NULL, which isn't guaranteed by +ANSI-C, but practically Dovecot isn't going to be run in systems where it's not +true and you're not going to remember to NULL-initialize all of your pointers +anyway without compiler/runtime failure. + +When using a struct, always zero-initialize it with 'memset()' instead of +setting each field separately to 0. It's too easy to cause bugs by adding a new +field to the struct and forgetting to initialize it. + +Double-frees in C are bad. Better to crash with NULL pointer dereference than +possibly allow an attacker to exploit heap corruption and run executable code. +Most of the pointers are set to NULL when they are freed: + + * 'i_free_and_null()' is a macro to free the memory and set the pointer to + NULL. + * 'i_free()' also currently sets the pointer to NULL, but another thought was + to set this pointing to some other invalid pointer. But arguably this should + be defined to be exactly the same as 'i_free_and_null()'. + * Most deinit() functions take a pointer-to-pointer parameter and set the + original one to NULL. There's no need to explicitly set the same pointer to + NULL afterwards. + +Buffers +------- + +Use dynamically growing strings/buffers wherever necessary instead of a static +sized buffer, where on larger input the function fails or truncates the data. +It's of course not good to allow users to infinitely grow memory usage, so +there should be some limits added, but it shouldn't fail even if the limit is +set to infinite. + +Avoid explicitly calculating memory usage for allocations. If you do, mark it +with '/* @UNSAFE */' comment unless it's the calculation is "so obvious that +you see it's correct at the first glance". If in doubt, just mark it UNSAFE. +The idea is that anyone can easily grep for these and verify their correctness. + +Type safety +----------- + + * Try to avoid using void pointers. + * Try to avoid casting types to other types, especially if the cast isn't + necessary to avoid a compiler warning. + +It's better if compiler can give a warning when something is accidentally used +wrong. + +Callback functions +------------------ + +Callback functions make the code more difficult to follow, especially when a +callback calls another callback, or when using function pointers to jump to +different callbacks depending on state. Of course with asynchronous C code it's +pretty much impossible to avoid callbacks. Still, try to avoid them where +possible to keep the code readable. lib-fs/fs-api.h is an example API which +supports async operations but with a single common "do more now" callback +rather than every single operation having its own callback parameter. This +makes it similar to async network IO with read()/write() EAGAIN handling. + +Often callback functions can be avoided by creating iterator functions instead. +For example instead of 'parse(callback, context' use 'ctx = parse_init(); while +(parse_next(ctx)) { .. } parse_deinit(&ctx);' + +Dovecot has some helper macros to make callbacks' context parameters type-safe. +In v2.2+ see 'CALLBACK_TYPECHECK()' macro and for example 'io_add()' for +example usage. + +(This file was created from the wiki on 2019-06-19 12:42) |