summaryrefslogtreecommitdiffstats
path: root/doc/wiki/Design.Code.txt
diff options
context:
space:
mode:
authorDaniel Baumann <daniel.baumann@progress-linux.org>2024-04-28 09:51:24 +0000
committerDaniel Baumann <daniel.baumann@progress-linux.org>2024-04-28 09:51:24 +0000
commitf7548d6d28c313cf80e6f3ef89aed16a19815df1 (patch)
treea3f6f2a3f247293bee59ecd28e8cd8ceb6ca064a /doc/wiki/Design.Code.txt
parentInitial commit. (diff)
downloaddovecot-f7548d6d28c313cf80e6f3ef89aed16a19815df1.tar.xz
dovecot-f7548d6d28c313cf80e6f3ef89aed16a19815df1.zip
Adding upstream version 1:2.3.19.1+dfsg1.upstream/1%2.3.19.1+dfsg1upstream
Signed-off-by: Daniel Baumann <daniel.baumann@progress-linux.org>
Diffstat (limited to 'doc/wiki/Design.Code.txt')
-rw-r--r--doc/wiki/Design.Code.txt203
1 files changed, 203 insertions, 0 deletions
diff --git a/doc/wiki/Design.Code.txt b/doc/wiki/Design.Code.txt
new file mode 100644
index 0000000..5fbddba
--- /dev/null
+++ b/doc/wiki/Design.Code.txt
@@ -0,0 +1,203 @@
+Code Design
+===========
+
+Generally Dovecot follows Linux kernel coding style
+[https://www.kernel.org/doc/Documentation/CodingStyle].
+
+Most of the coding style design is about getting as many compiler warnings and
+errors as possible. Dovecot already has some patched-Clang-specific features to
+get more warnings, and will likely have more in future. Issues found by static
+analyzers should also be fixed in some way, e.g. adding extra asserts.
+
+Runtime errors should be caught by asserts or NULL pointer dereferences, which
+cleanly crash the program instead of it continuing and possibly corrupting data
+or causing other bad things. Dovecot's master process restarts the crashed
+processes anyway. Obviously the master process should be very careful to avoid
+crashing, but even then in some OSes a crashed Dovecot master gets restarted by
+the init process.
+
+Dovecot's bottlenecks are primarily disk I/O and secondarily memory usage, so
+using extra CPU for asserts and other extra checks costs practically nothing.
+
+Types
+-----
+
+ * Unsigned types are used whenever the value isn't allowed to be negative.
+ This makes it easier to do "value too large" checks when you don't also have
+ to check for negative values. Also in arithmetic it's better to have the
+ value wrap (and hopefully checked later!) than cause undefined behavior with
+ a signed integer overflow.
+ * 'char *' always points to a NUL-terminated string, 'unsigned char *'
+ doesn't.
+ * 'size_t' should generally be used when pointing to a large memory area,
+ especially for 'mmap()'. Since 'size_t' can be slower to access than
+ 'unsigned int' (or at least use memory), it's fine to use 'unsigned int'
+ when it's "very unlikely" that the size ever goes beyond 4 GB
+ (e.g.'string_t').
+ * 'uoff_t' is used for file offsets/sizes. This is usually 64bit, even with
+ 32bit machines.
+ * 'uint32_t' vs. 'unsigned int': Use 'uint32_t' when the type really should be
+ 32bit, but don't spend too much energy trying to avoid mixing it with
+ 'unsigned int', since they are going to be the same types probably for the
+ rest of Dovecot's life..
+ * 'uint8_t' vs. 'unsigned char': I doubt Dovecot will ever be compiled
+ anywhere where these differ from one another, but for readability use
+ 'uint8_t' for binary data and 'unsigned char' for text data.
+
+Function parameters
+-------------------
+
+ * Try to avoid using/allowing NULL pointers. For example for a public API
+ instead of having 'foo(struct bar *bar)' where bar can be NULL, you could
+ have both 'foo(void)' and 'foo_with_bar(struct bar *bar)'. Of course don't
+ try too hard if it makes the API otherwise ugly.
+ * Dovecot v2.2+ marks all such parameters with ATTR_NULL. These can be
+ verified with a patched clang. Unfortunately the API isn't very nice, it
+ would be better to mark each parameter separately with ATTR_NULL instead
+ of having a numbered list. This will likely change in future.
+ * Also unless you know that a parameter can be NULL, don't bother wasting
+ code on checking if it is. Ideally those are noticed by the patched clang
+ check, but even if not, it's not that bad to crash on NULL pointer
+ dereference.
+ * '_r' suffix in parameter is used for values that are returned (e.g.
+ 'foo(const char **error_r)').
+ * Use const for pointers pretty much whenever possible.
+ * Some day in future I'd like compiler to give warnings if function parameters
+ are modified by the function itself. There's probably a lot of code that
+ does this currently, but try avoiding it in future.
+
+Function return values
+----------------------
+
+ * 'int' type is commonly used for return values. Usually this is used to mean
+ one of:
+ * -1 = error, 0 = ok
+ * -1 = error, 0 = unfinished (e.g. non-blocking call), 1 = finished
+ * unfortunately there are also other uses. I've been wondering about using
+ different types for these some day, such as "err" and "err3", but haven't
+ figured out an easy enough to use design, especially one where compiler
+ verifies that types aren't accidentally mixed.
+ * 'bool' shouldn't be used as return value for ok/error, use int 0/-1 instead.
+ (Old code has some of these. Sometimes it's also not quite clear if
+ something is an "error" or not, so this rule isn't perfect.)
+ * Functions returning pointers shouldn't use NULL pointer to mean an error,
+ only for things like "not found". For example instead of 'if ((ctx = init())
+ == NULL)' use 'if (init(&ctx) < 0)'
+ * Usually when calling a function, either save/check the return value or
+ explicitly say that you don't care about it by saying '(void)func();'
+ * Some functions where the return value can nearly always be ignored it's
+ annoying to add the '(void)' prefix. You can avoid these by adding
+ ATTR_NOWARN_UNUSED_RESULT to such function's prototype. A patched clang
+ can be used to find missing return value checks.
+ * If a system call fails unexpectedly, always log an error about it, possibly
+ even kill the process if it's vital for correct functionality
+ (e.g.'gettimeofday()'). '%m' in Dovecot's 'printf()'-style functions expands
+ to 'strerror(errno)'.
+
+Boolean expressions
+-------------------
+
+Try to use boolean expressions the way they work in Java. C doesn't require
+this, but I think it makes the code easier to understand and reduces bugs in
+some cases (e.g.'if (!foo())' when thinking foo() returns bool/FALSE, but
+actually returns int/-1 on error). We've a clang patch
+[http://dovecot.org/patches/clang/] to give warnings in these cases. As
+expected, it found quite a lot of bugs (some real bugs and a lot of "it just
+accidentally worked"
+[https://github.com/dovecot/core/commit/d9a7e950a9cd21f2b4a90ec7759fca9e8fcc7995]).
+
+ * 'bool x' and 'bool x:1' are the boolean types
+ * TRUE and FALSE are the only valid explicit boolean values (not 0 or 1, and
+ currently also not true/false although that could be changed)
+ * !=, ==, <, >, etc. comparisons create a boolean
+ * if, for, while, etc. require a boolean
+
+So:
+
+ * 'if (!ptr)' -> 'if (ptr == NULL)'
+ * 'if (!num)' -> 'if (num == 0)'
+ * 'if (flags & FLAG_FOO)' -> 'if ((flags & FLAG_FOO) != 0)'
+
+Memory
+------
+
+Memory is always allocated through one of Dovecot's wrappers, e.g. 'i_malloc()'
+or 'i_new()'. All of Dovecot's memory allocations always succeed or kill the
+process. There's no point in writing a lot of code to check for memory
+allocation failures that happen just about never. The only reason some memory
+allocations fail in Dovecot currently is because a process VSZ limit is
+reached, which usually indicates either a memory leak or trying to access a
+mailbox that is too large. In either of these cases it's better to just
+completely restart the process than try to limp along without getting anything
+useful done anymore.
+
+Memory allocations can be assumed to be zero-initialized. All of the memory
+allocation functions do it, except 't_malloc()' and 't_buffer_get()', which you
+should almost never use directly anyway. The code currently also assumes that
+pointers in zero-initialized memory area are NULL, which isn't guaranteed by
+ANSI-C, but practically Dovecot isn't going to be run in systems where it's not
+true and you're not going to remember to NULL-initialize all of your pointers
+anyway without compiler/runtime failure.
+
+When using a struct, always zero-initialize it with 'memset()' instead of
+setting each field separately to 0. It's too easy to cause bugs by adding a new
+field to the struct and forgetting to initialize it.
+
+Double-frees in C are bad. Better to crash with NULL pointer dereference than
+possibly allow an attacker to exploit heap corruption and run executable code.
+Most of the pointers are set to NULL when they are freed:
+
+ * 'i_free_and_null()' is a macro to free the memory and set the pointer to
+ NULL.
+ * 'i_free()' also currently sets the pointer to NULL, but another thought was
+ to set this pointing to some other invalid pointer. But arguably this should
+ be defined to be exactly the same as 'i_free_and_null()'.
+ * Most deinit() functions take a pointer-to-pointer parameter and set the
+ original one to NULL. There's no need to explicitly set the same pointer to
+ NULL afterwards.
+
+Buffers
+-------
+
+Use dynamically growing strings/buffers wherever necessary instead of a static
+sized buffer, where on larger input the function fails or truncates the data.
+It's of course not good to allow users to infinitely grow memory usage, so
+there should be some limits added, but it shouldn't fail even if the limit is
+set to infinite.
+
+Avoid explicitly calculating memory usage for allocations. If you do, mark it
+with '/* @UNSAFE */' comment unless it's the calculation is "so obvious that
+you see it's correct at the first glance". If in doubt, just mark it UNSAFE.
+The idea is that anyone can easily grep for these and verify their correctness.
+
+Type safety
+-----------
+
+ * Try to avoid using void pointers.
+ * Try to avoid casting types to other types, especially if the cast isn't
+ necessary to avoid a compiler warning.
+
+It's better if compiler can give a warning when something is accidentally used
+wrong.
+
+Callback functions
+------------------
+
+Callback functions make the code more difficult to follow, especially when a
+callback calls another callback, or when using function pointers to jump to
+different callbacks depending on state. Of course with asynchronous C code it's
+pretty much impossible to avoid callbacks. Still, try to avoid them where
+possible to keep the code readable. lib-fs/fs-api.h is an example API which
+supports async operations but with a single common "do more now" callback
+rather than every single operation having its own callback parameter. This
+makes it similar to async network IO with read()/write() EAGAIN handling.
+
+Often callback functions can be avoided by creating iterator functions instead.
+For example instead of 'parse(callback, context' use 'ctx = parse_init(); while
+(parse_next(ctx)) { .. } parse_deinit(&ctx);'
+
+Dovecot has some helper macros to make callbacks' context parameters type-safe.
+In v2.2+ see 'CALLBACK_TYPECHECK()' macro and for example 'io_add()' for
+example usage.
+
+(This file was created from the wiki on 2019-06-19 12:42)