style(3) | Library Functions Manual | style(3) |
style
— C language
style guide for Darwin low-level userspace projects
This style's primary objective is to be as friendly to the code review process as possible. Therefore, the style aims to ensure that code changes to the project produce diffs that are
As a secondary objective, this style also aims to make code as clear as possible for an uninitiated programmer reading it. "Clever" syntactic shortcuts are actively discouraged in favor of code that is easy to understand, even if it is less concise. Coincidentally, this practice also tends to lend itself to generating more readable diffs.
Like any style, consistent adherence across a project is a virtue in and of itself, resulting in less distraction for the reader. However, these guidelines should be taken as exactly that: guidelines. No style can be completely adhered to all the time. When you have convinced yourself that a deviation from the style is called for, just make sure it is for the greater good and maintains the style's aims of minimizing diffs and code complexity.
Scrolling vertically has always been easier than scrolling horizontally. Computer mouse manufacturers went so far as to dedicate hardware to the task of scrolling vertically when they came up with scroll wheels. Even on modern trackpads, while it is possible to scroll horizontally, it is far easier to scroll vertically. You just flick upwards. Do not be afraid to introduce extra lines of code if it will result in clearer, more human-parseable diffs when those lines are changed.
Scrolling horizontally is typically awkward, imprecise, and does not lend itself well toward reading on computers or even in print. (Academic journals frequently publish with two narrow columns per page to make reading easier, for example.) Lines should be wrapped consciously; this should not be left to the editor. A soft-wrapping scheme that looks good in your editor may not look good in someone else's editor (or with a different configuration for the same editor).
Just as natural language comments are difficult to read in one, long line, so too are lines of code. Both natural languages and programming languages deserve conscious, deliberate wrapping to improve readability.
Wrap at a column width narrow enough to accommodate side-by-side patch review. 80 is more likely to accommodate this, but 120 might be fine too. Pick a reasonable column width and stick to it. Think about the lines you are wrapping. If you have to wrap a line, do so in a way that is clear, and be willing to make changes to accommodate that (e.g. don't be afraid to declare a variable separately if having the declaration and assignment on the same line causes it to wrap in an unclear way).
Indentation's sole purpose is to indicate scope. You should not use indentation to align characters on two lines of source code (beyond, of course, aligning the first characters of each line if they are both in the same scope).
Given this aspect of the style, it does not particularly matter whether the author chooses spaces or tabs for indentation, and therefore the style makes no prescription (other than to pick one and stick with it).
This style also has another implication: tabs and spaces should never appear in sequence. Each line will be a series of tabs followed by the first character of code. Tabs will never appear after the initial indentation of the line.
Always think of future maintainers and document your thought process for them. Remember, a "future maintainer" might be you after you've forgotten why you did something. For non-trivial changes, you should not rely on linking to a ticket-tracking system to provide context and explanation for a piece of code. You should strive to ensure the reader of your code does not have to context-switch out of reading it in order to understand it.
This is not to say that linking to external resources in code is bad, but if a change's purpose can be reasonably expressed without interrupting how the code reads and flows, just do it. You don't need to publish a whitepaper in comments, but don't just give a link or ticket number with no context.
It is actually very difficult to construct a hash comparison scheme that humans can use without error consistently, and there have been successful social engineering attacks on getting humans to read two hashes that are "close enough" as identical. This means that humans need a lot of help telling the difference between two lines of text.
For any expression, divide it up into fundamental atoms (variable declarations, conditionals, etc.) and then assign each of those atoms to its own line of code. If you do this, when you change a single atom, it is immediately obvious that only that atom changed and nothing else did. The more atoms share lines of code, the more likely it is that changes to them will generate complex diffs that humans will have difficulty understanding.
Assume people will be reading your code in a tool that you do not control and cannot influence. Try viewing your code in such a tool and make sure that it is understandable. If you follow the guidelines of this style, your code may appear different in another viewer (in terms of how many columns are required to display a single line), but its structure will appear identical.
80 columns opens the door for a three-way, side-by-side comparison, but it could be impractical for a number of reasons. 120 columns should provide a nice balance, but really all that matters is that you pick a width and stick to it.
When indenting a continuation line, indent over by two additional tabs. This visually separates the indented line from the next line, which may itself be indented. If there is an operator in the middle of the line, the operator should not be wrapped to the continuation line.
Good
if (some_condition && some_other_condition && yet_another_condition) { exit(0); }
Bad
if (some_condition && some_other_condition && yet_another_condition) { exit(0); } if (some_condition && some_other_condition && yet_another_condition) { exit(0); }
Notice on the good example that the
exit(0)
line is made obviously distinct from the
indented conditions above it. It's very clear on quick visual inspection
that it's not a part of the conditional checks. The
&&
is left on the first line because, when
reviewing a patch to this area, it will be immediately clear to the reviewer
that that line continues to the next one.
Indentation is used only for indicating scope, so no consideration is given to visual alignment of equal signs, colons, etc. across multiple lines.
Good
typedef enum { THING0 = 0, THING1 = 1, THING_THAT_IS_REALLY_LONG = 2, } thing_t;
Bad
enum { THING0 = 0, THING1 = 1, THING_THAT_IS_REALLY_LONG = 2, };
This creates bloated diffs. If you have to re-align a ton of lines after you've added something longer, you get a bunch of whitespace diffs. So for variable declarations, enumerations, assignments, etc. just keep every line independent.
There is one exception to this rule, and that is if you choose to define a flagset in terms of its raw hexadecimal values and wish to align them. In this case, it is a significant benefit to have these values aligned, and you may do so with spaces.
Example
typedef enum { F_INIT = 0x00, F_FOO = 0x01, F_BARBAZ = 0x02, F_CAD = 0x04, F_FAD = 0x08, F_FUD = 0x10, F_FLAME = 0x20, F_FOOD = 0x40, } flag_t;
Use blank lines to separate logical chunks of code. Do not use more than one.
C99 has named initializers for structures. Prefer those to initializing members one-by-one later on. Both structures and arrays should be initialized in the same style, with each element of the initializer being on its own line. This is so that when an element is added to or removed from the initialization list, that change gets its own line of diff.
The exception to this is the string literal.
Good
struct my_struct baz = { .ms_foo = 1, .ms_bar = NULL, }; char *strings[] = { "string", "other string", };
struct my_struct baz = { 1, NULL }; struct my_struct baz = { 1, NULL }; struct my_struct baz = { .ms_foo = 1, .ms_bar = NULL, };
The last element of an initializer list should be followed by a comma. This is so that when you add a new element to that list, it's a one-line diff rather rather than a two-line diff (one line of diff to add the , to the previous-last element, and another line of diff to add the new-last element).
Good
enum { THING0, THING1, }; struct my_point p = { .x = 1, .y = 0, .z = 1, };
Bad
enum { THING0, THING1, }; enum { THING0, THING1 }; struct my_point p = { .x = 1, .y = 0, .z = 1 };
Note that, if your project requires ANSI C compliance, you should disregard this guideline, as it will not work under C89.
The clang(1) compiler supports extensions to the C language which allow for function name overloading. Name overloading generally leads to code which is difficult to read and introspect and should be avoided.
Any struct
which is shared or exported
should have a common prefix for each member. This helps avoid collisions
with preprocessor macros.
Good
struct foobar { int64_t fb_baz; char *fb_string; };
Bad
struct foobar { int64_t baz; char *string; };
A type is indicated with _t
at the end of
the typedef
, whether the type refers to a
struct
, union
,
enum
, etc. All types are indicated this way. Types
are in all lower-case letters.
Good
typedef uint64_t handle_t; typedef enum foo foo_t; typedef union bar bar_t;
Bad
typedef uint64_t Handle; typedef enum foo foo_e; typedef union bar bar_u;
Avoid integer types whose names do not indicate size, such as
int
or long
. Instead, use
the types from stdint.h
(e.g.
int64_t
, uint32_t
, etc.),
which explicitly indicate size. You may use size-ambiguous integer types if
an API requires it.
The sizeof()
operator can take both types
and variables as arguments. Where possible and relevant, always pass a
variable. This ensures that if the variable's type changes, the proper size
is used automatically.
Good
uuid_t ident; memcpy(ident, buff, sizeof(ident));
Bad
uuid_t ident; memcpy(ident, buff, sizeof(uuid_t));
IMPORTANT:
When applied to a char *
,
sizeof()
will return the width of a pointer,
not the size of the string literal it points to, so take
care to only use strlen(3) for such cases.
Relatedly, when applied to an array variable that is a parameter
in a function's parameter list, sizeof()
will return
the width of a pointer, not the size of the type.
Good
char *string = "the quick brown fox"; size_t len = strlen(string); void foo(uuid_t u) { uuid_t u2; memcpy(u2, u, sizeof(uuid_t)); }
Bad
char *string = "the quick brown fox"; size_t len = sizeof(string) - 1; void foo(uuid_t u) { uuid_t u2; // sizeof(u) == sizeof(void *) in this context. memcpy(u2, u, sizeof(u)); }
In C, an empty function parameter list means that
any set of
parameters is acceptable. In virtually all cases where you do this, you mean
to have a parameter list of void
.
Good
void foo(void) { do_a_thing_without_arguments(); }
Bad
void foo() { do_a_thing_without_arguments(); }
Preprocessor definitions are written in all-caps. Macros which are function-like may be lower-case provided they do not double-evaluate their arguments. Function-like macros that do double-evaluate their arguments should be in all-caps.
Good
#define FOO 1 #define halt() abort() // Does not double-evaluate _a and _b such that max(i++, j) is safe. #define max(_a, _b) ({ \ typeof(_a) a1 = (_a); \ typeof(_b) b1 = (_b); \ (a1 < b1 ? b1 : a1); \ }) // Double-evaluates _a and _b, so MAX(i++, j) is not safe. #define MAX(_a, _b) ((_a) < (_b) ? (_b) : (_a))
Bad
#define kFoo 1 // Double-evaluates _a and _b. #define max(_a, _b) ((_a) < (_b) ? (_b) : (_a))
Where possible, you should prefer inline functions to preprocessor macros, or split a macro into a preprocessor piece and an inline function piece.
Example
static inline void _debug_uint64_impl(const char *name, uint64_t val) { fprintf(stderr, "%s = %llu\n", name, val); } #define debug_uint64(_v) do { \ _debug_uint64_impl(#_v, _v); \ } while (0)
In this example, the preprocessor is used to do something that only the preprocessor can do: stringify the input variable's name. But once that work is done, the actual logging of the value is done by an inline function. This keeps the code much more readable.
Preprocessor macro expansion can run afoul of naming collisions
with other variables that are in the same scope as the macro being expanded.
To help avoid such collisions, parameters to preprocessor macros should have
a prefix, suffix or both. Another good option is to use a
_[A-Z]
prefix, since it is reserved by the C
standard and will not collide with preprocessor evaluation.
Example
#define foo2(_x_) ((_x_) * 2) #define foo4(_x) ((_x) * 4) #define foo8(_X) ((_X) * 8)
When passing a parameter to a preprocessor macro, it should always be referred to within parentheses to force evaluation. The exception is for parameters intended to be string literals.
Good
#define add2(__x) ((__x) + 2) #define println(__fmt, ...) printf(__fmt "\n", ## __VA_ARGS__)
Bad
#define add2(__x) x + 2
Preprocessor directives do not have scope, and therefore they always start at column zero.
Good
if (do_loop) { for (i = 0; i < 10; i++) { #if CONFIG_FOOBAR foobar(i); #else foobaz(i); #endif } }
Bad
if (do_loop) { for (i = 0; i < 10; i++) { #if CONFIG_FOOBAR foobar(i); #else foobaz(i); #endif } }
Do not hard-code the size of a string. Use either
sizeof(str) - 1
or
strlen(str)
. In the latter case,
clang(1) is smart enough to recognize when a constant
string is being passed to strlen(3) and replace the
function call with the string's actual length.
Good
char str[] = "string"; frob_string(str, sizeof(str) - 1); frob_string(str, strlen(str));
Bad
char str[] = "string"; frob_string(str, 6);
If you control all call sites for a function, then there is no
point to validating the inputs to that function. If none of your call sites
pass NULL
, to a pointer parameter, for example, then
the a NULL
input indicates that there is state
corruption in your address space. You may think that it's good to catch such
corruption, but NULL
is just
one possible
invalid pointer value. What if the invalid input is
0x1
? What if it is 0x2
?
Should you also check for those?
This kind of input checking complicates code. Because it indicates state corruption, the only sensible thing to do in that situation would be to abort. But the operating system has mechanisms in place to detect the reference of an invalid resource, such as virtual memory and use-after-free detection. There is no point to you duplicating these mechanisms.
Of course, you should always validate inputs when they come from untrusted external sources (such as a file or IPC message), but if the inputs only ever comes from your program, you should trust them.
Good
static foo_t * get_item(foo_t *arr, size_t idx) { return &arr[idx]; } int only_call_site(foo_t *f) { foo_t *arr = calloc(10, sizeof(arr[0])); if (!arr) { return errno; } *f = get_item(arr, 0); return 0; }
Bad
static foo_t * get_item(foo_t *arr, size_t idx) { if (!arr) { // No point to this check since we'll abort immediately below when we // try to dereference `arr`. The crash report will have more than enough // information to diagnose the NULL pointer dereference if it ever // happens. abort(); } return &arr[idx]; } int only_call_site(foo_t *f) { foo_t *arr = calloc(10, sizeof(arr[0])); if (!arr) { return errno; } *f = get_item(arr, 0); return 0; }
The C language provides precious few compile-time validation
mechanisms, and so in many cases it is not possible to fully describe to the
compiler the range of expected inputs for an API. So your API should
validate input from its caller and abort on invalid input. Returning an
error in such a case is pointless, since the caller probably isn't checking
the return code anyway. The only sure way to get the programmer's attention
is to abort the calling process with a helpful message. The
os_crash
routine allows you to supply such a message
that the crash reporter on Darwin will display in its crash report.
Good
uint8_t foo_a_bar(uint8_t number) { if (number > (UINT8_MAX / 2)) { os_crash("number given to foo_a_bar() too large"); } return (number * 2); }
Bad
int foo_a_bar(uint8_t number, uint8_t *new_number) { if (number > (UINT8_MAX / 2)) { return EINVAL; } *new_number = (number * 2); return 0; }
Some POSIX routines have return values that indicate whether you
should check errno
, and others just return the error
directly. While POSIX generally documents what does what pretty well, there
are lots of SPIs scattered around the system that use both conventions and
aren't documented at all, leaving you to spelunk through the implementation
to find out what's what.
To avoid confusion, do not re-use the same variable for the return
codes from these functions. If an API returns a code indicating that you
should check errno
, name it
ret
or similar. If it returns the error directly,
name it error
or similar and make it of type
errno_t
. This makes it very clear to the person
reading the code that you did the work to find out how the API worked. By
naming the variable you store the return value in appropriately, a reader of
your code (possibly Future You) can immediately know what's going on.
If you are making new API or SPI that returns an error code, make
it return errno_t
and do not use the global
errno
for communicating error information.
Good
#include <sys/types.h> errno_t error = posix_spawn(NULL, "ls", NULL, NULL, argv, envp); switch (error) { case 0: // Handle success. break; case EACCES: // Handle "permission denied". break; } int ret = reboot(RB_AUTOBOOT); if (ret == -1) { switch (errno) { case EPERM: // Handle "permission denied". break; case EBUSY: // Handle "reboot already in progress". break; } }
Bad
int ret = posix_spawn(NULL, "ls", NULL, NULL, argv, envp); switch (error) { case 0: // Handle success. break; case EACCES: // Handle "permission denied". break; } int error = reboot(RB_AUTOBOOT); if (error == -1) { switch (errno) { case EPERM: // Handle "permission denied". break; case EBUSY: // Handle "reboot already in progress". break; } }
Breaking up a single complex if
statement
into multiple distinct checks is both more readable and makes it possible to
be more granular about handling failure cases. It also leads to smaller
diffs if one of those conditions turns out to require special handling.
Complex if
statements are often associated
with input validation and just returning an error code (usually
EINVAL
) if any input is invalid. While deciding
which error to return in which case is more of an art than a science, that
doesn't mean you should just give up and return a single error every time
there isn't an immediately obvious fit to the case you've encountered.
Ideally, every case where your routine may fail should
be represented by a distinct error code, but this is often not practical.
Still, you should attempt to distinguish each
likely failure
case with its own error code. The POSIX error space is fairly rich, and
error descriptions are brief enough that they can be liberally interpreted.
For example, ESRCH
can be used to apply to any
situation where a resource could not be located, not just conditions where
there is literally "No such process".
This isn't to say that you should never have compound conditions
in an if
statement, but the groupings should almost
always be small, and the grouped checks should be highly likely to require
change as a group when change is needed.
Good
if (foo->f_int > 10 || foo->f_int < 5) return ERANGE; } if (!foo->f_uaddr) { return EFAULT; } if (foo->f_has_idx && foo->f_idx > 100) { return ERANGE; } if (foo->f_state != FS_INITIALIZED) { return EBUSY; }
Bad
if (foo->f_int > 10 || foo->f_int < 5 || !foo->f_uaddr || (foo->f_has_idx && foo->f_idx > 100) || foo->f_state != FS_INITIALIZED) { return EINVAL; }
See intro(2),
<sys/errno.h>
, and
<os/error.h>
for the error codes supported on
Darwin.
NULL
is valid input to
free(3). It's part of the API contract. Armed with this
knowledge, you can do things like avoid conditional memory calls, which are
always weird.
Good
char buff[1024]; char *ptr = buff; char *what2free = NULL; if (condition) { ptr = malloc(8); what2free = ptr; } free(what2free);
Bad
char buff[1024]; char *ptr = buff; bool did_malloc = false; if (condition) { ptr = malloc(8); did_malloc = true; } if (did_malloc) { free(ptr); }
Any non-exported symbols should be prefixed with a
_
. Thus any static
functions, project-local interfaces, etc. should have this prefix. Exported
symbols (API or SPI) should not have such a prefix.
Good
static const char *_thing = "thing"; static void _foo(void); void _project_local_interface(void);
static const char *thing = "thing"; static void foo(void); void project_local_interface(void);
Global variables should have a sensible prefix, preferably related
to the project name -- e.g. globals in the libxpc(3)
project are prefixed with xpc_
.
You may also consider declaring a global structure which contains all of your project's shared, unexported global state. This makes it very clear when code is referencing that state. Also, if your project is a library at the libSystem layer, this is required if you are ever to adopt os_alloc_once(3).
Example
typedef struct _foobar_globals { uint64_t fg_global_int; char *fg_global_string; } foobar_globals_t; foobar_globals_t _g; foobar_globals_t *g = &_g;
Sometimes projects must create bespoke SPIs for one particular
caller, and these SPIs are not considered suitable for general use. Append a
suffix to these SPIs to indicate their bespokeness and the intended caller
with _4caller
. For example, if you add an SPI
specifically for IOKit, your suffix would likely be
_4IOKit
.
#ifdef
is to check if a token is
defined at
all to anything. #if
is to check the token's
value. The C standard specifies that when a token is undefined,
#if
will evaluate it as 0
.
When checking for features, it's almost always more appropriate to use
#if
since the lack of a feature could still be
communicated by setting the token's value to 0
,
which would pass the #ifdef
check.
If you're on Darwin, libplatform
defines a
lot of nice macros for compiler attributes. Use them to decorate your
functions. This gives the compiler lots more information so it can do fancy
optimizations. Things like OS_NONNULL
let the
compiler know that a parameter should never be NULL
.
OS_WARN_RESULT
is great for enforcing that a caller
always check the return value of a function.
OS_MALLOC
lets the compiler know that the
function returns a heap allocation, and
OS_OBJECT_RETURNS_RETAINED
lets ARC know that the
function returns an object with a reference that the caller is responsible
for releasing.
You can avoid having to decorate all your pointer
parameters by using OS_ASSUME_NONNULL_BEGIN
and
OS_ASSUME_NONNULL_END
and specifically annotating
variables which can
be NULL
with the _Nullable
keyword. Either approach is acceptable.
Generally, use these attributes on functions which will have
callers who cannot view the implementation. Putting many of these attributes
on (for example) an inline function is harmless, but the compiler can reason
about things like OS_NONNULL
and infer it when it
can view the implementation at all call sites.
So as a rule of thumb, if it's in a header, decorate it
appropriately. These attributes can also serve as nice implicit
documentation around API and SPI. For example, if you have a decoration of
OS_NONNULL1
, you don't have to spell out in the
HeaderDoc that you can't pass NULL
for that
parameter; it'll be right there in the declaration, and the compiler will
catch attempts to do so.
In C, make the definition of a function findable and
distinguishable from its declaration (if any) through regular expressions.
This way, you can find the implementation of foo
by
doing a regex search for ^foo
, and you won't get the
declaration as a result.
Good
static int foo(int bar); int foo(int bar) { return bar; }
Bad
static int foo(int bar); int foo(int bar) { return bar; }
This has the additional benefit of allowing you to change the name/parameter list of a function independently of the return type. A diff of either will not be confused with the rest of the function signature.
Make them look nice. Include the appropriate decorations
(including an explicit export attribute such as
OS_EXPORT
so it's very, very clear that it's
intended to be API), availability attributes, and HeaderDoc. Put this stuff
before the function.
Example
/*! * @function foo * Returns `bar` and ignores another parameter. * * @param bar * The value to return. * * @param baz * The value to ignore. * * @result * The value of `bar`. This routine cannot fail. */ __API_AVAILABLE(macos(10.14), ios(12.0), tvos(12.0), watchos(5.0)) OS_EXPORT OS_WARN_RESULT OS_NONNULL2 int foo(int bar, char *baz);
In general, use C++/C99-style comments. But there may be good reasons to use the classic C-style comments, such as for HeaderDoc, which requires them, e.g.
/*! * Documentation */
Long, top-level comments may also use classic C-style comments.
C++/C99-style comments may directly follow code on the same line only if they are extremely brief. Otherwise, in general, comments and code should not share a line.
Also, do not get cute with /* */
comments
and embed them within code.
Good
// Comment on what the loop does. for (i = 0; i < cnt; i++) { // some code... } /* * A top-level or very long comment. */ int ret = esoteric_spi(); // returns -1 on failure, does not set errno
Bad
//Comment int ret = esoteric_spi(); // This SPI returns -1 on failure but does not set // errno, so here is a comment explaining that that really should be above // the line of code rather than immediately following it. foo(arg1, /* first argument */, arg2 /* second argument */);
case
and switch
belong at the same column indent because indentation indicates scope, and
due to case fall-through, all cases are in the same scope -- one lower than
the previous. (Unless you scope them explicitly with braces, but you should
avoid doing that if at all possible.)
Good
switch (thing) { case THING1: exit(0); break; case THING2: exit(1); break; default: __builtin_unreachable(); }
Bad
switch (thing) { case THING1: { exit(0); break; } case THING2: { exit(1); break; } default: __builtin_unreachable(); } switch (thing) { case THING1: exit(0); break; case THING2: exit(1); break; default: { __builtin_unreachable(); } }
If you're declaring an enum
, you should
typedef
it so the compiler can reason about valid
values and know the width of the enum
type if
possible. The OS_ENUM
macro provides the correct
behavior for C, C++, and Objective-C.
If you pre-declare a variable before using it, initialize it to a sane value. If this value is something like the return value of the function, initialize it to a value which indicates failure of the operation. You should always do this even if there are no code paths which fail to initialize the variable later. It's just good practice, and it gives the person reading your code an indication of what ranges of values the variable is expected to hold.
Good
int result = -1; if (success) { result = 0; }
Bad
int result; if (success) { result = 0; }
Any error variable should always be initialized to a non-success condition. In general, consider success as something that your code must explicitly declare and that the absence of such a declaration indicates failure.
Good
int error = -1; if (is_root()) { error = 0; } else { error = EPERM; }
Bad
int error = 0; if (!is_root()) { error = EPERM; }
Note that you may omit an initializer for a complex
struct
type (such as the stat(2)
struct
) but then it is incumbent upon you to ensure
that that variable is not used uninitialized except to populate it. For many
struct
types, you can initialize them with
{0}
. This will not work for structures with nested
structures though. For those you can use bzero(3) or
similar.
goto
has gotten a bad rap, but it's
probably the best way in C to do lots of sequential error handling. You
don't have
to use goto
if you don't want to, but if you do,
just keep a a couple things in mind.
-Wsometimes-uninitialized
. With this
warning, clang(1) will catch cases where a variable may
be used uninitialized because a goto
skipped the
initialization.goto
as a looping construct. The C
language has a few different control statements for looping and iteration.
Use one of those; it's not the 70's anymore.These guidelines make it simple to use
goto
effectively while avoiding the most common
pitfalls.
Sometimes you have to pass a parameter to a function to trigger some sort of behavior. Avoid using a magic Boolean for these cases. Instead, use an invariant that describes the behavior you are triggering.
Good
replace_spaces(string, REPLACE_TABS_TOO); replace_spaces(string, REPLACE_ONLY_SPACES);
Bad
replace_spaces(string, true); replace_spaces(string, false);
If you find yourself creating many such Boolean values for function parameters, you should seriously considering defining a set of flags and passing that as one parameter instead.
In general, avoid code that looks crunched together, especially around operators. Specifically:
Good
i++; j = i + k; k += condition ? i : j;
Bad
i ++; j=i+k k+=condition?i:j;
Don't use the ternary operator to choose between complex or long
expressions. Reserve it for very trivial cases that are highly unlikely to
change. In general if you've found yourself putting the expressions in your
usage of ternary operator on multiple lines, you should just be using an
if
statement.
Good
i += condition ? j : k;
Bad
i += (i < j && j > k || i == j) ? foo(bar, baz, 0, NULL) : frob(bar, 0, NULL, baz);
Good
if (condition) { do_thing(); }
Bad
if(condition) { do_thing (); } if ( condition ) { do_thing ( argument ); }
Worse
while( condition) { do_thing( ); }
Always, always, always use braces for your control statements. Lack of braces can and has led to serious security issues that were missed during code review, and putting the braces there from the start means that adding new statements to that clause does not require you to also add the braces.
The clause should be indented on the next line with no blank lines in between.
Good
if (condition) { do_thing(); } while (condition) { do_thing(); }
Bad
if (condition) do_thing(); if (condition) do_thing(); while (condition) do_thing(); while (condition) { do_thing(); }
Even trivial uses of braceless if
statements are problematic. Consider the following:
Bad
if (error) i++, i++;
This is admittedly contrived, but it would be likely to escape
code review because it's very easy to miss that the first line ends with a ,
rather than a ;. Braces in if
statements are
sensitive enough to security that the best policy is to simply always use
them, without exception.
Specific rules for braces:
else
goes between two braces on the same line.struct
,
union
, enum
, etc. goes on
the same line as the declaration.Good
if (condition) { do_thing(); } else { do_other_thing(); } void function(void) { return; } struct my_struct { uint32_t thing; }; for (cur; cur; cur = cur->next) { }
Bad
if (condition) { do_thing(); } else { do_other_thing(); } if (condition) { do_thing(); } else do_other_thing(); void function(void) { return; } struct my_struct { uint32_t thing; }; for (cur; cur; cur = cur->next)
Worse
if (condition) { do_thing(); } void function(void) { return; }
This style was largely derived from the style that evolved through the launchd(8), libdispatch(3), and libxpc(3) projects.
12 January, 2018 | Darwin |