6. Spaces
o, world!\n",'/'/'/'));}read(j,i,p){write(j/p+p,i---j,i/i);}
- Unlucky Things, Fuzzy C Code Contest, 1984. The author requested anonymity.
Normally, use portrait and landscape blanks. Indents and spaces should reflect the block structure of the code. For example, there should be at least two blank lines between one function definition and the comment of the next function.
If a conditional branch statement is too long, it should be split into several separate lines.
if (foo->next==NULL && totalcount<needed && needed<=MAX_ALLOT
&& server_active(current_input)) { ...
Maybe the following is better
if (foo->next == NULL
&& totalcount < needed && needed <= MAX_ALLOT
&& server_active(current_input))
{
...
Similarly, complex loop conditions should also be split into different rows.
for (curr = *listp, trail = listp;
curr != NULL;
trail = &(curr->next), curr = curr->next )
{
...
Other complex expressions, especially those that use the ?: operator, are best split into multiple lines.
c = (a == b)
? d + f(a)
: f(b) - d;
When there is an expression placed in brackets after the keyword, the keyword should be separated from the left bracket using spaces (the sizeof operator is an exception). In the parameter list, we should also use spaces to explicitly separate each parameter. However, macro definitions with parameters must not insert spaces between the name and the left bracket, otherwise the C precompiler will not recognize the following parameter list.
7. Example
* Determine if the sky is blue by checking that it isn't night.
* CAVEAT: Only sometimes right. May return TRUE when the answer
* is FALSE. Consider clouds, eclipses, short days.
* NOTE: Uses 'hour' from ''. Returns 'int' for
* compatibility with the old version.
*/
int /* true or false */
skyblue()
{
extern int hour; /* current hour of the day */
return (hour >= MORNING && hour <= EVENING);
}
/*
* Find the last element in the linked list
* pointed to by nodep and return a pointer to it.
* Return NULL if there is no last element.
*/
node_t *
tail(nodep)
node_t *nodep; /* pointer to head of list */
{
register node_t *np; /* advances to NULL */
register node_t *lp; /* follows one behind np */
if (nodep == NULL)
return (NULL);
for (np = lp = nodep; np != NULL; lp = np, np = np->next)
; /* VOID */
return (lp);
}
8. Simple statements
There should be only one statement per line unless multiple statements are particularly closely related.
case FOO: oogle (zork); boogle (zork); break;
case BAR: oogle (bork); boogle (zork); break;
case BAZ: oogle (gork); boogle (bork); break;
The empty body of the for or while loop statement should be placed separately on a line and commented, so that it can be clearly seen that the empty body is intentional and not missing code.
while (*dest++ = *src++)
; /* VOID */
Do not test non-zero expressions by default, for example:
if (f() != FAIL)
Better than the following code
if (f())
Even if the value of FAIL may be 0 (0 is considered false in C). An explicit test will solve your problem when someone subsequently decides to use -1 instead of 0 as the failed return value. Even if the value of the comparison never changes, we should use explicit comparisons; for example
if (!(bufsize % sizeof(int)))
It should be written as
if ((bufsize % sizeof(int)) == 0)
This reflects the numerical (non-Boolean) nature of this test. A common error point is to use strcmp to test whether the string is the same, and the results of this test should never be abandoned. A better way is to define a macro STREQ.
#define STREQ(a, b) (strcmp((a), (b)) == 0)
Non-zero tests are often abandoned for predicates or expressions that satisfy the following constraints:
0 means false, and everything else is true.
By its naming, you can see that the return is really obvious.
Use isvalid or valid to call a predicate, do not use checkvalid.
A very common practice is to declare a boolean type "bool" in a global header file. This particular name can greatly improve code readability.
typedef int bool;
#define FALSE 0
#define TRUE 1
or
typedef enum { NO=0, YES } bool;
Even with these statements, do not check the equivalent of a Boolean value to 1 (TRUE, YES, etc.); tests can be replaced with inequality between 0 (FALSE, NO, etc.). Most functions can be guaranteed to return 0 when it is false, but only return non-zero when it is true.
if (func() == TRUE) { ...
Must be written as
if (func() != FALSE) { ...
If possible, it is best to rename or rewrite the expression for the function/variable so that it can be clearly known without comparing it with true or false (e.g. rename to isvalid()).
Embed assignment statements also have a useful place. In some structures, there is no better way to achieve this result without reducing the readability of the code.
while ((c = getchar()) != EOF) {
process the character
}
The ++ and -- operators can be considered as assignment statements. In this way, for some intentions, functions with side effects are implemented. Using embedded assignment statements may also improve runtime performance. However, everyone should make a good trade-off between improved performance and reduced maintainability. This happens when using embedded assignment statements in some artificial places, for example:
a = b + c;
d = a + r;
It should not be replaced by the following code:
d = (a = b + c) + r;
Even the latter may save a calculation cycle. During long-term operation, as the optimizer matures, the runtime gap between the two will decrease, while the maintenance difference between the two will increase, as human memory will decline over time.
In any well-structured code, goto statements should be used conservatively. The biggest benefit of using goto is to jump out of switch, for and while multi-layer nesting, but the need to do so also implies that the inner structure of the code should be extracted and placed in a separate function with a successful or failed return value.
for (...) {
while (...) {
...
if (disaster)
goto error;
}
}
...
error:
clean up the mess
When goto is needed, its corresponding tag should be placed on a separate line and subsequent codes are indented one level. Comments should be added when using goto statements (possibly placed in the header of the code block) to illustrate its function and purpose. Continue should be used conservatively and as close as possible to the top of the loop. Break has less trouble.
Parameters of non-prototype functions sometimes need to be explicitly typed. For example, if a function expects a 32-bit long integer but is passed into a 16-bit integer number, it may cause the function stack to be misaligned. This problem occurs with pointers, integers, and floating point values.
9. Compound statements
A compound statement is a list of statements enclosed in brackets. There are many common parentheses formatting methods. If you have a local standard, please be consistent with the local standard, or choose a standard and use it continuously. Always use the styles used in those codes when editing other people's code.
control {
statement;
statement;
}
The style above is called "K&R style". If you haven't found a style you like, you can prioritize this style. In K&R style, the else part in the if-else statement and the while part in the do-while statement should be in the same line as the ending braces. In most other styles, braces occupy a single line.
When a code block has multiple tags, each tag should be placed on a single line. Annotations must be added to the fall-through feature of the switch statement in C language (that is, there is no break between the code segment and the next case statement) to facilitate better maintenance in the later stage. It is best to be a lint style comment/instructions.
switch (expr) {
case ABC:
case DEF:
statement;
break;
case UVW:
statement;
/*FALLTHROUGH*/
case XYZ:
statement;
break;
}
Here, the last break is unnecessary, but it is necessary, because if another case is added to the last case, it will prevent the fall-through error from happening. If you use the default case, then the default case should be placed at the end and no break is required, if it is the last case.
Once an if-else statement contains a compound statement in an if or else segment, both if and else segments should be bracketed in brackets (called fully bracketed syntax).
if (expr) {
statement;
} else {
statement;
statement;
}
In the sequence of if-if-else statements without a second else, brackets are also unnecessary. If the brackets after ex1 are omitted, the compiler parsing error will occur:
if (ex1) {
if (ex2) {
funca();
}
} else {
funcb();
}
A if-else statement with else if should be left aligned when writing.
if (STREQ (reply, "yes")) {
statements for yes
...
} else if (STREQ (reply, "no")) {
...
} else if (STREQ (reply, "maybe")) {
...
} else {
statements for default
...
}
This format looks like a general switch statement, and indentation reflects the exact switching between these candidate statements rather than nested statements.
Do-while loops always use brackets to enclose the loop body.
The following code is very dangerous:
#ifdef CIRCUIT
# define CLOSE_CIRCUIT(circno) { close_circ(circno); }
#else
# define CLOSE_CIRCUIT(circno)
#endif
...
if (expr)
statement;
else
CLOSE_CIRCUIT(x)
++i;
Note that on systems that are not defined in CIRCUIT, statement ++i is only executed when expr is false. This example points out the value of macros named in capitals and the value of making the code completely parenthesed.
Sometimes, if can be controlled transfers unconditionally through break, continue, goto or return. else should be implicit and the code should not be indented.
if (level > limit)
return (OVERFLOW)
normal();
return (level);
The flat indentation tells the reader that the Boolean test is kept constant in the rest of the sealing block.
10. Operator
A unary operator should not be separated from its unique operand. Generally, all other binary operators should be separated from their operation tree using blanks, with the exception of '.' and '->'. When encountering complex expressions, we need to make some judgments. If the inner operator does not use blank separation but the outer layer is used, the expression may be clearer.
If you think an expression is difficult to read, consider splitting it into multiple lines. Splitting at the lowest priority operator close to the breakpoint is the best option. Since C has some unexpected priority rules, expressions that use operators should be cloned in parentheses. But too many brackets can also make the code readability worse, because humans are not good at matching brackets.
The binary comma operator is also used, but we should generally avoid it. The biggest use of comma operators is to provide multiple initialization or operations, such as in for loop statements. Complex expressions, such as those that use nested ternary?: operators, can cause confusion and should be avoided as much as possible. The ternary operator and comma operator are useful in some places where macros are used, such as getchar. The operands of the logical expression before the ternary operator?: should be enclosed, and the return values of the two subexpressions should be of the same type.
11. Naming Contract
There is no doubt that each independent project has its own naming convention, but there are still some common rules worth referring to.
1). Keep names with the beginning or ending of underscores for system purposes and these names should not be used in any user-defined names. Most systems use these names for names that users should not and need to know. If you must use your own private identifier, you can start with the letters that identify the packages they belong to.
2). The constant names defined by #define should be capitalized.
3).Enum constants should be capitalized or all capitalized.
4). The name of function name, typedef name, variable name, structure, union and enumeration flag should be in lowercase letters.
5). Many "macro functions" are all capitalized. Some macros (such as getchar and putchar) are named in lowercase letters, because they may be used as functions. Lower-case named macros are allowed only when the macro behaves like a function call, that is, they only evaluate their parameters once and do not assign values to named formal parameters. Sometimes we cannot write a macro with function behavior, even if its parameters are evaluated only once.
6). Avoid using different naming methods in the same situation, such as foo and Foo. Also avoid foobar and foo_bar. The confusion caused by this needs to be considered.
7). Again, avoid using names that look similar. In many terminals and printing devices, 'I', '1' and 'l' are very similar. It's particularly bad to name a variable l because it looks very much like a constant '1'.
Generally, the global name (including enum) should have a unified prefix, through which we can identify which module the name belongs to. Global variables can be optionally aggregated in a global structure. The name of typedef is usually added with a 't' at the end.
Avoid conflicts between names in various standard libraries. Some systems may contain libraries that you don't need. In addition, your program may also be expanded one day in the future.
12. Constant
Numeric constants should not be hardcoded into source files. The #define feature of the C preprocessor should be used to give a meaningful name to the constant. Symbolized constants can make the code more readable. Unified definition of these values in one place also facilitates management of large programs, so that constant values can be uniformly modified in one place, just modify the value of define. Enumerated data types are more suitable for declaring a set of variables with discrete values, and the compiler can also perform additional type checks on them. At the very least, any hard-coded value constant must have a comment to illustrate the origin of the value.
The definition of a constant should be consistent with its use; for example, use 540.0 as a float instead of using 540 plus an implicit float type conversion. Sometimes the constants 0 and 1 are used directly without definition. For example, a constant used in a for loop statement to identify the subscript of an array,
for (i = 0; i < ARYBOUND; i++)
The above code is reasonable, but the following code is
door_t *front_door = opens(door[i], 7);
if (front_door == 0)
error("can't open %s\\\\n", door[i]);
It is unreasonable. In the last example, front_door is a pointer. When a value is a pointer, it should be compared with NULL instead of 0. NULL is defined in the standard I/O library header file, and in some new systems it is defined in it. Even simple values like 1 or 0 are better defined as TRUE and FALSE definitions (sometimes, it is better to use YES and NO).
Simple character constants should be defined as literal values and numbers should not be used. The use of non-visible text characters is discouraged because they are not portable. If non-visible text characters are necessary, especially when used in strings, they should be defined as escaped characters of three octal numbers (for example: '\007') instead of one character. Even so, this usage should take into account its machine relevance and be handled as follows.
13. Macro
Complex expressions may be used as macro parameters, which may cause problems due to operator priority order unless all parameters appear in the macro definition are enclosed in parentheses. We also don't seem to be able to use this problem caused by side effects in parameters, except to eliminate side effects when writing expressions (which is a good idea anyway). If possible, try to evaluate the macro parameters only once in the macro definition. There are many times when we cannot write a macro that can be used like a function.
Some macros are also used as functions (eg, getc and fgetc). These macros will be used to implement other functions, so that once the macro itself changes, the functions using the macro will also be affected. Be careful when exchanging macros and functions, because function parameters are passed by value, while macro parameters are replaced by name. Only by being particularly careful when defining macros can you reduce your concerns when using macros.
Global variables should be avoided in macro definitions, because the names of global variables are likely to be obscured by local declarations. For macros that modify named parameters (not the storage area to which these parameters point to) or are used as lvalues of the assignment statement, we should add corresponding comments to give reminders. Those macros that do not take parameters but reference variables, or are too long or as function alias should use an empty parameter list, for example:
#define OFF_A() (a_global+OFFSET)
#define BORK() (zork())
#define SP3() if (b) { int x; av = f (&x); bv += x; }
Macros save extra overhead for function calls and return, but when a macro is too long, the extra overhead for function calls and return becomes trivial, and in this case we should use functions.
In some cases, it is necessary to have the compiler ensure that the macro should end with a semicolon when used.
if (x==3)
SP3();
else
BORK();
If the semicolon after SP3 call is omitted, the subsequent else will match the if in the SP3 macro. With a semicolon, the else branch won't match any if. The SP3 macro can be implemented safely like this:
#define SP3() \\\\
do { if (b) { int x; av = f (&x); bv += x; }} while (0)
Manually setting the macro to add do-while enclosure looks awkward, and many compilers and tools will complain that it is a constant value in the while condition. A macro that declares a statement can make encoding easier:
#ifdef lint
static int ZERO;
#else
# define ZERO 0
#endif
#define STMT( stuff ) do { stuff } while (ZERO)
We can declare the SP3 macro with the following code:
#define SP3() \\\\
STMT( if (b) { int x; av = f (&x); bv += x; } )
Using STMT macros can effectively prevent some print layout errors that can potentially change program behavior.
In addition to type conversion, sizeof and the above techniques and techniques, keywords should be included only when the entire macro is enclosed in brackets.
14. Conditional Compilation
Conditional compilation is useful when handling machine dependencies, debugging, and setting specific options during compilation. But be careful about conditions. The various controls can easily be combined in an unpredictable way. If you use #ifdef to judge machine dependencies, make sure that when there is no machine type adaptation, you return an error instead of using the default machine type (use #error and indent it so that it can work under some old compilers). If you #ifdef optimization option, it should be an unoptimized code by default, not an incompatible program. Make sure that the test is unoptimized.
Note that text in the #ifdef area may be scanned (processed) by the compiler, even if the result of #ifdef evaluation is false. But even if the #ifdef part of the file can never be compiled to (for example, #ifdef COMMENT), this part should not place text at will.
Put #ifdefs in the header file, not in the source file as much as possible. Use #ifdef to define macros that can be used uniformly in the source code. For example, a header file used to check memory allocations might be implemented like this: (REALLOC and FREE are omitted):
#ifdef DEBUG
extern void *mm_malloc();
# define MALLOC(size) (mm_malloc(size))
#else
extern void *malloc();
# define MALLOC(size) (malloc(size))
#endif
Conditional compilation should usually be based on features one after another. In most cases, machine or operating system dependencies should be avoided.
#ifdef BSD4
long t = time ((long *)NULL);
#endif
There are two reasons why the above code is bad: there is a good chance that there will be a better choice on a certain 4BSD system, and it may also be that there is a code above that is the best code in a non-4BSD system. We can instead define a suitable macro in a configuration file such as TIME_LONG and TIME_STRUCTD.
15. Debugging
"C code. C code run. Run, code, run... Please run!!!" -- Barbara Tongue
If you use an enum, the first enum constant should be a non-zero value, or the first constant should indicate an error.
enum { STATE_ERR, STATE_START, STATE_NORMAL, STATE_END } state_t;
enum { VAL_NEW=1, VAL_NORMAL, VAL_DYING, VAL_DEAD } value_t;
Uninitialized values will be obtained by themselves later.
Check all error return values, even those functions that "cannot" fail. Consider that even if all previous file operations have been successful, close() and fclose may fail. Write your own functions so that they test errors in an explicit way, return error codes, or exit from the program. Contains a lot of debug and error checking code and leaves most of it in the final product. Even check for those "impossible" errors.
Use the assert mechanism to ensure that the values passed to each function are well-defined and the intermediate results are in good form.
Use #ifdef in debugging code as little as possible. For example, if mm_malloc is a debugging memory allocator, MALLOC will pick the right allocator, avoid using #ifdef to pile up garbage in the code, and make the difference between allocations clear, except that some extra memory will be allocated during the debugging period.
#ifdef DEBUG
# define MALLOC(size) (mm_malloc(size))
#else
# define MALLOC(size) (malloc(size))
#endif
Perform boundary verification on objects that are "impossible" overflow. A function written to a variable-length memory area should accept a parameter maxsize, which is the size of the target memory area. If sometimes the target memory area size is unknown, some maxsize's "magic number" value should mean "no bounds checking". When bounds checking fails, make sure this function does something useful, such as exiting the program or returning an error state.
/*
* INPUT: A null-terminated source string `src' to copy from and
* a `dest' string to copy to. `maxsize' is the size of `dest'
* or UINT_MAX if the size is not known. `src' and `dest' must
* both be shorter than UINT_MAX, and `src' must be no longer than
* `dest'.
* OUTPUT: The address of `dest' or NULL if the copy fails.
* `dest' is modified even when the copy fails.
*/
char *
copy (dest, maxsize, src)
char *dest, *src;
unsigned maxsize;
{
char *dp = dest;
while (maxsize\-\- > 0)
if ((*dp++ = *src++) == '\\\\0')
return (dest);
return (NULL);
}
In short, remembering that a program produces error answers twice as fast as possible (translator's note: does it mean to go in the opposite direction) is actually becoming infinitely slow, and this principle also holds true for programs that occasionally crash or crack down on valid data.