C
Standardizing style with the aim of aiding inspection and removing pitfalls
Human review remains one of the best ways to ensure a body of source code specifies the program that the programmer set out to write. Testing is usually a practical necessity, but a programmer’s ultimate goal should be a program that humans can read and understand. Source files that follow a single style are easier to review and audit. The pedansee utility parses source files in C and indicates whether they comply with one such style.
Pedansee’s default style strives for consistency, and it also removes pitfalls that lead to bugs. For example, C does not require braces around one-line blocks. However, leaving them off sometimes leads to the situation where a programmer later indents a statement following a block and expects it to be included in the block.
It is possible to integrate pedansee with a project that builds using the GNU autotools. First, add the following to the project’s configure.ac
:
AC_PATH_PROG(PEDANSEE, pedansee)
AM_CONDITIONAL(HAVE_PEDANSEE, test -n "$PEDANSEE")
Next, add this to the Makefile.am
responsible for compiling the project’s source code (replace proj_1_0):
check:
if HAVE_PEDANSEE
set -e; for i in $(proj_1_0_la_SOURCES); do $(PEDANSEE) $$i -- -x c $(DEFS) $(proj_1_0_la_CFLAGS); done
endif
With this, running pedansee is a matter of executing make check
.
The tool indent will reformat C source code according to a configurable style. Flyn Computing’s preferred use is indent -linux foo.c
. This follows Linus Torvalds’s preferred style.
Catching bugs with assertions
Assertions can help catch the bugs that result from ill composition or otherwise misused interfaces, but you should not use them to catch runtime conditions from which a program could otherwise recover. In general, it is best to catch bugs as early as possible during the course of writing a program; the following definition of C_ASSERT
will at compile time catch bugs involving constants:
#define C_ASSERT(e) typedef char __C_ASSERT__[(e)?1:-1] __attribute__((unused))
Catching bugs with run-time unit testing
Proving the correctness of a program or relying completely on code audits does not measure up to the complexity found in most programs. Thus run-time testing remains necessary. Unit testing tests the components of a program independently, and it is best performed as a program is written. For example, the programmer specifies a C function, and then he writes a series of tests to ensure the C function implements the specification. Finally, he writes the function itself and runs the tests. This makes the programmer an adversary of himself. Precisely when he best has in mind the purpose of a function, he writes the tests for that function. Only after the tests exist and can provide evidence of the correctness of the function does he write the function itself.
The check framework aids in writing unit tests for programs specified in C. Check can use fork
/exec
to allow a series of tests to run even if one terminates due to a memory error, although this feature can be turned off to facilitate running the tests in a debugger. Check also supports environment variables that result in running a subset of tests.
It is possible to integrate check with a project that builds using the GNU autotools. First, add the following to the project’s configure.ac
:
PKG_CHECK_MODULES([CHECK], [check >= 0.9.4],have_check=yes,have_check=no)
AM_CONDITIONAL(HAVE_CHECK, test x"$have_check" = "xyes")
Next, add this to the Makefile.am
responsible for compiling the project’s tests (replace … or omit the statement entirely):
if HAVE_CHECK
noinst_PROGRAMS += unit-test
endif
if HAVE_CHECK
unit_test_SOURCES = unit-test.c
unit_test_LDADD = ...
endif
Note that this example assumes that the project ships a library, namely libproj.
Each source file ought to contain tests. For example, this checks that the function x
produces the string bar
when passed foo
:
#ifdef HAVE_CHECK
START_TEST(x_test)
{
ck_assert_str_eq(x("foo"), "bar");
}
END_TEST
#endif
Refer to check’s documentation for a description of the API used to write such tests.
Writing the framework code necessary for test execution can be tedious, as it involves bundling tests into suites and maintaining a main function. Some projects include a script that generates this source; see libdmapsharing’s generate-test-suites
, for example.
Measuring code coverage using gcov
Unit tests ought to maximally cover the body of source code that makes up a program. Although it is impossible to test other than the simplest programs across all possible inputs, testing should at least try to execute each possible branch in a program. GCC’s gcov can help achieve path coverage.
To use gcov to measure your path coverage, compile your program using -fprofile-arcs -ftest-coverage
. You can build with the GNU autotools a configure script that activates these flags when passed --enable-coverage
. To do this, use the following pattern in configure.ac
:
AC_ARG_ENABLE(debug, [AC_HELP_STRING([--enable-debug],[enable debugging build])])
AC_ARG_ENABLE(coverage, [AC_HELP_STRING([--enable-coverage],[enable code-coverage build])])
if test "x$enable_debug" = "xyes"; then
CFLAGS="$CFLAGS -g"
elif test "x$enable_coverage" = "xyes"; then
CFLAGS="$CFLAGS -fprofile-arcs -ftest-coverage"
else
CFLAGS="$CFLAGS -O2"
fi
AC_PROG_CC
Run your program after building it with gcov support. The result is an instrumented execution that will produce files containing the details of the execution. To view these details, run:
$ gcov foo.c
where foo.c
is a source file. This will provide a summary along with a detailed report in foo.c.gcov
. The report marks lines with an integer representing how many times that line executed. A line preceded by ##### did not execute, and thus indicates insufficient test coverage.
Some projects require an argument that points gcov to the directory containing the project’s object files (i.e., .libs). This is the case when the project includes library code:
$ gcov foo.c -o .libs/libfoo_1_0-foo.gnco
Catching memory errors with Valgrind
Valgrind helps find in programs memory errors such as buffer overflows and memory leaks, and thus it might help find bugs missed even when unit tests provide full path coverage. To use Valgrind, compile your program to include debugging symbols and without optimization. Then run the following (replace program -options …
):
$ valgrind --leak-check=full --num-callers=100 program -options ...
The configure.ac
fragment described in the gcov section above also provides support for a --enable-debug
flag.
Catching memory errors with GCC
Always use GCC’s -Wall
and -Wextra
options.
GCC supports a -fsanitize=address
option that instruments a program to catch memory errors, including out-of-bounds memory accesses and memory that is used after having been freed. Simply invoke the option when compiling, and run the resulting program. As a dynamic analyzer, this will only catch errors that manifest while running. Refer to GCC’s documentation for other GCC sanitize options.
GCC also supports a -fanalyzer
option that invokes a static analyzer on the program GCC is compiling.
Catching programming errors with American Fuzzy Lop
Fuzzing provides randomized input patterns to a program in an attempt to cause the program to crash and thereby expose a bug. This technique might help find bugs missed by the techniques above.
For a program to be tested by American Fuzzy Lop (AFL), the program must read its input from standard input. If this is not the case, then you will need to write a wrapper program to facilitate testing.
To use AFL, compile your program using the following pattern:
$ afl-gcc program.c -o program
Next, craft a series of input patterns that will guide AFL as it later produces its random inputs. The AFL documentation describes how to do this, but placing the following in fuzz_testcase_dir/0
will cause AFL to produce character inputs:
b
Finally, run the program with:
$ afl-fuzz -i fuzz_testcase_dir -o fuzz_findings_dir ./fuzz
This will run AFL. AFL provides a real-time display and, when run as described, places crash-producing inputs in fuzz_findings_dir/crashes/
.
Static linking with the GNU linker
When you statically link using the GNU linker, ld
adds library symbols referenced by your code to the program it outputs. Ld
adds these symbols using source-file granularity; that is, if you require the function foo
, then ld
will include foo
along with any other symbols defined in the same source file as foo
. If you want to produce small programs, then it might make sense to write your libraries such that each source file contains a single externally-visible function; this will minimize the amount of code included in your program.
Ld
only includes symbols that you have not already defined, allowing you to override library functions. This must be used with care, because if you redefine foo
but not bar
but both were defined in the same library source file, then you will get a symbol conflict; ld
will include both foo
and bar
, and the conflict arises as a result of two definitions: your foo
and the library's foo
.