C
Standardizing style with the aim of aiding inspection and removing pitfalls
Human review remains one of the best ways to ensure a body of source code indeed specifies the program which the programmer set out to write. Testing is usually a practical necessity, but a programmer’s ultimate goal should be a program which humans can read and understand. Source files which follow a single style are easier to review and audit. The pedansee utility parses source files in C and indicates whether they comply with one such style.
Pedansee’s default style strives for consistency, and it also removes pitfalls which lead to bugs. For example, C does not require braces around one-line blocks. However, leaving them off sometimes leads to the situation where a programmer later indents a statement following a block and expects it to be included in the block.
It is possible to integrate pedansee with a project that builds using the GNU autotools.
First, add the following to the project’s configure.ac
:
AC_PATH_PROG(PEDANSEE, pedansee)
AM_CONDITIONAL(HAVE_PEDANSEE, test -n "$PEDANSEE")
Next, add this to the Makefile.am
responsible for compiling the project’s source code (replace proj_1_0):
check:
if HAVE_PEDANSEE
set -e; for i in $(proj_1_0_la_SOURCES); do $(PEDANSEE) $$i -- -x c $(DEFS) $(proj_1_0_la_CFLAGS); done
endif
With this, running pedansee is a matter of executing make check
.
Catching bugs with assertions
Assertions can help catch the bugs which result from ill composition or otherwise misused interfaces,
but you should not use them to catch runtime conditions from which a program could otherwise recover.
In general, it is best to catch bugs as early as possible during the course of writing a program;
the following definition of C_ASSERT
will at compile time catch
bugs involving constants:
#define C_ASSERT(e) typedef char __C_ASSERT__[(e)?1:-1] __attribute__((unused))
Catching bugs with run-time unit testing
Proving the correctness of a program or relying completely on code audits does not measure up to the complexity found in most programs. Thus run-time testing remains necessary. Unit testing tests the components of a program independently, and it is best performed as a program is written. For example, the programmer specifies a C function, and then he writes a series of tests to ensure the C function implements the specification. Finally, he writes the function itself and runs the tests. This makes the programmer an adversary of himself. Precisely when he best has in mind the purpose of a function, he writes the tests for that function. Only after the tests exist and can provide evidence of the correctness of the function does he write the function itself.
The check framework aids in writing
unit tests for programs specified in C.
Check can use fork
/exec
to allow a series of tests to run even if one
terminates due to a memory error,
although this feature can be turned off to facilitate running the tests in a
debugger. Check also supports environment variables which result in running a subset of tests.
It is possible to integrate check with a project that builds using the GNU autotools.
First, add the following to the project’s configure.ac
:
PKG_CHECK_MODULES([CHECK], [check >= 0.9.4],have_check=yes,have_check=no)
AM_CONDITIONAL(HAVE_CHECK, test x"$have_check" = "xyes")
Next, add this to the Makefile.am
responsible for compiling the project’s tests (replace … or omit the statement entirely):
if HAVE_CHECK
noinst_PROGRAMS += unit-test
endif
if HAVE_CHECK
unit_test_SOURCES = unit-test.c
unit_test_LDADD = ...
endif
Note that this example assumes that the project ships a library, namely libproj.
Each source file ought to contain tests. For example, this checks that the function
x
produces the string bar
when passed foo
:
#ifdef HAVE_CHECK
START_TEST(x_test)
{
ck_assert_str_eq(x("foo"), "bar");
}
END_TEST
#endif
Refer to check’s documentation for a description of the API used to write such tests.
Writing the framework code necessary for test execution can be tedious,
as it involves bundling tests into suites and maintaining a main function.
Some projects include a script which generates this source; see libdmapsharing’s generate-test-suites
, for example.
Measuring code coverage using gcov
Unit tests ought to maximally cover the body of source code which makes up a program. Although it is impossible to test other than the simplest programs across all possible inputs, testing should at least try to execute each possible branch in a program. GCC’s gcov can help achieve path coverage.
To use gcov to measure your path coverage, compile your program using -fprofile-arcs -ftest-coverage
.
You can build with
the GNU
autotools a configure script which activates these flags when passed --enable-coverage
.
To do this, use the following pattern in configure.ac
:
AC_ARG_ENABLE(debug, [AC_HELP_STRING([--enable-debug],[enable debugging build])])
AC_ARG_ENABLE(coverage, [AC_HELP_STRING([--enable-coverage],[enable code-coverage build])])
if test "x$enable_debug" = "xyes"; then
CFLAGS="$CFLAGS -g"
elif test "x$enable_coverage" = "xyes"; then
CFLAGS="$CFLAGS -fprofile-arcs -ftest-coverage"
else
CFLAGS="$CFLAGS -O2"
fi
AC_PROG_CC
Run your program after building it with gcov support. The result is an instrumented execution which will produce files which contain the details of the execution. To view these details, run:
$ gcov foo.c
where foo.c
is a source file. This will provide a summary along with a
detailed report in foo.c.gcov
. The report marks lines with an integer representing
how many times that line executed. A line preceded by ##### did not execute,
and thus indicates insufficient test coverage.
Some projects require an argument which points gcov to the directory which contains the project’s object files (i.e., .libs). This is the case when the project includes library code:
$ gcov foo.c -o .libs/libfoo_1_0-foo.gnco
Catching memory errors with Valgrind
Valgrind helps find in programs memory errors
such as buffer overflows and memory leaks,
and thus it might help find bugs missed even when unit tests provide full
path coverage.
To use Valgrind, compile your program to include debugging symbols
and without optimization. Then run the following (replace program -options …
):
$ valgrind --leak-check=full --num-callers=100 program -options ...
The configure.ac
fragment described in the gcov section
above also provides support for a --enable-debug
flag.
Catching programming errors with American Fuzzy Lop
Fuzzing provides randomized input patterns to a program in an attempt to cause the program to crash and thereby expose a bug. This technique might help find bugs missed by the techniques above.
For a program to be tested by American Fuzzy Lop (AFL), the program must read its input from standard input. If this is not the case, then you will need to write a wrapper program to facilitate testing.
To use AFL, compile your program using the following pattern:
$ afl-gcc program.c -o program
Next, craft a series of input patterns which will guide AFL as it later
produces its random inputs. The AFL documentation describes how to do
this, but placing the following in fuzz_testcase_dir/0
will cause AFL
to produce character inputs:
b
Finally, run the program with:
$ afl-fuzz -i fuzz_testcase_dir -o fuzz_findings_dir ./fuzz
This will run AFL. AFL provides a real-time display and, when
run as described, places crash-producing inputs in
fuzz_findings_dir/crashes/
.
Static linking with the GNU linker
When you statically link using the GNU linker, ld
adds library symbols referenced by your code to the program it outputs.
Ld
adds these symbols using source-file granularity; that is, if you require the function foo
,
then ld
will include foo
along with any other symbols defined in the same source file as foo
.
If you want to produce small programs, then it might make sense to write your libraries such that each
source file contains a single externally visible function;
this will minimize the amount of code included in your program.
Ld
only includes symbols which you have not already defined, allowing you to override library functions.
This must be used with care, because if you redefine foo
but not
bar
but both were defined in the same library source file, then
you will get a symbol conflict; ld
will include both foo
and bar
, and the conflict arises as a result of two definitions:
your foo
and the library's foo
.