Home
Download

Open Source

Projects
Patches

System Integration

Notes
SRPMs

Robust programming in C (an oxymoron?)

Standardizing style with the aim of aiding inspection and removing pitfalls

Human review remains one of the best ways to ensure a body of source code indeed specifies the program which the programmer set out to write. Testing is usually a practical necessity, but a programmer’s ultimate goal should be a program which humans can read and understand. Source files which follow a single style are easier to review and audit. The pedansee utility parses source files in C and indicates whether they comply with one such style.

Pedansee’s default style strives for consistency, and it also removes pitfalls which lead to bugs. For example, C does not require braces around one-line blocks. However, leaving them off sometimes leads to the situation where a programmer later indents a statement following a block and expects it to be included in the block.

It is possible to integrate pedansee with a project that builds using the GNU autotools. First, add the following to the project’s configure.ac:

AC_PATH_PROG(PEDANSEE, pedansee)
AM_CONDITIONAL(HAVE_PEDANSEE, test -n "$PEDANSEE")

Next, add this to the Makefile.am responsible for compiling the project’s source code (replace proj_1_0):

check:
if HAVE_PEDANSEE
       set -e; for i in $(proj_1_0_la_SOURCES); do $(PEDANSEE) $$i -- -x c $(DEFS) $(proj_1_0_la_CFLAGS); done
endif

With this, running pedansee is a matter of executing make check.

Catching bugs with assertions

Assertions can help catch the bugs which result from ill composition or otherwise misused interfaces, but you should not use them to catch runtime conditions from which a program could otherwise recover. In general, it is best to catch bugs as early as possible during the course of writing a program; the following definition of C_ASSERT will at compile time catch bugs involving constants:

#define C_ASSERT(e) typedef char __C_ASSERT__[(e)?1:-1] __attribute__((unused))

Catching bugs with run-time unit testing

Proving the correctness of a program or relying completely on code audits does not measure up to the complexity found in most programs. Thus run-time testing remains necessary. Unit testing tests the components of a program independently, and it is best performed as a program is written. For example, the programmer specifies a C function, and then he writes a series of tests to ensure the C function implements the specification. Finally, he writes the function itself and runs the tests. This makes the programmer an adversary of himself. Precisely when he best has in mind the purpose of a function, he writes the tests for that function. Only after the tests exist and can provide evidence of the correctness of the function does he write the function itself.

The check framework aids in writing unit tests for programs specified in C. Check can use fork/exec to allow a series of tests to run even if one terminates due to a memory error, although this feature can be turned off to facilitate running the tests in a debugger. Check also supports environment variables which result in running a subset of tests.

It is possible to integrate check with a project that builds using the GNU autotools. First, add the following to the project’s configure.ac:

PKG_CHECK_MODULES([CHECK], [check >= 0.9.4],have_check=yes,have_check=no)
AM_CONDITIONAL(HAVE_CHECK, test x"$have_check" = "xyes")

Next, add this to the Makefile.am responsible for compiling the project’s tests (replace … or omit the statement entirely):

if HAVE_CHECK
noinst_PROGRAMS += unit-test
endif

if HAVE_CHECK
unit_test_SOURCES = unit-test.c
unit_test_LDADD = ...
endif

Note that this example assumes that the project ships a library, namely libproj.

Each source file ought to contain tests. For example, this checks that the function x produces the string bar when passed foo:

#ifdef HAVE_CHECK

START_TEST(x_test)
{
    ck_assert_str_eq(x("foo"), "bar");
}
END_TEST

#endif

Refer to check’s documentation for a description of the API used to write such tests.

Writing the framework code necessary for test execution can be tedious, as it involves bundling tests into suites and maintaining a main function. Some projects include a script which generates this source; see libdmapsharing’s generate-test-suites, for example.

Measuring code coverage using gcov

Unit tests ought to maximally cover the body of source code which makes up a program. Although it is impossible to test other than the simplest programs across all possible inputs, testing should at least try to execute each possible branch in a program. GCC’s gcov can help achieve path coverage.

To use gcov to measure your path coverage, compile your program using -fprofile-arcs -ftest-coverage. You can build with the GNU autotools a configure script which activates these flags when passed –enable-coverage. To do this, use the following pattern in configure.ac:

AC_ARG_ENABLE(debug, [AC_HELP_STRING([--enable-debug],[enable debugging build])])
AC_ARG_ENABLE(coverage, [AC_HELP_STRING([--enable-coverage],[enable code-coverage build])])
if test "x$enable_debug" = "xyes"; then
    CFLAGS="$CFLAGS -g"
elif test "x$enable_coverage" = "xyes"; then
    CFLAGS="$CFLAGS -fprofile-arcs -ftest-coverage"
else
    CFLAGS="$CFLAGS -O2"
fi

AC_PROG_CC

Run your program after building it with gcov support. The result is an instrumented execution which will produce files which contain the details of the execution. To view these details, run:

$ gcov foo.c

where foo.c is a source file. This will provide a summary along with a detailed report in foo.c.gcov. The report marks lines with an integer representing how many times that line executed. A line preceded by ##### did not execute, and thus indicates insufficient test coverage.

Some projects require an argument which points gcov to the directory which contains the project’s object files (i.e., .libs). This is the case when the project includes library code:

$ gcov foo.c -o .libs/libfoo_1_0-foo.gnco

Catching memory errors with Valgrind

Valgrind helps find in programs memory errors such as buffer overflows and memory leaks, and thus it might help find bugs missed even when unit tests provide full path coverage. To use Valgrind, compile your program to include debugging symbols and without optimization. Then run the following (replace program -options …):

$ valgrind --leak-check=full --num-callers=100 program -options ...

The configure.ac fragment described in the gcov section above also provides support for a –enable-debug flag.

Catching programming errors with American Fuzzy Lop

Fuzzing provides randomized input patterns to a program in an attempt to cause the program to crash and thereby expose a bug. This technique might help find bugs missed by the techniques above.

For a program to be tested by American Fuzzy Lop (AFL), the program must read its input from standard input. If this is not the case, then you will need to write a wrapper program to facilitate testing.

To use AFL, compile your program using the following pattern:

$ afl-gcc program.c -o program

Next, craft a series of input patterns which will guide AFL as it later produces its random inputs. The AFL documentation describes how to do this, but placing the following in fuzz_testcase_dir/0 will cause AFL to produce character inputs:

b

Finally, run the program with:

$ afl-fuzz -i fuzz_testcase_dir -o fuzz_findings_dir ./fuzz

This will run AFL. AFL provides a real-time display and, when run as described, places crash-producing inputs in uzz_findings_dir/crashes/.

Static linking with the GNU linker

When you statically link using the GNU linker, ld adds library symbols referenced by your code to the program it outputs. Ld adds these symbols using source-file granularity; that is, if you require the function foo, then ld will include foo along with any other symbols defined in the same source file as foo. If you want to produce small programs, then it might make sense to write your libraries such that each source file contains a single externally visible function; this will minimize the amount of code included in your program.

Ld only includes symbols which you have not already defined, allowing you to override library functions. This must be used with care, because if you redefine foo but not bar but both were defined in the same library source file, then you will get a symbol conflict; ld will include both foo and bar, and the conflict arises as a result of two definitions: your foo and the library's foo.

Email: www@flyn.org — ✉ 6110 Campfire Court; Columbia, Maryland 21045; USA