Dead Code Detection in Legacy C/C++ Project

Dead code detection in legacy C/C++ project

You could use a code coverage analysis tool for this and look for unused spots in your code.

A popular tool for the gcc toolchain is gcov, together with the graphical frontend lcov (http://ltp.sourceforge.net/coverage/lcov.php).

If you use gcc, you can compile with gcov support, which is enabled by the '--coverage' flag. Next, run your application or run your test suite with this gcov enabled build.

Basically gcc will emit some extra files during compilation and the application will also emit some coverage data while running. You have to collect all of these (.gcdo and .gcda files). I'm not going in full detail here, but you probably need to set two environment variables to collect the coverage data in a sane way: GCOV_PREFIX and GCOV_PREFIX_STRIP...

After the run, you can put all the coverage data together and run it through the lcov toolsuite. Merging of all the coverage files from different test runs is also possible, albeit a bit involved.

Anyhow, you end up with a nice set of webpages showing some coverage information, pointing out the pieces of code that have no coverage and hence, were not used.

Off course, you need to double check if the portions of code are not used in any situation and a lot depends on how good your tests exercise the codebase. But at least, this will give an idea about possible dead-code candidates...

Finding dead code in a large C++ legacy application

You'll want to use a static analysis tool

  • StackOverflow: What open source C++ static analysis tools are available?
  • Wikipedia: List of tools for static code analysis

The main gotcha I've run into is that you have to be careful that any libraries aren't used from somewhere that you don't control/have. If you delete a function from a class that gets used by referencing a library in your project you can break something that you didn't know used the code.

Automated Dead code detection in native C++ application on Windows?

Ask the linker to remove unreferenced objects (/OPT:REF). If you use function-level linking, and verbose linker output, the linker output will list every function it can prove is unused. This list may be far from complete, but you already have the tools needed.

Dead code identification (C++)

You'll want something along the lines of QA-C++ (http://www.programmingresearch.com/QACPP_MAIN.html), also see http://en.wikipedia.org/wiki/List_of_tools_for_static_code_analysis for similar products.

You're looking for a static code analysis tool that detects unreachable code; many coding guidelines (such as MISRA-C++, if I'm not mistaken) require that no unreachable code exists. An analysis tool geared specifically to enforce such a guideline would be your best bet.

And you'll like be able to find other uses for the tool as well.

Find unused functions in a C project by static analysis

When GCC says "will never be executed", it means it. You may have a bug that, in fact, does make that dead code. For example, something like:

if (a = 42) {
// some code
} else {
// warning: unreachable code
}

Without seeing the code it's not possible to be specific, of course.

Note that if there is a macro at line 666, it's possible GCC refers to a part of that macro as well.

Dynamic dead code elimination tools for complex C++ projects

You might want to look at the code coverage tools that are used in testing. The idea of these tools is that they instrument the code and after running the set of tests you know what lines of code were executed at least once and what lines were never executed. After that you can improve tests.

The same thing can be used to identify dead code in case if you have diverse enough execution environment.

How can I know which parts in the code are never used?

There are two varieties of unused code:

  • the local one, that is, in some functions some paths or variables are unused (or used but in no meaningful way, like written but never read)
  • the global one: functions that are never called, global objects that are never accessed

For the first kind, a good compiler can help:

  • -Wunused (GCC, Clang) should warn about unused variables, Clang unused analyzer has even been incremented to warn about variables that are never read (even though used).
  • -Wunreachable-code (older GCC, removed in 2010) should warn about local blocks that are never accessed (it happens with early returns or conditions that always evaluate to true)
  • there is no option I know of to warn about unused catch blocks, because the compiler generally cannot prove that no exception will be thrown.

For the second kind, it's much more difficult. Statically it requires whole program analysis, and even though link time optimization may actually remove dead code, in practice the program has been so much transformed at the time it is performed that it is near impossible to convey meaningful information to the user.

There are therefore two approaches:

  • The theoretic one is to use a static analyzer. A piece of software that will examine the whole code at once in great detail and find all the flow paths. In practice I don't know any that would work here.
  • The pragmatic one is to use an heuristic: use a code coverage tool (in the GNU chain it's gcov. Note that specific flags should be passed during compilation for it to work properly). You run the code coverage tool with a good set of varied inputs (your unit-tests or non-regression tests), the dead code is necessarily within the unreached code... and so you can start from here.

If you are extremely interested in the subject, and have the time and inclination to actually work out a tool by yourself, I would suggest using the Clang libraries to build such a tool.

  1. Use the Clang library to get an AST (abstract syntax tree)
  2. Perform a mark-and-sweep analysis from the entry points onward

Because Clang will parse the code for you, and perform overload resolution, you won't have to deal with the C++ languages rules, and you'll be able to concentrate on the problem at hand.

However this kind of technique cannot identify the virtual overrides that are unused, since they could be called by third-party code you cannot reason about.

What do you do with unused code in your legacy applications?

  1. Mark it obsolete. http://msdn.microsoft.com/en-us/library/aa664623%28VS.71%29.aspx
  2. Next cycle, comment the code out & make appropriate remarks. Set a region around the code block so that it collapses nicely & doesn't eat up a lot of visual space.
  3. Next cycle, delete the code.

This way you don't surprise your teammates too badly if you need to deprecate some code that you're assigned to but which they need to make calls to. But allows you to make those needed changes!



Related Topics



Leave a reply



Submit