Static analysis exposes latent defects in legacy software

Story

February 11, 2013

Paul Anderson

GrammaTech

Static analysis tools can help find concurrency and other defects to cut legacy's latency.

When migrating from a legacy software-based system to new technology, it is important to be able to reuse as much code as possible. Even if such code has been tested thoroughly and has proved reliable in practice in the old system, it might still contain latent bugs. Those bugs might have never been triggered in the legacy system because of very specific properties of that system such as the toolchain used to compile the code, the processor architecture, or the host operating system. When ported to a new system where those properties are different, the latent defects might be manifest as harmful bugs. But the good news is that advanced static analysis tools can flush out these latent defects to help combat the challenge.

Updating the system, revealing coding flaws

One of the most important motivations for migrating legacy systems is to take advantage of advances in hardware technology since the original system was first deployed. Probably the most common benefit is increased performance because of the adoption of a newer and faster processor. This is also the single most significant change from the perspective of the code. The new processor can have a different bit width, or endianness, and the number of available cores can be different. During code porting from the old platform to the next, much of the recoding effort will go into adapting the code to those differences.

Compilers, toolchains, and latent bugs

There are many other less obvious differences than implementing a new processor, and these subtle nuances can be easy to overlook. Take, for example, the toolchain used to compile the code. On the face of it, this should not make much of a difference. After all, if the code was written to be ANSI C compliant, and if the compilers claim to support ANSI C, then surely the code will have the same semantics when compiled by either compiler? Unfortunately not. The C and C++ standards are riddled with clauses that are “compiler dependent,” meaning that the standard does not dictate exactly how to compile certain constructs, and the choice is up to the compiler writer. Many of these are obvious and well-known to programmers, such as the order in which operands are evaluated, but others are very subtle. A latent bug might be harmless on the legacy system because the compiler chooses to compile it a particular way, but hazardous on the new system because the new compiler makes a different choice.

Compilers are programs too, of course, and are themselves not free of defects. A recent study of C compilers found code generation defects in every compiler they tested[1]. The treatment of the volatile keyword, which is of vital importance in embedded safety-critical software because it is frequently used to read sensor data, is especially prone to compiler errors that cause changes to sensor values to be silently ignored. A program’s correct functioning might even come to rely on such flaws.

Another danger zone: Standard libraries

Another subtle software migration difference that can cause latent defects to become hazardous involves the standard libraries that interface to the operating system. One might hope that such libraries would be consistent across platforms, but this is rarely so. The most significant difference is with respect to error handling. A new platform can have completely different failure modes than the legacy platform, and the code might need to be changed to be able to handle those differences. Worse still, error cases appear to be very badly documented, according to a recent study[2].

Static analysis wins out, complements traditional testing

Clearly, any legacy migration project must include a significant amount of time for testing the software in its new incarnation. However, test results are only as good as the test inputs. If a test case fails to exercise a path on which a bug occurs, that defect might go undetected. Generating new test cases is expensive too. Hence, a sensible tactic to flush out these latent defects is to use advanced static analysis tools as part of the legacy transformation effort. Such tools are capable of finding defects such as those described herein, including those that depend on platform subtleties. They are especially good at finding concurrency defects such as data races that are exceedingly difficult to find using traditional testing methodologies. They are also good at finding instances of code that, while not definitively wrong, is highly correlated with errors or is particularly risky when ported to a different environment.

References

[1] Finding and Understanding Bugs in C Compilers, Xuejun Yang, Yang Chen, Eric Eide, John Regehr, 2011. www.cs.utah.edu/~regehr/papers/pldi11-preprint.pdf

[2] Expect the Unexpected: Error Code Mismatches Between Documentation and the Real World, Cindy Rubio-González, Ben Liblit, 2010. www.eecs.berkeley.edu/~rubio/includes/paste10.pdf

Paul Anderson is VP of Engineering at GrammaTech and can be contacted at paul@grammatech.com.