Hunting vulns by adding swarms of bugs to source code

Story

July 29, 2016

Sally Cole

Senior Editor

Military Embedded Systems

In a new twist on hunting vulnerabilities within source code, researchers are adding huge swarms of bugs to test the limits of bug-finding tools.

Software that sniffs out potentially dangerous bugs within computer programs tends to be extremely expensive, and yet there has been no solid way to determine for sure how many bugs go undetected. In response, a group of researchers from New York University (NYU), MIT Lincoln Laboratory, and Northeastern University developed a radically different approach to tackle the problem by intentionally adding hundreds of thousands of bugs to the source code to control the number of bugs within a program, rather than attempting to find and remediate them.

The group’s approach is called “LAVA,” short for Large-Scale Automated Vulnerability Addition, and tests the limits of bug-finding tools. It does this by inserting known quantities of novel vulnerabilities that are synthetic yet possess many of the same attributes as computer bugs in the wild.

This automated system was able to generate hundreds of thousands of unstudied, highly realistic vulnerabilities that are inexpensive, span the execution lifetime of a program, are embedded within a normal control and data flow, and manifest only for a small fraction of inputs to avoid shutting down the entire program.

The researchers needed to create novel bugs – and lots of them – to explore the strengths and weaknesses of bug-finding software. Previously identified vulnerabilities would have tripped current bug finders and skewed the results. (Figure 1.)

Figure 1: Image courtesy of NYU.

Just how big a problem is finding bugs within source code? Big. Moreover, the group discovered that many popular bug finders detected only two percent of the vulnerabilities created by LAVA.

“Although the two percent detection rateis definitely surprising – generally when I’ve asked people they expected something more in the range of 10 to 20 percent – I think the right way to think about it is not that these tools are very bad, but that finding bugs within code is inherently an extremely difficult problem,” explains Brendan Dolan-Gavitt, an assistant professor of computer science and engineering at NYU’s Tandon School of Engineering, and co-creator of LAVA. “We’ve seen cases where a bug might go undetected within some incredibly widely used software for decades.”

The motivation behind LAVA came from realizing that “people creating bug-finding tools didn’t really have a good way to tell how effective they were,” Dolan-Gavitt says. “Someone would create a new tool and test it out, and if it found a few bugs that other systems hadn’t caught, it was declared a success. This isn’t a very rigorous way of evaluating the effectiveness of a detector. And standardized test suites were both rare and extremely expensive to create.”

Tim Leek, who initially came up with the idea for LAVA at MIT Lincoln Laboratory, was involved in the creation of one of these test suites, and it took about six months to come up with a corpus of just 14 well-annotated and well-understood bugs. “So this really demonstrates that we need a better way of producing these test corpora,” notes Dolan-Gavitt.

How can the U.S. military tap LAVA? “The military often uses commercial and open-source bug-finding software to evaluate new software systems. When choosing one of these bug-finding systems, they need to do due diligence by evaluating their effectiveness, and that’s exactly what LAVA is designed to do,” Dolan-Gavitt says. “Traditional ways of evaluating bug detectors, such as manually adding bugs or using databases of historical bugs, are very costly or too easy to game. I think LAVA could help bring down the costs of doing such evaluations, which will also force vendors to up their game.”

Another area being explored for LAVA is cybersecurity training. “Right now, one of the primary methods of training cybersecurity experts in both the military and industry is through ‘Capture the Flag’ (CTF) competitions,” he adds. “Basically, the competition organizers set up some deliberately weak systems and software, then people compete to see who can break in. But these competitions are a lot of work to put together, so LAVA might be able to help by adding new exploitable bugs for a CTF.”

The group plans to launch its own open competition this summer to give developers and other researchers a chance to request a LAVA-bugged version of a piece of software, attempt to find the bugs, and receive a score based on their accuracy.

It’s important to note that plenty of other bug-finding tools exist that the group hasn’t been able to evaluate yet, both due to lack of time and the high cost of obtaining them. “We hope that when we open things up to a general competition, vendors will want to take part and do their own public evaluations,” says Dolan-Gavitt.

For more information, visit: www.ieee-security.org/TC/SP2016/papers/0824a110.pdf.