Concolic testing: Difference between revisions

Content deleted Content added

Inline

Revision as of 05:33, 1 September 2015

Concolic testing (a portmanteau of concrete and symbolic) is a hybrid software verification technique that performs symbolic execution, a classical technique that treats program variables as symbolic variables, along a concrete execution (testing on particular inputs) path. Symbolic execution is used in conjunction with an automated theorem prover or constraint solver based on constraint logic programming to generate new concrete inputs (test cases) with the aim of maximizing code coverage. Its main focus is finding bugs in real-world software, rather than demonstrating program correctness.

A description and discussion of the concept was introduced in DART by Patrice Godefroid, Nils Klurland, and Koushik Sen.^[1] The paper "CUTE: A concolic unit testing engine for C",^[2] by Koushik Sen, Darko Marinov, and Gul Agha, further extended the idea to data structures, and first coined the term concolic testing. Another tool, called EGT (renamed to EXE and later improved and renamed to KLEE), based on similar ideas was independently developed by Cristian Cadar and Dawson Engler in 2005, and published in 2005 and 2006.^[3] PathCrawler^[4] ^[5] first proposed to perform symbolic execution along a concrete execution path, but unlike concolic testing PathCrawler does not simplify complex symbolic constraints using concrete values. These tools (DART and CUTE, EXE) applied concolic testing to unit testing of C programs and concolic testing was originally conceived as a white box improvement upon established random testing methodologies. The technique was later generalized to testing multithreaded Java programs with jCUTE,^[6] and unit testing programs from their executable codes (tool OSMOSE).^[7] It was also combined with fuzz testing and extended to detect exploitable security issues in large-scale x86 binaries by Microsoft Research's SAGE.^[8]^[9]

The concolic approach is also applicable to model checking. In a concolic model checker, the model checker traverses states of the model representing the software being checked, while storing both a concrete state and a symbolic state. The symbolic state is used for checking properties on the software, while the concrete state is used to avoid reaching unreachable state. One such tool is ExpliSAT by Sharon Barner, Cindy Eisner, Ziv Glazberg, Daniel Kroening and Ishai Rabinovitz^[10]

Birth of concolic testing

Implementation of traditional symbolic execution based testing requires the implementation of a full-fledged symbolic interpreter for a programming language. Concolic testing implementor noticed that implementation of a full-fledged symbolic execution can be avoided if symbolic execution can be piggy-backed with the normal execution of a program through instrumentation. This idea of simplifying implementation of symbolic execution gave birth to concolic testing.

Development of SMT Solvers

An important reason for the rise of concolic testing (and more generally, symbolic-execution based analysis of programs) in the decade since it was introduced in 2005 is the dramatic improvement in the efficiency and expressive power of SMT Solvers. The key technical developments that lead to the rapid development of SMT solvers include combination of theories, lazy solving, DPLL(T) and the huge improvements in the speed of SAT solvers. SMT solvers that are particularly tuned for concolic testing include Z3, STP, Z3str2, and Boolector.

Example

Consider the following simple example, written in C:

void f(int x, int y) {
    int z = 2*y;
    if (x == 100000) {
        if (x < z) {
            assert(0); /* error */
        }
    }
}

Execution path tree for this example. Three tests are generated corresponding to the three leaf nodes in the tree, and three execution paths in the program.

Simple random testing, trying random values of x and y, would require an impractically large number of tests to reproduce the failure.

We begin with an arbitrary choice for x and y, for example x = y = 1. In the concrete execution, line 2 sets z to 2, and the test in line 3 fails since 1 ≠ 100000. Concurrently, the symbolic execution follows the same path but treats x and y as symbolic variables. It sets z to the expression 2y and notes that, because the test in line 3 failed, x ≠ 100000. This inequality is called a path condition and must be true for all executions following the same execution path as the current one.

Since we'd like the program to follow a different execution path on the next run, we take the last path condition encountered, x ≠ 100000, and negate it, giving x = 100000. An automated theorem prover is then invoked to find values for the input variables x and y given the complete set of symbolic variable values and path conditions constructed during symbolic execution. In this case, a valid response from the theorem prover might be x = 100000, y = 0.

Running the program on this input allows it to reach the inner branch on line 4, which is not taken since 100000 (x) is not less than 0 (z = 2y). The path conditions are x = 100000 and x ≥ z. The latter is negated, giving x < z. The theorem prover then looks for x, y satisfying x = 100000, x < z, and z = 2y; for example, x = 100000, y = 50001. This input reaches the error.

Algorithm

Essentially, a concolic testing algorithm operates as follows:

Classify a particular set of variables as input variables. These variables will be treated as symbolic variables during symbolic execution. All other variables will be treated as concrete values.
Instrument the program so that each operation which may affect a symbolic variable value or a path condition is logged to a trace file, as well as any error that occurs.
Choose an arbitrary input to begin with.
Execute the program.
Symbolically re-execute the program on the trace, generating a set of symbolic constraints (including path conditions).
Negate the last path condition not already negated in order to visit a new execution path. If there is no such path condition, the algorithm terminates.
Invoke an automated theorem prover to generate a new input. If there is no input satisfying the constraints, return to step 6 to try the next execution path.
Return to step 4.

There are a few complications to the above procedure:

The algorithm performs a depth-first search over an implicit tree of possible execution paths. In practice programs may have very large or infinite path trees — a common example is testing data structures that have an unbounded size or length. To prevent spending too much time on one small area of the program, the search may be depth-limited (bounded).
Symbolic execution and automated theorem provers have limitations on the classes of constraints they can represent and solve. For example, a theorem prover based on linear arithmetic will be unable to cope with the nonlinear path condition xy = 6. Any time that such constraints arise, the symbolic execution may substitute the current concrete value of one of the variables to simplify the problem. An important part of the design of a concolic testing system is selecting a symbolic representation precise enough to represent the constraints of interest.

Commercial success

Symbolic-execution based analysis and testing, in general, has witnessed a dramatic level of interest from the industry. Perhaps the most famous commercial tool that uses dynamic symbolic execution (aka concolic testing) is the SAGE tool from Microsoft. The KLEE and S2E tools (both of which are open-source tools, and use the STP constraint solver) are widely used in many companies including HP Fortify, NVIDIA, and IBM. Increasingly these technologies are being used by many security companies and hackers alike to find security vulnerabilities.

Limitations

Concolic testing has a number of limitations:

If the program exhibits nondeterministic behavior, it may follow a different path than the intended one. This can lead to nontermination of the search and poor coverage.
Even in a deterministic program, a number of factors may lead to poor coverage, including imprecise symbolic representations, incomplete theorem proving, and failure to search the most fruitful portion of a large or infinite path tree.
Programs which thoroughly mix the state of their variables, such as cryptographic primitives, generate very large symbolic representations that cannot be solved in practice. For example, the condition "if (md5_hash(input) == 0xdeadbeef)" requires the theorem prover to invert MD5, which is an open problem.

Tools

pathcrawler-online.com is a restricted version of the current PathCrawler tool which is publicly available as an online test-case server for evaluation and education purposes.
CUTE and jCUTE are available as binaries under a research-use only license by Urbana-Champaign for C and for Java.
CREST is an open-source solution for C comparable to CUTE (modified BSD license).
KLEE is an open source solution built on-top of the LLVM infrastructure (UIUC license).
CATG is an open-source solution for Java (BSD license).
Jalangi is an open-source concolic testing and symbolic execution tool for JavaScript. Jalangi supports integers and strings.
Microsoft Pex, developed at Microsoft Rise, is publicly available as a Microsoft Visual Studio 2010 Power Tool for the NET Framework.
Triton is an open-source Pin-based concolic execution framework for x86-64 binaries.
CutEr is an open-source concolic testing tool for the Erlang functional programming language.

Many tools, notably DART and SAGE, have not been made available to the public at large. Note however that for instance SAGE is "used daily" for internal security testing at Microsoft.^[11]

References

^ Patrice Godefroid; Nils Klarlund; Koushik Sen (2005). "DART: Directed Automated Random Testing" (PDF). Proceedings of the 2005 ACM SIGPLAN conference on Programming language design and implementation. New York, NY: ACM. pp. 213–223. ISSN 0362-1340. Retrieved 2009-11-09. {{cite conference}}: Unknown parameter |booktitle= ignored (|book-title= suggested) (help)
^ Koushik Sen; Darko Marinov; Gul Agha (2005). "CUTE: a concolic unit testing engine for C" (PDF). Proceedings of the 10th European software engineering conference held jointly with 13th ACM SIGSOFT international symposium on Foundations of software engineering. New York, NY: ACM. pp. 263–272. ISBN 1-59593-014-0. Retrieved 2009-11-09. {{cite conference}}: Unknown parameter |booktitle= ignored (|book-title= suggested) (help)
^ Cristian Cadar; Vijay Ganesh; Peter Pawloski; David L. Dill; Dawson Engler (2006). "EXE: Automatically Generating Inputs of Death" (PDF). Proceedings of the 13th International Conference on Computer and Communications Security (CCS 2006). Alexandria, VA, USA: ACM. {{cite conference}}: Unknown parameter |booktitle= ignored (|book-title= suggested) (help)
^ Nicky Williams; Bruno Marre; Patricia Mouy (2004). "On-the-Fly Generation of K-Path Tests for C Functions". Proceedings of the 19th IEEE International Conference on Automated Software Engineering (ASE 2004), 20–25 September 2004, Linz, Austria. IEEE Computer Society. pp. 290–293. ISBN 0-7695-2131-2. {{cite conference}}: Unknown parameter |booktitle= ignored (|book-title= suggested) (help)
^ Nicky Williams; Bruno Marre; Patricia Mouy; Muriel Roger (2005). "PathCrawler: Automatic Generation of Path Tests by Combining Static and Dynamic Analysis". Dependable Computing - EDCC-5, 5th European Dependable Computing Conference, Budapest, Hungary, April 20–22, 2005, Proceedings. Springer. pp. 281–292. ISBN 3-540-25723-3. {{cite conference}}: Unknown parameter |booktitle= ignored (|book-title= suggested) (help)
^ Koushik Sen; Gul Agha (August 2006). "CUTE and jCUTE : Concolic Unit Testing and Explicit Path Model-Checking Tools". Computer Aided Verification: 18th International Conference, CAV 2006, Seattle, WA, USA, August 17–20, 2006, Proceedings. Springer. pp. 419–423. ISBN 978-3-540-37406-0. Retrieved 2009-11-09. {{cite conference}}: Unknown parameter |booktitle= ignored (|book-title= suggested) (help)
^ Sébastien Bardin; Philippe Herrmann (April 2008). "Structural Testing of Executables" (PDF). Proceedings of the 1st IEEE International Conference on Software Testing, Verification, and Validation (ICST 2008), Lillehammer, Norway. IEEE Computer Society. pp. 22–31. ISBN 978-0-7695-3127-4. {{cite conference}}: Unknown parameter |booktitle= ignored (|book-title= suggested) (help),
^ Patrice Godefroid; Michael Y. Levin; David Molnar (2007). Automated Whitebox Fuzz Testing (PDF) (Technical report). Microsoft Research. TR-2007-58.
^ Patrice Godefroid (2007). "Random testing for security: blackbox vs. whitebox fuzzing" (PDF). Proceedings of the 2nd international workshop on Random testing: co-located with the 22nd IEEE/ACM International Conference on Automated Software Engineering (ASE 2007). New York, NY: ACM. p. 1. ISBN 978-1-59593-881-7. Retrieved 2009-11-09. {{cite conference}}: Unknown parameter |booktitle= ignored (|book-title= suggested) (help)
^ Sharon Barner, Cindy Eisner, Ziv Glazberg, Daniel Kroening, Ishai Rabinovitz: ExpliSAT: Guiding SAT-Based Software Verification with Explicit States. Haifa Verification Conference 2006: 138-154
^ SAGE team (2009). "Microsoft PowerPoint - SAGE-in-one-slide" (PDF). Microsoft Research. Retrieved 2009-11-10.

[1] Patrice Godefroid; Nils Klarlund; Koushik Sen (2005). "DART: Directed Automated Random Testing" (PDF). Proceedings of the 2005 ACM SIGPLAN conference on Programming language design and implementation. New York, NY: ACM. pp. 213–223. ISSN 0362-1340. Retrieved 2009-11-09. {{cite conference}}: Unknown parameter |booktitle= ignored (|book-title= suggested) (help)

[cute-2] Koushik Sen; Darko Marinov; Gul Agha (2005). "CUTE: a concolic unit testing engine for C" (PDF). Proceedings of the 10th European software engineering conference held jointly with 13th ACM SIGSOFT international symposium on Foundations of software engineering. New York, NY: ACM. pp. 263–272. ISBN 1-59593-014-0. Retrieved 2009-11-09. {{cite conference}}: Unknown parameter |booktitle= ignored (|book-title= suggested) (help)

[3] Cristian Cadar; Vijay Ganesh; Peter Pawloski; David L. Dill; Dawson Engler (2006). "EXE: Automatically Generating Inputs of Death" (PDF). Proceedings of the 13th International Conference on Computer and Communications Security (CCS 2006). Alexandria, VA, USA: ACM. {{cite conference}}: Unknown parameter |booktitle= ignored (|book-title= suggested) (help)

[4] Nicky Williams; Bruno Marre; Patricia Mouy (2004). "On-the-Fly Generation of K-Path Tests for C Functions". Proceedings of the 19th IEEE International Conference on Automated Software Engineering (ASE 2004), 20–25 September 2004, Linz, Austria. IEEE Computer Society. pp. 290–293. ISBN 0-7695-2131-2. {{cite conference}}: Unknown parameter |booktitle= ignored (|book-title= suggested) (help)

[5] Nicky Williams; Bruno Marre; Patricia Mouy; Muriel Roger (2005). "PathCrawler: Automatic Generation of Path Tests by Combining Static and Dynamic Analysis". Dependable Computing - EDCC-5, 5th European Dependable Computing Conference, Budapest, Hungary, April 20–22, 2005, Proceedings. Springer. pp. 281–292. ISBN 3-540-25723-3. {{cite conference}}: Unknown parameter |booktitle= ignored (|book-title= suggested) (help)

[6] Koushik Sen; Gul Agha (August 2006). "CUTE and jCUTE : Concolic Unit Testing and Explicit Path Model-Checking Tools". Computer Aided Verification: 18th International Conference, CAV 2006, Seattle, WA, USA, August 17–20, 2006, Proceedings. Springer. pp. 419–423. ISBN 978-3-540-37406-0. Retrieved 2009-11-09. {{cite conference}}: Unknown parameter |booktitle= ignored (|book-title= suggested) (help)

[7] Sébastien Bardin; Philippe Herrmann (April 2008). "Structural Testing of Executables" (PDF). Proceedings of the 1st IEEE International Conference on Software Testing, Verification, and Validation (ICST 2008), Lillehammer, Norway. IEEE Computer Society. pp. 22–31. ISBN 978-0-7695-3127-4. {{cite conference}}: Unknown parameter |booktitle= ignored (|book-title= suggested) (help),

[8] Patrice Godefroid; Michael Y. Levin; David Molnar (2007). Automated Whitebox Fuzz Testing (PDF) (Technical report). Microsoft Research. TR-2007-58.

[9] Patrice Godefroid (2007). "Random testing for security: blackbox vs. whitebox fuzzing" (PDF). Proceedings of the 2nd international workshop on Random testing: co-located with the 22nd IEEE/ACM International Conference on Automated Software Engineering (ASE 2007). New York, NY: ACM. p. 1. ISBN 978-1-59593-881-7. Retrieved 2009-11-09. {{cite conference}}: Unknown parameter |booktitle= ignored (|book-title= suggested) (help)

[10] Sharon Barner, Cindy Eisner, Ziv Glazberg, Daniel Kroening, Ishai Rabinovitz: ExpliSAT: Guiding SAT-Based Software Verification with Explicit States. Haifa Verification Conference 2006: 138-154

[11] SAGE team (2009). "Microsoft PowerPoint - SAGE-in-one-slide" (PDF). Microsoft Research. Retrieved 2009-11-10.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

@@ Line 2: / Line 2: @@
 A description and discussion of the concept was introduced in DART by Patrice Godefroid, Nils Klurland, and Koushik Sen.<ref>{{cite conference
-| first = Patrice
+| author1 = Patrice Godefroid
+| author2 = Nils Klarlund
-| last = Godefroid
-|author2=Nils Klarlund |author3=Koushik Sen
+| author3 = Koushik Sen
 | title = DART: Directed Automated Random Testing
 | booktitle = Proceedings of the 2005 ACM SIGPLAN conference on Programming language design and implementation
@@ Line 13: / Line 13: @@
 | url = http://cm.bell-labs.com/who/god/public_psfiles/pldi2005.pdf
 | accessdate = 2009-11-09
-|issn=0362-1340
+| issn=0362-1340
 }}
 </ref> The paper "CUTE: A concolic unit testing engine for C",<ref name="cute">{{cite conference
-| first = Koushik
+| author1 = Koushik Sen
+| author2 = Darko Marinov
-| last = Sen
+| author3 = Gul Agha
-| authorlink = Koushik Sen (computer scientist)
-|author2=Darko Marinov |author3=Gul Agha
 | title = CUTE: a concolic unit testing engine for C
 | booktitle = Proceedings of the 10th European software engineering conference held jointly with 13th ACM SIGSOFT international symposium on Foundations of software engineering
@@ Line 30: / Line 29: @@
 | isbn = 1-59593-014-0
 }}</ref> by Koushik Sen, Darko Marinov, and Gul Agha, further extended the idea to data structures, and first coined the term ''concolic testing''.  Another tool, called EGT (renamed to EXE and later improved and renamed to [http://klee.github.io/ KLEE]), based on similar ideas was independently developed by Cristian Cadar and Dawson Engler in 2005, and published in 2005 and 2006.<ref>{{cite conference
+| author1 = Cristian Cadar
-| first = Engler
-| last = Dawson
+| author2 = Vijay Ganesh
+| author3 = Peter Pawloski
-| authorlink = Dawson Engler (computer scientist) |author2=Cristian Cadar |author3=Vijay Ganesh |author4=Peter Pawloski |author5=David L. Dill |author6=Dawson Engler
+| author4 = David L. Dill
+| author5 = Dawson Engler
 | title = EXE: Automatically Generating Inputs of Death
 | booktitle = Proceedings of the 13th International Conference on Computer and Communications Security (CCS 2006)
@@ Line 40: / Line 41: @@
 | url = http://www.stanford.edu/~engler/exe-ccs-06.pdf
 }}</ref> [http://pathcrawler-online.com PathCrawler]<ref>{{cite conference
-| first = Nicky
+| author1 = Nicky Williams
-| last = Williams
+| author2 = Bruno Marre
-|author2=Bruno Marre |author3=Patricia Mouy
+| author3 = Patricia Mouy
- | title = On-the-Fly Generation of K-Path Tests for C Functions
+| title = On-the-Fly Generation of K-Path Tests for C Functions
 | booktitle = Proceedings of the 19th IEEE International Conference on Automated Software Engineering (ASE 2004), 20–25 September 2004, Linz, Austria
 | pages = 290&ndash;293
@@ Line 50: / Line 51: @@
 | isbn = 0-7695-2131-2
 }}</ref> <ref>{{cite conference
-| first = Nicky
+| author1 = Nicky Williams
-| last = Williams
+| author2 = Bruno Marre
-|author2=Bruno Marre |author3=Patricia Mouy |author4=Muriel Roger
+| author3 = Patricia Mouy
+| author4 = Muriel Roger
- | title = PathCrawler: Automatic Generation of Path Tests by Combining Static and Dynamic Analysis
+| title = PathCrawler: Automatic Generation of Path Tests by Combining Static and Dynamic Analysis
 | booktitle = Dependable Computing - EDCC-5, 5th European Dependable Computing Conference, Budapest, Hungary, April 20–22, 2005, Proceedings
 | pages = 281&ndash;292
@@ Line 61: / Line 63: @@
 }}</ref> first proposed to perform symbolic execution along a concrete execution path, but unlike concolic testing PathCrawler does not simplify complex symbolic constraints using concrete values.  These tools (DART and CUTE, EXE) applied concolic testing to unit testing of [[C programming language|C]] programs and concolic testing was originally conceived as a [[white box (software engineering)|white box]] improvement upon established [[random testing]] methodologies. The technique was later generalized to testing multithreaded [[Java programming language|Java]] programs with jCUTE,<ref>
 {{cite conference
-| first = Koushik
+| author1 = Koushik Sen
-| last = Sen
+| author2 =Gul Agha
-| authorlink = Koushik Sen (computer scientist)
-|author2=Gul Agha
 | title = CUTE and jCUTE : Concolic Unit Testing and Explicit Path Model-Checking Tools
 | booktitle = Computer Aided Verification: 18th International Conference, CAV 2006, Seattle, WA, USA, August 17–20, 2006, Proceedings
@@ Line 76: / Line 76: @@
 </ref> and unit testing programs from their executable codes (tool OSMOSE).<ref>
 {{cite conference
-| first = Sébastien
+| author1 = Sébastien Bardin
+| author2 = Philippe Herrmann
-| last = Bardin
-|author2=Philippe Herrmann
 | title = Structural Testing of Executables
-| booktitle = Proceedings of the 1st IEEE International Conference on Software Testing, Verification, and
+| booktitle = Proceedings of the 1st IEEE International Conference on Software Testing, Verification, and Validation (ICST 2008), Lillehammer, Norway.
-Validation (ICST 2008), Lillehammer, Norway.
 | pages = 22&ndash;31
 | publisher = IEEE Computer Society
@@ Line 89: / Line 87: @@
 }},
 </ref> It was also combined with [[fuzz testing]] and extended to detect exploitable security issues in large-scale [[x86]] binaries by [[Microsoft Research]]'s SAGE.<ref>{{cite techreport
-| first=Patrice
+| author1 = Patrice Godefroid
+| author2 = Michael Y. Levin
-| last=Godefroid
-|author2=Michael Y. Levin |author3=David Molnar
+| author3 = David Molnar
- | title=Automated Whitebox Fuzz Testing
+| title = Automated Whitebox Fuzz Testing
-| number=TR-2007-58
+| number = TR-2007-58
-| institution=Microsoft Research
+| institution = Microsoft Research
-| year=2007
+| year = 2007
-| url=http://research.microsoft.com/en-us/projects/atg/ndss2008.pdf
+| url = http://research.microsoft.com/en-us/projects/atg/ndss2008.pdf
 }}
 </ref><ref>
 {{cite conference
-| first = Patrice
+| author1 = Patrice Godefroid
-| last = Godefroid
-| authorlink = Patrice Godefroid
 | title = Random testing for security: blackbox vs. whitebox fuzzing
 | booktitle = Proceedings of the 2nd international workshop on Random testing: co-located with the 22nd IEEE/ACM International Conference on Automated Software Engineering (ASE 2007)
@@ Line 188: / Line 184: @@
 * [https://github.com/aggelgian/cuter/ CutEr] is an open-source concolic testing tool for the Erlang functional programming language.
-Many tools, notably DART and SAGE, have not been made available to the public at large. Note however that for instance SAGE is "used daily" for internal security testing at Microsoft.<ref>{{cite web |url=http://research.microsoft.com/en-us/um/people/pg/public_psfiles/sage-in-one-slide.pdf |title=Microsoft PowerPoint - SAGE-in-one-slide |author=SAGE team |year=2009 |publisher=Microsoft Research |accessdate=2009-11-10 }}</ref>
+Many tools, notably DART and SAGE, have not been made available to the public at large. Note however that for instance SAGE is "used daily" for internal security testing at Microsoft.<ref>{{cite web
+| url = http://research.microsoft.com/en-us/um/people/pg/public_psfiles/sage-in-one-slide.pdf
+| title = Microsoft PowerPoint - SAGE-in-one-slide
+| author = SAGE team
+| year = 2009
+| publisher = Microsoft Research
+| accessdate = 2009-11-10
+}}</ref>
 == References ==