Fuzz testing
Encyclopedia
Fuzz testing or fuzzing is a software testing
Software testing
Software testing is an investigation conducted to provide stakeholders with information about the quality of the product or service under test. Software testing can also provide an objective, independent view of the software to allow the business to appreciate and understand the risks of software...

 technique, often automated or semi-automated, that involves providing invalid, unexpected, or random data to the inputs of a computer program
Computer program
A computer program is a sequence of instructions written to perform a specified task with a computer. A computer requires programs to function, typically executing the program's instructions in a central processor. The program has an executable form that the computer can use directly to execute...

. The program is then monitored for exceptions such as crashes
Crash (computing)
A crash in computing is a condition where a computer or a program, either an application or part of the operating system, ceases to function properly, often exiting after encountering errors. Often the offending program may appear to freeze or hang until a crash reporting service documents...

 or failing built-in code assertion
Assertion (computing)
In computer programming, an assertion is a predicate placed in a program to indicate that the developer thinks that the predicate is always true at that place.For example, the following code contains two assertions:...

s. Fuzzing is commonly used to test for security problems in software or computer systems.

The term first originates from a class project at the University of Wisconsin 1988 although similar techniques have been used in the field of quality assurance, where they are referred to as robustness testing
Robustness testing
Robustness testing is any quality assurance methodology focused on testing the robustness of software. Robustness testing has also been used to describe the process of verifying the robustness Robustness testing is any quality assurance methodology focused on testing the robustness of software....

, syntax testing or negative testing.

There are two forms of fuzzing program; mutation-based and generation-based, which can be employed as white-, grey- or black-box testing. File format
File format
A file format is a particular way that information is encoded for storage in a computer file.Since a disk drive, or indeed any computer storage, can store only bits, the computer must have some way of converting information to 0s and 1s and vice-versa. There are different kinds of formats for...

s and network protocols are the most common targets of testing, but any type of program input can be fuzzed. Interesting inputs include environment variable
Environment variable
Environment variables are a set of dynamic named values that can affect the way running processes will behave on a computer.They can be said in some sense to create the operating environment in which a process runs...

s, keyboard and mouse event
Event (computing)
In computing an event is an action that is usually initiated outside the scope of a program and that is handled by a piece of code inside the program. Typically events are handled synchronous with the program flow, that is, the program has one or more dedicated places where events are handled...

s, and sequences of API
Application programming interface
An application programming interface is a source code based specification intended to be used as an interface by software components to communicate with each other...

 calls. Even items not normally considered "input" can be fuzzed, such as the contents of database
Database
A database is an organized collection of data for one or more purposes, usually in digital form. The data are typically organized to model relevant aspects of reality , in a way that supports processes requiring this information...

s, shared memory
Shared memory
In computing, shared memory is memory that may be simultaneously accessed by multiple programs with an intent to provide communication among them or avoid redundant copies. Depending on context, programs may run on a single processor or on multiple separate processors...

, or the precise interleaving of thread
Thread (computer science)
In computer science, a thread of execution is the smallest unit of processing that can be scheduled by an operating system. The implementation of threads and processes differs from one operating system to another, but in most cases, a thread is contained inside a process...

s.

For the purpose of security, input that crosses a trust boundary
Trust boundary
Trust boundary is a term in computer science and security used to describe a boundary where program data or execution changes its level of "trust". The term refers to any distinct boundary within which a system trusts all sub-systems . An example of an execution trust boundary would be where an...

 is often the most interesting. For example, it is more important to fuzz code that handles the upload of a file by any user than it is to fuzz the code that parses a configuration file that is accessible only to a privileged user.

History

The term "fuzz" or "fuzzing" originates from a 1988 class project at the University of Wisconsin, taught by Professor Barton Miller. The assignment was titled "Operating System Utility Program Reliability - The Fuzz Generator". The project developed a basic command-line fuzzer to test the reliability of Unix
Unix
Unix is a multitasking, multi-user computer operating system originally developed in 1969 by a group of AT&T employees at Bell Labs, including Ken Thompson, Dennis Ritchie, Brian Kernighan, Douglas McIlroy, and Joe Ossanna...

 programs by bombarding them with random data until they crashed. The test was repeated in 1995, expanded to include testing of GUI-based tools (X Windows), network protocols, and system library APIs. Follow-on work included testing command- and GUI-based applications on both Windows and MacOS X.

One of the earliest examples of fuzzing dates from before 1983. "The Monkey" was a Macintosh
Macintosh
The Macintosh , or Mac, is a series of several lines of personal computers designed, developed, and marketed by Apple Inc. The first Macintosh was introduced by Apple's then-chairman Steve Jobs on January 24, 1984; it was the first commercially successful personal computer to feature a mouse and a...

 application developed by Steve Capps
Steve Capps
Steve Capps is a computer programmer and engineer who is best known for his work on the Apple Inc. Macintosh computer and Newton OS during the 1980s and 1990s. He started working at the Xerox Corporation while still a computer science student at the Rochester Institute of Technology...

 prior to 1983. It used journaling hooks to feed random events into Mac programs, and was used to test for bugs in MacPaint
MacPaint
MacPaint was a bitmap-based graphics painting software program developed by Apple Computer and released with the original Macintosh personal computer on January 22, 1984. It was sold separately for US$195 with its word processor counterpart, MacWrite. MacPaint was notable because it could generate...

.

Uses

Fuzz testing is often employed as a black-box testing methodology in large software projects where a budget exists to develop test tools. Fuzz testing is one of the techniques that offers a high benefit-to-cost ratio.

The technique can only provide a random sample of the system's behavior, and in many cases passing a fuzz test may only demonstrate that a piece of software can handle exceptions without crashing, rather than behaving correctly. This means fuzz testing is an assurance of overall quality, rather than a bug-finding tool, and not a substitute for exhaustive testing or formal methods
Formal methods
In computer science and software engineering, formal methods are a particular kind of mathematically-based techniques for the specification, development and verification of software and hardware systems...

.

As a gross measurement of reliability, fuzzing can suggest which parts of a program should get special attention, in the form of a code audit
Code audit
A software code audit is a comprehensive analysis of source code in a programming project with the intent of discovering bugs, security breaches or violations of programming conventions. It is an integral part of the defensive programming paradigm, which attempts to reduce errors before the...

, application of static analysis
Static code analysis
Static program analysis is the analysis of computer software that is performed without actually executing programs built from that software In most cases the analysis is performed on some version of the source code and in the other cases some form of the object code...

, or partial rewrite
Rewrite (programming)
A rewrite in computer programming is the act or result of re-implementing a large portion of existing functionality without re-use of its source code. When the rewrite is not using existing code at all, it is common to speak of a rewrite from scratch...

s.

Types of bugs

As well as testing for outright crashes, fuzz testing is used to find bugs such as assertion failures and memory leak
Memory leak
A memory leak, in computer science , occurs when a computer program consumes memory but is unable to release it back to the operating system. In object-oriented programming, a memory leak happens when an object is stored in memory but cannot be accessed by the running code...

s (when coupled with a memory debugger
Memory debugger
A memory debugger is a programming tool for finding memory leaks and buffer overflows. These are due to bugs related to the allocation and deallocation of dynamic memory. Programs written in languages that have garbage collection, such as managed code, might also need memory debuggers, e.g...

). The methodology is useful against large applications, where any bug affecting memory safety
Memory safety
Memory safety is a concern in software development that aims to avoid software bugs that cause security vulnerabilities dealing with random-access memory access, such as buffer overflows and dangling pointers....

 is likely to be a severe vulnerability
Vulnerability (computing)
In computer security, a vulnerability is a weakness which allows an attacker to reduce a system's information assurance.Vulnerability is the intersection of three elements: a system susceptibility or flaw, attacker access to the flaw, and attacker capability to exploit the flaw...

.

Since fuzzing often generates invalid input it is used for testing error-handling routines, which are important for software that does not control its input. Simple fuzzing can be thought of as a way to automate negative test
Negative test
The subject Negative Test can relate to:*A negative medical test, in which the target parameter that was evaluated was not present.*In software testing, a test designed to determine the response of the system outside of what is defined. It is designed to determine if the system doesn't crash with...

ing.

Fuzzing can also find some types of "correctness" bugs. For example, it can be used to find incorrect-serialization
Serialization
In computer science, in the context of data storage and transmission, serialization is the process of converting a data structure or object state into a format that can be stored and "resurrected" later in the same or another computer environment...

 bugs by complaining whenever a program's serializer emits something that the same program's parser rejects. It can also find unintentional differences between two versions of a program or between two implementations of the same specification.

Techniques

Fuzzing programs fall into two different categories. Mutation based fuzzers mutate existing data samples to create test data while generation based fuzzers define new test data based on models of the input.

The simplest form of fuzzing technique is sending a stream of random bits to software, either as command line options, randomly mutated protocol packets, or as events. This technique of random inputs still continues to be a powerful tool to find bugs in command-line applications, network protocols, and GUI-based applications and services. Another common technique that is easy to implement is mutating existing input (e.g. files from a test suite
Test suite
In software development, a test suite, less commonly known as a validation suite, is a collection of test cases that are intended to be used to test a software program to show that it has some specified set of behaviours. A test suite often contains detailed instructions or goals for each...

) by flipping bits at random or moving blocks of the file around. However, the most successful fuzzers have detailed understanding of the format or protocol being tested.

The understanding can be based on a specification. A specification-based fuzzer involves writing the entire array of specifications into the tool, and then using model-based test generation techniques in walking through the specifications and adding anomalies in the data contents, structures, messages, and sequences. This "smart fuzzing" technique is also known as robustness testing, syntax testing, grammar testing, and (input) fault injection.
The protocol awareness can also be created heuristically from examples using a tool such as Sequitur. These fuzzers can generate test case
Test case
A test case in software engineering is a set of conditions or variables under which a tester will determine whether an application or software system is working correctly or not. The mechanism for determining whether a software program or system has passed or failed such a test is known as a test...

s from scratch, or they can mutate examples from test suite
Test suite
In software development, a test suite, less commonly known as a validation suite, is a collection of test cases that are intended to be used to test a software program to show that it has some specified set of behaviours. A test suite often contains detailed instructions or goals for each...

s or real life. They can concentrate on valid or invalid input, with mostly-valid input tending to trigger the "deepest" error cases.

There are two limitations of protocol-based fuzzing based on protocol implementations of published specifications: 1) Testing cannot proceed until the specification is relatively mature, since a specification is a prerequisite for writing such a fuzzer; and 2) Many useful protocols are proprietary, or involve proprietary extensions to published protocols. If fuzzing is based only on published specifications, test coverage for new or proprietary protocols will be limited or nonexistent.

Fuzz testing can be combined with other testing techniques. White-box fuzzing uses symbolic execution
Symbolic execution
In computer science, symbolic execution refers to the analysis of programs by tracking symbolic rather than actual values, a case of abstract interpretation. The field of symbolic simulation applies the same concept to hardware...

 and constraint solving. Evolutionary fuzzing leverages feedback from code coverage
Code coverage
Code coverage is a measure used in software testing. It describes the degree to which the source code of a program has been tested. It is a form of testing that inspects the code directly and is therefore a form of white box testing....

, effectively automating the approach of exploratory testing.

Reproduction and isolation

Test case reduction is the process of extracting minimal test case
Test case
A test case in software engineering is a set of conditions or variables under which a tester will determine whether an application or software system is working correctly or not. The mechanism for determining whether a software program or system has passed or failed such a test is known as a test...

s from an initial test case. Test case reduction may be done manually, or using software tools, and usually involves a divide-and-conquer strategy where parts of the test are removed one by one until only the essential core of the test case remains.

So as to be able to reproduce errors, fuzzing software will often record the input data it produces, usually before applying it to the software. If the computer crashes outright, the test data is preserved. If the fuzz stream is pseudo-random number
Pseudorandomness
A pseudorandom process is a process that appears to be random but is not. Pseudorandom sequences typically exhibit statistical randomness while being generated by an entirely deterministic causal process...

-generated, the seed value can be stored to reproduce the fuzz attempt. Once a bug is found, some fuzzing software will help to build a test case
Test case
A test case in software engineering is a set of conditions or variables under which a tester will determine whether an application or software system is working correctly or not. The mechanism for determining whether a software program or system has passed or failed such a test is known as a test...

, which is used for debugging, using test case reduction tools such as Delta or Lithium.

Advantages and disadvantages

The main problem with fuzzing to find program faults is that it generally only finds very simple faults. The computational complexity of the software testing problem is of exponential order (, ) and every fuzzer takes shortcuts to find something interesting in a timeframe that a human cares about. A primitive fuzzer may have poor code coverage
Code coverage
Code coverage is a measure used in software testing. It describes the degree to which the source code of a program has been tested. It is a form of testing that inspects the code directly and is therefore a form of white box testing....

; for example, if the input includes a checksum
Checksum
A checksum or hash sum is a fixed-size datum computed from an arbitrary block of digital data for the purpose of detecting accidental errors that may have been introduced during its transmission or storage. The integrity of the data can be checked at any later time by recomputing the checksum and...

 which is not properly updated to match other random changes, only the checksum validation code will be verified. Code coverage tools are often used to estimate how "well" a fuzzer works, but these are only guidelines to fuzzer quality. Every fuzzer can be expected to find a different set of bugs.

On the other hand, bugs found using fuzz testing are sometimes severe, exploitable bugs that could be used by a real attacker. This has become more common as fuzz testing has become more widely known, as the same techniques and tools are now used by attackers to exploit deployed software. This is a major advantage over binary or source auditing, or even fuzzing's close cousin, fault injection
Fault injection
In software testing, fault injection is a technique for improving the coverage of a test by introducing faults to test code paths, in particular error handling code paths, that might otherwise rarely be followed. It is often used with stress testing and is widely considered to be an important part...

, which often relies on artificial fault conditions that are difficult or impossible to exploit.

The randomness of inputs used in fuzzing is often seen as a disadvantage, as catching a boundary value condition with random inputs is highly unlikely.

Fuzz testing enhances software security and software safety
Safety engineering
Safety engineering is an applied science strongly related to systems engineering / industrial engineering and the subset System Safety Engineering...

because it often finds odd oversights and defects which human testers would fail to find, and even careful human test designers would fail to create tests for.

Further reading

  • ISBN 978-1-59693-214-2, Fuzzing for Software Security Testing and Quality Assurance, Ari Takanen, Jared D. DeMott, Charles Miller

External links

The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK