Defensive programming
Encyclopedia
Defensive programming is a form of defensive design
intended to ensure the continuing function of a piece of software in spite of unforeseeable usage of said software. The idea can be viewed as reducing or eliminating the prospect of Murphy's Law
having effect. Defensive programming techniques are used especially when a piece of software could be misused mischievously or inadvertently to catastrophic effect.
Defensive programming is an approach to improve software and source code, in terms of:
s can be potentially used by a cracker for a code injection
, denial-of-service attack
or other attack.
A difference between defensive programming and normal practices is that few assumptions are made by the programmer, who attempts to handle all possible error states. In short, the programmer never assumes a particular function call or library will work as advertised, and so handles it in the code. An example follows:
The function will crash when the input is over 1000 characters. Many mainstream programmers may not feel that this is a problem, supposing that no user will enter such a long input. A programmer practicing defensive programming would not allow the bug, because if the application contains a known bug, Murphy's Law
dictates that the bug will occur in use. This particular bug demonstrates a vulnerability which enables buffer overflow
exploit
s. Here is a solution to this example:
in a program requires the programmer to add extra code, which may also contain bugs.
. A do-it-yourself security audit is insufficient: the review must be made by a non-author, just as when writing a book, it must be proofread by someone other than the author.
Simply making the code available for others to read (see Free software
or Open Source Definition
) is insufficient: there is no guarantee that the code will ever be looked at, let alone that it will be rigorously reviewed.
should be performed to make sure the program handles unexpected input appropriately.
Testing tools can capture keystrokes associated with normal operations, then the captured keystroke strings can be copied and edited to try out all permutations of combinations, then extended for later tests after any modification
s. Proponents of key logging state that programmers who use this method should make sure that the people whose keystrokes are being captured are aware of this, and for what purpose, to avoid accusations of privacy
violation.
However, reusing code is not always a good practice, particularly when business logic is involved. Reuse in this case may cause serious business process
bugs.
problems.
Legacy problems are problems inherent when old designs are expected to work with today's requirements, especially when the old designs were not developed or tested with those requirements in mind.
Many software products have experienced problems with old legacy source code, for example:
Notable examples of the legacy problem:
For example, if you checked if a requested file is not "/etc/passwd
", a cracker might pass another variant of this file name, like "/etc/./passwd".
To avoid bugs due to non-canonical
input, employ canonicalization
APIs.
Defensive design
Defensive design is the practice of planning for contingencies in the design stage of a project or undertaking. Essentially, it is the practice of anticipating all possible ways that an end-user could misuse a device, and designing the device so as to make such misuse impossible, or to minimise the...
intended to ensure the continuing function of a piece of software in spite of unforeseeable usage of said software. The idea can be viewed as reducing or eliminating the prospect of Murphy's Law
Murphy's law
Murphy's law is an adage or epigram that is typically stated as: "Anything that can go wrong will go wrong". - History :The perceived perversity of the universe has long been a subject of comment, and precursors to the modern version of Murphy's law are not hard to find. Recent significant...
having effect. Defensive programming techniques are used especially when a piece of software could be misused mischievously or inadvertently to catastrophic effect.
Defensive programming is an approach to improve software and source code, in terms of:
- General quality - Reducing the number of software bugSoftware bugA software bug is the common term used to describe an error, flaw, mistake, failure, or fault in a computer program or system that produces an incorrect or unexpected result, or causes it to behave in unintended ways. Most bugs arise from mistakes and errors made by people in either a program's...
s and problems. - Making the source code comprehensible - the source code should be readable and understandable so it is approved in a code auditCode auditA software code audit is a comprehensive analysis of source code in a programming project with the intent of discovering bugs, security breaches or violations of programming conventions. It is an integral part of the defensive programming paradigm, which attempts to reduce errors before the...
. - Making the software behave in a predictable manner despite unexpected inputs or user actions.
Secure programming
Defensive programming is sometimes referred to as secure programming by computer scientists who state this approach minimizes bugs. Software bugSoftware bug
A software bug is the common term used to describe an error, flaw, mistake, failure, or fault in a computer program or system that produces an incorrect or unexpected result, or causes it to behave in unintended ways. Most bugs arise from mistakes and errors made by people in either a program's...
s can be potentially used by a cracker for a code injection
Code injection
Code injection is the exploitation of a computer bug that is caused by processing invalid data. Code injection can be used by an attacker to introduce code into a computer program to change the course of execution. The results of a code injection attack can be disastrous...
, denial-of-service attack
Denial-of-service attack
A denial-of-service attack or distributed denial-of-service attack is an attempt to make a computer resource unavailable to its intended users...
or other attack.
A difference between defensive programming and normal practices is that few assumptions are made by the programmer, who attempts to handle all possible error states. In short, the programmer never assumes a particular function call or library will work as advertised, and so handles it in the code. An example follows:
The function will crash when the input is over 1000 characters. Many mainstream programmers may not feel that this is a problem, supposing that no user will enter such a long input. A programmer practicing defensive programming would not allow the bug, because if the application contains a known bug, Murphy's Law
Murphy's law
Murphy's law is an adage or epigram that is typically stated as: "Anything that can go wrong will go wrong". - History :The perceived perversity of the universe has long been a subject of comment, and precursors to the modern version of Murphy's law are not hard to find. Recent significant...
dictates that the bug will occur in use. This particular bug demonstrates a vulnerability which enables buffer overflow
Buffer overflow
In computer security and programming, a buffer overflow, or buffer overrun, is an anomaly where a program, while writing data to a buffer, overruns the buffer's boundary and overwrites adjacent memory. This is a special case of violation of memory safety....
exploit
Exploit (computer security)
An exploit is a piece of software, a chunk of data, or sequence of commands that takes advantage of a bug, glitch or vulnerability in order to cause unintended or unanticipated behavior to occur on computer software, hardware, or something electronic...
s. Here is a solution to this example:
Reduce source code complexity
Never make code more complex than necessary. Complexity breeds bugs, including security problems. This goal can conflict with the goal of writing programs that can recover from any error and handle any user input. Handling all unexpected occurrencesException handling
Exception handling is a programming language construct or computer hardware mechanism designed to handle the occurrence of exceptions, special conditions that change the normal flow of program execution....
in a program requires the programmer to add extra code, which may also contain bugs.
Source code reviews
A source code review is where someone other than the original author performs a code auditCode audit
A software code audit is a comprehensive analysis of source code in a programming project with the intent of discovering bugs, security breaches or violations of programming conventions. It is an integral part of the defensive programming paradigm, which attempts to reduce errors before the...
. A do-it-yourself security audit is insufficient: the review must be made by a non-author, just as when writing a book, it must be proofread by someone other than the author.
Simply making the code available for others to read (see Free software
Free software
Free software, software libre or libre software is software that can be used, studied, and modified without restriction, and which can be copied and redistributed in modified or unmodified form either without restriction, or with restrictions that only ensure that further recipients can also do...
or Open Source Definition
Open Source Definition
The Open Source Definition is a document published by the Open Source Initiative, to determine whether or not a software license can be labeled with the open-source certification mark....
) is insufficient: there is no guarantee that the code will ever be looked at, let alone that it will be rigorously reviewed.
Software testing
Software testingSoftware testing
Software testing is an investigation conducted to provide stakeholders with information about the quality of the product or service under test. Software testing can also provide an objective, independent view of the software to allow the business to appreciate and understand the risks of software...
should be performed to make sure the program handles unexpected input appropriately.
Testing tools can capture keystrokes associated with normal operations, then the captured keystroke strings can be copied and edited to try out all permutations of combinations, then extended for later tests after any modification
Modification
Modification may refer to:*Modifications of school work for students with special educational needs*Modifications *Posttranslational modifications*Mod *Modified car*Body modification*Grammatical modifier-See also:...
s. Proponents of key logging state that programmers who use this method should make sure that the people whose keystrokes are being captured are aware of this, and for what purpose, to avoid accusations of privacy
Privacy
Privacy is the ability of an individual or group to seclude themselves or information about themselves and thereby reveal themselves selectively...
violation.
Intelligent source code reuse
If existing code is tested and known to work, reusing it may reduce the chance of bugs being introduced.However, reusing code is not always a good practice, particularly when business logic is involved. Reuse in this case may cause serious business process
Business process
A business process or business method is a collection of related, structured activities or tasks that produce a specific service or product for a particular customer or customers...
bugs.
Legacy problems
Before reusing old source code, libraries, APIs, configurations and so forth, it must be considered if the old work is valid for reuse, or if it is likely to be prone to legacyLegacy system
A legacy system is an old method, technology, computer system, or application program that continues to be used, typically because it still functions for the users' needs, even though newer technology or more efficient methods of performing a task are now available...
problems.
Legacy problems are problems inherent when old designs are expected to work with today's requirements, especially when the old designs were not developed or tested with those requirements in mind.
Many software products have experienced problems with old legacy source code, for example:
- Legacy codeLegacy codeLegacy code is source code that related to a no-longer supported or manufactured operating system or other computer technology. The term can also mean code inserted into modern software for the purpose of maintaining an older or previously supported feature — for example supporting a serial...
may not have been designed under a Defensive programming initiative, and might therefore be of much lower quality than newly designed source code. - Legacy code may have been written and tested under conditions which no longer apply. The old quality assurance tests may have no validity any more.
- Example 1: legacy code may have been designed for ASCII input but now the input is UTF-8.
- Example 2: legacy code may have been compiled and tested on 32-bit architectures, but when compiled on 64-bit architectures new arithmetic problems may occur (e.g. invalid signedness tests, invalid type casts, etc.).
- Example 3: legacy code may have been targeted for offline machines, but becomes vulnerable once network connectivity is added.
- Legacy code is not written with new problems in mind. For example, source code written about 1990 is likely to be prone to many Code injectionCode injectionCode injection is the exploitation of a computer bug that is caused by processing invalid data. Code injection can be used by an attacker to introduce code into a computer program to change the course of execution. The results of a code injection attack can be disastrous...
vulnerabilities, because most such problems were not widely understood at that time.
Notable examples of the legacy problem:
- BIND 9BINDBIND , or named , is the most widely used DNS software on the Internet.On Unix-like operating systems it is the de facto standard.Originally written by four graduate students at the Computer Systems Research Group at the University of California, Berkeley , the name originates as an acronym from...
, presented by Paul Vixie and David Conrad as "BINDv9 is a complete rewriteRewrite (programming)A rewrite in computer programming is the act or result of re-implementing a large portion of existing functionality without re-use of its source code. When the rewrite is not using existing code at all, it is common to speak of a rewrite from scratch...
", "Security was a key consideration in design" *, naming security, robustness, scalability and new protocols as key concerns for rewriting old legacy code. - Microsoft WindowsMicrosoft WindowsMicrosoft Windows is a series of operating systems produced by Microsoft.Microsoft introduced an operating environment named Windows on November 20, 1985 as an add-on to MS-DOS in response to the growing interest in graphical user interfaces . Microsoft Windows came to dominate the world's personal...
suffered from "the" Windows Metafile vulnerabilityWindows Metafile vulnerabilityThe Windows Metafile vulnerability is a security vulnerability in Microsoft Windows NT-based operating systems which has been used in a variety of exploits since late December 2005. The vulnerability was first discussed in the computer security community around 26 and December 27, 2005, with the...
and other exploits related to the WMF format. Microsoft Security Response Center describes the WMF-features as "Around 1990, WMF support was added... This was a different time in the security landscape... were all completely trusted" *, not being developed under the security initiatives at Microsoft. - OracleOracle CorporationOracle Corporation is an American multinational computer technology corporation that specializes in developing and marketing hardware systems and enterprise software products – particularly database management systems...
is combating legacy problems, such as old source code written without addressing concerns of SQL InjectionSQL injectionA SQL injection is often used to attack the security of a website by inputting SQL statements in a web form to get a badly designed website in order to dump the database content to the attacker. SQL injection is a code injection technique that exploits a security vulnerability in a website's software...
and privilege escalationPrivilege escalationPrivilege escalation is the act of exploiting a bug, design flaw or configuration oversight in an operating system or software application to gain elevated access to resources that are normally protected from an application or user...
, resulting in many security vulnerabilities which has taken time to fix and also generated incomplete fixes. This has given rise to heavy criticism from security experts such as David LitchfieldDavid LitchfieldDavid Litchfield is a renowned security expert from the United Kingdom, who focuses on the discovery and publication of computer security vulnerabilities with a special focus on database server software...
, Alexander KornbrustAlexander KornbrustAlexander Kornbrust is the founder and CEO of Red-Database-Security GmbH , a company specialized in Oracle security.He is one of the most active security researchers in the world working on Oracle security. He gave various presentations on security conferences like Black Hat or IT Underground....
, Cesar Cerrudo (1,2,3). An additional criticism is that default installations (largely a legacy from old versions) are not aligned with their own security recommendations, such as Oracle Database Security Checklist, which is hard to amend as many applications require the less secure legacy settings to function correctly.
Canonicalization
Crackers are likely to invent new kinds of representations of incorrect data.For example, if you checked if a requested file is not "/etc/passwd
Passwd (file)
In Unix-like operating systems the /etc/passwd file is a text-based database of information about users that may login to the system or other operating system user identities that own running processes....
", a cracker might pass another variant of this file name, like "/etc/./passwd".
To avoid bugs due to non-canonical
Canonical
Canonical is an adjective derived from canon. Canon comes from the greek word κανών kanon, "rule" or "measuring stick" , and is used in various meanings....
input, employ canonicalization
Canonicalization
In computer science, canonicalization , is a process for converting data that has more than one possible representation into a "standard", "normal", or canonical form...
APIs.
Low tolerance against "potential" bugs
Assume that code constructs that appear to be problem prone (similar to known vulnerabilities, etc.) are bugs and potential security flaws. The basic rule of thumb is: "I'm not aware of all types of security exploits. I must protect against those I do know of and then I must be proactive!".Other techniques
- One of the most common problems is unchecked use of constant-size structures and functions for dynamic-size data (the buffer overflowBuffer overflowIn computer security and programming, a buffer overflow, or buffer overrun, is an anomaly where a program, while writing data to a buffer, overruns the buffer's boundary and overwrites adjacent memory. This is a special case of violation of memory safety....
problem). This is especially common for string data in CC (programming language)C is a general-purpose computer programming language developed between 1969 and 1973 by Dennis Ritchie at the Bell Telephone Laboratories for use with the Unix operating system....
. C library functions like gets should never be used since the maximum size of the input buffer is not passed as an argument. C library functions like scanf can be used safely, but require the programmer to take care with the selection of safe format strings, by sanitising it before using it. - Encrypt/authenticate all important data transmitted over networks. Do not attempt to implement your own encryption scheme, but use a proven one instead.
- All data is important until proven otherwise.
- All data is tainted until proved otherwise.
- All code is insecure until proven otherwise.
- You cannot prove the security of any code in userland, or, more canonically: "never trust the client".
- If data is to be checked for correctness, verify that they are correct, not that they are incorrect.
- Design by ContractDesign by contractDesign by contract , also known as programming by contract and design-by-contract programming, is an approach to designing computer software...
- Design by contract uses preconditionPreconditionIn computer programming, a precondition is a condition or predicate that must always be true just prior to the execution of some section of code or before an operation in a formal specification....
s, postconditionPostconditionIn computer programming, a postcondition is a condition or predicate that must always be true just after the execution of some section of code or after an operation in a formal specification. Postconditions are sometimes tested using assertions within the code itself...
s and invariantsInvariant (computer science)In computer science, a predicate is called an invariant to a sequence of operations provided that: if the predicate is true before starting the sequence, then it is true at the end of the sequence.-Use:...
to ensure that provided data (and the state of the program as a whole) is sanitized. This allows code to document its assumptions and make them safely. This may involve checking arguments to a function or method for validity before executing the body of the function. After the body of a function, doing a check of object state (in Object-oriented programmingObject-oriented programmingObject-oriented programming is a programming paradigm using "objects" – data structures consisting of data fields and methods together with their interactions – to design applications and computer programs. Programming techniques may include features such as data abstraction,...
languages) or other held data and the return value before exits (break/return/throw/error code) is also wise.
- Design by contract uses precondition
- AssertionsAssertion (computing)In computer programming, an assertion is a predicate placed in a program to indicate that the developer thinks that the predicate is always true at that place.For example, the following code contains two assertions:...
- Within functions, you may want to check that you are not referencing something that is not valid (i.e., null) and that array lengths are valid before referencing elements, especially on all temporary/local instantiations. A good heuristic is to not trust the libraries you did not write either. So any time you call them, check what you get back from them. It often helps to create a small library of "asserting" and "checking" functions to do this along with a logger so you can trace your path and reduce the need for extensive debuggingDebuggingDebugging is a methodical process of finding and reducing the number of bugs, or defects, in a computer program or a piece of electronic hardware, thus making it behave as expected. Debugging tends to be harder when various subsystems are tightly coupled, as changes in one may cause bugs to emerge...
cycles in the first place. With the advent of logging libraries and Aspect Oriented Programming, many of the tedious aspects of defensive programming are mitigated.
- Within functions, you may want to check that you are not referencing something that is not valid (i.e., null) and that array lengths are valid before referencing elements, especially on all temporary/local instantiations. A good heuristic is to not trust the libraries you did not write either. So any time you call them, check what you get back from them. It often helps to create a small library of "asserting" and "checking" functions to do this along with a logger so you can trace your path and reduce the need for extensive debugging
- Prefer exceptions to return codes
- Generally speaking, it is preferable to throw intelligible exception messages that enforce part of your APIApplication programming interfaceAn application programming interface is a source code based specification intended to be used as an interface by software components to communicate with each other...
contract and guide the client programmerProgrammerA programmer, computer programmer or coder is someone who writes computer software. The term computer programmer can refer to a specialist in one area of computer programming or to a generalist who writes code for many kinds of software. One who practices or professes a formal approach to...
instead of returning values that a client programmer is likely to be unprepared for and hence minimize their complaints and increase robustness and security of your software.
- Generally speaking, it is preferable to throw intelligible exception messages that enforce part of your API
Further reading
- William R. Cheswick and Steven M. BellovinSteven M. BellovinSteven M. Bellovin is a researcher on computer networking and security. He is currently a Professor in the Computer Science department at Columbia University, having previously been a Fellow at AT&T Labs Research in Florham Park, New Jersey.- Career :...
, Firewalls and Internet Security: Repelling the Wily Hacker ISBN 0-201-63357-4 http://www.wilyhacker.com/
External links
- "Rules for Defensive C Programming(Mirror1)" by Dinu P. Madau 1999
- [ftp://ftp.akaedu.org/%E5%B5%8C%E5%85%A5%E5%BC%8F%E7%9B%B8%E5%85%B3%E4%B9%A6%E7%B1%8D%E8%B5%84%E6%BA%90_Books/%E5%A4%A7%E9%87%8F%E5%B5%8C%E5%85%A5%E5%BC%8F%E4%B9%A6%E7%B1%8D_upload_by_BlackorWhite/EmbeddedSystemProgromming%20CD%20Libray/files/99/9912/f-madau.pdf "Rules for Defensive C Programming(Mirror2)"] by Dinu P. Madau 1999
- "Secure Programming for Linux and Unix HOWTO" by David A. WheelerDavid A. WheelerDavid A. Wheeler is a computer scientist. He is best known for his work on Open source software/Free-libre software and Computer security.-Open Source Software:...
- "Proactive Debugging" article by Jack Ganssle 2001-02-26
- "Defensive programming" article by Rob Manderson 2004-08-06
- "The art of defensive programming: Or how to write code that will be easy to maintain" article by Jonathan West
- "The Art of Software Security Assessment" by Mark Dowd, John McDonald, and Justin Schuh
- CERT Secure Coding Standards
- Art of defensive programming in Java
- Defensive Programming in the RKBExplorer
- Solid Software by Shari Lawrence Pfleeger, Les Hatton and Charles C. Howell, has a section on Defensive Programming.