DWARF
Encyclopedia
DWARF is a widely used, standardized debugging data format
. DWARF was originally designed along with Executable and Linkable Format
(ELF), although it is independent of object file
formats. The name is a medieval fantasy
complement to "ELF" that has no official meaning, although the backronym
'Debugging With Attributed Record Formats' was later proposed.
adopted ELF as part of their move to Solaris, they opted to continue using stabs
, in an embedding known as "stabs-in-elf". Linux
followed suit, and DWARF-2 did not become the default until the late 1990s. DWARF version 3, which was released in January 2006, adds (among other things) support for C++
namespaces, Fortran 90 allocatable data and additional compiler optimization
techniques.
Version 4 of DWARF, which offers "improved data compression, better description of optimized code, and support for new language features in C++", was published in 2010.
structure. A DIE attribute can refer to another DIE anywhere in the tree—for instance, a DIE representing a variable would have a DW_AT_type entry pointing to the DIE describing the variable's type.
To save space, two large tables needed by symbolic debuggers are represented as byte-coded
instructions for simple, special-purpose finite state machine
s. The Line Number Table, which maps code locations to source code locations and vice versa, also specifies which instructions are part of function prologue
s and epilogues. The Call Frame Information table allows debuggers to locate frames on the call stack
.
Debugging data format
A debugging data format is a means of storing information about a compiled computer program for use by high-level debuggers. Modern debugging data formats store enough information to allow source-level debugging....
. DWARF was originally designed along with Executable and Linkable Format
Executable and Linkable Format
In computing, the Executable and Linkable Format is a common standard file format for executables, object code, shared libraries, and core dumps. First published in the System V Application Binary Interface specification, and later in the Tool Interface Standard, it was quickly accepted among...
(ELF), although it is independent of object file
Object file
An object file is a file containing relocatable format machine code that is usually not directly executable. Object files are produced by an assembler, compiler, or other language translator, and used as input to the linker....
formats. The name is a medieval fantasy
Medieval fantasy
Medieval fantasy is a subgenre of fantasy that encompasses medieval era high fantasy and sometimes simply represents fictitious versions of historic events. This subgenre is common among role-playing games, text-based roleplaying, and high-fantasy literature....
complement to "ELF" that has no official meaning, although the backronym
Backronym
A backronym or bacronym is a phrase constructed purposely, such that an acronym can be formed to a specific desired word. Backronyms may be invented with serious or humorous intent, or may be a type of false or folk etymology....
'Debugging With Attributed Record Formats' was later proposed.
History
The first version of DWARF proved to use excessive amounts of storage, and it was superseded by an incompatible successor DWARF-2, which added various encoding schemes to reduce data size. DWARF was not immediately successful; for instance, when Sun MicrosystemsSun Microsystems
Sun Microsystems, Inc. was a company that sold :computers, computer components, :computer software, and :information technology services. Sun was founded on February 24, 1982...
adopted ELF as part of their move to Solaris, they opted to continue using stabs
Stabs
stabs is a debugging data format for storing information about computer programs for use by symbolic and source-level debuggers...
, in an embedding known as "stabs-in-elf". Linux
Linux
Linux is a Unix-like computer operating system assembled under the model of free and open source software development and distribution. The defining component of any Linux system is the Linux kernel, an operating system kernel first released October 5, 1991 by Linus Torvalds...
followed suit, and DWARF-2 did not become the default until the late 1990s. DWARF version 3, which was released in January 2006, adds (among other things) support for C++
C++
C++ is a statically typed, free-form, multi-paradigm, compiled, general-purpose programming language. It is regarded as an intermediate-level language, as it comprises a combination of both high-level and low-level language features. It was developed by Bjarne Stroustrup starting in 1979 at Bell...
namespaces, Fortran 90 allocatable data and additional compiler optimization
Compiler optimization
Compiler optimization is the process of tuning the output of a compiler to minimize or maximize some attributes of an executable computer program. The most common requirement is to minimize the time taken to execute a program; a less common one is to minimize the amount of memory occupied...
techniques.
Version 4 of DWARF, which offers "improved data compression, better description of optimized code, and support for new language features in C++", was published in 2010.
Structure
DWARF uses a data structure called a Debugging Information Entry (DIE) to represent each variable, type, procedure, etc. A DIE has a tag (e.g., DW_TAG_variable, DW_TAG_pointer_type, DW_TAG_subprogram) and attributes (key-value pairs), A DIE can have nested (child) DIEs, forming a treeTree (data structure)
In computer science, a tree is a widely-used data structure that emulates a hierarchical tree structure with a set of linked nodes.Mathematically, it is an ordered directed tree, more specifically an arborescence: an acyclic connected graph where each node has zero or more children nodes and at...
structure. A DIE attribute can refer to another DIE anywhere in the tree—for instance, a DIE representing a variable would have a DW_AT_type entry pointing to the DIE describing the variable's type.
To save space, two large tables needed by symbolic debuggers are represented as byte-coded
Bytecode
Bytecode, also known as p-code , is a term which has been used to denote various forms of instruction sets designed for efficient execution by a software interpreter as well as being suitable for further compilation into machine code...
instructions for simple, special-purpose finite state machine
Finite state machine
A finite-state machine or finite-state automaton , or simply a state machine, is a mathematical model used to design computer programs and digital logic circuits. It is conceived as an abstract machine that can be in one of a finite number of states...
s. The Line Number Table, which maps code locations to source code locations and vice versa, also specifies which instructions are part of function prologue
Function prologue
In assembly language programming, the function prologue is a few lines of code at the beginning of a function, which prepare the stack and registers for use within the function...
s and epilogues. The Call Frame Information table allows debuggers to locate frames on the call stack
Call stack
In computer science, a call stack is a stack data structure that stores information about the active subroutines of a computer program. This kind of stack is also known as an execution stack, control stack, run-time stack, or machine stack, and is often shortened to just "the stack"...
.
Further reading
Michael Eager, chair of the DWARF Standards Committee, has written an introduction to debugging formats and DWARF 3, Introduction to the DWARF Debugging Format.External links
- Libdwarf, a C library intended to simplify reading (and writing) applications using DWARF2, DWARF3.