Thunk (object-oriented programming) - AbsoluteAstronomy.com

Some compilers for object-oriented languages such as C++

C++

C++ is a statically typed, free-form, multi-paradigm, compiled, general-purpose programming language. It is regarded as an intermediate-level language, as it comprises a combination of both high-level and low-level language features. It was developed by Bjarne Stroustrup starting in 1979 at Bell...

generate functions called thunks as an optimization

Compiler optimization

Compiler optimization is the process of tuning the output of a compiler to minimize or maximize some attributes of an executable computer program. The most common requirement is to minimize the time taken to execute a program; a less common one is to minimize the amount of memory occupied...

of virtual function

Virtual function

In object-oriented programming, a virtual function or virtual method is a function or method whose behaviour can be overridden within an inheriting class by a function with the same signature...

calls in the presence of multiple

Multiple inheritance

Multiple inheritance is a feature of some object-oriented computer programming languages in which a class can inherit behaviors and features from more than one superclass....

or virtual inheritance

Virtual inheritance

Virtual inheritance is a topic of object-oriented programming. It is a kind of inheritance in which the part of the object that belongs to the virtual base class becomes common direct base for the derived class and any next class that derives from it...

. Consider the C++ code

struct A {
int value;
virtual int access { return this->value; }
};
struct B {
int value;
virtual int access { return this->value; }
};
struct C : public A, public B {
int better_value;
virtual int access { return this->better_value; }
};

int use(B *b)
{
return b->access;
}

... C c; use(&c); ...

Since the function B::access is virtual, the call to b->access requires a vtable dispatch

Dynamic dispatch

In computer science, dynamic dispatch is the process of mapping a message to a specific sequence of code at runtime. This is done to support the cases where the appropriate method can't be determined at compile-time...

. In naïve implementations, the dispatch will consist of five steps:

The object pointed by b holds a pointer to the vtable. Load that pointer into a register.
The vtable entry for B::access is at some known offset in the vtable for B; find that entry E.
E contains a pointer to a function
Function pointer
A function pointer is a type of pointer in C, C++, D, and other C-like programming languages, and Fortran 2003. When dereferenced, a function pointer can be used to invoke a function and pass it arguments just like a normal function...

(in this case, C::access). Load that function pointer.
Since C::access expects a this pointer
This (computer science)
In many object-oriented programming languages, this is a keyword that is used in instance methods to refer to the object on which they are working. C++ and languages which derive in style from it generally use this...

to an instance of C, but b points to an instance of B, we must decrement b by the offset of B in C (in this example the size of C::A::value plus the size of C::A's vtable pointer). Since this offset is not known to use at compile time, it must also be loaded from E.
Finally, call C::access with the adjusted value of b.

The fourth step, in which an offset (a negative offset in this example) is loaded from E and added to b, can be completely eliminated by the compiler, thus speeding up every virtual method call, if the compiler generates a wrapper function like this, and places its address in the vtable entry E:

int thunk_for_C_access_in_B(B *b)
{
C *adjusted_b = (C *)b; /* decrements b by the appropriate offset,
so that it points to a C object */
return adjusted_b->C::access; /* a tail call to the original method */
}

Then the steps for b->access become:

The object pointed by b holds a pointer to the vtable. Load that pointer into a register.
The vtable entry for B::access is at some known offset in the vtable for B; find that entry E.
E contains a pointer to a function (in this case, thunk_for_C_access_in_B). Load that function pointer W.
Call W with the value of b. If b was really of dynamic type B, then W = B::access, and so we have saved two instructions (an expensive memory load and a cheap addition). If b was really of dynamic type C, then W = thunk_for_C_access_in_B, and so we have added one instruction (a cheap unconditional branch at the end of thunk_for_C_access_in_B).

Since the particular pattern of multiple inheritance in class C is rare in practice, we will generally save more instructions than we add. At the same time, we no longer need to store an offset for each entry E in the vtable, and so we have halved the size of every vtable in the program.

The term "thunk" for these compiler-generated functions can be seen as an example of "thunk"

Thunk (functional programming)

In computer science, a thunk is a parameterless closure created to prevent the evaluation of an expression until forced at a later time. In lazy languages thunks are created and forced implicitly...

, meant as a nullary function. It could have been described simply as a compiler-generated wrapper function

Adapter pattern

In computer programming, the adapter pattern is a design pattern that translates one interface for a class into a compatible interface...

, but the term "thunk" for these functions is now established.

The source of this article is wikipedia, the free encyclopedia. The text of this article is licensed under the GFDL.