Malloc
Encyclopedia
C dynamic memory allocation refers to performing dynamic memory allocation in the C
via a group of functions in the C standard library
, namely
The C++
programming language includes these functions for backwards compatibility; its use in C++ has been largely superseded by operators
Many different implementations of the actual memory allocation mechanism, used by
, automatically
, or dynamically. Static-duration variables are allocated in main (fixed) memory and persist for the lifetime of the program; automatic-duration variables are allocated on the stack
and come and go as functions are called and return. For static-duration and, before C99
(which allows variable-length automatic arrays), automatic-duration variables, the size of the allocation is required to be compile-time constant. If the required size is not known until run-time (for example, if data of arbitrary size is being read from the user or from a disk file), then using fixed-size data objects is inadequate.
The lifetime of allocated memory is also a concern. Neither static- nor automatic-duration memory is adequate for all situations. Automatic-allocated data cannot persist across multiple function calls, while static data persists for the life of the program whether it is needed or not. In many situations the programmer requires greater flexibility in managing the lifetime of allocated memory.
These limitations are avoided by using dynamic memory allocation in which memory is more explicitly (but more flexibly) managed, typically, by allocating it from the heap, an area of memory structured for this purpose. In C, the library function
Some platforms provide library calls which allow run-time dynamic allocation from the C stack rather than the heap (e.g. Unix
CRTL's
standard, which added support for variable-length array
s of block scope having sizes determined at runtime.
However, if one wishes to allocate a similar array dynamically, the following code could be used:
Type safety
One may "cast" (see type conversion
) this pointer to a specific type:
There are advantages and disadvantages to performing such a cast.
Common errors
The improper use of dynamic memory allocation can frequently be a source of bugs.
Most common errors are as follows:
Implementations
The implementation of memory management depends greatly upon operating system and architecture. Some operating systems supply an allocator for malloc, while others supply functions to control certain regions of data. The same dynamic memory allocator is often used to implement both
. Hence, it is referred to below as the allocator rather than
architectures is commonly done using the heap, or data segment
. The allocator will usually expand and contract the heap to fulfill allocation requests.
The heap method suffers from a few inherent flaws, stemming entirely from fragmentation
. Like any method of memory allocation, the heap will become fragmented; that is, there will be sections of used and unused memory in the allocated space on the heap. A good allocator will attempt to find an unused area of already allocated memory to use before resorting to expanding the heap. The major problem with this method is that the heap has only two significant attributes: base, or the beginning of the heap in virtual memory space; and length, or its size. The heap requires enough system memory to fill its entire length, and its base can never change. Thus, any large areas of unused memory are wasted. The heap can get "stuck" in this position if a small used segment exists at the end of the heap, which could waste any magnitude of address space, from a few megabytes to a few hundred.
is the author of a memory allocator called [ftp://g.oswego.edu/pub/misc/ dlmalloc] ("Doug Lea's Malloc").
Memory on the heap is allocated as "chunks", an 8-byte aligned
data structure
which contains a header and usable memory. Allocated memory contains an 8 or 16 byte overhead for the size of the chunk and usage flags. Unallocated chunks also store pointers to other free chunks in the usable space area, making the minimum chunk size 24 bytes.
Unallocated memory is grouped into "bins
" of similar sizes, implemented by using a double-linked list of chunks (with pointers stored in the unallocated space inside the chunk).
For requests below 256 bytes (a "smallbin" request), a simple two power best fit allocator is used. If there are no free blocks in that bin, a block from the next highest bin is split in two.
For requests of 256 bytes or above but below the mmap threshold, recent versions of dlmalloc use an in-place bitwise trie algorithm. If there is no free space left to satisfy the request, dlmalloc tries to increase the size of the heap, usually via brk
system call.
For requests above the mmap threshold (a "largebin" request), the memory is always allocated using the mmap
system call. The threshold is usually 256 KB The mmap method averts problems with huge buffers trapping a small allocation at the end after their expiration, but always allocates an entire page
of memory, which on many architectures is 4096 bytes in size.
7.0 and NetBSD
5.0, the old
's implementation of the
using
and gap page features implemented as part of OpenBSD's
, and to detect use-after-free bugs—as a large memory allocation is completely unmapped after it is freed, further use causes a segmentation fault
and termination of the program.
is an allocator whose goal is scalable memory allocation performance. Like OpenBSD's allocator, Hoard uses
can be used. Tcmalloc has garbage-collection for local storage of dead threads. The TCmalloc is considered to be more than twice as fast as glibc's ptmalloc for multithreaded programs.
, or the memory allocation function might be called from interrupt context. This necessitates a
subsystem of the operating system kernel.
Allocation size limits
The largest possible memory block
See also
External links
C (programming language)
C is a general-purpose computer programming language developed between 1969 and 1973 by Dennis Ritchie at the Bell Telephone Laboratories for use with the Unix operating system....
via a group of functions in the C standard library
C standard library
The C Standard Library is the standard library for the programming language C, as specified in the ANSI C standard.. It was developed at the same time as the C POSIX library, which is basically a superset of it...
, namely
malloc
, realloc
, calloc
and free
.The C++
C++
C++ is a statically typed, free-form, multi-paradigm, compiled, general-purpose programming language. It is regarded as an intermediate-level language, as it comprises a combination of both high-level and low-level language features. It was developed by Bjarne Stroustrup starting in 1979 at Bell...
programming language includes these functions for backwards compatibility; its use in C++ has been largely superseded by operators
new
and new[]
.Many different implementations of the actual memory allocation mechanism, used by
malloc
, are available. Their performance varies in both execution time and required memory.Rationale
The C programming language manages memory staticallyStatic memory allocation
Static memory allocation refers to the process of allocating memory at compile-time before the associated program is executed, unlike dynamic memory allocation or automatic memory allocation where memory is allocated as required at run-time....
, automatically
Automatic memory allocation
In computer programming, an automatic variable is a lexically-scoped variable which is allocated and de-allocated automatically when program flow enters and leaves the variable's scope...
, or dynamically. Static-duration variables are allocated in main (fixed) memory and persist for the lifetime of the program; automatic-duration variables are allocated on the stack
Call stack
In computer science, a call stack is a stack data structure that stores information about the active subroutines of a computer program. This kind of stack is also known as an execution stack, control stack, run-time stack, or machine stack, and is often shortened to just "the stack"...
and come and go as functions are called and return. For static-duration and, before C99
C99
C99 is a modern dialect of the C programming language. It extends the previous version with new linguistic and library features, and helps implementations make better use of available computer hardware and compiler technology.-History:...
(which allows variable-length automatic arrays), automatic-duration variables, the size of the allocation is required to be compile-time constant. If the required size is not known until run-time (for example, if data of arbitrary size is being read from the user or from a disk file), then using fixed-size data objects is inadequate.
The lifetime of allocated memory is also a concern. Neither static- nor automatic-duration memory is adequate for all situations. Automatic-allocated data cannot persist across multiple function calls, while static data persists for the life of the program whether it is needed or not. In many situations the programmer requires greater flexibility in managing the lifetime of allocated memory.
These limitations are avoided by using dynamic memory allocation in which memory is more explicitly (but more flexibly) managed, typically, by allocating it from the heap, an area of memory structured for this purpose. In C, the library function
malloc
is used to allocate a block of memory on the heap. The program accesses this block of memory via a pointer that malloc
returns. When the memory is no longer needed, the pointer is passed to free
which deallocates the memory so that it can be used for other purposes.Some platforms provide library calls which allow run-time dynamic allocation from the C stack rather than the heap (e.g. Unix
alloca
, Microsoft WindowsMicrosoft Windows
Microsoft Windows is a series of operating systems produced by Microsoft.Microsoft introduced an operating environment named Windows on November 20, 1985 as an add-on to MS-DOS in response to the growing interest in graphical user interfaces . Microsoft Windows came to dominate the world's personal...
CRTL's
malloca
). This memory is automatically freed when the calling function ends. The need for this is lessened by changes in the C99C99
C99 is a modern dialect of the C programming language. It extends the previous version with new linguistic and library features, and helps implementations make better use of available computer hardware and compiler technology.-History:...
standard, which added support for variable-length array
Variable-length array
In programming, a variable-length array is an array data structure of automatic storage duration whose length is determined at run time ....
s of block scope having sizes determined at runtime.
Overview of functions
The C dynamic memory allocation functions are defined instdlib.h
header (cstdlib
header in C++).malloc
- allocates the specified number of bytesrealloc
- increases the size of the specified block of memory. Reallocates it if neededcalloc
- allocates the specified number of bytes and initializes them to zerofree
- releases the specified block of memory back to the systemUsage example
The standard method of creating an array of 10 int objects:However, if one wishes to allocate a similar array dynamically, the following code could be used:
malloc
returns a null pointer to indicate that no memory is available, or that some other error occurred which prevented memory being allocated.Type safety
malloc
returns a void pointer (void *
), which indicates that it is a pointer to a region of unknown data type. The lack of a specific pointer type returned from malloc
is type-unsafe behaviour: malloc
allocates based on byte count but not on type. This distinguishes it from the C++ new operator that returns a pointer whose type relies on the operand. (see C Type Safety).One may "cast" (see type conversion
Type conversion
In computer science, type conversion, typecasting, and coercion are different ways of, implicitly or explicitly, changing an entity of one data type into another. This is done to take advantage of certain features of type hierarchies or type representations...
) this pointer to a specific type:
There are advantages and disadvantages to performing such a cast.
Advantages to casting
- C++ does require the cast. Including the cast allows a program (or a header file included in a program) to be both valid C and valid C++.
- The cast allows for older versions of
malloc
that originally returned achar *
.
Disadvantages to casting
- Under the ANSI C standard, the cast is redundant.
- Adding the cast may mask failure to include the header
stdlib.h
, in which the prototype formalloc
is found. In the absence of a prototype formalloc
, the standard requires that the C compiler assumemalloc
returns anint
. If there is no cast, a warning is issued when this integer is assigned to the pointer; however, with the cast, this warning is not produced, hiding a bug. On certain architectures and data models (such as LP64 on 64-bit systems, wherelong
and pointers are 64-bit andint
is 32-bit), this error can actually result in undefined behaviour, as the implicitly declaredmalloc
returns a 32-bit value whereas the actually defined function returns a 64-bit value. Depending on calling conventions and memory layout, this may result in stack smashing. This issue is not present in modern compilers, as they uniformly produce warnings that an undeclared function has been used, so a warning will still appear. For example, gcc's default behaviour is to show a warning that reads "incompatible implicit declaration of built-in function" regardless of whether the cast is present or not. - If type of pointer is changed, using cast requires one to fix all code lines where malloc was called
Common errors
The improper use of dynamic memory allocation can frequently be a source of bugs.
Most common errors are as follows:
- Allocation failure. Memory allocation is not guaranteed to succeed. If there's no check for successful allocation implemented, this usually leads to a crash of the program or the entire system.
- Memory leaks. Failure to deallocate memory using
free
leads to buildup of memory that is non-reusable memory, which is no longer used by the program. This wastes memory resources and can lead to allocation failures when these resources are exhausted. - Logical errors. All allocations must follow the same pattern: allocation using
malloc
, usage to store data, deallocation usingfree
. Failures to adhere to this pattern, such as memory usage after call to afree
or before call tomalloc
, callingfree
twice, etc., usually leads to a crash of the program.
Implementations
The implementation of memory management depends greatly upon operating system and architecture. Some operating systems supply an allocator for malloc, while others supply functions to control certain regions of data. The same dynamic memory allocator is often used to implement both
malloc
and the operator new
in C++C++
C++ is a statically typed, free-form, multi-paradigm, compiled, general-purpose programming language. It is regarded as an intermediate-level language, as it comprises a combination of both high-level and low-level language features. It was developed by Bjarne Stroustrup starting in 1979 at Bell...
. Hence, it is referred to below as the allocator rather than
malloc
.Heap-based
Implementation of the allocator on IA-32IA-32
IA-32 , also known as x86-32, i386 or x86, is the CISC instruction-set architecture of Intel's most commercially successful microprocessors, and was first implemented in the Intel 80386 as a 32-bit extension of x86 architecture...
architectures is commonly done using the heap, or data segment
Data segment
A data segment is a portion of virtual address space of a program, which contains the global variables and static variables that are initialized by the programmer...
. The allocator will usually expand and contract the heap to fulfill allocation requests.
The heap method suffers from a few inherent flaws, stemming entirely from fragmentation
Fragmentation (computer)
In computer storage, fragmentation is a phenomenon in which storage space is used inefficiently, reducing storage capacity and in most cases reducing the performance. The term is also used to denote the wasted space itself....
. Like any method of memory allocation, the heap will become fragmented; that is, there will be sections of used and unused memory in the allocated space on the heap. A good allocator will attempt to find an unused area of already allocated memory to use before resorting to expanding the heap. The major problem with this method is that the heap has only two significant attributes: base, or the beginning of the heap in virtual memory space; and length, or its size. The heap requires enough system memory to fill its entire length, and its base can never change. Thus, any large areas of unused memory are wasted. The heap can get "stuck" in this position if a small used segment exists at the end of the heap, which could waste any magnitude of address space, from a few megabytes to a few hundred.
dlmalloc
Doug LeaDoug Lea
Doug Lea is a professor of computer science at State University of New York at Oswego where he specializes in concurrent programming and the design of concurrent data structures. He was on the Executive Committee of the Java Community Process and chaired JSR 166, which added concurrency utilities...
is the author of a memory allocator called [ftp://g.oswego.edu/pub/misc/ dlmalloc] ("Doug Lea's Malloc").
Memory on the heap is allocated as "chunks", an 8-byte aligned
Data structure alignment
Data structure alignment is the way data is arranged and accessed in computer memory. It consists of two separate but related issues: data alignment and data structure padding. When a modern computer reads from or writes to a memory address, it will do this in word sized chunks...
data structure
Data structure
In computer science, a data structure is a particular way of storing and organizing data in a computer so that it can be used efficiently.Different kinds of data structures are suited to different kinds of applications, and some are highly specialized to specific tasks...
which contains a header and usable memory. Allocated memory contains an 8 or 16 byte overhead for the size of the chunk and usage flags. Unallocated chunks also store pointers to other free chunks in the usable space area, making the minimum chunk size 24 bytes.
Unallocated memory is grouped into "bins
Bin (computational geometry)
In computational geometry, the bin data structure allows efficient region queries, i.e., if there are some axis-aligned rectangles on a 2D plane, answer the question Given a query rectangle, return all rectangles intersecting it. kd-tree is another data structure that can answer this question...
" of similar sizes, implemented by using a double-linked list of chunks (with pointers stored in the unallocated space inside the chunk).
For requests below 256 bytes (a "smallbin" request), a simple two power best fit allocator is used. If there are no free blocks in that bin, a block from the next highest bin is split in two.
For requests of 256 bytes or above but below the mmap threshold, recent versions of dlmalloc use an in-place bitwise trie algorithm. If there is no free space left to satisfy the request, dlmalloc tries to increase the size of the heap, usually via brk
BRK
The 65xx family of microprocessors, consisting of the MOS Technology 6502 and its derivatives, the WDC 65C02, WDC 65C802 and WDC 65C816, all handle interrupts in a similar fashion. There are three hardware interrupt signals common to all 65xx processors and one software interrupt, the BRK...
system call.
For requests above the mmap threshold (a "largebin" request), the memory is always allocated using the mmap
Mmap
In computing, mmap is a POSIX-compliant Unix system call that maps files or devices into memory. It is a method of memory-mapped file I/O. It naturally implements demand paging, because initially file contents are not entirely read from disk and do not use physical RAM at all...
system call. The threshold is usually 256 KB The mmap method averts problems with huge buffers trapping a small allocation at the end after their expiration, but always allocates an entire page
Page (computing)
A page, memory page, or virtual page is a fixed-length contiguous block of virtual memory that is the smallest unit of data for the following:* memory allocation performed by the operating system for a program; and...
of memory, which on many architectures is 4096 bytes in size.
FreeBSD's and NetBSD's jemalloc
Since FreeBSDFreeBSD
FreeBSD is a free Unix-like operating system descended from AT&T UNIX via BSD UNIX. Although for legal reasons FreeBSD cannot be called “UNIX”, as the direct descendant of BSD UNIX , FreeBSD’s internals and system APIs are UNIX-compliant...
7.0 and NetBSD
NetBSD
NetBSD is a freely available open source version of the Berkeley Software Distribution Unix operating system. It was the second open source BSD descendant to be formally released, after 386BSD, and continues to be actively developed. The NetBSD project is primarily focused on high quality design,...
5.0, the old
malloc
implementation (phkmalloc) was replaced by jemalloc, written by Jason Evans. The main reason for this was a lack of scalability of phkmalloc in terms of multithreading. In order to avoid lock contention, jemalloc uses separate "arenas" for each CPU. Experiments measuring number of allocations per second in multithreading application have shown that this makes it scale linearly with the number of threads, while for both phkmalloc and dlmalloc performance was inversely proportional to the number of threads.OpenBSD's malloc
OpenBSDOpenBSD
OpenBSD is a Unix-like computer operating system descended from Berkeley Software Distribution , a Unix derivative developed at the University of California, Berkeley. It was forked from NetBSD by project leader Theo de Raadt in late 1995...
's implementation of the
malloc
function makes use of mmap
. For requests greater in size than one page, the entire allocation is retrieved using mmap
; smaller sizes are assigned from memory pools maintained by malloc
within a number of "bucket pages," also allocated with mmap
. On a call to free
, memory is released and unmapped from the process address spaceAddress space
In computing, an address space defines a range of discrete addresses, each of which may correspond to a network host, peripheral device, disk sector, a memory cell or other logical or physical entity.- Overview :...
using
munmap
. This system is designed to improve security by taking advantage of the address space layout randomizationAddress space layout randomization
Address space layout randomization is a computer security method which involves randomly arranging the positions of key data areas, usually including the base of the executable and position of libraries, heap, and stack, in a process's address space.- Benefits :Address space randomization hinders...
and gap page features implemented as part of OpenBSD's
mmap
system callSystem call
In computing, a system call is how a program requests a service from an operating system's kernel. This may include hardware related services , creating and executing new processes, and communicating with integral kernel services...
, and to detect use-after-free bugs—as a large memory allocation is completely unmapped after it is freed, further use causes a segmentation fault
Segmentation fault
A segmentation fault , bus error or access violation is generally an attempt to access memory that the CPU cannot physically address. It occurs when the hardware notifies an operating system about a memory access violation. The OS kernel then sends a signal to the process which caused the exception...
and termination of the program.
Hoard's malloc
The Hoard memory allocatorHoard memory allocator
The Hoard memory allocator, or Hoard, is a memory allocator for Linux, Solaris, Microsoft Windows and other operating systems. Hoard is designed to be efficient when used by multithreaded applications on multiprocessor computers...
is an allocator whose goal is scalable memory allocation performance. Like OpenBSD's allocator, Hoard uses
mmap
exclusively, but manages memory in chunks of 64 kilobytes called superblocks. Hoard's heap is logically divided into a single global heap and a number of per-processor heaps. In addition, there is a thread-local cache that can hold a limited number of superblocks. By allocating only from superblocks on the local per-thread or per-processor heap, and moving mostly-empty superblocks to the global heap so they can be reused by other processors, Hoard keeps fragmentation low while achieving near linear scalability with the number of threads.Thread-caching malloc (tcmalloc)
Every thread has local storage for small allocations. For large allocations mmap or sbrkSbrk
brk and sbrk are basic memory management system calls used in Unix and Unix-like operating systems to control the amount of memory allocated to the data segment of the process. These calls are typically made from a higher-level memory management library such as malloc...
can be used. Tcmalloc has garbage-collection for local storage of dead threads. The TCmalloc is considered to be more than twice as fast as glibc's ptmalloc for multithreaded programs.
In-kernel
Operating system kernels need to allocate memory just as application programs do. The implementation ofmalloc
within a kernel often differs significantly from the implementations used by C libraries, however. For example, memory buffers might need to conform to special restrictions imposed by DMADirect memory access
Direct memory access is a feature of modern computers that allows certain hardware subsystems within the computer to access system memory independently of the central processing unit ....
, or the memory allocation function might be called from interrupt context. This necessitates a
malloc
implementation tightly integrated with the virtual memoryVirtual memory
In computing, virtual memory is a memory management technique developed for multitasking kernels. This technique virtualizes a computer architecture's various forms of computer data storage , allowing a program to be designed as though there is only one kind of memory, "virtual" memory, which...
subsystem of the operating system kernel.
Allocation size limits
The largest possible memory block
malloc
can allocate depends on the host system, particularly the size of physical memory and the operating system implementation. Theoretically, the largest number should be the maximum value that can be held in a size tSize tsize_t is an unsigned data type defined by several C and C++ standards that is defined in stddef.h. It can be further imported by inclusion of stdlib.h as this file internally sub includes stddef.h....
type, which is an implementation-dependent unsigned integer representing the size of an area of memory. The maximum value is , or the constant SIZE_MAX
in the C99 standard.See also
- Buffer overflowBuffer overflowIn computer security and programming, a buffer overflow, or buffer overrun, is an anomaly where a program, while writing data to a buffer, overruns the buffer's boundary and overwrites adjacent memory. This is a special case of violation of memory safety....
- Memory debuggerMemory debuggerA memory debugger is a programming tool for finding memory leaks and buffer overflows. These are due to bugs related to the allocation and deallocation of dynamic memory. Programs written in languages that have garbage collection, such as managed code, might also need memory debuggers, e.g...
mprotect
new
(C++)
- Page size
- Variable-length arrayVariable-length arrayIn programming, a variable-length array is an array data structure of automatic storage duration whose length is determined at run time ....
External links
Buffer overflow
In computer security and programming, a buffer overflow, or buffer overrun, is an anomaly where a program, while writing data to a buffer, overruns the buffer's boundary and overwrites adjacent memory. This is a special case of violation of memory safety....
Memory debugger
A memory debugger is a programming tool for finding memory leaks and buffer overflows. These are due to bugs related to the allocation and deallocation of dynamic memory. Programs written in languages that have garbage collection, such as managed code, might also need memory debuggers, e.g...
mprotect
new
(C++)Variable-length array
In programming, a variable-length array is an array data structure of automatic storage duration whose length is determined at run time ....
- Definition of malloc in IEEE Std 1003.1 standard
- The design of the basis of the glibc allocator, by Doug LeaDoug LeaDoug Lea is a professor of computer science at State University of New York at Oswego where he specializes in concurrent programming and the design of concurrent data structures. He was on the Executive Committee of the Java Community Process and chaired JSR 166, which added concurrency utilities...
- The ptmalloc homepage, by Wolfram Gloger
- The Hoard homepage, by Emery Berger
- The nedmalloc homepage, by Niall Douglas
- The jemalloc homepage, by Jason Evans
- TCMalloc homepage, a high-performance malloc developed by Google
- Simple Memory Allocation Algorithms on OSDEV Community
- Hoard: A Scalable Memory Allocator for Multithreaded Applications by Emery Berger
- Scalable Lock-Free Dynamic Memory Allocation by Maged M. Michael
- Inside memory management - The choices, tradeoffs, and implementations of dynamic allocation by Jonathan Bartlett
- Memory Reduction (GNOME) wiki page with lots of information about fixing malloc
- C99 standard draft, including TC1/TC2/TC3
- Some useful references about C
- ISO/IEC 9899 – Programming languages – C