Memory segment
Encyclopedia
x86 memory segmentation refers to the implementation of memory segmentation on the x86 architecture
. Memory is divided into portions that may be addressed by a single index register
without changing a 16-bit segment selector. In real mode
or V86 mode
, a segment is always 64 kilobyte
s in size (using 16-bit offsets). In protected mode
, a segment can have variable length. Segments can overlap.
).
Because of the way the segment address and offset are added, a single linear address can be mapped to 4096 distinct segment:offset pairs. For example, the linear address 08124h can have the segmented addresses 06EFh:1234h, 0812h:0004h, 0000h:8124h, etc. This could be confusing to programmers accustomed to unique addressing schemes, but it can also be used to advantage, for example when addressing multiply-nested data structures. While real mode segments are always 64 KiB long, the practical effect is only that no segment can be longer than 64 KiB, rather than that every segment must be 64 KiB long. Because there is no protection or privilege limitation in real mode, even if a segment could be defined to be smaller than 64 KiB, it would still be entirely up to the programs to coordinate and keep within the bounds of their segments, as any program can always access any memory (since it can arbitrarily set segment selectors to change segment addresses, with absolutely no supervision). Therefore, real mode can just as well be imagined as having a variable length for each segment, in the range 1 to 65536 bytes, that is just not enforced by the CPU.
(Note that the leading zeros of the linear address, segmented addresses, and the segment and offset fields, which are usually neglected, were shown here for clarity).
The effective 20-bit address space
of real mode limits the addressable memory to 220 bytes, or 1,048,576 bytes. This derived directly from the hardware design of the Intel 8086 (and, subsequently, the closely related 8088), which had exactly 20 address pins. (Both were packaged in 40-pin DIP packages; even with only 20 address lines, the address and data buses were multiplexed to fit all the address and data lines within the limited pin count.)
In 16-bit real mode, enabling applications to make use of multiple memory segments (in order to access more memory than available in any one 64K-segment) is quite complex, but was viewed as a necessary evil for all but the smallest tools (which could do with less memory). The root of the problem is that no appropriate address-arithmetic instructions suitable for flat addressing of the entire memory range are available. Flat addressing is possible by applying multiple instructions, which however leads to slower programs.
's protected mode
extends the processor's address space to 224 bytes (16 megabytes), but not by adjusting the shift value. Instead, the 16-bit segment registers now contain an index into a table of segment descriptors containing 24-bit base addresses to which the offset is added. To support old software, the processor starts up in "real mode", a mode in which it uses the segmented addressing model of the 8086. There is a small difference though: the resulting physical address is no longer truncated to 20 bits, so real mode
pointers (but not 8086 pointers) can now refer to addresses between 100000h and 10FFEFh. This roughly 64-kilobyte region of memory was known as the High Memory Area (HMA), and later versions of MS-DOS
could use it to increase the available "conventional" memory (i.e. within the first MB). With the addition of the HMA, the total address space is approximately 1.06MB. Though the 80286 does not truncate real-mode addresses to 20 bits, a system containing an 80286 can do so with hardware external to the processor, by gating off the 21st address line, the A20 line. The IBM PC AT provided the hardware to do this (for full backward compatibility with software for the original IBM PC and PC-XT models), and so all subsequent "AT-class" PC clones did as well.
The protected mode segmentation system, present in the 80286 and later x86 CPUs, can be used to enforce separation of unprivileged processes, but most 32-bit operating systems uses the paging
mechanism introduced with the 80386 for this purpose instead. Such systems set all segment registers to point to a segment descriptor
with offset=0 and limit=232, giving an application full access to a 32-bit flat virtual address space through any segment register. By this method, normal application code does not have to deal with segment registers at all. This was possible because the 80386
widened the general purpose registers (i.e. the offset registers) to 32 bits. Naturally, the base addresses in the descriptors were also widened to 32 bits.
(RPL), a 1-bit Table Indicator (TI), and a 13-bit index.
The processor accesses the 64-bit segment descriptor
structure in the Global Descriptor Table
if TI is 0 or in the Local Descriptor Table
if TI is 1. It then performs the privilege check:
DPL < max (CPL,RPL)
where CPL is the current privilege level (found in the lower 2 bits of the CS register), RPL is the requested privilege level from the segment selector, and DPL is the descriptor privilege level of the segment (found in the descriptor). All privilege levels are integers in the range 0..3, where the lowest number corresponds to the highest privilege.
If the inequality is true, the processor generates a general protection fault
(GP). Otherwise, address translation continues. The processor then takes the 32-bit or 16-bit offset and compares it against the segment limit specified in the segment descriptor. If it is larger, a GP fault is generated. Otherwise, the processor adds the 24-bit segment base, specified in descriptor, to the offset, creating a linear physical address.
The privilege check is done only when the segment register is loaded, because segment descriptors are cached in hidden parts of the segment registers. (Is this true on the 80286, or only on the 80386 and above?)
and later, protected mode retains the segmentation mechanism of 80286 protected mode, but a paging
unit has been added as a second layer of address translation between the segmentation unit and the physical bus. Also, importantly, address offsets are 32-bit (instead of 16-bit), and the segment base in each segment descriptor is also 32-bit (instead of 24-bit). The general operation of the segmentation unit is othewise unchanged. The paging unit may be enabled or disabled; if disabled, operation is the same as on the 80286. If the paging unit is enabled, addresses in a segment are now virtual addresses, rather than physical addresses as they were on the 80286. That is, the address that is derived by adding the segment starting address from a segment descriptor to an offset is a virtual (or logical) address; the segment starting address is a virtual address, and the offset is also in the virtual address space. Programs issue logical (46-bit) addresses which go through the segmentation unit to be checked and translated into linear 32-bit addresses, before being sent to the enabled paging unit which ultimately translates them into physical addresses (which are also 32-bit on the 386
, but can be larger on newer processors which support Physical Address Extension
).
The 80386 also introduced two new general-purpose data segment registers, FS and GS, to the original set of four segment registers (CS, DS, ES, and SS).
architecture does not use segmentation in long mode (64-bit mode). Four of the segment registers: CS, SS, DS, and ES are forced to 0, and the limit to 264. However, in x86 versions of Microsoft Windows, the FS segment points to a small data structure, different for each thread
, which contains information about exception handling, thread-local variables, and other per-thread state. The x86-64 architecture supports this technique by allowing a nonzero base address for FS & GS.
movl $42, %fs:(%eax) ; Equivalent to M[fs:eax]<-42) in RTL
However, segment registers are usually used implicitly.
Segmentation cannot be turned off on x86-32 processors (this is true for 64-bit mode as well, but beyond the scope of discussion), so many 32-bit operating systems simulate a flat memory model
by setting all segments' bases to 0 in order to make segmentation neutral to programs. For instance, the Linux
kernel sets up only 4 general purpose segments:
* __KERNEL_CS (Kernel code segment, base=0, limit=4GB, DPL=0)
* __KERNEL_DS (Kernel data segment, base=0, limit=4GB, DPL=0)
* __USER_CS (User code segment, base=0, limit=4GB, DPL=3)
* __USER_DS (User data segment, base=0, limit=4GB, DPL=3)
Since the base is set to 0 in all cases and the limit 4 GiB, the segmentation unit does not affect the addresses the program issues before they arrive at the paging
unit. (This, of course, refers to 80386 and later processors, as the earlier x86 processors do not have a paging unit.)
Current Linux also uses GS to point to thread-local storage
.
Segments can be defined to be either code, data, or system segments. Additional permission bits are present to make segments read only, read/write, execute, etc.
Note that, in protected mode, code may always modify all segment registers except CS (the code segment
). This is because the current privilege level (CPL) of the processor is stored in the lower 2 bits of the CS register. The only way to raise the processor privilege level (and reload CS) is through the lcall (far call) and int (interrupt) instructions. Similarly, the only way to lower the privilege level (and reload CS) is through lret (far return) and iret (interrupt return) instructions. In real mode, code may also modify the CS register by making a far jump (or using an undocumented POP CS instruction on the 8086 or 8088)). Of course, in real mode, there are no privilege levels; all programs have absolute unchecked access to all of memory and all CPU instructions.
For more information about segmentation, see the IA-32
manuals freely available on the AMD or Intel websites.
X86 architecture
The term x86 refers to a family of instruction set architectures based on the Intel 8086 CPU. The 8086 was launched in 1978 as a fully 16-bit extension of Intel's 8-bit based 8080 microprocessor and also introduced segmentation to overcome the 16-bit addressing barrier of such designs...
. Memory is divided into portions that may be addressed by a single index register
Index register
An index registerCommonly known as a B-line in early British computers. in a computer's CPU is a processor register used for modifying operand addresses during the run of a program, typically for doing vector/array operations...
without changing a 16-bit segment selector. In real mode
Real mode
Real mode, also called real address mode, is an operating mode of 80286 and later x86-compatible CPUs. Real mode is characterized by a 20 bit segmented memory address space and unlimited direct software access to all memory, I/O addresses and peripheral hardware...
or V86 mode
Virtual 8086 mode
In the 80386 microprocessor and later, virtual 8086 mode allows the execution of real mode applications that are incapable of running directly in protected mode while the processor is running a protected mode operating system.VM86 mode uses a segmentation scheme identical to that of real mode In...
, a segment is always 64 kilobyte
Kilobyte
The kilobyte is a multiple of the unit byte for digital information. Although the prefix kilo- means 1000, the term kilobyte and symbol KB have historically been used to refer to either 1024 bytes or 1000 bytes, dependent upon context, in the fields of computer science and information...
s in size (using 16-bit offsets). In protected mode
Protected mode
In computing, protected mode, also called protected virtual address mode, is an operational mode of x86-compatible central processing units...
, a segment can have variable length. Segments can overlap.
Real mode
In real mode, the 16-bit segment selector is interpreted as the most significant 16 bits of a linear 20-bit address, called a segment address, of which the remaining four least significant bits are all zeros. The segment address is always added with a 16-bit offset to yield a linear address. For instance, the segmented address 06EFh:1234h has a segment selector of 06EFh, representing a segment address of 06EF0h, to which we add the offset, yielding the linear address 06EF0h + 1234h = 08124h (cf. hexadecimalHexadecimal
In mathematics and computer science, hexadecimal is a positional numeral system with a radix, or base, of 16. It uses sixteen distinct symbols, most often the symbols 0–9 to represent values zero to nine, and A, B, C, D, E, F to represent values ten to fifteen...
).
Because of the way the segment address and offset are added, a single linear address can be mapped to 4096 distinct segment:offset pairs. For example, the linear address 08124h can have the segmented addresses 06EFh:1234h, 0812h:0004h, 0000h:8124h, etc. This could be confusing to programmers accustomed to unique addressing schemes, but it can also be used to advantage, for example when addressing multiply-nested data structures. While real mode segments are always 64 KiB long, the practical effect is only that no segment can be longer than 64 KiB, rather than that every segment must be 64 KiB long. Because there is no protection or privilege limitation in real mode, even if a segment could be defined to be smaller than 64 KiB, it would still be entirely up to the programs to coordinate and keep within the bounds of their segments, as any program can always access any memory (since it can arbitrarily set segment selectors to change segment addresses, with absolutely no supervision). Therefore, real mode can just as well be imagined as having a variable length for each segment, in the range 1 to 65536 bytes, that is just not enforced by the CPU.
(Note that the leading zeros of the linear address, segmented addresses, and the segment and offset fields, which are usually neglected, were shown here for clarity).
The effective 20-bit address space
Address space
In computing, an address space defines a range of discrete addresses, each of which may correspond to a network host, peripheral device, disk sector, a memory cell or other logical or physical entity.- Overview :...
of real mode limits the addressable memory to 220 bytes, or 1,048,576 bytes. This derived directly from the hardware design of the Intel 8086 (and, subsequently, the closely related 8088), which had exactly 20 address pins. (Both were packaged in 40-pin DIP packages; even with only 20 address lines, the address and data buses were multiplexed to fit all the address and data lines within the limited pin count.)
In 16-bit real mode, enabling applications to make use of multiple memory segments (in order to access more memory than available in any one 64K-segment) is quite complex, but was viewed as a necessary evil for all but the smallest tools (which could do with less memory). The root of the problem is that no appropriate address-arithmetic instructions suitable for flat addressing of the entire memory range are available. Flat addressing is possible by applying multiple instructions, which however leads to slower programs.
Protected mode
80286 protected mode
The 80286Intel 80286
The Intel 80286 , introduced on 1 February 1982, was a 16-bit x86 microprocessor with 134,000 transistors. Like its contemporary simpler cousin, the 80186, it could correctly execute most software written for the earlier Intel 8086 and 8088...
's protected mode
Protected mode
In computing, protected mode, also called protected virtual address mode, is an operational mode of x86-compatible central processing units...
extends the processor's address space to 224 bytes (16 megabytes), but not by adjusting the shift value. Instead, the 16-bit segment registers now contain an index into a table of segment descriptors containing 24-bit base addresses to which the offset is added. To support old software, the processor starts up in "real mode", a mode in which it uses the segmented addressing model of the 8086. There is a small difference though: the resulting physical address is no longer truncated to 20 bits, so real mode
Real mode
Real mode, also called real address mode, is an operating mode of 80286 and later x86-compatible CPUs. Real mode is characterized by a 20 bit segmented memory address space and unlimited direct software access to all memory, I/O addresses and peripheral hardware...
pointers (but not 8086 pointers) can now refer to addresses between 100000h and 10FFEFh. This roughly 64-kilobyte region of memory was known as the High Memory Area (HMA), and later versions of MS-DOS
MS-DOS
MS-DOS is an operating system for x86-based personal computers. It was the most commonly used member of the DOS family of operating systems, and was the main operating system for IBM PC compatible personal computers during the 1980s to the mid 1990s, until it was gradually superseded by operating...
could use it to increase the available "conventional" memory (i.e. within the first MB). With the addition of the HMA, the total address space is approximately 1.06MB. Though the 80286 does not truncate real-mode addresses to 20 bits, a system containing an 80286 can do so with hardware external to the processor, by gating off the 21st address line, the A20 line. The IBM PC AT provided the hardware to do this (for full backward compatibility with software for the original IBM PC and PC-XT models), and so all subsequent "AT-class" PC clones did as well.
The protected mode segmentation system, present in the 80286 and later x86 CPUs, can be used to enforce separation of unprivileged processes, but most 32-bit operating systems uses the paging
Virtual memory
In computing, virtual memory is a memory management technique developed for multitasking kernels. This technique virtualizes a computer architecture's various forms of computer data storage , allowing a program to be designed as though there is only one kind of memory, "virtual" memory, which...
mechanism introduced with the 80386 for this purpose instead. Such systems set all segment registers to point to a segment descriptor
Segment descriptor
In memory addressing for Intel x86 computer architectures, segment descriptors are a part of the segmentation unit, used for translating a logical address to a linear address...
with offset=0 and limit=232, giving an application full access to a 32-bit flat virtual address space through any segment register. By this method, normal application code does not have to deal with segment registers at all. This was possible because the 80386
Intel 80386
The Intel 80386, also known as the i386, or just 386, was a 32-bit microprocessor introduced by Intel in 1985. The first versions had 275,000 transistors and were used as the central processing unit of many workstations and high-end personal computers of the time...
widened the general purpose registers (i.e. the offset registers) to 32 bits. Naturally, the base addresses in the descriptors were also widened to 32 bits.
Detailed Segmentation Unit Workflow
A logical address consists of a 16-bit segment selector (supplying 13+1 address bits) and a 16-bit offset. The segment selector must be located in one of the segment registers. That selector consists of a 2-bit Requested Privilege LevelPrivilege level
A privilege level in the x86 instruction set controls the access of the program currently running on the processor to resources such as memory regions, I/O ports, and special instructions. There are 4 privilege levels ranging from 0 which is the most privileged, to 3 which is least privileged...
(RPL), a 1-bit Table Indicator (TI), and a 13-bit index.
The processor accesses the 64-bit segment descriptor
Segment descriptor
In memory addressing for Intel x86 computer architectures, segment descriptors are a part of the segmentation unit, used for translating a logical address to a linear address...
structure in the Global Descriptor Table
Global Descriptor Table
The Global Descriptor Table or GDT is a data structure used by Intel x86-family processors starting with the 80286 in order to define the characteristics of the various memory areas used during program execution, including the base address, the size and access privileges like executability and...
if TI is 0 or in the Local Descriptor Table
Local Descriptor Table
The Local Descriptor Table is a memory table used in the x86 architecture in protected mode and containing memory segment descriptors: start in linear memory, size, executability, writability, access privilege, actual presence in memory, etc....
if TI is 1. It then performs the privilege check:
DPL < max (CPL,RPL)
where CPL is the current privilege level (found in the lower 2 bits of the CS register), RPL is the requested privilege level from the segment selector, and DPL is the descriptor privilege level of the segment (found in the descriptor). All privilege levels are integers in the range 0..3, where the lowest number corresponds to the highest privilege.
If the inequality is true, the processor generates a general protection fault
General protection fault
A general protection fault in the Intel x86 and AMD x86-64 architectures, and other unrelated architectures, is a fault that can encompass several cases in which protection mechanisms within the processor architecture are violated by any of the programs that are running, either the kernel or a...
(GP). Otherwise, address translation continues. The processor then takes the 32-bit or 16-bit offset and compares it against the segment limit specified in the segment descriptor. If it is larger, a GP fault is generated. Otherwise, the processor adds the 24-bit segment base, specified in descriptor, to the offset, creating a linear physical address.
The privilege check is done only when the segment register is loaded, because segment descriptors are cached in hidden parts of the segment registers. (Is this true on the 80286, or only on the 80386 and above?)
80386 protected mode
In the 386Intel 80386
The Intel 80386, also known as the i386, or just 386, was a 32-bit microprocessor introduced by Intel in 1985. The first versions had 275,000 transistors and were used as the central processing unit of many workstations and high-end personal computers of the time...
and later, protected mode retains the segmentation mechanism of 80286 protected mode, but a paging
Paging
In computer operating systems, paging is one of the memory-management schemes by which a computer can store and retrieve data from secondary storage for use in main memory. In the paging memory-management scheme, the operating system retrieves data from secondary storage in same-size blocks called...
unit has been added as a second layer of address translation between the segmentation unit and the physical bus. Also, importantly, address offsets are 32-bit (instead of 16-bit), and the segment base in each segment descriptor is also 32-bit (instead of 24-bit). The general operation of the segmentation unit is othewise unchanged. The paging unit may be enabled or disabled; if disabled, operation is the same as on the 80286. If the paging unit is enabled, addresses in a segment are now virtual addresses, rather than physical addresses as they were on the 80286. That is, the address that is derived by adding the segment starting address from a segment descriptor to an offset is a virtual (or logical) address; the segment starting address is a virtual address, and the offset is also in the virtual address space. Programs issue logical (46-bit) addresses which go through the segmentation unit to be checked and translated into linear 32-bit addresses, before being sent to the enabled paging unit which ultimately translates them into physical addresses (which are also 32-bit on the 386
Intel 80386
The Intel 80386, also known as the i386, or just 386, was a 32-bit microprocessor introduced by Intel in 1985. The first versions had 275,000 transistors and were used as the central processing unit of many workstations and high-end personal computers of the time...
, but can be larger on newer processors which support Physical Address Extension
Physical Address Extension
In computing, Physical Address Extension is a feature to allow x86 processors to access a physical address space larger than 4 gigabytes....
).
The 80386 also introduced two new general-purpose data segment registers, FS and GS, to the original set of four segment registers (CS, DS, ES, and SS).
Later developments
The x86-64X86-64
x86-64 is an extension of the x86 instruction set. It supports vastly larger virtual and physical address spaces than are possible on x86, thereby allowing programmers to conveniently work with much larger data sets. x86-64 also provides 64-bit general purpose registers and numerous other...
architecture does not use segmentation in long mode (64-bit mode). Four of the segment registers: CS, SS, DS, and ES are forced to 0, and the limit to 264. However, in x86 versions of Microsoft Windows, the FS segment points to a small data structure, different for each thread
Thread (computer science)
In computer science, a thread of execution is the smallest unit of processing that can be scheduled by an operating system. The implementation of threads and processes differs from one operating system to another, but in most cases, a thread is contained inside a process...
, which contains information about exception handling, thread-local variables, and other per-thread state. The x86-64 architecture supports this technique by allowing a nonzero base address for FS & GS.
Practices
Logical addresses can be explicitly specified in x86 assembler language, e.g. (AT&T syntax):movl $42, %fs:(%eax) ; Equivalent to M[fs:eax]<-42) in RTL
Register Transfer Language
In computer science, register transfer language is a term used to describe a kind of intermediate representation that is very close to assembly language, such as that which is used in a compiler. Academic papers and textbooks also often use a form of RTL as an architecture-neutral assembly language...
However, segment registers are usually used implicitly.
- All CPU instructions are implicitly fetched from the code segmentCode segmentIn computing, a code segment, also known as a text segment or simply as text, is one of the sections of a program in an object file or in memory, which contains executable instructions....
specified by the segment selector held in the CS register.
- Most memory references come from the data segmentData segmentA data segment is a portion of virtual address space of a program, which contains the global variables and static variables that are initialized by the programmer...
specified by the segment selector held in the DS register. These may also come from the extra segment specified by the segment selector held in the ES register, if a segment-override prefix precedes the instruction that makes the memory reference. Most, but not all, instructions that use DS by default will accept an ES override prefix.
- Processor stackStack-based memory allocationStacks in computing architectures are regions of memory where data is added or removed in a last-in-first-out manner.In most modern computer systems, each thread has a reserved region of memory referred to as its stack. When a function executes, it may add some of its state data to the top of the...
references, either implicitly (e.g. push and pop instructions) or explicitly (memory accesses using the ESP or (E)BP registers) use the stack segment specified by the segment selector held in the SS register.
- String instructions (e.g. stos, movs) also use the extra segment specified by the segment selector held in the ES register.
Segmentation cannot be turned off on x86-32 processors (this is true for 64-bit mode as well, but beyond the scope of discussion), so many 32-bit operating systems simulate a flat memory model
Flat memory model
Flat memory model or linear memory model refers to a memory addressing paradigm in low-level software design such that the CPU can directly address all of the available memory locations without having to resort to any sort of memory segmentation or paging schemes.Memory management and...
by setting all segments' bases to 0 in order to make segmentation neutral to programs. For instance, the Linux
Linux
Linux is a Unix-like computer operating system assembled under the model of free and open source software development and distribution. The defining component of any Linux system is the Linux kernel, an operating system kernel first released October 5, 1991 by Linus Torvalds...
kernel sets up only 4 general purpose segments:
* __KERNEL_CS (Kernel code segment, base=0, limit=4GB, DPL=0)
* __KERNEL_DS (Kernel data segment, base=0, limit=4GB, DPL=0)
* __USER_CS (User code segment, base=0, limit=4GB, DPL=3)
* __USER_DS (User data segment, base=0, limit=4GB, DPL=3)
Since the base is set to 0 in all cases and the limit 4 GiB, the segmentation unit does not affect the addresses the program issues before they arrive at the paging
Paging
In computer operating systems, paging is one of the memory-management schemes by which a computer can store and retrieve data from secondary storage for use in main memory. In the paging memory-management scheme, the operating system retrieves data from secondary storage in same-size blocks called...
unit. (This, of course, refers to 80386 and later processors, as the earlier x86 processors do not have a paging unit.)
Current Linux also uses GS to point to thread-local storage
Thread-local storage
Thread-local storage is a computer programming method that uses static or global memory local to a thread.This is sometimes needed because normally all threads in a process share the same address space, which is sometimes undesirable...
.
Segments can be defined to be either code, data, or system segments. Additional permission bits are present to make segments read only, read/write, execute, etc.
Note that, in protected mode, code may always modify all segment registers except CS (the code segment
Code segment
In computing, a code segment, also known as a text segment or simply as text, is one of the sections of a program in an object file or in memory, which contains executable instructions....
). This is because the current privilege level (CPL) of the processor is stored in the lower 2 bits of the CS register. The only way to raise the processor privilege level (and reload CS) is through the lcall (far call) and int (interrupt) instructions. Similarly, the only way to lower the privilege level (and reload CS) is through lret (far return) and iret (interrupt return) instructions. In real mode, code may also modify the CS register by making a far jump (or using an undocumented POP CS instruction on the 8086 or 8088)). Of course, in real mode, there are no privilege levels; all programs have absolute unchecked access to all of memory and all CPU instructions.
For more information about segmentation, see the IA-32
IA-32
IA-32 , also known as x86-32, i386 or x86, is the CISC instruction-set architecture of Intel's most commercially successful microprocessors, and was first implemented in the Intel 80386 as a 32-bit extension of x86 architecture...
manuals freely available on the AMD or Intel websites.