Fat binary
Encyclopedia
A fat binary is a computer program
Computer program
A computer program is a sequence of instructions written to perform a specified task with a computer. A computer requires programs to function, typically executing the program's instructions in a central processor. The program has an executable form that the computer can use directly to execute...

 with code native to multiple Instruction set
Instruction set
An instruction set, or instruction set architecture , is the part of the computer architecture related to programming, including the native data types, instructions, registers, addressing modes, memory architecture, interrupt and exception handling, and external I/O...

s which can consequently be run on multiple processor types. The usual method of implementation is to include a version of the machine code
Machine code
Machine code or machine language is a system of impartible instructions executed directly by a computer's central processing unit. Each instruction performs a very specific task, typically either an operation on a unit of data Machine code or machine language is a system of impartible instructions...

 for each instruction set, preceded by code compatible with all operating systems which executes a jump to the appropriate section. This results in a file larger than a normal one-architecture binary, thus the name.

The use of fat binaries is not common in operating system
Operating system
An operating system is a set of programs that manage computer hardware resources and provide common services for application software. The operating system is the most important type of system software in a computer system...

 software; there are several alternatives to solve the same problem, such as the use of an installer program to choose an architecture-specific binary at install time, distributing software in source code
Source code
In computer science, source code is text written using the format and syntax of the programming language that it is being written in. Such a language is specially designed to facilitate the work of computer programmers, who specify the actions to be performed by a computer mostly by writing source...

 form and compiling
Compiler
A compiler is a computer program that transforms source code written in a programming language into another computer language...

 it in-place, or the use of a virtual machine
Virtual machine
A virtual machine is a "completely isolated guest operating system installation within a normal host operating system". Modern virtual machines are implemented with either software emulation or hardware virtualization or both together.-VM Definitions:A virtual machine is a software...

 and Just In Time compilation.

Apple's Fat Binary

A fat binary scheme smoothed the Apple Macintosh's transition, beginning in 1994, from 68k
68k
The Motorola 680x0/m68000/68000 is a family of 32-bit CISC microprocessors. During the 1980s and early 1990s, they were popular in personal computers and workstations and were the primary competitors of Intel's x86 microprocessors...

 microprocessors to PowerPC
PowerPC
PowerPC is a RISC architecture created by the 1991 Apple–IBM–Motorola alliance, known as AIM...

 microprocessors. Many applications for the old platform ran transparently on the new platform under an evolving emulation scheme, but emulated code generally runs slower than native code. Applications released as "fat binary" took up more storage space, but they ran at full speed on either platform. They achieved this by packaging both a 68000-compiled version and a PowerPC-compiled version of the same program into their executable files. The older 68k code (CFM-68K or classic 68K) continued to be stored in the resource fork
Resource fork
The resource fork is a construct of the Mac OS operating system used to store structured data in a file, alongside unstructured data stored within the data fork. A resource fork stores information in a specific form, such as icons, the shapes of windows, definitions of menus and their contents, and...

, while the newer PowerPC code was contained in the data fork, in PEF
Preferred Executable Format
The Preferred Executable Format is a file format that specifies the format of executable files and other object code. PEF executables are also called Code Fragment Manager files ....

 format.)

Fat binaries were larger than programs supporting only the PowerPC or 68k, which led to the creation of a number of utilities that would strip out the non-needed version. In the era of small hard drives, when 80 MB hard drives were a common size, these utilities were sometimes useful as program code was generally a large percentage of overall drive usage.

NeXTSTEP Multi-Architecture Binaries

Fat binaries were a feature of NeXT
NeXT
Next, Inc. was an American computer company headquartered in Redwood City, California, that developed and manufactured a series of computer workstations intended for the higher education and business markets...

's NeXTSTEP
NEXTSTEP
NeXTSTEP was the object-oriented, multitasking operating system developed by NeXT Computer to run on its range of proprietary workstation computers, such as the NeXTcube...

/OPENSTEP operating system, starting with NeXTSTEP 3.1; in NeXTSTEP, they were called "Multi-Architecture Binaries". Multi-Architecture Binaries were originally intended to allow software to be compiled to run both on NeXT's Motorola 68k-based hardware and on Intel IA-32
IA-32
IA-32 , also known as x86-32, i386 or x86, is the CISC instruction-set architecture of Intel's most commercially successful microprocessors, and was first implemented in the Intel 80386 as a 32-bit extension of x86 architecture...

-based PC
IBM PC compatible
IBM PC compatible computers are those generally similar to the original IBM PC, XT, and AT. Such computers used to be referred to as PC clones, or IBM clones since they almost exactly duplicated all the significant features of the PC architecture, facilitated by various manufacturers' ability to...

s running NeXTSTEP, with a single binary file for both platforms. It was later used to allow OPENSTEP applications to run on PCs and the various RISC platforms OPENSTEP supported. Multi-Architecture Binary files are in a special archive format, in which a single file stores one or more Mach-O
Mach-O
Mach-O, short for Mach object file format, is a file format for executables, object code, shared libraries, dynamically-loaded code, and core dumps. A replacement for the a.out format, Mach-O offered more extensibility and faster access to information in the symbol table.Mach-O was once used by...

 subfiles for each architecture supported by the Multi-Architecture Binary. Every Multi-Architecture Binary starts with a structure (struct fat_header) containing two unsigned integers. The first integer (magic) is used as a magic number to identify this file as a Fat Binary. The second integer (nfat_arch) defines how many Mach-O Files the archive contains (how many instances of the same program for different architectures). After this header, there are nfat_arch fat_arch structures (struct fat_arch). This structure defines the offset (from the start of the file) at which to find the file, the alignment, the size and the CPU type and subtype the Mach-O binary (within the archive) is targeted at.

The version of the GNU Compiler Collection
GNU Compiler Collection
The GNU Compiler Collection is a compiler system produced by the GNU Project supporting various programming languages. GCC is a key component of the GNU toolchain...

 shipped with the Developer Tools was able to cross-compile source code for the different architectures on which NeXTStep
NEXTSTEP
NeXTSTEP was the object-oriented, multitasking operating system developed by NeXT Computer to run on its range of proprietary workstation computers, such as the NeXTcube...

 was able to run. For example it was possible to choose the target architectures with multiple '-arch' options (with the architecture as argument). This was a convenient way to distribute a program for NeXTStep running on different architectures.

It was also possible to create libraries (e.g. using libtool) with different targeted object files.

Mach-O and Mac OS X

Apple Computer acquired NeXT in 1996 and continued to work with the OPENSTEP code. Mach-O became the native object file format in Apple's free Darwin operating system
Darwin (operating system)
Darwin is an open source POSIX-compliant computer operating system released by Apple Inc. in 2000. It is composed of code developed by Apple, as well as code derived from NeXTSTEP, BSD, and other free software projects....

 (2000) and Apple's Mac OS X
Mac OS X
Mac OS X is a series of Unix-based operating systems and graphical user interfaces developed, marketed, and sold by Apple Inc. Since 2002, has been included with all new Macintosh computer systems...

 (2001), and NeXT's Multi-Architecture Binaries continued to be supported by the operating system. Under Mac OS X, Multi-Architecture Binaries can be used to support multiple variants of an architecture, for instance to have different versions of 32-bit
32-bit
The range of integer values that can be stored in 32 bits is 0 through 4,294,967,295. Hence, a processor with 32-bit memory addresses can directly access 4 GB of byte-addressable memory....

 code optimized for the PowerPC G3
PowerPC G3
The PowerPC 7xx is a family of third generation 32-bit PowerPC microprocessors designed and manufactured by IBM and Motorola . This family is called the PowerPC G3 by its well-known customer Apple Computer...

, PowerPC G4
PowerPC G4
PowerPC G4 is a designation used by Apple Computer to describe a fourth generation of 32-bit PowerPC microprocessors. Apple has applied this name to various processor models from Freescale, a former part of Motorola....

, and PowerPC 970
PowerPC 970
The PowerPC 970, PowerPC 970FX, PowerPC 970GX, and PowerPC 970MP, are 64-bit Power Architecture processors from IBM introduced in 2002. When used in Apple Inc. machines, they were dubbed the PowerPC G5....

 generations of processors. It can also be used to support multiple architectures, such as 32-bit and 64-bit
64-bit
64-bit is a word size that defines certain classes of computer architecture, buses, memory and CPUs, and by extension the software that runs on them. 64-bit CPUs have existed in supercomputers since the 1970s and in RISC-based workstations and servers since the early 1990s...

 PowerPC or, as mentioned in the next section, PowerPC and x86.

Apple's Universal binary

In 2005, Apple announced another transition, from PowerPC processors to Intel x86 processors. Apple promotes the distribution of new applications that support both PowerPC and x86 natively by using executable files in Multi-Architecture Binary format. Apple calls such programs "Universal applications" and calls the file format "Universal binary
Universal binary
A universal binary is, in Apple parlance, an executable file or application bundle that runs natively on either PowerPC or Intel-manufactured IA-32 or Intel 64-based Macintosh computers; it is an implementation of the concept more generally known as a fat binary.With the release of Mac OS X Snow...

," perhaps to distinguish this new transition from the previous transition and other uses of Multi-Architecture Binary format.

Universal binary format is not necessary for forward migration of pre-existing native PowerPC applications; for this role, Apple supplies Rosetta
Rosetta (software)
Rosetta was a lightweight and dynamic binary translator for Mac OS X which Apple released in 2006 when it transitioned the Macintosh from PowerPC to Intel processors. It allowed pre-existing software to run on the new systems without modification....

, a PPC emulator. However, Rosetta has a fairly steep performance overhead, so developers are encouraged to offer both PPC and Intel binaries, using Universal binaries. The obvious cost of Universal binary is that every installed executable file is larger, but in the years since the release of the PPC, hard drive space has greatly outstripped executable size; while a Universal binary might be double the size of a single-platform version of the same application, resources generally dwarf the code size, which becomes a minor issue. In fact, often a universal binary version of an application is smaller than two different versions of an application (one each for PowerPC and Intel) because the resources have to be duplicated. Nevertheless, Mac OS X does include the lipo and ditto command-line application to remove versions from the Multi-Architecture Binary image.

As indicated elsewhere, Apple includes code in the Xcode
Xcode
Xcode is a suite of tools, developed by Apple, for developing software for Mac OS X and iOS. Xcode 4.2, the latest major version, is available on the Mac App Store for free for Mac OS X 10.7 , and on the Apple Developer Connection website for free to registered developers Xcode is a suite of tools,...

 development environment to allow applications to be delivered in both 32-bit and 64-bit versions. This is useful on both the Intel and PowerPC platform, both of which have shipped both 32 and 64-bit versions of the CPUs. Universal binaries created with this in mind can contain up to four versions of the executable code (32-bit PowerPC, 32-bit x86, 64-bit PowerPC, and 64-bit x86
X86-64
x86-64 is an extension of the x86 instruction set. It supports vastly larger virtual and physical address spaces than are possible on x86, thereby allowing programmers to conveniently work with much larger data sets. x86-64 also provides 64-bit general purpose registers and numerous other...

).

FatELF: Universal Binaries for Linux

FatELF is a Fat Binary implementation for Linux
Linux
Linux is a Unix-like computer operating system assembled under the model of free and open source software development and distribution. The defining component of any Linux system is the Linux kernel, an operating system kernel first released October 5, 1991 by Linus Torvalds...

 and other Unix-like
Unix-like
A Unix-like operating system is one that behaves in a manner similar to a Unix system, while not necessarily conforming to or being certified to any version of the Single UNIX Specification....

 operating systems. Technically, FatELF is an extension of the ELF binary format. Additionally to the CPU architecture abstraction (byte order, word size, CPU instruction set etc.), there is the advantage of binaries with support for multiple kernel ABIs
Application binary interface
In computer software, an application binary interface describes the low-level interface between an application program and the operating system or another application.- Description :...

 and versions.

FatELF has several use-cases according to developers:
  • Distributions no longer need to have separate downloads for various platforms.
  • Separated /lib, /lib32 and /lib64 trees are not required anymore in OS directory structure
    Filesystem Hierarchy Standard
    The Filesystem Hierarchy Standard defines the main directories and their contents in Linux operating systems. For the most part, it is a formalization and extension of the traditional BSD filesystem hierarchy....

    .
  • The correct binary and libraries are centrally chosen by the system instead of shell script
    Shell script
    A shell script is a script written for the shell, or command line interpreter, of an operating system. It is often considered a simple domain-specific programming language...

    s.
  • If the ELF ABI changes someday, legacy users can be still supported.
  • Distribution of web browser plug ins that work out of the box with multiple platforms.
  • Distribution of one application file that works across Linux and BSD OS
    Berkeley Software Distribution
    Berkeley Software Distribution is a Unix operating system derivative developed and distributed by the Computer Systems Research Group of the University of California, Berkeley, from 1977 to 1995...

     variants, without a platform compatibility layer on them.
  • One hard drive partition can be booted on different machines with different CPU architectures, for development and experimentation. Same root file system, different kernel and CPU architecture.
  • Applications provided by network share or USB sticks, will work on multiple systems. This is also helpful for creating portable application
    Portable application
    A portable application , sometimes also called standalone, is a computer software program designed to run independently from an operating system...

    s and also Cloud Computing
    Cloud computing
    Cloud computing is the delivery of computing as a service rather than a product, whereby shared resources, software, and information are provided to computers and other devices as a utility over a network ....

     images for heterogeneous systems.


A proof-of-concept Ubuntu 9.04
Ubuntu (operating system)
Ubuntu is a computer operating system based on the Debian Linux distribution and distributed as free and open source software. It is named after the Southern African philosophy of Ubuntu...

 image is available (VM image of Ubuntu 9.04 with Fat Binary support). Up to now the FatELF is not integrated in the kernel mainline.

Combined COM-style binaries for CP/M-80 and DOS

CP/M-80 executables for the Intel 8080
Intel 8080
The Intel 8080 was the second 8-bit microprocessor designed and manufactured by Intel and was released in April 1974. It was an extended and enhanced variant of the earlier 8008 design, although without binary compatibility...

 processor use the same .COM
.com
The domain name com is a generic top-level domain in the Domain Name System of the Internet. Its name is derived from commercial, indicating its original intended purpose for domains registered by commercial organizations...

 file extension as DOS
DOS
DOS, short for "Disk Operating System", is an acronym for several closely related operating systems that dominated the IBM PC compatible market between 1981 and 1995, or until about 2000 if one includes the partially DOS-based Microsoft Windows versions 95, 98, and Millennium Edition.Related...

 compatible operating systems for Intel 8086
Intel 8086
The 8086 is a 16-bit microprocessor chip designed by Intel between early 1976 and mid-1978, when it was released. The 8086 gave rise to the x86 architecture of Intel's future processors...

 binaries. In both cases programs are loaded at offset +100h and executed by jumping to the first byte in the file. A the opcodes of the two processor families are not compatible, attempting to start a program under the wrong operating system leads to incorrect and unpredictable behaviour.

To avoid this methods have been devised to build fat binaries which contain both a CP/M-80 and a DOS program, preceded by initial code which is interpreted correctly by both operating systems, either by combining two fully functional programs each built for their corresponding environment, or by adding stubs which cause the program to exit gracefully if started on the wrong processor. For this to work, the first few instructions in the .COM file have to be valid code for both 8086 and 8080 processors which cause the processors to branch into different locations within the code. For example, the utilities in the MYZ80 emulator start with EBh, 52h, EBh. An 8086 sees this as a jump and reads its next instruction from offset +154h whereas an 8080 or compatible goes straight through and reads its next instruction from +103h.

Some CP/M-80 3.0 .COM files may have one or more RSX overlays attached to them by GENCOM. If so, they start with an extra 256 byte header. In order to indicate this, the first byte in the header is set to C9h, which works both as a signature identifying this type of COM file to the CP/M 3.0 executable loader, as well as a RET instruction for 8080-compatible processors which leads to safe operation if the file is executed under older versions of CP/M-80.

C9h is never appropriate as the first byte of a program for any x86 processor (it has different meanings for different generations, but is never a meaningful first byte); the executable loader in some versions of DOS rejects COM files that start with C9h, avoiding incorrect operation.

Combined COM and SYS files

DOS device drivers start with a file header whose first four bytes are FFFFFFFFh by convention, although this is not a requirement. This is fixed up dynamically by the operating system when the driver loads (typically in the DOS BIOS when it executes DEVICE statements in CONFIG.SYS
CONFIG.SYS
CONFIG.SYS is the primary configuration file for the DOS, OS/2 as well as similar operating systems. It is a special file that contains setup or configuration instructions for the computer system.- Usage :...

). Since DOS does not reject files with a .COM extension to be loaded per DEVICE and does not test for FFFFFFFFh, it is possible to combine a COM program and a device driver into the same file by placing a jump instruction to the entry point of the embedded COM program within the first four bytes of the file (three bytes are usually sufficient). If the embedded program and the device driver sections share a common portion of code or data it is necessary for the code to deal with being loaded at offset +100h as a .COM style program, and at 0h as a device driver.

Crash-protected system files

Under DOS, some files have file extensions which do not reflect their actual file type. For example, COUNTRY.SYS is not a DOS device driver, but a binary NLS database file for use with the CONFIG.SYS COUNTRY statement. The PC DOS and DR-DOS system files IBMBIO.COM
IBMBIO.COM
IBMBIO.COM is the filename of the DOS-BIOS in many DOS operating systems, and as such part of PC-DOS, earlier versions of MS-DOS, and DR DOS 5.0 and higher...

 and IBMDOS.COM
IBMDOS.COM
IBMDOS.COM is the filename of the DOS kernel. It exists in DR-DOS and PC-DOS systems, with MS-DOS using MSDOS.SYS. The file is located in the root directory of the drive containing the operating system....

are special binary images, not COM-style programs. Trying to load COUNTRY.SYS with a DEVICE statement or executing IBMBIO.COM at the command prompt will cause unpredictable results.

It is sometimes possible to avoid this by utilizing techniques similar to those described above. For example, under DR-DOS 7.02 or higher, if these files are called inappropriately, embedded stubs will just display some file version information and exit gracefully.

External links

The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK