Sizeof - AbsoluteAstronomy.com

In the programming languages C and C++

C++

C++ is a statically typed, free-form, multi-paradigm, compiled, general-purpose programming language. It is regarded as an intermediate-level language, as it comprises a combination of both high-level and low-level language features. It was developed by Bjarne Stroustrup starting in 1979 at Bell...

, the unary operator sizeof is used to calculate the sizes of datatypes

Data type

In computer programming, a data type is a classification identifying one of various types of data, such as floating-point, integer, or Boolean, that determines the possible values for that type; the operations that can be done on values of that type; the meaning of the data; and the way values of...

, in number of byte

Byte

The byte is a unit of digital information in computing and telecommunications that most commonly consists of eight bits. Historically, a byte was the number of bits used to encode a single character of text in a computer and for this reason it is the basic addressable element in many computer...

s. A byte in this context is the same as an unsigned char, and may be larger than the standard 8 bits, although that is uncommon in modern implementations. sizeof returns the size of the type of the variable or parenthesized type-specifier that it precedes as a

size tSize t
size_t is an unsigned data type defined by several C and C++ standards  that is defined in stddef.h. It can be further imported by inclusion of stdlib.h as this file internally sub includes stddef.h....

type value. sizeof can be applied to all datatypes, be they primitive types such as the integer

Integer (computer science)

In computer science, an integer is a datum of integral data type, a data type which represents some finite subset of the mathematical integers. Integral data types may be of different sizes and may or may not be allowed to contain negative values....

and floating-point

IEEE floating-point standard

IEEE 754–1985 was an industry standard for representingfloating-pointnumbers in computers, officially adopted in 1985 and superseded in 2008 byIEEE 754-2008. During its 23 years, it was the most widely used format for...

types defined in the language, pointers to memory addresses

Data pointer

In computer science, a pointer is a programming language data type whose value refers directly to another value stored elsewhere in the computer memory using its address...

, or the compound datatypes (unions

Union (computer science)

In computer science, a union is a value that may have any of several representations or formats; or a data structure that consists of a variable which may hold such a value. Some programming languages support special data types, called union types, to describe such values and variables...

, struct

Struct

struct is a computer science term for a record that is used to store more than one value.struct is used in the following programming languages:* struct * struct vs. C++ classes...

s, or C++ classes

Class (computer science)

In object-oriented programming, a class is a construct that is used as a blueprint to create instances of itself – referred to as class instances, class objects, instance objects or simply objects. A class defines constituent members which enable these class instances to have state and behavior...

) defined by the programmer.

Need for `sizeof`

In many programs, there are situations where it is useful to know the size of a particular datatype (one of the most common examples is dynamic memory allocation using the library

Library (computer science)

In computer science, a library is a collection of resources used to develop software. These may include pre-written code and subroutines, classes, values or type specifications....

function

mallocMalloc
C dynamic memory allocation refers to performing dynamic memory allocation in the C via a group of functions in the C standard library, namely malloc, realloc, calloc and free....

). Though for any given implementation

Programming language implementation

A programming language implementation is a system for executing programs written in a programming language.There are two general approaches to programming language implementation:...

of C or C++ the size of a particular datatype is constant, the sizes of even primitive types in C and C++ are implementation defined (that is, not precisely defined by the standard

ANSI C

ANSI C refers to the family of successive standards published by the American National Standards Institute for the C programming language. Software developers writing in C are encouraged to conform to the standards, as doing so aids portability between compilers.-History and outlook:The first...

). This can cause problems when trying to allocate a block of memory of the appropriate size. For example, say a programmer wants to allocate a block of memory big enough to hold ten variables

Variable (programming)

In computer programming, a variable is a symbolic name given to some known or unknown quantity or information, for the purpose of allowing the name to be used independently of the information it represents...

of type int. Because our hypothetical programmer doesn't know the exact size of type int, the programmer doesn't know how many bytes to ask malloc for. Therefore, it is necessary to use sizeof:

/*pointer to type int, used to reference our allocated data*/
int * pointer = malloc(sizeof(int) * 10);

In the preceding code, the programmer instructs malloc to allocate and return a pointer to memory. The size of the block allocated is equal to the number of bytes a single object of type int takes up, multiplied by 10, ensuring enough space for all 10 ints.

It is generally not safe for a programmer to assume they know the size of any datatype. For example, even though most implementations of C and C++ on 32-bit

32-bit

The range of integer values that can be stored in 32 bits is 0 through 4,294,967,295. Hence, a processor with 32-bit memory addresses can directly access 4 GB of byte-addressable memory....

systems define type int to be 4 bytes, it is recommended by many programmers to always use sizeof, as the size of an int could change when code is ported

Porting

In computer science, porting is the process of adapting software so that an executable program can be created for a computing environment that is different from the one for which it was originally designed...

to a different system, breaking the code. In addition, it is frequently very difficult to predict the sizes of compound datatypes such as a struct or union due to structure "padding" (see Implementation below). Another reason for using sizeof is readability, as this avoids magic number

Magic number (programming)

In computer programming, the term magic number has multiple meanings. It could refer to one or more of the following:* A constant numerical or text value used to identify a file format or protocol; for files, see List of file signatures...

Use

The sizeof operator is used to determine the amount of space any data-element/datatype occupies in memory. To use sizeof, the keyword "sizeof" is followed by a type name, variable, or expression

Expression (programming)

An expression in a programming language is a combination of explicit values, constants, variables, operators, and functions that are interpreted according to the particular rules of precedence and of association for a particular programming language, which computes and then produces another value...

. If a type name is used, it always needs to be enclosed in parentheses, whereas variable names and expressions can be specified with or without parentheses. A sizeof expression evaluates to an unsigned

Signedness

In computing, signedness is a property of data types representing numbers in computer programs. A numeric variable is signed if it can represent both positive and negative numbers, and unsigned if it can only represent non-negative numbers .As signed numbers can represent negative numbers, they...

size tSize t
size_t is an unsigned data type defined by several C and C++ standards  that is defined in stddef.h. It can be further imported by inclusion of stdlib.h as this file internally sub includes stddef.h....

value equal to the size in bytes of the "argument" datatype, variable, or expression (with datatypes, sizeof evaluates to the size of the datatype; for variables and expressions it evaluates to the size of the type of the variable or expression). For example, since sizeof(char) is defined to be 1
and assuming ints are 4 bytes long, the following code will print 1,4:

/* the following code illustrates the use of sizeof
* with variables and expressions (no parentheses needed),
* and with type names (parentheses needed)
*/

char c;

printf("%zu,%zu\n", sizeof c, sizeof(int));

The value of a sizeof expression is always non-negative as the C standard specifies that the type of such an expression is size_t, defined to be an unsigned integer type. The z prefix should be used to print it, because the actual size can differ on each architecture.

Using `sizeof` with arrays

When sizeof is applied to an array, the result is the size in bytes of the array in memory. The following program uses sizeof to determine the size of an array, avoiding a buffer overflow

Buffer overflow

In computer security and programming, a buffer overflow, or buffer overrun, is an anomaly where a program, while writing data to a buffer, overruns the buffer's boundary and overwrites adjacent memory. This is a special case of violation of memory safety....

when copying characters:

include
include

int main(int argc, char **argv)
{
char buffer[10]; /* Array of 10 chars */

/* Only copy 9 characters from argv[1] into buffer.
* sizeof(char) is defined to be 1, so the number of
* elements in buffer is equal to its size in bytes.
*/
strncpy(buffer, argv[1], sizeof(buffer) - sizeof(char));

/* Set the last element of the buffer equal to null */
buffer[sizeof(buffer) - 1] = '\0';

return 0;
}

Here, sizeof buffer is equivalent to 10*sizeof(char), or 10.

C99

C99

C99 is a modern dialect of the C programming language. It extends the previous version with new linguistic and library features, and helps implementations make better use of available computer hardware and compiler technology.-History:...

adds support for flexible array members to structures. This form of array declaration is allowed as the last element in structures only, and differs from normal arrays in that no length is specified to the compiler:

include

struct flexarray
{
char val;
char array[]; /* Flexible array member; must be last element of struct */
};

int main(int argc, char **argv)
{
printf("sizeof(struct flexarray) = %zu\n", sizeof(struct flexarray));

return 0;
}

In this case the sizeof operator returns the size of the structure, including any padding, but without any storage allowed for the array. In the above example, the following output will be produced:



sizeof(struct flexarray) = 1

For structures containing flexible array members,

sizeof is therefore equivalent to offsetofOffsetof
C's offsetof macro is an ANSI C library feature found in stddef.h. It evaluates to the offset  of a given member within a struct or union type, an expression of type size_t.  The offsetof macro takes two parameters, the first being a structure name, and the second being the name of a member within...

(s, array) where s is the structure name and array is the flexible array member.



C99C99
C99 is a modern dialect of the C programming language. It extends the previous version  with new linguistic and library features, and helps implementations make better use of available computer hardware and compiler technology.-History:...

 also allows variable length arrays where the length is specified at runtime.  In such cases the sizeof operator is evaluated in part at runtime to determine the storage occupied by the array.





include 


size_t flexsize(int n)

{

   char b[n+3];      /* Variable length array */

   return sizeof b;  /* Execution time sizeof */

}



int main

{

  size_t size;

  size = flexsize(10); /* flexsize returns 13 */

  return 0;

}



 sizeof and incomplete types 
sizeof can only be applied to "completely" defined types. With arrays, this means that the dimensions of the array must be present in its declarationDeclaration (computer science)
In programming languages, a declaration specifies the identifier, type, and other aspects of language elements such as variables and functions. It is used to announce the existence of the element to the compiler; this is important in many strongly-typed languages  that require variables and their...

, and that the type of the elements must be completely defined. For structs and unions, this means that there must be a member list of completely defined types. For example, consider the following two source files:





/* file1.c */

int arr[10];

struct x {int one; int two;};

/* more code */



/* file2.c */

extern int arr[];

struct x;

/* more code */





Both files are perfectly legal C, and code in file1.c can apply sizeof to arr and struct x. However, it is illegal for code in file2.c to do this, because the definitions in file2.c are not complete. In the case of arr, the code does not specify the dimension of the array; without this information, the compiler has no way of knowing how many elements are in the array, and cannot calculate the array's overall size. Likewise, the compiler cannot calculate the size of struct x because it does not know what members it is made up of, and therefore cannot calculate the sum of the sizes of the structure's members. If the programmer provided the size of the array in its declaration in file2.c, or completed the definition of struct x by supplying a member list, this would allow him to apply sizeof to arr and struct x in that source file.

 sizeof... and variadic template packs 
C++11 introduced variadic templates, the keyword sizeof followed by ellipsisEllipsis
Ellipsis  is a series of marks that usually indicate an intentional omission of a word, sentence or whole section from the original text being quoted. An ellipsis can also be used to indicate an unfinished thought or, at the end of a sentence, a trailing off into silence...

 returns the number of elements in a parameter pack.







template 

void print_size(Args... args)

{

  cout << sizeof...(args) << endl;

}



int main

{

  print_size; // outputs 0

  print_size("Is the answer", 42, true); // outputs 3

}



 Implementation 
It is the responsibility of the compiler's author to implement the sizeof operator in a way specific and correct for a given implementation of the language. The sizeof operator must take into account the implementation of the underlying memory allocation scheme to obtain the sizes of various datatypes. sizeof is usually a compile-time operator, which means that during compilation, sizeof and its operand get replaced by the result-value. This is evident in the assembly languageAssembly language
An assembly language is a low-level programming language for computers, microprocessors, microcontrollers, and other programmable devices. It implements a symbolic representation of the machine codes and other constants needed to program a given CPU architecture...

 code produced by a C or C++ compiler. For this reason, sizeof qualifies as an operator, even though its use sometimes looks like a function call. Applying sizeof to variable length arrays, introduced in C99C99
C99 is a modern dialect of the C programming language. It extends the previous version  with new linguistic and library features, and helps implementations make better use of available computer hardware and compiler technology.-History:...

, is an exception to this rule.

 Structure padding 
To calculate the sizes of user-defined types, the compiler takes into account any alignmentData structure alignment
Data structure alignment is the way data is arranged and accessed in computer memory. It consists of two separate but related issues: data alignment and data structure padding.  When a modern computer reads from or writes to a memory address, it will do this in word sized chunks...

 space needed for complex user-defined data structures. This is why the size of a structure in C can be greater than the sum of the sizes of its members. For example, on many systems, the following code will print 8:





struct student{

  char grade; /* char is 1 byte long */

  int age; /* int is 4 bytes long */

};



printf("%zu", sizeof (struct student));





The reason for this is that most compilers, by default, align complex data-structures to a word alignment boundary. In addition, the individual members are also aligned to their respective alignment boundaries. By this logic, the structure student gets aligned on a word boundary and the variable age within the structure is aligned with the next word address. This is accomplished by way of the compiler inserting "padding" space between two members or to the end of the structure to satisfy alignment requirements. This padding is inserted to align age with a word boundary. (Most processorCentral processing unit
The central processing unit  is the portion of a computer system that carries out the instructions of a computer program, to perform the basic arithmetical, logical, and input/output operations of the system. The CPU plays a role somewhat analogous to the brain in the computer. The term has been in...

s can fetch an aligned wordData structure alignment
Data structure alignment is the way data is arranged and accessed in computer memory. It consists of two separate but related issues: data alignment and data structure padding.  When a modern computer reads from or writes to a memory address, it will do this in word sized chunks...

 faster than they can fetch a word value that straddles multiple words in memory, and some don't support the operation at all).

        The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.

Need for `sizeof`

Use

Using `sizeof` with arrays

`sizeof` and incomplete types

`sizeof...` and variadic template packs

Implementation

Structure padding

Need for sizeof

Use

Using sizeof with arrays

sizeof and incomplete types

sizeof... and variadic template packs

Implementation

Structure padding

Need for `sizeof`

Using `sizeof` with arrays

`sizeof` and incomplete types

`sizeof...` and variadic template packs