Array programming

Encyclopedia

In computer science

,

s to apply transparently to vectors, matrices

, and higher dimensional arrays.

Array programming primitives concisely express broad ideas about data manipulation. The level of conciseness can be dramatic in certain cases: it is not uncommon to find array programming language one-liners

that require more than a couple of pages of Java code.

APL

, designed by Ken Iverson

, was the first programming language

to provide array programming capabilities. The mnemonic APL refers to the title of his seminal book "A Programming Language" and not to arrays per se. Iverson's contribution to rigor and clarity was probably more important than the simple extension of dimensions to functions.

model as it allows the programmer to think and operate on whole aggregates of data, without having to resort to explicit loops of individual scalar operations.

Iverson described the rationale behind array programming (actually referring to APL) as follows:

The basis behind array programming and thinking is to find and exploit the properties of data where individual elements are similar and/or adjacent. Unlike object orientation which implicitly breaks down data to its constituent parts (or scalar

quantities), array orientation looks to group data and apply a uniform handling.

Function rank is an important concept to array programming languages in general, by analogy to tensor

rank in mathematics: functions that operate on data may be classified by the number of dimensions they act on. Ordinary multiplication, for example, is a scalar ranked function because it operates on zero-dimensional data (individual numbers). The cross product

operation is an example of a vector rank function because it operates on vectors, not scalars. Matrix multiplication

is an example of a 2-rank function, because it operates on 2-dimensional objects (matrices). Collapse operators reduce the dimensionality of an input data array by one or more dimensions. For example, summing over elements collapses the input array by 1 dimension.

and 3DNow!

, which include rudimentary SIMD

array capabilities. Array processing is distinct from parallel processing

in that one physical processor performs operations on a group of items simultaneously while parallel processing aims to split a larger problem into smaller ones (MIMD

) to be solved piecemeal by numerous processors. Processors with two or more cores are increasingly common today.

. Others include: A+

, IDL, K

, Q

, Mathematica

, MATLAB

, MOLSF, NumPy, GNU Octave

, PDL

, R

, S-Lang, SAC

, Nial and ZPL.

, Pascal

, etc. operations apply only to single values, so

Computer science

Computer science or computing science is the study of the theoretical foundations of information and computation and of practical techniques for their implementation and application in computer systems...

,

**array programming languages**(also known as**vector**or**multidimensional**languages) generalize operations on scalarScalar (computing)

In computing, a scalar variable or field is one that can hold only one value at a time; as opposed to composite variables like array, list, hash, record, etc. In some contexts, a scalar value may be understood to be numeric. A scalar data type is the type of a scalar variable...

s to apply transparently to vectors, matrices

Matrix (mathematics)

In mathematics, a matrix is a rectangular array of numbers, symbols, or expressions. The individual items in a matrix are called its elements or entries. An example of a matrix with six elements isMatrices of the same size can be added or subtracted element by element...

, and higher dimensional arrays.

Array programming primitives concisely express broad ideas about data manipulation. The level of conciseness can be dramatic in certain cases: it is not uncommon to find array programming language one-liners

One-liner program

A one-liner is textual input to the command-line of an operating system shell that performs some function in just one line of input.The one liner can be# An expression written in the language of the shell....

that require more than a couple of pages of Java code.

APL

APL programming language

APL is an interactive array-oriented language and integrated development environment, which is available from a number of commercial and noncommercial vendors and for most computer platforms. It is based on a mathematical notation developed by Kenneth E...

, designed by Ken Iverson

Kenneth E. Iverson

Kenneth Eugene Iverson was a Canadian computer scientist noted for the development of the APL programming language in 1962. He was honored with the Turing Award in 1979 for his contributions to mathematical notation and programming language theory...

, was the first programming language

Programming language

A programming language is an artificial language designed to communicate instructions to a machine, particularly a computer. Programming languages can be used to create programs that control the behavior of a machine and/or to express algorithms precisely....

to provide array programming capabilities. The mnemonic APL refers to the title of his seminal book "A Programming Language" and not to arrays per se. Iverson's contribution to rigor and clarity was probably more important than the simple extension of dimensions to functions.

## Concepts

The fundamental idea behind array programming is that operations apply at once to an entire set of values. This makes it a high-level programmingHigh-level programming language

A high-level programming language is a programming language with strong abstraction from the details of the computer. In comparison to low-level programming languages, it may use natural language elements, be easier to use, or be from the specification of the program, making the process of...

model as it allows the programmer to think and operate on whole aggregates of data, without having to resort to explicit loops of individual scalar operations.

Iverson described the rationale behind array programming (actually referring to APL) as follows:

The basis behind array programming and thinking is to find and exploit the properties of data where individual elements are similar and/or adjacent. Unlike object orientation which implicitly breaks down data to its constituent parts (or scalar

Scalar (computing)

In computing, a scalar variable or field is one that can hold only one value at a time; as opposed to composite variables like array, list, hash, record, etc. In some contexts, a scalar value may be understood to be numeric. A scalar data type is the type of a scalar variable...

quantities), array orientation looks to group data and apply a uniform handling.

Function rank is an important concept to array programming languages in general, by analogy to tensor

Tensor

Tensors are geometric objects that describe linear relations between vectors, scalars, and other tensors. Elementary examples include the dot product, the cross product, and linear maps. Vectors and scalars themselves are also tensors. A tensor can be represented as a multi-dimensional array of...

rank in mathematics: functions that operate on data may be classified by the number of dimensions they act on. Ordinary multiplication, for example, is a scalar ranked function because it operates on zero-dimensional data (individual numbers). The cross product

Cross product

In mathematics, the cross product, vector product, or Gibbs vector product is a binary operation on two vectors in three-dimensional space. It results in a vector which is perpendicular to both of the vectors being multiplied and normal to the plane containing them...

operation is an example of a vector rank function because it operates on vectors, not scalars. Matrix multiplication

Matrix multiplication

In mathematics, matrix multiplication is a binary operation that takes a pair of matrices, and produces another matrix. If A is an n-by-m matrix and B is an m-by-p matrix, the result AB of their multiplication is an n-by-p matrix defined only if the number of columns m of the left matrix A is the...

is an example of a 2-rank function, because it operates on 2-dimensional objects (matrices). Collapse operators reduce the dimensionality of an input data array by one or more dimensions. For example, summing over elements collapses the input array by 1 dimension.

## Uses

Array programming is very well suited to implicit parallelization; a topic of much research nowadays. Further, Intel and compatible CPUs developed and produced after 1997 contained various instruction set extensions, starting from MMX and continuing through SSSE3SSSE3

Supplemental Streaming SIMD Extensions 3 is a SIMD instruction set created by Intel and is the fourth iteration of the SSE technology.- History :...

and 3DNow!

3DNow!

3DNow! is an extension to the x86 instruction set developed by Advanced Micro Devices . It adds single instruction multiple data instructions to the base x86 instruction set, enabling it to perform simple vector processing, which improves the performance of many graphic-intensive applications...

, which include rudimentary SIMD

SIMD

Single instruction, multiple data , is a class of parallel computers in Flynn's taxonomy. It describes computers with multiple processing elements that perform the same operation on multiple data simultaneously...

array capabilities. Array processing is distinct from parallel processing

Parallel computing

Parallel computing is a form of computation in which many calculations are carried out simultaneously, operating on the principle that large problems can often be divided into smaller ones, which are then solved concurrently . There are several different forms of parallel computing: bit-level,...

in that one physical processor performs operations on a group of items simultaneously while parallel processing aims to split a larger problem into smaller ones (MIMD

MIMD

In computing, MIMD is a technique employed to achieve parallelism. Machines using MIMD have a number of processors that function asynchronously and independently. At any time, different processors may be executing different instructions on different pieces of data...

) to be solved piecemeal by numerous processors. Processors with two or more cores are increasingly common today.

## Languages

The canonical examples of array programming languages are APL, J, and Fortran 90Fortran

Fortran is a general-purpose, procedural, imperative programming language that is especially suited to numeric computation and scientific computing...

. Others include: A+

A+ (programming language)

A+ is an array programming language descendent from the programming language A, which in turn was created to replace APL in 1988. Arthur Whitney developed the "A" portion of A+, while other developers at Morgan Stanley extended it, adding a graphical user interface and other language features...

, IDL, K

K (programming language)

K is a proprietary array processing language developed by Arthur Whitney and commercialized by Kx Systems. The language serves as the foundation for kdb, an in-memory, column-based database, and other related financial products. The language, originally developed in 1993, is a variant of APL and...

, Q

Q (programming language from Kx Systems)

Q is a proprietary array processing language developed by Arthur Whitney and commercialized by Kx Systems. The language serves as the query language for kdb+, a disk based and in-memory, column-based database. kdb+ is based upon K, a terse variant of APL...

, Mathematica

Mathematica

Mathematica is a computational software program used in scientific, engineering, and mathematical fields and other areas of technical computing...

, MATLAB

MATLAB

MATLAB is a numerical computing environment and fourth-generation programming language. Developed by MathWorks, MATLAB allows matrix manipulations, plotting of functions and data, implementation of algorithms, creation of user interfaces, and interfacing with programs written in other languages,...

, MOLSF, NumPy, GNU Octave

GNU Octave

GNU Octave is a high-level language, primarily intended for numerical computations. It provides a convenient command-line interface for solving linear and nonlinear problems numerically, and for performing other numerical experiments using a language that is mostly compatible with MATLAB...

, PDL

Perl Data Language

PDL is a set of array programming extensions to the Perl programming language.PDL is an extension to Perl v5, intended for scientific and other data intensive programming tasks...

, R

R (programming language)

R is a programming language and software environment for statistical computing and graphics. The R language is widely used among statisticians for developing statistical software, and R is widely used for statistical software development and data analysis....

, S-Lang, SAC

SAC programming language

SAC is a strict purely functional programming language which design is focused on the needs of numerical applications. Emphasis is laid on efficient support for array processing. Efficiency concerns are essentially twofold...

, Nial and ZPL.

## Examples

In scalar languages like FORTRAN 77, CC (programming language)

C is a general-purpose computer programming language developed between 1969 and 1973 by Dennis Ritchie at the Bell Telephone Laboratories for use with the Unix operating system....

, Pascal

Pascal (programming language)

Pascal is an influential imperative and procedural programming language, designed in 1968/9 and published in 1970 by Niklaus Wirth as a small and efficient language intended to encourage good programming practices using structured programming and data structuring.A derivative known as Object Pascal...

, etc. operations apply only to single values, so

*a*+*b*expresses the addition of two numbers. In such languages adding two arrays requires indexing and looping: