J (programming language)
Encyclopedia
The J programming language
, developed in the early 1990s by Kenneth E. Iverson
and Roger Hui
, is a synthesis of APL (also by Iverson) and the FP
and FL function-level languages created by John Backus
.
To avoid repeating the APL special character problem, J requires only the basic ASCII
character set, resorting to the use of digraphs formed using the dot or colon characters to extend the meaning of the basic characters available. Additionally, to keep parsing and the language simple, and to compensate for the lack of character variation in ASCII, many characters which might need to be balanced in other languages (such as [] {} "" `` or <>) are treated by J as stand alone tokens or (with digraphs) as part of a multi-character token.
J is a very terse array programming language, and is most suited to mathematical and statistical programming, especially when performing operations on matrices
. It has also been used in Extreme Programming
and network performance
analysis.
Like the original FP/FL languages, J supports function-level programming
(also known as higher-order functional programming), via its tacit programming
features (note that function-level programming is not the same as functional programming).
Unlike most languages that support object-oriented programming
, J's flexible hierarchical namespace
scheme (where every name exists in a particular locale) can be effectively used as a framework for both class-based
and prototype-based
object-oriented programming.
J is not a von Neumann programming language
; however, it is possible to use the von Neumann programming style.
Since March 2011, J is free and open source software
under the GPLv3 license. One may also purchase source for commercial use under a negotiated license.
and function composition
. Thus, its programs can be very terse and are prone to code obfuscation
.
The hello world program
in J is
'Hello, world!'
This implementation of hello world reflects the traditional use of J – programs are entered into a J interpreter session, and the results of expressions are displayed. It's also possible to arrange for J scripts to be executed as standalone programs, but the mechanisms for associating a script with the interpreter are system dependent. Here's how this might look on a UNIX
system:
#!/bin/jc
echo 'Hello, world!'
exit
Historically, APL used / to indicate the fold
, so +/1 2 3 was equivalent to 1+2+3. Meanwhile, division was represented with the classic mathematical division symbol (the obelus
, ÷), which was implemented by overstriking
a minus sign and a colon (on both EBCDIC and ASCII paper terminals). Because ASCII in general does not support overstrikes in a device-independent way, and does not include a division symbol per se, J uses % to represent division, as a visual approximation or reminder. (This illustrates something of the mnemonic character of J's tokens, and some of the quandaries imposed by the use of ASCII.)
The following is a J program to calculate the average of a list of numbers:
avg=: +/ % #
and this is a test execution of the program
avg 1 2 3 4
2.5
# counts the number of items in the array. +/ sums the items
of the array. % divides the sum by the number of items. Note: avg is defined above using a train of three verbs ("+/", "%", and "#") known as a fork. Specifically (V0 V1 V2) Ny is the same as (V0 Ny) V1 (V2 Ny) which shows some of the power of J. (Here V0, V1, V2 denote verbs and Ny denotes a noun.)
Some examples of using avg :
v=: ?. 20 $ 100 NB. a random vector
v
46 55 79 52 54 39 60 57 60 94 46 78 13 18 51 92 78 60 90 62
avg v
59.2
4 avg\ v NB. moving average on periods of size 4
58 60 56 51.25 52.5 54 67.75 64.25 69.5 57.75 38.75 40 43.5 59.75 70.25 80 72.5
m=: ?. 4 5 $ 50 NB. a random matrix
m
46 5 29 2 4
39 10 7 10 44
46 28 13 18 1
42 28 10 40 12
avg"1 m NB. apply avg to each rank 1 subarray (each row) of m
17.2 22 21.2 26.4
Rank is a crucial concept in J. Its significance in J is similar to the significance of "select" in SQL and of "while" in C.
Here is an implementation of quicksort, from the J Dictionary:
sel=: adverb def 'u #[ '
quicksort=: verb define
if. 1 >: #y do. y
else.
(quicksort ysel e=.y(~?#y
end.
)
The following is an implementation of quicksort demonstrating tacit programming
. Tacit programming involves composing functions together and not referring explicitly to any variables. J's support for forks and hooks dictates rules on how arguments applied to this function will be applied to its component functions.
quicksort=: (($:@(<#[) , (=#[) , $:@(>#[)) ({~ ?@#)) ^: (1<#)
The following expression exhibits pi
with n digits and demonstrates the extended precision capabilities of J:
n=: 50 NB. set n as the number of digits required
<.@o. 10x^n NB. extended precision 10 to the nth * pi
314159265358979323846264338327950288419716939937510
Of these, numeric has the most variants.
One of J's numeric types is the bit. There are two bit values: 0, and 1. Additionally, bits can be formed into lists. For example, 1 0 1 0 1 1 0 0 is a list of eight bits. And, syntactically, the J parser treats that as a single word (space characters are recognized as a word forming character when they're between what would otherwise be numeric words). Lists of arbitrary length are supported.
Furthermore, J supports all the usual binary operations on these lists, such as and, or, exclusive or, rotate, shift, not, etc. For example,
1 0 0 1 0 0 1 0 +. 0 1 0 1 1 0 1 0 NB. or
1 1 0 1 1 0 1 0
3 |. 1 0 1 1 0 0 1 1 1 1 1 NB. rotate
1 0 0 1 1 1 1 1 1 0 1
Note that J also supports higher order arrays of bits—they can be formed into two-dimensional, three-dimensional, etc. arrays. The above operations perform equally well on these arrays.
Other numeric types include integer (3, 42), floating point (3.14, 8.8e22), complex (0j1, 2.5j3e88), extended precision integer (12345678901234567890x), and (extended precision) rational fraction (1r2, 3r4). As with bits, these can be formed into lists or arbitrarily dimensioned arrays. As with bits, operations are performed on all numbers in an array.
Lists of bits can be converted to integer using the #. verb. Integers can be converted to lists of bits using the #: verb. (And, when parsing J, . and : are word forming characters. They're never tokens by themselves unless preceded by a space.)
J also supports the literal (character) type. Literals are enclosed in quotes, for example,'a' or 'b' . Lists of literals are also supported using the usual convention of putting multiple characters in quotes, such as 'abcdefg' . Typically, individual literals are 8-bits wide (ascii), but J also supports other literals (unicode). Numeric and boolean operations are not supported on literals, but collection oriented operations (such as rotate) are supported.
Finally, there's the boxed data type. Typically, data is put in a box using the < operation (without any left argument—if there's a left argument, this would be the 'less than' operation). This is analogous to C
's & operation (without any left argument). However, where the result of C's & has reference semantics, the result of J's < has value semantics. In other words, < is a function and it produces a result. The result has 0 dimensions, regardless of the structure of the contained data. From the viewpoint of a J programmer, < 'puts the data into a box' and lets the programmer work with an array of boxes (it can be assembled with other boxes, and/or additional copies can be made of the box). Boxed data is displayed by J, somewhat after the fashion some SQL
interpreters decorate table results from select statements.
<1 0 0 1 0
+---------+
|1 0 0 1 0|
+---------+
The only collection type offered by J is the arbitrarily dimensioned array. Most algorithms can be expressed very concisely using operations on these arrays.
J's arrays are homogeneously typed, for example the list 1 2 3 is a list of integers despite the fact that 1 is a bit. For the most part, these sorts of type issues are transparent to the programmer. Only certain specialized operations reveal differences in type. For example, the list 1.0 0.0 1.0 0.0 would be treated exactly the same, by most operations, as the list 1 0 1 0.
J also supports sparse numeric arrays where non-zero values are stored with their indices. This is an efficient mechanism where relatively few values are non-zero.
J also supports objects and classes, but these are an artifact of the way things are named, and are not data types in and of themselves. Instead, boxed literals are used to refer to objects (and classes). J data has value semantics, but objects and classes need reference semantics.
Another pseudo-type—associated with name, rather than value—is the memory mapped file.
Programming language
A programming language is an artificial language designed to communicate instructions to a machine, particularly a computer. Programming languages can be used to create programs that control the behavior of a machine and/or to express algorithms precisely....
, developed in the early 1990s by Kenneth E. Iverson
Kenneth E. Iverson
Kenneth Eugene Iverson was a Canadian computer scientist noted for the development of the APL programming language in 1962. He was honored with the Turing Award in 1979 for his contributions to mathematical notation and programming language theory...
and Roger Hui
Roger Hui
Roger Hui is a computer scientist and co-developer of the J Programming Language.He was born in Hong Kong and he immigrated to Canada with his entire family in 1966.-Education and career:In 1973, Hui entered the University of Alberta...
, is a synthesis of APL (also by Iverson) and the FP
FP (programming language)
FP is a programming language created by John Backus to support the function-level programming paradigm...
and FL function-level languages created by John Backus
John Backus
John Warner Backus was an American computer scientist. He directed the team that invented the first widely used high-level programming language and was the inventor of the Backus-Naur form , the almost universally used notation to define formal language syntax.He also did research in...
.
To avoid repeating the APL special character problem, J requires only the basic ASCII
ASCII
The American Standard Code for Information Interchange is a character-encoding scheme based on the ordering of the English alphabet. ASCII codes represent text in computers, communications equipment, and other devices that use text...
character set, resorting to the use of digraphs formed using the dot or colon characters to extend the meaning of the basic characters available. Additionally, to keep parsing and the language simple, and to compensate for the lack of character variation in ASCII, many characters which might need to be balanced in other languages (such as [] {} "" `` or <>) are treated by J as stand alone tokens or (with digraphs) as part of a multi-character token.
J is a very terse array programming language, and is most suited to mathematical and statistical programming, especially when performing operations on matrices
Matrix (mathematics)
In mathematics, a matrix is a rectangular array of numbers, symbols, or expressions. The individual items in a matrix are called its elements or entries. An example of a matrix with six elements isMatrices of the same size can be added or subtracted element by element...
. It has also been used in Extreme Programming
Extreme Programming
Extreme programming is a software development methodology which is intended to improve software quality and responsiveness to changing customer requirements...
and network performance
Network performance
Network performance refers to the service quality of a telecommunications product as seen by the customer. It should not be seen merely as an attempt to get "more through" the network....
analysis.
Like the original FP/FL languages, J supports function-level programming
Function-level programming
In computer science, function-level programming refers to one of the two contrasting programming paradigms identified by John Backus in his work on programs as mathematical objects, the other being value-level programming....
(also known as higher-order functional programming), via its tacit programming
Tacit programming
Tacit programming is a programming paradigm in which a function definition does not include information regarding its arguments, using combinators and function composition instead of variables...
features (note that function-level programming is not the same as functional programming).
Unlike most languages that support object-oriented programming
Object-oriented programming
Object-oriented programming is a programming paradigm using "objects" – data structures consisting of data fields and methods together with their interactions – to design applications and computer programs. Programming techniques may include features such as data abstraction,...
, J's flexible hierarchical namespace
Namespace (computer science)
A namespace is an abstract container or environment created to hold a logical grouping of unique identifiers or symbols . An identifier defined in a namespace is associated only with that namespace. The same identifier can be independently defined in multiple namespaces...
scheme (where every name exists in a particular locale) can be effectively used as a framework for both class-based
Class-based programming
Class-based programming, or more commonly class-orientation, refers to the style of object-oriented programming in which inheritance is achieved by defining classes of objects, as opposed to the objects themselves .The most popular and developed model of OOP is a class-based model, as opposed to an...
and prototype-based
Prototype-based programming
Prototype-based programming is a style of object-oriented programming in which classes are not present, and behavior reuse is performed via a process of cloning existing objects that serve as prototypes. This model can also be known as classless, prototype-oriented or instance-based programming...
object-oriented programming.
J is not a von Neumann programming language
Von Neumann programming languages
A von Neumann language is any of those programming languages that are high-level abstract isomorphic copies of von Neumann architectures . As of 2009, most current programming languages fit into this description, likely as a consequence of the extensive domination of the von Neumann computer...
; however, it is possible to use the von Neumann programming style.
Since March 2011, J is free and open source software
Free and open source software
Free and open-source software or free/libre/open-source software is software that is liberally licensed to grant users the right to use, study, change, and improve its design through the availability of its source code...
under the GPLv3 license. One may also purchase source for commercial use under a negotiated license.
Examples
J permits point-free styleTacit programming
Tacit programming is a programming paradigm in which a function definition does not include information regarding its arguments, using combinators and function composition instead of variables...
and function composition
Function composition (computer science)
In computer science, function composition is an act or mechanism to combine simple functions to build more complicated ones...
. Thus, its programs can be very terse and are prone to code obfuscation
Obfuscated code
Obfuscated code is source or machine code that has been made difficult to understand for humans. Programmers may deliberately obfuscate code to conceal its purpose or its logic to prevent tampering, deter reverse engineering, or as a puzzle or recreational challenge for someone reading the source...
.
The hello world program
Hello world program
A "Hello world" program is a computer program that outputs "Hello world" on a display device. Because it is typically one of the simplest programs possible in most programming languages, it is by tradition often used to illustrate to beginners the most basic syntax of a programming language, or to...
in J is
'Hello, world!'
This implementation of hello world reflects the traditional use of J – programs are entered into a J interpreter session, and the results of expressions are displayed. It's also possible to arrange for J scripts to be executed as standalone programs, but the mechanisms for associating a script with the interpreter are system dependent. Here's how this might look on a UNIX
Unix
Unix is a multitasking, multi-user computer operating system originally developed in 1969 by a group of AT&T employees at Bell Labs, including Ken Thompson, Dennis Ritchie, Brian Kernighan, Douglas McIlroy, and Joe Ossanna...
system:
#!/bin/jc
echo 'Hello, world!'
exit
Historically, APL used / to indicate the fold
Fold (higher-order function)
In functional programming, fold – also known variously as reduce, accumulate, compress, or inject – are a family of higher-order functions that analyze a recursive data structure and recombine through use of a given combining operation the results of recursively processing its...
, so +/1 2 3 was equivalent to 1+2+3. Meanwhile, division was represented with the classic mathematical division symbol (the obelus
Obelus
An obelus is a symbol consisting of a short horizontal line with a dot above and below. It is mainly used to represent the mathematical operation of division. It is therefore commonly referred to as the division sign.- History :The word "obelus" comes from the Greek word for a sharpened stick,...
, ÷), which was implemented by overstriking
Overstrike
In typography, overstrike is a method of printing characters that are missing from the printer's character set. It was widely used around the early 1990s...
a minus sign and a colon (on both EBCDIC and ASCII paper terminals). Because ASCII in general does not support overstrikes in a device-independent way, and does not include a division symbol per se, J uses % to represent division, as a visual approximation or reminder. (This illustrates something of the mnemonic character of J's tokens, and some of the quandaries imposed by the use of ASCII.)
The following is a J program to calculate the average of a list of numbers:
avg=: +/ % #
and this is a test execution of the program
avg 1 2 3 4
2.5
# counts the number of items in the array. +/ sums the items
of the array. % divides the sum by the number of items. Note: avg is defined above using a train of three verbs ("+/", "%", and "#") known as a fork. Specifically (V0 V1 V2) Ny is the same as (V0 Ny) V1 (V2 Ny) which shows some of the power of J. (Here V0, V1, V2 denote verbs and Ny denotes a noun.)
Some examples of using avg :
v=: ?. 20 $ 100 NB. a random vector
v
46 55 79 52 54 39 60 57 60 94 46 78 13 18 51 92 78 60 90 62
avg v
59.2
4 avg\ v NB. moving average on periods of size 4
58 60 56 51.25 52.5 54 67.75 64.25 69.5 57.75 38.75 40 43.5 59.75 70.25 80 72.5
m=: ?. 4 5 $ 50 NB. a random matrix
m
46 5 29 2 4
39 10 7 10 44
46 28 13 18 1
42 28 10 40 12
avg"1 m NB. apply avg to each rank 1 subarray (each row) of m
17.2 22 21.2 26.4
Rank is a crucial concept in J. Its significance in J is similar to the significance of "select" in SQL and of "while" in C.
Here is an implementation of quicksort, from the J Dictionary:
sel=: adverb def 'u #
quicksort=: verb define
if. 1 >: #y do. y
else.
(quicksort y
end.
)
The following is an implementation of quicksort demonstrating tacit programming
Tacit programming
Tacit programming is a programming paradigm in which a function definition does not include information regarding its arguments, using combinators and function composition instead of variables...
. Tacit programming involves composing functions together and not referring explicitly to any variables. J's support for forks and hooks dictates rules on how arguments applied to this function will be applied to its component functions.
quicksort=: (($:@(<#[) , (=#[) , $:@(>#[)) ({~ ?@#)) ^: (1<#)
The following expression exhibits pi
Pi
' is a mathematical constant that is the ratio of any circle's circumference to its diameter. is approximately equal to 3.14. Many formulae in mathematics, science, and engineering involve , which makes it one of the most important mathematical constants...
with n digits and demonstrates the extended precision capabilities of J:
n=: 50 NB. set n as the number of digits required
<.@o. 10x^n NB. extended precision 10 to the nth * pi
314159265358979323846264338327950288419716939937510
Data types and structures
J supports three simple types:- Numeric
- Literal (Character)
- Boxed
Of these, numeric has the most variants.
One of J's numeric types is the bit. There are two bit values: 0, and 1. Additionally, bits can be formed into lists. For example, 1 0 1 0 1 1 0 0 is a list of eight bits. And, syntactically, the J parser treats that as a single word (space characters are recognized as a word forming character when they're between what would otherwise be numeric words). Lists of arbitrary length are supported.
Furthermore, J supports all the usual binary operations on these lists, such as and, or, exclusive or, rotate, shift, not, etc. For example,
1 0 0 1 0 0 1 0 +. 0 1 0 1 1 0 1 0 NB. or
1 1 0 1 1 0 1 0
3 |. 1 0 1 1 0 0 1 1 1 1 1 NB. rotate
1 0 0 1 1 1 1 1 1 0 1
Note that J also supports higher order arrays of bits—they can be formed into two-dimensional, three-dimensional, etc. arrays. The above operations perform equally well on these arrays.
Other numeric types include integer (3, 42), floating point (3.14, 8.8e22), complex (0j1, 2.5j3e88), extended precision integer (12345678901234567890x), and (extended precision) rational fraction (1r2, 3r4). As with bits, these can be formed into lists or arbitrarily dimensioned arrays. As with bits, operations are performed on all numbers in an array.
Lists of bits can be converted to integer using the #. verb. Integers can be converted to lists of bits using the #: verb. (And, when parsing J, . and : are word forming characters. They're never tokens by themselves unless preceded by a space.)
J also supports the literal (character) type. Literals are enclosed in quotes, for example,
Finally, there's the boxed data type. Typically, data is put in a box using the < operation (without any left argument—if there's a left argument, this would be the 'less than' operation). This is analogous to C
C (programming language)
C is a general-purpose computer programming language developed between 1969 and 1973 by Dennis Ritchie at the Bell Telephone Laboratories for use with the Unix operating system....
's & operation (without any left argument). However, where the result of C's & has reference semantics, the result of J's < has value semantics. In other words, < is a function and it produces a result. The result has 0 dimensions, regardless of the structure of the contained data. From the viewpoint of a J programmer, < 'puts the data into a box' and lets the programmer work with an array of boxes (it can be assembled with other boxes, and/or additional copies can be made of the box). Boxed data is displayed by J, somewhat after the fashion some SQL
SQL
SQL is a programming language designed for managing data in relational database management systems ....
interpreters decorate table results from select statements.
<1 0 0 1 0
+---------+
|1 0 0 1 0|
+---------+
The only collection type offered by J is the arbitrarily dimensioned array. Most algorithms can be expressed very concisely using operations on these arrays.
J's arrays are homogeneously typed, for example the list 1 2 3 is a list of integers despite the fact that 1 is a bit. For the most part, these sorts of type issues are transparent to the programmer. Only certain specialized operations reveal differences in type. For example, the list 1.0 0.0 1.0 0.0 would be treated exactly the same, by most operations, as the list 1 0 1 0.
J also supports sparse numeric arrays where non-zero values are stored with their indices. This is an efficient mechanism where relatively few values are non-zero.
J also supports objects and classes, but these are an artifact of the way things are named, and are not data types in and of themselves. Instead, boxed literals are used to refer to objects (and classes). J data has value semantics, but objects and classes need reference semantics.
Another pseudo-type—associated with name, rather than value—is the memory mapped file.
Documentation
J's documentation, unlike that of most other programming languages, is organized as a dictionary, with words in J identified as nouns, verbs, adverbs, conjunctions, and so on. The parts of speech are indicated using markup. Note that verbs have two forms: monads (arguments only on the right) and dyads (arguments on the left and on the right). For example, in '-1' the hyphen is a monad, and in '3-2' the hyphen is a dyad. The monad definition is mostly independent of the dyad definition, regardless of whether the verb is a primitive verb or a derived verb.Control structures
J provides control structures (details here) similar to other procedural languages. The controls are:- assert. T
- break.
- continue.
- for. T do. B end.
- for_xyz. T do. B end.
- goto_name.
- label_name.
- if. T do. B end.
- if. T do. B else. B1 end.
- if. T do. B
- elseif. T1 do. B1
- elseif. T2 do. B2
- end.
- return.
- select. T
- case. T0 do. B0
- fcase. T1 do. B1
- case. T2 do. B2
- end.
- throw.
- try. B catch. B1 catchd. B2 catcht. B3 end.
- while. T do. B end.
- whilst. T do. B end.
See also
- A+ (programming language)A+ (programming language)A+ is an array programming language descendent from the programming language A, which in turn was created to replace APL in 1988. Arthur Whitney developed the "A" portion of A+, while other developers at Morgan Stanley extended it, adding a graphical user interface and other language features...
- an APL dialect - K (programming language)K (programming language)K is a proprietary array processing language developed by Arthur Whitney and commercialized by Kx Systems. The language serves as the foundation for kdb, an in-memory, column-based database, and other related financial products. The language, originally developed in 1993, is a variant of APL and...
- another APL-influenced language - QQ (programming language from Kx Systems)Q is a proprietary array processing language developed by Arthur Whitney and commercialized by Kx Systems. The language serves as the query language for kdb+, a disk based and in-memory, column-based database. kdb+ is based upon K, a terse variant of APL...
- The language of KDB+ and a new merged version of K and KSQL.
External links
- JSoftware - Creators of J, currently gratisGratisGratis is the process of providing goods or services without compensation. It is often referred to in English as "free of charge" or "complimentary"...
for all uses - J Wiki - Showcase, documentation, articles, etc.
- J Forum Archives - Discussion of the language
- Cliff Reiter - Chaos, fractals, and mathematical symmetries, in J
- Ewart Shaw - Bayesian inference, medical statistics, and numerical methods, using J
- Keith Smillie - Statistical applications of array programming languages, especially J
- John Howland - Research on parallelization of array programming languages, especially J