ATS (programming language)
Encyclopedia
ATS is a programming language whose stated purpose is to support theorem proving in combination with practical programming through the use of advanced type system
s. The performance of ATS has been demonstrated to be comparable to that of the C
and C++
programming languages. By using theorem proving, and strict type checking, the compiler can detect and prove that its implemented functions are not susceptible to bugs such as division by zero
, memory leaks
, buffer overflow
, and other forms of memory corruption
by verifying pointer arithmetic and reference counting
before the program compiles.
programming languages. An earlier language, Dependent ML
, by the same author has been incorporated by the language.
in combination with practical programming. With theorem proving one can prove, for instance, that an implemented function does not produce memory leaks. It also prevents other bugs that might otherwise only be found during testing. It incorporates a system similar to those of proof assistants which usually only aimed at verify mathematical proofs -- except ATS uses this ability to prove that the implementations of its functions operate correctly, and produce the expected output.
As a simple example, in a function using division, the programmer may prove that the divisor will never equal zero, preventing a division by zero
error. Let's say, the divisor 'X' was computed as 5 times the length of list 'A'. One can prove, that in the case of a non-empty list, 'X' is non-zero, since 'X' is the product of two non-zero numbers (5 and the length of 'A'). A more practical example would be proving through reference counting
that the retain count on an allocated block of memory is being counted correctly for each pointer. Then one can know, and quite literally prove, that the object will not be deallocated prematurely, and that memory leaks
will not occur.
The beauty of the ATS system is that since all theorem proving occurs strictly within the compiler, it has NO effect on the speed of the executable program. ATS code is often harder to compile than standard C
code, but once it compiles the programmer can be certain that it is running correctly to exactly the degree specified by their proofs.
In ATS proofs are separate from implementation, so it is possible to implement a function without proving it if the programmer so desires.
val x : @(int, char) = @(15, 'c') // x.0 = 15 ; x.1 = 'c'
val @(a, b) = x // pattern matching binding, a= 15, b='c'
val x = @{first=15, second='c'} // x.first = 15
val @{first=a, second=b} = x // a= 15, b='c'
val @{second=b, ...} = x // with omission, b='c'
prefix ' means indirect or boxed allocation
val x : '(int, char) = '(15, 'c') // x.0 = 15 ; x.1 = 'c'
val '(a, b) = x // a= 15, b='c'
val x = '{first=15, second='c'} // x.first = 15
val '{first=a, second=b} = x // a= 15, b='c'
val '{second=b, ...} = x // b='c'
special
With '|' as separator, some functions return wrapped the result value with an evaluation of predicates
val ( predicate_proofs | values) = myfunct params
[...] existential quantification
(...) parenthetical expression or tuple
(... | ...) (proofs | values)
@(...) flat tuple or variadic function
parameters tuple (see example's printf)
@[byte][BUFLEN] type of an array of BUFLEN values of type byte
@[byte][BUFLEN] array instance
@[byte][BUFLEN](0) array initialized to 0
sortdef nat = {a: int | a >= 0 } // from prelude: ∀ a ∈ int ...
typedef String = [a:nat] string(a) // [..]: ∃ a ∈ nat ...
type (as sort): generic sort for elements with the length of a pointer word, to be used in type parameterized polymorphic functions
// {..}: ∀ a,b ∈ type ...
fun {a,b:type} swap_type_type (xy: @(a, b)): @(b, a) = (xy.1, xy.0)
t@ype: linear version of previous type with abstracted length. It supersets type
viewtype: a domain class like type with a view (memory association)
viewt@ype: linear version of viewtype with abstracted length. It supersets viewtype
view: relation of a Type and a memory location. The infix @ is its most common constructor
fun {a:t@ype} ptr_get0 {l:addr} (pf: a @ l | p: ptr l): @(a @ l | a)
fun {a:t@ype} ptr_set0 {l:addr} (pf: a? @ l | p: ptr l, x: a): @(a @ l | void)
viewdef array_v (a:viewt@ype, n:int, l: addr) = @[a][n] @ l
T? :possibly uninitialized type
predicates:
given r = fact(n) ; r1 = fact(n-1)
in ATS code:
dataprop FACT (int, int) =
| FACTbas (0, 1) // basic case: FACT(0, 1)
| {n:int | n > 0} {r,r1:int} FACTind (n, r) of (FACT (n-1, r1), MUL (n, r1, r)) // inductive case
where FACT (int, int) is a proof type
" proving through the construction dataprop.
The evaluation of fact1(n-1) returns a pair (proof_n_minus_1 | result_of_n_minus_1) which is used in the calculation of fact1(n). The proofs express the predicates of the proposition.
Compilation (compiles through gcc
, without setting up garbage collection
unless explicitly stated with -D_ATS_GCATS )
atscc fact1.dats -o fact1
./fact1 4
compiles and gives the expected result
dataview array_v (a: viewt@ype+, int, addr) =
| {l:addr} array_v_none (a, 0, l)
| {n:nat} {l:addr} array_v_some (a, n+1, l) of (a @ l, array_v (a, n, l+sizeof a))
datatype workday = Mon | Tue | Wed | Thu | Fri
lists
datatype list0 (a:t@ype) = list0_cons (a) of (a, list0 a) | list0_nil (a)
var res: int with pf_res = 1 // introduces pf_res as an alias of view @ (res)
on stack array allocation:
var !p_buf with pf_buf = @[byte][BUFLEN](0) // pf_buf = @[byte][BUFLEN](0) @ p_buf
See val and var declarations
Type system
A type system associates a type with each computed value. By examining the flow of these values, a type system attempts to ensure or prove that no type errors can occur...
s. The performance of ATS has been demonstrated to be comparable to that of the C
C (programming language)
C is a general-purpose computer programming language developed between 1969 and 1973 by Dennis Ritchie at the Bell Telephone Laboratories for use with the Unix operating system....
and C++
C++
C++ is a statically typed, free-form, multi-paradigm, compiled, general-purpose programming language. It is regarded as an intermediate-level language, as it comprises a combination of both high-level and low-level language features. It was developed by Bjarne Stroustrup starting in 1979 at Bell...
programming languages. By using theorem proving, and strict type checking, the compiler can detect and prove that its implemented functions are not susceptible to bugs such as division by zero
Division by zero
In mathematics, division by zero is division where the divisor is zero. Such a division can be formally expressed as a / 0 where a is the dividend . Whether this expression can be assigned a well-defined value depends upon the mathematical setting...
, memory leaks
Memory leak
A memory leak, in computer science , occurs when a computer program consumes memory but is unable to release it back to the operating system. In object-oriented programming, a memory leak happens when an object is stored in memory but cannot be accessed by the running code...
, buffer overflow
Buffer overflow
In computer security and programming, a buffer overflow, or buffer overrun, is an anomaly where a program, while writing data to a buffer, overruns the buffer's boundary and overwrites adjacent memory. This is a special case of violation of memory safety....
, and other forms of memory corruption
Memory corruption
Memory corruption happens when the contents of a memory location are unintentionally modified due to programming errors; this is known as violating memory safety. When the corrupted memory contents are used later in the computer program, it leads either to program crash or to strange and bizarre...
by verifying pointer arithmetic and reference counting
Reference counting
In computer science, reference counting is a technique of storing the number of references, pointers, or handles to a resource such as an object, block of memory, disk space or other resource...
before the program compiles.
History
ATS is derived mostly from the ML and Objective CamlObjective Caml
OCaml , originally known as Objective Caml, is the main implementation of the Caml programming language, created by Xavier Leroy, Jérôme Vouillon, Damien Doligez, Didier Rémy and others in 1996...
programming languages. An earlier language, Dependent ML
Dependent ML
Dependent ML is an experimental functional programming language proposed by Hongwei Xi and Frank Pfenning. Dependent ML extends ML by a restricted notion of dependent types: types may be dependent on static indices of type Nat...
, by the same author has been incorporated by the language.
Theorem proving
The primary focus of ATS is to support theorem provingFormal verification
In the context of hardware and software systems, formal verification is the act of proving or disproving the correctness of intended algorithms underlying a system with respect to a certain formal specification or property, using formal methods of mathematics .- Usage :Formal verification can be...
in combination with practical programming. With theorem proving one can prove, for instance, that an implemented function does not produce memory leaks. It also prevents other bugs that might otherwise only be found during testing. It incorporates a system similar to those of proof assistants which usually only aimed at verify mathematical proofs -- except ATS uses this ability to prove that the implementations of its functions operate correctly, and produce the expected output.
As a simple example, in a function using division, the programmer may prove that the divisor will never equal zero, preventing a division by zero
Division by zero
In mathematics, division by zero is division where the divisor is zero. Such a division can be formally expressed as a / 0 where a is the dividend . Whether this expression can be assigned a well-defined value depends upon the mathematical setting...
error. Let's say, the divisor 'X' was computed as 5 times the length of list 'A'. One can prove, that in the case of a non-empty list, 'X' is non-zero, since 'X' is the product of two non-zero numbers (5 and the length of 'A'). A more practical example would be proving through reference counting
Reference counting
In computer science, reference counting is a technique of storing the number of references, pointers, or handles to a resource such as an object, block of memory, disk space or other resource...
that the retain count on an allocated block of memory is being counted correctly for each pointer. Then one can know, and quite literally prove, that the object will not be deallocated prematurely, and that memory leaks
Memory leak
A memory leak, in computer science , occurs when a computer program consumes memory but is unable to release it back to the operating system. In object-oriented programming, a memory leak happens when an object is stored in memory but cannot be accessed by the running code...
will not occur.
The beauty of the ATS system is that since all theorem proving occurs strictly within the compiler, it has NO effect on the speed of the executable program. ATS code is often harder to compile than standard C
C (programming language)
C is a general-purpose computer programming language developed between 1969 and 1973 by Dennis Ritchie at the Bell Telephone Laboratories for use with the Unix operating system....
code, but once it compiles the programmer can be certain that it is running correctly to exactly the degree specified by their proofs.
In ATS proofs are separate from implementation, so it is possible to implement a function without proving it if the programmer so desires.
Data representation
According to the author (Hongwei Xi), ATS's efficiency is largely due to the way that data is represented in the language and tail-call optimizations (which are generally important for the efficiency of functional programming languages). Data can be stored in a flat or unboxed representation rather than a boxed representation.Basic types
- bool (true, false)
- int (255, 0377, 0xFF), unary minus as ~
- double
- char 'a'
- string "abc"
Tuples and records
prefix @ or none means direct, flat or unboxed allocationval x : @(int, char) = @(15, 'c') // x.0 = 15 ; x.1 = 'c'
val @(a, b) = x // pattern matching binding, a= 15, b='c'
val x = @{first=15, second='c'} // x.first = 15
val @{first=a, second=b} = x // a= 15, b='c'
val @{second=b, ...} = x // with omission, b='c'
prefix ' means indirect or boxed allocation
val x : '(int, char) = '(15, 'c') // x.0 = 15 ; x.1 = 'c'
val '(a, b) = x // a= 15, b='c'
val x = '{first=15, second='c'} // x.first = 15
val '{first=a, second=b} = x // a= 15, b='c'
val '{second=b, ...} = x // b='c'
special
With '|' as separator, some functions return wrapped the result value with an evaluation of predicates
val ( predicate_proofs | values) = myfunct params
Common
{...} universal quantification[...] existential quantification
(...) parenthetical expression or tuple
(... | ...) (proofs | values)
@(...) flat tuple or variadic function
Variadic function
In computer programming, a variadic function is a function of indefinite arity, i.e., one which accepts a variable number of arguments. Support for variadic functions differs widely among programming languages....
parameters tuple (see example's printf)
@[byte][BUFLEN] type of an array of BUFLEN values of type byte
@[byte][BUFLEN] array instance
@[byte][BUFLEN](0) array initialized to 0
Dictionary
sort:domainsortdef nat = {a: int | a >= 0 } // from prelude: ∀ a ∈ int ...
typedef String = [a:nat] string(a) // [..]: ∃ a ∈ nat ...
type (as sort): generic sort for elements with the length of a pointer word, to be used in type parameterized polymorphic functions
// {..}: ∀ a,b ∈ type ...
fun {a,b:type} swap_type_type (xy: @(a, b)): @(b, a) = (xy.1, xy.0)
t@ype: linear version of previous type with abstracted length. It supersets type
viewtype: a domain class like type with a view (memory association)
viewt@ype: linear version of viewtype with abstracted length. It supersets viewtype
view: relation of a Type and a memory location. The infix @ is its most common constructor
- T @ L asserts that there is a view of type T at location L
fun {a:t@ype} ptr_get0 {l:addr} (pf: a @ l | p: ptr l): @(a @ l | a)
fun {a:t@ype} ptr_set0 {l:addr} (pf: a? @ l | p: ptr l, x: a): @(a @ l | void)
- the type of ptr_get0 (T) is ∀ l : addr . ( T @ l | ptr( l ) ) -> ( T @ l | T) // see manual, section 7.1. Safe Memory Access through Pointers
viewdef array_v (a:viewt@ype, n:int, l: addr) = @[a][n] @ l
T? :possibly uninitialized type
propositions
dataprop expresses predicates as algebraic typespredicates:
- FACT( n, r) if and only ifIf and only ifIn logic and related fields such as mathematics and philosophy, if and only if is a biconditional logical connective between statements....
fact(n) = r - MUL( n, m, prod) iffIFFIFF, Iff or iff may refer to:Technology/Science:* Identification friend or foe, an electronic radio-based identification system using transponders...
n * m = prod
given r = fact(n) ; r1 = fact(n-1)
FACT(n, r) =
FACT(0, 1)
| FACT(n, r) iff FACT (n-1, r1) and MUL (n, r1, r) // for n > 0
in ATS code:
dataprop FACT (int, int) =
| FACTbas (0, 1) // basic case: FACT(0, 1)
| {n:int | n > 0} {r,r1:int} FACTind (n, r) of (FACT (n-1, r1), MUL (n, r1, r)) // inductive case
where FACT (int, int) is a proof type
Example
Non tail-recursive factorial with proposition or "TheoremTheorem
In mathematics, a theorem is a statement that has been proven on the basis of previously established statements, such as other theorems, and previously accepted statements, such as axioms...
" proving through the construction dataprop.
The evaluation of fact1(n-1) returns a pair (proof_n_minus_1 | result_of_n_minus_1) which is used in the calculation of fact1(n). The proofs express the predicates of the proposition.
Compilation (compiles through gcc
GNU Compiler Collection
The GNU Compiler Collection is a compiler system produced by the GNU Project supporting various programming languages. GCC is a key component of the GNU toolchain...
, without setting up garbage collection
Garbage collection (computer science)
In computer science, garbage collection is a form of automatic memory management. The garbage collector, or just collector, attempts to reclaim garbage, or memory occupied by objects that are no longer in use by the program...
unless explicitly stated with -D_ATS_GCATS )
atscc fact1.dats -o fact1
./fact1 4
compiles and gives the expected result
pattern matching exhaustivity
as in case+, val+, type+, viewtype+, ...- with suffix '+' the compiler issues an error in case of non exhaustive alternatives
- without suffix the compiler issues a warning
- with '-' as suffix, avoids exhaustivity control
modules
staload "foo.sats" // foo.sats is loaded and then opened into the current namespace
staload F "foo.sats" // to use identifiers qualified as $F.bar
dynload "foo.dats" // loaded dynamically at run-time
dataview
Dataviews are often declared to encode recursively defined relations on linear resources.dataview array_v (a: viewt@ype+, int, addr) =
| {l:addr} array_v_none (a, 0, l)
| {n:nat} {l:addr} array_v_some (a, n+1, l) of (a @ l, array_v (a, n, l+sizeof a))
datatype / dataviewtype
Datatypesdatatype workday = Mon | Tue | Wed | Thu | Fri
lists
datatype list0 (a:t@ype) = list0_cons (a) of (a, list0 a) | list0_nil (a)
dataviewtype
A dataviewtype is similar to a datatype, but it is linear. With a dataviewtype, the programmer is allowed to explicitly free (or deallocate) in a safe manner the memory used for storing constructors associated with the dataviewtype.variables
local variablesvar res: int with pf_res = 1 // introduces pf_res as an alias of view @ (res)
on stack array allocation:
var !p_buf with pf_buf = @[byte][BUFLEN](0) // pf_buf = @[byte][BUFLEN](0) @ p_buf
See val and var declarations
External links
- ATS home page
- Manual Draft. Some examples refer to features or routines not present in the release (Anairiats-0.1.6) (e.g.: print overload for strbuf, and using its array examples gives errmsgs like "use of array subscription is not supported".)
- ATS for ML programmers and other articles