Cyclone programming language
Encyclopedia
The Cyclone programming language
is intended to be a safe dialect of the C language
. Cyclone is designed to avoid buffer overflow
s and other vulnerabilities that are endemic in C programs, without losing the power and convenience of C as a tool for system programming
.
Cyclone development was started as a joint project of AT&T Labs Research and Greg Morrisett
's group at Cornell in 2001. Version 1.0 was released on May 8, 2006.
, while still maintaining its look and performance. To this end, Cyclone places the following limits on programs:
To maintain the tool set that C programmers are used to, Cyclone provides the following extensions:
For a better high-level introduction to Cyclone, the reasoning behind Cyclone and the source of these lists, see this paper.
Cyclone looks, in general, much like C
, but it should be viewed as a C-like language.
(following C terminology these are called pointers):
The purpose of introducing these new pointer types is to avoid common problems when using pointers. Take for instance a function, called
Although the person who wrote the function
This tells the Cyclone compiler that the argument to
s and other "off-by-one"-style mistakes. To avoid this, the
This function assumes that the string being passed in is terminated by NUL (
. The Cyclone version of
Here,
, and casting from
This returns an object that is allocated on the stack of the function
and other compilers will warn about such code, the following will typically compile without warnings:
Cyclone does regional analysis of each segment of code, preventing dangling pointers, such as the one returned from this version of
Examples
The best example to start with is the classic Hello world program:
External links
Presentations:
Programming language
A programming language is an artificial language designed to communicate instructions to a machine, particularly a computer. Programming languages can be used to create programs that control the behavior of a machine and/or to express algorithms precisely....
is intended to be a safe dialect of the C language
C (programming language)
C is a general-purpose computer programming language developed between 1969 and 1973 by Dennis Ritchie at the Bell Telephone Laboratories for use with the Unix operating system....
. Cyclone is designed to avoid buffer overflow
Buffer overflow
In computer security and programming, a buffer overflow, or buffer overrun, is an anomaly where a program, while writing data to a buffer, overruns the buffer's boundary and overwrites adjacent memory. This is a special case of violation of memory safety....
s and other vulnerabilities that are endemic in C programs, without losing the power and convenience of C as a tool for system programming
System programming
System programming is the activity of programming system software. The primary distinguishing characteristic of systems programming when compared to application programming is that application programming aims to produce software which provides services to the user System programming (or systems...
.
Cyclone development was started as a joint project of AT&T Labs Research and Greg Morrisett
Greg Morrisett
John Gregory Morrisett is the Allen B. Cutting Professor of Computer Science and Associate Dean for Computer Science and Engineering in the Harvard School of Engineering and Applied Sciences....
's group at Cornell in 2001. Version 1.0 was released on May 8, 2006.
Language features
Cyclone attempts to avoid some of the common pitfalls of CC (programming language)
C is a general-purpose computer programming language developed between 1969 and 1973 by Dennis Ritchie at the Bell Telephone Laboratories for use with the Unix operating system....
, while still maintaining its look and performance. To this end, Cyclone places the following limits on programs:
-
NULL
checks are inserted to prevent segmentation faultNull-In computing:* Null , a special marker and keyword in SQL* Null character, the zero-valued ASCII character, also designated by NUL, often used as a terminator, separator or filler* Null device, a special computer file that discards all data written to it...Segmentation faultA segmentation fault , bus error or access violation is generally an attempt to access memory that the CPU cannot physically address. It occurs when the hardware notifies an operating system about a memory access violation. The OS kernel then sends a signal to the process which caused the exception...
s - Pointer arithmetic is limited
- Pointers must be initialized before use (this is enforced by definite assignment analysisDefinite assignment analysisIn computer science, definite assignment analysis is a data-flow analysis used by compilers to conservatively ensure that a variable or location is always assigned to before it is used.-Motivation:...
) - Dangling pointerDangling pointerDangling pointers and wild pointers in computer programming are pointers that do not point to a valid object of the appropriate type. These are special cases of memory safety violations....
s are prevented through region analysis and limits onfree
MallocC dynamic memory allocation refers to performing dynamic memory allocation in the C via a group of functions in the C standard library, namely malloc, realloc, calloc and free.... - Only "safe" casts and unions are allowed
-
goto
Control flowIn computer science, control flow refers to the order in which the individual statements, instructions, or function calls of an imperative or a declarative program are executed or evaluated....
into scopes is disallowed -
switch
Control flowIn computer science, control flow refers to the order in which the individual statements, instructions, or function calls of an imperative or a declarative program are executed or evaluated....
labels in different scopes are disallowed - Pointer-returning functions must execute
return
-
setjmp
andlongjmp
are not supported
To maintain the tool set that C programmers are used to, Cyclone provides the following extensions:
- Never-
NULL
pointers do not requireNULL
checks - "Fat" pointers support pointer arithmetic with run-time bounds checkingBounds checkingIn computer programming, bounds checking is any method of detecting whether a variable is within some bounds before its use. It is particularly relevant to a variable used as an index into an array to ensure its value lies within the bounds of the array...
- Growable regions support a form of safe manual memory management
- Garbage collectionGarbage collection (computer science)In computer science, garbage collection is a form of automatic memory management. The garbage collector, or just collector, attempts to reclaim garbage, or memory occupied by objects that are no longer in use by the program...
for heap-allocated values - Tagged unionTagged unionIn computer science, a tagged union, also called a variant, variant record, discriminated union, or disjoint union, is a data structure used to hold a value that could take on several different, but fixed types. Only one of the types can be in use at any one time, and a tag field explicitly...
s support type-varying arguments - Injections help automate the use of tagged unions for programmers
- Polymorphism replaces some uses of
void *
- varargs are implemented as fat pointers
- ExceptionsException handlingException handling is a programming language construct or computer hardware mechanism designed to handle the occurrence of exceptions, special conditions that change the normal flow of program execution....
replace some uses ofsetjmp
andlongjmp
For a better high-level introduction to Cyclone, the reasoning behind Cyclone and the source of these lists, see this paper.
Cyclone looks, in general, much like C
C (programming language)
C is a general-purpose computer programming language developed between 1969 and 1973 by Dennis Ritchie at the Bell Telephone Laboratories for use with the Unix operating system....
, but it should be viewed as a C-like language.
Pointer/reference types
Cyclone implements three kinds of referenceReference (computer science)
In computer science, a reference is a value that enables a program to indirectly access a particular data item, such as a variable or a record, in the computer's memory or in some other storage device. The reference is said to refer to the data item, and accessing those data is called...
(following C terminology these are called pointers):
-
*
(the normal type) -
@
(the never-NULL
pointer), and -
?
(the only type with pointer arithmetic allowed, "fat" pointers).
The purpose of introducing these new pointer types is to avoid common problems when using pointers. Take for instance a function, called
foo
that takes a pointer to an int:Although the person who wrote the function
foo
could have inserted NULL
checks, let us assume that for performance reasons they did not. Calling foo(NULL);
will result in undefined behavior (typically, although not necessarily, a SIGSEGV being sent to the application). To avoid such problems, Cyclone introduces the @
pointer type, which can never be NULL
. Thus, the "safe" version of foo
would be:This tells the Cyclone compiler that the argument to
foo
should never be NULL
, avoiding the aforementioned undefined behavior. The simple change of *
to @
saves the programmer from having to write NULL
checks and the operating system from having to trap NULL
pointer dereferences. This extra limit, however, can be a rather large stumbling block for most C programmers, who are used to being able to manipulate their pointers directly with arithmetic. Although this is desirable, it can lead to buffer overflowBuffer overflow
In computer security and programming, a buffer overflow, or buffer overrun, is an anomaly where a program, while writing data to a buffer, overruns the buffer's boundary and overwrites adjacent memory. This is a special case of violation of memory safety....
s and other "off-by-one"-style mistakes. To avoid this, the
?
pointer type is delimited by a known bound, the size of the array. Although this adds overhead due to the extra information stored about the pointer, it improves safety and security. Take for instance a simple (and naïve) strlen
function, written in C:This function assumes that the string being passed in is terminated by NUL (
'\0'
). However, what would happen if char buf[] = {'h','e','l','l','o','!'};
were passed to this string? This is perfectly legal in C, yet would cause strlen
to iterate through memory not necessarily associated with the string s
. There are functions, such as strnlen
which can be used to avoid such problems, but these functions are not standard with every implementation of ANSI CANSI C
ANSI C refers to the family of successive standards published by the American National Standards Institute for the C programming language. Software developers writing in C are encouraged to conform to the standards, as doing so aids portability between compilers.-History and outlook:The first...
. The Cyclone version of
strlen
is not so different from the C version:Here,
strlen
bounds itself by the length of the array passed to it, thus not going over the actual length. Each of the kinds of pointer type can be safely cast to each of the others, and arrays and strings are automatically cast to ?
by the compiler. (Casting from ?
to *
invokes a bounds checkBounds checking
In computer programming, bounds checking is any method of detecting whether a variable is within some bounds before its use. It is particularly relevant to a variable used as an index into an array to ensure its value lies within the bounds of the array...
, and casting from
?
to @
invokes both a NULL
check and a bounds check. Casting from *
or ?
results in no checks whatsoever; the resulting ?
pointer has a size of 1.)Dangling pointers and region analysis
Consider the following code, in C:This returns an object that is allocated on the stack of the function
itoa
, which is not available after the function returns. While gccGNU Compiler Collection
The GNU Compiler Collection is a compiler system produced by the GNU Project supporting various programming languages. GCC is a key component of the GNU toolchain...
and other compilers will warn about such code, the following will typically compile without warnings:
Cyclone does regional analysis of each segment of code, preventing dangling pointers, such as the one returned from this version of
itoa
. All of the local variables in a given scope are considered to be part of the same region, separate from the heap or any other local region. Thus, when analyzing itoa
, the compiler would see that z
is a pointer into the local stack, and would report an error.Examples
The best example to start with is the classic Hello world program:
External links
- Cyclone Homepage
- Cyclone 1.0 source code RPM
- Cyclone - Source code repositories
- Cyclone - FAQ
- Cyclone for C programmers
Presentations: