Automata-Based Programming
Encyclopedia
Automata-based programming is a programming paradigm
Programming paradigm
A programming paradigm is a fundamental style of computer programming. Paradigms differ in the concepts and abstractions used to represent the elements of a program and the steps that compose a computation A programming paradigm is a fundamental style of computer programming. (Compare with a...

 in which the program or its part is thought of as a model of a finite state machine
Finite state machine
A finite-state machine or finite-state automaton , or simply a state machine, is a mathematical model used to design computer programs and digital logic circuits. It is conceived as an abstract machine that can be in one of a finite number of states...

 (FSM) or any other (often more complicated) formal automaton (see automata theory
Automata theory
In theoretical computer science, automata theory is the study of abstract machines and the computational problems that can be solved using these machines. These abstract machines are called automata...

). Sometimes a potentially-infinite set of possible states is introduced, and such a set can have a complicated structure, not just an enumeration.

FSM-based programming is generally the same, but, formally speaking, doesn't cover all possible variants as FSM stands for finite state machine
Finite state machine
A finite-state machine or finite-state automaton , or simply a state machine, is a mathematical model used to design computer programs and digital logic circuits. It is conceived as an abstract machine that can be in one of a finite number of states...

 and automata-based programming doesn't necessarily employ FSMs in the strict sense.

The following properties are key indicators for automata-based programming:
  1. The time period of the program's execution is clearly separated down to the steps of the automaton. Each of the steps is effectively an execution of a code section (same for all the steps), which has a single entry point. Such a section can be a function or other routine, or just a cycle body. The step section might be divided down to subsection to be executed depending on different states, although this is not necessary.
  2. Any communication between the steps is only possible via the explicitly noted set of variables named the state. Between any two steps, the program (or its part created using the automata-based technique) can not have implicit components of its state, such as local (stack) variables' values, return addresses, the current instruction pointer etc. That is, the state of the whole program, taken at any two moments of entering the step of the automaton, can only differ in the values of the variables being considered as the state of the automaton.


The whole execution of the automata-based code is a (possibly explicit) cycle of the automaton's steps.

Another reason to use the notion of automata-based programming is that the programmer's style of thinking about the program in this technique is very similar to the style of thinking used to solve math-related tasks using Turing machine
Turing machine
A Turing machine is a theoretical device that manipulates symbols on a strip of tape according to a table of rules. Despite its simplicity, a Turing machine can be adapted to simulate the logic of any computer algorithm, and is particularly useful in explaining the functions of a CPU inside a...

, Markov algorithm
Markov algorithm
In theoretical computer science, a Markov algorithm is a string rewriting system that uses grammar-like rules to operate on strings of symbols. Markov algorithms have been shown to be Turing-complete, which means that they are suitable as a general model of computation and can represent any...

 etc.

Example

Consider a program in C
C (programming language)
C is a general-purpose computer programming language developed between 1969 and 1973 by Dennis Ritchie at the Bell Telephone Laboratories for use with the Unix operating system....

 that reads a text from standard input stream, line by line, and prints the first word of each line. It is clear we need first to read and skip the leading spaces, if any, then read characters of the first word and print them until the word ends, and then read and skip all the remaining characters until the end-of-line character is encountered. Upon reaching the end of line character (regardless of the stage), we restart the algorithm from the beginning, and upon encountering the end of file condition (regardless of the stage), we terminate the program.

Traditional (imperative) program in C

The program which solves the example task in traditional (imperative
Imperative programming
In computer science, imperative programming is a programming paradigm that describes computation in terms of statements that change a program state...

) style can look something like this:
  1. include

int main(void)
{
int c;
do {
c = getchar;
while(c ' ')
c = getchar;
while(c != EOF && c != ' ' && c != '\n') {
putchar(c);
c = getchar;
}
putchar('\n');
while(c != EOF && c != '\n')
c = getchar;
} while(c != EOF);
return 0;
}

Automata-based style program

The same task can be solved by thinking in terms of finite state machines. Note that line parsing has three stages: skipping the leading spaces, printing the word and skipping the trailing characters. Let's call them states before, inside and after. The program may now look like this:
  1. include

int main(void)
{
enum states {
before, inside, after
} state;
int c;
state = before;
while((c = getchar) != EOF) {
switch(state) {
case before:
if(c '\n') {
putchar('\n');
} else
if(c != ' ') {
putchar(c);
state = inside;
}
break;
case inside:
switch(c) {
case ' ': state = after; break;
case '\n':
putchar('\n');
state = before;
break;
default: putchar(c);
}
break;
case after:
if(c '\n') {
putchar('\n');
state = before;
}
}
}
return 0;
}

Although the code now looks longer, it has at least one significant advantage: there's only one reading (that is, call to the getchar function) instruction in the program. Besides that, there's only one loop instead of the four the previous versions had.

In this program, the body of the while loop is the automaton step, and the loop itself is the cycle of the automaton's work.

The program implements (models) the work of a finite state machine shown on the picture. The N denotes the end of line character, the S denotes spaces, and the A stands for all the other characters. The automaton follows exactly one arrow on each step depending on the current state and the encountered character. Some state switches are accompanied with printing the character; such arrows are marked with asterisks.

It is not absolutely necessary to divide the code down to separate handlers for each unique state. Furthermore, in some cases the very notion of the state can be composed of several variables' values, so that it could be impossible to handle each possible state explicitly. In the discussed program it is possible to reduce the code length by noticing that the actions taken in response to the end of line character are the same for all the possible states. The following program is equal to the previous one but is a bit shorter:
  1. include

int main(void)
{
enum states {
before, inside, after
} state;
int c;
state = before;
while((c = getchar) != EOF) {
if(c '\n') {
putchar('\n');
state = before;
} else
switch(state) {
case before:
if(c != ' ') {
putchar(c);
state = inside;
}
break;
case inside:
if(c ' ') {
state = after;
} else {
putchar(c);
}
break;
case after:
break;
}
}
return 0;
}

A separate function for the automation step

The most important property of the previous program is that the automaton step code section is clearly localized. It is possible to demonstrate this property even better if we provide a separate function for it:
  1. include

enum states { before, inside, after };
void step(enum states *state, int c)
{
if(c '\n') {
putchar('\n');
*state = before;
} else
switch(*state) {
case before:
if(c != ' ') {
putchar(c);
*state = inside;
}
break;
case inside:
if(c ' ') {
*state = after;
} else {
putchar(c);
}
break;
case after:
break;
}
}
int main(void)
{
int c;
enum states state = before;
while((c = getchar) != EOF) {
step(&state, c);
}
return 0;
}

This example clearly demonstrates the basic properties of automata-based code:
  1. time periods of automaton step executions may not overlap
  2. the only information passed from the previous step to the next is the explicitly specified automaton state

Explicit state transition table

A finite automaton can be defined by an explicit state transition table
State transition table
In automata theory and sequential logic, a state transition table is a table showing what state a finite semiautomaton or finite state machine will move to, based on the current state and other inputs...

. Generally speaking, an automata-based program code can naturally reflect this approach. In the program below there's an array named the_table, which defines the table. The rows of the table stand for three states, while columns reflect the input characters (first for spaces, second for the end of line character, and the last is for all the other characters).

For every possible combination, the table contains the new state number and the flag, which determines whether the automaton must print the symbol. In a real life task, this could be more complicated; e.g., the table could contain pointers to functions to be called on every possible combination of conditions.
  1. include

enum states { before = 0, inside = 1, after = 2 };
struct branch {
unsigned char new_state:2;
unsigned char should_putchar:1;
};
struct branch the_table[3][3] = {
/* ' ' '\n' others */
/* before */ { {before,0}, {before,1}, {inside,1} },
/* inside */ { {after, 0}, {before,1}, {inside,1} },
/* after */ { {after, 0}, {before,1}, {after, 0} }
};
void step(enum states *state, int c)
{
int idx2 = (c ' ') ? 0 : (c '\n') ? 1 : 2;
struct branch *b = & the_table[*state][idx2];
*state = (enum states)(b->new_state);
if(b->should_putchar) putchar(c);
}
int main(void)
{
int c;
enum states state = before;
while((c = getchar) != EOF)
step(&state, c);
return 0;
}

Automation and Automata

Automata-based programming indeed closely matches the programming needs found in the field of automation
Automation
Automation is the use of control systems and information technologies to reduce the need for human work in the production of goods and services. In the scope of industrialization, automation is a step beyond mechanization...

.

A production cycle is commonly modelled as:
  • A sequence of stages stepping according to input data (from captors).
  • A set of actions performed depending on the current stage.


Various dedicated programming languages allow expressing such a model in more or less sophisticated ways.

Example Program

The example presented above could be expressed according to this view like in the following program. Here pseudo-code uses such conventions:
  • 'set' and 'reset' respectively activate & inactivate a logic variable (here a stage)
  • ':' is assignment, '=' is equality test


SPC : ' '
EOL : '\n'

states : (before, inside, after, end)

setState(c) {
if c=EOF then set end
if before and (c!=SPC and c!=EOL) then set inside
if inside and (c=SPC or c=EOL) then set after
if after and c=EOL then set before
}

doAction(c) {
if inside then write(c)
else if c=EOL then write(c)
}

cycle {
set before
loop {
c : readCharacter
setState(c)
doAction(c)
}
until end
}

The separation of routines expressing cycle progression on one side, and actual action on the other (matching input & output) allows clearer and simpler code.

Automation & Events

In the field of automation, stepping from step to step depends on input data coming from the machine itself. This is represented in the program by reading characters from a text. In reality, those data inform about position, speed, temperature, etc of critical elements of a machine.

Like in GUI
Gui
Gui or guee is a generic term to refer to grilled dishes in Korean cuisine. These most commonly have meat or fish as their primary ingredient, but may in some cases also comprise grilled vegetables or other vegetarian ingredients. The term derives from the verb, "gupda" in Korean, which literally...

 programming, changes in the machine state can thus be considered as events causing the passage from a state to another, until the final one is reached. The combination of possible states can generate a wide variety of events, thus defining a more complex production cycle. As a consequence, cycles are usually far to be simple linear sequences. There are commonly parallel branches running together and alternatives selected according to different events, schematically represented below:

s:stage c:condition

s1
|
|-c2
|
s2
|
----------
| |
|-c31 |-c32
| |
s31 s32
| |
|-c41 |-c42
| |
----------
|
s4

Using object-oriented capabilities

If the implementation language supports object-oriented programming
Object-oriented programming
Object-oriented programming is a programming paradigm using "objects" – data structures consisting of data fields and methods together with their interactions – to design applications and computer programs. Programming techniques may include features such as data abstraction,...

, it is reasonable to encapsulate the automaton into an object, thus hiding implementation details from the outer program. This approach is encompassed by a design pattern called state pattern
State pattern
The state pattern, which closely resembles Strategy Pattern, is a behavioral software design pattern, also known as the objects for states pattern. This pattern is used in computer programming to represent the state of an object. This is a clean way for an object to partially change its type at...

. For example, the object-oriented version of the program mentioned before can look like this, implemented in C++
C++
C++ is a statically typed, free-form, multi-paradigm, compiled, general-purpose programming language. It is regarded as an intermediate-level language, as it comprises a combination of both high-level and low-level language features. It was developed by Bjarne Stroustrup starting in 1979 at Bell...

:
  1. include

class StateMachine {
enum states { before = 0, inside = 1, after = 2 } state;
struct branch {
enum states new_state:2;
int should_putchar:1;
};
static struct branch the_table[3][3];
public:
StateMachine : state(before) {}
void FeedChar(int c) {
int idx2 = (c ' ') ? 0 : (c '\n') ? 1 : 2;
struct branch *b = & the_table[state][idx2];
state = b->new_state;
if(b->should_putchar) putchar(c);
}
};
struct StateMachine::branch StateMachine::the_table[3][3] = {
/* ' ' '\n' others */
/* before */ { {before,0}, {before,1}, {inside,1} },
/* inside */ { {after, 0}, {before,1}, {inside,1} },
/* after */ { {after, 0}, {before,1}, {after, 0} }
};
int main(void)
{
int c;
StateMachine machine;
while((c = getchar) != EOF)
machine.FeedChar(c);
return 0;
}

Note: To minimize changes not directly related to the subject of the article, the input/output functions from the standard library of C
C (programming language)
C is a general-purpose computer programming language developed between 1969 and 1973 by Dennis Ritchie at the Bell Telephone Laboratories for use with the Unix operating system....

 are being used. Note the use of the ternary operator, which could also be implemented as if-else.
Applications
Automata-based programming is widely used in lexical
Lexical analysis
In computer science, lexical analysis is the process of converting a sequence of characters into a sequence of tokens. A program or function which performs lexical analysis is called a lexical analyzer, lexer or scanner...

 and syntactic analyses.

Besides that, thinking in terms of automata (that is, breaking the execution process down to automaton steps and passing information from step to step through the explicit state) is necessary for event-driven programming
Event-driven programming
In computer programming, event-driven programming or event-based programming is a programming paradigm in which the flow of the program is determined by events—i.e., sensor outputs or user actions or messages from other programs or threads.Event-driven programming can also be defined as an...

 as the only alternative to using parallel processes or threads.

The notions of states and state machines are often used in the field of formal specification
Formal specification
In computer science, a formal specification is a mathematical description of software or hardware that may be used to develop an implementation. It describes what the system should do, not how the system should do it...

. For instance, UML
Unified Modeling Language
Unified Modeling Language is a standardized general-purpose modeling language in the field of object-oriented software engineering. The standard is managed, and was created, by the Object Management Group...

-based software architecture development uses state diagrams to specify the behaviour of the program. Also various communication protocols are often specified using the explicit notion of state (see, e.g., RFC 793).

Thinking in terms of automata (steps and states) can also be used to describe semantics of some programming language
Programming language
A programming language is an artificial language designed to communicate instructions to a machine, particularly a computer. Programming languages can be used to create programs that control the behavior of a machine and/or to express algorithms precisely....

s. For example, the execution of a program written in the Refal
Refal
Refal "is a functional programming language oriented toward symbol manipulation", including "string processing, translation, [and] artificial intelligence". It is one of the oldest members of this family, first conceived in 1966 as a theoretical tool with the first implementation appearing in 1968...

 language is described as a sequence of steps of a so-called abstract Refal machine; the state of the machine is a view (an arbitrary Refal expression without variables).

Continuations in the Scheme language require thinking in terms of steps and states, although Scheme itself is in no way automata-related (it is recursive). To make it possible the call/cc feature to work, implementation needs to be able to catch a whole state of the executing program, which is only possible when there's no implicit part in the state. Such a caught state is the very thing called continuation, and it can be considered as the state of a (relatively complicated) automaton. The step of the automaton is deducing the next continuation from the previous one, and the execution process is the cycle of such steps.

Alexander Ollongren in his book explains the so-called Vienna method of programming languages semantics description which is fully based on formal automata.

The STAT system http://www.cs.ucsb.edu/~seclab/projects/stat/index.html is a good example of using the automata-based approach; this system, besides other features, includes an embedded language called STATL which is purely automata-oriented.
History
Automata-based techniques were used widely in the domains where there are algorithms based on automata theory, such as formal language analyses.

One of the early papers on this is by Johnson et al., 1968.

One of the earliest mentions of automata-based programming as a general technique is found in the paper by Peter Naur
Peter Naur
Peter Naur is a Danish pioneer in computer science and Turing award winner. His last name is the N in the BNF notation , used in the description of the syntax for most programming languages...

, 1963. The author calls the technique Turing machine approach, however no real Turing machine
Turing machine
A Turing machine is a theoretical device that manipulates symbols on a strip of tape according to a table of rules. Despite its simplicity, a Turing machine can be adapted to simulate the logic of any computer algorithm, and is particularly useful in explaining the functions of a CPU inside a...

 is given in the paper; instead, the technique based on states and steps is described.
Compared against imperative and procedural programming
The notion of state
State (computer science)
In computer science and automata theory, a state is a unique configuration of information in a program or machine. It is a concept that occasionally extends into some forms of systems programming such as lexers and parsers....

 is not exclusive property of automata-based programming.
Generally speaking, state (or program state
Program state
One of the key concepts in computer programming is the idea of state, essentially a snapshot of the measure of various conditions in the system. Most programming languages require a considerable amount of state information in order to operate properly - information which is generally hidden from...

) appears during execution of any computer program
Computer program
A computer program is a sequence of instructions written to perform a specified task with a computer. A computer requires programs to function, typically executing the program's instructions in a central processor. The program has an executable form that the computer can use directly to execute...

, as a combination of all information that can change during the execution. For instance, a state of a traditional imperative
Imperative programming
In computer science, imperative programming is a programming paradigm that describes computation in terms of statements that change a program state...

 program consists of
  1. values of all variables and the information stored within dynamic memory
  2. values stored in registers
  3. stack contents (including local variables' values and return addresses)
  4. current value of the instruction pointer


These can be divided to the explicit part (such as values stored in variables) and the implicit part (return addresses and the instruction pointer).

Having said this, an automata-based program can be considered as a special case of an imperative program, in which implicit part of the state is minimized. The state of the whole program taken at the two distinct moments of entering the step code section can differ in the automaton state only. This simplifies the analysis of the program.
Object-oriented programming relationship
In the theory of object-oriented programming
Object-oriented programming
Object-oriented programming is a programming paradigm using "objects" – data structures consisting of data fields and methods together with their interactions – to design applications and computer programs. Programming techniques may include features such as data abstraction,...

 an object is said to have an internal state and is capable of receiving messages, responding to them, sending messages to other objects and changing the internal state during message handling. In more practical terminology, to call an object's method is considered the same as to send a message to the object.

Thus, on the one hand, objects from object-oriented programming can be considered as automata (or models of automata) whose state is the combination of internal fields, and one or more methods are considered to be the step. Such methods must not call each other nor themselves, neither directly nor indirectly, otherwise the object can not be considered to be implemented in an automata-based manner.

On the other hand, it is obvious that object is good for implementing a model of an automaton. When the automata-based approach is used within an object-oriented language, an automaton model is usually implemented by a class, the state is represented with internal (private) fields of the class, and the step is implemented as a method; such a method is usually the only non-constant public method of the class (besides constructors and destructors). Other public methods could query the state but don't change it. All the secondary methods (such as particular state handlers) are usually hidden within the private part of the class.
External links
The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK