Generator (computer science)
Encyclopedia
In computer science
, a generator is a special routine
that can be used to control the iteration
behaviour of a loop. A generator is very similar to a function that returns an array, in that a generator has parameters, can be called, and generates a sequence of values. However, instead of building an array containing all the values and returning them all at once, a generator yields the values one at a time, which requires less memory and allows the caller to get started processing the first few values immediately. In short, a generator looks like a function but behaves like an iterator
.
Generators can be implemented in terms of more expressive control flow
constructs, such as coroutine
s or first-class continuation
s.
(1975), were a prominent feature in the string manipulation language Icon (1977) and are now available in Python
, C#, JavaScript
, Ruby
and other languages. In CLU and C#, generators are called iterators, and in Ruby, enumerators.
inside loops. The first time that a generator invocation is reached in a loop, an iterator object
is created that encapsulates the state of the generator routine at its beginning, with arguments bound to the corresponding parameter
s. The generator's body is then executed in the context of that iterator until a special yield action is encountered; at that time, the value provided with the yield action is used as the value of the invocation expression. The next time the same generator invocation is reached in a subsequent iteration, the execution of the generator's body is resumed after the yield action, until yet another yield action is encountered. In addition to the yield action, execution of the generator body can also be terminated by a finish action,
at which time the innermost loop enclosing the generator invocation is terminated.
Because generators compute their yielded values only on demand, they are useful for representing sequences that would be expensive or impossible to compute at once. These include e.g. infinite sequences and live data streams.
In the presence of generators, loop constructs of a language can be reduced into a single loop ... end loop construct; all the usual loop constructs can then be comfortably simulated by using suitable generators in the right way.
In Python, a generator can be thought of as an iterator
that contains a frozen stack frame. Whenever the iterator's
The following extends the example above by using a generator expression to compute squares from the countfrom generator function:
Both of these examples utilise Generics, but this is not required.
It is possible to use multiple
and these sequences are also first-class values:
Some sequences are implemented imperatively (with private state variables) and some are implemented as (possibly infinite) lazy lists. Also, new struct definitions can have a property that specifies how they can be used as sequences.
But more directly, Racket comes with a generator library for a more traditional generator specification. For example,
Note that the Racket core implements powerful continuation features, providing general (re-entrant) continuations that are composable, and also delimited continuations. Using this, the generator library is implemented in Racket.
8.6, the generator mechanism is founded on named coroutine
s.
, with its lazy evaluation
model, everything is a generator - every datum created with a non-strict data constructor is generated on demand. For example,
where
which re-fetches values agreable with a predicate, and stops requesting new values as soon as a non-agreable one is encountered. The shared storage access is used as a universal mediator in Haskell. List comprehensions can be freely used:
and other collections frameworks, typically provide iterators for all collections.)
However, Java
does not have generators built into the language. This means that creating iterators is often much trickier than in languages with built-in generators, especially when the generation logic is complex. Because all state must be saved and restored every time an item is to be yielded from an iterator, it is not possible to store state in local variables or use built-in looping routines, as when generators are available; instead, all of this must be manually simulated, using object fields to hold local state and loop counters.
Even simple iterators built this way tend to be significantly bulkier than those using generators, with a lot of boilerplate code.
also does not natively provide generators, yet various library implementations exist, such as pygen.
Computer science
Computer science or computing science is the study of the theoretical foundations of information and computation and of practical techniques for their implementation and application in computer systems...
, a generator is a special routine
Subroutine
In computer science, a subroutine is a portion of code within a larger program that performs a specific task and is relatively independent of the remaining code....
that can be used to control the iteration
Iteration
Iteration means the act of repeating a process usually with the aim of approaching a desired goal or target or result. Each repetition of the process is also called an "iteration," and the results of one iteration are used as the starting point for the next iteration.-Mathematics:Iteration in...
behaviour of a loop. A generator is very similar to a function that returns an array, in that a generator has parameters, can be called, and generates a sequence of values. However, instead of building an array containing all the values and returning them all at once, a generator yields the values one at a time, which requires less memory and allows the caller to get started processing the first few values immediately. In short, a generator looks like a function but behaves like an iterator
Iterator
In computer programming, an iterator is an object that enables a programmer to traverse a container. Various types of iterators are often provided via a container's interface...
.
Generators can be implemented in terms of more expressive control flow
Control flow
In computer science, control flow refers to the order in which the individual statements, instructions, or function calls of an imperative or a declarative program are executed or evaluated....
constructs, such as coroutine
Coroutine
Coroutines are computer program components that generalize subroutines to allow multiple entry points for suspending and resuming execution at certain locations...
s or first-class continuation
Continuation
In computer science and programming, a continuation is an abstract representation of the control state of a computer program. A continuation reifies the program control state, i.e...
s.
History
Generators first appeared in CLUCLU programming language
CLU is a programming language created at MIT by Barbara Liskov and her students between 1974 and 1975. It was notable for its use of constructors for abstract data types that included the code that operated on them, a key step in the direction of object-oriented programming...
(1975), were a prominent feature in the string manipulation language Icon (1977) and are now available in Python
Python (programming language)
Python is a general-purpose, high-level programming language whose design philosophy emphasizes code readability. Python claims to "[combine] remarkable power with very clear syntax", and its standard library is large and comprehensive...
, C#, JavaScript
JavaScript
JavaScript is a prototype-based scripting language that is dynamic, weakly typed and has first-class functions. It is a multi-paradigm language, supporting object-oriented, imperative, and functional programming styles....
, Ruby
Ruby (programming language)
Ruby is a dynamic, reflective, general-purpose object-oriented programming language that combines syntax inspired by Perl with Smalltalk-like features. Ruby originated in Japan during the mid-1990s and was first developed and designed by Yukihiro "Matz" Matsumoto...
and other languages. In CLU and C#, generators are called iterators, and in Ruby, enumerators.
Uses
Generators are usually invokedExecution (computers)
Execution in computer and software engineering is the process by which a computer or a virtual machine carries out the instructions of a computer program. The instructions in the program trigger sequences of simple actions on the executing machine...
inside loops. The first time that a generator invocation is reached in a loop, an iterator object
Object (computer science)
In computer science, an object is any entity that can be manipulated by the commands of a programming language, such as a value, variable, function, or data structure...
is created that encapsulates the state of the generator routine at its beginning, with arguments bound to the corresponding parameter
Parameter (computer science)
In computer programming, a parameter is a special kind of variable, used in a subroutine to refer to one of the pieces of data provided as input to the subroutine. These pieces of data are called arguments...
s. The generator's body is then executed in the context of that iterator until a special yield action is encountered; at that time, the value provided with the yield action is used as the value of the invocation expression. The next time the same generator invocation is reached in a subsequent iteration, the execution of the generator's body is resumed after the yield action, until yet another yield action is encountered. In addition to the yield action, execution of the generator body can also be terminated by a finish action,
at which time the innermost loop enclosing the generator invocation is terminated.
Because generators compute their yielded values only on demand, they are useful for representing sequences that would be expensive or impossible to compute at once. These include e.g. infinite sequences and live data streams.
In the presence of generators, loop constructs of a language can be reduced into a single loop ... end loop construct; all the usual loop constructs can then be comfortably simulated by using suitable generators in the right way.
Python
An example Python generator:In Python, a generator can be thought of as an iterator
Iterator
In computer programming, an iterator is an object that enables a programmer to traverse a container. Various types of iterators are often provided via a container's interface...
that contains a frozen stack frame. Whenever the iterator's
next
method is called, Python resumes the frozen frame, which executes normally until the next yield
statement is reached. The generator's frame is then frozen again, and the yielded value is returned to the caller.Generator Expressions
Python has a syntax modeled on that of list comprehensions, called a generator expression that aids in the creation of generators.The following extends the example above by using a generator expression to compute squares from the countfrom generator function:
Icon
Every expression (including loops) is a generator. Printing squares from 0 to 20 can be achieved by writingC#
An example C# 2.0 generator (theyield
is available since C# version 2.0):Both of these examples utilise Generics, but this is not required.
It is possible to use multiple
yield return
statements and they are applied in sequence on each iteration:XL
In XL, iterators are the basis of 'for' loops:Racket
Racket provides several related facilities for generators. First, its for-loop forms work with sequences, which are a kind of a producer:and these sequences are also first-class values:
Some sequences are implemented imperatively (with private state variables) and some are implemented as (possibly infinite) lazy lists. Also, new struct definitions can have a property that specifies how they can be used as sequences.
But more directly, Racket comes with a generator library for a more traditional generator specification. For example,
Note that the Racket core implements powerful continuation features, providing general (re-entrant) continuations that are composable, and also delimited continuations. Using this, the generator library is implemented in Racket.
Tcl
In TclTcl
Tcl is a scripting language created by John Ousterhout. Originally "born out of frustration", according to the author, with programmers devising their own languages intended to be embedded into applications, Tcl gained acceptance on its own...
8.6, the generator mechanism is founded on named coroutine
Coroutine
Coroutines are computer program components that generalize subroutines to allow multiple entry points for suspending and resuming execution at certain locations...
s.
Ruby
Ruby supports generators (starting from version 1.9) in the form of the built-in Enumerator class.Haskell
In HaskellHaskell (programming language)
Haskell is a standardized, general-purpose purely functional programming language, with non-strict semantics and strong static typing. It is named after logician Haskell Curry. In Haskell, "a function is a first-class citizen" of the programming language. As a functional programming language, the...
, with its lazy evaluation
Lazy evaluation
In programming language theory, lazy evaluation or call-by-need is an evaluation strategy which delays the evaluation of an expression until the value of this is actually required and which also avoids repeated evaluations...
model, everything is a generator - every datum created with a non-strict data constructor is generated on demand. For example,
where
(:)
is a non-strict list constructor, cons, and $
is just a "called-with" operator, used for parenthesization. This uses the standard adaptor function,which re-fetches values agreable with a predicate, and stops requesting new values as soon as a non-agreable one is encountered. The shared storage access is used as a universal mediator in Haskell. List comprehensions can be freely used:
Java
Java has had a standard interface for implementing iterators since its early days, and since Java 5, the "foreach" construction makes it easy to loop over iterators that provide the java.lang.Iterable interface. (The Java collections frameworkJava collections framework
The Java collections framework is a set of classes and interfaces that implement commonly reusable collection data structures.Although it is a framework, it works in a manner of a library...
and other collections frameworks, typically provide iterators for all collections.)
However, Java
Java (programming language)
Java is a programming language originally developed by James Gosling at Sun Microsystems and released in 1995 as a core component of Sun Microsystems' Java platform. The language derives much of its syntax from C and C++ but has a simpler object model and fewer low-level facilities...
does not have generators built into the language. This means that creating iterators is often much trickier than in languages with built-in generators, especially when the generation logic is complex. Because all state must be saved and restored every time an item is to be yielded from an iterator, it is not possible to store state in local variables or use built-in looping routines, as when generators are available; instead, all of this must be manually simulated, using object fields to hold local state and loop counters.
Even simple iterators built this way tend to be significantly bulkier than those using generators, with a lot of boilerplate code.
Perl
Perl does not natively provide generators, but support is provided by the Coro::Generator module which uses the Coro co-routine framework. Example usage:Lisp
Common LispCommon Lisp
Common Lisp, commonly abbreviated CL, is a dialect of the Lisp programming language, published in ANSI standard document ANSI INCITS 226-1994 , . From the ANSI Common Lisp standard the Common Lisp HyperSpec has been derived for use with web browsers...
also does not natively provide generators, yet various library implementations exist, such as pygen.
See also
- List comprehension for another construct that generates a sequence of values
- IteratorIteratorIn computer programming, an iterator is an object that enables a programmer to traverse a container. Various types of iterators are often provided via a container's interface...
for the concept of producing a list one element at a time - Lazy evaluationLazy evaluationIn programming language theory, lazy evaluation or call-by-need is an evaluation strategy which delays the evaluation of an expression until the value of this is actually required and which also avoids repeated evaluations...
for producing values when needed - CorecursionCorecursionIn computer science, corecursion is a type of operation that is dual to recursion. Corecursion and codata allow total languages to work with infinite data structures such as streams. Corecursion is often used in conjunction with lazy evaluation. Corecursion can produce both finite and infinite data...
for potentially infinite data by recursion instead of yield - CoroutineCoroutineCoroutines are computer program components that generalize subroutines to allow multiple entry points for suspending and resuming execution at certain locations...
for even more generalization from subroutine - ContinuationContinuationIn computer science and programming, a continuation is an abstract representation of the control state of a computer program. A continuation reifies the program control state, i.e...
for generalization of control flow