Metaprogramming
Encyclopedia
Metaprogramming is the writing of computer program
s that write or manipulate other programs (or themselves) as their data, or that do part of the work at compile time
that would otherwise be done at runtime. In many cases, this allows programmers to get more done in the same amount of time as they would take to write all the code manually, or it gives programs greater flexibility to efficiently handle new situations without recompilation.
The language in which the metaprogram is written is called the metalanguage
. The language of the programs that are manipulated is called the object language
. The ability of a programming language to be its own metalanguage is called reflection
or reflexivity.
Reflection is a valuable language feature to facilitate metaprogramming. Having the programming language itself as a first-class data type
(as in Lisp, Forth or Rebol
) is also very useful. Generic programming
invokes a metaprogramming facility within a language, in those languages supporting it.
Metaprogramming usually works in one of three ways. The first way is to expose the internals of the run-time engine to the programming code through application programming interface
s (APIs). The second approach is dynamic execution of string expressions that contain programming commands. Thus, "programs can write programs". Although both approaches can be used in the same language, most languages tend to lean toward one or the other.
The third way is to step outside the language entirely. General purpose program transformation
systems, which accept language descriptions and can carry out arbitrary transformations on those languages, are direct implementations of general metaprogramming. This allows metaprogramming to be applied to virtually any target language without regard to whether that target language has any metaprogramming abilities of its own.
transaction processing
system had Assembler macros that generated COBOL statements as a pre-processing step.
This script (or program) generates a new 993-line program that prints out the numbers 1–992. This is only an illustration of how to use code to write more code; it is not the most efficient way to print out a list of numbers. Nonetheless, a programmer can write and execute this metaprogram in just a couple of minutes, and will have generated exactly 1000 lines of code in that amount of time.
A quine is a special kind of metaprogram that produces its own source code as its output.
Not all metaprogramming involves generative programming. If programs are modifiable at runtime or if an incremental compilation is available (such as in C#, Forth, Frink
, Groovy, JavaScript
, Lisp, Lua, Perl
, PHP
, Python
, REBOL
, Ruby
, Smalltalk
, and Tcl
), then techniques can be used to perform metaprogramming without actually generating source code.
Lisp is probably the quintessential language with metaprogramming facilities, both because of its historical precedence and because of the simplicity and power of its metaprogramming. In Lisp metaprogramming, the quasiquote operator (typically a comma) introduces code that is evaluated at program definition time rather than at run time. The metaprogramming language is thus identical to the host programming language, and existing Lisp routines can be directly reused for metaprogramming, if desired.
This approach has been implemented in other languages by incorporating an interpreter in the program, which works directly with the program’s data. There are implementations of this kind for some common high-level languages, such as RemObject’s Pascal Script for Object Pascal
.
One style of metaprogramming is to employ domain-specific programming language
s (DSLs). A fairly common example of using DSLs involves generative metaprogramming: lex
and yacc
, two tools used to generate lexical analyzer
s and parser
s, let the user describe the language using regular expression
s and context-free grammar
s, and embed the complex algorithms required to efficiently parse the language.
Computer program
A computer program is a sequence of instructions written to perform a specified task with a computer. A computer requires programs to function, typically executing the program's instructions in a central processor. The program has an executable form that the computer can use directly to execute...
s that write or manipulate other programs (or themselves) as their data, or that do part of the work at compile time
Compile time
In computer science, compile time refers to either the operations performed by a compiler , programming language requirements that must be met by source code for it to be successfully compiled , or properties of the program that can be reasoned about at compile time.The operations performed at...
that would otherwise be done at runtime. In many cases, this allows programmers to get more done in the same amount of time as they would take to write all the code manually, or it gives programs greater flexibility to efficiently handle new situations without recompilation.
The language in which the metaprogram is written is called the metalanguage
Metalanguage
Broadly, any metalanguage is language or symbols used when language itself is being discussed or examined. In logic and linguistics, a metalanguage is a language used to make statements about statements in another language...
. The language of the programs that are manipulated is called the object language
Object language
An object language is a language which is the "object" of study in various fields including logic, linguistics, mathematics and theoretical computer science. The language being used to talk about an object language is called a metalanguage...
. The ability of a programming language to be its own metalanguage is called reflection
Reflection (computer science)
In computer science, reflection is the process by which a computer program can observe and modify its own structure and behavior at runtime....
or reflexivity.
Reflection is a valuable language feature to facilitate metaprogramming. Having the programming language itself as a first-class data type
First-class object
In programming language design, a first-class citizen , in the context of a particular programming language, is an entity that can be constructed at run-time, passed as a parameter, returned from a subroutine, or assigned into a variable...
(as in Lisp, Forth or Rebol
REBOL
REBOL is a cross-platform data exchange language and a multi-paradigm dynamic programming language originally designed by Carl Sassenrath for network communications and distributed computing. The language and its official implementation, which is a proprietary freely redistributable software are...
) is also very useful. Generic programming
Generic programming
In a broad definition, generic programming is a style of computer programming in which algorithms are written in terms of to-be-specified-later types that are then instantiated when needed for specific types provided as parameters...
invokes a metaprogramming facility within a language, in those languages supporting it.
Metaprogramming usually works in one of three ways. The first way is to expose the internals of the run-time engine to the programming code through application programming interface
Application programming interface
An application programming interface is a source code based specification intended to be used as an interface by software components to communicate with each other...
s (APIs). The second approach is dynamic execution of string expressions that contain programming commands. Thus, "programs can write programs". Although both approaches can be used in the same language, most languages tend to lean toward one or the other.
The third way is to step outside the language entirely. General purpose program transformation
Program transformation
A program transformation is any operation that takes a computer program and generates another program. In many cases the transformed program is required to be semantically equivalent to the original, relative to a particular formal semantics and in fewer cases the transformations result in programs...
systems, which accept language descriptions and can carry out arbitrary transformations on those languages, are direct implementations of general metaprogramming. This allows metaprogramming to be applied to virtually any target language without regard to whether that target language has any metaprogramming abilities of its own.
In statically typed functional languages
- Usage of dependent typeDependent typeIn computer science and logic, a dependent type is a type that depends on a value. Dependent types play a central role in intuitionistic type theory and in the design of functional programming languages like ATS, Agda and Epigram....
s allows proving that generated code is never invalid.
Staged meta-programming
- MetaML
- MetaOCaml
IBM/360 assembler
The IBM/360 and derivatives had powerful Assembler macro facilities that were often used to generate complete programs or sections of programs (for different operating systems for instance). Macros provided with CICSCICS
Customer Information Control System is a transaction server that runs primarily on IBM mainframe systems under z/OS and z/VSE.CICS is a transaction manager designed for rapid, high-volume online processing. This processing is mostly interactive , but background transactions are possible...
transaction processing
Transaction processing
In computer science, transaction processing is information processing that is divided into individual, indivisible operations, called transactions. Each transaction must succeed or fail as a complete unit; it cannot remain in an intermediate state...
system had Assembler macros that generated COBOL statements as a pre-processing step.
Examples
A simple example of a metaprogram is this bash script, which is an example of generative programming:This script (or program) generates a new 993-line program that prints out the numbers 1–992. This is only an illustration of how to use code to write more code; it is not the most efficient way to print out a list of numbers. Nonetheless, a programmer can write and execute this metaprogram in just a couple of minutes, and will have generated exactly 1000 lines of code in that amount of time.
A quine is a special kind of metaprogram that produces its own source code as its output.
Not all metaprogramming involves generative programming. If programs are modifiable at runtime or if an incremental compilation is available (such as in C#, Forth, Frink
Frink
Frink, named after the fictional mad scientist Professor John Frink from The Simpsons, is a calculating tool and programming language designed by Alan Eliasen. It is built on the Java Virtual Machine and incorporates features similar to Java, Perl, Ruby, Smalltalk, and various BASIC implementations...
, Groovy, JavaScript
JavaScript
JavaScript is a prototype-based scripting language that is dynamic, weakly typed and has first-class functions. It is a multi-paradigm language, supporting object-oriented, imperative, and functional programming styles....
, Lisp, Lua, Perl
Perl
Perl is a high-level, general-purpose, interpreted, dynamic programming language. Perl was originally developed by Larry Wall in 1987 as a general-purpose Unix scripting language to make report processing easier. Since then, it has undergone many changes and revisions and become widely popular...
, PHP
PHP
PHP is a general-purpose server-side scripting language originally designed for web development to produce dynamic web pages. For this purpose, PHP code is embedded into the HTML source document and interpreted by a web server with a PHP processor module, which generates the web page document...
, Python
Python (programming language)
Python is a general-purpose, high-level programming language whose design philosophy emphasizes code readability. Python claims to "[combine] remarkable power with very clear syntax", and its standard library is large and comprehensive...
, REBOL
REBOL
REBOL is a cross-platform data exchange language and a multi-paradigm dynamic programming language originally designed by Carl Sassenrath for network communications and distributed computing. The language and its official implementation, which is a proprietary freely redistributable software are...
, Ruby
Ruby (programming language)
Ruby is a dynamic, reflective, general-purpose object-oriented programming language that combines syntax inspired by Perl with Smalltalk-like features. Ruby originated in Japan during the mid-1990s and was first developed and designed by Yukihiro "Matz" Matsumoto...
, Smalltalk
Smalltalk
Smalltalk is an object-oriented, dynamically typed, reflective programming language. Smalltalk was created as the language to underpin the "new world" of computing exemplified by "human–computer symbiosis." It was designed and created in part for educational use, more so for constructionist...
, and Tcl
Tcl
Tcl is a scripting language created by John Ousterhout. Originally "born out of frustration", according to the author, with programmers devising their own languages intended to be embedded into applications, Tcl gained acceptance on its own...
), then techniques can be used to perform metaprogramming without actually generating source code.
Lisp is probably the quintessential language with metaprogramming facilities, both because of its historical precedence and because of the simplicity and power of its metaprogramming. In Lisp metaprogramming, the quasiquote operator (typically a comma) introduces code that is evaluated at program definition time rather than at run time. The metaprogramming language is thus identical to the host programming language, and existing Lisp routines can be directly reused for metaprogramming, if desired.
This approach has been implemented in other languages by incorporating an interpreter in the program, which works directly with the program’s data. There are implementations of this kind for some common high-level languages, such as RemObject’s Pascal Script for Object Pascal
Object Pascal
Object Pascal refers to a branch of object-oriented derivatives of Pascal, mostly known as the primary programming language of Embarcadero Delphi.-Early history at Apple:...
.
One style of metaprogramming is to employ domain-specific programming language
Domain-specific programming language
In software development and domain engineering, a domain-specific language is a programming language or specification language dedicated to a particular problem domain, a particular problem representation technique, and/or a particular solution technique...
s (DSLs). A fairly common example of using DSLs involves generative metaprogramming: lex
Lex programming tool
Lex is a computer program that generates lexical analyzers . Lex is commonly used with the yacc parser generator. Lex, originally written by Mike Lesk and Eric Schmidt, is the standard lexical analyzer generator on many Unix systems, and a tool exhibiting its behavior is specified as part of the...
and yacc
Yacc
The computer program yacc is a parser generator developed by Stephen C. Johnson at AT&T for the Unix operating system. The name is an acronym for "Yet Another Compiler Compiler." It generates a parser based on an analytic grammar written in a notation similar to BNF.Yacc used to be available as...
, two tools used to generate lexical analyzer
Lexical analysis
In computer science, lexical analysis is the process of converting a sequence of characters into a sequence of tokens. A program or function which performs lexical analysis is called a lexical analyzer, lexer or scanner...
s and parser
Parsing
In computer science and linguistics, parsing, or, more formally, syntactic analysis, is the process of analyzing a text, made of a sequence of tokens , to determine its grammatical structure with respect to a given formal grammar...
s, let the user describe the language using regular expression
Regular expression
In computing, a regular expression provides a concise and flexible means for "matching" strings of text, such as particular characters, words, or patterns of characters. Abbreviations for "regular expression" include "regex" and "regexp"...
s and context-free grammar
Context-free grammar
In formal language theory, a context-free grammar is a formal grammar in which every production rule is of the formwhere V is a single nonterminal symbol, and w is a string of terminals and/or nonterminals ....
s, and embed the complex algorithms required to efficiently parse the language.
Implementations
- ASF+SDF Meta EnvironmentASF+SDF Meta EnvironmentThe ASF+SDF Meta-Environment is an IDE and toolset for interactive program analysis and transformation. It combines SDF , ASF and other technologies.Some of the features:...
- DMS Software Reengineering ToolkitDMS Software Reengineering ToolkitThe DMS Software Reengineering Toolkit is a proprietary set of program transformation tools available for automating custom source program analysis, modification, translation or generation of software systems for arbitrary mixtures of source languages for large scale software systems.DMS has been...
- Intentional ProgrammingIntentional ProgrammingIn computer programming, intentional programming is a collection of concepts which enable software source code to reflect the precise information, called intention, which programmers had in mind when conceiving their work...
- Joose (JavaScript)Joose (framework)Joose is an open-source self-hosting meta object system for JavaScript with support for classes, inheritance, mixins, traits and aspect oriented programming....
- JetBrains MPSJetBrains MPSJetBrains MPS is a metaprogramming system which is being developed by JetBrains. It implements language-oriented programming. MPS is an environment for language definition, a Language Workbench, and integrated development environment for such languages....
- Moose (Perl)Moose (Perl)Moose is an extension of the Perl 5 object system. It brings modern object-oriented language features to Perl 5, making object-oriented programming more consistent and less tedious.-Features:...
- Nemerle
- Stratego/XTStratego/XTStratego/XT is a language and toolset for constructing stand-alone program transformation systems.It combines the Stratego transformation language with the XT toolset of transformation components, providing a framework for constructing stand-alone...
- Template HaskellTemplate HaskellTemplate Haskell is an experimental language extension to the programming language Haskell implemented in the Glasgow Haskell Compiler . In early incarnations it was also known as Template Meta-Haskell....
See also
- Aspect weaverAspect weaverAn aspect weaver is a metaprogramming utility for aspect-oriented languages designed to take instructions specified by aspects and generate the final implementation code. The weaver integrates aspects into the locations specified by the software as a pre-compilation step...
- Comparison of code generation toolsComparison of code generation toolsThis article compares variable metamodel code generation tools . Fixed metamodel code generation tools, such as UML tools, are excluded .- Technical :...
- Compile-time reflection
- Inferential programmingInferential programmingIn ordinary computer programming, the programmer keeps the program's intended results in mind and painstakingly constructs a computer program to achieve those results. Inferential programming refers to techniques and technologies enabling the inverse...
- Instruction set simulatorInstruction Set SimulatorAn instruction set simulator is a simulation model, usually coded in a high-level programming language, which mimics the behavior of a mainframe or microprocessor by "reading" instructions and maintaining internal variables which represent the processor's registers.Instruction simulation is a...
- Interpreted languageInterpreted languageInterpreted language is a programming language in which programs are 'indirectly' executed by an interpreter program. This can be contrasted with a compiled language which is converted into machine code and then 'directly' executed by the host CPU...
- MetaobjectMetaobjectIn computer science, a metaobject or meta-object is any entity that manipulates, creates, describes, or implements other objects. The object that the metaobject is about is called the base object...
- Partial evaluationPartial evaluationIn computing, partial evaluation is a technique for several different types of program optimization by specialization. The most straightforward application is to produce new programs which run faster than the originals while being guaranteed to behave in the same way...
- Self-interpreterSelf-interpreterA self-interpreter, or metainterpreter, is a programming language interpreter written in the language it interprets. An example would be a BASIC interpreter written in BASIC...
- Self-modifying codeSelf-modifying codeIn computer science, self-modifying code is code that alters its own instructions while it is executing - usually to reduce the instruction path length and improve performance or simply to reduce otherwise repetitively similar code, thus simplifying maintenance...
- Source code generation
- Template metaprogrammingTemplate metaprogrammingTemplate metaprogramming is a metaprogramming technique in which templates are used by a compiler to generate temporary source code, which is merged by the compiler with the rest of the source code and then compiled. The output of these templates include compile-time constants, data structures, and...
External links
- c2.com Wiki: Metaprogramming article
- Meta Programming on the Program Transformation Wiki
- Code generation Vs Metaprogramming
- "Solenoid": The first metaprogramming framework for eXist-db
- The Art of Enterprise Metaprogramming