Hartmann pipeline
Encyclopedia
A Hartmann pipeline is an extension of the Unix
pipeline
concept, providing for more complex paths, multiple input/output streams, and other features. It is an example and extension of Pipeline programming
.
A Hartmann pipe is a non-procedural representation of a solution of a data processing
problem as a dataflow
. The error-prone step of translating the dataflow to a traditional procedural programming
language is eliminated.
Hartmann pipelines may thus be considered as an executable specification language
.
The concept was developed by John Poul Hartmann (born 1946), a Danish engineer with IBM. It is available as a software product CMS/TSO Pipelines for a number of IBM platforms. A somewhat backlevel version is included with every level of VM/ESA and z/VM
.
, limited to stage separators (typically "
A simple example that reads a disk file, separates records containing the string "Hello" from those that do not, and writes both sets of records to different disk files can be written as:
where the
will see some similarities in Hartmann pipelines.
It is obvious that the author was influenced by APL;
some of the filters have names and functions similar to specific APL primitive functions.
Examples include the TAKE filter, which passes a specified number of records, and the DEAL filter, which spreads its input records out across its output streams, in imitation of the APL deal operator.
Unix
Unix is a multitasking, multi-user computer operating system originally developed in 1969 by a group of AT&T employees at Bell Labs, including Ken Thompson, Dennis Ritchie, Brian Kernighan, Douglas McIlroy, and Joe Ossanna...
pipeline
Pipeline (Unix)
In Unix-like computer operating systems , a pipeline is the original software pipeline: a set of processes chained by their standard streams, so that the output of each process feeds directly as input to the next one. Each connection is implemented by an anonymous pipe...
concept, providing for more complex paths, multiple input/output streams, and other features. It is an example and extension of Pipeline programming
Pipeline programming
When a programming language is originally designed without any syntax to nest function calls, pipeline programming is a simple syntax change to add it. The programmer connects notional program modules into a flow structure, by analogy to a physical pipeline carrying reaction products through a...
.
A Hartmann pipe is a non-procedural representation of a solution of a data processing
Data processing
Computer data processing is any process that a computer program does to enter data and summarise, analyse or otherwise convert data into usable information. The process may be automated and run on a computer. It involves recording, analysing, sorting, summarising, calculating, disseminating and...
problem as a dataflow
Data flow diagram
A data flow diagram is a graphical representation of the "flow" of data through an information system, modelling its process aspects. Often they are a preliminary step used to create an overview of the system which can later be elaborated...
. The error-prone step of translating the dataflow to a traditional procedural programming
Procedural programming
Procedural programming can sometimes be used as a synonym for imperative programming , but can also refer to a programming paradigm, derived from structured programming, based upon the concept of the procedure call...
language is eliminated.
Hartmann pipelines may thus be considered as an executable specification language
Specification language
A specification language is a formal language used in computer science.Unlike most programming languages, which are directly executable formal languages used to implement a system, specification languages are used during systems analysis, requirements analysis and systems design.Specification...
.
The concept was developed by John Poul Hartmann (born 1946), a Danish engineer with IBM. It is available as a software product CMS/TSO Pipelines for a number of IBM platforms. A somewhat backlevel version is included with every level of VM/ESA and z/VM
Z/VM
z/VM is the current version in IBM's VM family of virtual machine operating systems. z/VM was first released in October 2000 and remains in active use and development . It is directly based on technology and concepts dating back to the 1960s, with IBM's CP/CMS on the IBM System/360-67...
.
Overview
A pipeline consists of a collection of stages, joined together by stage separators. Stages can be written in a variety of languages, and are either filters that process data records or device drivers (sources and sinks) that read data into or out of the pipeline. Unlike other implementations of pipeline programming, Hartmann's design has multiple streams in and out of each stage and can interconnect them non-sequentially. Unlike many programming languages, pipelines have a very small amount of notationNotation
-Written communication:* Phonographic writing systems, by definition, use symbols to represent components of auditory language, i.e. speech, which in turn refers to things or ideas. The two main kinds of phonographic notational system are the alphabet and syllabary...
, limited to stage separators (typically "
|
"), pipeline separators (typically ";
" or "?
"), and label separators (":
"). Due to common usage, the diskread
stage is also known as <
and diskwrite
as >
, however all stages have names that are words in or make some sense in English.A simple example that reads a disk file, separates records containing the string "Hello" from those that do not, and writes both sets of records to different disk files can be written as:
(end ;) < input.txt | A: locate /Hello/ | > found.txt ; A: | > notfound.txt
where the
<
stage reads the input disk file, the two >
stages write the output disk files, and the locate
stage separates the input stream into two output streams. locate
's primary output (records containing Hello) is passed to the first >
stage, and its secondary output (records not containing Hello) is passed through the A:
connector to the second >
stage. The ; divides the specification into 2 pipelines. The collection of pipelines is called a pipeline set.Features
Some of the salient characteristics that distinguish Hartmann Pipeline from ordinary Unix pipes are:- Filters may have multiple inputs and multiple outputs. For example, a selection filter can send the found records down one output pipe and the not found records down another.
- A linear notation for representing pipeline networks.
- An interface that allows REXXREXXREXX is an interpreted programming language that was developed at IBM. It is a structured high-level programming language that was designed to be both easy to learn and easy to read...
programs to act as stages. - A pacing strategy in the Pipeline supervisor that allows, for example, a stream to be split, say by a selection filter, and the records on the output legs to be processed by other filters, then merged by a join filter and have the record order preserved in result stream.
- As implied by the previous item, data streams are (generally) not simply buffered and passed along to the next filter. The filters operate in parallel with input and output records handled by the Pipeline supervisor.
Similarity to APL
Programmers familiar with the APL programming languageAPL programming language
APL is an interactive array-oriented language and integrated development environment, which is available from a number of commercial and noncommercial vendors and for most computer platforms. It is based on a mathematical notation developed by Kenneth E...
will see some similarities in Hartmann pipelines.
It is obvious that the author was influenced by APL;
some of the filters have names and functions similar to specific APL primitive functions.
Examples include the TAKE filter, which passes a specified number of records, and the DEAL filter, which spreads its input records out across its output streams, in imitation of the APL deal operator.