Pipe-calculus

Pipe-calculus is a universal model of computation closely related to process calculus, lambda calculus and formal grammars. One of its distinctive features is the inclusion of syntactic symbols, so it is possible to express program structure and complex data types using only the built-in constructs.

Pure pipe-calculus is untyped. Construction of a type system is in the exploration phase, placing the focus on well-typed communication of programs.

Although the core calculus is sufficiently expressive to be used as a programming language, with added syntax sugar it can be the basis of various typed or untyped programming languages. Its characteristic features like I/O and stream orientation make pipe-calculus a suitable choice even for scripting languages.

Note: Pipe-calculus is a work in progress. This wiki page reflects the current status of the project and details may well change over time.

Introduction

Motivation

Development of pipe-calculus started with the idea that syntax should be a first class element of programming languages as well as formation rules are integral part of logical systems. A language like that is able to interpret complex input without using a parser library. Pipe-calculus is a minimal language with this capability.

On the practical side pipe-calculus is inspired by shell programming, which is largely based on text streams and stream processing tools. Indeed the idea of the pipe combinator comes from shell scripting.

An example

To get a taste of pipe-calculus, consider a program that adds two natural numbers encoded in unary numeral system. Suppose that we send both numbers over a stream as a series of 1s terminated by an end symbol. Our example language consists of the following expressions.

match s expects that the next symbol of the input stream is s. On failure the current fork of the program is aborted.
write s appends the symbol s to the output stream.
run x runs a program given its name.
p ; q runs p then runs q unless p fails.
p | q forks the execution of the program so that one fork runs p, another fork runs q, sharing the same input and output stream.

Start by writing a routine that copies a natural number from the input to the output.

passNat = match 1; write 1; run passNat | match end; write end

Now we can write the program that writes the sum of two numbers to the output.

addNat = match 1; write 1; run addNat | match end; run passNat

The program terminates after matching and writing out the second end symbol. It skips the first end that terminates the first number and copies everything else. Effectively it concatenates two sequences of 1s. If any other symbol occurs in the input stream, the program fails.

This language is obviously too weak. We can write simple filters, but we can't even check the equality of two numbers. In order to do it, we need a method that allows the first number to be accumulated and read back symbol by symbol as we read the second number. Pipe-calculus can be seen as a framework to study the required extensions that turn this tiny language into a practical programming language.

Connection with other fields

Pipe-calculus is presented as a process calculus. Although it significantly differs from typical process calculi, it still lends itself to this style of presentation. Its development is also inspired by higher-order logic programming, automata theory, type theory (up to the lambda-cube) and algebraic effects ^[1].

In certain cases the connection can be made more precise.

Programs written in pipe-calculus can recognize and generate words of formal languages encoded as terms.
Untyped lambda calculus can be translated to a subset of higher-order pipe-calculus (to be specified later).

Variants and computational power

The most basic variant of the calculus that lacks any scoped constructs is called zeroth-order as a reference to zeroth-order logic. Somewhat surprisingly, even the zeroth-order calculus has a limited computational power. In the higher-order calculus there are lexically scoped variables that can bind arbitrary terms. Intermediate variants can be constructed by extending the basic calculus with new constructs. Determining the computational power of specific variants of the calculus is of interest.

It is an observation and also a driving force of development that specific variants of the calculus can recognize specific classes of formal languages in the Chomsky hierarchy. For example it is assumed that the zeroth-order calculus with top level recursion can encode finite-state machines and as a consequence it can recognize regular languages.

Informal description

Terms

[math]\displaystyle{ \def\dot{.} \def\seq{\mathrel{;}} \def\succeed{\textbf{succeed}} \def\alt{\mid} \def\fail{\textbf{fail}} \def\pipe{\rhd} \def\pass{\textbf{pass}} \def\eq{ = ~} }[/math]Pipe-calculus terms are also called processes to emphasize their interactive, behavioral aspect. Each process has a conceptual input and output which are implicit in the rules. Output of one process can be connected to the input of another one. Some terms contain atoms which are unscoped and uninterpreted symbols. In grammatical context one can think of atoms as tokens.

The following terms are available in pipe-calculus.

synchronisation, where
- negative literal prefixing [math]\displaystyle{ a^- \dot s }[/math] is a process that expects an atom [math]\displaystyle{ a }[/math] to be signalled on its input. If the expectation is fulfilled, the process proceeds as [math]\displaystyle{ s }[/math], otherwise it fails.
- positive literal prefixing [math]\displaystyle{ a^+ \dot s }[/math] is a process that signals an atom [math]\displaystyle{ a }[/math] on its output before proceeding as [math]\displaystyle{ s }[/math].
communication, where
- input prefixing [math]\displaystyle{ ?n \dot s }[/math] is a process that receives a term on its input. On success, it proceeds as [math]\displaystyle{ s }[/math] binding the name [math]\displaystyle{ n }[/math] to the received term.
- output prefixing [math]\displaystyle{ !s \dot t }[/math] is a process that sends a term [math]\displaystyle{ s }[/math] to its output before proceeding as [math]\displaystyle{ t }[/math].
sequence [math]\displaystyle{ s \seq t }[/math] is a composite process that executes [math]\displaystyle{ s }[/math] before proceeding as [math]\displaystyle{ t }[/math] unless [math]\displaystyle{ s }[/math] fails. In that case the composite process also fails.
choice [math]\displaystyle{ s \alt t }[/math] is a forking process where one fork executes [math]\displaystyle{ s }[/math] while the other fork executes [math]\displaystyle{ t }[/math]. Execution of the forks overlap in time. If one of the forks fails, the composite process is replaced by the other fork.
pipeline [math]\displaystyle{ s \pipe t }[/math] is a composite process that enforces unidirectional, left to right data flow between its components.
[math]\displaystyle{ \mathsf{succeed} }[/math] is a process that successfully terminated.
[math]\displaystyle{ \mathsf{fail} }[/math] is a process that failed.

Sequence

Choice

failure is local to choice
forks may overlap
but identical forks cannot be distinguished
choice and mathematical case analisys

Exclusive choice

One can make choices mutually exclusive by limiting themeselves to trees of choices where each leaf of the tree is one of the forms [math]\displaystyle{ a^+ . s }[/math] and [math]\displaystyle{ a^- . s }[/math] [and all atoms are different]. This is an example of guarded choice[]. It allows for the encoding of sum types and the corresponding [math]\displaystyle{ case }[/math] expressions in strongly typed functional languages.

The pipe combinator

After discussing sequence and choice, the pipe combinator is the first where interaction occurs between processes. [math]\displaystyle{ s \rhd t }[/math] is a composite process that enforces unidirectional data flow between its components. Input of the composite process is forwarded to the input of [math]\displaystyle{ s }[/math], output of [math]\displaystyle{ s }[/math] is connected to the input of [math]\displaystyle{ t }[/math] and output of [math]\displaystyle{ t }[/math] is forwarded to the output of the composite process, according to the following diagram.

[math]\displaystyle{ \underset{s \rhd t}{\boxed{\to \boxed{s} \to \boxed{t} \to}} }[/math]

Compared to the parallel composition [math]\displaystyle{ \mid }[/math] in π-calculus, [math]\displaystyle{ \rhd }[/math] can be classified as unidirectional parallel composition. [math]\displaystyle{ \rhd }[/math] does not occur in the normal form of closed terms.

If the operands are matching synchronisation prefixes, both sides make progress.

[math]\displaystyle{ a^+ . s \rhd a^- . t = s \rhd t }[/math]

If the left operand is an output prefix and the right operand is an input prefix, communication occurs and both sides make progress. The term sent by the left side is available on the right side.

[math]\displaystyle{ !s . t \rhd ?n . u = t \rhd u[s/n] }[/math]

Any other combination of operands leads to failure of the current fork. The following example fails because synchronisation cannot occur between different atoms.

[math]\displaystyle{ a^+ . s \rhd b^- . t = \mathsf{fail} }[/math]

If any of the arguments is a choice, each fork of the left side is paired with each fork of the right side.

[math]\displaystyle{ r \mid s \rhd t \mid u = (r \rhd t) \mid (r \rhd u) \mid (s \rhd t) \mid (s \rhd u) }[/math]

Pipe and Unix pipelines

A pipeline in Unix shell programming is a good analogy for the pipe combinator. Piped programs execute concurrently. It cannot occur that in [math]\displaystyle{ P1 | P2 }[/math] the output of [math]\displaystyle{ P1 }[/math] directly reaches the output of the pipeline. It is an essential difference though that Unix processes can handle more than one input and output streams, and only the standard input and output streams are passed on through the pipe.

Recognition and interpretation

One can also think of [math]\displaystyle{ s \rhd t }[/math] in a syntactic setting. In this case [math]\displaystyle{ s }[/math] represents input or source, while [math]\displaystyle{ t }[/math] plays the role of a recognizer, parser or interpreter.

Formal Definition

Syntax

[math]\displaystyle{ \def\pos{\mathsf{pos}~} \def\neg{\mathsf{neg}~} \def\seq{\mathrel{;}} \def\succeed{\textbf{succeed}} \def\alt{\mid} \def\fail{\textbf{fail}} \def\pipe{\rhd} \def\eq{ = ~} \def\rule{ ::= ~} \begin{alignat}{3} & \mbox{Atoms}~ & a, b \rule & \mathsf{A} \mid \mathsf{B} \mid \mathsf{'foo} \mid \mathsf{'bar} \mid \mathsf{'+} \mid \mathsf{'*} \mid ... \\ & \mbox{Variables}~ & x, y \rule & \mathsf{a} \mid \mathsf{b} \mid ... \\ & \mbox{Prefixes}~ & p, q \rule & ~ a^+ & & \text{Positive literal (signal)} \\ & & \mid & ~ a^- & & \text{Negative literal (match)} \\ & & \mid & ~ ?x & & \text{Abstraction (input)} \\ & & \mid & ~ !s & & \text{Argument (output)} \\ & \mbox{Terms}~ & s, t \rule & ~ p . s & & \text{Prefix operator} \\ & & \mid & ~ s \seq t & & \text{Sequential composition} \\ & & \mid & ~ s \mathsf{|} t & & \text{Choice} \\ & & \mid & ~ s \pipe t & & \text{Pipe (interpretation)} \\ & & \mid & ~ \succeed & & \text{Successful termination} \\ & & \mid & ~ \fail & & \text{Failure} \\ & & \mid & ~ x & & \text{Variable} \\ & & \mid & ~ (s) & & \text{Grouping} \\ \end{alignat} }[/math]

Operator summary
Symbol	Associativity^[2]	Precedence
[math]\displaystyle{ \seq }[/math]	right	highest
[math]\displaystyle{ \mid }[/math]	right
[math]\displaystyle{ \rhd }[/math]	right	lowest

Equational theory

Pipe-calculus can be described as a process algebra ^[3] ^[4] . The following identities are satisfied by all [math]\displaystyle{ s, t, u }[/math] pipe-calculus terms.

[math]\displaystyle{ \def\seq{\mathrel{;}} \def\succeed{\textbf{succeed}} \def\alt{\mid} \def\fail{\textbf{fail}} \def\pipe{\rhd} \def\eq{ = ~} }[/math]Sequential composition is associative with left neutral element [math]\displaystyle{ \succeed }[/math] and left absorbing element [math]\displaystyle{ \fail }[/math].

[math]\displaystyle{ \begin{align} (s \seq t) \seq u \eq & s \seq (t \seq u) \\ \succeed \seq t \eq & t \\ \fail \seq t \eq & \fail \\ \end{align} }[/math]

Choice is associative, commutative and idempotent with neutral element [math]\displaystyle{ \fail }[/math] and absorbing element [math]\displaystyle{ \succeed }[/math].

[math]\displaystyle{ \begin{align} (s \alt t) \alt u \eq & s \alt (t \alt u) \\ s \alt t \eq & t \alt s \\ s \alt s \eq & s \\ \succeed \alt t \eq & \succeed \\ \fail \alt t \eq & t \\ \end{align} }[/math]

Sequential composition is right distributive over choice.

[math]\displaystyle{ \begin{align} (s \alt t) \seq u \eq & s \seq u \alt t \seq u \end{align} }[/math]

Pipe has left neutral element [math]\displaystyle{ \fail }[/math] and right absorbing element [math]\displaystyle{ \succeed }[/math].

[math]\displaystyle{ \begin{align} \fail \pipe t \eq & t \\ s \pipe \succeed \eq & \succeed \\ \end{align} }[/math]

Todo: pipe is also associative for well behaving operands. A process whose forks are all of the same prefixed form are certainly well behaving.

Normal forms

Normal form of pipe-calculus terms is defined by a term rewriting system^[4]. A term is in normal form if no rewrite rule applies to it.

[math]\displaystyle{ \def\seq{\mathrel{;}} \def\succeed{\textbf{succeed}} \def\alt{\mid} \def\fail{\textbf{fail}} \def\pipe{\rhd} \def\in{\,?} \def\out{!} \def\rew{\mathrel{\Rightarrow}\,} }[/math] [math]\displaystyle{ \begin{alignat}{1} (s \seq t) \seq u \rew & s \seq (t \seq u) \\ (s \alt t) \seq u \rew & s \seq u \alt t \seq u \\ \succeed \seq s \rew & s \\ \fail \seq s \rew & \fail \\ \\ (s \alt t) \alt u \rew & s \alt (t \alt u) \\ a^+ . s \alt a^+ \seq t \rew & a^+ . (s \alt t) \\ a^- . s \alt a^- \seq t \rew & a^- . (s \alt t) \\ \fail \alt s \rew & s \\ s \alt \fail \rew & s \\ \succeed \alt s \rew & \succeed \\ s \alt \succeed \rew & \succeed \\ \end{alignat} }[/math]

While all other rules are syntax directed, rules for [math]\displaystyle{ \pipe }[/math] have to be attempted in the listed order for the following reasons.

The processes [math]\displaystyle{ a^- . s \pipe a^+ . t }[/math] and [math]\displaystyle{ \in x . s \pipe \out t . u }[/math] behave differently depending on the order of rule application, even if their normal forms are identical.
The process [math]\displaystyle{ r \alt s \pipe t \alt u }[/math] produces subresults in different order depending of the order of rule application. Explicit application order avoids nondeterminism in both cases.
We need a default rule that fails in order to ensure that [math]\displaystyle{ \pipe }[/math] never occurs in normal forms^[5].

[math]\displaystyle{ \begin{alignat}{2} a^+ . s \pipe a^- . t \rew & s \pipe t \\ \out s . t \pipe \in x . u \rew & t \pipe u[s/x] \\ r \alt s \pipe t \alt u \rew & (r \pipe t \alt u) \alt (s \pipe t \alt u) \\ s \alt t \pipe u \rew & (s \pipe u) \alt (t \pipe u) \\ s \pipe t \alt u \rew & (s \pipe t) \alt (s \pipe u) \\ \fail \pipe t \rew & t \\ s \pipe \succeed \rew & \succeed \\ s \pipe t \rew & \fail & \\ \end{alignat} }[/math]

Notice that finding a matching rewriting rule does not require checking the equality of terms, only of atoms.

Extensions

Recursion (μ-recursion, top level recursion, regular grammars)
Stack (context-free grammars)
Guarded choice
Ordered choice (PEG, backtracking, local context)
Intersection

Connection with process calculus

Process calculi (or process algebras) is an approach for formally modeling concurrent systems. Process refers to the behavior of a system. The word algebra denotes an algebraic approach in talking about behavior, while the word calculus emphasizes the computational aspect of processes. ^[3] There are lots of implementations of this approach and none of them can be pointed to as the primary one.

A process calculus can be characterized by the atomic units of behavior (actions, events) and formers of compound processes (operators, combinators). Implementations may include several of both categories and they are often open for extensions.

It is a recurrent pattern in the literature that processes can be viewed in two ways: one can combine component processes and examine the evolution of the resulting process. In this setting component processes may communicate with each other and the compound process may come to a rest without displaying external behavior. A single process can also be examined by placing it into a (not strictly specified) environment which is able to interact with the process.

Pipe-calculus fits into this framework if we make the following observations.

Prefixes are atomic units of behavior (actions).
Prefixes with sequential composition and choice form basic process algebra (BPA). ^[4]
The pipe operation can be seen as a restricted, unidirectional parallel composition.

There are also notable differences.

The most outstanding difference is the lack of channels. Pipe-calculus processes only have a conceptual input and output, however one can simulate channels with synchronisation where atoms stand for channel names.
Communication primitives in mainstream process calculi often inseparably unite two conceptual steps: the selection of a communication channel and the act of message sending. For example Π-calculus ^[6] defines input prefixing [math]\displaystyle{ c\left(x\right).P }[/math] and output prefixing [math]\displaystyle{ \overline{c} \langle y \rangle.P }[/math] to denote receiving a name via channel [math]\displaystyle{ c }[/math] and sending a name via channel [math]\displaystyle{ c }[/math] respectively, before proceeding as [math]\displaystyle{ P }[/math]. Although one can introduce as convention synchronization where the communicated message is ignored, and broadcast where a group of processes are receiving messages on the same channel. In pipe-calculus these steps are separated by design.
In pipe-calculus there is no counterpart of name hiding that prevents interference between two groups of processes using the same channel name for communication. Since there is no general, bidirectional parallel composition, we don't need a hiding primitive. The pipe operation naturally restricts communication to a pipeline.
In pipe-calculus, if a process fails to synchronize, it is aborted. This is different from the usual implementation where a process that fails to synchronize does not evolve, it "keeps waiting" for a message, opening up the possibility for asynchronous communication.

Notes

↑ Matija Pretnar: An Introduction to Algebraic Effects and Handlers
↑ Notice that left or right associativity is a grammatical property of a binary operator, which differs from the mathematical associativity of the respective operation. Even associative operations may be represented by left or right associative operators. See Operator associativity.
↑ ^3.0 ^3.1 J.C.M. Baeten: A brief history of process algebra., Theoretical Computer Science 335 (2005) 131 – 146
↑ ^4.0 ^4.1 ^4.2 Wan Fokkink: Introduction to Process Algebra, Springer-Verlag
↑ The default case allows for embedding synchronisation failure in the object language, so it can be captured by [math]\displaystyle{ \mid }[/math].
↑ Joachim Parrow: An Introduction to the π-Calculus, chapter to appear in Handbook of Process Algebra, ed. Bergstra, Ponse and Smolka, Elsevier

[1] Matija Pretnar: An Introduction to Algebraic Effects and Handlers

[2] Notice that left or right associativity is a grammatical property of a binary operator, which differs from the mathematical associativity of the respective operation. Even associative operations may be represented by left or right associative operators. See Operator associativity.

[Baeten-3] 3.0 ^3.1 J.C.M. Baeten: A brief history of process algebra., Theoretical Computer Science 335 (2005) 131 – 146

[Fokkink-4] 4.0 ^4.1 ^4.2 Wan Fokkink: Introduction to Process Algebra, Springer-Verlag

[5] The default case allows for embedding synchronisation failure in the object language, so it can be captured by [math]\displaystyle{ \mid }[/math].

[6] Joachim Parrow: An Introduction to the π-Calculus, chapter to appear in Handbook of Process Algebra, ed. Bergstra, Ponse and Smolka, Elsevier

[1]

[2]

[3]

[4]

[5]

[6]