On Coinduction and Quantum Lambda Calculi

In the ubiquitous presence of linear resources in quantum computation, program equivalence in linear contexts, where programs are used or executed once, is more important than in the classical setting. We introduce a linear contextual equivalence and two notions of bisimilarity, a state-based and a distribution-based, as proof techniques for reasoning about higher-order quantum programs. Both notions of bisimilarity are sound with respect to the linear contextual equivalence, but only the distribution-based one turns out to be complete. The completeness proof relies on a characterisation of the bisimilarity as a testing equivalence


Introduction
Since two decades ago, the theory of quantum computing has attracted considerable research efforts.Benefiting from the superposition of quantum states, quantum computing may provide remarkable speedup over its classical analogue [32,16,17].As a consequence, a wealth of models and programming languages for describing quantum computation have been introduced.For many reasons, the functional paradigm fits very well in the picture.One successful attempt in this direction is QUIPPER [15], an expressive functional higher-order language that can be used to program a diverse set of non-trivial quantum algorithms and can generate quantum gate representations using trillions of gates.The group led by Svore introduced LIQUi| as a modular software architecture designed to control quantum hardware [34]: it enables easy programming, compilation, and simulation of quantum algorithms and circuits.In spite of the success of language design, the semantic foundation of quantum programming languages is not well established.In a series of papers, Selinger and co-authors try to find a denotational semantics for higher-order quantum computation [28,29,30,31].In the most recent one [24], they propose a denotational model that is adequate with respect to an operational semantics.However, full abstraction for a higher-order language with both classical and quantum resources still remains an open problem.
In quantum mechanics, a fundamental principle is the no-cloning theorem of quantum resources.From a type-theoretic point of view, quantum resources are linear and can be described by linear types in quantum programming languages.How to define appropriate program equivalences for this kind of languages is an interesting problem.Some preliminary results towards this direction have been obtained by omitting quantum effects and only considering nondeterminism and linearity in a functional language [8].For that restricted setting, a notion of linear contextual equivalence is introduced and shown to be nicely related to trace equivalence.Linear contextual equivalence is a special form of contextual equivalence [23] in which the observable behaviour of programs is tested by executing them (at most) once.
In the current work we investigate the operational semantics of the typed quantum λ-calculus proposed in [24].We aim to develop coinductive proof techniques for linear contextual equivalence (written ) of quantum programs.We first define a labelled transition system for the quantum λ-calculus.It is in fact a probabilistic labelled transition system (pLTS) because probability distributions arise naturally when quantum systems are measured.In the underlying pLTS, we consider two notions of probabilistic bisimilarity: one (written ∼ s ) is state-based because it is directly defined over states and then lifted to distributions; the other (∼ d ) is distribution-based as it is a relation between distributions.The relation ∼ s is essentially the probabilistic bisimilarity originally defined by Larsen and Skou [22], representing a branching time semantics.In contrast, the relation ∼ d is strictly coarser.It is in the style of [13,18], representing a linear time semantics.Both ∼ s and ∼ d are sound proof techniques for , which requires to prove that they are congruence relations.We show the congruence property by adapting Howe's method to the quantum setting.We also find that ∼ d provides a complete proof technique for .In order to prove full-abstraction, we first characterise ∼ d as a testing equivalence = T given by a simple testing language.Since all the tests in the testing language can be simulated by linear contexts, we obtain that ⊆ = T , which implies that ⊆ ∼ d .To some extent this is a generalisation of the aforementioned coincidence result obtained in [8] because the distribution-based probabilistic bisimilarity ∼ d captures a notion of trace equivalence in the probabilistic setting very intuitively.

Other Related Work
Reasoning about program equality in higher-order languages is challenging.Most of the time equivalence between two programs requires them to exhibit the same observable behaviour under any context.To alleviate the burden of dealing with all the contexts, a useful way out is to develop operational methods for proving program equivalence.For example, Abramsky's applicative bisimulation [1] has attracted a lot of attention, not only in the classical setting [14,19,25,26], but also in the probabilistic setting [4,2].In [4] a notion of probabilistic applicative bisimulation is shown to be a sound technique for proving contextual equivalence.However, completeness fails and can only be recovered when pure, deterministic λ-terms are considered and a coupled logical bisimulation is used in place of applicative bisimulation.In [2] a probabilistic call-by-value λ-calculus is considered, where a probabilistic applicative bisimilarity is shown to be a sound and complete proof technique for contextual equivalence.Recently, the third author and Rioli have studied applicative bisimulation in a purely linear quantum λ-calculus, obtaining a soundness result [3].Following this line of research, our work is carried out in a quantum setting and uses a distribution-based bisimilarity to characterise linear contextual equivalence.We also examine a state-based bisimilarity.It corresponds to the probabilistic applicative bisimilarity discussed above.The characterisation of state-based probabilistic bisimilarity by a set of tests, shown with an involved proof in [33] and with a tradition dated back to [22], is essential for the completeness proof of [2].For distribution-based bisimilarity, however, a much simpler characterisation exists.
Variants of probabilistic bisimulation have already been used to compare the behaviour of quantum processes [21,11,5,12,10].However, as far as we know, the current work is the first to explore operational techniques based on probabilistic bisimulation to reason about contextual equivalence of fully-featured higher-order quantum programs.

Structure of the Paper
In Section 2, we introduce the syntax, the reduction semantics, and the labelled transition semantics of a quantum λ-calculus.A linear contextual equivalence and two bisimilarities are defined.In Section 3 we show that the two notions of bisimilarity are congruence relations included in linear contextual equivalence.The distribution-based bisimilarity is shown to coincide with a testing equivalence in Section 4. By exploiting this result, we show that the distribution-based bisimilarity is complete with respect to linear contextual equivalence.Finally, we conclude in Section 5.

A Quantum λ-Calculus
Following [24] we introduce the syntax and operational semantics of a quantum λ-calculus.

Syntax
In this typed language, types are given by the following grammar: Here A B is the usual linear function type and ( !A B) is a non-linear function type.For any arbitrary type A, the type !A can be simulated by !( 1A).The qubit type is used to classify terms that represent qubit information.Tensor and sum types are standard.We use the notation A ⊗n for A tensored n times.The type A l denotes finite lists of type A.
Terms are built up from constants and variables, using the following constructs: M, N, P ::= x Variables Abstractions and applications Tensor products and projections Most of the language constructs are standard.The tensor product and tensor projection are related to linearity.The quantum operators are used to prepare quantum systems.The constant U ranges over a set of elementary unitary transformations on quantum bits.Two typical examples are the Hadamard gate H and the controlled-not gate N c .Variables appearing in the λ-binder, the let-binder, the match-binder, and the letrecbinder are bound variables.We write fv(M ) for the set of free variables in term M .We will not distinguish α-equivalent terms, which are terms syntactically identical up to renaming of bound variables.If M and N are terms and x is a variable, then M {N/x} denotes the term resulting from substituting N for all free occurrences of x in M .More generally, given a list N 1 , . . ., N n of terms and a list x 1 , . . ., x n of distinct variables, we write On Coinduction and Quantum Lambda Calculi or simply M { Ñ /x} for the result of simultaneously substituting each N i for free occurrences in M of the corresponding variable x i .
Values are special terms in the following form where c ranges over the set of constants {skip, split A , meas, new, U}.As syntactic sugar we write bit = 1 ⊕ 1, tt = in r skip, and ff = in l skip.
A typing assertion takes the form ∆ M : A, where ∆ is a finite partial function from variables to types, M is a term, and A is a type.We write dom(∆) for the domain of ∆.We call ∆ exponential (resp.linear) whenever ∆(x) is (resp.is not) a !-type for each x ∈ dom(∆).We write !∆ for a context that is exponential.The type assignment relation consists of all typing assertions that can be derived from the axioms and rules in Figure 1, where the contexts ∆ and ∆ are assumed to be linear.The notation ∆, x : A denotes the partial function which properly extends ∆ by mapping x to A, so it is assumed that x ∈ dom(∆).Similarly, in the notation ∆, ∆ the domains of ∆ and ∆ are assmued to be disjoint.We write Prog(A) = {M | ∅ M : A} for the set of all closed programs of type A.
For any typing assertion ∆ M : A, it is not difficult to see that fv(M ) ⊆ dom(∆).Let fqv(M ) be the set of free variables of qubit type in term M and fcv(M ) collects all other types of free variables.Thus we have fv(M ) = fcv(M ) ∪ fqv(M ) for any term M .We often separate the free quantum variables in M from the type environment ∆ and write ∆ M : A where ∆ , x 1 : qubit, . . ., We will write ∆ M R N : A to mean that (∆, M, A) R • (∆, N, A).Sometimes we omit the type information if it is not important and simply write ∆ M R N .

The Reduction Semantics
The reduction semantics is defined in terms of an abstract machine simulating the behaviour of the QRAM model [20].

Definition 1.
A quantum closure is a triple [q, l, M ] where q is a normalized vector of C 2 n , for some integer n ≥ 0. It is called the quantum state; M is a term, not necessarily closed; l is a linking function that is an injective map from fqv(M ) to the set {1, . . ., n}.We write dom(l) for the domain of l.The notation l m stands for the union of two linking functions l and m (viewed as two sets of pairs) if their domains are disjoint, otherwise it is undefined.A closure [q, l, M ] is total if l is surjective, thus a bijection.In that case we write l as x 1 , . . ., x n if dom(l) = {x 1 , . . ., x n } and l(x i ) = i for all i ∈ {1 . . .n}.A quantum closure C = [q, l, M ] is well typed and has type A in ∆ whenever dom(l) = {x 1 , . . ., x m } and ∆, x 1 : qubit, . . ., x m : qubit M : A. In this case we write ∆ C : A. The notion of α-equivalence extends naturally to quantum closures.So, e.g., the two states [q, y , λx A .y] and [q, z , λx A .z] are deemed equivalent.With a slight abuse of language, we call a closure [q, l, V ] a value when the term V is a value.Most often we will work with closed quantum where E is any evaluation context generated by the grammar In the two reduction rules for new, the quantum state q has size n, and x is a fresh variable.
In the rule for unitary transformations, the quantum state r is obtained by applying the k-ary unitary gate U to the qubits l(x 1 ), . . ., l(x k ).In other words, r = (σ where σ is the action on C 2 n of any permutation over {1, . . ., n} such that σ(i) = l(x i ) whenever i ≤ k.In the rules for measurements, we assume that if q 0 and q 1 are normalized quantum states of the form j α j |ϕ j ⊗ |0 ⊗ |φ j , j β j |ϕ j ⊗ |1 ⊗ |φ j , then r 0 and r 1 are Figure 2 Small-Step Axioms.
The reduction semantics defined above employs a call-by-value evaluation strategy.For any µ = i p i • [q i , l i , M i ], let env(µ) = i p i • tr fqv(M ) q i q † i be the reduced quantum state of the qubits not referred to by M .In particular, if each [q i , l i , M i ] in the support of µ is a total quantum closure, we then have env(µ) = |µ|.
In order to investigate the long-term behaviour of a Markov chain, we introduce the notion of extreme derivative from [7].We first need to lift the relation → to be a transition relation between subdistributions: Definition 4 (Extreme derivative).Suppose we have subdistributions µ, µ → n , µ × n for n ≥ 0 with the following properties: and each µ × k is stable in the sense that C , for all C ∈ µ × k .Then we call ρ :=   Extreme derivatives can also be defined by a big-step semantics given by using a binary relation ⇓ between quantum closures and value distributions.Some of the rules of it can be found in Figure 3 (the others are similar).In the rules, ε stands for the empty subdistribution, and the notation |µ| stands for the size of the subdistribution µ, i.e.
C∈ µ µ(C).Finally, term constructors are, with abuse of notation, applied to quantum closures and subdistributions in the natural way, e.g., in l ( k∈K p k • [q, l, M ]) stands for k∈K p k • [q, l, in l M ].

Lemma 5. [[C]] = sup{µ | C ⇓ µ}, where the supremum of subdistributions are computed component-wisely.
Following [8] we would like to give an alternative characterisation of linear contextual equivalence.Intuitively, as usual, a context is a term with a unique hole, and a linear context is a context where programs under examination will be evaluated and used exactly once.We are interested in closing contexts.
Figure 4 Labelled Transition Rules for Quantum Closures.

A Probabilistic Labelled Transition System
In [14], Gordon defines explicitly a labelled transition system in order to illustrate the bisimulation technique in PCF.We follow this idea to define a probabilistic labelled transition system for the quantum λ-calculus, upon which we can define probabilistic bisimulations.
Transition rules are listed in Figure 4: we make the typing of terms explicit in the rules as the type system plays an important role in defining the operational semantics of typed terms., {x → 1}, x], where the quantum variable x refers to the first qubit of an entangled quantum state.
The last rule in Figure 4 says that term reductions are considered as internal transitions that are abstracted away; external transitions are labelled by actions.Intuitively, external transitions represent the way terms interact with environments (or contexts).For instance, a λ-abstraction can "consume" (application of itself to) a term, which is supplied by the environment as an argument, and forms a β-reduction.The rule for skip says that what it can provide to the environment is the value of itself, and after that it cannot provide any information, hence no external transitions can occur any more.We represent this by a transition, labelled by the value of the constant, into a non-terminating program Ω Ω Ω of appropriate type.
The set of quantum closures Cl together with the transition rules in Figure 4 yields a probabilistic labelled transition system (pLTS).It is in fact a reactive system in the sense that if , that is no two outgoing transitions leaving a quantum closure are labelled by the same action.Below we recall Larsen and Skou's probabilistic bisimulation [22].We first review a way of lifting a binary relation R over a set S to be a relation R † over the set of subdistributions on S given in [9] 1 .Definition 8. Let S, T be two countable sets and R ⊆ S × T be a binary relation.The Here we write R(X) for the set {t ∈ T | ∃s ∈ X. s R t} and µ(X) is the accumulation probability s∈X µ(s).
The definition by Larsen and Skou, when instantiated on the pLTS of closed quantum closures, looks as follows: Definition 9. A probabilistic simulation is a preorder R on closed quantum closures such that whenever (C, D) ∈R we have that: A probabilistic bisimulation is a relation R such that both R and R −1 are probabilistic simulations.Let and ∼ s be the largest probabilistic simulation and bisimulation, called similarity and bisimilarity, respectively.Bisimilarity and similarity are relations on closed quantum closures, but can be generalized to open closures as follows: Suppose ∆ M, N : A. We write for any q and l such that [q, l, M ] and [q, l, N ] are both typable quantum closures.
For reactive pLTSs, the kernel of probabilistic similarity is probabilistic bisimilarity.That is, ∼ s = ∩ −1 .This of course also applies to the specific pLTS we are working with.The probabilistic bisimilarity defined above is a binary relation between states, and thus sometimes called state-based bisimilarity.Alternatively, it is possible to directly define a (sub)distribution-based bisimilarity by comparing actions emitted from subdistributions.In order to do so, we first define a transition relation between subdistributions.
]] for any q and l such that [q, l, M ] and [q, l, N ] are quantum closures.
It is not difficult to see that s ∼ s t implies s ∼ d t but not the other way around, as witnessed by the following example.

C O N C U R ' 1 5
On Coinduction and Quantum Lambda Calculi [∅, ∅, meas(H(new ff))] , x 1 x 2 , (λxy.meas x 1 )(meas But then the condition s 1 ∼ s † µ is invalid because there is no way to split s 1 into two different states such that they are bisimilar to t 1 and t 2 respectively.In the quantum λ-calculus this distinction between state-based and distribution-based bisimulations also exists.For example, the quantum closures [∅, ∅, (λxy.meas(H(newff)))x] and , x 1 x 2 , (λxy.meas x 1 )(meas x 2 )] exhibit similar behaviour as states s and t, respectively, as depicted in Figure 6.

Congruence
In this section we show that both ∼ s and ∼ d are congruence relations.The proof for ∼ s is more complicated, so we take it as an example and give the details.The case for ∼ d follows the same schema.The basic idea is to make use of Howe's method [19,26], which requires to start from an initial relation R, define a precongruence candidate R H , a precongruence relation by construction, and then to show the coincidence of that relation with the relation.

Definition 13.
Let R be a typed relation on quantum closures.Its compatible refinement R is defined by some natural rules, a selection of which is in Figure 7.A relation R is a precongruence iff it contains its own compatible refinement, that is R ⊆ R. Let a congruence be an equivalence relation that is a precongruence.
Let R be a typed relation on quantum closures.The typed relation R H is defined by the rules in Figure 8.Note that if R is reflexive then R ⊆ R H , and R H is a precongruence.Therefore, in order to show that R is a precongruence (or congruence if R is also symmetric), it suffices to establish R H ⊆ R because we then have the coincidence of R with R H .In order to show (∼ s ) H ⊆ ∼ s , we need the following two technical lemmas.
Consequently, we can establish the coincidence of with H , from which it is easy to show that ∼ s is a congruence.Similar arguments apply to ∼ d .
Theorem 16.Both ∼ s and ∼ d are included in .

Completeness of Distribution-Based Bisimilarity
In this section we show that distribution-based bisimilarity is complete for linear contextual equivalence.The basic idea is to first characterise bisimilarity by a very simple testing framework.Let T be the set of tests of the two forms: ω and a • t, where ω is used to indicate success and a ranges over the set of all possible labels in the transition rules in Figure 4.In other words, the testing language is given by the grammar: t ::= ω | a • t.
Below we define the function Pr that calculates the probability of passing a test for a distribution of states in a reactive pLTS.If µ is a point distribution s, we will write Pr(s, t) for Pr(µ, t).We define a testing equivalence = T by letting µ = T ν iff ∀t ∈ T : Pr(µ, t) = Pr(ν, t).It turns out that the tests in T are sufficient to characterise ∼ d as far as reactive pLTSs are concerned.
Theorem 17.Let µ and ν be two distributions in a reactive pLTS.Then µ ∼ d ν if and only if µ = T ν.
Following [2], we turn each test into a corresponding context.That is, for a given test t and a given type A, there exists a linear context C A t such that for all terms M of type A, the success probability of t applied to any total quantum closure [q, l, M ] is exactly the convergence probability of [q, l, C A t [M ]].

Lemma 18.
Let A be a type and t a test.There is a context C A t such that ∅ C A t (∅; A) : bit and for every M with ∅ M : A, we have Pr([q, l, M ], t) = |[[[q, l, C A t [M ]]]]|, where [q, l, M ] and [q, l, C A t [M ]] are quantum closures for any q and l.
As a consequence of the previous lemma, we can show that the distribution-based bisimilarity ∼ d is complete with respect to the linear contextual equivalence .

Concluding Remarks
We have presented two notions of bisimilarity for reasoning about equivalence of higher-order quantum programs in linear contexts, based on an appropriate labelled transition system for specifying the operational behaviour of programs.Both bisimilarities are sound with respect to the linear contextual equivalence, but only the distribution-based one turns out to be complete.Since linear resources are widely used in quantum computation, we believe that linear contextual equivalence will be a useful notion of behavioural equivalence for quantum programs.The coinductive proof techniques developed in the current work can help to reason about quantum programs.
In the future, it would be interesting to seek a denotational model fully abstract with respect to the linear contextual equivalence.As recently shown in [35], Fock spaces can be useful to interpret quantum computation and they are close to the categorical semantics studied in [24], so it seems promising to start from there.

Lemma 2 (Lemma 3 (
Totality).Let C and D be two quantum closures and C p D. If C is total then so is D. Type safety).Let C = [q, l, M ] be a closed quantum closure.Then either M is a value or the total probability of all one-step reductions from C is 1.By Lemma 3 we see that the reduction semantics induces a Markov chain (Cl, →), where Cl is the set of all closed quantum closures and → ⊆ Cl × D(Cl) is the transition relation with C → µ satisfying µ(D) = p iff C p D for some p > 0.Here D(Cl) stands for all probability subdistributions over Cl and µ is a full distribution over all successor quantum closures of C.

∞
k=0 µ × k an extreme derivative of µ, and write µ ⇒ ρ.Let C be a quantum closure in the Markov chain (Cl, →).The extreme derivative of the point distribution on C that assigns probability 1 to C, written C, is unique, and we use it for the denotation of C, indicated as [[C]].So we always have C ⇒ [[C]].Note that in the presence of divergence [[C]] may be a proper subdistribution.

Definition 6 .Definition 7 .
A linear context (or simply a context) is a term with a hole, written C(∆; A), such that C[M ] is a closed program whenever the hole is filled in by a term M , where ∆ M : A, and the hole lies in linear position.Following[2], we require that the observable behaviour of a quantum closure C is its probability of convergence |[[C]]|.The linear contextual preorder is the typed relation defined as follows: ∆ M N : A if for every linear context C, quantum state q and linking function l such that ∅ C(∆; A) : B, and both [q, l,C[M ]] and [q, l, C[N ]] are total quantum closures, it holds that |[[[q, l, C[M ]]]]| ≤ |[[[q, l, C[N ]]]]|.Linear contextual equivalence is the typed relation by letting ∆ M N : A just when ∆ M N : A and ∆ N M : A.
A transition takes the form C a − → µ, where C is a quantum closure, µ is a subdistribution over quantum closures, and a is an action.If µ is a point distribution D, we simply write the transition as C a − → D. Note that non-total quantum closures are needed here to specify the operational semantics: we cannot work only with total quantum closures due to entanglement.For example, we should allow for a non-total quantum closure like [ |00 +|11 √ 2

Definition 10 .
We write µ a − → ρ if ρ = s∈ µ µ(s) • µ s , where µ s is determined as follows: either s a − → µ s or there is no ν with s a − → ν, and in this case we set µ s = ε.Note that this is a weaker notion of transition relation between subdistributions, compared with that defined on Page 432.If µ a − → µ then some (not necessarily all) states in the support of µ can perform action a.For example, consider the two states s 2 and s 3 in Figure 5. Since s 2 c − → s 4 and s 3 cannot perform action c, the distribution µ = 1 2 s 2 + 1 2 s 3 can make the transition µ c − → 1 2 s 4 to reach the subdistribution 1 2 s 4 .Let µ be a subdistribution over a reactive pLTS.After performing any action, it can reach a unique subdistribution.That is, if µ a − → ν and µ a − → ρ then ν = ρ.Given a subdistribution µ, we denote by [[µ]] the subdistribution C∈ µ µ(C) • [[C]].Definition 11.A distribution-based bisimulation is a binary relation R on subdistributions such that µ R ν implies env(µ) = env(ν); [[µ]] R [[ν]]; if µ and ν are value distributions and µ a − → ρ, then ν a − → ξ for some ξ with ρ R ξ, and vice-versa.Let ∼ d be the largest distribution-based bisimulation.Suppose ∅ M, N : A. We write

Figure 5 s
Figure 5 s ∼s t.

Figure 6
Two Quantum Closures not Related by ∼s., (s 2 , t 3 ), (s 3 , t 4 ), (s 4 , t 5 ))} and check that R is a distribution-based bisimulation.Therefore, we have s ∼ d t.Note that the point distribution at state s 1 is related to the distribution 1 2 t 1 + 1 2 t 2 .However, we have s ∼ s t because after performing action a the state s evolves into the point distribution s 1 , and the only candidate transition from t to match this is t