A Complete Normal-Form Bisimilarity for Algebraic Effects and Handlers

Abstract


Introduction
Algebraic effects with handlers [22,3] have become a popular technique of programming with computational effects such as exceptions, mutable state or nondeterminism.Their strength lies in their modularity, as it is possible to easily combine several effects thanks to the separation between syntax and semantics.Indeed, effects themselves are just syntactic constructs which do not carry any meaning; their semantics is given by the handlers, which come into play when an interpretation of an effect is needed for the computation to go through.
As an informal example, borrowed from [8], consider the reader effect ask, which returns a hidden value when triggered.An effect is used as a labeled operation, e.g., as in do ask () + do ask () + 2, and its meaning is given by a handler, as in handle do ask () + do ask () + 2 {ask: x,k → k 5; ret y → y} The handler specifies how it interprets the ask effect by the expression x,k → k 5, where x stands for the value the effect operation is applied to (which is not used in this example), and k for its continuation or resumption, i.e., the rest of the computation, which includes the handler itself.Here, the handler simply passes 5 to the continuation, so that do ask () + do ask () + 2 eventually reduces to 12. Once the expression inside the handler is a value, it is passed to the return clause ret y → y, which in our case simply returns the result.Any expression can

3:2
A Complete Normal-Form Bisimilarity for Algebraic Effects and Handlers be used in an effect handler, including one making use of the continuation several times or not at all; for example, in handle do ask () + do ask () + 2 {ask: x,k → 13; ret y → y} the handler throws away the continuation when called the first time and returns 13, which is then the final result of the computation.Multiple effects can be used in an expression, which are then interpreted by a single handler, or by successive handlers enclosing the expression.
The order of the handlers then specifies the semantics of all the effects combined.
While handlers make combining multiple effects programmer friendly, reasoning about the behavior of programs with effects and handlers appears to be inherently challenging, mainly due to the non-local transfer of control involved in effect handling.When it comes to the issue of program equivalence, the standard notion considered in calculi modeling programming languages, typically based on λ-calculi, is contextual equivalence [20], which requires program phrases to behave the same when plugged in any context.The quantification over all contexts makes this relation hard to use in practice, so one usually looks for more tractable characterizations of contextual equivalence, either in the form of logical relations [24] or coinductively defined bisimilarities [1,17,27].
In the presence of algebraic effects and handlers, the situation is even more interesting, because we have to take into account the possibility that the testing context may interpret any non-handled effects the two programs being tested might use.There exist some works on formal techniques for reasoning about program equivalence in calculi with algebraic effects, but they either do not include handlers in the language [16,15,14] or are directed by a type structure of the calculus [8] (we discuss related work in detail in Section 4).None of them, however, focuses on the control structure of a full calculus of algebraic effects and handlers (where effects are interpreted dynamically, unlike, e.g., in [14]) and in isolation from other concepts such as types.Algebraic effects are intimately related to delimited-control operators [12,21], for which bisimulation theories have been studied extensively [4], yet they differ in a very essential way, as we argue in this work.
In this paper, we show that it is possible to characterize contextual equivalence in an untyped calculus with algebraic effects and handlers with one of the simplest notions of equivalence, namely normal-form (or open) bisimilarity [25,17].In a normal-form bisimilarity proof one compares open terms by reducing them to normal forms, which are then decomposed into bisimilar subterms.In a language with algebraic effects, we have to consider extra normal forms -programs with effects that have not been handled.More importantly, we have to observe how a context may handle an effect and its continuation.To this end, we introduce an extended calculus where contexts can be abstractly represented with context variables, a concept we used in our previous work on normal-form bisimulations for abortive continuations [7].Such variables can be observed and discriminated upon by the bisimilarity that is defined for the extended calculus.Extending the calculus is a critical step in obtaining sound and complete bisimilarity, but it should be seen just as a tool for studying the plain calculus.When restricted to the plain calculus, the bisimilarity relates exactly those terms that are equivalent w.r.t. the contextual equivalence in the plain calculus.
In many calculi, the decomposition of normal forms as done in normal-form bisimilarity is usually too fine-grained and distinguishes programs that are in fact contextually equivalent [17].The result of this paper shows that handlers contain sufficient discriminating power for normal-form bisimilarity to be complete w.The rest of this paper is organized as follows.In Section 2, we present the syntax, semantics, and contextual equivalence of the plain calculus λ eff , the minimal calculus with effects and handlers we consider for our study.In Section 3, we define the normal-form bisimilarity for the extended calculus and prove its soundness and completeness.We also define up-to techniques, proof techniques meant to simplify equivalence proofs, and we illustrate how the bisimilarity and these techniques can be used on examples.Additionally, we pinpoint the difference between algebraic effects and delimited-control operators and how it affects the definition of a normal-form bisimulation.In Section 4, we discuss related work, and we conclude in Section 5.The appendix contains the soundness and completeness proof sketches.

2
The Calculus λ eff Syntax.The calculus λ eff , whose syntax is given in Figure 1, extends the λ-calculus with labeled effects do l e and handlers handle e {H; r}, where H is a list of effect handlers l i : x i ,k i → e i and r is a return clause ret x → e .The order of the list is irrelevant, but we assume the labels l 1 . . .l n to be pairwise distinct.In a handler x i ,k i → e i , the variable x i represents the argument of the effect, while k i stands for its continuation (or resumption).
We write lbl(e) for the set of effect labels l that label do expressions in e. Reduction semantics.We fix a call-by-value, left-to-right reduction strategy for λ eff by defining the syntax of evaluation contexts as follows.
ECtx hl When writing expressions, we sometimes decorate a context with a label it does not handle, i.e., writing E l if l / ∈ hl(E).Typically, we write E l [do l v] for an expression where the effect l cannot be handled by E.
The reduction semantics of λ eff is given by the following rules.
We write → * for the reflexive and transitive closure of →.In the third rule, we see that the effect do l v is interpreted by the first enclosing handler, as E = handle E l {H; r} and E does not handle l.The handler has access not only to the argument v of the effect, but also to its continuation, represented as a function λz.E [z].Note that the handler itself is part of the captured continuation, meaning that it can handle further effects when the continuation is resumed.

Normal-Form Bisimilarity
We first informally introduce our notion of normal-form bisimilarity, before giving its definition and discussing its soundness and completeness.We also explain why, in spite of the relationship between handlers and multi-prompted delimited continuations, it is more difficult to define a complete normal-form bisimilarity for the latter than for the former.A simple way to compare the handlers behaviors is to plug the contexts with a controlstuck term do l x for a fresh x and for any l (handled by the contexts).However, such a testing term is not strong enough, as it would relate a handler which throws away the continuation to one that does not, e.g.,

Informal Presentation
We need to account for the fact that control-stuck terms may be surrounded with a context without introducing a quantification over these contexts which would go against the principles behind normal-form bisimulation.We do so by extending the syntax of the calculus with context variables, a construct we introduced in previous works to track the whereabouts of contexts captured by control operators [7,4].In a control-stuck term α l [do l x], the context variable α l stands for a context which does not handle l, and its presence allows to distinguish between the two contexts E 1 and E 2 .
Adding context variables to λ eff generates new normal forms of the shape E[α l [v]] and , where the computation is stuck because we do not know which context α l stands for.The bisimulation deals with these normal forms in a very regular way, simply asking to reduce to a normal form of the same shape with related contexts and values.In the end, the definition we obtain (Definition 5) follows the usual pattern of normal-form bisimulation -the only subtlety being in how to compare contexts -and yet the resulting bisimilarity is sound and complete w.r.t. the contextual equivalence of the extended calculus.More importantly, the restriction of the bisimilarity to plain calculus terms yields the contextual equivalence for the plain calculus.

Extended Calculus
As explained in the previous section, we extend the syntax of λ eff with context variables in order to observe how contexts are captured when effects are triggered.We assume a set CVar of context variables, ranged over by α and β.Similar to evaluation contexts, we decorate these variables with an effect it does not handle: the variable α l is a context variable standing for a context which does not handle l.In particular, when considering a control-stuck term, the context variable is always decorated with an effect label.Moreover, we write We extend the syntax of expressions and evaluation contexts as follows.
We write cv(e) for the set of context variables occurring in e.We adapt the definition of hl so that hl(α l [E]) = (Lbl \ {l}) ∪ hl(E), as α l stands for a context not handling l but which may potentially handle any other label.While the reduction rules themselves are the same, the semantics of the extended calculus is still affected by the change in the grammar of evaluation contexts.In particular, it admits more normal forms than the plain λ eff .for some E, α l , and v, We refer to normal forms of the shape E[α l [v]] as context-stuck terms and those of the shape Given an expression e, a context variable α l and a context E l , we define the context sub- and the substitution is recursively propagated to the sub-expressions in the other cases.

Definition
We define the bisimulation for the extended calculus using the notion of diacritical progress we developed in a previous work [2,6], which distinguishes between active and passive clauses.
Roughly, passive clauses are between simulation states which should be considered equal, while active clauses are between states where actual progress is taking place.This distinction does not change the notions of bisimulation or bisimilarity, but it simplifies the soundness proof of the bisimilarity.It also allows for the definition of powerful up-to techniques, functions on relations meant to simplify bisimilarity proofs.For normal-form bisimilarity, our framework enables up-to techniques which respect η-expansion [7], a necessary condition to reach completeness.
Given a relation R on expressions, we extend it to values and evaluation contexts in the following way.
x fresh The • v extension compares values by simply applying them to a fresh variable; such a test, compliant with η-expansion [7], is valid because λ-abstractions are the only values of our language.As explained in Section 3.1, we consider two extensions for evaluation contexts, as it depends how these are used: • r is used when we know the contexts are plugged only with values (resumptions), while • c assumes that they can be filled with any expression, including an effectful one.As a result, • c compares how the contexts deal with the effects they may handle (the ones in hl(E 1 ) ∪ hl(E 2 )), by testing them with an expression α l [do l x] built using a fresh context variable α l which can be observed during the bisimulation game.We define progress, bisimulation and bisimilarity using these extensions.

Definition 5. A relation R progresses to S, T written R S, T , if R ⊆ S, S ⊆ T , and
if e 1 → e 1 , then there exists e 2 such that e 2 → * e 2 and e 1 T e 2 ; if e 1 = v 1 , then there exists v 2 such that e 2 → * v 2 and v 1 S v v 2 ; if and and and the symmetric of the above conditions on e 2 .
A normal-form bisimulation is a relation R such that R R, R, and normal-form bisimilarity ≈ is the union of all normal-form bisimulations.
As pointed out before, the clauses dealing with normal forms are very similar, simply requiring e 2 to reduce to a normal form of the same kind, and then decomposing these normal forms into pairwise related subterms.We just have to be careful in using • r only for the contexts used as resumptions.
We progress towards S in the value and context-stuck term clauses and T in the others; the former are passive while the latter are active.Our framework prevents some up-to techniques from being applied after a passive transition.For values, we want to forbid the application of bisimulation up to context as it would be unsound: we could deduce that v 1 x and v 2 x are equivalent for all v 1 and v 2 just by building a candidate relation containing v 1 and v 2 .Similarly, for context-stuck terms, we prevent the application of bisimulation up to substitution of context variables, as we could also relate any v 1 and v 2 from a candidate containing α l [v 1 ] and α l [v 2 ] by replacing the context variable with x.
Example 6.We consider the handler of Example 1 for the reader effect, where we generalize the hidden value 5 to a given variable z: Alternatively, the reader effect can be interpreted by the following handler obtained from the standard handler for mutable state: The context E 2 applies the handler to the current value of the state and let the handling code of the operation(s) access it through a λ-abstraction.(We would obtain a standard handler for mutable state by adding the clause set: x,k → λy.k y x handling the operation set which sets the value of the state.) We show that these two handlers for the reader effect are equivalent by establishing the equivalence between the contexts D. Biernacki and S. Lenglet and P. Polesiuk Testing with α l [do l x] and defining E 2 = handle {l: x,k → λy.k y y; ret x → λy.x}, we get We obtain two context-stuck terms, for which we need to relate identical variables and the contexts E 1 and E 2 we want to equate in the first place.In the end, we can easily build a bisimulation R such that E 1 R c E 2 .

Soundness and Up-to Techniques
In our framework [6] as in the works we extend [18,23], proving that the bisimilarity is compatible -preserved by contexts -amounts to showing that a form of bisimulation up to context is valid, as explained after Lemma 10.We slightly reformulate our most recent work [6] to make it simpler but expressive enough it can be applied to λ eff .
In what follows, we use s, f, g to range over monotone functions on relations, i.e., functions such that R ⊆ S implies f(R) ⊆ f(S) for any R, S. We extend ∪ to functions so that for all R, . We define an ordering on functions so that f g if for all R, f(R) ⊆ g(R), which is itself extended pointwise to pairs of functions.
As pointed out before, because of the distinction between passive and active clauses, not all up-to techniques can be applied in all clauses.In fact, we decompose an up-to technique into a pair of functions (s, f), where s can be used in passive clauses while f cannot.
In an up-to technique (s, f), s is said strong while f is said weak.Instead of proving directly that a pair is an up-to technique, we consider a sufficient criterion based on respectfulness 2     and the largest respectful pair, called the diacritical companion (u, w): if a pair (s, f) is below the companion, then it is an up-to technique.
The diacritical companion is defined using notions of evolution on monotone functions which can be seen as the higher-order counterpart of progress on relations.We decompose diacritical progress R S, T into passive progress R p S and active progress R a T to define different kinds of evolution.
Definition 8. Let f, g be monotone functions.
f passively evolves to g, written f p g, if for all R, S, R p S implies f(R) p g(S); f actively evolves to g, written f a g, if for all R, S, R a S implies f(R) a g(S); f restrictively evolves to g, written f p|a g, if for all R, S, R p R a S implies f(R) a g(S).
Passive and active evolutions express the idea that f becomes g in respectively passive and active clauses.Restricted evolution allows a relation R to do some administrative step (passive progress) before doing some active progress, as long as we stay in R. For λ eff , it means that we can reduce a term to a value before doing some active progress with it.
2 Our previous work [6] is built on the notion of compatibility, but the notion of progress we use in this paper makes Definition 9 correspond to respectfulness instead.See [26,23,6] for a discussion on the difference between the two notions.
In words, the bisimulations of diacritical evolution are exactly respectful pairs, and its bisimilarity is the diacritical companion.Among other properties, we can show that any pair below the companion (including the companion itself) is an up-to technique.
Lemma 10.The following hold: The second inequality implies that any strong function can also be used as a weak one, justifying why such a function is said "strong", as it can be applied without restriction in any clause.The last equality states that the weak companion preserves bisimilarity, so for any f w, we also have ), showing that it is below w is enough to deduce that ≈ is compatible.
The remaining question is how to prove that a given pair (s, f) is below the companion.
In this paper, we use a degenerate but sufficient version of a theorem in our previous work [6,Theorem 4.12].Let id be the identity on relations.We define S(s) inductively as the smallest The function S(s) is the smallest function built from s, id, and u stable by composition and union, while W(s, f) is the smallest function built from s, f, id, and w stable by composition and union.Including u and w in their definition means that any function already proved respectively strong or weak is below respectively S(s) or W(s, f).
Theorem 11.Let (s, f) be monotone functions.If The idea of the theorem is to see how s and f evolve and prove that the results of their evolutions is below what is on the right of the arrows.Any combination of weak functions can be obtained after an active or restricted evolution, but only strong functions can be used after a passive one, except that f can be used once.This constraint on f makes the soundness proofs of the most interesting up-to techniques of λ eff more difficult (cf.Appendix A).
We define the up-to functions we consider for λ eff in Figure 2. The first four are usual and can be found in many variants of the λ-calculus [7,4].The function red is the usual bisimulation up to reduction, where expressions can be related after some reduction steps, Up-to techniques.The remaining functions are more specific to λ eff .The function cvar plugs related terms into any context variable.This variable can then be replaced with contexts using either csubst or rsubst, depending whether the contexts behave as resumptions or not.In the latter case, the contexts should be related with • r , and the context variable should be in resumption position, a condition we check with the predicate resum, defined in Figure 2.
Roughly, resum(α l , e) means that α l is about to be captured -i.e., plugged with an effect do l v -or has already been captured, and is therefore plugged with a value.
The functions cvar, csubst, and rsubst can be used to define a more conventional bisimulation up to evaluation context, similar to the one of the plain λ-calculus [7].
We simply plug e 1 and e 2 into a fresh context variable which is then replaced with E 1 and E 2 .
The functions we define are strong, except for csubst and rsubst.
The proofs for the strong techniques are simple or as in the plain λ-calculus [7]; we sketch the proof for csubst and rsubst in the appendix.It is not surprising that these two functions We prove that the two effects commute by showing that Sketch.We show that the relation R given by the following rules is a bisimulation up-to.
The pair of the first rule is straightforward to check as each expression evaluates to v.For the second rule, the interesting cases are when l is an effect handled by the two expressions evaluate to ().If l = ask, they evaluate to respectively , which are context-stuck terms and for which we can easily check the bisimulation requirements.
If l = flip, then the expressions of the second rule reduce to respectively and To compare these context-stuck terms, we plug the two contexts with a fresh variable and a fresh control-stuck terms.When plugged with a fresh variable, we obtain ], for which we can again easily check the bisimulation clause.With control-stuck terms, we obtain expressions related by the third rule defining R. Checking bisimulation for the third rule is done by a similar case analysis on l and concludes the proof.

Completeness
In this section we show that for any two expressions e and e 2 would be distinguished by the following context: The main lemma of this section establishes that ≡ E is a bisimulation, which, by Lemma 18, implies completeness of ≈ w.r.t.≡.Case: ] and e 1 ≡ E e 2 .We need to show that there exist E 2 and v 2 such that: To prove (1), we take a fresh label l , and we define a substitution σ as follows: for β l ∈ cv(e 1 ) ∪ cv(e 2 ) and β l = α l σ(α l ) = do l where H l = l 1 : x,k → Ω; . . .;l n : x,k → Ω and {l 1 , . . ., l n } = lbl(e 1 ) ∪ lbl(e 2 ) − {l }, and we consider a context E = handle {l : x,k → x; ret x → Ω}.It is easy to see that E[e 1 ]σ ⇓ v and that if e 2 evaluates to a normal form which is not for some E 2 and v 2 , then either σ reduces to a control-stuck term (the latter case occurs when e 2 itself reduces to a control-stuck term E 2 [do l v 2 ]).
To prove (2), we take a fresh variable z, a context E and a closing substitution σ, and we assume that To this end we take fresh labels l , get and put (the latter two to encode a binary state as an algebraic effect), and we define σ to be equal to σ everywhere, except for along with where b ∈ {true, false}.Let us notice that The idea is to use α l , the single synchronization point of e 1 and e 2 available, in such a way that the first time α l is used, E true [e i ]σ reduces to an expression behaving like E[v i z]σ.To ensure this, we make sure that any subsequent uses of α l (it could occur in v i or E) actually mean σ(α l ).But when the state is set to false, the λ-abstraction in σ (α l ) behaves like the identity, and filling the hole of σ (α l ) with a value v simply passes v to σ(α l ).Filling it with a control-stuck term E l [do l v] allows σ(α l ) to eventually handle the effect, capturing a context equivalent to (λz.z) E l .In the end, few additional reduction steps.
To prove (3), we have to show: (a) for a fresh variable z, and (b) for any l and fresh α l and z.Assuming we compare expressions using E and σ in both cases, we proceed as in (2), except that in (a) we take and in (b) we take The remaining cases are proved similarly and can be found in Appendix B.
Corollary 20.For any expressions e 1 and e 2 in the plain calculus, if e 1 ≡ e 2 , then e 1 ≈ e 2 .

Comparison with Multi-Prompted Delimited Continuations
Algebraic effects and handlers studied in the untyped setting, as in this work, diverge from their categorical origins [22], and can be considered a new form of delimited control [10,11].
As a matter of fact, there exist mutual encodings of algebraic effects and (deep) handlers over a single operation and the control operator shift0 [28], both in an untyped [12] and polymorphically typed settings [21].These encodings are not fully abstract and therefore they do not guarantee that a behavioral theory, such as the one presented in this work, would carry over to the corresponding calculus of delimited continuations.Given that we allow for multi-labeled algebraic operations, the corresponding calculus in our case would be a generalization of shift0 to its multi-prompted version shift0 l where the main reduction rule is: We can observe that in contrast to the calculus of algebraic effects, the party responsible for handling the effect is the same as the one that actually does the effect -it is not the prompt that handles it, but the expression e.The reversal of the roles makes algebraic effects considerably more programmer-friendly, but it also simplifies the theory, compared to the one for classical delimited-control operators.In particular, the techniques we propose in this work appear not to be sufficient for constructing a normal-form bisimulation theory for multi-prompted shift0.
The main obstacle is encountered when we relate evaluation contexts, say E 1 and E 2 .The requirement that E 1 [z] and E 2 [z] (for a fresh z) be related is uncontroversial.However, how should we test E 1 and E 2 for control effects?We need a notion of an abstract control-stuck term and we do not know how to represent it in this calculus.We could introduce a syntactic category of control-stuck-term variables for this purpose, but this would lead nowhereplugging E 1 and E 2 with such a variable would immediately result in control-stuck termsthere simply is no code that could test the contexts.
One could try to decompose the contexts E 1 and E 2 into some corresponding sub-contexts and relate those, following the approach that works for single-prompted control operators shift and reset for which there exists a sound normal-form bisimilarity [4].Whether this could lead to a complete theory is not clear and requires further study.As for single-prompted control operators, be it shift or shift0, reaching completeness seems a tall order -notice that the completeness proof of Section 3.5 hinges on the existence of fresh effect labels (prompts).

Related Work
Up to now, most works studying the behavioral theory of a calculus with generic algebraic effects were not considering handlers, but interpretations of effects instead, usually in a monad.In such a setting, the behavior of an effect is therefore given for all programs once and for all by the interpretation.In contrast, with handlers, the behavior of an effect may change between programs or during the execution of a program as it depends on how it is handled.The calculus we consider is therefore more expressive than those of the works we list below, with a more discriminative contextual equivalence.It explains why we can reach completeness with a syntactic equivalence such as normal-form bisimilarity while previous works do not achieve completeness with more elaborate equivalences such as applicative bisimilarity.As a matter of fact, the completeness proof presented in this paper relies on an encoding of state and resembles the completeness proof we developed for higher-order state in a previous work [5].The definition of the normal-form bisimilarity for state, unlike the one presented in this work, did not require any extensions of the calculus.However, its Some recent works interpret effects in a monad and use relators which express how interpreted terms should be compared in the monad.Relators allow to develop the behavioral theory of a calculus with effects in a very abstract setting: e.g., one can get for free that the bisimilarity is a congruence provided that a relator exists for the interpretation monad.
Relators have been studied for applicative bisimilarity in call-by-value [15] or call-by-name [16], and for normal-form bisimilarity in call-by-value [14].As pointed out by the authors in [16], "there is however little hope to prove a generic full-abstraction result [w.r.t.contextual equivalence] in such a setting, although for certain notions of an effect, full abstraction is already known to hold."However, completeness can be obtained in some cases, as in an untyped call-by-name calculus with deterministic effects [16].
The other path to completeness in typed languages is through logic or logical relations.

Conclusion
We present a sound and complete normal-form bisimilarity for a calculus with effects and handlers.The crucial point is to accurately observe how evaluation contexts may handle effects.First, we distinguish between resumptions, which are plugged only with values, from regular contexts, which may be plugged with any expressions, including effectful ones.We then test the latter contexts using control-stuck terms where the continuation is represented by a context variable, which allows to track how the captured continuation is Then, it would be worthwhile to investigate whether the results presented in this paper carry over to shallow handlers.Finally, there exist a number of type-and-effect systems for algebraic effects of varying complexity [8,9,21], and one can wonder how features such as effect polymorphism along with effect coercions would influence the theory of this paper.

A Soundness Proof Sketch
We only discuss the case of csubst and rsubst, as the others are proved as in the plain λ-calculus [7].In particular, we use the fact that Lemma 21. subst u We want to prove that csubst and rsubst are weak, but to circumvent the constraint that they cannot be composed twice in a passive clause, we combine csubst and rsubst in a single ssubst doing simultaneous substitutions.

Lemma 22. ssubst w
Proof.Let R R, S, e 1 σ 1 subst(R) e 2 σ 2 with e 1 R e 2 and σ 1 R c σ 2 .We proceed by case analysis on the behavior of e 1 .The cases where e 1 reduces, is a value, or is an open-stuck term are simple. Suppose Any context variable surrounding the hole of E 1 l can only be of the form α l i , meaning that E 1 l σ 1 still does not handle l, and the resulting terms are control-stuck.
We progress to ssubst, so we can conclude. Suppose ] with α l ∈ dom(σ 1 ) (the case where the variable is not in the domain is easily handled).There exist E 2 and v 2 such that e and We have two special cases to consider, σ 1 (α l ) = β l [ ] and σ 1 (α l ) = ; in the other cases, σ 1 (α l )[v 1 ] is doing something active and we can conclude using Lemma 21.
If σ 1 (α l ) = , we have x R σ 2 (α l )[x], from which we deduce that there exist w such We are fine w.r.
and therefore which is what we need.Then for any l and fresh γ l and y.

Because
for some e 1 (the case where l is not handled is not interesting), then there exists e 2 such that E 2 [γ l [do l y]] → * e 2 and e 1 S e 2 .Therefore, which is enough to conclude. Suppose ] with l = l and α l ∈ dom(σ 1 ); then there exist E 2 ,

B Completeness Proof Sketch
The proof proceeds as described in Section 3.5: given e 1 ≡ E e 2 , we check that for each behavior of e 1 , e 2 is able to match.If e 1 is a normal form, we verify that (1) e 2 evaluates to a normal form of the same kind, and the normal forms can be decomposed into related sub-parts.
For each case, we give the substitution σ and the context E enforcing (1).Checking that related sub-parts are contextually equivalent relies in most cases on an encoding of a mutable state using handlers, as in Section 3.5.In all the subcases below, we assume the labels get and put to be fresh, and given a boolean b and a context E , we define We define E in each subcase where the encoding is needed.
Case: e 1 → e 1 .Because the reduction is deterministic, we still have e 1 ≡ E e 2 .
Hence, there exists v 2 such that e 2 → * v 2 ; we check that Let x be a fresh variable, E a context, and σ a closing substitution such that E[v 1 x]σ ⇓ v .
Then E[e 2 x]σ → * E[v 2 x]σ and since e 1 ≡ E e 2 , we also have E[v 2 x]σ ⇓ v . Case: To check (1), take σ as follows: we check that ( 2) Assuming we use a fresh variable x and E, σ as testing arguments, we conclude in the former case by considering E = handle {l: z,k → E[z x]; ret z → z} and σ as discriminating arguments.
We prove (3) assuming x fresh and E, σ as testing arguments.Let l , l be fresh labels; we define where E = handle {l: z,k → if do get () then (do put false; do l k) else k (do l z); ret z → z}.
and E l is E where all the occurrences of l are replaced by l .When l is handled first, we create the discriminating term; subsequent handlings are perfomed by E through l .
Renaming l into a fresh l in E is necessary to bypass the handler for l in E .The discriminating arguments are E true and σ. Case: ]. Described in details in Section 3.5. Case: To check (1), take σ as follows:   4) In each case, we assume x and l to be fresh and the testing arguments to be E and σ.
The discriminating arguments for (2) are σ , defined to be equal to σ everywhere, except

Figure 1
Figure 1 Syntax of λ eff

F S C D 2 0 2 0 3 : 4 A
Complete Normal-Form Bisimilarity for Algebraic Effects and HandlersWe write E[e] for the plugging of the expression e into the context E, and e{v/x} for the usual capture-avoiding substitution of x by v in e.Given a context E, we define the set of effects it handles, written hl(E), as follows.

Lemma 4 .
An open expression e is a normal form in the extended calculus iff e is a value, or e = E[x v] for some E, x, and v, or e = E l [do l v] for some E, l, and v, or e = E[α l [v]]

F S C D 2 0 2 0 3 : 8 A
Complete Normal-Form Bisimilarity for Algebraic Effects and Handlers

Figure 2
Figure 2 Up-to functions for λ eff

Lemma 19 .F 14 A
≡ E is a bisimulation.Proof.The proof consists in a case-by-case verification of the conditions stated in Definition 5 for the candidate relation ≡ E .Here we present one of the most representative cases that, in our opinion, illustrates best the power of the calculus and the techniques used in the remaining cases.Complete Normal-Form Bisimilarity for Algebraic Effects and Handlers

F S C D 2 0 2 0 3 : 16 A
Complete Normal-Form Bisimilarity for Algebraic Effects and Handlersstructure is considerably more involved since in the absence of control operators, to reach completeness, we had to explicitly handle deferred diverging terms and impose a stack-like discipline on the way evaluation contexts are tested.
handled.Extending the calculus with context variables introduces new normal forms which are compared by the bisimilarity in a very simple and regular way.The fact that such a simple notion of normal-form bisimilarity is complete shows the discriminating power of handlers.A consequence is that the examples of equivalent programs we provide are quite simple, as more complex effectful expressions are easily distinguished by handlers.There are several directions for future work.As pointed out in Section 3.6, it remains an open question how to define complete normal-form bisimulations in the calculus of multiprompted delimited-control operators corresponding to deep handlers studied in this work.

for α l :σ
(α l ) = σ(α l )[handle {l: z,k → if do get () then (do put false; do l z) else k (do l x); ret z → z}],and E true assumingE = handle {l : z,k → E[z x]; ret z → z}.For (3), we prove E 1 l [x] ≡ E E 2 l [x]as in (2), except that we take an extra fresh l and defineσ (α l ) = σ(α l )[handle {l: z,k → if do get () then (do put false; do l k) else k (do l x); ret z → z}]andE = handle {l : z,k → E l [z x]; ret z → z}where E l is the context E where the occurrences of l are replaced with l .Proving (4) requires (a)E 1 [x] ≡ E E 2 [x] and (b) E 1 [α l [do l z]] ≡ E E 2 [α l [do l x]]for any l and fresh α l .Assuming the same testing arguments, both cases are proved as in(2), except that in (a) we takeE = handle {l : z,k → E[k x]; ret z → z}and in (b) we take E = handle {l : z,k → E[k (α l [do l x])]; ret z → z}.

redex and an evaluation context. Example 1. Let us consider the example from the introduction:
1If a handler obtains a value (second rule), there are no more effects to handle and the value is passed to the return clause.The semantics is deterministic, as it can be shown that an expression is either a normal form or can be uniquely decomposed into a D.

Biernacki and S. Lenglet and P. Polesiuk 3:5 Normal forms and contextual equivalence. When
considering open expressions, normal forms can be of the following kinds.Two expressions e 1 and e 2 are contextually equivalent, written e 1 ≡ e 2 , if for all contexts C, such that C[e 1 ] and C[e 2 ] are closed, we have C[e 1 ] ⇓ v iff C[e 2 ] ⇓ v .

D 2 0 2 0 3:6 A Complete Normal-Form Bisimilarity for Algebraic Effects and Handlers Dealing
with control-stuck terms follows the same logic as for open-stuck terms:E 1 l [do l v 1 ]is related to e 2 if e 2 reduces to a control-stuck term with related values and contexts.
Indeed, successive "identity handlers" h = x,k → k x should be related if they handle the same effects, even in a different order: the context handle handle {l 2 : h; ret x → x} {l 1 : h; ret x → x} is expected to be equivalent to the context handle handle {l 1 : h; ret x → x} {l 2 : h; ret x → x}.
stuck terms.The latter differ from control-stuck terms of the form E l [do l v], because α l may be replaced by a context handling l , so even ifE 1 does not handle l we cannot consider E 1 [α l [E 2l ]] as a context not handling l .
A context variable cannot be bound, therefore an open term may contain context variables or free expression variables.In contrast, an expression or context is closed if it does not have any context variable or free expression variable.

D 2 0 2 0 3:12 A Complete Normal-Form Bisimilarity for Algebraic Effects and Handlers are
weak, as they essentially behave as bisimulation up to context, which is also weak in the plain λ-calculus.As explained in Section 3.3, they cannot be used in the passive clauses, i.e., when relating values or context-stuck terms.Let e 1 and e 2 be expressions of the plain calculus.If e 1 ≈ e 2 , then e 1 ≡ e 2 .Indeed, if e 1 ≈ e 2 , then for all contexts C, C[e 1 ] ≈ C[e 2 ] because ≈ is compatible.If C[e 1 ] ⇓ v , then C[e 2 ]⇓ v simply by definition of the bisimilarity.
[14]ple 15.Dal Lago and Gavazzo[14]propose an example where two fixed-point combinators are signaling each β-reduction with a tick effect; we modify it so that the two expressions are equivalent with handlers (but the tick effect is now arbitrary).Let e 1 = λy.dotick (∆ y ∆ y ) ∆ y = λx.(dotick y) λz.do tick (x x z) e 2 = Θ Θ Θ = λx.λy.do tick ((do tick y) λz.do tick (x x y z)) We prove these expressions are bisimilar up to, by building a candidate relation R incrementally, starting from e 1 and e 2 .Proof.The term e 1 is a value, and e 2 → λy.do tick ((do tick y) λz.do tick (Θ Θ y z)), so we need to relate the bodies of the λ-abstractions.We have a reduction do tick (∆ y ∆ y ) → do tick ((do tick y) λz.do tick (∆ y ∆ y z)); the resulting term is control-stuck, which we relate to do tick ((do tick y) λz.do tick (Θ Θ y z)) which is also control-stuck.The arguments of the effect are the same, and we need to relate the two contexts do tick ( λz.do tick (∆ y ∆ y z)) and do tick ( λz.do tick (Θ Θ y z)).Plugging them with a fresh variable, we obtain two open-stuck terms, meaning that we need to relate the two identical contexts do tick and the values λz.do tick (∆ y ∆ y z) and λz.do tick (Θ Θ y z).These last two values are related up to lambda and evaluation context if R contains ∆ y ∆ y and Θ Θ y, and the bisimulation proof for these two expressions is the same as for e 1 and e 2 .In the end, taking R = {(e 1 , e 2 ), (∆ y ∆ y , Θ Θ y)}, we can show that R is a bisimulation up to refl, red, lam, and up to context, i.e., up to cvar and csubst.Note that we are allowed to use the latter weak technique when comparing open-stuck terms, as it is an active clause.
[8]]nn et al.[13]propose a contextual equivalence and a logical relation characterizing it in a call-by-name calculus with effects.Their framework deals with different effects in a uniform way but with some limitations, as for instance nondeterminism, local store, or the combination of effects cannot be accounted for.Simpson and Voorneveld [29]present a modal logic for a call-by-value calculus which coincides with Dal Lago et al.'s applicative bisimilarity[15], but not with contextual equivalence, as demonstrated later[19].Matache and Staton improve on these results by defining a logic for a calculus in continuation-passing style that coincides with both applicative bisimilarity and contextual equivalence[19].Finally, Biernacki et al.[8]define a step-indexed logical relation for a call-by-value calculus with effects and handlers; to the best of our knowledge, it is the only previous work with handlers.

A Complete Normal-Form Bisimilarity for Algebraic Effects and Handlers such
that e 2 → * E 2 [α l [E 2 l [do l v 2 ]]]; we check that (2) v 1 ≡ v E v 2 , (3) E 1 l ≡ r E E 2 l, and (