Useful Open Call-by-Need

This paper studies useful sharing, which is a sophisticated optimization for lambda-calculi, in the context of call-by-need evaluation in presence of open terms. Useful sharing turns out to be harder in call-by-need than in call-by-name or call-by-value, because call-by-need evaluates inside environments, making it harder to specify when a substitution step is useful. We isolate the key involved concepts and prove the correctness and the completeness of useful sharing in this setting.


Introduction
Despite decades of research on how to best evaluate λ-terms, the topic is still actively studied and recent years have actually seen a surge in new results and sophisticated techniques.This paper is an attempt at harmonizing two of them, namely, strong call-by-need and useful sharing, under the influence of a third recently identified setting, open call-by-value.To describe our results, we have to first outline each of these approaches.
In the untyped, effect-free setting of the λ-calculus, CbNeed can be seen as borrowing the best aspects of call-by-value (CbV), of which it takes efficiency, and of CbN, of which it retains the better terminating behavior, as stressed in particular by Accattoli et al. [17].In contrast to CbN and CbV, however, CbNeed cannot easily be managed at the small-step level of the usual operational semantics of the λ-calculus, based on β-reduction and meta-level substitution.Its fine dynamics, indeed, requires a decomposition of the substitution process acting on single variable occurrences at a time -what we refer to as micro-step (operational) semantics -and enriching λ-terms with some form of first-class sharing.While Wadsworth's original presentation is quite difficult to manage, along the years presentations of CbNeed have improved considerably ([45, 48, 19, 31]), up to obtaining neat definitions, as the one by Accattoli et al. [6] (2014) in the linear substitution calculus (shortened to LSC), which led to elegant proofs of its correctness with respect to CbN, as done by Kesner [41] (2016), and of its relationship with neededness from a rewriting point of view, by Kesner et al. [43] (2018).
Strong Call-by-Need.Being motivated by functional languages, CbNeed is usually studied considering two restrictions with respect to the ordinary λ-calculus: 1) terms are closed, and 2) abstraction bodies are not evaluated.Let us call this setting Closed CbNeed.Extensions of CbNeed removing both these restrictions have been considered, obtaining what we shall refer to as Strong CbNeed.In his PhD thesis [27] (1999), Barras designs and implements an abstract machine for Strong CbNeed, which has then been used in the kernel of the Coq proof assistant to decide the convertibility of terms.Balabonski et al. [23] (2017) give instead the first formal operational semantics of Strong CbNeed, proving it correct with respect to Strong CbN-see also Barenbaum et al. [25], where the semantics of [23] is extended towards Barras's work; Biernacka and Charatonik [28], where it is studied via an abstract machine; Balabonski et al. [24] where it has recently been revisited and partially formalized.
CbNeed and the Strong Barrier.The definition of Strong CbNeed in [23] builds over the simple one in the LSC, and yet is very sophisticated and far from obvious.This is an instance of a more general fact concerning implementation techniques: dealing with the strong setting is orders of magnitude more difficult than with the closed setting, it is not just a matter of adapting a few definitions.New complex issues show up, requiring new techniques and concepts -let us refer to this fact as to the strong barrier.Another instance is the fact that Lévy's optimality [47] is far more complex in the strong case than in the weak one [29,22].
For neededness, the tool to break the strong barrier is a complex notion of needed evaluation context, parametrized and defined by mutual induction with their sets of needed variables.Specifying the positions in a term where needed redexes take place is very subtle.
Reasonable Cost Models and the Strong Barrier.Another sophisticated form of sharing for λ-calculi arose recently in the study of whether the λ-calculus admits reasonable evaluation strategies, that is, strategies whose number of β steps is a reasonable time cost model (i.e.measure of time complexity) for λ-terms.The number of function calls (that is, β-steps) is the cost model often used in practice for functional programs -this is done for instance by Charguéraud and Pottier in [32].A time cost model is reasonable when it is polynomially equivalent to the one of Turing machines, which is the requirement for good time cost models.For the λ-calculus, the theory justifying the practice of taking the number of function calls as a time cost model is far from trivial.It is an active research topic, see Accattoli [5].
The first result about λ-calculus reasonable strategies is due to Blelloch and Greiner [30] (1995), and concerns Closed CbV.The 2000s have seen similar results for Closed CbN and Closed CbNeed by Sands, Gustavson, and Moran [51] and Dal Lago and Martini [34,33].These cases are based on simulating the λ-calculus via simple forms of sharing such as those at work in abstract machines.The same kind of sharing can also be represented in the LSC, as shown by Accattoli et al. [6].The strong case seemed elusive and was suspected not to be reasonable, because of Asperti and Mairson's result that Lévy's optimal (strong) strategy is not reasonable [21] -the elusiveness was just another instance of the strong barrier.
Useful Sharing.In 2014, Accattoli and Dal Lago managed to break the barrier, proving that Strong CbN (also known as leftmost-outermost evaluation, or normal order) is a reasonable strategy [14].The proof rests on a simulation of Strong CbN in a refinement of the LSC with a new further level of sharing, deemed useful sharing.They also show useful sharing to be mandatory for breaking the strong barrier for reasonability.
Useful sharing amounts to doing minimal unsharing work, namely only when it contributes to creating β-steps, while avoiding to unfold the sharing (i.e. to substitute) when it only makes the term grow in size.Similarly to CbNeed, the specification of useful sharing can take place only at the micro-step level.Note that the replacement of a variable x in t with u can create a β redex only if u is (or shall reduce to) an abstraction and there is an applied occurrence of x in t (that is, t = T ⟨xs⟩ for some context T ).Therefore, restricting to useful substitutions -that is, adopting useful sharing -amounts to two optimizations of the substitution/unfolding process: 1. Never substitute normal applications: one must avoid substitutions of terms which are not -and shall not reduce to -abstractions, such as, say, yz, because their substitution cannot create β-redexes.Indeed, T ⟨(yz)s⟩ has a β redex if and only if T ⟨xs⟩ does.2. Substituting abstractions on-demand: when the term to substitute is an abstraction, one needs to be sure that the variable occurrence to replace is applied, because, for instance, replacing x with I in yx (obtaining yI) is useless, as no β-redexes are created.The first optimization is easy to specify, because it concerns the shape of the terms to substitute, that is, what to substitute -it has a small-step nature.The second one instead is very delicate, as it also concerns where to substitute.It depends on single variable occurrences and thus it is inherently micro-step -note that x has both a useful and a useless occurrence in xx.Similarly to Strong CbNeed, the difficulty is specifying useful evaluation contexts.
Strong CbNeed and Useful Sharing.Given the similar micro-step traits of CbNeed and useful sharing, and their similar difficulties, it is natural to wonder whether they can be combined.The operational semantics of Strong CbNeed in [23] has the easy useful optimization hardcoded, as it substitutes only abstractions.However, it ignores the delicate second optimization, and its number of β steps is therefore not a reasonable cost model.Concretely, this means that the practice of counting function calls does not reflect the cost of Balabonski et al.'s operational semantics for Strong CbNeed.Since Strong CbNeed is used in the implementation of Coq, this issue has both theoretical and practical relevance.
The aim of this paper is to start adapting useful sharing to call-by-need, developing reasonable operational semantics for CbNeed beyond the closed setting, and continuing a research line about CbNeed started by Accattoli and Barras [7,8].To explain our approach, we first need to overview a recent new perspective on the strong barrier.
Opening the Strong Barrier.The theory of the λ-calculus has mainly been developed in CbN.Historically, Barendregt stressed the importance of head evaluation (which does not evaluate arguments) for a meaningful representation of partial recursive functions -this is the leading theme of his famous book [26].A decade later, Abramsky and Ong stressed the relevance of weak head evaluation (which does not evaluate abstraction bodies either) to model functional programming languages [1].Therefore, the usual incremental way to understand strong evaluation is to start with the closed CbN case (i.e., weak head evaluation and closed terms), then turn to the head case (head evaluation and open terms), and finally add evaluation into arguments obtaining the strong case (and leftmost-outermost evaluation).This is for instance the progression that has been followed by Accattoli and Dal Lago to obtain a reasonable time cost model for Strong CbN [13,14].
In a line of work by Accattoli and co-authors [9,15,16,12] aimed at developing a theory of CbV beyond the usual closed case, it became clear that there is an alternative and better route to the strong setting They also show that useful sharing factors through the open setting, rather than through the head one: the two useful optimizations are irrelevant in the head case, while they make sense in the open one, where they can be studied without facing the whole of the strong barrier.
The strong setting can be seen as the iteration of the open one under abstraction, (but not as the iteration of the closed one, because diving into abstractions forces to deal with open terms).This view is adopted by Grégoire and Leroy in the design of the second strong abstract machine at work in Coq [38].Useful sharing for the strong case then amounts to understanding how open useful sharing and the iteration interact, which is subtle and yet is an orthogonal problem.Studying the open case first is the progression followed recently by Accattoli and co-authors to prove that Strong CbV is reasonable for time [9,16,11].This Paper.According to the decomposition of the strong barrier, here we study, as a first step, useful sharing for CbNeed in the open setting.Let us stress that, because of the barrier, it is not practicable to directly study the strong setting -this is also how the study for CbN and CbV, which are simpler than CbNeed, have been carried out in the literature.
An interesting aspect of useful sharing is that, while the underlying principle is the same, its CbN and CbV incarnations look very different, as the two strategies provide different invariants, leading to different realizations of the required optimizations.It is then interesting to explore useful sharing in CbNeed, which can be seen as a merge of CbN and CbV.
Difficulties.It turns out that useful sharing is quite more difficult to specify in CbNeed than in CbN or CbV.Useful sharing requires to know, for every variable replacement, both what is being substituted (is it an abstraction?) and where (is the variable to replace applied?).Evaluating only needed arguments, and only once, means that CbNeed evaluation moves deeply into a partially evaluated environment, making hard to keep track of both the what and the where of variable replacements.In particular, a variable might not be applied in the environment but at the same time be meant to replace an applied variable -thus being applied up to sharing -making the identification of applied variables a major difficulty.
The definition of useful rewriting steps is always involved.In CbN and CbV, they can nonetheless be specified compactly via the concept of unfolding, that is, iterated meta-level substitutions [14,9].These definitions can be called semantical, as they define useful micro steps via side conditions of a small-step nature.They are also somewhat ineffective, because they require further work to be made operational.Unfortunately, it is unclear how to give a semantic definition of usefulness in CbNeed.In particular, defining useful CbNeed evaluation contexts seems to require the unfolding of contexts, which is tricky, given that in CbNeed the context hole might be shared, thus risking being duplicated by the unfolding.
Outcome.Despite these difficulties, we succeed in designing an operational semantics for Open CbNeed with useful sharing, and proving that it validates the expected properties.
We proceed in three incremental steps.First, we provide a new split presentation of Closed CbNeed tuned for the study of useful sharing developed later on.Second, we extend it to the open setting, essentially mimicking Balabonski et al.'s approach [23], but limiting it to the open fragment.The real novelty is the third step, providing the refinement into a useful open CbNeed calculus, of which we prove the good properties.The crucial and sophisticated concept is the one of useful (CbNeed) evaluation contexts, which isolate where useful needed substitutions can be triggered.They are parametrized and defined by mutual induction with the notions of both applied and unapplied variables, similarly to how needed evaluation contexts are parametrized and mutually dependent with needed variables.The isolation of these concepts and the proof of their properties are our main contribution.
Our definition of useful step is operational rather than semantical, as we give a directand unfortunately involved -definition of useful evaluation contexts, being unclear how to give a semantic definition based on unfoldings in CbNeed.On the positive side, ours is the first fully operational definition of usefulness in the literature.Previous work (in CbN and CbV) has either adopted semantical ones [14,9], or has given abstract machines realizing the useful optimizations, but avoiding defining a useful calculus on purpose [3,16,11].
Among the properties that we prove, two can be seen as capturing the correctness and the completeness of useful sharing with respect to Open CbNeed: Correctness: useful substitution steps are eventually followed by a β step, the one that they contribute to create.That is, our useful steps correctly captures the intended semantics, as no steps irrelevant for β redexes are mistakenly considered as useful.
Completeness: normal forms in Useful Open CbNeed unfold to normal forms in Open CbNeed (the unfolding of normal forms is easy to deal with).That is, useful steps do not stop too soon: no steps contributing to β redexes are mistakenly considered as useless.
Sketched Complexity Analysis.The third essential property for useful sharing, and its reason to be, is reasonability: the useful calculus can be implemented within a polynomial (or even linear) overhead in the number of β-steps.We sketch the complexity analysis at the end of the paper.A formal proof requires introducing an abstract machine implementing the calculus.We have developed the machine, but left it to a forthcoming paper for lack of space.
Intersection Types in the Background.Because of the inherent difficulties mentioned above, our calculus is involved, even very involved.To remove the suspicion that it is an ad-hoc calculus, we paired it with a characterization of its key properties via intersection types, used as a validation tool with a denotational flavor, refining the type-based studies in [41,23,17].In such typing system, the delicate notions of useful evaluation contexts, and applied and unapplied variables have natural counterparts, and type derivations can be used to measure both evaluation lengths and the size of normal forms exactly.Such a companion study -omitted for lack of space -is in Leberle's PhD thesis [46].
Proofs.We adopt a meticulous approach, developing proofs in full details, almost at the level of a formalization in a proof assistant.The many technical details, mostly of a tedious nature, are in the technical report [18].This paper explains the relevant concepts.

The Need For Useful Sharing
Here we show a paradigmatic case of size exploding family -which is a family of terms whose size grows exponentially with the number of β-steps -motivating the key optimization of useful sharing for open and strong evaluation.Actually, there are two paradigmatic cases of size explosion and, accordingly, two optimizations characterizing useful sharing.The first C S L 2 0 2 2

4:6
Useful Open Call-By-Need optimization amounts to forbid the substitution of normal applications, and it is hardcoded into CbNeed evaluation, which by definition substitutes only values.Therefore, we omit discussing the first case of size explosion -more details can be found in [14,11].

Size Explosion.
The example of size-explosion we are concerned with is due to Accattoli [2] and based on the following families of terms, the t i , and results, the u i (where I := λz.z): ▶ Proposition 1 (Closed and strategy-independent size explosion, [2]).Let n > 0. Then The Useful Optimization.It is easily seen that all the terms substituted along the evaluation of the family are abstractions, namely the identity I and instances of u i , and that none of these abstractions ever becomes the abstraction (on the left) of a β-redex -that is, their substitution does not create, or it is not useful for, β-redexes.These abstractions are however duplicated and nested inside each other, being responsible for the exponential growth of the term size.Useful sharing is about avoiding such useless duplications.If evaluation is weak, and substitution is micro-step (i.e. one variable occurrence at a time, when in evaluation position, in a formalism with sharing), then the family does not cause an explosion.The replaced variables indeed are all instances of x in some t i which are under abstraction, and which are then never replaced in micro-step weak evaluation.With micro-step strong evaluation, however, these replacement do happen, and the size explodes.When evaluated with Balabonski et al.Strong CbNeed [23], this family takes a number of micro-steps exponential in the number of β steps, showing that -for as efficient as Strong CbNeed may be -the number of β steps does not reasonably measure its evaluation time.
To tame this problem, one needs to avoid useless substitutions, resting on an optimization sometimes called substituting abstractions on-demand, which is tricky.It requires abstractions to be substituted only on applied variable occurrences: note that the explosion is caused by replacements of variables (namely the instances of x) which are not applied, and that thus do not create β-redexes.For instance, the optimization should allow us substituting I on y in yx, because it is useful, that is, it creates a β redex, while it should forbid substituting it on x because it is useless for β-redexes.Note that this optimization makes sense only when one switches to micro-step evaluation, that is, at the level of machines, because in xx there are both a useful and a useless occurrence of x.The implementation of substituting abstractions on-demand is very subtle, also because by not performing useless substitutions, it breaks invariants of the usual open/strong evaluation process.
As shown by Accattoli and Guerrieri [16], in an open (but not strong) setting, substituting abstractions on-demand is not mandatory for reasonability.They also show, however, that it makes nonetheless sense to study it because it is mandatory for obtaining efficient implementations, as it reduces the complexity of the overhead from quadratic to linear with respect to the size of the initial term.On the other hand, the optimization is mandatory in strong settings, and it is easier to first study it in the open setting, because the iteration under abstraction (required to handle the strong case) introduces new complex subtleties.

3
The Split Presentation of Closed Call-by-Need

Split Evaluation Rules.
In contrast to most λ-calculi, we do not define the root cases of the rules and then extend them by a closure by evaluation contexts.We rather define them directly at the global level.Adopting global rules is not mandatory, and yet it shall be convenient for dealing with the useful calculus -we use them here too for uniformity.

Closed CbNeed evaluation rules
The names of the rules are due to the link between the LSC and linear logic, see Accattoli [4].Note that we use both plugging of terms and programs, to ease up notations.While the intended behavior is -we hope -clear, specifying these steps via evaluation contexts requires some care and a few definitions.Essentially, we need to understand when evaluation can pass to the next argument, and thus characterize when terms are normal.This is easy for terms but becomes tricky for programs.
Evaluation Places and Needed Variables.The grammars of the language are the same as for split Closed CbNeed, but defining the open evaluation contexts is quite subtler.In Closed CbNeed there is only one place of the term where evaluation can take place, the hereditary head context H * .In the open setting the situation is more general: there is one active evaluation place plus potentially many passive ones, which are those places where evaluation already passed and ended.On some of these passive places, evaluation ended on a free variable (occurrence).We refer to these free variables as needed (definition below 1 ), as they shall end up in the normal form, given that at least one of their occurrences has already been evaluated and cannot be erased.For instance in p := (x(yI), [z←x][y←I]) the active place is y, the first occurrence of x is a needed occurrence, while the second one is not.Normal Terms.In Open CbNeed normal forms are not simply answers (i.e.abstractions together with an environment), as free variables induce a richer structure.We shall later characterize the subtle inductive structure of normal programs.For now, we need predicates (that shall be later shown) characterizing normal terms, as they are used to define evaluation contexts.The definition and the terminology are borrowed from Open CbV [9,15], where normal terms are called fireballs and are defined by mutual induction with inert terms: Later on, we shall often need to refer to inert terms that are not variables, which is why we introduce now a dedicated notation.We shall sometimes write inert(t) (resp., abs(t)) to express that t is an inert term (resp., an abstraction).
1 Needed variables are intended to be considered only for normal terms (or normal programs, or normal parts of a context), and yet the definition is given here for every term (in particular every applications, instead of only inert applications if ).The reason for our lax definition is that the technical development requires at times to consider the needed variables of a term that is not yet known to be normal.The lax definition goes against the needed intuition, as one of the reviewers understandably complained about, suggesting to call these variables frozen, following Balabonski et al. [23].We preferred to keep needed because they are similar but different from the frozen variables in [23], see the end of this section.Open evaluation contexts and their needed vars inert(i, ϵ)

▶ Lemma 2 (Unique parameterization of open evaluation contexts).
Let P ∈ E V and P ∈ E W . Then V = W.
Open Evaluation Rules.The definition of the evaluation rules mimics exactly the one for the split closed case.Given an Open CbNeed evaluation context P ∈ E V , we have: We shall say that p reduces to q in the Open CbNeed evaluation strategy, and write p → ond q, whenever p → om q or p → oe q.
Normal Programs.Normal programs mimic normal terms and are of two kinds, inert or abstractions.The definition however now depends on needed variables and cannot be given as a simple grammar.The two predicates inert and abs are defined in Fig. 2. Finally, predicate onorm is defined as the union of inert and abs, that is, onorm(p) if inert(p) or abs(p).The intended meaning is that it characterizes programs in Open CbNeed-normal form.

▶ Proposition 4 (Syntactic characterization of Open CbNeed-normal forms). Let p be a program. Then p is in → ond -normal form if and only if onorm(p).
The proofs of Prop. 3 and 4 (in [18]) are subtler and longer than one might expect, because of the fact that evaluation contexts and needed variables are mutually defined.

Relationship with Balabonski et al.
With respect to the definition of Strong CbNeed in [23], we follow essentially the same approach up to two differences, not counting the obvious fact that we are open and not strong.First, we use a split calculus, while they do not, because they do not study useful sharing.
Second, they have a similar but different parametrization of evaluation contexts.They are more liberal, as their sets of frozen variables used as parameters are supersets of our needed variables, but they also parametrize reduction steps, which we avoid.Our 'tighter' choice is related to the fine study of intersection types for Open CbNeed, which can be found in the Leberle's PhD thesis [46], and it is also essential for the refinement required by the useful extension of Sect. 5.In [24], a reformulation of [23] using a deductive system (parametrized by frozen variables) rather than evaluation contexts is used -it could also be used here.

Useful Open Call-by-Need
Roughly, useful sharing is an optimization of micro-step substitutions, that is, of exponential steps.The idea is that there are substitution steps that are useful to create β/multiplicative redexes and steps that are useless.For instance (the underline stresses the created β-redex): The main idea is that useful steps replace applied variable occurrences, while useless steps replace unapplied occurrences.The definition of the useful calculus then shall refine the open one by replacing the set of needed variables with two sets, one for applied and one for unapplied variable occurrences.Note a subtlety: variables can have both applied and unapplied needed occurrences, as x in xx.Therefore, usefulness is a concept that can be properly expressed only when considering replacements of single variable occurrences.Usefulness unfortunately is not so simple.Consider the following step replacing z with I: Is it useful or useless?It does not create a multiplicative redex -therefore it looks useless -but without it we cannot perform the next step (xy, [x←I][z←I]) → oe (Iy, [x←I][z←I]) replacing x with I which is certainly useful -thus step (1) has to be useful.
We then have to refine the defining principle for usefulness: useful steps replace hereditarily applied variable occurrences, that is, occurrences that are applied, or that are by themselves (i.e.not in an application) and that are meant to replace a hereditarily applied occurrence.
Handling hereditarily applied variables is specific to CbNeed, and makes defining Useful Open CbNeed quite painful.The key point is the global character of the hereditary notion, that requires checking the evaluation context leading to the variable occurrence and it is then not of a local nature.We believe that hereditarily applied variables, nonetheless, are an unavoidable ingredient of usefulness in a CbNeed scenario, and not an ad-hoc point of our study.This opinion is backed by the fact that such a convoluted mechanism is modeled very naturally at the level of intersection types, as it is shown in Leberle's PhD Thesis [46].Important: from now on, we ease the language saying applied to mean hereditarily applied.Applied and Unapplied Variables.We now define, for terms, programs, and term contexts, the sets of applied and unapplied variables a(•) and u(•), that are subsets of needed variables nv(•).We shall prove that nv(t) = a(t) ∪ u(t) (i.e., the two sets cover nv(t) exactly).As already pointed out, applied and unapplied variables, however, are not a partition of needed variables, that is, in general a(t) ∩ u(t) ̸ = ∅ as a variable can have both applied and unapplied (needed) occurrences, as x in xx.The same holds also for programs and term contexts.

C S L
The set of applied variables of terms, programs, and term contexts are defined in Fig. 3 explanations follow.Having in mind that we want to define a(p) in such a way that it satisfies a(p) ⊆ nv(p), note that condition x ∈ nv(t, e) ∧ x ∈ a(t, e) ∧ u = y ∈ Var in the definition of a(t, e [x←u]) would more simply be x ∈ a(t, e) ∧ u = y ∈ Var.However, we have not proved yet that a(p) ⊆ nv(p), which is why the definition is given in this more general form.
We give some examples.As expected, y is an applied variable of y z and z (y z).It is also applied in p := (z x, [x←y z]), even if x is not applied in (z x, ϵ).Thus, Useful Open CbNeed evaluation shall be defined as to include exponential steps such as (z x, [x←y z][z←v]) → oe (z x, [x←v z][y←v]), which are useful.Note that y is not applied in (x, [z←yx]), because applied variables have to be needed variables, and y is not needed.Another example: if p := (xt, [x←y]), then y ∈ a(p) (and also z ∈ a((xt, [x←y][y←z]))).Useful Open CbNeed, then, shall retain the following two exponential steps of the open case, since the sequence is supposed to continue with a → m step, contracting the redex given by vt: The set of unapplied variables of terms, programs, and term contexts are defined in Fig. 3. Once again, in the second clause defining u(t, e[x←u]) the side condition x ∈ nv(t, e) can be replaced by x ∈ a(t, e), after Lemma 5 below is proved.We give some examples.A consequence of the definition is that, as for applied variables, y is not unapplied in (xx, [z←xy]) because it is not needed.As it is probably expected, y is unapplied in (zx, [z←xy]), even if xy is meant to replace z which is applied in zx.Perhaps counter-intuitively, instead, our definitions imply both y ∈ a(p) and y ∈ u(p) for p := (xx, [x←y]), that is, the unique occurrence of y is both applied and unapplied in p2 .
1. Terms: nv(t) = u(t) ∪ a(t) for every term t.Finally, the derived concept of useless variable shall also be used.
▶ Definition 6 (Useless variables).Given a term t, we define the set of useless variables as ul (t) := u(t)\a(t).The set of useless variables of a program p is defined analogously.
Useless variables are crucial in differentiating Useful Open CbNeed from Open CbNeed.We shall prove that if p is a useful open normal form and x ∈ ul (p), then p@[x←v] is also a useful open normal form (while it is not a open normal form).The notion of useless variables is intuitively simple but technically complex.Some examples.First, note that ul (x x, ϵ) = ∅.The example can be extended to a hereditary setting, noting that ul (y, [y←x x]) = ∅.However, the reasoning takes into account only needed occurrences, that is, note that x ∈ ul (z x, [y←x x]) , as the occurrence of x that is applied to an argument is not needed.
Evaluation Contexts.The definition of evaluation contexts is particularly subtle in the useful case.First of all, their set E U ,A is indexed by two sets of variables (rather than one as in the open case), the applied A and the unapplied U variables of the context, defined by mutual induction with the contexts themselves.The second key point is that there are two different kinds of evaluation contexts, a permissive one for multiplicative redexes, whose set is noted E U ,A , and a restrictive one for exponential redexes, noted E @ U ,A and implementing the fact that the variable occurrence to be replaced has to be in an applied position.The asymmetry is unavoidable, because useful sharing concerns only exponential steps.

4:14
Useful Open Call-By-Need x ∈ (U ∪ A) P @[x←y] ∈ E @ upd(U ,x,y),upd(A,x,y) x ∈ (U ∪ A) x ∈ (U \ A) which is now generalized into 3 rules, depending on the kind of term contained in the ES.▶ Definition 7 (Applicative term contexts).A term context H shall be called an applicative term context if it is derived using the grammar H @ , J @ , I @ :: ▶ Definition 8 (Exponential evaluation contexts).We shall say that an evaluation context P is a exponential evaluation context if it is derived with the rules in Fig. 5.
Applicative term contexts serve as the base case of exponential evaluation contexts, now given by two refinements of the multiplicative case: 1. the base case E AX1 is akin to the base case M AX for multiplicative contexts, except that it requires the term context to be applicative.2. the plugging-based rule M HER splits in two.A first rule E AX2 which simply plugs an applicative context H @ into a multiplicative evaluation context -note that this rule gives another base case for exponential evaluation contexts.A second rule E NA that handles the special case of the global applicative constraint.
Let us see the differences between rules E NA and AX2 with two examples.Their side conditions (x / ∈ (U ∪ A) and x / ∈ A) are explained at page 17 of the technical report [18].E NA : consider the program p := (x t, [x←z]), where z is in applied position due to the global applicative constraint, as it substitutes x which is applied to t.We may derive an exponential evaluation context P that isolates z, that is, such that P ⟨z⟩ = p, as follows: ), and so p = P ⟨z⟩ as expected.In this case, we are extending an exponential context, which is already applied.E AX2 : consider p := (x, [x←z t]), for which z is an applied variable because it is itself applied, while its ES binds the needed but unapplied variable x.Let us derive an exponential evaluation context P focusing on z in such a way that P ⟨z⟩ = p as follows: ), and so p = P ⟨z⟩ as expected.
Here the context (⟨•⟩, ϵ) in the hypothesis is multiplicative and it becomes exponential once extended with an ES containing an applicative term context.
The next proposition guarantees that exponential contexts are a restriction of multiplicative contexts, that is, that the introduced variations over the deduction rules do not add contexts that were not already available before.
▶ Proposition 9 (Exponential contexts are multiplicative).Let P ∈ E @ U ,A .Then P ∈ E V,B , for some V ⊆ U and B ⊆ A.
Let us repeat that, instead, multiplicative contexts are not in general exponential contexts, because they are not required to be applicative, for instance P := (x⟨•⟩, [x←yy]) ∈ E {y},{y} is a multiplicative context but not an exponential one.
Evaluation Rules.The reduction rules for the Useful Open CbNeed strategy are: Useful Open CbNeed evaluation rules Useful multiplicative P ⟨(λx.t)u⟩→ um P ⟨t, [x←u]⟩ if P ∈ E U ,A Useful exponential P ⟨x⟩ → ue P ⟨v⟩ if P ∈ E @ U ,A and P (x) = v Moreover, we shall say that p reduces in the Useful Open CbNeed strategy to q, and write p → und q, if p → um q or p → ue q.

4:16
Useful Open Call-By-Need Determinism.The first property of useful evaluation that we consider is determinism, that is proved similarly for the open case, but for some further technicalities due to the existence of two sets of variables parametrizing evaluation contexts.
Usefulness.We prove two properties ensuring that the defined reduction captures useful sharing.The first one is a correctness property, stating that useful exponential steps are eventually followed by a multiplicative step -no useless exponential steps are possible.
▶ Proposition 11 (Usefulness of exponential steps).Let p = P ⟨x⟩ → ue P ⟨v⟩ = q with P ∈ E @ U ,A and P (x) = v.Then there exists a program r and a reduction sequence d : q → k ue → um r s.t.: 1. the evaluation context of each → ue steps in d is in E @ U ,A , and the one of → um is in E U ,A .2. k ≥ 0 is the number of E NA rules in the derivation of P ∈ E @ U ,A .
Completeness amounts to proving that useful normal forms, when unshared, give a Open CbNeed normal term.The point is that useful substitutions, if erroneously designed, might stop too soon, on programs that still contain -up to unsharing -some redexes.Completeness is developed in the following paragraph about useful normal forms.
Useful Normal Forms.We are now going to develop an inductive description of useful normal forms, that is, programs that are → und -normal.The key property guiding the characterization of a useful normal program p is that if the sharing in p is unfolded (by turning ES into meta-level substitutions and obtaining a term) it produces a normal term of the open system, where the unfolding operation is defined as follows: Unfolding of programs (t, ϵ) The characterization rests on 3 predicates, defined in Fig. 6, for programs unfolding to variables (genVar x (p)), values (uabs(p)), and non-variable inert terms (uinert(p)).Programs satisfying genVar x (p) are called generalized variable of (hereditary) head variable x -we also write genVar # (p) to state that there exists x ∈ Var such that genVar x (p).Programs satisfying uabs(p) (resp.uinert(p)), instead, are useful abstractions (resp.useful inerts).The predicate unorm(p) holds for programs satisfying either of the three described predicates, which we shall show being exactly programs that are normal in Useful Open CbNeed.Generalized variables play a special role, because they can be extended to unfold to values or non-variable inert terms, by appending an appropriate ES to their environment with rule A GV or I GV .For instance, a useful normal program such as (x, While the concepts in the characterization of useful normal programs are relatively simple and natural, the proof of the next proposition is long and tedious, because of the complex shape of useful evaluation contexts and of their parametrization, see the technical report [18]. ▶ Proposition 13 (Syntactic characterization of Useful Open CbNeed-normal forms).Let p be a program.Then p is in → und -normal form if and only if unorm(p).

Figure 1
Figure 1 Needed variables for term contexts and the derivation rules for open evaluation contexts.

Figure 2
Figure 2 Predicates for Open CbNeed normal programs.

Figure 3
Figure 3Applied and unapplied variables for terms, programs, and term contexts.

Figure 4
Figure 4 Derivation rules for multiplicative evaluation contexts.

Figure 5
Figure 5 Derivation rules for exponential evaluation contexts.

1 . 2 . 3 .
[x←y]) unfolds to a variable but its useful normal extension (x, [x←y][y←I]) unfolds to the value I, while (x, [x←y][y←zz]) unfolds to the non-variable inert term zz.▶ Proposition 12 (Disjointness and unfolding of useful predicates).For every program p, at most one of the following holds: genVar # (p), uabs(p), or uinert(p).Moreover, If genVar x (p) then p → = x.If uabs(p) then p → is a value.If uinert(p) then p → is a non-variable inert term.

Accattoli and M. Leberle 4:7 The Need to Split.
[6,41,10,23,43,17,42]on calculus (LSC) provides a simple and elegant setting for studying CbNeed, as shown repeatedly by Accattoli, Kesner and co-authors[6,41,10,23,43,17,42].The LSC extends the λ-calculus with explicit substitutions (shortened to ES), noted t[x←u], which are a compact notation for let x = u in t.Captureavoiding meta-level substitution is noted t{x←u}.To model the useful optimization explained above, we shall need to substitute abstractions only on applied variables.Now, in the LSC, ES can appear everywhere in the term, for instance there are terms such as t := x[x←I]u.Note that in t it is hard to say whether the replacement of x with I is useful by looking only at the scope of the ES (which is the left of the [•←•] construct): the subtlety being that the replacement is indeed useful, because the variable is applied and I is an abstraction, but the application it is involved in is outside the scope of the ES.To avoid this complication, we give a presentation of CbNeed where ES are separated from the term they act upon, and cannot be nested into each other, similarly to what happens in abstract machines.The split presentation is not mandatory to study useful sharing, but it is quite convenient.The subtlety is that the evaluation of t can create new ESs, which should be added to the program without breaking its structure, that is, outside the ES which is being evaluated ([z←•] in the example).The trick to make it work, is using an unusual notion of context plugging.Before defining evaluation contexts we simply discuss split contexts, which are used throughout the paper.
[8]]his section we give an unusual split presentation of Closed CbNeed that shall be the starting point for our study of the open and the useful open cases of the next sections.B.Split Grammars.In the split syntax, a term is an ordinary λ-term (without ESs), and a program is a term together with -in a separate place -a list of ESs, called environment.Given a countable set of variables Var, the syntax of Closed CbNeed is given by:Values v, w ::= λx.t Environments e, e ′ ::= ϵ | e[x←t] Terms t, u, s ::= x ∈ Var | v | tu Programs p, q ::= (t, e)Note that the body of a λ-abstraction is a term and not a program.Of course, extending the framework to strong evaluation -which is left for future work -requires to allow programs under λ-abstractions.Note also that variables are not values.This is standard in works dealing with implementations or efficiency, as excluding them brings a speed-up, as shown by Accattoli and Sacerdoti Coen[10].In e[x←t] and (u, e[x←t]) the variable x is bound in e and u.Terms and programs are identified modulo α-renaming.Environments are concatenated by simple juxtaposition.We also define the environment look-up operation as follows: set e(x) := t if e = e ′ [x←t]e ′′ and x is not bound in e ′ , and e(x) := ⊥ otherwise.Split Contexts and Plugging.Micro-step CbN and CbV evaluation have easy split presentations, because their evaluation contexts may be seen as term contexts, using the environment only for look up, see for instance[8].CbNeed evaluation contexts, instead, need to enter into the environment.Typically in a program such as (xy, [x←z][z←t]), whose head variable x has been found, CbNeed evaluation has to enter inside [x←•], finding another (hereditary) head variable z, and in turn enter inside [z←•] and evaluate t.

4 Open Call-by-Need We
now shift to Open CbNeed, an evaluation strategy extending Closed CbNeed and allowing reduction to act on possibly open programs.Roughly, the strategy iterates CbNeed evaluation on the arguments of the head variable, when the normal form of ordinary CbNeed evaluation is not an abstraction, which can happen when terms are not necessarily closed.Various aspects of Closed CbNeed become subtler in Open CbNeed, namely the definition of evaluation contexts and the structure of normal forms, together with the new notion of needed variables.Essentially, we are giving an alternative presentation of the open fragment of Balabonski et al.'s Strong Call-by-Need, with which we compare at the end of the section.For appropriate generalizations of → m and → e .Of course, we retain and extend to arguments the hereditary character of the reduction rules, therefore having also steps such as: (x((λz.z)I), [y←t]) → m (xz, [z←I][y←t]) and (xz, [z←I][y←t]) → e (xI, [z←I][y←t]) B. Accattoli and M. Leberle 4:9 (yx, [x←y((λz.z)I)])→m (yx, [x←yz][z←I]), and (yx, [x←yz][z←I]) →e (yx, [x←yI][z←I])

2 0 2 2 4:12 Useful Open Call-By-Need
4.Their set is noted E U ,A .The refinement is even if useful sharing concerns only exponential steps: a multiplicative context such as ((yx)⟨•⟩,[x←v]) ∈ E ∅,{y} indeed is not an open context, because it contains a useless substitution step that in Open CbNeed would be fired before evaluating the hole.The definition of multiplicative contexts follows the one for Open CbNeed contexts (M AX , M GC , and M HER are essentially as before) except for rule: Exponential contexts are even more involved, because they have to select only applicative variable occurrences and the applicative constraint is of a global nature.First, we need a notion of applicative term context, where the hole is applied.
That is, given P ∈ E U ,A and x ∈ (U ∪ A), the constraints to extend P with an ES[x←t]are: Rule M I : there are no constraints if t is a non-variable inert term i + .Note that M I and M GC together imply that we can always append ESs containing inert terms to multiplicative contexts, without altering the Useful Open CbNeed order of reduction.Rule M VAR : this rule covers the case where t is a variable y.It is used to handle the global applicative constraint, as in such a case, if the evaluation context is P @[x←y], then y has to be added to the applied and/or unapplied variables of the context, according to the role played by x in P , which is realized via the function upd defined as follows:upd(S, x, y) := S x / ∈ S (S \ {x}) ∪ {y} x ∈ SRule M U : it covers the case where t is a value v, requiring that x is not applied, that is, / ∈ A. Such an extension would have re-activated x in the plain open case, and created a (useless) exponential redex, but here it shall not be the case.Note that it means that P @[x←t] is a multiplicative context only if x ∈ (U \ A), i.e. if x is a useless variable of P .Exponential Contexts.