The Squirrel Prover and its Logic

Security protocols are widely used today to secure transactions that take place over public channels like the Internet. Common uses include the secure transfer of sensitive information such as credit card numbers, or user authentication on a system. Because of their presence in many widely used applications (e.g. electronic commerce, government-issued ID), developing methods and tools to verify security protocols has become an important research challenge. Such tools help increase our trust in protocols, and hence on the applications that rely on them.


INTRODUCTION
Security protocols are widely used today to secure transactions that take place over public channels like the Internet.Common uses include the secure transfer of sensitive information such as credit card numbers, or user authentication on a system.Because of their presence in many widely used applications (e.g.electronic commerce, governmentissued ID), developing methods and tools to verify security protocols has become an important research challenge.Such tools help increase our trust in protocols, and hence on the applications that rely on them.
Formal methods have brought various approaches to prove that cryptographic protocols indeed guarantee the expected security properties.An effective approach in this area of research consists in modeling cryptographic messages as first-order terms, together with an equational theory that represents attacker capabilities.This idea, originally proposed in [Dolev and Yao 1981], has been refined over the years, resulting in a variety of so-called symbolic models.These models encompass broad categories of attackers and facilitate the automated verification of protocols.They have led to the development of successful tools such as ProVerif [Blanchet 2001] and Tamarin [Meier et al. 2013].However, it is important to note that security in a symbolic model does not necessarily imply security in the cryptographers' standard model, called the computational model.In that model, attackers are represented by probabilistic polynomial-time Turing machines (PPTMs), and one proves that a protocol is indistinguishable from an idealized, obviously secure version of it.Verification techniques for the computational model, though crucially needed, often exhibit less flexibility or automation compared to ones for symbolic models.As an illustration, secret keys are faithfully modeled in the computational model as long bitstrings that are drawn uniformly at random, whereas they are modeled using abstract names in symbolic models.In symbolic models, two distinct secret keys are represented by different names, which cannot be equal.However, in the computational model, as in reality, it is possible (although unlikely) that the sampled bitstrings are equal.In this column, we present a recent logic-based method for verifying cryptographic protocols in the computational model, and some practical aspects of its implementation in the Squirrel tool [Baelde et al. 2021;Baelde et al. 2023].This system is built on the Computationally Complete Symbolic Attacker (CCSA) approach of [Bana and Comon-Lundh 2012;Bana and Comon-Lundh 2014], which relies on the symbolic setting of logic, but avoids the limitations of the symbolic models mentioned above.Instead of modeling attacker capabilities by rules stating what the adversary can do, the CCSA method relies on the specification of what the attacker cannot do.Starting from the security properties of cryptographic primitives, one derives rules that articulate which pairs of message sequences are indistinguishable.These rules are proved sound w.r.t. the interpretation of terms as PPTMs.Therefore, a proof of a security property using these rules implies security in the computational model under the initial cryptographic assumptions.The CCSA logic was later extended into a meta-logic, which served as basis for the first version of Squirrel [Baelde et al. 2021], before being generalized to a fully-fledged higher-order logic in [Baelde et al. 2023].The Squirrel tool is a proof assistant developed in a collaborative effort, which has been successfully used on a variety of case studies [Baelde et al. 2022;Comon et al. 2020;Cremers et al. 2022].Instructions for installing the system are available on its website, together with a user manual, tutorials, and an in-browser interface for playing with the tool without installing it: https://squirrel-prover.github.io We first provide, in Section 2, an introduction to the computational model and the CCSA approach, showing in particular how cryptographic assumptions translate into logical rules.We elaborate on this in Section 3 to show how protocols can be modeled in CCSA logic, discussing a subtle issue w.r.t. the intended notion of security.Finally, Section 4 formally defines the higher-order CCSA logic, on which Squirrel is based, and discusses some of its interesting technical features: the distinction between local and global formulas, the subtleties tied to reasoning about probabilistic objects, and the key notion of bi-deduction for reasoning about computational indistinguishability.
Related work.Squirrel is only a recent addition to the list of available systems for verifying cryptographic protocols.We have mentioned above some tools that provide guarantees in symbolic models; we now briefly present the tools providing guarantees in the computational model.Several such systems exist, based on different approaches.The earliest one is CryptoVerif [Blanchet 2008], which mechanizes proofs based on high-level game transformations, following the style of pen-and-paper proofs.The most prominent system today is probably EasyCrypt [Barthe et al. 2011], which is a proof assistant also featuring higher-order logic.It notably embeds a domain-specific probabilistic relational Hoare logic [Barthe et al. 2009], which can capture cryptographic game transformations.This design makes it possible to carry out most cryptographic arguments within EasyCrypt.Other systems are currently being developed, with various goals: notably CryptHOL [Basin et al. 2020], which explores an alternative modeling technique in Isabelle/HOL, and F [Swamy et al. 2016], which is a generalpurpose program verification framework based on refinement types that can be used (via external arguments) to provide computational security guarantees.Those approaches can be compared on several criteria [Barbosa et al. 2021].
Here, we only compare with the closest two tools, and only with respect to three criteria, to highlight differences: modeling, automation, and proof methodology.Regarding modeling, CryptoVerif and Squirrel have a similar level of detail and expressivity, although CryptoVerif supports a larger range of cryptographic assumptions.EasyCrypt is more expressive, with a higher level of detail, at the cost of modeling overhead, and thus better suited for proving properties of primitives.Protocol specifications in Cryp-toVerif and Squirrel are given in a process algebra, whereas protocols are encoded in EasyCrypt as APIs, which is inconvenient for interactive protocols.Finally, CryptoVerif does not support stateful protocols, while Squirrel and EasyCrypt do.
Overall, the current level of automation of Squirrel sits somewhere between Cryp-toVerif, which can apply cryptographic arguments automatically, although it often requires hints about which game transformations to use, and EasyCrypt, which does not automatically apply cryptographic arguments.
Finally, the most important difference between Squirrel and the other tools is the associated proof methodology: CryptoVerif relies on game transformations and EasyCrypt performs Hoare-style proofs of programs, while Squirrel reasons over execution traces of protocols.

A BASIC INDISTINGUISHABILITY LOGIC
We introduce in this section the CCSA approach, which is designed to reason on cryptographic protocols, i.e. concurrent programs relying on cryptographic primitives to achieve some functionality while running in a malicious environment.To simplify our exposition, we delay the treatment of general protocols to Section 3, and restrict our presentation to simple sequential cryptographic games for the time being.These games are used by cryptographers to express security properties of cryptographic primitives against attackers modeled as arbitrary polynomial-time computations.A key concept about games is computational indistinguishability: roughly, two games are Game G: A secret key sk of length η is randomly sampled.The interactive attacker A is asked to provide a message m which is then encrypted (using sk) and sent back to A.
Afterwards, A returns a second message x which is the final output of the game.indistinguishable if any polynomial-time adversary has at-best a negligible probability in distinguishing them.In the CCSA approach, we use first-order logic to establish computational indistinguishability: we first model the messages derived during a game's execution as first-order terms, and then reason about computational indistinguishability in first-order logic, using inference rules derived from cryptographic assumptions.
A key aspect of the logic is that it abstracts cryptographic arguments as much as possible.Notably, its syntax does not mention the length of keys or the value of probabilities.Further, reduction-based arguments are implicit: they justify the soundness of the proof rules, but the code underlying reductions is never explicitly stated.

The core of the logic
Security in the real world is often conditional, as there is usually a small but non-zero probability that the adversary manages to break the security of a system.Typically, one cannot rule out that a lucky attacker guesses a secret key.To discard such unlikely cases, security is considered up to a negligible probability of attack.A function In what follows, we will thus use a security parameter η ∈ N which can be, e.g., the length of the secret keys, and security must hold with overwhelming probability w.r.t.η.
In the cryptographic literature, security properties are commonly specified using socalled games, in which an adversary tries to mount an attack: the game might represent an adversary attempting to guess a secret value, to forge a signature, . . .Games are usually written in pseudo-code using an imperative style, as shown in Figure 1.The statement x $ ← D stores the result of a random sampling following distribution D into variable x: e.g.sk $ ← {0, 1} η uniformly samples a secret key sk of η bits.Games are parameterized by an abstract interactive attacker, denoted by A. The statement o ← A(i) calls the adversary A on input i, and stores its answer in variable o.The adversary is stateful, retaining information across its invocations.Standard assignments are denoted in the same way: x ← e stores the result of evaluating e into variable x.
Describing cryptographic games as probabilistic imperative programs lets cryptographers rely on the reader's intuitive understanding of their semantics, and avoids introducing convoluted execution models.However, this comes at a cost: formally reasoning over such programs can be difficult (one needs to deal with statefulness, loop invariants, probabilistic dependencies, . . .); moreover, encoding complex protocols as imperative programs is possible but not natural, and complicates security proofs.
Cryptographic games in CCSA.In the CCSA approach, we do not explicitly represent games.Instead, we shall only represent the messages computed at different points in a game, using first-order terms.Those are pure, unlike the stateful games, and thus easier to reason about.Our terms are built using: honest function symbols which represent the various primitives used to compute messages (e.g.pairing, encryption); -attacker function symbols, noted att i for i ∈ N, modeling arbitrary computations performed by adversaries; -name symbols, representing the sources of randomness: essentially, a name is a pointer to a memory cell holding a value sampled at random before the game starts.
Example 2.1.Let us illustrate this approach by modeling the final result of the game of Figure 1 as a single term.The sampling of the secret key is modeled by a name symbol sk.The first call to the attacker A() is modeled by the term att 0 ().In the imperative game modeling style, internal functions (e.g. the encryption enc) can be probabilistic, but this is not so in our approach, where we precisely track probabilistic dependencies.Thus, we explicitly specify the randomness used by the encryption function enc, by introducing a name symbol r and using the term enc(att 0 (), r, sk) to model t in the game.Then, the second call to the adversary can be modeled using a different adversarial function symbol, as: att 1 (enc(att 0 (), r, sk)).Importantly, we have modeled two successive calls to the stateful attacker A in the game by two pure functions (att 0 and att 1 ).This is without loss of generality, as any state computed during the first call to A and used in the second call can be recomputed when modeling the second call as att 1 .
Models of the logic.Defining the logic's semantics in order to have a faithful translation from games to terms requires a bit of care.The interpretation of a term is parameterized by the security parameter η ∈ N, as well as two explicit sources of randomness provided by a pair ρ = (ρ h , ρ a ) of tapes filled with random bits.A model M of our logic must: -provide for each honest function symbol f its interpretation as a PPTM M f ; -associate to each name n a unique sub-sequence of η bits in ρ h using a PPTM M n with access to ρ h , such that distinct names use disjoint parts of ρ h ; -interpret any adversarial function symbol att as a PPTM M att which can only access the random tape ρ a (but not ρ h ).
For the sake of simplicity, we forced names to be interpreted as uniform and independent random samplings in {0, 1} η .Our models are first-order models with a tailored interpretation domain.Thus, terms can be interpreted in M using the standard semantics of first-order logic.In our specific setting, a term t is interpreted as a function associating, to each value of the security parameter η and random tapes ρ, the interpretation t η,ρ M .More precisely, the function η, ρ → t η,ρ M is computed by a PPTM.For example, the interpretation n η,ρ M of a name symbol n is computed by M n (η, ρ h ), and an adversarial computation is computed by att i (t) η,ρ M = M atti (η, t η,ρ M , ρ a ).Importantly, this interpretation is compositional and makes all probability dependencies explicit, which will facilitate reasoning over cryptographic messages.
Example 2.2.Coming back to G from Figure 1, the value in the variable x at the end of G's execution follows the same probability distribution as att 1 (enc(att 0 (), r, sk)) η,ρ M where the tapes in ρ are sampled at random, and where M interprets the encryption and attacker functions as does G.
Using Turing machines to interpret terms gives us a high level of expressivity.For instance, conditional branching in games can be internalized in the first-order terms using an (if b then t else e) function symbol whose interpretation is forced to be the natural one in all models.
Example 2.3.Consider the following game: This game samples a fresh secret key and queries the attacker to obtain a guess m for the key.The game outputs 0 if the attacker correctly guessed the key, and 1 otherwise.The message returned at the end of the game can be represented by the following term:

The computational indistinguishability predicate ∼
One of the most commonly used notions to define cryptographic properties is computational indistinguishability. Roughly, indistinguishability is expressed as a guessing game in which an adversary must figure out which one of two scenarios G I or G R it is interacting with.More precisely, two games G I and G R are indistinguishable, which we write G I ≈ G R , when any PPTM adversary has at-best a negligible probability of guessing whether it is interacting with G I or G R .Formally, we define the advantage of the attacker A as the probability that it makes the two games behave differently.We then require this advantage to be negligible in the security parameter η: Typically, the game G R will correspond to an attack against the real primitives, while G I will represent an attack against an idealized implementation of the primitives for which security is obvious.
Example 2.4.Coming back to Example 2.3, the indistinguishability G sk ≈ (return 1) states that the probability that the adversary A guesses the secret key sk is negligible.In that case, a simple probabilistic independence argument can be used to show that the probability that A computes sk is 1 2 η .Thus, the indistinguishability holds.Squirrel's logic relies on a predicate ∼ to represent computational indistinguishability. Formally, for two terms u, v, the predicate t is the game where the adversary is provided with t η,ρ M and must produce a bit b, which the game returns.In other words, u ∼ v in M when: Reusing G sk and φ sk from Example 2.3, checking φ sk ∼ 1 amounts to verifying that G sk ≈ (return 1).
In the example above, we have used the indistinguishability predicate on booleans.In that case, the semantics of ∼ can be restated in a simpler way, without a quantification over all distinguishers A: it simply means that the two booleans have the same probability of being 1.The general semantics of ∼ becomes useful, though, to reason on it by decomposing terms: for instance, it allows us to derive f (u) ∼ f (v) from u ∼ v for any u and v of arbitrary types.Indeed, the existence of a PPTM distinguishing between f (u) and f (v) implies the existence of a PPTM distinguishing between u and v: the latter distinguisher is obtained by composing the former with the PPTM computing f .Logical rules.We now present some rules that can be used to reason over ∼.We intuitively describe some of those rules here, starting with the simpler rules that do not rely on any security assumptions over the primitives.
Example 2.6.The following formulas are valid: when t is a ground term in which n does not occur (1) t ∼ t for any term t; indistinguishability is reflexive.
(2) n ∼ n for any names n, n ; two uniform random samplings are indistinguishable.
(3) (n, n) ∼ (n, n ); the attacker sees, in one case, the same value twice, and in the other, two distinct values (with overwhelming probability).Notice that this implies that ∼ does not lift to an arbitrary context: we have seen that n ∼ n , but this example shows that we do not have for all context ∼ true for any terms u, v; more generally, any term occurring in an indistinguishability can be rewritten into an equal term.(5) (n = n ) ∼ true; names, i.e. random samplings, collide with negligible probability.
We provide a first set of inference rules in Figure 2. REFL corresponds to the reflexivity of ∼, and REWRITE allows replacing two terms that are equal with overwhelming probability in any context.Finally, FRESH exploits the fact that a term syntactically contains all its probabilistic dependencies: if a name n does not occur in a term t, then n is a uniform random sampling independent of t, and thus t = n can only be true with negligible probability.
Example 2.7.Let us go back to Example 2.5 and the formula φ sk ∼ 1.We assume an additional proof rule SIMPL ≡ that allows to replace a term with another one equal to it with probability one, which we note ≡, and where for instance, v ≡ (if false then u else v).We can prove our goal with the following derivation: Here, we use twice SIMPL ≡ : once with the previously mentioned equality over a false conditional, and once using the fact that ((att 0 () = sk) = false) ≡ (att 0 () = sk).

Unforgeability of hash functions: EUF-CMA
To reason over cryptographic games, a final piece of the puzzle is missing: assumptions over cryptographic primitives.For each primitive (a cipher, a signature, a hash, . . .), there is a usual set of cryptographic assumptions made over it.Such assumptions are expressed as an indistinguishability between two games, and cryptographic proofs take the form of reduction-based arguments.To prove that the security of the primitive, for instance expressed as P R ≈ P I , implies the security of some game G R ≈ G I , we prove the contrapositive: assuming that there is a distinguisher against G R ≈ G I , we build a distinguisher against P R ≈ P I .In our logic, we hide those reduction-based arguments behind dedicated rules, one for each cryptographic assumption.
Consider the example of a keyed hash function h(x, sk) and of the EUF-CMA assumption.Intuitively, a keyed hash function is a function that takes a message and a key, and produces a short unpredictable value.More precisely, if a key sk is secret and randomly sampled, then an attacker has a negligible probability of guessing the hash h(m, sk) of any m, unless the attacker was directly given this value.Such a hash function is called unforgeable, and this is formalized in the EUF-CMA h cryptographic game: Hashing Oracle O(x) This game asks the attacker to return a pair of values t, m such that t = h(m, sk).The attacker is allowed to interact with the hashing oracle O, but it must return a m that was never queried to O.This is enforced by storing in a set L all oracle inputs.
We can express the security of the hash function by assuming that for all attackers A, EUF-CMA h ≈ (return false).Our goal is now to derive an inference rule which is sound under this assumption.
As a warm-up, we restrict ourselves to models where the interpretation of h satisfies the EUF-CMA assumption, and analyze the validity of a few statements: -(h(0, sk) = att 0 ()) ∼ false is valid.Otherwise, we would have a model (and thus an attacker) able to output with non-negligible probability the value of h(0, sk) without having access to any other hash values.This attacker would trivially break EUF-CMA h , by directly outputting 0, h(0, sk).-(h(0, sk) = att 1 (sk)) ∼ false is not valid, because the secret key is leaked to the attacker.
As such, the computation of att 1 (sk) cannot be seen as a computation made by an adversary A O in the EUF-CMA h game, and the assumption does not apply.-(h(0, sk) = att 1 (h(1, sk))) ∼ false is valid.Otherwise, we would have a model and thus an attacker breaking EUF-CMA h : the attacker would compute h(1, sk) with a call to oracle O(1), after which L = {1}, and return att 1 (h(1, sk)), which is the hash of 0 / ∈ L.
To generalize this, we need to answer the following question: under which conditions over t and m is (h(m, sk) = t) ∼ false a valid formula?We answer this by providing sufficient conditions under which the terms t and m can be produced by some attacker interacting with the EUF-CMA h game in such a way that m is never queried to O.
The first main condition is that all syntactic occurrences of sk in t and m must be as a key to h (i.e.all occurrences are of the form h(u, sk) for some u).Otherwise, the terms cannot be simulated, as the adversary A against EUF-CMA h does not know the key.
Once we know that sk is only used in key position, it is easy to syntactically track in t and m what will be computed using queries to O: it is precisely the set of subterms of the form h(x, sk), i.e. S h sk (t, m) = {h(u, sk) | h(u, sk) subterm of t or m}.Using this set, we can then define a formula, inside our logic, expressing that all previously seen hash values are distinct from m.Those two conditions are sufficient to obtain a rule enabling us to use EUF-CMA to establish indistinguishabilities in the logic: This rule has a logical premise, which becomes a proof obligation when the rule is applied to a proof goal, and a syntactic side condition automatically checked by Squirrel.

MODELING INTERACTIVE PROTOCOLS WITH RECURSIVE FUNCTIONS
Now that we have introduced the core CCSA framework, let us turn to how cryptographic protocols are modeled.Compared to the simple games of the previous section, protocols are concurrent systems, involving several communicating agents.An agent may for instance be a key server, a bank, a user's terminal. . .When analyzing the security of a protocol, it is desirable to make worst-case assumptions.A typical one is that Game G(R) the adversary has full control over the network: it receives all messages output by protocol agents, and is tasked with feeding inputs to them, with messages resulting from arbitrary computations.We also assume that the adversary schedules the actions of the protocol's agents: it decides when to spawn a new session, when to advance in a session, . . .This can actually be modeled using cryptographic games, representing the agents as oracles to which the adversary has access.Adversarial computations then induce a sequence of interactions with the protocol's agents, i.e. an execution trace of the protocol.Following this style, quantifying over all adversaries implies quantifying over all execution traces.We present this approach in more details in what follows, illustrating it on a simplified version of the BASIC HASH protocol [Brusò et al. 2010].

An example: BASIC HASH protocol
The BASIC HASH protocol is a simple RFID protocol in which a tag T tries to authenticate itself to a reader R.This is an access control protocol: e.g. the tag may be embedded in an RFID fob, and the reader could guard access to some building.The tag and the reader both share a secret key sk.Whenever the tag wants to authenticate itself to the reader, it samples a random value n, hashes it with sk using an unforgeable keyed hash function as presented in Section 2.3, and sends the pair n, h(n, sk) of the name and its hash digest to the reader.The reader can check that a message x it received from the network was generated by the tag by extracting the first component fst(x) of x, re-computing its hash using the secret key sk and checking that this yields the second component snd(x) of its input.
Authentication game.We can express the fact that this protocol provides some form of authentication using the cryptographic game presented in Figure 3.The definitions of T and R r correspond to the tag and reader agents described above, with a single addition that will be useful to express the security property: before sending its output x = n, h(n, sk) to the reader, the tag logs x into a set L. Further, R i corresponds to an idealized reader, which checks that its input x originates from T by directly verifying that x appears in the log L. Intuitively, the idealized reader looks, across space and time, into the past internal memories of the tag to check whether it indeed generated x.Obviously, R i is only useful as a trick to model the expected authentication property, and cannot be physically implemented.
The cryptographic game G(R), parameterized by the reader R ∈ {R r , R i }, samples the secret key sk, initializes the log L, and runs the adversary A T,R by letting it interact with the tag and reader oracles.The adversary must try to guess a bit b indicating whether it is interacting with the real or ideal reader R: its guess b is the final result of the game.The protocol is secure if A has a negligible advantage in guessing the correct bit b, i.e. if the computational indistinguishability G(R r ) ≈ G(R i ) holds.Indeed, if an attacker can make R r accept while breaking authentication (i.e.no corresponding tag produced the message), trying to do the same execution with R i would fail by construction, making it trivial to distinguish the two worlds.

Modeling protocol executions
During its execution, A can call the tag and reader oracles any number of times and according to any interleaving.We are going to model the interaction of A with the protocol agents T, R along an execution trace tr representing a fixed but arbitrary interleaving.Thus, an execution trace tr is a finite sequence τ 1 , . . ., τ n of timestamps, where the j-th timestamp τ j in the trace represents the agent the adversary interacted with at this step.To distinguish multiple interactions with the same agents, actions are indexed by a unique replication index i in some set of indices I.For BASIC HASH, we use τ = T(i) and τ = R(i) for an interaction with respectively the tag and the reader.For convenience, we also force τ 1 to a special value init.
Modeling execution traces.We assume a typed logic, to enable reasoning over different kinds of objects.Models of the logic provide an interpretation domain for each type, where for instance message is interpreted as the set of bitstrings and bool as {0, 1}.Intuitively, the type of all terms used in Section 2 is message or bool, with e.g. the equality function symbol = typed as message → message → bool.The types index and timestamp respectively represent the sets of indices I and timestamps T , and we use function symbols T and R, both of type index → timestamp, to build timestamps from indices, and init : timestamp for the initial timestamp.The execution trace tr is implicitly encoded using two function symbols: where happens(τ ) states that τ has been scheduled in the execution trace, and τ < τ that τ occurred before τ in the trace.Each model M of the logic fixes the execution trace by interpreting the types and function symbols above.We require, through an ad hoc restriction on models of the logic, that the interpretation of index in any model is finite and independent of the security parameter η.Further, we assume that any scheduled timestamp uniquely corresponds to either init, some T(i) or some R(i).
Modeling the game's execution.In the logic, the interaction of the adversary A with the oracles T and R along an execution trace is modeled using several mutually recursive functions called macros.A macro is identified by its name m and is evaluated at a particular timestamp τ : informally, m@τ refers to the value of m at point τ in the execution.We will use the following macros: input@τ and output@τ denote the messages input and output by the oracle scheduled at time τ in the execution trace; and frame@τ is roughly the sequence of all messages output by any agent during the execution up to timestamp τ .The frame is crucial to express security properties, as it represents the full list of messages seen by the attacker.We assume a function symbol pred : timestamp → timestamp that maps any scheduled timestamp different from init to its predecessor in the execution trace, and all other timestamps to a fixed constant undef, distinct from all scheduled timestamps.The macros can be defined as follows for the game G(R r ) using the real reader: where empty is a symbol of type message representing the empty bitstring, and where we generalized names to be of type index → message, so that each n(i) is an independent fresh name.The r subscript indicates that these macros are for the game using the real reader.The macros definitions for the game with the ideal reader G(R i ), subscripted by i, are identical except for the output macro: we only need to introduce a macro L@τ for the value of the log L after the execution of the oracle at τ , and to modify the R(i) case of output: Then, the formula frame r @τ ∼ frame i @τ expresses the fact that, for any model M defining an execution trace tr and an adversary A, the adversary has a negligible probability of distinguishing the real and ideal versions of the protocol, i.e.G(R r ) ≈ G(R i ) holds along this particular execution trace (see Section 3.3 below for a discussion of the security guarantees provided by this formula).
Cryptographic reasoning with recursive functions.The EUF-CMA reasoning rule presented in Section 2.3 does not directly apply to terms containing recursive definitions: we need to be able to reason over the occurrences of the hash function inside terms, and thus possibly inside an arbitrary recursive unfolding.A first way to fix the EUF-CMA rule is to forbid any occurrence of a hash in all the recursive functions occurring inside a term, but it is of course too limiting: in BASIC HASH, we have for instance a set of hash occurrences of the form h(n(i), sk) in output r .We need to be able to reason over such a set, for instance using some universal quantification over indices or timestamps.We formally introduce such constructions in Section 4.2, and extend the cryptographic rules in Section 4.4.
Implementation in Squirrel.The encoding of protocols using macros as described above can be generalized to support a large class of security protocols.Given a protocol P, we can use the same definitions for frame and input, and we only need to adapt the definition of output according to the protocol specification, and, if necessary, introduce additional macros for the protocol's mutable variables (such as the macro L for the log L).This method is expressive enough to encode complex protocol behaviors, and allows all relevant information to be expressed as messages that our logic can handle.However, it is arguably not very intuitive to manipulate for users.To help with that, Squirrel allows users to describe protocols as processes in a process algebra (a variant of the applied π-calculus), and automatically translates them as macros (the actual definition of Squirrel macros is slightly more involved than the one presented here, we refer readers to [Baelde et al. 2022] for details).

The issue of asymptotic security
With the modeling style described here for protocols, we always reason on an arbitrary but fixed trace.When proving a property, we show that it holds (up to negligible probability) in all models of the logic, and the model includes the interpretation of the order on timestamps, i.e. the interleaving of actions, as well as the interpretation of all attacker function symbols.In particular, this means that in any given model, the trace and the interpretation of adversarial functions is fixed, and only then do the security parameter η and random tape ρ vary.In other words, when proving a property φ, what we establish can informally be described by the formula: ∀ trace tr.∀ adversary A. Pr ρ (φ holds in tr against A) is overwhelming in η.
While such a statement gives good security guarantees, it is weaker than what is commonly expected.In cryptographic games, the adversary is indeed given access to the protocol through oracles it may call at will any (polynomial) number of times.That is, the adversary can choose the trace, rather than having it imposed on him.In particular, the length of the trace may be chosen by the adversary and thus depend on η, while it is fixed in our model.That would correspond to the stronger (informal) formula: It is possible to overcome this limitation by changing the way we model protocols, to let the adversary adaptively choose the trace during the protocol execution.However, such alternatives have not yet been explored in practice.

THE GENERAL LOGIC & PROOF SYSTEM
We have considered so far a typed language with first-order terms and a single indistinguishability predicate, with inference rules that can be expressed as axioms in first-order logic.Except for the addition of recursively defined functions, all this fits within the original CCSA logic of [Bana and Comon-Lundh 2014], which is plain firstorder logic.Thus, proving statements in that logic can be done using any proof system for classical first-order logic; only the CCSA axiom schemes would reflect the intended probabilistic semantics of terms and the cryptographic assumptions on primitives.
We have seen, however, that syntactic side conditions, which are needed for cryptographic rules, cannot be easily lifted to terms containing recursive definitions.In addition, the logic appears ill-suited for some security properties.Recall the BASIC HASH example, whose security we defined by using a real and an ideal system.Yet, the intuitive security property should only be about a high-level fact verified by all executions of the real system: whenever a reader accepts, the value it received was produced by an honest tag.
To improve this aspect of the logic, a core idea is to introduce a more specific logic, where special status is given to terms of type bool.We have seen axioms involving indistinguishability, of the form t ∼ true or t ∼ false.For such formulas, using computational indistinguishability for all PPTM distinguishers is overkill, and we can take a much simpler approach: t ∼ true simply means that t is true with overwhelming probability.Further, it is convenient to view t itself as a specific kind of formula: to this effect, we can use function symbols representing propositional connectives, as we did for the ifthenelse construction; less obviously, the same can be done for quantifiers.In that context, t will be called a local formula, while the real first-order formulas are called global.It turns out that CCSA proofs tend to intensively reason on local formulas.This calls for a proof system that conveniently handles these like proper formulas.Squirrel comes with one such proof system, which we present in this section.

Higher-order terms with a relaxed semantics
Before properly introducing local and global formulas and their proof systems, we must come back to the semantics of terms.Indeed, since [Baelde et al. 2023], the logic behind Squirrel features higher-order terms.There have been several motivations for this significant generalization of the original CCSA logic.First, Squirrel developments predominantly consist of proofs of local formulas, and higher-order features at this level help structure these proofs in a modular way, and re-use them.Second, the constraints of the original CCSA logic, namely that all terms are interpreted as PPTMs, make sense for modeling cryptographic protocols and adversaries but are sometimes too restrictive when writing proofs.For instance, it can be useful to talk about the discrete logarithm over finite groups, even though this operation is (hopefully) not PTIME.It can also be useful to quantify over infinite types in local formulas, e.g. to specify properties of primitives over messages: anticipating on what follows, we will be able to write local formulas such as ∀x, y : message.fst( x, y ) = x.
The terms of our logic are thus simply-typed λ-calculus terms, additionally featuring recursive definitions.In the tool, some limited forms of polymorphism are even available, but we leave them unaccounted for in the theory.Each type τ is interpreted in a model M as a family of sets τ η M indexed by the security parameter η, with We require that bool η M = {0, 1} and message η M = {0, 1} * for all M and η.The potential dependency in η is however crucial e.g. to faithfully model the finite groups used in Diffie-Hellman key exchanges.
Terms are no longer interpreted as PPTMs, but more generally as discrete random variables, i.e. functions from random tapes to the desired domain.More precisely, a model fixes a set of finite tapes T M,η for each η, and we let RV M (τ ) be the set of all η-indexed sequences of functions from T M,η to τ η M .Then, given M, η and ρ, any term t of type τ is interpreted as ) where M(x) is the interpretation of x in M, and M[x/a] is the model modified to interpret x as a random variable X such that X(η, ρ) = a.Note that the interpretation of a term only depends on the interpretation of its subterms for the same values of η and ρ, and thus the value of M[x/a](x) is only relevant on η, ρ.We refer the reader to [Baelde et al. 2023] for the full details, including the interpretation of recursive definitions when they satisfy a suitable well-foundedness condition, as well as how to check that some terms do correspond to PPTM computations in order to apply cryptographic assumptions.
While it could seem more natural to have infinite tapes, rather than finite but arbitrarily long ones, this has technical motivations.It ensures that all functions computed on tapes actually correspond to random variables, i.e. that they are measurable.This restriction of our logic (and of the original CCSA logic) is not limiting when it comes to modeling cryptographic protocols and attackers, which all run in PTIME.It is however crucial that the length of tapes can grow with η.The restriction to finite tapes is standard in formal cryptographic reasoning: it is present, e.g., in approaches based on Hoare logics [Barthe et al. 2011;Petcher and Morrisett 2015;Basin et al. 2020].

Local and global formulas
Squirrel's logic is structured around two kinds of formulas: we call global formulas the formulas of first-order logic built around the ∼ predicate, and local formulas the terms of type bool.To clearly distinguish the two, we add tildes to logical connectives and quantifiers at the global level.We also note formulas with different letters: φ and ψ for local formulas and F and G for global ones.

Atomic global formulas include indistinguishabilities, but also atoms of the form [φ],
which can be understood as φ ∼ true.The syntax above is open-ended, as more predicates will be used: in addition to custom predicates introduced by the user, e.g. to model verifications on messages, we will make use of several extra predicates later in this section.Note, however, that we will not use equality at the global level.
Indistinguishability atoms u ∼ v can only be formed when u i and v i have the same type for all i.Moreover, this type must be of order at most 1, i.e. of the form τ 1 → . . .→ τ n where all τ k are base atomic types.Further, we assume that elements of base types can be encoded as bitstrings, which allows interpreting these atoms as in Section 2.2, with a generalization: we now consider a distinguisher that can access the (semantics of) terms as oracles, which can be called on arbitrary inputs to obtain new data.
As said before, local formulas are just boolean terms.As such, the syntax above should be understood as syntactic sugar: for instance, φ ∧ ψ stands for (∧) φ ψ where (∧) is a constant function of type bool → bool → bool.We also view local quantifiers in this way: ∀(x : τ ).φ is a notation for ∀ τ (λ(x : τ ).φ) where ∀ τ is a constant of type (τ → bool) → bool.We require that all models interpret these logical constants as expected.In particular, ∀x.φ η,ρ M = 1 iff φ η,ρ M[x/a] = 1 for all a ∈ τ η M .Example 4.1.Consider once again the BASIC HASH protocol of Figure 3. Instead of expressing its security by using an equivalence, we can also state it with a local formula over the previously defined macros input r and output r : Asking that this formula holds with overwhelming probability properly expresses authentication for any trace: whenever some tag R(i) accepts, its input value must have been produced in the past by an honest reader T(j).
As usual in first-order logic, a global formula can be satisfied (or not) in a model: we write M |= F when this is the case.A global formula is valid when it is satisfied in all models.We write and φ are overwhelmingly true, then so must be ψ.However, be satisfied in a model where φ is not overwhelmingly true, but true for half of the tapes; hence ψ could be always false, and [φ ⇒ ψ] does not hold.
Example 4.3.We have [φ] ∨ [ψ] |= [φ ∨ ψ] but not the converse: there might be models where only φ holds for half of the random tapes and only ψ holds for the other half, hence φ ∨ ψ is overwhelmingly true (even exactly true) while neither φ nor ψ is.
In each model, global formulas are either fixed to true or false, but their quantifiers range over random variables, i.e. elements of RV M (τ ) for some τ .In contrast, local formulas are probabilistic and interpreted in RV M (bool), but their quantifiers range over individuals, i.e. elements a ∈ τ η,ρ M .Despite this essential difference, we can relate quantifiers at the two levels, as shown in [Baelde et al. 2023, Proposition 2].PROPOSITION 4.4.For all M, we have M |= [∀(x : τ ).φ] iff.M |= ∀(x : τ ).[φ], and similarly for the existential quantifiers.

Proof system for global and local reasoning
Squirrel's proof system is given in a natural deduction style (though the tool's tactics are sometimes closer to sequent calculus rules) and it deals with two kinds of sequents.Global sequents are of the form E; Θ F where F is a global formula, Θ is a set of global formulas, and E is a list of typed variable declarations containing at least all free variables of Θ, F .Such a sequent can be read as the global formula ∀E.( ∧Θ ⇒ F ), and it is valid when the associated formula is valid.Local sequents are of the form E; Θ; Γ φ where φ is a local formula, Γ is a set of local formulas, and the other components are as before, with E declaring all free variables of Θ, Γ and φ.The formula associated to a local sequent is ∀E.( ∧Θ ⇒ [∧Γ ⇒ φ]).
It is important to note, in the meaning of local sequents, that the second implication happens at the local level.As a result, the global hypotheses of Θ and the local hypotheses of Γ take a different meaning, illustrated in the following selected rules: The first rule is the usual axiom rule for the global logic.The second and third rules are local and global versions of the axiom rule for the local logic.The last rule allows proving a global sequent whose conclusion is [φ] from the corresponding local sequent.
The local equality predicate gives rise to the ability to rewrite at the local level, which is in fact a generalization of the REWRITE rule from Figure 2: In the tool, a powerful rewrite tactic is provided, which builds on the previous rule as well as more basic rules, to perform various kinds of rewriting.The tactic can be invoked on a goal E; Θ; Γ φ to selectively replace occurrences of u into v inside Θ, Γ and φ when u = v is part of the hypotheses (or can be simply derived from them).If u = v is a local hypothesis, however, rewriting will only be possible in Γ and φ; rewriting in Θ requires a global hypothesis [u = v], and is only possible at specific occurrences in Θ.
The previous result relating global and local quantifiers justifies that our sequents only feature a top-level environment E corresponding to a global quantification.The rules for universal quantifiers in our proof system are as follows, assuming that x does not occur in E, and that t is a well-typed term of type τ in environment E: Let us mention one last articulation between the global and local logics: In the tool, this rule is made available through a convenient rewrite equiv tactic, typically used to establish local formulas when they essentially derive from indistinguishability assumptions.
Example 4.5.A common assumption over hash functions, stronger than EUF-CMA, states that they can be seen as pseudo-random functions (PRF): for a fresh secret key sk, an attacker cannot distinguish between x → h(x, sk) and a fully random function.It notably yields in our logic that h(0, sk), h(1, sk) ∼ n, m.Applying on both sides of the equivalence the function λx, y : message.x = y, we can obtain the validity of the indistinguishability formula h(0, sk) = h(1, sk) ∼ n = m.With this global formula and the rewrite equiv tactic, we can reduce the proof of the local sequent E; Θ; A more complete description of our proof system is given in [Baelde et al. 2023].However, it is important to point out that the system presented in that paper (or any earlier one) is only a collection of correct rules: it does not come with any completeness or cut-elimination result.Ongoing investigations are starting to provide such prooftheoretical results, though in a propositional fragment that allows to view our two-level logic in the richer framework of modal logic.

Cryptographic reasoning with recursive functions
It is natural to lift cryptographic reasoning to either local or global reasoning depending on the nature of the cryptographic assumption.In the case of EUF-CMA from Section 2.3, it could become a local rule of the form: sk only occurs as h(x, sk) in t and m Yet, now that terms contain recursive functions, the meaning of forbidding occurrences of sk is unclear: forbidden computations (e.g. a leak of a secret key) might occur deep inside the recursive functions.Further, sk may be an indexed key k(i), and occurrences with distinct indices k(j) for j = i should be ignored.Checking for such indirect occurrences requires a deeper analysis of the terms involved.
Example 4.6.Let k(i) be a name symbol indexed by an integer i and representing a key, and consider the following recursive function: where we assume to have functions symbols for the standard operations over integers.Basically, keys(i) computes the list [k(i), k(i − 1), . . ., k(1)] (encoded as nested pairs).Consider some index terms i 0 , j 0 and assume that j 0 < i 0 .To prove h(att(keys(j 0 )), k(i 0 )) = t with EUF-CMA, we need to check, among other things, that k(i 0 ) does not occur in keys(j 0 ).The term keys(j 0 ) cannot be evaluated without providing a concrete value for j 0 .Moreover, if we attempted to directly check the condition on keys(j 0 )'s body, we would find that it contains k(i) (for a bound i), which can a priori be equal to k(i 0 ).To conclude here, we need to exploit the hypothesis that j 0 < i 0 and details on how the recursive function keys is defined.Thus, we need a more involved check exploiting additional information that cannot be directly obtained from a superficial scrutiny of the terms.
Generalized subterms.We therefore introduce a generalized notion of subterm, to compute an over-approximation of all the subterms that may occur during the evaluation of a recursive function, for any model and random tapes.In addition, to allow exploiting semantic information (e.g. the fact that for any k(i) in keys(j), we have i = 0), we gather together with each subterm a local formula expressing the condition under which that subterm may be computed.A generalized subterm, or occurrence, σ in a term t is a tuple (V, φ, s): s is, roughly, a subterm of t (expanding recursive definitions), V is a set of typed variables bound above s in t, and φ is the condition under which s is evaluated in t, gathered from all if conditions.We use this notion, denoted ST(t), in cryptographic rules to replace the previous conditions on syntactic subterms.For instance, a condition expressing that k(i 0 ) does not occur in t now instead requires that for each (V, φ, k(i)) ∈ ST(t), the formula ∀V.φ ⇒ i 0 = i holds.Taking all this into account, the EUF-CMA rule finally becomes: where ST h (t, m) is ST(t, m), where we excluded occurrences of k in subterms of the form h(•, k(•)).An additional premise (omitted here) checks that t and m are computable in polynomial time.
Making the verification effective.As presented here, the set of occurrences in a term may be infinite: thus, we cannot generate one premise for each occurrence.To alleviate this issue, we use heuristics that help Squirrel construct a proof obligation for the user that over-approximates this infinite set in a way that makes it expressible as a single formula.We illustrate how this can be done on an example.
Example 4.7.Continuing Example 4.6, consider arbitrary indices i 0 , j 0 such that j 0 < i 0 .Recall that keys(j 0 ) computes the list of keys [k(j 0 ), . . ., k(1)].In the style described above, a condition for a rule may require us to consider all elements (V, φ, k(i)) of ST(att(keys(j 0 ))) (for any i) to show that i cannot be equal to i 0 .There are countably many such elements, all of the form (∅, j 0 = 0 ∧ j 0 − 1 = 0 ∧ • • • ∧ j 0 − = 0, k(j 0 − )) for ∈ N.They can all be subsumed by using instead the occurrence (i, 0 < i ∧ i ≤ j 0 , k(i)).The corresponding proof obligation is ∀i : index.0 < i ∧ i ≤ j 0 ⇒ i = i 0 , which can indeed be proved under the hypothesis that j 0 < i 0 .

Reasoning over traces: induction and case analysis
We now clarify the interpretation of traces in this logic, and what kind of reasoning we can perform over them.Following the modeling approach outlined in Section 3, recall that we may use A : index → timestamp to represent an indexed action.Assuming for simplicity that we only have this action A, we restrict our attention to models M where where N is a finite subset of index M and undef is a special value used to interpret all timestamps that do not happen.Then, the constants init and A are interpreted naturally, independently of η, ρ: However, this is too strong, and unsound in our class of models: consider for instance a random variable over timestamps that is init for some random tapes and some A(n) for other tapes.To avoid this problem, we crucially need to quantify over random variables that are actually constant, i.e. which depend neither on η nor on ρ.Assuming a global predicate const(•) with exactly this meaning, we can now write the following, which is sound in our class of models: In practice, such axioms are used in Squirrel proofs through the case tactic.This tactic, when given a timestamp, will perform a case analysis for local or global goals.However, when the goal is a global sequent, it will generate an additional subgoal to verify that the timestamp is constant, corresponding to the hypothesis in our axiom above.In practice, the constancy assumptions come from the target security property.For instance, the equivalence of protocols modeled in Section 3.2 would more precisely be stated as ∀τ.const(τ ) ⇒ frame r @τ ∼ frame i @τ .
Case analysis over timestamps is not always sufficient to prove a goal, and induction is sometimes needed.At the local level, induction is supported by the following axiom, which is obviously valid in our intended class of models: Ongoing and Future Work.As explained in Section 3, the Squirrel tool allows cryptographic protocols to be described in a process algebra, whereas the translation from protocols to the logic relies on the definition of mutually recursive functions modeling the protocol observables depending on the execution trace.To ease formal proofs, this translation does not follow the granularity of elementary execution steps in the process algebra.Instead, it groups together elementary steps in blocks following specific patterns.Whereas the soundness of the translation is easy to obtain for a simple class of protocols, some works remains to be done for more complex protocols.
We also aim to improve the automation of the prover.One option is to discharge parts of the proof that do not rely on any cryptographic assumption to SMT solvers.Another option is to rely on type systems: techniques based on typing have mainly been used in symbolic models, but we are currently working to develop such a system to obtain computational guarantees in the CCSA framework.
We aim to extend the scope of the post-quantum work initiated in [Cremers et al. 2022] and to analyze hybrid post-quantum protocols that rely on a combination of asymmetric post-quantum algorithms with well-known and well-studied pre-quantum asymmetric cryptography, e.g. based on the discrete logarithm problem.
Lastly, as mentioned in Section 3, the current notion of security in Squirrel differs from the standard one in the computational model, as the number of sessions we consider is arbitrary but fixed, in the sense that it does not depend on the security parameter.We plan to overcome this limitation, notably by leveraging the composition result outlined in [Comon et al. 2020].Furthermore, we intend to expand on our methodology to go beyond asymptotic guarantees and provide concrete security bounds.
otherwiseThe following axiom is sound in this class of models:∀x : timestamp.happens(x) ⇒ x = init ∨ ∃i : index.x = A(i)It can effectively be used to reason by case analysis on timestamps, but only at the level of local formulas.At the global level, it is tempting to postulate a stronger axiom:∀x : timestamp.[happens(x)] ⇒ [x = init] ∨ ∃i : index.[x = A(i)]