Weakly-Unambiguous Parikh Automata and Their Link to Holonomic Series

We investigate the connection between properties of formal languages and properties of their generating series, with a focus on the class of holonomic power series. We ﬁrst prove a strong version of a conjecture by Castiglione and Massazza: weakly-unambiguous Parikh automata are equivalent to unambiguous two-way reversal bounded counter machines, and their multivariate generating series are holonomic. We then show that the converse is not true: we construct a language whose generating series is algebraic (thus holonomic), but which is inherently weakly-ambiguous as a Parikh automata language. Finally, we prove an eﬀective decidability result for the inclusion problem for weakly-unambiguous Parikh automata, and provide an upper-bound on its complexity. 2012 ACM Subject Classiﬁcation Theory of computation → Formal languages and automata theory; Mathematics of computing → Generating functions


Introduction
This article investigates the link between holonomic (or D-finite) power series and formal languages.We consider the classical setting in which this connection is established via the generating series L(x) = n≥0 n x n counting the number n of words of length n in a given language L. On the languages side, the Chomsky-Schützenberger hierarchy [10] regroups languages in classes of increasing complexity: regular, context-free, context-sensitive and recursively enumerable.For power series, a similar hierarchy exists, consisting of the rational, algebraic and holonomic series.The first two levels of each hierarchy share a strong connection, as the generating series of a regular (resp.unambiguous context-free) language is a rational (resp.algebraic) power series.
This connection has borne fruits both in formal language theory and in combinatorics.In combinatorics, finite automata and unambiguous grammars are routinely used to establish rationality and algebraicity of particular power series.In formal languages, this connection was (implicitly) used to give polynomial-time algorithms for the inclusion and universality 114:2 Weakly-Unambiguous Parikh Automata and Their Link to Holonomic Series tests for unambiguous finite automata [35].In [15], Flajolet uses the connection between unambiguous context-free grammars and algebraic series to prove the inherent ambiguity of certain context-free languages, solving several conjectures with this tool.Using analytic criteria on the series (for instance, the existence of infinitely many singularities), he establishes that the series of these context-free languages is not algebraic.Hence these languages cannot be described by unambiguous context-free grammars and are therefore inherently ambiguous.
In this article we propose to extend the connection to holonomic series.Holonomic series enjoy non-trivial closure properties whose algorithmic counterparts are actively studied in computer algebra.Our aim is to show that these advances can be leveraged to obtain nontrivial results in the formal languages and verification worlds.These results are particularly noteworthy as the notion of unambiguity in automata theory is not fully understood [11].The work of extending the connection was already initiated by Massazza in [27], where he introduces two families of languages, named RCM1 and linearly constrained languages (LCL), whose generating series are holonomic.These classes are, however, not captured by well-known models of automata, and this limits their appeal.Recently, Castiglione and Massazza addressed this issue and conjectured that RCM contains the languages accepted by deterministic one-way reversal bounded machines (RBCM for short) [8]; Massazza proved the result for RBCM for two subclasses of one-way deterministic RBCM [28,29].This conjecture hints that the class RCM is related to models of automata such as RBCM, which are used in program verification.
Our first contribution is to prove a stronger version of this conjecture.We show that RCM and LCL respectively correspond to the languages accepted by weakly-unambiguous2 version of Parikh automata (PA, for short) [24] and pushdown Parikh automata.Intuitively, Parikh automata are3 finite non-deterministic word automata enriched with the ability to test semilinear constraints between the number of occurrences of each transition in the run.In terms of RBCM, these classes correspond to unambiguous two-way RBCM and unambiguous one-way RBCM enriched with a stack.Parikh automata are also commonly used in program verification.In view of the literature, these results might seem expected but they still require a careful adaptation of the standard techniques in the absence of a stack and become even more involved when a stack is added.
After having established the relevance of the classes of languages under study, we provide two consequences of the holonomicity of their associated generating series.
The first consequence follows Flajolet's approach mentioned previously and gives criteria to establish the inherent weak-ambiguity for languages accepted by PA or pushdown PA, by proving that their generating series are not holonomic.These criteria are sufficient but not necessary; this is not surprising as the inherent weak-ambiguity is undecidable for languages accepted by PA.Yet, the resulting method captures non-trivial examples with quite short and elegant proofs.In contrast, we give an example of inherently weakly-ambiguous PA language having a holonomic series (and therefore not amenable to the analytic method) for which we prove inherent weak-ambiguity by hand.The proof is quite involved but shows the inherent ambiguity of this language for a much larger class of automata (i.e., PA whose semi-linear sets are replaced by arbitrary recursive sets).

114:3
The second consequence is of an algorithmic nature.We focus on the inclusion problem for weakly-unambiguous PA, whose decidability can be deduced from Castiglione and Massazza's work [8].Here our contribution is an effective decidability result: we derive a concrete bound B, depending on the size of the representation of the two PAs, such that the inclusion holds if and only if the languages are included when considering words up to the length B. This bound B is obtained by a careful analysis of the proofs establishing the closure properties of holonomic series (in several variables), notably under Hadamard product and specialization.We do this by controlling various parameters (order, size of the polynomial coefficients, . . . ) of the resulting partial differential equations.

Primer on holonomic power series in several variables
In this section, we introduce power series in several variables and the classes of rational, algebraic and holonomic power series.We recall the connection with regular and context-free languages via the notions of generating series in one or several variables.
Let Q[x 1 , . . ., x k ] be the ring of polynomials in the variables x 1 , . . ., x k with coefficients in Q and let Q(x 1 , . . ., x k ) be the associated field of rational functions.
The generating series of a sequence (f n ) n∈N is the (formal) power series in the variable x defined by F (x) = n∈N f n x n .More generally, the generating series of a sequence a(n 1 , . . ., n k ) is a multivariate (formal) power series in the variables x 1 , . . ., x k defined by In this article, we only consider power series whose coefficients belong to the field Q.The set of such k-variate power series is denoted Q Power series are naturally equipped with a sum and a product which generalize those of polynomials, for which Q[[x 1 , . . ., x k ]] is a ring.We use the bracket notation for the coefficient extraction: The generating series of a language L over the alphabet Σ = {a 1 , . . ., a k } is the univariate power series L(x) = w∈L x |w| = n∈N n x n , where n counts the number of words of length n in L. Similarly the multivariate generating series of L defined by L(x a1 , . . ., x a k ) = (n1,...,n k )∈N k (n 1 , . . ., n k )x n1 a1 . . .x n k a k where (n 1 , . . ., n k ) denotes the number of words w in L such that |w| a1 = n 1 , |w| a2 = n 2 , . . ., and |w| a k = n k , and |w| a denotes the number of occurrences of a ∈ Σ in w.This way, we create one dimension per letter, so that each letter a ∈ Σ has a corresponding variable x a .
Observe that the univariate generating series of a language is exactly L(x, . . ., x), obtained by setting each variable to x in its multivariate generating series.
Example 1.The generating series of the language P of well-nested parentheses defined by the grammar S → aSbS and its counting series is 4 . Indeed the production of the grammar translates to the equation P (x a , x b ) = x a x b P (x a , x b ) 2 + 1.This equation admits only one power series solution, namely . 4 As for the inverse, the square root of a power series with nonzero constant term can be defined using the usual Taylor formula I C A L P 2 0 2 0 114:4 Weakly-Unambiguous Parikh Automata and Their Link to Holonomic Series A power series A(x 1 , . . ., x k ) = n1,...,n k a(n 1 , . . ., n k )x n1 1 . . .x n k k is rational if it satisfies an equation of the form: P (x 1 , . . ., x k )A(x 1 , . . ., x k ) = Q(x 1 , . . ., x k ), with P, Q ∈ Q[x 1 , . . ., x k ] and P = 0.The generating series (both univariate and multivariate) of regular languages (i.e., languages accepted by a finite state automaton) are rational power series [4].It is well-known that the generating series can be effectively computed from a deterministic automaton accepting the language (see for instance [17, §I.4.2] for a detailed proof).For example, the multivariate generating series of the regular language (abc) * is Its univariate generating series is 1 1−x 3 .The connection between rational languages and rational power series is not tight.For instance, the generating series of the non-regular context-free language {a n b n : n ≥ 0} is 1 1−xax b , which is rational.In fact, it has the same generating series as the regular language (ab) * .Also there exist rational power series with coefficients in N which are not the generating series of any rational language.This is the case for 1+x−5x 2 −125x 3 as shown in [4].A power series A(x 1 , . . ., x k ) is algebraic if there exists a non-zero polynomial P ∈ Q[x 1 , . . ., x k , Y ] such that P (x 1 , . . ., x k , A(x 1 , . . ., x k )) = 0.All rational series are algebraic.

Example 2. The series
= 0 and there is no similar algebraic equation of degree 1.
The reader is referred to [34,17] for a detailed account on rational and algebraic series.
In the same manner that rational series satisfy linear equations and algebraic series satisfy polynomial equations, holonomic series satisfy linear differential equations with polynomial coefficients.To give a precise definition, we need to introduce the formal partial differentiation of power series.The differential operator ∂ xi with respect to the variable x i is defined by The composed operator ∂ j xi is inductively defined for j ≥ 1 by Definition 3 (see [33,26] In the sequel, except for Section 5, we rely on the closure properties of the holonomic series and will not need to go back to Definition 3.

Example 4. A simple example of holonomic series is
For a more involved example, consider the language containing the words having the same number of occurrences of a's, b's and c's.This language is classically not context-free.Moreover there are 3n n,n,n words of length 3n in L 3 The equivalence of these notions is proved by deep results of Bernšteȋn [2] and Kashiwara [21,36].

and the power series
and satisfies the partial differential equation: Holonomic series are an extension of the hierarchy we presented, as stated in the following proposition (see [12] for a proof, and [3] for bounds, algorithms and historical remarks).

Proposition 5. Multivariate algebraic power series are holonomic.
In the univariate case6 , a power series A(x) = n a n x n is holonomic if and only if its coefficients satisfy a linear recurrence of the form p r (n)a n+r + . . .+ p 0 (n)a n = 0, where every p i is a polynomial with rational coefficients [33,Th. 1.2].
We now focus on these closure properties.
Holonomic series are also closed under substitution by algebraic series as long as the resulting series is well-defined7 .
A sufficient condition for the substitution to be valid is that G i (0, . . ., 0) = 0 for all i (see [33,Th. 2.7]).For the case . ., y .The Hadamard product is the coefficient-wise multiplication of power series.If the series A(x 1 , . . ., x k ) and B(x 1 , . . ., x k ) are the generating series of the sequences a(n 1 , . . ., n k ) and b(n 1 , . . ., n k ), the Hadamard product A B of A and B is the power series defined by Observe that the support of F G is the intersection of the supports of F and G.
Example 9.The generating series of the language L 3 of Example 4, which is not contextfree, can be expressed using the Hadamard product: since , which is not algebraic.One of our main technical contributions is to provide bounds on the sizes of the polynomials in the differential equations of the holonomic representation of the Hadamard product of two rational series P1 Q1 and P2 Q2 : we prove that their maxdegree is at most (kM ) O(k) and that the logarithm of their largest coefficient is at most (kM ) O(k 2 ) (1 + log S ∞ ), where M (resp.S ∞ ) is the maxdegree plus one (resp.largest coefficient) in P 1 , Q 1 , P 2 and Q 2 .

Weakly-unambiguous Parikh automata
In this section, we introduce weakly-unambiguous Parikh automata and show that their multivariate generating series are holonomic.We establish that they accept the same languages as unambiguous two-way reversal bounded counter machines [19].Finally, we prove that the class of accepted languages coincides with Massazza's RCM class [27,8].Parikh automata (PA for short) were introduced in [23,24].Informally, a PA is a finite automaton whose transitions are labeled by pairs (a, v) where a is a letter of the input alphabet and v is a vector in N d .A run q 0 a1,v1 where the sum is done component-wise.The acceptance condition is given by a set of final states and a semilinear set of vectors.A run is accepting if it reaches a final state and if its vector belongs to the semilinear set.
The PA depicted above, equipped with the semilinear constraint {(n 1 , n 2 , n 1 + n 2 ) : n 1 , n 2 ≥ 0}, accepts the set of words w over {a, b, c} that start and end with a and that are such that |w| a + |w| b = |w| c .
In [13,20], it is shown that every semilinear set admits an unambiguous presentation.A presentation c + P * with P = {p 1 , . . ., p k } of a linear set L is unambiguous if for all x ∈ L, the λ i 's such that x = c + λ 1 p 1 + • • • + λ k p k are unique.An unambiguous presentation of a semilinear set is given by a disjoint union of unambiguous linear sets.A bound on the size of the equivalent unambiguous presentation is given in [9].
Semilinear sets are ubiquitous in theoretical computer science and admit numerous characterizations.They are the rational subsets of the commutative monoid (N d , +), the unambiguous rational subset of (N d , +) [13,20], the Parikh images of context-free languages [30], the sets definable in Presburger arithmetic [31], the sets defined by boolean combinations of linear inequalities, equalities and equalities modulo constants.
For a semilinear set C ⊆ N d , we consider its characteristic generating series It is well-known [13,20] that this power series is rational 8 . 8The characteristic series of an unambiguous linear set c + P * ⊆ N with P = {p1, . . ., −1 and hence is rational.As an unambiguous semilinear set is the disjoint union of unambiguous linear sets, its characteristic series is the sum of their series and it is therefore rational.

Weakly-unambiguous PAs and their generating series
We now introduce PA and their weakly-unambiguous variant.We discuss the relationship with the class of unambiguous PA introduced by Cadilhac et al. in [7] and the closure properties of this class.
A Parikh automaton of dimension d ≥ 1 is a tuple A = (Σ, Q, q I , F, C, ∆) where Σ is the alphabet, Q is the set of states, q A run of the automaton is a sequence q 0 a1,v1 the state q n is final and if the vector v 1 + • • • + v n belongs to C. The word w is then said to be accepted by A. The language accepted by A is denoted by L(A).
To define a notion of size for a PA, we assume that the constraint set is given by an unambiguous presentation p i=1 c i + P * i .We denote by |A| := |Q| + |∆| + p + i |P i | and by A ∞ the maximum coordinate of a vector appearing in ∆, the c i 's and the P i 's.Definition 10.A Parikh automaton is said to be weakly-unambiguous if for every word there is at most one accepting run.
A language is inherently weakly-ambiguous if it cannot be accepted by any weaklyunambiguous PA.The language S (defined in Section 4.1) is an example 9 of a language accepted by a non-deterministic PA which is inherently weakly-ambiguous.
Remark 11.We consider here the standard notion of unambiguity for finite state machines.However we decided to use the name weakly-unambiguous to avoid the confusion with the class of unambiguous PA which appears in the literature.This class was introduced by Cadilhac et al. in [7] for constraint automata, a model equivalent to PA and was latter defined directly on PAs.This notion of unambiguity is more restrictive than ours: they call a Parikh automaton unambiguous if the underlying automaton on letters, where the vectors have been erased, is unambiguous.Clearly such automata are weakly-unambiguous.However the converse is not true.Consider the language L = {c n w : over the alphabet {a, b, c}.Using results from [7], one can show that it is not recognized by any unambiguous Parikh automata.However, it is accepted by the weakly-unambiguous automaton depicted in Fig. 1 below with the semilinear {(n 1 , n 2 , n 3 ) : The lack of expressivity of unambiguous PAs is counter-balanced by their closure under boolean operations, which is explained by their link with a class of deterministic PA [7,14].It was pointed out to us by a reviewer that the class of weakly-unambiguous PA is briefly considered, under the name OneCA, in Cadilhac's PhD thesis [5, p. 117], where only basic properties are established, in particular the strict inclusion of unambiguous PA in this class.
Using a standard product construction when the vectors are concatenated and using the concatenation of the constraints, it is easy to show that weakly-unambiguous PA are closed under intersection.In [8], the authors claim that the class 10 is closed under union.However their construction has an irrecoverable flaw and we do not know if weakly-unambiguous PA are closed under union or under complementation. 9Using the equivalences between weakly-unambiguous PA and RCM established in Proposition 13 and PA and RBCM [23,6], it also gives an example of a language accepted by a RBCM with a non-holonomic generating series (strengthening Theorem 12 of [8]) and a witness for the strict inclusion of RCM in RBCM announced in Theorem 11 of [8].Remark that their proof of this theorem only shows that there exists no recursive translation from RBCM to RCM. 10 Actually, their claim is for the class RCM, which we will show to be equivalent in Section 3.4.

I C
We now give a very short proof of the fact that weakly-unambiguous languages in PA have holonomic generating series.The idea of the proof can be traced back to [25].A similar proof was given in [27] for languages in the class RCM but using the closure under algebraic substitutions instead of specialization (see Remark 25).
Our approach puts into light a different multivariate power series associated with a weakly-unambiguous PA A of dimension d.The multivariate weighted generating series G(x, y 1 , . . ., y d ) of A is such that for all indices (n, i 1 , . . ., i d ), [x n y i1 1 . . .y i d d ]G counts the number of words of length n accepted by A with a run labeled by the vector (i 1 , . . ., i d ).

Proposition 12. The generating series of the language recognized by a weakly-unambiguous Parikh automaton is holonomic.
Proof.Let A be a weakly-unambiguous PA with a constraint set C ⊆ N d .We first prove that its weighted series G(x, y 1 , . . ., y d ) is holonomic.As holonomic series are closed under Hadamard product (see Theorem 8), it suffices to express G as the Hadamard product of two rational series A and C in the variables x, y 1 , . . ., y d .
The first series A(x, y 1 , . . ., y d ) is such that for all n, i 1 , . . ., i d ≥ 0, [x n y i1 1 . . .y i d d ]A counts the number of runs of A starting in q I , ending in a final state and labeled with a word of length n and the vector (i 1 , . . ., i d ).Note that we do not require that (i 1 , . . ., i d ) belongs to C. As this series simply counts the number of runs in an automaton, its rationality is proved via the standard translation of the automaton into a linear system of equations.
For the second series, we take C(x, y 1 , . . ., y d ) := 1 1−x C(y 1 , . . ., y d ) where C is the support series of C, which is rational (see [13,20]).A direct computation yields that for all n, i 1 , . . ., i d ≥ 0, [x n y i1 1 . . .y i d d ]C is equal to 1 if (i 1 , . . ., i d ) belongs to C and 0 otherwise.The Hadamard product of A and C counts the number of runs accepting a word of length n with the vector (i 1 , . . ., i d ).As A is weakly-unambiguous, this quantity is equal to the number of words of length n accepted with this vector.Hence G = Ā C.
The univariate series A(x) of A is equal to G(x, 1, . . ., 1).Indeed, for all n ≥ 0, , y d ) is the sum over all vectors i ∈ N d of the number of words of length n accepted with the vector i.As A is weaklyunambiguous, each word is accepted with at most one vector and this sum is therefore equal to the total number of accepted words of length n.Thanks to Proposition 7, A(x) = G(x, 1, . . ., 1) is holonomic.

Equivalence with unambiguous reversal bounded counter machines
A k-counter machine [19] is informally a Turing machine with one read-only tape that contains the input word, and k counters.Reading a letter a on the input tape, in a state q, the machine can check which of its counters are zero, increment or decrement its counters, change its state, and move its read head one step to the left or right, or stay on its current position.Note that the machine does not have access to the exact value of its counters.A k-counter machine is said (m, n)-reversal bounded if its reading head can change direction between left and right at most m times, and if every counter can alternate between incrementing and decrementing at most n times each.Finally, a reversal bounded counter machine (RBCM) is a k-counter machine which is (m, n)-reversal bounded for some m and n.A RBCM is unambiguous if for every word there is at most one accepting computation.
RBCM are known 11 to recognize the same languages as Parikh automata (see [24,23,6]).This equality does not hold anymore for their deterministic versions [6,Prop. 3.14].However, the proof of the equivalence for the general case can be slightly modified to preserve unambiguity.

Proposition 13. The class of languages accepted by unambiguous RBCM and weaklyunambiguous PA coincide.
Proof sketch.Unambiguous RBCMs are shown to be equivalent to one-way unambiguous RBCMs.In turn these are shown to be equivalent to weakly-unambiguous PA with εtransitions, which in turn are equivalent to weakly-unambiguous PA.This ε-removal step needs to be adapted to preserve weak-unambiguity.A language L over Σ belongs to RCM if there exist a regular language R over Γ = {a 1 , . . ., a d }, a semilinear set 12  Proof sketch.Every language in RCM can be accepted by a weakly-unambiguous PA that guesses the underlying word over Γ: the weak-unambiguity is guaranteed by the injectivity of the morphism.Conversely a language accepted by a weakly-unambiguous PA is in RCM by taking for R the set of runs of the PA and translating the constraint: the injectivity of the morphism is guaranteed by the weak-unambiguity of the PA.

Equivalence with RCM
In [8], the authors conjectured that the class RCM contains the one-way deterministic RBCM.From Theorem 14 and Proposition 13 we get a stronger result: Corollary 15.The languages in RCM are the languages accepted by unambiguous RBCM.

Weakly-unambiguous pushdown Parikh automata
A pushdown Parikh automaton (PA for short) is a PA where the finite automaton is replaced by a pushdown automaton.A weakly-unambiguous PA has at most one accepting run for each word.Most results obtained previously can be adapted for weakly-unambiguous PA.However, unsurprisingly, the class of languages accepted by weakly-unambiguous PA is not closed under union and intersection.This can be shown using the inherent weak-ambiguity of the language D proved in Section 4.1.The closure under complementation is left open.Proposition 16.The generating series of a weakly-unambiguous PA is holonomic.
Proof.The proof is almost identical to the proof of Proposition 12.The only difference is that the series A is algebraic and not rational.Indeed it counts the number of runs in a pushdown automaton and the language of runs is a deterministic context-free language even if the pushdown automaton is not deterministic.
Remark that using the same techniques, we can prove that the generating series of weakly-unambiguous Parikh tree automata are holonomic.As we proved that all these series are also generating series of PAs, we do not elaborate on this model in this extended abstract.
RBCMs can be extended with a pushdown storage to obtain a RBCM with a stack [19].
Theorem 17. Weakly-unambiguous PA are equivalent to unambiguous one-way RBCM with a stack.Proof sketch.We first establish that unambiguous one-way RBCM with a stack are equivalent to weakly-unambiguous PA with ε-transitions.Contrarily to the PA case, the removal of ε-transitions is quite involved and uses weighted context-free grammars.
The class LCL of [27] is defined 13 as RCM is, except that the regular language is replaced by an unambiguous context-free14 language.Similarly to the PA case, one can prove: Proposition 18. LCL is the set of languages accepted by weakly-unambiguous PA.

Examples of inherently weakly-ambiguous languages
There is a polynomial-time algorithm to decide whether a given PA is weakly-unambiguous.But inherent weak-unambiguity is undecidable, as a direct application of a general theorem from [18].This emphasizes that inherent weak-ambiguity is a difficult problem in general.

Two examples using an analytic criterion
Following an idea from Flajolet [15] for context-free languages, the link between weaklyunambiguous PA and holonomic series yields sufficient criteria to establish inherent weakambiguity, of analytic flavor: the contraposition of Proposition 12 indicates that if L is recognized by a PA but its generating series is not holonomic, then L is inherently weaklyambiguous.Hence, any criterion of non-holonomicity can be used to establish the inherent weak-ambiguity.Many such criteria can be obtained when considering the generating series as analytic functions (of complex variables).See [15,16] for several examples.For the presentation of this method in this extended abstract, we only rely on the following property: Proposition 19 ([33]).A holonomic function in one variable has finitely many singularities.
Our first example is the language D, defined over the alphabet {a, b} as follows: This language is recognized by a weakly-ambiguous Parikh automaton, which guesses the correct j, and then verifies that n j+1 = 2n j .Let D = ab(a * b) * \ D, and suppose by contradiction that D can be recognized by a weakly-unambiguous PA.Then its generating series should be holonomic by Proposition 12. Since the generating series of ), it should be holonomic too.Looking closely at the form of the words of D, we get that its generating series is It is not holonomic as xD(x, 1) + x has infinitely many singularities, see [15, p. 296-297].
Our second example is Shamir's language S = {a n bv 1 a n v 2 : n ≥ 1, v 1 , v 2 ∈ {a, b} * }.One can easily design a PA recognizing S, where one coordinate stands for the length of the first run of a's and the other one for the second run of a's, the automaton guessing when the second run starts.Flajolet proved that S is inherently ambiguous as a context-free language, since its generating series S(z) = z(1−z) 1−2z+z n+1 has an infinite number of singularities [15, p. 296-297].This also yields its inherent weak-ambiguity as a PA language.

Limit of the method: an example using pumping techniques
As already mentioned, the analytic method presented is not always sufficient to prove inherent ambiguity.In this section, we develop an example where it does not apply.We consider the following language L even , which is accepted both by a deterministic pushdown automaton and a non-deterministic PA (where n = a n b as in Section 4.1): In other words, L even is the language of sequences of encoded numbers having two consecutive equal values, the first one being at an odd position.This language is accepted by a nondeterministic PA but is also deterministic context-free.This means that its generating series is algebraic and hence holonomic.This puts it out of the reach of our analytic method.
In this section we establish the following result: Theorem 20.The language L even is inherently weakly-ambiguous as a PA language.
The remainder of this section is devoted to sketch the proof of this proposition.By contradiction, we suppose that L even is recognized by a weakly-unambiguous PA A.
An a-piece ω of A is a non-empty simple path of a-edges in A, starting and ending at the same state: the states of the path are pairwise distinct, except for its extremities.The origin of w is its starting (and ending) state.Let Π(A) be the (finite) set of a-pieces in A.
We see a run in A as a sequence of transitions forming a path in A. An a-subpath of a run R in A is a maximal consecutive subsequence of R whose transitions are all labeled by a's, that is, it cannot be extended further to the left nor to the right in R using a's.
Let R be an accepting run in A. One can show that every a-subpath S of R can be decomposed as , where the σ i 's are a-pieces of Π(A), the s i 's are positive integers, and the w i 's are paths not using twice the same state.Moreover, this decomposition is unique if we add the condition that if w i = ε, then σ i = σ i−1 and the only state in common in w i and σ i is the origin of σ i .This is done by repeatedly following the path until a state q is met twice, factorizing this segment of the form wσ, where σ is an a-piece of origin q.We call this decomposition the canonical form of S, and the signature of S is the tuple (w 1 , σ 1 , w 2 , . . ., w f , σ f , w f +1 ), i.e., we dropped the s i 's of the canonical form.From the weak-unambiguity of A we can prove that there are at most c distinct possible signatures, where c only depends on A, and that f is always at most |Q A |.
Ramsey's Theorem [32] guarantees that there exists an integer r such that any complete undirected graph with at least r vertices, whose edges are colored using c 2 different colors, admits a monochromatic triangle.We fix two positive integers n and k sufficiently large, which will be chosen later on, depending on A only.For ∈ {1, . . ., r}, let w be the word w = n 1 n 2 . . .n 2r , where n i = n for odd i and n 2i = n + k if i = and n 2 = n.Each w is in L even , with a match at position 2 only.By weak-unambiguity, each w has a unique accepting run R in A, each such run having 2r a-subpaths by construction.For i = j in {1, . . ., r}, let λ ij be the signature of the 2j-th a-subpath of R i , which is a path of length n + k by definition.The complete undirected graph of vertex set {1, . . ., r} where each edge ij, with i < j, is colored by the pair (λ ij , λ ji ) admits a monochromatic triangle of vertices α < β < γ.In particular, λ αβ = λ αγ and λ γα = λ γβ .
We choose k = lcm({|σ| : σ ∈ Π(A)}), and n sufficiently large so that any a-subpath of an accepting run contains an a-piece σ repeated at least k + 1 times.This is possible as the w i 's have bounded length, and there are at most |Q A | + 1 of them.Hence, the 2γ-th a-subpath of w α contains a a-piece σ that is repeated more than s times, where s = k/|σ|.As λ αβ = λ αγ , the piece σ is also in the 2β-th a-subpath of w α .If we alter the accepting path R α into R α by looping s more times in σ in the a-subpath at position 2β and s less times in σ at position 2γ, we obtain a run for the word w = run is accepting as the PA computes the same vector as for w α , by commutativity of vectors addition.And the signatures remain unchanged, as there are sufficiently many repetitions of σ at position 2γ in w α .Similarly, as λ γα = λ γβ , we can alter the accepting path R γ into an accepting path R γ of same signatures as R γ for the same word w, by removing s = k/|σ | iterations of an a-piece σ at position 2α and adding them at position 2β.
We have built two paths R α and R γ that both accept the same word w.Therefore, as A is weakly-unambiguous, they are equal.As the signatures have not changed, this implies that the signature at position 2α in R α is λ γα , which is equal to λ γβ (monochromaticity), which is equal to λ αβ (R α and R γ have same signatures).This is a contradiction as we could remove one a-piece at position 2α in w α and add it at position 2β, while computing the same vector with the same starting and ending states: but this word is not in L even .
Remark 21.The proof relies on manipulations of paths in the automaton, and we only use the commutativity of the addition for the vector part.Thus, it still holds if we consider automata where we use a recursively enumerable set instead of a semilinear set for acceptance.

Algorithmic consequence of holonomicity
Generating series of languages have already been used to obtain efficient algorithms on unambiguous models of automata.For instance, they were used by Stearns and Hunt as a basic tool to obtain bounds on the length of a word witnessing the non-inclusion between two unambiguous word automata [35].More precisely, the proof in [35] relies on the recurrence equation satisfied by the coefficients of the generating series (which is guaranteed to exist by holonomicity in one variable).In the rational case, this recurrence relation can be derived from the automaton and does not require advanced results on holonomic series.In this section, our aim is to obtain a similar bound for the inclusion problem for weakly-unambiguous Parikh automata.The inclusion problem for RCM (and hence for weakly-unambiguous PA) is shown to be decidable in [8] but no complexity bound is provided.Note that this problem is known to be undecidable for non-deterministic PA [19].We follow the same approach as for the rational case [35] and for RCM [8].In stark contrast with the rational case, it is necessary to closely inspect holonomic closure properties in order to give concrete bounds.Fix A and B two weakly-unambiguous PA.We can construct a weakly-unambiguous PA C accepting L(A) ∩ L(B).We rely on the key fact that the series D(x) = A(x) − C(x) counts the number of words of length n in L(A) \ L(B).In particular, L(A) ⊆ L(B) if and only if D(x) = 0.
As D(x) is the difference of two holonomic series, it is holonomic.As equality between holonomic series is decidable, [8] concludes that the problem is decidable.But without further analysis, no complexity upper-bound can be derived.The coefficients of D(x) = n≥0 d n x n satisfy a recurrence equation of the form p 0 This equation fully determines d n in terms of its r previous values d n−1 , . . ., d n−r , provided that p 0 (n) = 0.In particular, if the r previous values are all equal to 0, then d n = 0. Consequently, if d n is equal to 0 for all n ≤ r + R where R denotes the largest positive root of p 0 (which is bounded from above by its ∞-norm, as a polynomial on Z) then 15 D(x) = 0.
Taking W := r + R, we have that if L(A) ⊆ L(B) then there exists a word witnessing this non-inclusion of length at most W .We now aim at computing an upper-bound on W on the size of the inputs A and B.
For this, we first bound the order of the linear recurrence satisfied by A(x) and C(x), as well as the degrees and norm of the polynomials involved.This is stated in Proposition 22, whose proof follows the one of Proposition 12, while establishing such bounds for the multivariate Hadamard product and the specialization to 1.
Finally, we transfer these bounds to the series D(x) of L(A) \ L(B) using the analysis of [22] for the sum of holonomic series in one variable.Using the bound of Theorem 23, the inclusion problem can be solved in triply exponential time by a naive counting of all words up to the bound.Using dynamic programming to compute the number of accepted words, we can decide inclusion in doubly exponential time.Remark 25.In [8], the authors propose a different construction to prove the holonomicity of the generating series of languages in RCM.This proof uses the closure of holonomic series under Hadamard product and algebraic substitution x 1 = x 2 = • • • = x n = x.It is natural to wonder if this approach would lead to better bounds in Proposition 22 (using the 15 The proof of Theorem 7 in [8] wrongly suggests that we can take the order of the differential equation for D as a bound on the length of a witness for non-inclusion.In general, this is not the case.For instance consider D(x) = x 1000 which satisfies the first-order differential equation 1000D(x) − x∂xD(x) = 0.It is clear that the coefficients D0 = 0 and D1 = 0 are not enough to decide that D is not zero.
I C A L P 2 0 2 0 114:14 Weakly-Unambiguous Parikh Automata and Their Link to Holonomic Series equivalence between weakly-unambiguous PA and RCM).It turns out that the operation x is more complicated than it seems at a first glance.Indeed, to our knowledge, no proof of the closure under algebraic substitution explains what happens if, during the substitution process, the equations become trivial.This issue can be overcome by doing the substitution step by step: x 2 = x 1 , then x 3 = x 1 , etc.However, this naive approach would produce worse bounds.

Perspectives
The bounds obtained in Section 5 are derived directly from constructions given in the proofs of the closure properties.In particular, we did not use any information on the special form of our series.The bounds are certainly perfectible using more advanced tools from computer algebra.Also it seems that the complexity of the closure under the algebraic substitution deserves more investigation, as discussed in Remark 25.
A more ambitious perspective is to find larger classes of automata whose generating series are holonomic.This would certainly require new ideas, as for instance any holonomic power series with coefficients in {0, 1} is known to be the characteristic series of some semilinear set [1].
and the symmetric ones for the other variables x b and x c .

1 1 −
xax b xc is the support series of the subset {(n, n, n) : n ∈ N}, and since 1 1−(xa+x b +xc) is the multivariate series of all the words on {a, b, c}, we have
If we fix an alphabet Γ = {a 1 , . . ., a d } with the ordering a 1 < • • • < a d on the letters, we can associate with every semilinear set C of dimension d, the language [C] = {w ∈ Γ * : (|w| a1 , . . ., |w| a d ) ∈ C} of words whose numbers of occurrences of each letter satisfy the constraint expressed by C. For instance, if we take the semilinear set C 0 = {(n, m, n, m) : n, m ≥ 0} and the alphabet {a, b, c, d} ordered by a < b < c < d, [C 0 ] consists of all words having as many a's as c's and as many b's as d's.
C ⊆ N d and a length preserving morphism µ : Γ * −→ Σ * , that is injective over R∩[C], so that L = µ(R∩[C]).For example, L abab = {a n b m a n b m : n, m ∈ N} can be shown to be in RCM by taking Γ = {a, b, c, d}, Σ = {a, b}, µ(a) = µ(c) = a, µ(b) = µ(d) = b, R = a * b * c * d * and the semilinear set C 0 defined in the previous paragraph.Theorem 14. L ∈ RCM iff L is recognized by a weakly-unambiguous Parikh automaton.
Weakly-Unambiguous Parikh Automata and Their Link to Holonomic Series

Theorem 23 .
Given two weakly-unambiguous PA A and B of respective dimensions d A and d B , if L(A) is not included in L(B) then there exists a word in L(A) \ L(B) of length at most 2 2 O(d 2 log(dM )) where d = d A + d B and M = |A| |B| A ∞ B ∞ .

Corollary 24 .
Given two weakly-unambiguous PA A and B of dimensionsd A and d B , we can decide if L(A) is included in L(B) in time 2 2 O(d 2 log(dM )) where d = d A + d B and M = |A| |B| A ∞ B ∞ .