A Note on Fixed- and Discrete-Time Estimation via the DREM Method

A simple fixed-time converging estimation algorithm is presented for a linear regression using the dynamic regressor extension and mixing method within a discrete-time setting, with a persistently excited regressor and bounded measurement noises. The solution is based on Kreisselmeier's filters, and it is computationally simpler than the existing analogs.


I. INTRODUCTION
In this note, we consider the parameter estimation problem for a discrete-time linear regression equation (LRE) where y k , v k ∈ R are the output and the bounded measurement noise, ϕ k , θ = (θ 1 . . .θ n ) ⊤ ∈ R n are the bounded regressor and the unknown parameter vectors.The goal is to estimate θ given the online measurements y k and ϕ k .
The parameter estimation problem for dynamic or static systems is an essential issue in many scientific disciplines, and many well-matured approaches to solving this problem are available [1], [2].Despite various batch methods based on post-processing of collected measurements, two popular iterative parameter estimation methods are the recursive leastsquares and gradient-descent estimators [3].The fundamental result in parameter estimation claims that if the regressor ϕ is persistently exciting (PE), that is exciting uniformly in time (see Section III for a formal definition of this property), then the recursive least-squares and gradient-descent iterative estimators converge exponentially, they are also input-to-state stable (ISS), and provide filtering of the noise v k .
Numerous recent studies address possible relaxation of the PE requirement in both continuous-and discrete-time settings.These results include the composite [4] and concurrent [5] learning, directional forgetting [6], fixed-time estimation [7], and the recent regularization-based results [8], [9].However, it is widely understood that if the regressor ϕ is not exciting uniformly in time, but only on a particular interval (interval excitation), then the additive noise cannot be efficiently filtered, and the PE-relaxation methods typically consider the noise-free scenario, v ≡ 0. In this note, due to the presence of noise vk in (1), we are focused on the uniform in time excitation, i.e., when the regressor ϕ is PE.
Standard iterative estimators also have some known shortages.Among them, it is necessary to mention the difficulty of accelerating the convergence process (this property is mainly predefined by the level of excitation in the regressor) and the non-monotonicity of the convergence [10]- [12].The latter fact means that despite a decreasing norm of the parameter vector estimation error, the respective discrepancies for each parameter may demonstrate a complex oscillatory behavior before settling down in a vicinity of the true value, which may postpone a real-time utilization of the obtained estimates.That is why a solution to these drawbacks recently proposed by the dynamical regressor extension and mixing (DREM) method [13] has quickly gained popularity [14]- [17].
The DREM procedure consists of two steps: extension and mixing.In the first step, the model ( 1) is extended to get a new linear regression with a square n × n regressor matrix.Then, at the mixing step, the single extended LRE for a vector of unknown parameters is transformed into a set of n scalar LREs for each element of the vector θ independently.Then an iterative estimator, e.g., the gradient-descent one, is applied to each of the new scalar LREs separately, yielding the elementwise improved transients with simple and transparent estimator tuning rules.
However, in this note, we argue that the use of iterative estimators in the DREM procedure is more a tradition than a need.We show that a proper choice of an LRE extension at the first step of the DREM procedure ensures that the novel regressor is strictly positive allowing for point-wise (algebraic) estimation.A point-wise estimator can be of limited practical interest due to its noise sensitivity, even if it provides the fixed-time (independent in initial conditions) convergence.To this end, we study the worst-case bound on the parameter estimation error and propose tuning guidelines minimizing this bound.
To summarize, the contribution of this paper is as follows: • we show that a particular choice for LRE extension ensures strict positiveness of the novel regressor; • we analyze noise propagation for the point-wise (algebraic) estimator and derive the worst-case bound on estimation error.Finally, we address the argument that standard iterative estimators provide a good trade-off between the transient time and the steady-state noise attenuation (filtering).We show that we obtain faster transients with a comparable steadystate performance by putting the point-wise estimator in a serial connection with a (non-)linear filter.Together, these contributions support our claim that under the PE condition, the DREM procedure can function as a fixed-time estimator without being followed by gradient-type algorithms.
The rest of the paper is organized as follows.In Section II we briefly describe the DREM procedure.In Section III we discuss how the initial excitation can be preserved by the DREM procedure.In Section IV we present the main result of the note, and Section V contains simulations illustrating our results.Finally, Section VI contains some concluding remarks.

Notation
• The sets of nonnegative real and integer numbers are denoted by R + and N, respectively.Also, N * := N \ {0}.
The set of real n × m-matrices is denoted by R n×m .The n-identity matrix is denoted by I n .• For a vector x ∈ R n , ∥x∥ denotes its Euclidean norm.
• The rounding function to the greatest integer smaller than s ∈ R is denoted by ⌊s⌋ = floor(s).

II. DYNAMICAL REGRESSOR EXTENSION AND MIXING
The original DREM procedure in [13] consists in finding n stable causal filters H i (z), where i ∈ {1, . . ., n}, and z is the time shift operator (i.e., z h y k = y k−h for any k, h ∈ N, k ≥ h), whose auxiliary role is also to filter the noise.For i ∈ {1, . . ., n}, denote That leads to a new extended regression where ϵ k ∈ R n is an exponentially decaying term coming from the initialization of the filters.Finally, for this yields the element-wise scalar linear regression: where y k,i ∈ R and ϕ k ∈ R are known signals, θ i ∈ R is the unknown constant parameter to be estimated, and v k,i ∈ R is an unknown bounded measurement distortion.Note that the estimation of each θ i is now explicitly independent of other components of θ, and the interconnection is hidden in as the measured regression error, where θ k := θ− θ k is the parameter estimation error.It remains now to design an algorithm to compute θ k , which is a simpler problem than (1), since in (5) the estimation of each θ i can be treated separately.Moreover, in scalar linear regression, the convergence of θ k is element-wise monotone and can be accelerated by tuning the estimation procedures [13].
One of the main issues of the DREM methodology is the excitation of the common regressor ϕ k , and its relation with the excitation of ϕ k in the original problem statement, which is obviously predefined by the choice of the filters H i in (2).Let us present a solution to this problem.

III. KREISSELMEIER'S REGRESSOR EXTENSION
Let us show how the DREM procedure can keep the excitation level of the original regression for (5).To this end, let us characterize the admissible excitation levels [10]: It is said to be intervally excited (IE) if the above inequality is satisfied for k = 0 only.
It worth noting that in the noise-free case, interval excitation can be related with the conventional identifiability/observability condition for θ in (1).
To keep the excitation, we proceed as follows.Instead of applying filters H i directly to (1), consider the auxiliary regression problem where λ ∈ (0, 1).Then for , we obtain the extended LRE (3).The state-space implementation of this extension is known as Kreisselmeier's dynamic regressor extension and is given by After the mixing step, it yields the decoupled linear regression equation (5).Specifically, the choice Φ 0 = 0 and Y 0 = 0 guarantees that ϵ k ≡ 0 in (3) for all k ∈ N.
The next Lemma shows that the proposed extension keeps the excitation properties of the original regressor ϕ.
Remark 1.If φ in (1) is IE, then ϕ resulting from (4), ( 6) cannot be identically zero, and if φ is PE, then ϕ is strictly positive for all k ≥ ℓ.Thus we further consider only those steps k ∈ N when ϕ k ̸ = 0. Indeed, otherwise the linear regression (5) reads y k = v k and there is no reason to update the estimate for such samples, i.e., θ k = θ k−1 for ϕ k = 0 (recall that the signal ϕ is known).

IV. NOISE PROPAGATION
In the noise-free case, after applying the DREM method and obtaining the decoupled linear regression (5), the estimation problem has a trivial point-wise solution which can be unconditionally applied as soon as ϕ k ̸ = 0 for some k ∈ N.Such a condition is always verified with an IEϕ.In the presence of noise, the noise propagation can be studied using the established properties of Kreisselmeier's filters (6), and the following simple result can be obtained constituting the main outcome of this note: Theorem 1.Let the regressor signal ϕ in (1) be (ℓ, µ)-PE and the Kreisselmeier's dynamic extension filters (6) be used in (2).Then the estimate θ k can be computed by (4), (8), for all k ≥ ℓ, and Proof: According to Lemma 1, for all k ≥ ℓ, ϕ k ≥ α = µ (1 − λ) λ ℓ−1 n , hence the division in ( 8) is well defined.By definition, giving the required estimates by the properties of ϕ k .Thus, the DREM-based algorithm ( 6), ( 4), (8) guarantees a fixed-time convergence of the estimate θ k to the true value θ (the convergence time ℓ ≥ n is independent of initial conditions), while the estimation error is proportional to the noise v amplitude.
Remark 2. According to the estimates given in Theorem 1, the choice λ * = ℓ−1 ℓ minimizes the parameter estimation error gain α −1 with respect to the noise.Augmenting the value of λ can improve the asymptotic precision of the proposed algorithm.
Several final comments are in order: 1) Roughly speaking, the DREM method provides a direct measurement of the unknown parameter vector θ for k ≥ ℓ: Note also that ( 4), ( 6), ( 8) implicitly realize exponentially-weighted least-squares estimation with a forgetting factor λ.
2) Any additional noise filtering can be applied to Y k to attenuate the influence of V k .In the existing literature, gradient algorithms are frequently used for (5) [7], [16], [19], [20], which serve merely for implicit noisecanceling, being probably not the best choice for such a purpose.Indeed, the conventional selection is the leastsquares algorithm: k , it is reduced to (8), while for γ k = ϱϕ −2 k with ϱ ∈ (0, 1), to a low-pass filtering of (8).In general, many more options are available, as the moving-average filter or nonlinear designs that cannot be directly applied to the linear regression ( 5), e.g., the median filter.See Section V for an illustrative comparison of selected methods.3) Note that Kreisselmeier's dynamic extension (6) filters the noise v k already, or other noise filters can be applied to get (1).
V. SIMULATIONS

A. General description
We consider the standard example where we estimate parameters of a sinusoidal signal: which can be written as (1) with where ω = 2π T , T ∈ N, T ≥ 3, is the known period, and A 0 ∈ R, A ∈ R + , ψ ∈ [0, 2π) are unknown constant parameters.For the conciseness of simulations, we focus below on estimating the parameter θ 1 = A 0 only.
The DREM procedure (6) with zero initial conditions yields the element-wise scalar linear regression (5), where we are interested in the equation corresponding to i = 1 only, namely Note that the DREM procedure allows for the single parameter estimation; otherwise, the whole vector θ has to be estimated.It is straightforward to show that the regressor ϕ is (ℓ, µ)-PE with ℓ = T and µ = T 2 , and Lemma 1 applies.The signal ϕ is thus also PE and it is strictly separated from zero for k ≥ T .Then the direct estimation ( 8) is applied, where ε ϕ > 0 is a small constant introduced to ensure the feasibility of implementing the direct estimation for the initial transients of ( 6).
In the sequel, we consider various filtering techniques applied to the direct estimate (11) and compare with estimators applied to the element-wise linear regression model (10).The results illustrate that fixed-time direct estimation followed by filtering performs better than estimators with asymptotic convergence.

B. Used approaches
By abuse of notation, we use y k , θ, and θk to denote y k,1 , θ 1 , and θk,1 , resp., since we are interested only in θ1 .
For the element-wise linear regression (10), we consider the following two estimators: • the gradient estimator given by where γ 0 > 0 is the tuning coefficient; • the least-squares estimator with forgetting given by with p 0 > 0 and λ ls ∈ (0, 1) is the forgetting factor.For the direct estimate, we consider the following filtering methods applied to θdir k given in ( 11): • the low-pass filtering given by θlow where ϱ ∈ (0, 1) is the tuning coefficient; • the moving-average filtering given by where N = min {k, N ma }, and N ma is the length of the averaging window; • the median filter given by where N = min {k, N med }, and N med is the number of samples used in the median computation, • the Kalman filter given by where Q ≥ 0 and R 0 > 0 are the process and observation noise covariances.When Q ̸ = 0, the case of time-varying parameters can be treated.

C. Simulation results
Consider the signal (9), where A 0 = −3, A = 5, ψ = π 3 , and T = 117.The measurement noise v is chosen as a uniform random variable, v ∼ U (−1, 1).Following Remark 2, the value λ in ( 6) is chosen as λ = T −1 T and the initial conditions in (6) are zero.Then the lower bound α defined in Lemma 1 is α = 0.0063 and the asymptotic lower bound of ϕ is 0.025.
Fig. 1 depicts the estimation error transients of the considered methods in the ideal noise-free scenario.Denote the step number when the value ϕ k overcomes the threshold ε ϕ as T fix , where T fix = 14 in the considered example.The direct estimate (11) converges in the fixed time T fix .The median filter also has the fixed-time convergence property with the convergence time min 2T fix , T fix + N med 2 .The movingaverage filter converges in the fixed time T fix + N ma − 1, and other estimates converge asymptotically.
Fig. 2 depicts the estimation error transients of the considered methods for noisy measurements.The direct estimate exhibits sensitivity to the noise in the initial transient when the value of ϕ is small; this sensitivity can be reduced by increasing ε ϕ .Nevertheless, all considered methods perform well in alleviating the noise, where filtering of the direct estimate converges to a vicinity of the true value faster than the estimators applied to the linear regression model.
The steady-state performance of the considered methods is summarized in Table I, where the mean squared error (MSE) and the mean absolute error (MAE) are given; these values are computed over 10 6 samples after the transients.
provides parameter estimation with an improvement in the noise attenuation, and it should be considered as a reasonable alternative to the standard parameter estimation approaches applied to the element-wise scalar linear regression (5).

VI. CONCLUSION
This work has established that the DREM procedure provides a direct parameter estimation by transforming the initial vector linear regression into one with unknown parameters directly measured with noise.We have also proved that Kreisselmeier's filters preserve the initial excitation of the regressor in the DREM method.Noise filters can be effectively coupled with DREM.In the future, we shall investigate other estimation algorithms with accelerated convergence rates in discrete time, especially when it is impossible to obtain a sign-definite regressor after DREM or to improve robustness to measurement noise.

Fig. 1 .
Fig.1.Estimation error transients in the noise-free case: the direct estimation θdir(11), the gradient estimator θgr (12), the least-squares estimator θls (13), the low-pass filtering θlow (14), the moving-average filtering θma (15), the median filtering θmed (16), the Kalman filter θkal(17).The dashed line corresponds to 5% of the initial error value; the estimators are tuned to have approximately equal transient time.For the median filter, the number of samples is the same as for the moving average.

TABLE I MSE
AND MAE OF THE SELECTED METHODS.