A Physics-Informed Neural Network Method to Approximate Homogeneous Lyapunov Functions

This paper applies the physics-informed neural network approach to approximate globally-defined Lyapunov functions and stabilizing controls for homogeneous dynamical systems. The advantage of this class of systems is that all analysis and design can be made locally on a suitably defined unit sphere, which corresponds perfectly to the applicability conditions of neural networks.


I. INTRODUCTION
Control theory and engineering heavily rely on the notion of stability of nonlinear dynamic systems.The literature contains various techniques to solve this issue, primarily based on the Lyapunov theory.The existence of a Lyapunov function is a necessary and sufficient condition for a large class of dynamical systems to be asymptotically stable.As no universal technique exists for designing such a function, determining a Lyapunov function candidate is not straightforward.The vast bulk of known procedures involves methods to either analytically design a Lyapunov function or attempt to build numeric approximations.Analytical methods are usually constrained to particular canonical forms of the system's model, while most currently used numeric techniques perform well in systems with a small number of state variables.However, these methods have a high computational cost when used in large-scale systems like recurrent neural networks, which might have numerous state variables.
Artificial neural networks (ANNs) became a trendy tool to deal with Lyapunov functions.A classical result, the universal approximation theorem, is of great help in this issue.According to this theorem, a neural network with at least one hidden layer may arbitrarily approximate any smooth function.The set of functions that neural networks represent with one hidden layer is then dense in the set of continuous functions [2], [9].A stronger result is proved in these latter, namely that every continuous function with support in a compact set may be approximated with any desired degree of accuracy using standard multilayer feedforward networks with sufficient hidden units and any number of hidden layers.In [13], a direct approach for constructing a Lyapunov function modeled by a neural network is proposed, with the weights calculated by a genetic algorithm.For a training procedure to calculate approximations of Lyapunov functions for nonlinear ordinary differential equations, the author in [8] proposes a deep neural network architecture.If the systems admit a compositional Lyapunov function, it is shown that the number of neurons required to estimate a Lyapunov function with a fixed accuracy increases only polynomially in the state dimension.A control Lyapunov function is constructed in [12], using a neural network, which renders the closed-loop system stable after training.The authors in [10] provide the LyaNet technique for training ODEs that can give adequate prediction performance, quicker inference dynamics convergence, and more significant adversarial robustness.In [15], a deep neural network architecture is used to determine a numerical control Lyapunov function for nonlinear dynamics and an estimate of the region of the attraction, guaranteeing its stability.
The study in [3] describes a numerical method for solving partial differential equations with application neural networkbased functions, employing an unconstrained minimization problem.However, it was in [14] that the Physics Informed Neural Network (PINN) method was first described.The authors utilized it for training a fully connected ANN endowed with a loss function that encodes physics laws represented by nonlinear PDEs.The paper [1] proposes PINN approaches combined with a coupled-automatic-numerical differentiation framework to enable robust and efficient training.Numerous other studies have attempted to apply, extend, or provide theoretical support for this approach.In particular, PINN was used in [8] to approximate a Lyapunov function, looking for small-gain theory applications in a large-scale system, where it was also observed that the approach could be applied only for local stability analysis, in the general case, and that there is no guarantee in a neighborhood of the origin.
An important problem, rarely discussed in the litera-ture, is the principal existence of a Lyapunov function with the chosen structure and regularity for the studied class of systems.According to the remarkable results in [7], under an appropriate nonlinear change of variables, global asymptotic stability (GAS) of the origin for a family of ordinary differential equations is equivalent to global exponential stability (GES) with a quadratic Lyapunov function.The change of variables can then be incorporated into an approximation of a Lyapunov function in the original coordinates.An important point is a sort of relaxation by taking the change of variables diffeomorphism only smooth out of the origin.For an arbitrary nonlinear system, this characteristic can be an insurmountable obstacle for a global approximation.Nevertheless, this obstruction does not hold for homogeneous systems, for which a Lyapunov function can also be chosen as homogeneous.So, here, we aim at approximating a Lyapunov function on a compact set, not including the origin, and then obtain a global expression by dilations.

II. PRELIMINARIES
Consider a continuous-time system: where x : R + → R n represents the state function and f : R n → R n is a continuous vector field, locally Lipschitz continuous on R n \ {0}, with f (0) = 0. We suppose that the system (1) with initial condition x 0 ∈ R n has a solution x(t, x 0 ) defined for all t ∈ R + (the system is forward complete, see [11] for basic definitions of properties of dynamical systems).In this note, we will sometimes omit the time dependency and write simply x.So, by x ∈ R n , we mean x(t) ∈ R n , for t ∈ R + .
A. Stability notions: GAS to GES, and the existence of Lyapunov functions We refer to [7] for the concepts and results in this subsection.Let O ⊂ R n be an open set in the domain of attraction of the origin for (1).
, and the derivative of V in the direction of the vector field f satisfies DV (x)f (x) = −g(x) ≤ 0, for some function g : R n → R + , with g(x) > 0 for all x ∈ O \ {0}.In the case O = R n and V (x) → +∞ when x → +∞, we call V a global Lyapunov function.
The existence of a (global) Lyapunov function for ( 1) is a necessary and sufficient condition for 0 ∈ R n to be a (globally) asymptotically stable equilibrium, see for instance [11].
Using KL functions, we may also write that x = 0 is globally asymptotically stable if there exists a class KLfunction β such that ∥x(t, x 0 )∥ ≤ β(∥x 0 ∥, t), for all t ∈ R + and all x 0 ∈ R n .Also, x = 0 is globally exponentially stable, if there exist c ≥ 1 and α > 0 such that for all t ∈ R + and all x 0 ∈ R n .A significant result in [7], Theorem 2, claims that the system (1), supposed to be GAS, can be transformed into a GES system: Theorem 1 ( [7]): Let n ̸ = 4,5 and consider any system (1) on R n , which is GAS.Then (1) can be transformed into a system that is GES.In particular, in (2), one can choose c = 1 and α = 1.
The proposed transformation is realized through what the authors in [7] call a change of variables, that is, a homeomorphism T : R n → R n satisfying T (0) = 0, T C 1 on R n , and a diffeomorphism on R n \ {0}.These transformations are infinitely differentiable everywhere, except at the origin, where they are just continuously differentiable.Their respective inverses are continuous globally and infinitely differentiable away from the origin.In the new coordinates, for y = T (x), the dynamics of (1) takes the form: Remark that the change of variables can be seen as an approximation of a Lyapunov function, as suggested by the following proposition ( [7], Proposition 1): Proposition 1: For n ̸ = 5, let V : R n → R be a proper (i.e., for each β ≥ 0, the set {x ∈ R n | V (x) ≤ β} is bounded), positive definite, C 1 global Lyapunov function of (1).Assume that V is smooth on R n \{0}.Then there exists a change of variables T : R n → R n with DT (0) = 0 such that Remark 1: As mentioned in [7], the main ingredient in the proof of the above proposition is related to the question of whether level sets of Lyapunov functions in R n are diffeomorphic to the unit sphere in R n−1 , based on results from F. W. Wilson [18].In the case of homeomorphisms only, the case n = 4 results from the Poincaré conjecture, see [18], solved in 2003 by G. Perelman, and M. H. Freedman [6] proved the remaining case n = 5.However, the question of the existence of a diffeomorphism (as in the definition of change of variables) remains open.
Therefore, in the original coordinates, we have the expressions: for all x ∈ R n .
B. Homogeneity 1) Definitions of homogeneity: Basic notions on homogeneous systems useful in the sequel are presented in this subsection.For any r i > 0, i ∈ 1, n, and λ > 0, define the vector of weights r = r 1 . . .r n and the dilation matrix . Also denote r min = min i∈1,n {r i } and r max = max i∈1,n {r i }.
Definition 2 ( [5], [20]): holds for all x ∈ R n and any λ > 0. In both cases, the constant ν is called the degree of homogeneity.
Definition 3: A dynamical system as in ( 1) is called rhomogeneous of degree ν if this property is satisfied for the vector field f , in the sense of Definition 2.
Definition 4: For x ∈ R n and ϖ ≥ r max , the rhomogeneous norm is defined as follows Notice that, for all x ∈ R n , its Euclidean norm ∥x∥ is related to the homogeneous one: σ r (∥x∥ r ) ≤ ∥x∥ ≤ σr (∥x∥ r ) for some σ r , σr ∈ K ∞ [5].
In the following, due to this norm equivalence, the stability analysis with respect to the norm ∥ • ∥ can be replaced with the analysis for the norm ∥ • ∥ r .The homogeneous norm has an important property: it is r-homogeneous of degree 1, that is ∥Λ r (λ)x∥ r = λ∥x∥ r for all x ∈ R n and λ > 0.
2) Lyapunov functions: One of the main features of homogeneous systems related to the Lyapunov function method is described in the following theorem: Theorem 2 ( [17], [19]): For an r-homogeneous system (1) of degree ν, if the origin is GAS, then there exists an r-homogeneous function V : R n → R + of degree µ > −ν, C 1 on R n , smooth on R n \{0}, and such that for all x ∈ R n : for some 0 < a ≤ b, c > 0 and d > 0.
We describe next an interesting robustness result for homogeneous Lyapunov functions established in [5]: Proposition 2: For a locally Lipschitz continuous and rhomogeneous function V : R n → R + of degree µ > −ν, assume that the estimates (4), ( 5

C. ANNs & PINNs
1) Artificial neural networks: An ANN is a graph-like machine learning model.Each node is an artificial neuron, and each edge connecting nodes is a weight, which registers the degree of connection between neurons.Each neuron may be considered as a function that takes the output of nearby neurons as its input.Neurons are aggregated into L hidden layers, compressed between the input and output layers.
An ANN with more than one hidden layer (L > 1) between the input and output is called a deep neural network.For instance, in a feedforward neural network, the input is processed successively through the layers L, L − 1, . . ., 1.We will use networks with the input vector where θ ∈ R p is the vectors of parameters to be learned.There are N ℓ neurons in the layer numbered ℓ and y ℓ k is the output of the neuron at position k in the layer ℓ, giving a vector At the lowest level, y L+1 = x, and the outputs of the neurons in the highest layer y 1 are used to compute the network's output W (x; θ).More precisely, in the hidden layers, the outputs of the neurons are ⊤ ∈ R N ℓ+1 the weights and b ℓ k ∈ R the biases.The output is obtained by an affine linear combination of the values of the highest layer (ℓ = 1) in the form: where all parameters a k , w 1 k , b 1 k , and c in the expression are gathered in the parameter θ (as well as other weights if L > 1).
2) Approximation of a continuous function by a neural network: In this subsection, we recall a popular method of approximating a continuous function.
The works of [9] and [2] have established that, given enough hidden units, standard multilayer feedforward networks with any number of hidden layers and arbitrary squashing function are capable of approximating, with any desired degree of accuracy, any continuous function of n variables with support in some compact set.In particular, the following result can be formulated: Theorem 3 ( [2]): Let ϕ : I n → R be a continuous function defined on the n-dimensional unit cube I n in R n , and ε > 0.Then, there is a N 1 > 0 such that with σ 1 a sigmoidal function (i.e., a non-decreasing function 3) Application of ANNs for approximating Lyapunov functions: We wish to select θ such that W (x; θ) approximates a Lyapunov function V (x), for all x ∈ K, with K a fixed compact set including the origin for a GAS (at zero) system (1).According to Theorem 3, we can use one single hidden layer, so L = 1 (a deeper ANNs can be used similarly).The formula of W (x; θ) is given by (6).
Following [8], to train the network and approximate a Lyapunov function by W (x; θ), a loss function L : R×R n × R n → R must be defined.Then the training phase consists of finding parameters θ that search for the minimality of where x (i) ∈ K are test points (elements of a learning/test dataset).Unlike conventional ANN learning tasks, the loss function L additionally depends on the test points' values of the derivative DW of W with respect to x.Such a constraint is required to establish if W is a Lyapunov function [8]: for a suitably chosen α ∈ K.The appearance of DW naturally opens a way of interpreting this training problem within a partial differential equation (PDE) framework.In such a case, the positivity of the Lyapunov function can be introduced through the boundary conditions: However, it can be challenging to implement the boundary conditions in the above form numerically, either because the first expression involves a single point or because of the strict positive condition in the second one.We may then replace it with the following to palliate this: for some functions α 1 and α 2 ∈ K. Based on the above observations, in [8] the following loss function is considered: The drawback of this functional L is that its minimization and the obtained solution W depends on the choice of the tuning functions α, α 1 , and α 2 , and it could be the case that, for a chosen set of these functions, there is no Lyapunov function for (1).
4) PINN: Physics-Informed Neural Networks are artificial neural network models that use a priori information from the ODE or PDE system governing the dynamics under study and approximate a solution of this ODE/PDE for given initial conditions and time/space domain.This results in introducing terms into the loss function of the ANN that are somehow proportional to the considered physics/dynamics information.
A typical PINN uses a fully connected ANN architecture to represent the solution V (x, t) of a dynamical system.For some domain Ω ⊂ R n and T > 0, the PINN model generates W (x, t; θ) predicting V (x, t) for x ∈ Ω and t ∈ [0, T ], where θ are properly learned parameters of ANN as before.The accuracy of the PINN output is determined by θ, which is optimized with respect to the PINN loss function during the training.To derive the PINN loss function, we consider V to be described by differential equations of the general form: where D x and D t are differential operators in x and t, B is a suitable boundary functional, and the functions V 0 and g determine the initial and the boundary conditions.The loss function incorporates all information in the system [14]: where ν 1 , ν 2 , ν 3 > 0 are tuning parameters.We can see that the functional L depends on the derivative of W with respect to x and t, which are the inputs of ANN producing W (x, t; θ).Then, the problem of approximating a Lyapunov function for the system (1) by an ANN solved in [8] is an example of exploitation of PINN.The functional L from the previous subsection contains the structure of the PDE describing the properties of a Lyapunov function.

III. PROBLEM STATEMENT
Consider the vector field in (1).We wish to employ an ANN or PINN-based structure with an input vector x = x 1 . . .x n ⊤ ∈ R n and a scalar output W(x; θ) ∈ R to approximate an unknown Lyapunov function of (1) at the origin for x ∈ K, with K being a compact set in R n on which V must be computed, {0} ∈ K. Here, the vector θ ∈ R p represents the network's free parameters that must be learned to get the desired result.
To this end, we will refine the results of [8] by utilizing the result of Proposition 1 and the resulting PDE (3).Therefore, in our case, the ANN has to approximate the change of variables T defining W and DW due to (3).For brevity, let us use one single hidden layer ANN consisting of N 1 neurons without biases (in such a case, we naturally get that the transformation is zero at the origin).Consequently, consider θ = vec w 1 w 2 ∈ R 2N1n , with w 1 ∈ R n×N1 , w 2 ∈ R N1×n , and define a smooth map N θ : R n → R n as: where σ1 : R N1 → R N1 is a function defined as being a smooth activation function.Then such a function N θ (x) should approximate the transformation map T , and the Lyapunov function candidate will be approximated by A similar structure of Lyapunov functions is proposed in [16] for polynomial systems using the sum-of-squares approach.
The function W(x; θ) would be a Lyapunov function candidate, provided that the conditions for positive definiteness are verified, which in such a case can be rewritten as follows: , where I n denotes the identity matrix in R n×n .The remaining condition of (3) on the decay of W(x; θ) along the trajectories of the system (1) can be presented as follows: To define a loss function, we combine the conditions formulated above and write, with respect to the map N θ (x): where, as before, w represents N θ and p corresponds to the derivative DN θ with respect to x.Note that L is not a square loss function, so its terms are not squares of the argument in comparison with the formulation of [8].Therefore, it results in a simpler gradient with respect to θ.According to the results of [7], such a presentation of a Lyapunov function always exists for n ̸ = 4, 5 (see Proposition 1).In addition, we deal with inequalities and not with equality solutions.The functional ( 9) can be equivalently rewritten in the form explicitly dependent on θ: , which is ready for application of the optimization procedures in θ.In such a case, the Lyapunov function candidate ( 8) is a smooth function on R n , while in Proposition 1, the smoothness at the origin is not ensured for the studied class of systems in (1).Therefore, it is a mild limitation of this ANN approximation, which has some consequences.
Following the result of Theorem 3, for the vector of parameters θ properly tuned by a learning algorithm, for any ε > 0 and given compact set K, there is N 1 > 0 such that for some Lyapunov function V for the system (1).However, the function V is vanishing at the origin by Definition 1, while the error function e(x; θ) = V (x) − W(x; θ), being upper-bounded in the norm by ε in the set K, is not necessary vanishing for x = 0. Therefore, W(x; θ) could be signindefinite close to the origin, and there is no theoretical guarantee of having a solution to this drawback.Additional research in this direction has a good perspective.

SYSTEMS
As we have just seen, for the system (1) in its generality, the behavior at the origin of any Lyapunov function approximation can be a significant obstacle to ANN/PINN approximation.Fortunately, this obstruction does not apply to homogeneous systems for which a homogeneous Lyapunov function can be selected, whose design and analysis can be performed on the sphere only, leading to a global solution.
Hence, the idea is to use a compact set that excludes the origin, estimate the Lyapunov function, and then use dilations to produce a global expression.

A. Stability
For homogeneous systems, we consider an approximation V θ of an unknown Lyapunov function V defined globally, and where the ANN output W (x; θ) defined in ( 6) is needed on the set S r (1) only: Such a Lyapunov function V θ is homogeneous by construction.Then, the conditions to be checked can be obtained from the estimates (4), ( 5) and restricted to the set S r (1): which can be transformed into the following loss function: It is easy to see that, in this case, there is no problem with the vanishing of W somewhere on the set of approximation S r (1).Moreover, for homogeneous systems and the chosen form of Lyapunov function approximation (10), there exist theoretical guarantees coming from Theorem 3, contrarily the general case of the system (1) discussed in the previous section (notice that (10) may be not smooth at the origin): Theorem 4: There are a number of hidden units N 1 > 0 and a vector of parameters θ such that ( 10) is an rhomogeneous global Lyapunov function for any degree µ > max{0, −ν} for an r-homogeneous system (1) of degree ν with GAS origin.
Proof: For a properly learned parameters θ (it implies that a sufficiently dense grid has been selected guaranteeing that the behavior of f and W in between is sufficiently small), for ε ∈ (0, 1) sufficiently small, there is a sufficiently big number of neurons in the hidden layer N 1 > 0 such that sup y∈Sr( 1) for some ρ > 0, and (10) is the required homogeneous global Lyapunov function (see Proposition 2).
Comparing this result and its applicability conditions to the corresponding ones from [5], it is easy to conclude that PINN-based approach from Theorem 4 is much simpler.In addition, it allows many extensions to be straightforwardly obtained, as coupling with a stabilizing control design presented below.

B. Stabilization
Consider a continuous-time system: where x : R + → R n represents the state function, F : R n × R m → R n is a smooth vector field with F (0, 0) = 0, and u : R + → R m is the input function.We need to assume the following symmetry relation extending the homogeneity property to the systems with input: Assumption 1: There exist two vectors of weights r = r 1 . . .r n and r = r 1 . . .r m such that for all λ > 0, all x ∈ R n , all u ∈ R m and some ν ≥ −r min .
Hence, for u = 0, we recover the r-homogeneity of F in the standard sense.Now, assume that there exists a stabilizing continuous feedback law admitting a symmetry relation: Assumption 2: There is continuous function u : R n → R m such that u(Λ r (λ)x) = Λ r (λ)u(x) for all λ > 0 and all x ∈ R n , and the closed-loop system ẋ(t) = F (x(t), u(x(t))) (12) admits the origin as a GAS equilibrium.
It is easy to verify that ( 12) is r-homogeneous.Hence, by Theorem 2, there is an r-homogeneous Lyapunov function V (x).Let us try to calculate simultaneously the control u and the Lyapunov function V using PINNs.
As described in [5], homogeneous functions can be approximated on the unit sphere, and the expression (10) can be used to search V (x).Similarly, define U (x; θ ′ ) = Λ r (∥x∥ r )N θ ′ (Λ −r (∥x∥ r )x) , ∀x ∈ R n as an approximation of u(x), where N θ ′ is given in (7) with m-dimensional output of ANN, and N ′ 1 > 0 denotes the number of hidden units.The functional (9)  Its proper minimization allows the vectors of weights θ and θ ′ approximating the respective quantities, V and u, to be found.
Proof: It repeats the main arguments of the proof of Theorem 4 by observing that the system (13) is rhomogeneous of degree ν by construction.

V. CONCLUSION
In this note, we have proposed a method for approximating globally specified Lyapunov functions and stabilizing controllers for homogeneous dynamical systems using a particular type of ANN, physics-informed neural networks.The natural characteristics of Lyapunov functions and their derivatives are perfectly adaptable to employing PINN techniques.The great advantage of working with homogenous systems is that all analysis and design can be done locally on a well-specified unit sphere.This important feature also fully matches the criteria for applying neural networks.Using recent results on partial homogenization of nonholonomic systems [4], the proposed approach can be applied to this class of models.
The notation DV (x)f (x) stands for the directional derivative of a continuously differentiable function V : R n → R with respect to a vector field f : R n → R n evaluated at the point x (then DV (x) is the gradient).•For a set A ⊂ R n , denote its boundary and interior by ∂A and int(A), respectively.• The finite set of the first n positive integers is denoted by 1, n := {1, 2, ..., n}. •