Optimized time-domain control of passive haptic teleoperation systems for multi-DoF interaction

This paper presents a time-domain passivity controller for multi-DoF haptic-enabled teleoperation systems aimed at improving performance in terms of transparency for a given task. By solving an online convex optimization problem, the proposed approach enhances transparency of interaction along specific directions of the environment space which are significant for the task at hand, while guaranteeing system stability. An experimental evaluation of the effectiveness of the proposed design is presented, enrolling twenty participants. We compared the performance of the proposed approach vs. those of a standard energy-bounding time-domain algorithm during the exploration of a virtual sphere. Results show that, as the communication delay between the local and remote agents grows, the proposed technique better preserves transparency along the directions that are more important for the task at hand.


I. INTRODUCTION
Robotic teleoperation has been central in the technological advances of the past decade.Teleoperated robots, i.e., robots remotely controlled by a human operator, perform over 700,000 surgeries per year [1], they are currently exploring the surface of Mars, they help our public forces during natural calamities, they navigate our oceans and sort our waste.Research shows that haptic feedback plays an important role in improving robotic teleoperation (e.g., a 55% and 80% precision increase during needle insertion and instrument positioning, respectively [2]), and it is often considered as one of the most promising technologies in the field [2], [3], [4].
The main objective when designing the control of a haptic-enabled teleoperation system is to achieve stability and transparency.Indeed, it is well-known that, in case of communication delays or stiff environments, providing kinesthetic haptic feedback to the user can lead to undesired and abrupt oscillations of the system, which may be very dangerous for the system, the user, and the environment.These potentially dangerous behaviors must be avoided, especially in those fields of application where the safety of the system is a paramount and non-negotiable requirement, e.g., in surgical robotics [5].At the same time, transparency is also fundamental, as it is what enables the user to receive a faithful representation of the remote environment in terms of matching impedance [6], [7], G. Bianchini, D. Barcelli, and D. Prattichizzo are with the Dept. of Information Engineering and Mathematics, University of Siena (Siena, Italy).email: {giannibi,barcelli,prattichizzo}@diism.unisi.itC. Pacchierotti is with the CNRS, Univ.Rennes, Inria, IRISA (Rennes, France).email: claudio.pacchierotti@irisa.fr D. Prattichizzo is also with the Humanoids & Human Centered Mechatronics research line, Istituto Italiano di Tecnologia (Genova, Italy).[8].Achieving high transparency while guaranteeing stability is indeed a prominent challenge in the field of robotic control.
Toward this objective, passivity theory is considered an effective tool for ensuring a stable interaction during teleoperation [9], [10].It has been analyzed and implemented from many different points of view, e.g., [11], [12] analyzed passivity in the time domain considering the energy levels of the different system components, [13] addressed the issue of using a delayed communication channel, [14], [15], [16] presented an energy-bounding approach to guarantee the passivity of the teleoperation loop, and [17], [18] proposed to recover any reduction of force due to the passivity action through ungrounded cutaneous feedback.A notable approach is that of Franken et al. [19], who proposed a two-layer tank-based control scheme.A transparency layer computes the ideal forces to be actuated at both sides of the teleoperation system, while a passivity layer corrects such forces when this is necessary to guarantee the passivity, and thus the stability, of the system.More recently, Ferraguti et al. used tank-based approaches to passively reproduce a time-varying stiffness [20] and address the problem of optimal use of energy in a wavebased teleoperation architecture [21].
This paper presents a task-oriented passivity control approach for enhancing transparency during haptic-enabled teleoperation.It improves upon the tank-based architecture of Franken et al. [19] as, in that work, the passivity layer does not consider and control how the transparency is lost during the needed stabilizing control action.Indeed, given the target task, it may be possible to identify some subsets of the task space that are more important than others for the objectives of the considered task, e.g., during medical palpation, rendering well the stiffness of the tissue might be more important than rendering its rugosity, friction, or stickiness.If it is possible to identify such priority subspaces, a controller can be designed to privilege those components of the transparency, defined as the fidelity of the rendered force, while other components may be altered by a large amount without significantly affecting the overall task performance.The proposed design for the passivity layer aims at preserving the level of transparency along subsets of the environment space that are preponderant for a given task at a given time, while preserving suitable energy bounds in order to guarantee passivity.When controller action is required, a correction to the ideal forces provided by the transparency layer is computed via the solution of a quadratic program, which is characterized by modest computational complexity and is amenable to implementation in real time.

II. PRELIMINARIES
This work exploits the time-domain controller structure in [19] (see Fig. 1), assumed to operate with time T s .The sampled generalized positions [velocities] of the end effectors of the local (also known as master) and remote (also known as slave) devices are denoted by vectors q m (k) [ qm (k)] and q s (k) [ qs (k)], respectively, where k is the discrete time index.The role of the transparency layer is to compute the ideal force τ T Ls (k) to be actuated at the remote robot and to convey an ideal haptic feedback force τ T Lm (k) to the local one, typically given by a scaled and possibly delayed version of the interaction force τ e (k) sensed at the remote side.It is well-known that in the presence of delays or stiff environments, a stable rendering of such forces may not be always achieved even if the teleoperated environment is passive [22].
In [19], as well as in the majority of time-domain control approaches, passivity of the system is exploited as a sufficient condition to ensure stability.In particular, the total (sampled) energy H T (k) of the system is considered, which in turn can be decomposed as being H M (k), H S (k) and H C (k) the energy levels relative to the local side, the environment (remote side), and the communication channel, respectively.Hence, passivity of the overall system is guaranteed by any controller that is able to enforce the condition The role of the passivity layer is to correct the force τ T L (k) at either side, when this is required to ensure that (1) holds.Since the estimation of the total energy H T is not viable in the presence of communication delays, in [19] a virtual energy tank H is associated to each side of the teleoperation system.The tank makes up the energy budget available to the controller.According to a given policy, the passivity layer computes a suitable correction to the ideal force τ T L (k) based on the tank level, and a modified force τ P L (k) is actuated, in order for the tank level not to drop below zero, therefore preserving passivity.Virtual energy exchange between tanks is also performed using a balancing algorithm, so as to reduce the conservatism inherent to requiring that both tank levels be positive in order to guarantee passivity.In the following, the basic scheme for the passivity layer implementation is recalled without specific reference to either the local or the remote side and using the same notation as in [19].
Let k denote the time interval between sampling instants k − 1 and k, and let H(k) indicate the tank level, i.e., the energy budget to perform the control action within k.Denote with q(k) the sampled generalized device displacement, and let τ r ( k) be the actuated force during k, which is held constant since a zero-order-hold is used.The energy loss in tank H during k is therefore given by Hence, the updated tank level at time k is given by Taking into account a possible virtual energy exchange amounting to H ± (k), performed according to the exchange protocol, the energy tank level at the end of the time interval k becomes The value of H(k + 1) in ( 4) represents the available energy budget to perform the actuation task at k + 1.
The passivity layer then computes a curtailed force τ P L (k) by saturating the magnitude of each component of τ T L (k) to a value σ(k) depending on H(k + 1), given by where K σ is a positive constant.Finally, a certain amount τ T LC (k) of virtual damping force is added to τ P L (k) at the local side to prevent total tank depletion.Such force is computed as where α > 0 and H d > 0 are empirically determined reference parameters.

III. PROPOSED DESIGN
In this paper, we propose a design of the passivity layer for a multi-DoF teleoperation system which specializes the framework in [19] in order to preserve transparency for a specific task as much as possible while guaranteeing passivity.As opposed to "blind" saturation of the actuated force components according to the energy budget H(k + 1) (see ( 5)), a realtime optimization procedure is introduced which allows for maximizing fidelity on the rendered force components which are deemed relevant for the given task at the present time instant, while satisfying the passivity condition.
At time k, let us consider the estimated energy loss as a function of the perspective actuated force τ r (k + 1) (yet to be computed), given by Therefore, the tank level at time k + 1 would be given by Let H min (k) > 0 be a dynamically varying energy reference, to be interpreted as the amount of energy to be left in the tank after τ r (k + 1) has been applied during the time interval k + 1, and whose choice is discussed later on.According to the estimate in ( 8), the following constraint must hold in order to ensure Ĥ(k + 1) ≥ H min (k): Clearly, if , then the ideal force τ r (k + 1) = τ T L (k) can be safely actuated.When this is not the case, a corrected force τ r (k + 1) = τ P L (k) must be computed to guarantee (9).To this purpose, let τ C (k) denote the force correction, i.e., Let S i (k), i = 1, . . .m be a suitable set of subspaces of the task space, which in general depend on the specific task and the current configuration (q(k), q(k)).A priority value p i (k) ≥ 0 (also configuration-dependent) is assigned to each S i (k).The idea of associating each subspace to a priority index is quite simple: the higher the priority p i (k), the stricter the requirement that the projection on S i (k) of the force correction τ C (k) be small.Such a projection can be written as where the projection matrix P Si (k) characterizes S i (k).
It is therefore natural to introduce the following quadratic functional to be minimized under the tank level control condition (9): In addition to (9), further constraints are in order.Indeed, the passivity layer must not try to achieve a tank energy gain by excessively altering (e.g., inverting) one or more components of the rendered forces, thus resulting in an excessive loss of transparency.To this purpose, it has to be ensured that the sign of all components of τ P L (k) be the same as the corresponding components of τ T L (k).To prevent the tank from depleting indefinitely, a slight inversion is allowed, but the corresponding energy gain should not exceed that provided by a damping element with coefficient β (note that this plays the role of the virtual damper in ( 6)).More specifically, let C − (k) be the set of components τ T L,j (k) of τ T L (k) such that τ T L,j (k) qj (k) ≥ 0, i.e., the components that play in favor of tank depletion according to the energy balance (8).On such components, the following constraint is enforced: (12) As far as the dynamic update of the threshold level H min (k) is concerned, many policies can be devised.In this paper, the following update rule is adopted: (13) where H 0 > 0 represents a reference tank level, and η > 0 is a tunable gain parameter.Using H min (k) in ( 13) as the threshold in (9) allows the system to recover energy when H(k + 1) is below the reference level H 0 , as well as to limit the spending of energy when it is above.In this respect, note that H(k + 1) is a quantity that becomes known as soon as (q(k), q(k)) are sensed and the virtual energy exchange in (4) has taken place.Using ∥ q(k)∥ in the error gain of ( 13) allows to limit the correction forces when velocities are low.
Based on the above observations, the following algorithm is proposed for the implementation of the passivity layer at eack time k.
Algorithm 1 Passivity layer implementation 1: Given: Solve the optimization problem if Problem ( 14) is feasible then 9: else 11: Note that in Algorithm 1, a fallback solution (11:, 12:) is considered in case the optimization problem turns out to be unfeasible in order to allow the tank to replenish.This is accomplished by introducing a damping force along the components in C − (k) and leaving the remaining components untouched.
The optimization problem ( 14) is a quadratic program, which enjoys the property that the global minimum can be efficiently computed by means of interior point methods characterized by a computational complexity which is quadratic to cubic in the number of variables (i.e., DoFs) involved [23].On a standard PC, a 6−DoF instance of the problem can be solved in a fraction of a millisecond.In the experimental section, the method is implemented without problems on a 3−DoF system operating at a sampling frequency of 1 kHz.

IV. EXPERIMENTAL EVALUATION
To evaluate the effectiveness of the proposed method, a virtual palpation experiment has been carried out.

A. Experimental setup
The teleoperated side is replaced with a virtual environment composed of a three-dimensional sphere with radius R=3 cm.A two-dimensional representation is reported in Fig. 2. The sphere is modelled as a spring with elastic constant K s = 200 N/m everywhere except on a spherical cone of radius r=1 cm.In such spherical cone, the stiffness is higher, increasing quadratically from K s = 200 N/m at its borders until K ss = 1000 N/m at its center p ss .The sphere also has surface friction, computed using the standard Coulomb friction model with coefficient µ = 0.005.
The user controls a grounded 3-DoF Omega haptic device, whose position is linked to the one of a proxy in the virtual environment, following the standard god-object interaction model.The control loop operates at a sampling time of 1 ms.
The communication channel is modeled as in Franken et al. [19] and simulated by introducing a time delay T d which affects the forces and virtual energy exchange between sides.

B. Task and conditions
The task consists in interacting with the virtual sphere to find the center of the stiffer spherical cone, p ss , as fast and as precisely as possible.It was evaluated considering three different control schemes in four communication delays conditions, yielding a total of twelve experimental conditions.
1) Control schemes: We considered three control schemes.No passivity controller (NP).The controller directly transmits the forces and velocities between local and virtual side, while no passivity constraints are enforced.Unmodified forces τ T L are actuated at both sides.

Standard energy-based tank level controller (STC).
The controller proposed in [19] and recalled in Section II , implemented with parameters α = 10, H d = 0.2 J, K σ = 100.Transparency-oriented passivity controller (TOP).The proposed controller, implemented as described in Sec.III.Given the nature of the task, i.e., the palpation of a spherical surface to locate a region which is stiffer, we find it convenient to assign the highest priority to the rendering of the elastic force along the direction perpendicular to the sphere surface (p 1 (k) = 0.5), while friction forces on the directions tangential to the sphere surface are assigned lower priorities (p 1 (k) = p 2 (k) = 0.1).Reference values of the parameters in the tank level control policy (13) are chosen as H 0 = 0.2, η = 0.1, β = 0.1.For solving the optimization problem (14) in real time, the qpOASES library has been used [24].In this volume, the stiffness increases quadratically from Ks at its borders, so as to provide a smooth transition between the two parts of the sphere, until Kss = 1000 N/m at its center, pss.Users are asked to explore the sphere to find pss, i.e., the stiffest point of the sphere.
2) Communication delays: We simulated four communication delays, following the examples reported in [25], [26], considering T d = 0, 20, 50, 100 ms.Delay T d is to be considered in addition to the intrinsic delay of the system, which has been measured to be 1.6 ms on average.We refer to the twelve experimental conditions by combining the short names of the control schemes (NP, STC, TOP) with the simulated delay in subscript, e.g., NP 100 refers to the no passivity controller under a delay of 100 ms, TOP 0 refers to the proposed transparency-oriented passivity controller under no delay, and so on.
No visual feedback on the stiffness of the sphere was provided to the users, who were only able to see the undeformed sphere and the position of the haptic proxy on its surface.

C. Participants
Twenty participants took part to our experiment, including six women and fourteen men (age 26-35 years old).Users performed three repetitions of the interaction task per control scheme per communication delay, yielding a total of 3 repetitions × 3 control schemes × 4 communication delays = 36 trials per participants.Trials were randomized to avoid any learning effect.The experiment lasted around 20 minutes.
No information about which one was the proposed techniques was provided to the subjects.Users sit comfortably in front of a computer screen showing the test environment.They use the Omega haptic device with their dominant hand to interact with the sphere and look for the stiffest point on its surface.

D. Results
To validate the effectiveness and viability of the proposed control method, we registered (a) the error in detecting the center of the spherical cone (the stiffer point on the sphere), (b) the task completion time, (c) the force correction applied at the user's side along the direction perpendicular to the sphere surface, and (d) the force correction applied at the user's side along the direction tangential to the sphere surface.To compare the four metrics between the twelve experimental Force correction (transparency) normal to the sphere surface Force correction (transparency) normal to the sphere surface Simple main effect of the control scheme 0 ms: No significant differences to report 20 ms: STC vs. TOP p < 0.001 50 ms: STC vs. TOP p < 0.001 100 ms: STC vs. TOP p < 0.001 Simple main effect of the communication delay STC: 0 vs. 20 p < 0.001 0 vs. 50 p < 0.001 0 vs. 100 p < 0.001 20 vs. 50 p < 0.001 20 vs. 100 p < 0.001 50 vs. 100 p < 0.001 TOP: 0 vs. 20 p < 0.001 0 vs. 50 p < 0.001 0 vs. 100 p < 0.001 20 vs. 50 p < 0.001 20 vs. 100 p < 0.001 50 vs. 100 p = 0.032 Force correction (transparency) tangential to the sphere surface Force correction (transparency) tangential to the sphere surface Simple main effect of the control scheme 0 ms: No significant differences to report 20 ms: STC vs. TOP p = 0.002 50 ms: STC vs. TOP p < 0.001 ms: STC vs. TOP p = 0.001 Simple main effect of the communication delay STC: 0 vs. 20 p = 0.001 0 vs. 50 p < 0.001 0 vs. 100 p < 0.001 20 vs. 100 p < 0.001 50 vs. 100 p = 0.002 TOP: 0 vs. 20 p < 0.001 0 vs. 50 p < 0.001 0 vs. 100 p < 0.001 20 vs. 50 p = 0.002 20 vs. 100 p = 0.001 50 vs. 100 p = 0.042

Participants' perceived effectiveness Participants' perceived effectiveness
Main effect of the control scheme NP vs. STC p < 0.001 NP vs. TOP p < 0.001 STC vs. TOP p = 0.018 conditions (see Sec. IV-B), we ran two-way repeated-measures ANOVA tests on the data.Control schemes (NP, STC, TOP) and communication delays (0 ms, 20 ms, 50 ms, 100 ms) were treated as within-subject factors.All data passed the Shapiro-Wilk normality test.A Greenhouse-Geisser correction was used when the assumption of sphericity was violated.Results of post hoc analysis with Bonferroni adjustments are reported in Table I (only significant p values are shown).
Figure 3a shows the error in locating the stiffest point.It is calculated as ∥q(end) − p ss ∥, where q(end) is the final position of the Omega interface, when the user indicated to have found the target stiffest point, and p ss is the actual stiffest point (see Sec. IV-A).The two-way repeated-measure ANOVA revealed a statistically significant two-way interaction between control scheme and communication delay (F 6,114 = 7.721, p < 0.001, a=0.05).When an interaction effect is present, the interpretation of the main effects might be incomplete or misleading.In this case, it is better to determine the difference between trials for each control scheme and communication delay, called simple main effects.Analyzing the simple main effects for the control scheme variable (one-way repeated-measures ANOVAs), we found a statistically significant differences in the considered control for 20 ms (F 2,38 = 4.551, p = 0.017), 50 ms (F 2,38 = 7.891, p = 0.001), and 100 ms (F 2,38 = 32.992,p < 0.001) communication delays.Analyzing the simple main effects for the communication delay variable (one-way repeated-measures ANOVAs), we found a statistically significant differences in the considered delays for NP (F 3,57 = 44.021,p < 0.001, STC (F 3,57 = 22.278, p < 0.001), and TOP (F 3,57 = 6.653, p = 0.001) control schemes.
Figure 3b shows the task completion time.It is calculated as the time elapsed between when the user touches the sphere for the first time and when he or she indicate to have found the target stiffest point.Once again, the two-way repeated-measure ANOVA revealed a statistically significant two-way interaction between control scheme and communication delay (F 3.748,71.213= 9.410, p < 0.001, a=0.05).Analyzing the simple main effects for the control scheme variable (one-way repeated-measures ANOVAs), we found a statistically significant differences in the considered control for 20 ms (F 2,38 = 5.100, p = 0.011), 50 ms (F 2,38 = 8.872, p = 0.001), and 100 ms (F 2,38 = 27.406,p < 0.001) communication delays.Analyzing the simple main effects for the communication delay variable (oneway repeated-measures ANOVAs), we found a statistically significant differences in the considered delays for NP (F 3,57 = 47.646,p < 0.001), STC (F 3,57 = 12.177, p < 0.001), and TOP (F 3,57 = 5.038, p = 0.004) control schemes.
Figure 3c shows the RMS of the force correction enforced by the controller along the axis perpendicular to the sphere.This axis carries the information on the stiffness of the surface and is prioritized in the proposed STC controller.It is calculated as the average norm of the difference τ C between the applied (τ P L ) and ideal (τ T L ) forces along this axis.In this analysis we only compare STC and TOP controllers, as NP does not apply any correction to the force, i.e., for NP, this difference τ C is always zero.Once again, the two-way repeated-measure ANOVA revealed a statistically significant two-way interaction between control scheme and communication delay (F 1.989,37.796= 33.945,p < 0.001, a=0.05).Analyzing the simple main effects for the control scheme variable (paired-samples ttests, STC vs. TOP only), we found a statistically significant differences in the considered control for 20 ms (t(19)=4.529,p < 0.001), 50 ms (t(19)=4.458,p < 0.001), and 100 ms (t(19)=7.284,p < 0.001)) communication delays.Analyzing the simple main effects for the communication delay variable (one-way repeated-measures ANOVAs), we found a statistically significant differences in the considered delays for both STC (F 1.876,35.648= 165.339,p < 0.001) and TOP (F 1.735,32.969= 38.182,p < 0.001) control schemes.
Figure 3d shows the RMS of the force correction enforced by the controller along the two axes tangential to the sphere.These axes carry no information on the stiffness of the surface and are given low priority in the proposed STC controller.It is calculated as the average norm of the difference τ C between the applied (τ P L ) and ideal (τ T L ) forces along these axes.As before, we only compare STC and TOP controllers.Once again, the two-way repeated-measure ANOVA revealed a statistically significant two-way interaction between control scheme and communication delay (F 3,57 = 7.976, p < 0.001, a=0.05).Analyzing the simple main effects for the control scheme variable (paired-samples t-tests, STC vs. TOP only), we found a statistically significant differences in the considered control for 20 ms (t(19)=-3.522,p = 0.002), 50 ms (t(19)=-6.059,p < 0.001), and 100 ms (t(19)=-3.939,p = 0.001)) communication delays.Analyzing the simple main effects for the communication delay variable (one-way repeated-measures ANOVAs), we found a statistically significant differences in the considered delays for both STC (F 1.952,37.084= 32.514,p < 0.001) and TOP (F 1.522,28.917= 32.752,p < 0.001) control schemes.
In addition to the quantitative evaluation reported above, we also measured our (e) subjects' experience.Immediately after the experiment, participants were asked to rate, on a slider going from 1 to 11, "how easy was to detect the stiffest point on the sphere?", for the three considered control schemes.A score of 1 meant "very difficult" and a score of 11 meant "very easy." Figure 3e shows their ratings.In this case, we ran we ran a one-way repeated-measures ANOVA on the data.Control scheme (NP, STC, TOP) was treated as the within-subject factor.The test revealed a statistically significant difference between control schemes (F 1.419,26.959= 125.044,p < 0.001, a=0.05).Results of post hoc analysis with Bonferroni adjustments are again reported in Table I (only significant p values are shown).

V. DISCUSSION AND CONCLUSIONS
The experimental results in the previous section show that, when no delay is introduced, all three control schemes (NP, STC, TOP) enable an accurate and fast completion of the task.However, as the introduced delay grows, an increasing force/transparency reduction is required to preserve the stability of the system, showing significant differences among the three control schemes.As expected, enforcing no passivity control, scheme NP shows severely degraded performance in all metrics as the delay increases, with large oscillations preventing users to correctly complete the task.On the other hand, both STC and TOP guarantees the stability of the system at all times, but they do it in different ways.STC reduces the force at the user's side along all axis in a very similar way, while TOP corrects the forces less along the prioritized axis, i.e., the one perpendicular to the surface of the sphere, and more along the other axes.This different approach leads to contrasting results, as the most salient information for this task is the stiffness along the axis perpendicular to the sphere.By guaranteeing higher transparency where it counts, the proposed TOP control scheme yields better task performance in all considered metrics.Of course, this behavior is more evident with higher delays, but it is already significant with a delay of 20 ms.A possible direction of improvement is to dynamically compute subspaces and priority indexes from representative runs of the considered task and real-time environment impedance estimation.Moreover, future work will focus on improving the performance on the proposed approach as well as on running real-world experiments using robotic manipulators.

Fig. 2 :
Fig.2: Two-dimensional representation of the (three-dimensional) sphere composing our environment.It presents a stiffness Ks = 200 N/m everywhere except on a small spherical cone of radius r.In this volume, the stiffness increases quadratically from Ks at its borders, so as to provide a smooth transition between the two parts of the sphere, until Kss = 1000 N/m at its center, pss.Users are asked to explore the sphere to find pss, i.e., the stiffest point of the sphere.

Fig. 3 :
Fig.3: Mean and 95% confidence interval of (a) the error in detecting the center of the spherical cone (the stiffer point on the sphere), (b) the task completion time, (c) the force correction applied at the user's side along the direction perpendicular to the sphere surface, (d) the force correction applied at the user's side along the direction tangential to the sphere surface, and (e) how effective subjects considered each control scheme.The statistical analysis is reported in Sec.IV-D and Tab.I.

TABLE I :
Statistical analysis (post hoc analysis after corresponding repeatedmeasures tests)