Bayesian filtering for model predictive control of stochastic gene expression in single cells

This study describes a method for controlling the production of protein in individual cells using stochastic models of gene expression. By combining modern microscopy platforms with optogenetic gene expression, experimentalists are able to accurately apply light to individual cells, which can induce protein production. Here we use a finite state projection based stochastic model of gene expression, along with Bayesian state estimation to control protein copy numbers within individual cells. We compare this method to previous methods that use population based approaches. We also demonstrate the ability of this control strategy to ameliorate discrepancies between the predictions of a deterministic model and stochastic switching system.


Introduction
Modern microscopy platforms are revolutionizing the quality and quantity of biological data. Synthetic biology, and in particular optogenetics provide novel ways to quantify and perturb biological systems at the single-cell level. These platforms allow for the first time online control of protein production in single cells [1][2][3][4][5]. However, protein production in single cells is stochastic [6], and novel methods must be developed to use models to control single cell gene expression. When a model is available, model predictive control (MPC) is known to have many benefits. From a control perspective, controlling the protein expression in individual cells should reduce the error of each cell with respect to the target. Until now, most attempts to control gene expression have been at the population level [3,7,8]. The distinction is shown in figure 1.
To date, few studies have attempted to perform MPC on multiple individual cells simultaneously using modern platforms [2,5]. Of these two studies, only [5] uses a stochastic model of the process to control gene expression in individual bacteria. However, their model uses moment-based approximations of the stochastic dynamics, and it is known that such approximations are not always appropriate [9].
In this study, we introduce a new method to perform single-cell control of optogenetic production of protein. This method uses a simple, but stochastic, reaction-based model to control the stochastic gene expression within each cell in the population. For each cell, we estimate the intracellular protein levels based on fluorescence measurements and the stochastic model of the process using a Bayesian filter. Once this intracellular probability distribution has been estimated, we use receding horizon MPC, along with the stochastic model, to drive intracellular protein values to pre-determined target values.
We compare the single-cell approach to the population based approach. In the single-cell approach, each cell has its own controller, while in the population approach, one controller is used for all cells. We evaluate these approaches computationally for two different models of optogenetically controlled gene expression. The first is a model of a light-driven gene expression system using the EL222 transcription factor [10], which has been used experimentally in [2,11,12]. The second is a simplified single-species nonlinear model of a self-regulated gene.

Modeling gene expression
Gene expression is inherently stochastic [6], due to the random diffusion of the various molecules involved in the process of transcription and translation. In general, the chemical master equation (CME) governs the time evolution of an infinite state continuous time Markov chain with generator matrix A(u(t), θ). We consider the matrix A as a function of control parameter u(t), and of kinetic rate parameters θ. The ith discrete state in the Markov chain is a node on the lattice, x i = {ζ 1 , ζ 2 , . . . , ζ n } for the n species in the chemical system, where each ζ i is an integer molecule count. The vector p collects the probabilities of each state or configuration as p = [p(x 1 ), p(x 2 ), . . .)] T . We denote the total set of states as X . Note that this vector is often infinite in its dimension. The CME is the set of ODEs which describes the time evolution of these probabilities, i.e.
We use the finite state projection (FSP) approach to solve the CME [13][14][15]. The FSP approach splits this infinite state Markov chain into two distinct sets, J and J ′ , and collects the (infinite) set of states J ′ into a single absorbing sink state g(t), d dt .
The sink state g(t) is the error in the FSP solution, which monotonically increases with t [15]. As such, the FSP dimension (i.e. the cardinality of J ) can be modified to decrease the error to a prescribed value for any finite time. Solving the FSP can be extremely computationally expensive in general, as the number of elements in the matrix A grows exponentially with the number of species considered in the model, and therefore one typically does not include more than three or four species in a given model [16]. For convenience, for the remainder of the work we will refer to A J J as A, and p J as p.
In this study, we are considering controllers that are piecewise-constant, and therefore A is constant over fixed time intervals. Furthermore, we are considering a control which has either an on or off state, i.e. there is either light shining on a given cell or not. When A is constant, the solution at t f given p(t 0 ) is . We can pre-compute the infinitesimal generator for both the on and off state of the controller, where we define the 'on' generator matrix as A on and the 'off ' generator matrix as A off . Given a sequence of input u(t) the FSP solution can be iteratively evaluated on the constant time intervals, Equation (4) gives the forward map of the probability distribution p(t) to p(t + ∆t), which is key for the Bayesian filter discussed next. For the remainder of the manuscript, we consider a system which is measured every ∆t time units, for a total of N t measurements.

Bayesian filtering for single cells using the FSP
In general, the probability distribution of molecule counts modeled by the CME is non-Gaussian; the domain is discrete and low-copy effects abound [9]. Therefore, state estimation techniques that assume symmetric, Gaussian variability such as the Kalman filter are not appropriate, and instead we turn to the more general Bayesian filtering [17].
The Bayesian filter in this work serves as a way to filter out observation noise when we have arbitrary state distributions which come from the solution of the CME solved with the FSP. A Bayesian filtering approach considers a prediction step and an estimation step, assuming an initial distribution p(t 0 ). We start with the distribution of the estimated state of the cell at the previous time p((k − 1)∆t) = [p 0 , p 1 , . . . , p N ]. This distribution can be propagated from (k − 1)∆t to k∆t according to u(t) and equation (4).
To obtain an update rule for the distribution of the estimated state at measurement time points, we need to specify a model that describes how the state is mapped to the observable outputs in experiments. Because measurements of fluorescently labeled biomolecules are intrinsically noisy due to the photophysics of fluorescent molecules and detection of photons, the fluorescent measurement of a single cell is described probabilistically. Each molecule emits a random number of photons, some of which are detected by the camera. The distribution of the number of photons emitted by a single fluorescent protein is typically taken to be Poisson, but in the limit of many emissions is approximately Gaussian, i.e. v k ∼ N (µ FP , σ 2 FP ). For n i such molecules in a cell, the probability of measuring fluorescence z k is given by as the sum of Gaussian random variables is a Gaussian random variable. Together, equations (4) and (5) can be used to iteratively estimate p(k∆t) based on the measurement z k . The FSP-based Bayesian filtering is detailed in algorithm 1. The algorithm is initialized by defining an initial distribution of molecules p(t 0 ). For each measurement period ∆t, the posterior estimate of the probability distributionp((k − 1)∆t) from the previous measurement is propagated to the current time k∆t. The appropriate generator A on or A off is used depending on the prescribed actuation during the time interval. Finally, the measurement at time k∆t, z k is incorporated to the model according to equation (5) and Bayes' rule. The computation of the normalizing factor Z in line 10 of algorithm 1 is not feasible in general. This is because computation of the marginal likelihood requires the the integration over continuous variables, except for cases such as the Gaussian and other simple parametric likelihoods and conjugate priors. However, for the FSPbased approach, the marginalization is simply a sum over the discrete and finite state space, This fact also enables easy computation when dealing with hidden species; the marginalization can be taken over the unobserved variables. Algorithm 1. FSP-based Bayesian filtering.
To evaluate our approach, we will compare the FSP-based Bayesian filtering algorithms described above to the more commonly used population based Kalman filter.

Kalman filtering for population state estimation
To contrast single-cell control, we introduce here the notion of a population model. In the population model, individual cells are not measured, and the fluorescence measurements come from averaging the fluorescence across the entire population of cells. We use the Linear Noise Approximation for stochastic chemical kinetics to approximate the dynamics of the mean and variance in the CME in equation (1) [13]. The LNA is valid when the number of interacting molecules and system volume are sufficiently large. These dynamics are given by The equations for ϕ and ρ can be found in [18]. The mean molecule number for the population is approximated byx(t). The variance aboutx(t) is approximated by Σ(t) for each single cell, and therefore the variance of the population mean is simply given by P = Σ(t)/N c , where N c is the number of cells that are averaged. Under the LNA, we consider the state and process noise to follow a normal distribution N (x, P). Therefore, we can use the LNA to construct a Kalman filter to iteratively estimate the population's mean molecule number using population-averaged fluorescence measurements, y k . The previous state and process noise estimate at the (k − 1)th time, are used as the initial conditions in equation (7) and propagated forward ∆t to find thex k|k−1 andP k|k−1 .
Finally, because not all species are observed, we define an observability matrix, C which linearly combines the state variables to an observable state. Therefore, we can write the filtering equations where K k is the Kalman gain, where R is the additive observation noise which comes from the measurement process.

Receding horizon MPC
In the L 2 cost in equation (12), the error is computed with respect to the meanx c to be consistent with the assumption that population model makes; namely that the population meanx is Gaussian distributed. However, it is well known that individual cells' molecule counts often follow more complex distribution and approaches that rely on averaging and the central limit theorem are not valid [9]. Our goal is to find the control that moves this complex distribution towards a target molecule count. We chose the expected absolute deviation, as it allows the cost function to account for intrinsic asymmetries and non-Gaussian features inp: The summation over j is taken sum over all values of the observable molecular abundance in the FSP analysis. We perform receding horizon MPC, in which we aim to optimize the cost function J over the finite time horizon H using the current state estimate, i.e. to find the light input sequence u(t) which minimizes J over [k∆t, k∆t + H], i.e.
For these systems, the photo stimulation is taken either to be on or off between k∆t and (k + 1)∆t, therefore there are 2 H/∆t possible light combinations over the horizon H. We optimized by exhaustively evaluating equation (14) for each possible light sequence. Other tree-based searches or 'greedy algorithms' may be useful in optimizing finding controllers without exhaustively searching through all possible combinations.

Control of a light-driven expression system
Optogenetically controlled protein production is now possible in bacteria, yeast, and higher eukaryotes [19].
The general idea is that light at specific wavelengths can be used to activate transcription factors that 'turn on' protein production within cells [10,20]. Optogenetic systems often react faster than traditional chemically induced protein production, and downstream responses can be manipulated using frequency and/or amplitude modulation of the input light stimulation [11].
Here, we work specifically with the EL222 optogenetic system. In this system, EL222 molecules are created and degraded at all times within the cell. Under blue light photo stimulation, EL222 molecules dimerize and then become activated transcription factors, which bind to the pEL222 promoter, activating the production a protein. We model the system of light-induced protein production using the following set of biochemical reactions, describing the constitutive transcription and translation of EL222 molecules, which we denote X and the production of fluorescent protein, Y. The control parameter, u(t), represents a step function that follows from the light either being on or off. We assume a delay τ between the time the light is applied and fluorescent protein is produced according to the model B is a random variable which describes a 'burst' of EL222 expression, and is geometrically distributed [21]. This model is valid when mRNA are short-lived compared to protein.
Under this assumption, each individual translation event can be ignored, and replaced with a burst of B protein. The mean burst size b is given by the number of translation events per mRNA, which may be derived from the ratio of the degradation rate of mRNA to the translation rate. B is a geometrically distributed random variable, corresponding to the number of protein made by an mRNA before it was degraded. Using this bursting model of gene expression, we have the following set of propensities for the system, where x i and y i indicate the integer counts of each molecular species. The infinitesimal generator can therefore be written We considered both a constant target and a dynamic target. Individual cells were simulated using a modified form of the stochastic simulation algorithm [22] which allowed us to dynamically update the simulation by applying the control u(t). For the population measurement we computed the average fluorescence by taking the molecular abundance of the fluorescent protein s i from the Gillespie simulation for the ith cell at time k, (17) where (µ FP T k − y k )) 2 (18) and the single-cell error where the total number of time points is indicated by Nt. In contrast to equations (12) and (13), the purpose of these errors is to evaluate the quality of the fluorescent measurements realized by each control strategy. We compared the model-based controllers at the population and single-cell level to naive bang-bang controllers. For all controllers, we set ∆t = 6 min, and used H = 4 for the model-based controllers. The naive controller stimulates the cells at k(∆t + 1) if the current measurement is below the target fluorescence (y k < µ FP T k for population, f i k < µ FP T k for single-cells), and does not stimulate cells if  As is apparent in figures 2(A) and (C), the population mean under population control reaches the target faster than under single-cell control. On the other hand, after arriving to the target the singlecell control achieves a lower error. We analyzed this trend as a function of population size in figure 3. Time-to-target is the first time at which the population mean reaches the target. Steady state error is the mean squared error of the population mean after the target has first been reached. For the LNA population control, 50 independent cell populations were simulated at each population size. For each FSP population control, populations at each size were sampled 50 times from a set of 200 independently-controlled cells without replacement to construct standard deviations of the errors for different population sizes. For the steady state error, we show that single-cell control results in marginally lower population errors on average. However, this trend is reversed for time-totarget; across all population sizes, the single-cell control leads to longer time-to-target.
Next, single-cell controller performance as a function of measurement time ∆t was quantified. Only FSP-based and simple bang-bang single-cell controllers have been considered thus far; now we evaluate the LNA-based population control (with population size 1) as another single-cell controller against the FSP-based controller. Figure 4 compares the LNAbased single-cell control with the FSP-based one. We considered three measurement times; ∆t = [12,24,48] min. When ∆t is small, the LNA and FSP give similar A B [min] LN A FSP solutions, and therefore a Gaussian approximation leads to accurate control. However, when ∆t is larger, mismatch between the Gaussian approximation and the true probability distribution (predicted by the FSP) leads to innacurate control. Single cell and population errors are shown in figure 4(B).
Computationally, the single-cell based control is significantly more expensive than the population approach. It requires solving equation (4) 2 H/∆t times for each cell within ∆t (i.e. between measurement updates), and therefore the main bottleneck is the computation of equation (4), which can be efficiently evaluated using Krylov subspace methods [23]. Algorithm 1 scales with t FSP × 2 H/∆t × Nt, in which the time to solve the FSP from k∆t to (k + 1)∆t is denoted t FSP . The FSP solution cost mostly depends on the computational expense of an orthogonalization step in the Krylov subspace methods (see [24] for a detailed discussion), and is therefore problem dependent. As long as the time to evaluate over the specified number of horizon steps, t FSP × 2 H/∆t is less than the measurement times ∆t, single-cell control using the FSP is feasible for at least one cell. Yet, for each cell the computation is 'embarrassingly parallel' , and one can scale up the number of cells that are controlled arbitrarily with the number of processors that are available. In the example presented here, the FSP with 2400 states can be solved in ≈0.05 s on a single CPU. This amounts to about 3 s for each cell with H = 6∆t. Using realistic experimental time scales of ∆t ≈ 6 min [5,12,25], one could sequentially evaluate control for 120 cells on a single processor.

Restoring deterministic behavior for a self-regulated gene
In this example, we start with a simple self regulated gene where the function f (x) is a Hill function which we assume can be affected by the control by modulating the basal production rate k b , As shown in figure 5, under certain parameter settings, the system is bistable. When the same system is simulated stochastically the trajectories bounce between the two stable equilibria ( figure 5(B)), a phenomenon which is not seen in the deterministic setting. We then used the single-cell control strategy to prevent state switching by supplying the deterministic model behavior (dashed lines, figure 5(C)) as the target for cells that either start with 0 protein molecules (shown in purple) or 100 protein molecules (shown in orange). The controller is effectively able to prevent stochastic switching and stabilize the equilibria of the system, shown in figure 5(A), by anticipating when cells are likely to pass the unstable region and switch from low to high expression levels. (B) In the deterministic system (dark orange and purple lines), the system is bistable, and the final state could be either a low or high value depending on the initial condition. However, when simulated with the SSA, state switching is observed and fluorescence switches between the high state and low state (light purple and orange lines). (C) Using the single-cell control, the state switching can be eliminated, restoring the deterministic behavior of the system.

Discussion
The advent of optogenetic control of single gene expression systems promises precise manipulation of key biological processes, yet many challenges remain [5,11,12,25]. We have compared single-cell and population control (figures 2-4). We show that population control tends to have a faster time to target, but has a higher errors for single-cells and the population mean (figures 2 and 3, table 1). While the improvement of single-cell control is marginal for ∆t = 6 min, further numerical experiments between FSP and LNA single-cell controllers in figure 4 suggest that model mismatch could lead to larger discrepancies at longer ∆t. This indicates that non-Gaussianity plays a key role in MPC of cell populations at both the single-cell and population level.
Using an accurate single-cell model can be important for performing single-cell control, particularly if the time between measurements is large and Gaussian approximations are not valid, as shown in figure 4(B), right panel. This model mismatch can even affect the ability to accurately control the population mean ( figure 4). While we have limited our work to analyzing how these controllers perform at different ∆t, one may also consider changing the prediction horizon, H. The horizon will be target dependent; for example, the optimal H would likely be different for the time-varying target compared to the constant target.
While we have presented FSP-based approaches and demonstrated their feasability and practical importance, other approaches may also yield precise control. One example is particle filtering, which typically use a sequential importance sampling scheme to draw samples from the posterior state estimatep. For single-cell control, this amounts to managing a set of particles for each cell independently, and could become computationally burdensome to achieve similar accuracy as the FSP approach presented here. However, if accuracy is not a concern, such a sampling scheme may allow one to consider more species than is possible using the approach in this work. It would also be interesting to investigate the use of Gillepie's algorithm [22] as a way to generate/propagate the particles through time. Recently, reinforcement learning has showed promise for controlling biological systems for which we do not have accurate models [25].
Future MPC approaches could stream real-time measurements to HPC resources, enabling more complex analyses such as model ensembles, online learning, and online experiment design. This could allow one to consider mismatch between the model performing the control and the system under study. Along these same lines, one could devise automated schemes to switch between computationally cheap approaches, such as the LNA, and computationally intense approaches such as the FSP.

Data availability statement
The data that support the findings of this study are openly available. URL: https://github.com/zachfox/ FSP-control.