A novel data-driven bilinear subspace identification approach.
Hua, Yang ; Shaoyuan, Li
INTRODUCTION
Being an important class of non-linear systems, bilinear systems
are useful for non-linear plant modelling and control in many
circumstances (Lakshminarayanan et al., 2001; Mastronardi et al., 2001).
The attention on the research of bilinear systems is increasing now,
mainly because such systems are more general to approximate a large
class of non-linear systems. At the same time, the simplicity of the
structure and similarity to linear systems make them more attractive in
many control applications (Bloemen et al., 2002). Therefore, research on
the identification of bilinear model is now increasing such as subspace identification methods (Favoreel et al., 1999).
Subspace identification (Van Overschee and De Moor, 1996; Qin et
al., 2005; Palanthandalam-Madapusi et al., 2005) has been widely
accepted for the identification of linear multivariable systems due to
its numerical computation advantages and applicability to MIMO systems.
However, it is noticed that a major drawback of the subspace bilinear
identification is the enormous dimension of the data matrices involved,
especially as the row number grows row number grows exponentially with
the order of the system.
To overcome the problem encountered in the bilinear subspace
identification methods, subset-selection techniques were used (Chen and
Maciejowski, 2000) to approximate the estimation of the model, where
only the most important rows in the data matrices are used. However,
this method is computationally expensive, because every row of the huge
dimension data matrices needs to be processed. Another approach based on
the kernel method was recently developed in Verdult and Verhaegen
(2005). Its computational complicity relies only on the computation of
the kernel matrix, which is a square matrix where the dimension equals
to the number of the samples and will never depend on the exponential
growing number of rows. However, the state-space model obtained through
this method is only an approximation to its represented system even in
the case of noise-free.
In order to make it more applicable to the modelling and control of
real industry processes, a new bilinear predictor model is identified in
this study. Based on the displacement structure theory (Kailath and
Sayed, 1995; Mastronardi et al., 2001), the method deals with the curse
of huge dimensionality and therefore reduces the computation cost. The
paper is organized as follows: in the next section, bilinear subspace
identification methods are briefly reviewed and the exponential increase
of the row number in the data matrix is illustrated; an enhanced
efficient bilinear subspace identification algorithm is proposed in the
third section; in order to demonstrate the enhanced efficiency, the
proposed method is simulated with the CSTR system in the fourth section;
followed by a conclusion section.
PROBLEM DESCRIPTION
This is a preliminary section to set up the necessary basis for
developing a new identification procedure. First of all, a discrete time MIMO bilinear model in state-space form is introduced to build up a
framework for the problem analysis and new algorithm development.
Actually, the model is well representative of many industrial processes
such as CSTR systems. Conventional identification procedure of bilinear
subspace method and formation of data matrices are explained to show the
data explosion problems, therefore the necessity to improve the
computational efficiency is brought forward to the next section.
Consider a discrete time MIMO bilinear model in state-space form
(Favoreel et al., 1999; Verdult and Verhaegen, 2005)
[x.sub.k+1] = [Ax.sub.k] + [Nu.sub.k] [cross product] [x.sub.k] +
[Bu.sub.k] + [v.sub.k] [y.sub.k] = [Cx.sub.k] [Du.sub.k] + [e.sub.k] (1)
where [x.sub.k] [member of] [R.sup.n], [u.sub.k] [member of]
[R.sup.m], and [y.sub.k] [member of] [R.sup.l] are the state, input and
output vectors, respectively, [v.sub.k] [member of] [R.sup.n] and
[e.sub.k] [member of] [R.sup.l] are the process and output immeasurable
noises, respectively, with zero mean, stationary white noise sequences,
which are assumed to be uncorrelated with the inputs; k = 1, 2, ... are
time indices. The matrix N = [[N.sub.1] [N.sub.2] ... [N.sub.m] [member
of] [R.sup.nxnm] characterizes the bilinearity of the model. The
Kronecker product [cross product] of two vectors a [member of] [R.sup.p]
and b [member of] [R.sup.q] are defined as a [cross product] b =
[[a.sub.1][b.sup.T] ... [a.sub.p][b.sup.T]].sup.T] [member of]
[R.sup.pq]. The Khatri-Rao product [dot circle] of two matrices A =
[[a.sub.1] [a.sub.2] ... [a.sub.j]] [member of] [R.sub.pxj] and B =
[[b.sub.1] [b.sub.2] ... [b.sub.j]] [member of] [R.sup.qxj] are defined
as the column-wise Kronecker product A [dot circle] B = [[a.sub.1]
[cross product] [b.sub.1] [a.sub.2] [cross product] [b.sub.2] ...
[a.sub.j] [cross product] [b.sub.j]], which will be used later. Such
bilinear dynamics can be found in many mass balance processes, such as
CSTR systems.
The following matrices are defined and used in bilinear subspace
identification (Verdult and Verhaegen, 2005):
[X.sub.k] = ([x.sub.k] [x.sub.k+1] ... [x.sub.k+N-1]) [member of]
[R.sup.nxN] (2)
[U.sub.k] = ([u.sub.k] [u.sub.k+1] ... [u.sub.k+N-1]) [member of]
[R.sup.mxn] (3)
[Y.sub.k] = ([y.sub.k] [y.sub.k+1] ... [y.sub.k+N-1]) [member of]
[R.sup.lxN] (4)
[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] (5)
[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] (6)
with the initial condition [MATHEMATICAL EXPRESSION NOT
REPRODUCIBLE IN ASCII] and [Y.sub.j|j] = [Y.sub.j] [member of]
[R.sup.lxN]. The matrix [U.sub.k+j/j] is defined similar to
[Y.sub.k+j/j].
According to the previous researches on subspace identification
(Van Overschee and De Moor, 1996; Favoreel et al., 1999; Verdult and
Verhaegen, 2005; Qin et al., 2005), the data equations are formulated
with the available measurements in a structured way. These equations
play a crucial role in subspace identification.
[X.sub.k+j = [[DELTA].sup.x.sub.k][X.sub.k+j-1|j] +
[[DELTA].sup.u.sub.k][U.sub.k+j-1|j]+V (7)
[[gamma].sub.k+j|j] = [H.sup.x.sub.k][X.sub.k+j-1|j] +
[H.sup.u.sub.k][U.sub.k+j-1|j]+[G.sub.k][U.sub.k+j] + E (8)
where
[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII],
and [G.sub.k] = [[[D.sup.T] 0].sup.T]. In the above
equations,[[DELTA].sup.x.sub.k], [[DELTA].sup.u.sub.k] [H.sup.x.sub.k]
and [H.sup.u.sub.k] are unknown since they depend on system matrices,
and the term V and E take noise into account. Given the measurements of
the system inputs and outputs, the data Hankel matrices are constructed
as follows:
[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]
As mentioned in Verdult and Verhaegen (2005), a challenging problem
of the subspace methods for bilinear systems is the fact that the row
number of the data matrices H = [[[W.sup.T.sub.p] [U.sup.T.sub.f]
[Y.sup.T.sub.f].sup.T] grows exponentially with the block size j. In
some researches, QR fraction of matrix H is first performed to obtain
the subspace matrices. If k is selected equal to j, the row number M of
the data matrix H is given as:
M = dim (H)= jl + ([d.sup.2.sub.j)+[d.sub.j])(l/m+1)+[d.sub.j]+m
(9) where [d.sub.j] = [(m + 1).sup.j] - 1. It is requested that the size
is no smaller than the order, i.e., j [greater than or equal to] n (Van
Overschee and De Moor, 1996). Even for the considerably low-order
system, i.e., the small value of n, the computation complexity may
become heavily burdened. This limits practical applications of the
bilinear subspace identification. To illustrate this expression, Table 1
is given with the total row numbers of the matrix where j is selected as
j = n + 1.
COMPUTATIONAL EFFICIENT BILINEAR SUBSPACE IDENTIFICATION ALGORITHM
In this section, we propose a new bilinear subspace method with the
identification of predictor model. The prediction expressions of the
output [Y.sub.f] can be written as:
[[??].sub.f] [L.sub.w][W.sub.p] [L.sub.u] [U.sub.f] (10)
where the model identification involves finding the optimal
prediction of the future outputs [Y.sub.f] using this predictor. The
equation extrapolate future input-output behaviours from past
input-output data, which are in bilinear form because there are products
of control inputs in the construction of the data matrices [W.sub.p] and
[U.sub.f] . The problem of obtaining the optimal prediction of [Y.sub.f]
can be obtained through solving the following least-square problem:
[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] (11)
where [L.sub.w] and [L.sub.u] are the subspace predictor matrices.
The prediction [Y.sub.f] can be found by the orthogonal projection of
the row space of [Y.sub.f] onto the row space spanned by [W.sub.p] and
[U.sub.f]:
[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] (12)
In previous studies (Palanthandalam-Madapusi et al., 2005; Verdult
and Verhaegen, 2005), the QR fraction of [H.sup.T] is implemented once
the data matrix H is constructed. The row and column dimension of the
matrix H are M and N, respectively, where N is the same as the sample
number and M grows exponentially with the block size j, as declared
before. The computation complexity is O([M.sup.2]N ) for the QR
factorization of [H.sup.T]. It is well known that the R factor of the RQ
factorization of the matrix [H.sup.T] is equivalent to the Cholesky
factor L of the matrix [PHI] = [HH.sup.T] (Mastronardi et al., 2001). A
new algorithm with enhanced computational efficiency is pursued based on
this observation.
The displacement structure theory (Kailath and Sayed, 1995;
Mastronardi et al., 2001) is dominantly used here to resolve Projection
(12). To simplify the following discussion, we transform all the Hankel
matrices in Projection (12) to Toeplitz matrices by reversing the order
of their columns (Chun et al., 1987). To relieve the computation
complexity, the matrix [PHI] instead of H is proposed in the numerical
computation, where [PHI] = [HH.sup.T]. It is a symmetrical
positive-semi-definite with an order n. Then the displacement
(Mastronardi et al., 2001) of [PHI] is defined as follows:
[[nabla].sub.z] [PHI] = [PHI] - [Z.sup.T][PHI]Z (13)
where the displacement operator Z is the generalized shift matrix
with ones along the first subdiagonal and zeros elsewhere.
It is assumed that the rank of [[nabla].sub.z] is [alpha] and
usually satisfied with [alpha] <<n. Let the matrix [[nabla].sub.z]
[PHI] have p positive eigenvalues and q = [alpha] - p negative
eigenvalues, then it has a following factorization :
[[nabla].sub.z] [PHI] [G.sub.T][SIGMA]G (14)
where
[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]
The matrix G is called as the generator of [PHI] and the number
[alpha] is the length of the generator with respect to Z.
Lemma 1.
For an arbitrary matrix [PHI] [member] [R.sup.MxM], the matrix
??can be reconstructed using the generator G as follows:
[PHI] = [n-1.summation over (i=0)]
[([Z.sup.i]).sup.T][G.sup.T][SIGMA][GZ.sup.i] (15)
As shown in Chun et al. (1987), the generator of the product of two
Hankle or Toeplitz matrices can be found by inspection. The generator of
the matrix [PHI] can be selected as follows:
G = [[g.sub.1], [g.sub.2], ..., g[alpha]]
[g.sub.1] = [H.sup.T]a/[(ba).sup.1/2], [g.sub.2] = [a.sup.(i)],
[g.sub.3] = [g.sup.(1).sub.1], [g.sub.4] = [c.sup.s] (16)
where a is the first column, [b.sup.T] denotes the first row, and
[c.sup.T] denotes the last row of the matrix [H.sup.T], [a.sup.(1)]
denotes that the vector is the same as a except the first element is 0,
[c.sup.s] is the vector as c shifts down by one position with
[c.sup.s.sub.1] = 0.
Lemma 2.
Let [PHI] be a positive-semi-definite matrix with [MATHEMATICAL
EXPRESSION NOT REPRODUCIBLE IN ASCII]. Its proper generator is given as
[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII], and the first row
of the triangular factorization is constructed by [[??].sub.1]. Then the
matrix [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] has null
first column and row, whose generator is given by [MATHEMATICAL
EXPRESSION NOT REPRODUCIBLE IN ASCII]. The proof can be found in Chun et
al.(1987).
The generalized Schur algorithm just consists of a recursive use of
lemma 2, where the generator must be selected in advance. The input of
this algorithm is the proper generator of [PHI] with respect to Z, and
the output is the partial triangular factor. Therefore, this generalized
Schur algorithm allows reconstructing the Cholesky factor L of the
matrix [PHI], where [PHI] = [LL.sup.T].
Using the generator given in Equation (16), the Cholesky factor of
the matrix [PHI] is obtained after performing M steps of partial
factorization using the generalized Schur algorithm. This implements the
same procedure to the QR factorization. The overall computation
complexity of the Cholesky factorization is O(MN), which is much smaller
than O([M.sup.2]N), especially when the row number M of the matrix H is
large. The comparison of flops of the Cholesky factorization and QR
fraction is given in Table 2. The system is selected to be m = 1, l = 2
for different block size.
Once the Cholesky factor L is available:
[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] (17)
the terms Lw and Lu can be computed as follows. Let:
[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] (18)
where [dagger] denotes the Moore-Penrose pseudo-inverse. The
subspace predictor matrices [L.sub.w] and [L.sub.u] can be represented
using Matlab notations as follows:
[L.sub.w] = L(:, 1 : [n.sub.w]) (19)
[L.sub.u] = L(:, [n.sub.w] + 1: end) (20)
where [n.sub.w] is the total number of rows in [W.sub.p].
SIMULATION OF A CSTR SYSTEM
The main objective of this simulation study is to illustrate the
efficiency of the proposed bilinear subspace identification method with
its application to a non-linear plant CSTR. A single irreversible,
first-order, exothermic reaction, from A to B is assumed to occur in the
reactor. The mass and energy balance are described by the following two
non-linear ordinary differential equations (Ge et al., 1999; Wu, 2001):
[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] (21)
where [C.sub.a] is the effluent concentration of the component A, T
is the reactor temperature, q is the volumetric feed flow rate and
[Q.sub.c] is the coolant flow rate. The parameters of the CSTR system
are taken as Ge et al. (1999). The CSTR system is modelled as a
one-input two-output system with the input [Q.sub.c] and outputs
[C.sub.a] , T.
The open-loop responses for +10% and -30% step changes in
[Q.sub.c], reported in Figure 1, demonstrate that the reactor exhibits
highly non-linear behaviours in this operation regime. For a better
approximation of the CSTR system, a bilinear subspace predictor model is
identified in this simulation applying the proposed method.
The output data were collected using a random Gaussian signal with
the 20% increase and a decrease of [Q.sub.c0] = 103.41 as input. We use
1100 data samples in this simulation, where 1000 data samples are used
for the model identification and the other 100 data samples are used for
the validation of the prediction ability. The non-linear CSTR is
approximated using bilinear predicator model with sample time 0.1 min.
The block size j is chosen as 6, and the sample number N = 1000.
[FIGURE 1 OMITTED]
Using the input and output data set, construct the data matrix H =
[[[W.sup.T.sub.p][U.sup.T.sub.f][Y.sup.T.sub.f]].sup.T] and compute the
symmetric matrix [PHI] = [H.sup.T]H. Then the Projection (12) is
calculated using Cholesky factorization of the matrix [PHI] based on the
displacement structure. As pointed out before, this procedure composes
most of the computational cost and is computational efficient. The
comparison results with QR fraction are given in Table 2. The
approximate bilinear predictor model of the non-linear CSTR systems is
estimated in the form of Equation (11) without any prior knowledge,
including the order of the CSTR system.
For the validation of the identified model, the responses of the
predictor model and the true CSTR system to the 200 input data are shown
in Figure 2 with solid lines and dashed lines, respectively. The first
100 samples are the data used in the model identification and the last
100 samples are fresh data, which are used for the validation of
regression and predictive performance of the identified bilinear model,
respectively. It is implied that the bilinear model fits the sample data
well and can approximate the CSTR system with good predictive ability.
The variance-account-for (VAF) is used as a measure of the quality
of the output signal generated by the identified model, which is defined
as:
VAF% = [1- var(y-[y.sub.est)]/var(y)]x100%
where y denotes the real output signals, yest is the estimated
output signals, and var(*) denotes the variance of a quasi-stationary
signal. The VAF results of the identification model with the validation
of 100 old data and 100 fresh data are shown in Table 3.
It is shown that the proposed approach implements data-driven
identification of bilinear subspace predictor model with high
performance. The computational reduction shown in Table 2 is also
considerably evident. These improvements may induce more applications of
bilinear subspace identification.
[FIGURE 2 OMITTED]
CONCLUSIONS
The main contribution of the study can be summarized with a proper
integration of various algorithms into a computationally efficient
procedure, therefore, removing the barrier for using subspace
identification method for bilinear systems. The significant reduction in
computation of projection matrices provides a new field for subspace
identification and control of systems.
ACKNOWLEDGEMENTS
This work was supported by the National Natural Science Foundation
of China under Grant 60474051 and the program for New Century Excellent
Talents at the University of China (NCET). The authors are grateful to
the anonymous reviewers for their helpful comments and constructive
suggestions with regard to this paper.
Manuscript received January 16, 2006; revised manuscript received
April 6, 2006; accepted for publication July 16, 2006.
NOMENCLATURE
A/B orthogonal projection
[A/[sub.c]B oblique projection
q process flow rate (l/min)
[T.sub.cf] inlet coolant temperature (k)
V reactor volume (l)
-[DELTA]H heat of reaction (cal/mol)
[k.sub.0] reaction rate constant (l/min)
[C.sub.p], [C.sub.pc] specific heats (cal/g/k)
E/R activation energy (K)
[rho], [[rho].sub.c] liquid densities (g/l)
[T.sub.f] feed temperature (K)
[h.sub.A] heat transfer coefficient (cal/min/K)
REFERENCES
Bloemen, H. H. J., M. Cannon and B. Kouvaritakis, "An
Interpolation Strategy for Discrete-Time Bilinear MPC Problems,"
IEEE Trans. Auto. Control 45(5), 775-778 (2002).
Chen, H. and J. Maciejowski, "An Improved Subspace
Identification Method for Bilinear Systems," in "Proceedings
of the 39th IEEE Conference on Decision and Control," Sydney,
Australia (2000), pp. 1573-1578.
Chun, J., T. Kailath and H. Lev-Ari, "Fast Parallel Algorithms
for QR and Triangular Factorization," SIAM J. Sci. Stat. Comput.
8(6), 899-913 (1987).
Favoreel, W., B. De Moor and P. Van Overschee, "Subspace
Identification of Bilinear Systems Subject to White Inputs," IEEE
Trans. Auto. Control 44(6), 1157-1165 (1999).
Ge, S. S., C. C. Hang and T. Zhang, "Nonlinear Adaptive
Control using Neural Networks and Its Application to CSTR Systems,"
J. Process Control 9(4), 313-323 (1999).
Kailath, T. and A. H. Sayed, "Displacement Structure: Theory
and Applications," SIAM Review 37, 297-386 (1995).
Khatri, C. G. and C. R. Rao, "Solutions to Some Function
Equations and their Applications to Characterization of Probability
Distributions," Indian J. Stat. 30, 167-180 (1968).
Lakshminarayanan, S., P. Mhatre and R. Gudi, "Identification
of Bilinear Models for Chemical Processes using Canonical Variate
Analysis," Ind. Eng. Chem. Res. 40(20), 4415-4427 (2001).
Lightbody, G., and G. W. Irwin, "Direct Neural Model Reference
Adaptive Control," IEEE Proc., Control Theory Appl. 142(1), 31-43
(1995).
Mastronardi, N., D. Kressner, V. Sima, P. Van Dooren and S. Van
Huffel, "A Fast Algorithm for Subspace State-Space System
Identification Via Exploitation of the Displacement Structure," J.
Comput. Applied Math. 132(1), 71-81 (2001).
Palanthandalam-Madapusi, H. J., S. Lacy, J. B. Hoagg and D. S.
Bernstein, "Subspace-Based Identification for Linear and Nonlinear
Systems," in "Proceeding of the American Control
Conference," Portland, OR, U.S.A. (2005), pp. 2320-2334.
Qin, S. J., W. Lin and L. Ljung, "A Novel Subspace
Identification Approach with Enforced Causal Models," Automatica
41(12), 2043-2053 (2005).
Wu, F., "LMI-Based Robust Model Predictive Control and its
Application to an Industrial CSTR Problem," J. Process Control
11(6), 649-659 (2001).
Van Overschee, P. and B. De Moor, "Subspace Identification for
Linear Systems: Theory, Implementation, Applications," Kluwer
Academic Publishers, Dordrecht (1996).
Verdult, V. and M. Verhaegen, "Kernel Methods for Subspace
Identification of Multivariable LPV and Bilinear Systems,"
Automatica 41(9), 1557-1565 (2005).
Hua Yang and Shaoyuan Li *
Institute of Automation, Shanghai Jiao Tong University, Shanghai,
200240, China
* Author to whom correspondence may be addressed. E-mail address:
[email protected]
Table 1. Total number of rows of the matrix H
m = 1 n = 1 n = 2 n = 3 n = 4
1 30 123 500 2021
2 158 1438 13050 117866
Table 2. The consuming number of flops of QR
and Cholesky factorization
QR Cholesky
j = 2 7.04 x [10.sup.8] 7.48 x [10.sup.5]
j = 3 2.65 x [10.sup.10] 3.91 x [10.sup.6]
Table 3. VAF results for the identified model
Old data Fresh data
VAF (92.41% 94.43%) (90.60% 93.28%)