文章基本信息

标题：A novel data-driven bilinear subspace identification approach.
作者：Hua, Yang ; Shaoyuan, Li
期刊名称：Canadian Journal of Chemical Engineering
印刷版ISSN：0008-4034
出版年度：2007
期号：February
语种：English
出版社：Chemical Institute of Canada
摘要：Being an important class of non-linear systems, bilinear systems are useful for non-linear plant modelling and control in many circumstances (Lakshminarayanan et al., 2001; Mastronardi et al., 2001). The attention on the research of bilinear systems is increasing now, mainly because such systems are more general to approximate a large class of non-linear systems. At the same time, the simplicity of the structure and similarity to linear systems make them more attractive in many control applications (Bloemen et al., 2002). Therefore, research on the identification of bilinear model is now increasing such as subspace identification methods (Favoreel et al., 1999).

A novel data-driven bilinear subspace identification approach.

Hua, Yang ; Shaoyuan, Li

INTRODUCTION

Being an important class of non-linear systems, bilinear systems are useful for non-linear plant modelling and control in many circumstances (Lakshminarayanan et al., 2001; Mastronardi et al., 2001). The attention on the research of bilinear systems is increasing now, mainly because such systems are more general to approximate a large class of non-linear systems. At the same time, the simplicity of the structure and similarity to linear systems make them more attractive in many control applications (Bloemen et al., 2002). Therefore, research on the identification of bilinear model is now increasing such as subspace identification methods (Favoreel et al., 1999).

Subspace identification (Van Overschee and De Moor, 1996; Qin et al., 2005; Palanthandalam-Madapusi et al., 2005) has been widely accepted for the identification of linear multivariable systems due to its numerical computation advantages and applicability to MIMO systems. However, it is noticed that a major drawback of the subspace bilinear identification is the enormous dimension of the data matrices involved, especially as the row number grows row number grows exponentially with the order of the system.

To overcome the problem encountered in the bilinear subspace identification methods, subset-selection techniques were used (Chen and Maciejowski, 2000) to approximate the estimation of the model, where only the most important rows in the data matrices are used. However, this method is computationally expensive, because every row of the huge dimension data matrices needs to be processed. Another approach based on the kernel method was recently developed in Verdult and Verhaegen (2005). Its computational complicity relies only on the computation of the kernel matrix, which is a square matrix where the dimension equals to the number of the samples and will never depend on the exponential growing number of rows. However, the state-space model obtained through this method is only an approximation to its represented system even in the case of noise-free.

In order to make it more applicable to the modelling and control of real industry processes, a new bilinear predictor model is identified in this study. Based on the displacement structure theory (Kailath and Sayed, 1995; Mastronardi et al., 2001), the method deals with the curse of huge dimensionality and therefore reduces the computation cost. The paper is organized as follows: in the next section, bilinear subspace identification methods are briefly reviewed and the exponential increase of the row number in the data matrix is illustrated; an enhanced efficient bilinear subspace identification algorithm is proposed in the third section; in order to demonstrate the enhanced efficiency, the proposed method is simulated with the CSTR system in the fourth section; followed by a conclusion section.

PROBLEM DESCRIPTION

This is a preliminary section to set up the necessary basis for developing a new identification procedure. First of all, a discrete time MIMO bilinear model in state-space form is introduced to build up a framework for the problem analysis and new algorithm development. Actually, the model is well representative of many industrial processes such as CSTR systems. Conventional identification procedure of bilinear subspace method and formation of data matrices are explained to show the data explosion problems, therefore the necessity to improve the computational efficiency is brought forward to the next section.

Consider a discrete time MIMO bilinear model in state-space form (Favoreel et al., 1999; Verdult and Verhaegen, 2005)

[x.sub.k+1] = [Ax.sub.k] + [Nu.sub.k] [cross product] [x.sub.k] + [Bu.sub.k] + [v.sub.k] [y.sub.k] = [Cx.sub.k] [Du.sub.k] + [e.sub.k] (1)

where [x.sub.k] [member of] [R.sup.n], [u.sub.k] [member of] [R.sup.m], and [y.sub.k] [member of] [R.sup.l] are the state, input and output vectors, respectively, [v.sub.k] [member of] [R.sup.n] and [e.sub.k] [member of] [R.sup.l] are the process and output immeasurable noises, respectively, with zero mean, stationary white noise sequences, which are assumed to be uncorrelated with the inputs; k = 1, 2, ... are time indices. The matrix N = [[N.sub.1] [N.sub.2] ... [N.sub.m] [member of] [R.sup.nxnm] characterizes the bilinearity of the model. The Kronecker product [cross product] of two vectors a [member of] [R.sup.p] and b [member of] [R.sup.q] are defined as a [cross product] b = [[a.sub.1][b.sup.T] ... [a.sub.p][b.sup.T]].sup.T] [member of] [R.sup.pq]. The Khatri-Rao product [dot circle] of two matrices A = [[a.sub.1] [a.sub.2] ... [a.sub.j]] [member of] [R.sub.pxj] and B = [[b.sub.1] [b.sub.2] ... [b.sub.j]] [member of] [R.sup.qxj] are defined as the column-wise Kronecker product A [dot circle] B = [[a.sub.1] [cross product] [b.sub.1] [a.sub.2] [cross product] [b.sub.2] ... [a.sub.j] [cross product] [b.sub.j]], which will be used later. Such bilinear dynamics can be found in many mass balance processes, such as CSTR systems.

The following matrices are defined and used in bilinear subspace identification (Verdult and Verhaegen, 2005):

[X.sub.k] = ([x.sub.k] [x.sub.k+1] ... [x.sub.k+N-1]) [member of] [R.sup.nxN] (2)

[U.sub.k] = ([u.sub.k] [u.sub.k+1] ... [u.sub.k+N-1]) [member of] [R.sup.mxn] (3)

[Y.sub.k] = ([y.sub.k] [y.sub.k+1] ... [y.sub.k+N-1]) [member of] [R.sup.lxN] (4)

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] (5)

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] (6)

with the initial condition [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] and [Y.sub.j|j] = [Y.sub.j] [member of] [R.sup.lxN]. The matrix [U.sub.k+j/j] is defined similar to [Y.sub.k+j/j].

According to the previous researches on subspace identification (Van Overschee and De Moor, 1996; Favoreel et al., 1999; Verdult and Verhaegen, 2005; Qin et al., 2005), the data equations are formulated with the available measurements in a structured way. These equations play a crucial role in subspace identification.

[X.sub.k+j = [[DELTA].sup.x.sub.k][X.sub.k+j-1|j] + [[DELTA].sup.u.sub.k][U.sub.k+j-1|j]+V (7)

[[gamma].sub.k+j|j] = [H.sup.x.sub.k][X.sub.k+j-1|j] + [H.sup.u.sub.k][U.sub.k+j-1|j]+[G.sub.k][U.sub.k+j] + E (8)

where

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII],

and [G.sub.k] = [[[D.sup.T] 0].sup.T]. In the above equations,[[DELTA].sup.x.sub.k], [[DELTA].sup.u.sub.k] [H.sup.x.sub.k] and [H.sup.u.sub.k] are unknown since they depend on system matrices, and the term V and E take noise into account. Given the measurements of the system inputs and outputs, the data Hankel matrices are constructed as follows:

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

As mentioned in Verdult and Verhaegen (2005), a challenging problem of the subspace methods for bilinear systems is the fact that the row number of the data matrices H = [[[W.sup.T.sub.p] [U.sup.T.sub.f] [Y.sup.T.sub.f].sup.T] grows exponentially with the block size j. In some researches, QR fraction of matrix H is first performed to obtain the subspace matrices. If k is selected equal to j, the row number M of the data matrix H is given as:

M = dim (H)= jl + ([d.sup.2.sub.j)+[d.sub.j])(l/m+1)+[d.sub.j]+m (9) where [d.sub.j] = [(m + 1).sup.j] - 1. It is requested that the size is no smaller than the order, i.e., j [greater than or equal to] n (Van Overschee and De Moor, 1996). Even for the considerably low-order system, i.e., the small value of n, the computation complexity may become heavily burdened. This limits practical applications of the bilinear subspace identification. To illustrate this expression, Table 1 is given with the total row numbers of the matrix where j is selected as j = n + 1.

COMPUTATIONAL EFFICIENT BILINEAR SUBSPACE IDENTIFICATION ALGORITHM

In this section, we propose a new bilinear subspace method with the identification of predictor model. The prediction expressions of the output [Y.sub.f] can be written as:

[[??].sub.f] [L.sub.w][W.sub.p] [L.sub.u] [U.sub.f] (10)

where the model identification involves finding the optimal prediction of the future outputs [Y.sub.f] using this predictor. The equation extrapolate future input-output behaviours from past input-output data, which are in bilinear form because there are products of control inputs in the construction of the data matrices [W.sub.p] and [U.sub.f] . The problem of obtaining the optimal prediction of [Y.sub.f] can be obtained through solving the following least-square problem:

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] (11)

where [L.sub.w] and [L.sub.u] are the subspace predictor matrices. The prediction [Y.sub.f] can be found by the orthogonal projection of the row space of [Y.sub.f] onto the row space spanned by [W.sub.p] and [U.sub.f]:

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] (12)

In previous studies (Palanthandalam-Madapusi et al., 2005; Verdult and Verhaegen, 2005), the QR fraction of [H.sup.T] is implemented once the data matrix H is constructed. The row and column dimension of the matrix H are M and N, respectively, where N is the same as the sample number and M grows exponentially with the block size j, as declared before. The computation complexity is O([M.sup.2]N ) for the QR factorization of [H.sup.T]. It is well known that the R factor of the RQ factorization of the matrix [H.sup.T] is equivalent to the Cholesky factor L of the matrix [PHI] = [HH.sup.T] (Mastronardi et al., 2001). A new algorithm with enhanced computational efficiency is pursued based on this observation.

The displacement structure theory (Kailath and Sayed, 1995; Mastronardi et al., 2001) is dominantly used here to resolve Projection (12). To simplify the following discussion, we transform all the Hankel matrices in Projection (12) to Toeplitz matrices by reversing the order of their columns (Chun et al., 1987). To relieve the computation complexity, the matrix [PHI] instead of H is proposed in the numerical computation, where [PHI] = [HH.sup.T]. It is a symmetrical positive-semi-definite with an order n. Then the displacement (Mastronardi et al., 2001) of [PHI] is defined as follows:

[[nabla].sub.z] [PHI] = [PHI] - [Z.sup.T][PHI]Z (13)

where the displacement operator Z is the generalized shift matrix with ones along the first subdiagonal and zeros elsewhere.

It is assumed that the rank of [[nabla].sub.z] is [alpha] and usually satisfied with [alpha] <<n. Let the matrix [[nabla].sub.z] [PHI] have p positive eigenvalues and q = [alpha] - p negative eigenvalues, then it has a following factorization :

[[nabla].sub.z] [PHI] [G.sub.T][SIGMA]G (14)

where

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

The matrix G is called as the generator of [PHI] and the number [alpha] is the length of the generator with respect to Z.

Lemma 1.

For an arbitrary matrix [PHI] [member] [R.sup.MxM], the matrix ??can be reconstructed using the generator G as follows:

[PHI] = [n-1.summation over (i=0)] [([Z.sup.i]).sup.T][G.sup.T][SIGMA][GZ.sup.i] (15)

As shown in Chun et al. (1987), the generator of the product of two Hankle or Toeplitz matrices can be found by inspection. The generator of the matrix [PHI] can be selected as follows:

G = [[g.sub.1], [g.sub.2], ..., g[alpha]]

[g.sub.1] = [H.sup.T]a/[(ba).sup.1/2], [g.sub.2] = [a.sup.(i)], [g.sub.3] = [g.sup.(1).sub.1], [g.sub.4] = [c.sup.s] (16)

where a is the first column, [b.sup.T] denotes the first row, and [c.sup.T] denotes the last row of the matrix [H.sup.T], [a.sup.(1)] denotes that the vector is the same as a except the first element is 0, [c.sup.s] is the vector as c shifts down by one position with [c.sup.s.sub.1] = 0.

Lemma 2.

Let [PHI] be a positive-semi-definite matrix with [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]. Its proper generator is given as [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII], and the first row of the triangular factorization is constructed by [[??].sub.1]. Then the matrix [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] has null first column and row, whose generator is given by [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]. The proof can be found in Chun et al.(1987).

The generalized Schur algorithm just consists of a recursive use of lemma 2, where the generator must be selected in advance. The input of this algorithm is the proper generator of [PHI] with respect to Z, and the output is the partial triangular factor. Therefore, this generalized Schur algorithm allows reconstructing the Cholesky factor L of the matrix [PHI], where [PHI] = [LL.sup.T].

Using the generator given in Equation (16), the Cholesky factor of the matrix [PHI] is obtained after performing M steps of partial factorization using the generalized Schur algorithm. This implements the same procedure to the QR factorization. The overall computation complexity of the Cholesky factorization is O(MN), which is much smaller than O([M.sup.2]N), especially when the row number M of the matrix H is large. The comparison of flops of the Cholesky factorization and QR fraction is given in Table 2. The system is selected to be m = 1, l = 2 for different block size.

Once the Cholesky factor L is available:

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] (17)

the terms Lw and Lu can be computed as follows. Let:

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] (18)

where [dagger] denotes the Moore-Penrose pseudo-inverse. The subspace predictor matrices [L.sub.w] and [L.sub.u] can be represented using Matlab notations as follows:

[L.sub.w] = L(:, 1 : [n.sub.w]) (19)

[L.sub.u] = L(:, [n.sub.w] + 1: end) (20)

where [n.sub.w] is the total number of rows in [W.sub.p].

SIMULATION OF A CSTR SYSTEM

The main objective of this simulation study is to illustrate the efficiency of the proposed bilinear subspace identification method with its application to a non-linear plant CSTR. A single irreversible, first-order, exothermic reaction, from A to B is assumed to occur in the reactor. The mass and energy balance are described by the following two non-linear ordinary differential equations (Ge et al., 1999; Wu, 2001):

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] (21)

where [C.sub.a] is the effluent concentration of the component A, T is the reactor temperature, q is the volumetric feed flow rate and [Q.sub.c] is the coolant flow rate. The parameters of the CSTR system are taken as Ge et al. (1999). The CSTR system is modelled as a one-input two-output system with the input [Q.sub.c] and outputs [C.sub.a] , T.

The open-loop responses for +10% and -30% step changes in [Q.sub.c], reported in Figure 1, demonstrate that the reactor exhibits highly non-linear behaviours in this operation regime. For a better approximation of the CSTR system, a bilinear subspace predictor model is identified in this simulation applying the proposed method.

The output data were collected using a random Gaussian signal with the 20% increase and a decrease of [Q.sub.c0] = 103.41 as input. We use 1100 data samples in this simulation, where 1000 data samples are used for the model identification and the other 100 data samples are used for the validation of the prediction ability. The non-linear CSTR is approximated using bilinear predicator model with sample time 0.1 min. The block size j is chosen as 6, and the sample number N = 1000.

[FIGURE 1 OMITTED]

Using the input and output data set, construct the data matrix H = [[[W.sup.T.sub.p][U.sup.T.sub.f][Y.sup.T.sub.f]].sup.T] and compute the symmetric matrix [PHI] = [H.sup.T]H. Then the Projection (12) is calculated using Cholesky factorization of the matrix [PHI] based on the displacement structure. As pointed out before, this procedure composes most of the computational cost and is computational efficient. The comparison results with QR fraction are given in Table 2. The approximate bilinear predictor model of the non-linear CSTR systems is estimated in the form of Equation (11) without any prior knowledge, including the order of the CSTR system.

For the validation of the identified model, the responses of the predictor model and the true CSTR system to the 200 input data are shown in Figure 2 with solid lines and dashed lines, respectively. The first 100 samples are the data used in the model identification and the last 100 samples are fresh data, which are used for the validation of regression and predictive performance of the identified bilinear model, respectively. It is implied that the bilinear model fits the sample data well and can approximate the CSTR system with good predictive ability.

The variance-account-for (VAF) is used as a measure of the quality of the output signal generated by the identified model, which is defined as:

VAF% = [1- var(y-[y.sub.est)]/var(y)]x100%

where y denotes the real output signals, yest is the estimated output signals, and var(*) denotes the variance of a quasi-stationary signal. The VAF results of the identification model with the validation of 100 old data and 100 fresh data are shown in Table 3.

It is shown that the proposed approach implements data-driven identification of bilinear subspace predictor model with high performance. The computational reduction shown in Table 2 is also considerably evident. These improvements may induce more applications of bilinear subspace identification.

[FIGURE 2 OMITTED]

CONCLUSIONS

The main contribution of the study can be summarized with a proper integration of various algorithms into a computationally efficient procedure, therefore, removing the barrier for using subspace identification method for bilinear systems. The significant reduction in computation of projection matrices provides a new field for subspace identification and control of systems.

ACKNOWLEDGEMENTS

This work was supported by the National Natural Science Foundation of China under Grant 60474051 and the program for New Century Excellent Talents at the University of China (NCET). The authors are grateful to the anonymous reviewers for their helpful comments and constructive suggestions with regard to this paper.

Manuscript received January 16, 2006; revised manuscript received April 6, 2006; accepted for publication July 16, 2006.

NOMENCLATURE

A/B orthogonal projection
[A/[sub.c]B oblique projection
q process flow rate (l/min)
[T.sub.cf] inlet coolant temperature (k)
V reactor volume (l)
-[DELTA]H heat of reaction (cal/mol)
[k.sub.0] reaction rate constant (l/min)
[C.sub.p], [C.sub.pc] specific heats (cal/g/k)
E/R activation energy (K)
[rho], [[rho].sub.c] liquid densities (g/l)
[T.sub.f] feed temperature (K)
[h.sub.A] heat transfer coefficient (cal/min/K)

REFERENCES

Bloemen, H. H. J., M. Cannon and B. Kouvaritakis, "An Interpolation Strategy for Discrete-Time Bilinear MPC Problems," IEEE Trans. Auto. Control 45(5), 775-778 (2002).

Chen, H. and J. Maciejowski, "An Improved Subspace Identification Method for Bilinear Systems," in "Proceedings of the 39th IEEE Conference on Decision and Control," Sydney, Australia (2000), pp. 1573-1578.

Chun, J., T. Kailath and H. Lev-Ari, "Fast Parallel Algorithms for QR and Triangular Factorization," SIAM J. Sci. Stat. Comput. 8(6), 899-913 (1987).

Favoreel, W., B. De Moor and P. Van Overschee, "Subspace Identification of Bilinear Systems Subject to White Inputs," IEEE Trans. Auto. Control 44(6), 1157-1165 (1999).

Ge, S. S., C. C. Hang and T. Zhang, "Nonlinear Adaptive Control using Neural Networks and Its Application to CSTR Systems," J. Process Control 9(4), 313-323 (1999).

Kailath, T. and A. H. Sayed, "Displacement Structure: Theory and Applications," SIAM Review 37, 297-386 (1995).

Khatri, C. G. and C. R. Rao, "Solutions to Some Function Equations and their Applications to Characterization of Probability Distributions," Indian J. Stat. 30, 167-180 (1968).

Lakshminarayanan, S., P. Mhatre and R. Gudi, "Identification of Bilinear Models for Chemical Processes using Canonical Variate Analysis," Ind. Eng. Chem. Res. 40(20), 4415-4427 (2001).

Lightbody, G., and G. W. Irwin, "Direct Neural Model Reference Adaptive Control," IEEE Proc., Control Theory Appl. 142(1), 31-43 (1995).

Mastronardi, N., D. Kressner, V. Sima, P. Van Dooren and S. Van Huffel, "A Fast Algorithm for Subspace State-Space System Identification Via Exploitation of the Displacement Structure," J. Comput. Applied Math. 132(1), 71-81 (2001).

Palanthandalam-Madapusi, H. J., S. Lacy, J. B. Hoagg and D. S. Bernstein, "Subspace-Based Identification for Linear and Nonlinear Systems," in "Proceeding of the American Control Conference," Portland, OR, U.S.A. (2005), pp. 2320-2334.

Qin, S. J., W. Lin and L. Ljung, "A Novel Subspace Identification Approach with Enforced Causal Models," Automatica 41(12), 2043-2053 (2005).

Wu, F., "LMI-Based Robust Model Predictive Control and its Application to an Industrial CSTR Problem," J. Process Control 11(6), 649-659 (2001).

Van Overschee, P. and B. De Moor, "Subspace Identification for Linear Systems: Theory, Implementation, Applications," Kluwer Academic Publishers, Dordrecht (1996).

Verdult, V. and M. Verhaegen, "Kernel Methods for Subspace Identification of Multivariable LPV and Bilinear Systems," Automatica 41(9), 1557-1565 (2005).

Hua Yang and Shaoyuan Li *

Institute of Automation, Shanghai Jiao Tong University, Shanghai, 200240, China

* Author to whom correspondence may be addressed. E-mail address: [email protected]

Table 1. Total number of rows of the matrix H

m = 1 n = 1 n = 2 n = 3 n = 4

1 30 123 500 2021

2 158 1438 13050 117866

Table 2. The consuming number of flops of QR
and Cholesky factorization

 QR Cholesky

j = 2 7.04 x [10.sup.8] 7.48 x [10.sup.5]

j = 3 2.65 x [10.sup.10] 3.91 x [10.sup.6]

Table 3. VAF results for the identified model

 Old data Fresh data

VAF (92.41% 94.43%) (90.60% 93.28%)