lauantai 5. syyskuuta 2015

Inverted pendulum - simulation and control






Check out the SIMULATION! Swipe the screen horizontally to move the pendulum.


Introduction 

Inverted pendulum is the classic problem in control theory and therefore well suited for an exercise combining mathematical modelling, simulation, control theory and web technologies for presentation purposes.

Just about all suitable controllers have been applied to inverted pendulum problem since it has become the challenge all new controllers must be able to conquer. Model Predictive Control (MPC) uses a mathematical model of the phenomenon to predict how the system is going to evolve and what control input should be applied to achieve a desired end state. MPC Controller computes an optimal control sequence and applies the first control input, then computes the optimal control sequence again and applies the first control input again and so forth. You are encouraged to take a look at the freely available excellent lecture notes.

In all projects key to success is rigorous scoping and prioritization. This project consist of making simulation model and linear model predictive controller in c++ and use Node.js addon to glue those to web frontend. Simulation is visualized by drawing inverted pendulum in 3d using three.js WebGL library.

Code is available at the bitbucket and a running example is available here. In the example you can whack the pendulum using a mouse (press left button down, move mouse horizontally and release the left button) or a sweep in a mobile device and see how the controller stabilizes the system.

Simulation model

The first step is to create a reasonably accurate simulation model of the inverted pendulum. In principal this could be done using standard Newtonian physics you learned at school, F=ma and so forth, but handling constraints is a bit tricky in Newtonian formalism. For an example inverted pendulum should not fall through the ground nor rise above the ground. Ensuring this in Newtonian formalism requires a careful consideration of all relevant forces that cancel out in the actual simulation keeping the inverted pendulum on the ground.

However, it is possible to handle constraints using Lagrangian mechanics that use generalized coordinates and forces. A little bit confusing sentence if your background is not in the physics or control theory but it will become very clear by the end of this article.

In Lagrangian mechanics you need to express kinetic and potential energy using generalized coordinates. In this case kinetic energy is
\begin{equation} K = \frac{1}{2}M\dot{x}^2 + \frac{1}{2}m\big[ (\dot{x} - L\dot{\varphi}\sin{\varphi})^2 + (\dot{y} + L\dot{\varphi}\cos{\varphi})^2 \big] \end{equation}
and potential energy is \begin{equation} V = mgL\sin{\varphi} \end{equation} Lagrangian is \begin{equation} \begin{split} L = &K - V \\ = & \frac{1}{2}(M+m)\dot{x}^2 + \frac{1}{2}mL^{2}\dot{\varphi}^{2} - L\sin{\varphi}\dot{x}\dot{\varphi} - mgL\sin{\varphi} \\ \end{split} \end{equation} since $\dot{y}=0$. Lagrangian can be used to derive differential equation of the system using the generalized equations of motion
\begin{equation} \label{equations_of_motion} \frac{d}{dt} \frac{\partial L}{\partial \dot{q_{i}}} - \frac{\partial L}{\partial q_{i}}= F_{i}, \end{equation} where $F_{i}$ is generalized force. Let's apply generalized equations of motion (\ref{equations_of_motion}) for $x$ coordinate
\begin{equation} \begin{split} \frac{\partial L}{\partial \dot{x}} = & \: (M+m)\dot{x} - mL\dot{\varphi}\sin{\varphi} \\ \frac{dL}{dx} = & \: 0\\ \frac{d}{dt}\bigg(\frac{\partial L}{\partial \dot{x}}\bigg) = & \: (M+m)\ddot{x} - mL\ddot{\varphi}\sin{\varphi} - mL\dot{\varphi}^{2}\cos{\varphi}\\ \end{split} \end{equation} and therefore \begin{equation} (m+M)\ddot{x} - mL\ddot{\varphi}\sin{\varphi} - mL\dot{\varphi}^{2}\cos{\varphi} = F_{lin}, \end{equation} where $F_{lin}$ is in this case just a combination control force ($F_{c}$) pushing cart left and right and linear friction defined as \begin{equation} F_{lin} = F_{c} - \mu_{x}'x, \end{equation} where $\mu_{x}'$ is linear friction coefficient including physical properties of the ground and the cart. Let's next consider coordinate $\varphi$ \begin{equation} \begin{split} \frac{\partial L}{\partial \dot{\varphi}} = & \: mL^{2}\dot{\varphi} - mL\dot{x}\sin{\varphi} \\ \frac{dL}{d\varphi} = & \: -mL\dot{x}\dot{\varphi}\cos{\varphi} - mgL\cos{\varphi}\\ \frac{d}{dt}\bigg(\frac{\partial L}{\partial \dot{\varphi}}\bigg) = & \: mL^{2}\ddot{\varphi} - mL\ddot{x}\sin{\varphi} -mL\dot{x}\dot{\varphi}\cos{\varphi}\\ \end{split} \end{equation} and therefore \begin{equation} L\ddot{\varphi}-\ddot{x}\sin{\varphi} + g\cos{\varphi} = \frac{F_{\varphi}}{mL} = F_{rot}, \end{equation} where $F_{rot}$ is scaled generalized force defined as friction \begin{equation} F_{rot} = F_{rotnoise} - \mu_{\varphi}'\dot{\varphi}, \end{equation} where $\mu_{\varphi}'$ is rotational friction coefficient including hinge's physical properties and $F_{rotnoise}$ is rotational disturbance which can be intentional, say user input, or noise i.e. due to wind.

Next we need to uncouple $\ddot{x}$ and $\ddot{\varphi}$ which is easily done by solving them from the systems of equations we just derived. After a little bit of arithmetic one gets \begin{equation} \frac{d}{dt} \left[ {\begin{array}{c} \dot{\varphi}\\ \varphi\\ \dot{x}\\ x \end{array} } \right] = \left[ {\begin{array}{c} \frac{F_{lin}\sin{\varphi}+mL\dot{\varphi}^{2}\cos{\varphi}\sin{\varphi} - (m+M)g\cos{\varphi} + (m+M)F_{rot}}{L(m+M-m\sin^{2}{\varphi})}\\ \dot{\varphi}\\ \frac{F_{lin} + F_{rot}m\sin{\varphi} -mg\sin{\varphi}\cos{\varphi}+mL\dot{\varphi}^{2}\cos{\varphi}}{m + M - m\sin^{2}{\varphi}}\\ \dot{x} \end{array} } \right] \label{statePropagationEq} \end{equation}
These equations can be solved trivially using Runge-Kutta method. The friction terms serve dual purpose. Firstly, they are physically justified. Secondly, they make the system to bleed out energy. Due to numerical inaccuracies and approximations Runge-Kutta solvers tend to gather some extra energy in system like this so bleeding out some energy helps to keep the system stable.

Model predictive control

Model predictive control comes in two variants. Linear model predictive control considers only linear systems whereas nonlinear model predictive control considers nonlinear systems. Linear model predictive control leads to convex optimization problem that can be solved easily whereas optimization problem in nonlinear model predictive control may be riddled with troubles such as local optima and is generally hard to solve. One of the great benefits of model predictive control is that you can explicitly limit the maximum and minimum values of the control input. That is, optimal sequence on control inputs is different based on physical limits of the actuators. In addition you can also set limits to allowed states (say, speed of the cart). 

Linear model predictive control problem tries to find sequence on control inputs (items of vector u) such that \begin{equation} \begin{split} J(k) = & \sum_{i=1}^{N}\big[ x^{T}(k+i|k)Qx(k+i|k) + u^{T}(k+i|k)Ru(k+i|k) \big]\\ &+ x^{T}(k+N|k)\bar{Q}x(k+N|k) \end{split} \label{Jk} \end{equation} is minimized. $N$ is the length of the prediction horizon and matrices $Q$,$R$ and $\bar{Q}$ can be used to give different terms and variables a proper weight in the optimization problem. These matrices offer some degrees of freedom to the control problem and can be used to tune the controller. In numerical computations matrices are preferred and an equivalent form of ($\ref{Jk}$) can be used \begin{equation} J(k) = \mathbf{u}^{T}(k)H\mathbf{u}(k) + 2x^{T}F^{T}\mathbf{u}(k) + x^{T}(k)Gx(k) \label{Jkmatrix} \end{equation} The first term is control input cost punishing excessive control inputs. The second term is cost due to system state changes by the control input and the last term is cost due to free propagation of the system as it doesn't depend on the control input. Matrices are \begin{equation} \begin{split} H = & C^{T}\tilde{Q}C + \tilde{R} \\ G = & M^{T}\tilde{Q}M + Q \\ F = & C^{T}\tilde{Q}M, \\ \end{split} \end{equation} where \begin{equation} M = \left[ {\begin{array}{c} A\\ A^{2} \\ \vdots \\ A^{N} \end{array} } \right] \end{equation} \begin{equation} C = \left[ {\begin{array}{cccc} B & 0 & \cdots & 0\\ AB & B & \cdots & 0\\ \vdots& \vdots & \ddots & \\ A^{N-1}B & A^{N-2}B & \cdots & B \end{array} } \right] \end{equation} \begin{equation} \tilde{Q} = \left[ {\begin{array}{cccc} Q & 0 & \cdots & 0\\ 0 & \ddots & & \vdots\\ \vdots& & Q & 0\\ 0 & \cdots & 0 & \bar{Q} \end{array} } \right] \end{equation} and \begin{equation} \tilde{R} = \left[ {\begin{array}{cccc} R & 0 & \cdots & 0\\ 0 & \ddots & & \vdots\\ \vdots& & R & 0\\ 0 & \cdots & 0 & R \end{array} } \right] \end{equation} Many solvers require derivative of cost function with respect to optimized parameter. Taking derivative of ($\ref{Jkmatrix}$) with respect to $x$ leads to \begin{equation} \frac{dJ(k)}{du} = 2\mathbf{u}^{T}(k)H + 2x^{T}F^{T} \end{equation} For details of derivation see.

There are two glaring issues here. Firstly, differential equations of inverted pendulum are nonlinear while ($\ref{Jkmatrix}$) applies only to linear systems. Secondly, MPC control problem is presented in terms of difference equations instead of differential equations. Let’s tackle these issues one by one. Inverted pendulum can be considered linear as long as it is almost in the upright position. Applying linearization formula

\begin{equation} \bar{f}(\bar{x}) = \bar{f}(\bar{x_{0}}) + \nabla \bar{f}|_{x_{0}}(\bar{x}-\bar{x}_{0}) \end{equation}

to ($\ref{statePropagationEq}$) at $ x_{0}=[0, \pi/2, 0, 0]^{T}$ leads to a system of linear differential equations
\begin{equation} \dot{x} = Ax(t) + Bu(t), \end{equation} where \begin{equation} A = \left[ {\begin{array}{cccc} -\frac{\mu_{\varphi}(m+M)}{LM} & \frac{(m+M)g}{LM} & -\frac{\mu_{x}}{LM} & 0\\ 1 & 0 & 0 & 0\\ -\frac{\mu_{\varphi}m}{M} & \frac{mg}{M} & -\frac{\mu_{x}}{M} & 0\\ 0 & 0 & 1 & 0 \end{array} } \right] \end{equation} and \begin{equation} B = \left[ {\begin{array}{c} \frac{1}{LM}\\ 0 \\ \frac{1}{M}\\ 0 \end{array} } \right] \end{equation} $x(t)$ is state vector (note that this is not x coordinate) and $u(t)$ is control input. In this case only available control input is to push cart left or right so $u(t)$ is just $F_{c}$. Also, it's assumed that $F_{rotnoise}$ is zero centered and it's ignored in the prediction model. Let’s now transform these to difference equations using formulas

\begin{equation} A_{discrete} = e^{At} \end{equation} and \begin{equation} B_{discrete} = \bigg( \int_{0}^{T}e^{At}dt \bigg)B \end{equation}

The first matrix can be evaluated as power series since
\begin{equation} e^{X} = \sum_{0}^{\infty}\frac{1}{k!}X^{k} \end{equation}
However the second expression is trickier since matrix A is singular in this case and you cannot use trivial integration formula similar to scalar case. Lets derive a power series form also for it 

\begin{equation} \begin{split} \int_{0}^{T}e^{At}dt = & \int_{0}^{T} \sum_{0}^{\infty}\frac{1}{k!}(At)^{k}dt \\ = & \sum_{0}^{\infty}\frac{1}{k!}A^{k}\int_{0}^{T}t^{k}dt \\ = & \sum_{0}^{\infty}\frac{1}{k!}A^{k}\bigg/_{0}^{T}\frac{1}{k+1}t^{k+1} \\ = & \sum_{0}^{\infty}\frac{1}{k!}A^{k}\frac{1}{k+1}T^{k+1} \\ = & T\sum_{0}^{\infty}\frac{1}{(k+1)!}\big(AT\big)^k \end{split} \end{equation}
Solving optimization problems is fascinating but quite involved topic so let’s scope implementation of constrained optimization problem solver out by using an existing library. dlib library offers a wide range of optimizers as well as classes for matrix arithmetics. 

Software stack 

In one sense software engineering is all about stacking libraries on top of each other to create a fully functional system. More one can use ready-made libraries and components more quickly one can finish the project. Hence the name software stack and importance of getting it right.


Browser frontend uses three.js for 3d visualization of inverted pendulum. Three.js is an easy-to-use abstraction of WebGL interface. Visualization of the simulation needs to be near real time so websocket is used for communication between the browser frontend and server backend that is implemented using NodeJS with Express.js . Actual simulation is implemented using standard c++ threads which don't share a lot of state. One thread runs actual simulation and keeps up the simulation state while the other thread runs the MPC controller.

As shown above MPC controller is essentially just a real time optimization problem solver whose input is the system state and the output is the control value needed to solve control problem. In actual controller one usually also has the estimation block but here it is assumed that controller is able to measure system state directly. Noise is added to controller's state measurement to make simulation more interesting.

Actual optimization problem is solved using a dlib library which is a templated multipurpose optimization and machine learning library implemented in c++.  Simulation is glued to frontend using Node.js addon that exposes V8 objects to the javascript.

Gluing dlib to to Node.js was fairly easy. By default Real Time Type Information RTTI and exceptions are turned off in Node.js. Both of these are needed by dlib but you can easily turn these on in the binding.gyp file.

Conclusions

Model predictive control allows a fine tuned and efficient control of complex system while taking into account also physical limitations of actuators. Web technologies allow you to present your work to in an interactive manner without forcing the reader to install a multitude of arcane libraries and navigating through the maze of issues due to slightly varying system environments and setups.

Control theory is cool by itself and in addition mathematical foundations of many branches of data science have a lot of in common with the tools used in model predictive control.

Ei kommentteja:

Lähetä kommentti