This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. In most cases, these works may not be reposted without the explicit permission of the copyright holder.

This material is based upon work supported by the National Science Foundation under Grant No. 0245291. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.

- Douglas Hittle, ME faculty member and director of SEAL,
- Peter Young, ECE faculty member,
- Chuck Anderson, CS faculty member
- David Hodgson, ME graduate student
- Mike Buehner, ECE graduate student
- Mail to all active members

- Keith Bush, CS graduate student
- Nagabhushan.K.N, ME graduate student
- Jilin Tu, CS graduate student,
- Matt Kretchmar
- Mike Anderson, ECE graduate student,
- Chris Delnero, ME graduate student,
- Susan Cavender, ME staff member

- National Science Foundation, ECS-0245291, 5/1/03--4/30/06, $399,999, D. Hittle, P. Young, and C. Anderson, Robust Learning Control for Building Energy Systems.
- National Science Foundation, CMS-9804747, 9/15/98--9/14/01, $746,717, D. Hittle, P. Young, and C. Anderson, Robust Learning Control for Heating, Ventilating, and Air-Conditioning Systems.
- National Science Foundation, CMS-9732986, 5/98 - 4/02, $200,000, Peter M. Young, Robust Learning Control with Application to Intelligent Building Systems.
- National Science Foundation, CMS-9401249, 1/95--12/96, $133,196, D. Hittle and C. Anderson, Neural Networks for Control of Heating and Air-Conditioning Systems.

Our approach is to add to the robust control framework a family of reinforcement learning algorithms based on artificial neural networks and designed to optimize the performance of a controller through experience with the actual system. We have showed through analysis and experiments that stability can be guaranteed even as the agent learns. This is some of the first work we know of to address K. Narendra's call for new approaches: "It is precisely in problems where the system has to adapt to large uncertainty that controllers based on neural networks will be needed in practical applications. For such problems, new concepts and methods based on stability theory will have to be explored." (Narendra, 1990)

The objectives of our current work are the following:

- Verify our multi-input, multi-output (MIMO) robust reinforcement learning approach on our experimental heating, ventilating, and air-conditioning (HVAC) system,
- Develop new analysis tools and algorithms to include learned, dynamic models based on recurrent neural networks within our robust reinforcement learning theory and techniques,
- Design and test more advanced robust reinforcement learning algorithms with reinforcement as a function of control performance plus robustness,
- Evaluate our new techniques with experiments on the experimental HVAC system,
- Disseminate our theoretical and experimental results and algorithms through conference and journal publications and by assisting colleagues at other institutions in conducting their own tests of our methods,
- Incorporate the results of our research in our teaching. The significance of this work to the robust control field is the combination of reinforcement learning and robust control to overcome the conservative behavior of robust controllers.

We are an interdisciplinary team consisting of a specialist in robust control from the Electrical & Computer Engineering Department, a specialist in reinforcement learning for neural networks from the Department of Computer Science, and a specialist in design, modeling and control of HVAC systems from the Mechanical Engineering Department. This interdisciplinary approach will further advance the state-of-the-art in the theory of robust reinforcement learning control design, demonstrate these new methods on an experimental HVAC system and provide much needed improved methods for controlling HVAC systems in buildings. Eventual wide spread implementation of these schemes in buildings around the world will reduce energy consumption, improve comfort, and extend equipment life.

- Anderson M.L., Buehner M.R., Young P.M., Hittle D.C., Anderson C., Tu J., and Hodgson, D., MIMO Robust Control for Heating, Ventilating, and Air Conditioning (HVAC) Systems, IEEE Transactions on Control Systems Technology, vol. 16, no. 3, pp 475-483, May 2008.

- Anderson C.W., Young P.M., Buehner M.R., Knight J.N., Bush K.A., and Hittle D.C., Robust Reinforcement Learning Control using Integral Quadratic Constraints for Recurrent Neural Networks, IEEE Transactions on Neural Networks: Special Issue on Neural Networks for Feedback Control Systems, vol. 18, no. 4, pp. 993-1002, July 2007.
- Buehner M.R., Anderson C.W., Young P.M., Bush K.A., and Hittle D.C., Improving Performance using Robust Recurrent Reinforcement Learning Control, in Proceedings of the European Control Conference 2007, Kos, Greece, pp. 1676-1681, July 2007.
- Anderson M.L., Buehner M.R., Young P.M., Hittle D.C., Anderson C., Tu J., and Hodgson, D., An Experimental System for Advanced Heating, Ventilating, and Air Conditioning (HVAC) Control, Energy and Buildings, vol 39, no. 2, pp. 136-147, Feb. 2007.

- Buehner M.R. and Young P.M., A Tighter Bound for the Echo State Property, IEEE Transactions on Neural Networks, vol 17, no. 3, pp. 820-824, May 2006.

- Bush, K. and Tsendjav, B. Improving the Richness of Echo State Features Using Next Ascent Local Search, in Proceedings of the Artificial Neural Networks In Engineering Conference, St. Louis, MO, pp 227-232, Nov. 2005.
- Bush, K. and Anderson, C.W., Modeling Reward Functions for Incomplete State Representations via Echo State Networks, in Proceedings of the International Joint Conference on Neural Networks, Montreal, pp 2295-3000, Aug. 2005.

- Anderson, C.W., Ketchmar, R.M., Young, P.M., and Hittle, D.C., Robust Reinforcement Learning Using Integral-Quadratic Constraints, in Learning and Approximate Dynamic Programming, ed.\ by Si, J., Barto, A., Powell, W., and Wunsch, D., John Wiley \& Sons, Chapter 13, pages 337-358, 2004.
- Anderson, C.W., Hittle, D.C., Ketchmar, R.M., and Young, P.M., Robust Reinforcement Learning for Heating, Ventilation, and Air Conditioning Control of Buildings, in Learning and Approximate Dynamic Programming, ed. by Si, J., Barto, A., Powell, W., and Wunsch, D., John Wiley & Sons, Chapter 20, pages 517-534, 2004.
- Delnero, C.C., Dreisigmeyer, D., Hittle, D.C., Young, P.M., Anderson, C.W., and Anderson, M.L., Exact Solution of the Governing PDE of a Hot Water to Air Finned Tub Cross Flow Heat Exchanger. International Journal of Heating, Ventilating, Air-Conditioning and Refrigerating Research, vol. 10, 2004.

- Anderson, M.L., Young, P.M., Hittle, D.C., Anderson, C.W., Tu, J., and Hodgson, D. MIMO Robust Control for Heating, Ventilating and Air Conditioning (HVAC) Systems, in 41st IEEE Conference on Decision and Control, Las Vegas, Dec. 10-13, pp. 167-172, 2002.

- Tu, Jilin, Continuous Reinforcement Learning for Feedback Control Systems M.S. Thesis, Department of Computer Science, Colorado State University, Fort Collins, CO, 2001.
- Anderson, M.L., MIMO robust control for heating, ventilating, and air-conditioning (HVAC) systems. M.S. Thesis, Department of Electrical and Computer Engineering, Colorado State University, Fort Collins, CO, 2001.
- Delnero, C.C. Neural Networks and PI Control Using Steady State Prediction Applied to a Heating Coil. M.S. Thesis, Department of Mechanical Engineering, Colorado State University, Fort Collins, CO, 2001.
- Delnero C.C., Dreisigmeyer D., Hittle D.C., and Young P.M., Partial Differential Equation Modeling of a Heating, Ventilating, and Air Conditioning (HVAC) System, submitted to Research Journal of ASHRAE, 2002.
- Kretchmar, R.M., Young P.M., Anderson C., Hittle D., Anderson M., Tu J., and Delnero C. Robust Reinforcement Learning Control , in American Control Conference, pp. 902-907, 2001.
- Kretchmar, R.M., Young P.M., Anderson C., Hittle D., Anderson M., Delnero C., and Tu J., Robust Reinforcement Learning Control with Static and Dynamic Stability, International Journal of Robust and Nonlinear Control, vol. 11, pp. 1469-1500, 2001.
- Delnero, C.C., Hittle, D.C., Young, P.M., Anderson, C.W., and
Anderson, M.L.
Neural Networks and PI Control using Stady State Prediction Applied to a Heating Coil, In Proceedings of CLIMA2000, pp. 58-71, 2001.

- Kretchmar, R.M. (2000) A Synthesis of Reinforcement Learning and Robust Control Theory, Ph.D. Dissertation, Department of Computer Science, Colorado State University, Fort Collins, CO, 2000.

- Kretchmar, R. M., and Anderson, C. W., Using Temporal Neighborhoods to Adapt Function Approximators in Reinforcement Learning. In Proceedings of International Work-Conference on Artificial and Natural Neural Networks, June 2-4, Alicante, Spain, Springer-Verlag, 1999.

- Kretchmar, R. M., and Anderson, C. W., Comparison of CMACs and Radial Basis Functions for Local Function Approximators in Reinforcement Learning, ICNN'97, International Conference on Neural Networks, pp. 834-837, 1997.
- Anderson, C. W., Hittle, D., Katz, A. and Kretchmar, R., Synthesis of Reinforcement Learning, Neural Networks, and PI Control Applied to a Simulated Heating Coil. Journal of Artificial Intelligence in Engineering, Vol. 11, #4, pp. 423 R 431, 1997.

- Anderson, C. W., Hittle, D., Katz, A. and Kretchmar, R. Reinforcement Learning, Neural Networks and PI Control Applied to a Heating Coil. Solving Engineering Problems with Neural Networks: Proceedings of the International Conference on Engineering Applications of Neural Networks (EANN-96), ed. by Bulsari, A.B., Kallio, S., and Tsaptsinos, D., Systems Engineering Association, PL 34, FIN-20111 Turku 11, Finland, pp. 135-142, 1996.

- Hittle, D., Anderson, C., Young, P.M., Delnero, C., and Anderson, M.L., A combined proportional plus integral (PI) and neural network (NN) controller NSF# 01-035, Patent filed 2001. Provisional Application 60/318,044 filed Sept. 08, 2001
- Young, P.M., Anderson, C.W., Hittle, D.C., Kretchmar, R.M, Control System and Technique Employing Reinforcement Learning Having Stability and Learning Phases, Patent No. US 6,665,651 B2, Date of Patent: Dec. 16, 2003

- Reinforcement Learning Control with Robust Stability, Poster at the Second Annual Intermountain/Southwest Conference on Industrial and Interdisciplinary Mathematics, Colorado State University, Feb 28 - March 1, 2003.
- Robust Reinforcement Learning with Static and Dynamic Stability, Anderson, invited presentation at the NSF Workshop on Learning and Approximate Dynamic Programming, Playacar, Mexico, April 8-10, 2002.
- Robust Learning Control with Robust Learning Control with Application to HVAC Application to HVAC Systems, Hittle, Young, and Anderson, project status presented June, 2001, to the National Science Foundation (also available as PowerPoint slides)
- "Synthesis of Robust Control and Reinforcement Learning", Anderson, presented to the Department of Systems Engineering, The Australian National University, Canberra, Australia, November 18, 1999.

This is a photograph of the physical heating system we have constructed for running experiments. This is located at the Solar Energy Applications Lab at Colorado State University, Fort Collins, CO. | |

Here is a close-up of the control hardware, including the PC that drives the system. |

- Two technical reports by Megretski and
Rantzer on Integral Quadratic Constraints:- System Analysis via Integral Quadratic Constraints, Part I (Compressed Postscript version), Technical report, TFRT--7531, Dept. of Automatic Control, Lund Institute of Technology, April 1995
- System Analysis via Integral Quadratic Constraints, Part II (Compressed Postscript version) , Technical report, TFRT--7559, Dept. of Automatic Control, Lund Institute of Technology, September 1997

- The IQC-beta toolbox for Matlab and manual, available from Chung-Yao Kao
- Lecture Notes on Integral Quadratic Constraints, by Ulf Jonsson, 2000.
- Analysis of Feedback Systems: Theory and Computation, by U. Jonsson, R. Sepulchre, and J.-C. Willems

- Convex Optimization, by Boyd and Vandenberghe, Cambridge University Press, 2004,

- Linear Matrix Inequalities in Control, by Carsten Scherer and Siep Weiland, October 2000.
- Robust Stability and Performance Analysis of Uncertain Systems Using Linear Matrix Inequalities, by Balakrishnan and Kashyap,

- Feedback Control Theory, by Doyle, Francis, and Tannenbaum, 1992. This is now out of print, but the title is a link to an on-line version.

- Reinforcement Learning: An Introduction, by Richard Sutton and Andrew Barto, MIT Press, 1998.
- Reinforcement Learning Using Neural Networks, with Applications to Motor Control, dissertation by Remi Coulom that nicely presents continuous state, action, and time reinforcement learning.

- Continuous State Space Q-Learning for Control of Nonlinear Systems, by Stephan H.G. ten Hagen, 2001 Dissertation.
- Neural Q-Learning, by ten Hagen, shorter 18 pager.
- Q-Learning for systems with continuous state and action spaces, by ten Hagen and Krose, 8 pager, 2000.