AI Safety

With autonomous systems taking on more and more consequential decision-making responsibility, ensuring their behaviour remains aligned with human values becomes correspondingly critical.

ARAAC researchers work on the safety and alignment of autonomous agents, from human-aligned reinforcement learning and the safe transfer of knowledge between environments, to fairness, trust, and the longer-term existential safety of advanced AI.

Key Researchers

Professor Peter Vamplew

Federation University Australia

Peter is co-founder/co-leader of ARAAC, and a senior member of the Future of Life Institute’s Existential AI safety Research Community. He has played a leading role in establishing multi-objective reinforcement learning (MORL) as a sub-field of reinforcement learning, explicitly designed for problems with multiple conflicting objectives (which describes most real-world problems)

View profile

Professor Richard Dazeley

Deakin University

Richard is the Leader of the Machine Intelligence Lab at Deakin University (Geelong), and the Deputy Head of School. He is a leading researcher in the Human-alignment of autonomous agents through Safe, Ethical, Explainable and Interactive methods utilising Multiobjective Reinforcement Learning (MORL) and is a senior member of the AI existential Safety Community

View profile

Dr Bahar Nakisa

Deakin University

Dr. Bahareh Nakisa is a Lecturer of Applied AI and the course director of Applied AI at School of Information Technology, Deakin University. Bahar’s expertise spans multiple domains, encompassing applied AI, deep learning, computer vision, affective computing, and human-aligned AI in autonomous systems.

View profile

Associate Professor Cameron Foale

Federation University Australia

Cameron has an interest in building usable, fair, transparent, and scalable connected eHealth systems, and applying AI techniques to time-series data.

View profile

Scott Johnson

Deakin University

Scott is currently studying for his Honours degree at Deakin University, with a focus on the transfer of safety knowledge between environments using Multi-Objective Reinforcement Learning. He has worked as a research assistant on several ML projects for both Deakin University and Federation University.

View profile

ARAAC Publications

M. Hussonnois, T. G. Karimpanal, S. Rana (2025). Human-Aligned Skill Discovery: Balancing Behaviour Exploration and Alignment International Conference on Autonomous Agents and Multiagent Systems (AAMAS)
H. Harland, R. Dazeley, H. Senaratne, P. Vamplew, F. Cruz, B. Nakisa (2025). AI apology: a critical review of apology in AI systems Artificial Intelligence Review
P. Vamplew, C. Foale, C. F. Hayes, P. Mannion, E. Howley, R. Dazeley, S. Johnson, J. Källström, G. Ramos, R. Rădulescu, W. Röpke, D. M. Roijers (2024). Utility-Based Reinforcement Learning: Unifying Single-objective and Multi-objective Reinforcement Learning International Conference on Autonomous Agents and Multiagent Systems (AAMAS)
F. Cruz, T. G. Karimpanal, M. A. Solis, P. Barros, R. Dazeley (2023). Human-aligned reinforcement learning for autonomous agents and robots Neural Computing and Applications
H. Harland, R. Dazeley, B. Nakisa, F. Cruz, P. Vamplew (2023). AI apology: interactive multi-objective reinforcement learning for human-aligned AI Neural Computing and Applications
P. Vamplew, B. J. Smith, J. Källström, G. Ramos, R. Rădulescu, D. M. Roijers, C. F. Hayes, F. Heintz, P. Mannion, P. J. K. Libin, R. Dazeley, C. Foale (2022). Scalar reward is not enough: a response to Silver, Singh, Precup and Sutton (2021) Autonomous Agents and Multi-Agent Systems (JAAMAS)
P. Vamplew, C. Foale, R. Dazeley, A. Bignold (2021). Potential-based multiobjective reinforcement learning approaches to low-impact agents for AI safety Engineering Applications of Artificial Intelligence
T. G. Karimpanal, S. Rana, S. Gupta, T. Tran, S. Venkatesh (2020). Learning Transferable Domain Priors for Safe Exploration in Reinforcement Learning International Joint Conference on Neural Networks (IJCNN)
P. Vamplew, R. Dazeley, C. Foale, S. Firmin, J. Mummery (2018). Human-aligned artificial intelligence is a multiobjective problem Ethics and Information Technology