AI Safety
With autonomous systems taking on more and more consequential decision-making responsibility, ensuring their behaviour remains aligned with human values becomes correspondingly critical.
ARAAC researchers work on the safety and alignment of autonomous agents, from human-aligned reinforcement learning and the safe transfer of knowledge between environments, to fairness, trust, and the longer-term existential safety of advanced AI.
Key Researchers
Dr Bahar Nakisa
Deakin University
Dr. Bahareh Nakisa is a Lecturer of Applied AI and the course director of Applied AI at School of Information Technology, Deakin University. Bahar’s expertise spans multiple domains, encompassing applied AI, deep learning, computer vision, affective computing, and human-aligned AI in autonomous systems.
Associate Professor Cameron Foale
Federation University Australia
Cameron has an interest in building usable, fair, transparent, and scalable connected eHealth systems, and applying AI techniques to time-series data.
Ethan Watkins (EJ)
ARAAC
EJ is a chemist by training but has pivoted his career towards AI safety research to ensure that advances in AI result in human flourishing. He is particularly interested in Reinforcement Learning and is excited to explore the potential of multi-objective approaches to train agents that are better aligned with human goals. He is currently working with ARAAC researchers as an intern.
Dr Mahdi Kazemi Moghaddam
Deakin University
Mahdi was a Research Fellow in Reinforcement Learning at Deakin University from 2022-2023. Mahdi is interested in (deep) reinforcement learning, focusing on both single-agent and multi-agent systems. He strives to contribute to the advancement of responsible AI solutions by developing methods that prioritise fairness and trust without compromising efficiency.
Professor Peter Vamplew
Federation University Australia
Peter is co-founder/co-leader of ARAAC, and a senior member of the Future of Life Institute’s Existential AI safety Research Community. He has played a leading role in establishing multi-objective reinforcement learning (MORL) as a sub-field of reinforcement learning, explicitly designed for problems with multiple conflicting objectives (which describes most real-world problems)
Professor Richard Dazeley
Deakin University
Richard is the Leader of the Machine Intelligence Lab at Deakin University (Geelong), and the Deputy Head of School. He is a leading researcher in the Human-alignment of autonomous agents through Safe, Ethical, Explainable and Interactive methods utilising Multiobjective Reinforcement Learning (MORL) and is a senior member of the AI existential Safety Community
Scott Johnson
Deakin University
Scott is currently studying for his Honours degree at Deakin University, with a focus on the transfer of safety knowledge between environments using Multi-Objective Reinforcement Learning. He has worked as a research assistant on several ML projects for both Deakin University and Federation University.
ARAAC Publications
- (2025). Human-Aligned Skill Discovery: Balancing Behaviour Exploration and Alignment International Conference on Autonomous Agents and Multiagent Systems (AAMAS)
- (2025). AI apology: a critical review of apology in AI systems Artificial Intelligence Review
- (2024). Utility-Based Reinforcement Learning: Unifying Single-objective and Multi-objective Reinforcement Learning International Conference on Autonomous Agents and Multiagent Systems (AAMAS)
- (2023). Human-aligned reinforcement learning for autonomous agents and robots Neural Computing and Applications
- (2023). AI apology: interactive multi-objective reinforcement learning for human-aligned AI Neural Computing and Applications
- (2022). Scalar reward is not enough: a response to Silver, Singh, Precup and Sutton (2021) Autonomous Agents and Multi-Agent Systems (JAAMAS)
- (2021). Potential-based multiobjective reinforcement learning approaches to low-impact agents for AI safety Engineering Applications of Artificial Intelligence
- (2020). Learning Transferable Domain Priors for Safe Exploration in Reinforcement Learning International Joint Conference on Neural Networks (IJCNN)
- (2018). Human-aligned artificial intelligence is a multiobjective problem Ethics and Information Technology