In this group project, we developed a reinforcement learning agent for the Atari-style game Knights-Archers-Zombies (KAZ). Utilising the PettingZoo KAZ environment and the RLlib framework, we implemented the Proximal Policy Optimisation (PPO) algorithm. A key component of our approach is our inclusion of manual feature engineering, transforming raw environment states into a more meaningful feature vectors for the model to learn from.

Our agent was trained using an AWS EC2 instance, allowing us to scale up training and achieve faster convergence. I personally developed the below RL training visualisation plot, which came in particularly useful when experimenting with the agent's feature vectors. This, similar to my DevOps internship experience, allowed us to catch issues early and save on both limited computational resources and time.
