Details
Reinforcement Learning for Laparoscopic Lesion Resection
Year: 2026
Term: Winter
Student Name: Justice Joakim-Walters
Supervisor: Matthew Holden
Abstract: Minimally invasive surgical procedures require precise toolpath planning to achieve complete lesion resection while minimizing damage to surrounding healthy tissue. This thesis presents a reinforcement learning based approach for autonomous surgical toolpath generation in laparoscopic liver lesion resection. The task is formulated as a sequential decision making problem within a simulated surgical environment developed in CoppeliaSim, where full patient anatomies are reconstructed from computed tomography (CT) scans obtained from the CT-ORG and LiTS datasets. The agent operates in Cartesian space, with actions mapped to joint velocities using a Jacobian-based inverse kinematics formulation, allowing the policy to directly control end-effector position rather than indirectly through its temporal derivative (velocity). A multi-component reward function is designed to capture key objectives of the task, including lesion removal, boundary adherence, motion smoothness, and procedural efficiency. To improve training stability and reduce exploration requirements, behaviour cloning is used to initialize the policy using programmatically generated trajectories that approximate structured surgical strategies. Experimental results demonstrate that the reinforcement learning agent is able to consistently achieve successful lesion resection without access to true expert demonstrations. The learned policies exhibit efficient and adaptive toolpaths, often outperforming the structured initialization in terms of execution time and tissue preservation. While the resulting trajectories show greater variability than the deterministic baselines, this variability enables the discovery of alternative strategies that can improve performance under certain conditions. Overall, this work demonstrates that reinforcement learning can effectively learn surgical toolpath strategies from interaction with a simulated environment, with minimal reliance on expert data. The proposed framework provides a scalable approach to surgical automation and highlights the potential of reinforcement learning for complex, precision critical tasks in minimally invasive surgery.