ESTRO 2021 Abstract Book

S135

ESTRO 2021

Prompt-gamma imaging (PGI) based range verification has been utilized in first pencil-beam scanning (PBS) proton therapy treatments and is under systematic investigation concerning its potential benefit in a clinical study at our institution. Manual interpretation of the detected spot-wise range shift information is time- consuming, highly complex, and therefore not feasible in a broad routine application. Here, we present an approach to automatically detect and classify treatment deviations in realistically simulated PGI data for head and neck cancer treatments using convolutional neural networks (CNNs). Materials and Methods For 12 patients and an anthropomorphic head phantom, PBS treatment plans were generated and one field per plan was assumed to be monitored with the IBA slit camera. In total, 386 scenarios resembling different relevant or non-relevant treatment deviations were simulated on planning and control CTs and manually classified into 7 classes: non-relevant changes (NRC) and relevant changes triggering treatment intervention due to range prediction errors (±RPE), setup errors in beam direction (±SE), anatomical changes (AC), or a combination of such errors (CE). After filtering of PBS spots with reliable PGI information, the 3D spatial maps of PGI-determined range deviations (reference vs. change scenario) were converted to 16x16x16 voxel grids. Three complexity levels of simulated PGI data were investigated: (A) optimal PGI data, (B) realistic PGI data with simulated Poisson noise based on the locally delivered proton number, (C) realistic PGI data with an additional positioning uncertainty of the slit camera following an experimentally determined distribution. For each complexity level, 3D-CNNs (6 convolutional & 2 downsampling layers) were trained on a subset of 8 patients and the phantom dataset using patient-specific leave-one-out cross-validation and tested on an independent test cohort of 4 patients.

Results On the test data, the CNN ensemble achieved an accuracy of 0.81, 0.77, and 0.76 for the complexity levels (A), (B), and (C), respectively. Similarly, for the task to solely differentiate relevant from non-relevant changes, the binary accuracy was 0.97, 0.95, and 0.93. The trained ensemble provided fast (<1 s) predictions and detected treatment deviations in the most realistic scenario (C) with a sensitivity of 0.97 and a specificity of 0.82. Misclassifications of the AC class were likely due to similar PGI characteristics to the CE class.

Made with FlippingBook Learn more on our blog