ESTRO 2020 Abstract book

S371 ESTRO 2020

PH-0604 The impact of training sample size on deep learning based organ auto segmentation for head neck

Abstract withdrawn

PH-0605 External validation of deep learning-based contouring of head and neck organs at risk E. Brunenberg 1 , I. Steinseifer 1 , S. Van den Bosch 1 , H. Kaanders 1 , C. Brouwer 2 , R. Monshouwer 1 1 Radboud university medical center, Radiation Oncology, Nijmegen, The Netherlands ; 2 University of Groningen- University Medical Center Groningen, Radiation Oncology, Groningen, The Netherlands Purpose or Objective Automatic delineation of head and neck (HN) organs at risk (OARs) using deep learning has been shown to provide reasonably accurate contours [1]. However, in order to develop generic solutions, it is necessary to validate these results on data sets from other institutes. For this study, we evaluated a deep learning contouring (DLC) model generated in another institute on a test set of our own data. Material and Methods The DLC model was trained on data (CT plus clinical contours delineated by expert radiation oncologists) from 549 HN cancer patients, as described in [1]. The model delineates 22 OARs, comprising (salivary) glands, upper digestive system, central nervous system, bone and vessel structures. We generated a validation cohort based on 58 HN cancer patients with manually delineated contours, also by expert radiation oncologists according to the same consensus guidelines [2], and ran this cohort through the DLC model (WorkfloxBox 2.0.1, Mirada Medical). The accuracy of the DLC model was assessed using the Dice similarity coefficient (DSC) and compared to [1]. The DSC was calculated on the contours as a whole, and also divided in four separate regions in superoinferior direction, in order to obtain more spatial information on model performance. For this binned version, a volume- normalized DSC was used . Results We focused on fourteen ROIs that were delineated consistently in our validation cohort, divided in three groups: (salivary) glands, aerodigestive tract, and other (brainstem and mandible). In Figure 1, the results of the global DSC values are visualized. An overview of all median DSC values, also in comparison to previously reported data [1], is given in Table 1. For the salivary glands, thyroid, brainstem and mandible, the DLC model performed well, and results were comparable to [1]. DSC was substantially lower for the digestive system ROIs, except for the oral cavity. The binned DSC results indicate that for most ROIs, the DLC model performs worse on caudal and cranial boundaries than in the middle of the organs.

Conclusion We demonstrated that deep learning models can to a large degree be used to perform GTV delineation in HNSCC, and that including more imaging modalities improves the prediction results. PET is critical to our deep learning model, but efforts in adding MR images are not prominent in improving predictions. Further improvements in segmentation results are expected to be achieved by introducing more data with multimodal accurate labeling and a more robust network structure for small object segmentation.

Made with FlippingBook - Online magazine maker