Research
Holger R. Roth, Chen Shen, Hirohisa Oda, Takaaki Sugino, Masahiro Oda, Yuichiro Hayashi, Kazunari Misawa, Kensaku Mori
Recent advances in deep learning, like 3D fully convolutional networks (FCNs), have improved the state-of-the-art in dense semantic segmentation of medical images. However, most network architectures require severely downsampling or cropping the images to meet the memory limitations of today's GPU cards while still considering enough context in the images for accurate segmentation. In this work, we propose a novel approach that utilizes auto-context to perform semantic segmentation at higher resolutions in a multi-scale pyramid of stacked 3D FCNs. We train and validate our models on a dataset of manually annotated abdominal organs and vessels from 377 clinical CT images used in gastric surgery, and achieve promising results with close to 90% Dice score on average. For additional evaluation, we perform separate testing on datasets from different sources and achieve competitive results, illustrating the robustness of the model and approach.

21st International Conference on Medical Image Computing and Computer Assisted Intervention - MICCAI 2018, September 16-20, Granada, Spain
Pancreas segmentation in computed tomography imaging has been historically difficult for automated methods because of the large shape and size variations between patients. In this work, we describe a custom-build 3D fully convolutional network (FCN) that can process a 3D image including the whole pancreas and produce an automatic segmentation. We investigate two variations of the 3D FCN architecture; one with concatenation and one with summation skip connections to the decoder part of the network. We evaluate our methods on a dataset from a clinical trial with gastric cancer patients, including 147 contrast enhanced abdominal CT scans acquired in the portal venous phase. Using the summation architecture, we achieve an average Dice score of 89.7 ± 3.8 (range [79.8, 94.8]) % in testing, achieving the new state-of-the-art performance in pancreas segmentation on this dataset.
(paper)
Recent advances in 3D fully convolutional networks (FCN) have made it feasible to produce dense voxel-wise predictions of volumetric images. In this work, we show that a multi-class 3D FCN trained on manually labeled CT scans of several anatomical structures (ranging from the large organs to thin vessels) can achieve competitive segmentation results, while avoiding the need for handcrafting features or training class-specific models. To this end, we propose a two-stage, coarse-to-fine approach that will first use a 3D FCN to roughly define a candidate region, which will then be used as input to a second 3D FCN. This reduces the number of voxels the second FCN has to classify to ~10% and allows it to focus on more detailed segmentation of the organs and vessels. We utilize training and validation sets consisting of 331 clinical CT images and test our models on a completely unseen data collection acquired at a different hospital that includes 150 CT scans, targeting three anatomical organs (liver, spleen, and pancreas). In challenging organs such as the pancreas, our cascaded approach improves the mean Dice score from 68.5 to 82.2%, achieving the highest reported average score on this dataset. We compare with a 2D FCN method on a separate dataset of 240 CT scans with 18 classes and achieve a significantly higher performance in small organs and vessels. Furthermore, we explore fine-tuning our models to different datasets. Our experiments illustrate the promise and robustness of current 3D FCN based semantic segmentation of medical images, achieving state-of-the-art results.\footnote{Our code and trained models are available for download: github.com/holgerroth/3Dunet_abdomen_cascade.

Holger R. Roth, Hirohisa Oda, Xiangrong Zhou, Natsuki Shimizu, Ying Yang, Yuichiro Hayashi, Masahiro Oda, Michitaka Fujiwara, Kazunari Misawa, Kensaku Mori
Computerized Medical Imaging and Graphics (2018)
(paper (arXiv), code with trained models, presentation)
Accurate automatic organ segmentation is an important yet challenging problem for medical image analysis. The pancreas is an abdominal organ with very high anatomical variability. This inhibits traditional segmentation methods from achieving high accuracies, especially compared to other organs such as the liver, heart or kidneys. In this paper, we present a holistic learning approach that integrates semantic mid-level cues of deeply-learned organ interior and boundary maps via robust spatial aggregation using random forest. Our method generates boundary preserving pixel-wise class labels for pancreas segmentation. Quantitative evaluation is performed on CT scans of 82 patients in 4-fold cross-validation. We achieve a (mean ± std. dev.) Dice Similarity Coefficient of 78.01% ± 8.2% in testing which significantly outperforms the previous state-of-the-art approach of 71.8% ± 10.7% under the same evaluation criterion.
Holger R. Roth, Le Lu, Amal Farag, Andrew Sohn, Ronald M. Summers
MICCAI (Medical Image Computing and Computer-Assisted Interventions), Athens, Greece, 2016
(paper (extended version), poster, data)
Improving Computer-aided Detection using Convolutional Neural Networks and Random View Aggregation
Automated computer-aided detection (CADe) in medical imaging has been an important tool in clinical practice and research. State-of-the-art methods often show high sensitivities but typically with the cost of high false-positives (FP) rates per patient. We design a two-tiered coarse-to-fine cascade framework to first operate a highly sensitive candidate generation system at a maximum sensitivity of ~100% but with high FP level (~50 per patient), leveraging the existing CAD systems. Regions or volumes of interest (ROI or VOI) for lesion candidates are generated in this step and function as input for a second tier which is our focus. In this stage, we generate N 2D (dimensional) or 2.5D views, via sampling through scale transformations, random translations and rotations with respect to each ROI centroid coordinates. These random views are used to train deep Convolutional Neural Network (ConvNet) classifiers. In testing, the trained ConvNets are employed to assign class (e.g., lesion, pathology) probabilities for a new set of N random views that are averaged then at each ROI to compute a final per-candidate classification probability. This second tier behaves as a highly selective process to reject difficult false positives while preserving high sensitivities in three different data sets: 59 patients for sclerotic metastases detection, 176 for lymph node detection, and 1,186 patients for colonic polyp detection. Experimental results show the ability of ConvNets to generalize well to different medical imaging CADe applications and scale elegantly to varying sizes of data sets. Our proposed method improves the CADe performance markedly in all cases. CADe sensitivities improved from 57% to 70%, from 43% to 77% and 58% to 75% at 3 FPs per patient for sclerotic metastases, lymph nodes and colonic polyps, respectively.
Holger R. Roth, Le Lu, Jiamin Liu, Jianhua Yao, Ari Seff, Kevin Cherry , Lauren Kim, Ronald M. Summers
(paper)
Related software/code: LymphNodeRFCNNPipeline
Automatic organ segmentation is an important prerequisite for many computer-aided diagnosis systems. The high anatomical variability of organs in the abdomen, such as the pancreas, prevents many segmentation methods from achieving high accuracies when compared to other segmentation of organs like the liver, heart or kidneys. Recently, the availability of large annotated training sets and the accessibility of affordable parallel computing resources via GPUs have made it feasible for "deep learning" methods such as convolutional networks (ConvNets) to succeed in image classification tasks. These methods have the advantage that used classification features are trained directly from the imaging data. We present a fully-automated bottom-up method for pancreas segmentation in computed tomography (CT) images of the abdomen. The method is based on hierarchical coarse-to-fine classification of local image regions (superpixels). Superpixels are extracted from the abdominal region using Simple Linear Iterative Clustering (SLIC). An initial probability response map is generated, using patch-level confidences and a two-level cascade of random forest classifiers, from which superpixel regions with probabilities larger 0.5 are retained. These retained superpixels serve as a highly sensitive initial input of the pancreas and its surroundings to a ConvNet that samples a bounding box around each superpixel at different scales (and random non-rigid deformations at training time) in order to assign a more distinct probability of each superpixel region being pancreas or not. We evaluate our method on CT images of 82 patients (60 for training, 2 for validation, and 20 for testing). Using ConvNets we achieve average Dice scores of 68%+-10% (range, 43-80%) in testing. This shows promise for accurate pancreas segmentation, using a deep learning approach and compares favorably to state-of-the-art methods.
IEEE Transactions on Medical Imaging, 28 September 2015, Volume:PP, Issue: 99
Follow-up work presented at MICCAI 2015:
DeepOrgan: Multi-level Deep Convolutional Networks for Automated Pancreas Segmentation
Holger R. Roth, Le Lu, Amal Farag, Hoo-Chang Shin, Jiamin Liu, Evrim Turkbey, Ronald M. Summers
Comments: To be presented at MICCAI 2015 - 18th International Conference on Medical Computing and Computer Assisted Interventions, Munich, Germany
(paper, poster, presentation)
Slice-based classification of anatomy in medical images using deep convolutional nets
Automated classification of human anatomy is an important prerequisite for many computer-aided diagnosis systems. The spatial complexity and variability of anatomy throughout the human body makes classification difficult. “Deep learning” methods such as convolutional networks (ConvNets) outperform other state-of-the-art methods in image classification tasks. In this work, we present a method for organ- or body-part-specific anatomical classification of medical images acquired using computed tomography (CT) with ConvNets. We train a ConvNet, using 4,298 separate axial 2D key-images to learn 5 anatomical classes. Key-images were mined from a hospital PACS archive, using a set of 1,675 patients. We show that a data augmentation approach can help to enrich the data set and improve classification performance. Using ConvNets and data augmentation, we achieve anatomy-specific classification error of 5.9 % and area-under-the-curve (AUC) values of an average of 0.998 in testing. We demonstrate that deep learning can be used to train very reliable and accurate classifiers that could initialize further computer-aided diagnosis.
Software/code: CNNSliceClassifier
Automated Lymph Node (LN) detection is an important clinical diagnostic task but very challenging due to the low contrast of surrounding structures in Computed Tomography (CT) and to their varying sizes, poses, shapes and sparsely distributed locations. State-of-the-art studies show the performance range of 52.9% sensitivity at 3.1 false-positives per volume (FP/vol.), or 60.9% at 6.1 FP/vol. for mediastinal LN, by one-shot boosting on 3D HAAR features. In this paper, we first operate a preliminary candidate generation stage, towards 100% sensitivity at the cost of high FP levels (40 per patient), to harvest volumes of interest (VOI). Our 2.5D approach consequently decomposes any 3D VOI by resampling 2D reformatted orthogonal views N times, via scale, random translations, and rotations with respect to the VOI centroid coordinates. These random views are then used to train a deep Convolutional Neural Network (CNN) classifier. In testing, the CNN is employed to assign LN probabilities for all N random views that can be simply averaged (as a set) to compute the final classification probability per VOI. We validate the approach on two datasets: 90 CT volumes with 388 mediastinal LNs and 86 patients with 595 abdominal LNs. We achieve sensitivities of 70%/83% at 3 FP/vol. and 84%/90% at 6 FP/vol. in mediastinum and abdomen respectively, which drastically improves over the previous state-of-the-art work.
Related software/code: LymphNodeRFCNNPipeline
Annotated CT data: here
Automated detection of sclerotic metastases (bone lesions) in Computed Tomography (CT) images has potential to be an important tool in clinical practice and research. State-of-the-art methods show performance of 79% sensitivity or true-positive (TP) rate, at 10 false-positives (FP) per volume. We design a two-tiered coarse-to-fine cascade framework to first operate a highly sensitive candidate generation system at a maximum sensitivity of ~92% but with high FP level (~50 per patient). Regions of interest (ROI) for lesion candidates are generated in this step and function as input for the second tier. In the second tier we generate N 2D views, via scale, random translations, and rotations with respect to each ROI centroid coordinates. These random views are used to train a deep Convolutional Neural Network (CNN) classifier. In testing, the CNN is employed to assign individual probabilities for a new set of N random views that are averaged at each ROI to compute a final per-candidate classification probability. This second tier behaves as a highly selective process to reject difficult false positives while preserving high sensitivities. We validate the approach on CT images of 59 patients (49 with sclerotic metastases and 10 normal controls). The proposed method reduces the number of FP/vol. from 4 to 1.2, 7 to 3, and 12 to 9.5 when comparing a sensitivity rates of 60%, 70%, and 80% respectively in testing. The Area-Under-the-Curve (AUC) is 0.834. The results show marked improvement upon previous work.
Related software/code: LymphNodeRFCNNPipeline
Our research focuses on establishing the spatial correspondence between inner colon surfaces extracted from CT colonography. This will help the radiologist to detect colorectal cancer more quickly and accurately.
▲
Polyp in a supine (left) and prone (right) CT scan.
▲
The principle of colon surface registration using a cylindrical representation, where the colour scale indicates the shape index on the colon surfaces.
Each year, more than 630,000 people worldwide die from colorectal cancer. The cancer in the colon can develop from colorectal lesions, mostly small polyps. Computed tomography colonography (CTC), or virtual colonoscopy is now established as standard screening tool for these colorectal lesions in USA, Europe and Japan. If a lesion is detected early enough, it can be safely treated by removing the polyp using an endoscope.
The routine is to acquire two CT images in prone and supine positions of the patient in order to increase the radiologist’s confidence in classifying a tissue as a polyp. However, this changing of the positions of the patient introduces large deformations of the colon. Its changes in inflation and orientation make it difficult and tedious for even the most experienced radiologist to establish correspondence manually. This can lead to significant delays and errors in the diagnosis.
An automatic method for establishing spatial correspondence between the prone and supine CTC views has the potential to improve accuracy and confidence of the radiologist during the screening process and save time in the diagnosis. Furthermore, its result could be included to computer-aided detection (CAD) algorithms in order to improve their robustness and accuracy.
We develop a cylindrical registration approach of the inner colon surfaces extracted from CTC. We map the prone and supine surfaces into a cylindrical domain and establish spatial correspondence non-rigidly using a cylindrical b-spline registration.
Project site & Data with registration reference standard: http://cmic.cs.ucl.ac.uk/CTC/
Related publications
Registration of the endoluminal surfaces of the colon derived from prone and supine CT colonography (2011)
Holger R. Roth, Jamie R. McClelland, Darren J. Boone, Marc Modat, M. Jorge Cardoso, Thomas E. Hampshire, Mingxing Hu, Shonit Punwani,Sebastien Ourselin, Greg G. Slabaugh, Steve Halligan and David J. Hawkes. (paper)
Roth, HR (2013) Registration of prone and supine CT colonography images and its clinical application. Doctoral thesis, UCL (University College London). (thesis)
This work was featured in two AuntMinnie.com articles:
Surface registration unites prone and supine CTC data by Eric Barnes, September 15, 2011
Prone-supine matching algorithm shows high accuracy in CTC by Eric Barnes, March 11, 2013
Visualization based on colon registration (by T. Hampshire)