Weakly-Supervised Convolutional Neural Networks for Multimodal Image Registration

Hu, Yipeng; Modat, Marc; Gibson, Eli; Li, Wenqi; Ghavami, Nooshin; Bonmati, Ester; Wang, Guotai; Bandula, Steven; Moore, Caroline M.; Emberton, Mark; Ourselin, Sébastien; Noble, J. Alison; Barratt, Dean C.; Vercauteren, Tom

doi:10.1016/j.media.2018.07.002

Computer Science > Computer Vision and Pattern Recognition

arXiv:1807.03361 (cs)

[Submitted on 9 Jul 2018]

Title:Weakly-Supervised Convolutional Neural Networks for Multimodal Image Registration

Authors:Yipeng Hu, Marc Modat, Eli Gibson, Wenqi Li, Nooshin Ghavami, Ester Bonmati, Guotai Wang, Steven Bandula, Caroline M. Moore, Mark Emberton, Sébastien Ourselin, J. Alison Noble, Dean C. Barratt, Tom Vercauteren

View PDF

Abstract:One of the fundamental challenges in supervised learning for multimodal image registration is the lack of ground-truth for voxel-level spatial correspondence. This work describes a method to infer voxel-level transformation from higher-level correspondence information contained in anatomical labels. We argue that such labels are more reliable and practical to obtain for reference sets of image pairs than voxel-level correspondence. Typical anatomical labels of interest may include solid organs, vessels, ducts, structure boundaries and other subject-specific ad hoc landmarks. The proposed end-to-end convolutional neural network approach aims to predict displacement fields to align multiple labelled corresponding structures for individual image pairs during the training, while only unlabelled image pairs are used as the network input for inference. We highlight the versatility of the proposed strategy, for training, utilising diverse types of anatomical labels, which need not to be identifiable over all training image pairs. At inference, the resulting 3D deformable image registration algorithm runs in real-time and is fully-automated without requiring any anatomical labels or initialisation. Several network architecture variants are compared for registering T2-weighted magnetic resonance images and 3D transrectal ultrasound images from prostate cancer patients. A median target registration error of 3.6 mm on landmark centroids and a median Dice of 0.87 on prostate glands are achieved from cross-validation experiments, in which 108 pairs of multimodal images from 76 patients were tested with high-quality anatomical labels.

Comments:	Accepted manuscript in Medical Image Analysis
Subjects:	Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)
Cite as:	arXiv:1807.03361 [cs.CV]
	(or arXiv:1807.03361v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.1807.03361
Related DOI:	https://doi.org/10.1016/j.media.2018.07.002

Submission history

From: Yipeng Hu [view email]
[v1] Mon, 9 Jul 2018 19:53:16 UTC (2,904 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Weakly-Supervised Convolutional Neural Networks for Multimodal Image Registration

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Weakly-Supervised Convolutional Neural Networks for Multimodal Image Registration

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators