Neighbourhood Watch: Referring Expression Comprehension via Language-guided Graph Attention Networks

Wang, Peng; Wu, Qi; Cao, Jiewei; Shen, Chunhua; Gao, Lianli; Hengel, Anton van den

Computer Science > Computer Vision and Pattern Recognition

arXiv:1812.04794 (cs)

[Submitted on 12 Dec 2018]

Title:Neighbourhood Watch: Referring Expression Comprehension via Language-guided Graph Attention Networks

Authors:Peng Wang, Qi Wu, Jiewei Cao, Chunhua Shen, Lianli Gao, Anton van den Hengel

View PDF

Abstract:The task in referring expression comprehension is to localise the object instance in an image described by a referring expression phrased in natural language. As a language-to-vision matching task, the key to this problem is to learn a discriminative object feature that can adapt to the expression used. To avoid ambiguity, the expression normally tends to describe not only the properties of the referent itself, but also its relationships to its neighbourhood. To capture and exploit this important information we propose a graph-based, language-guided attention mechanism. Being composed of node attention component and edge attention component, the proposed graph attention mechanism explicitly represents inter-object relationships, and properties with a flexibility and power impossible with competing approaches. Furthermore, the proposed graph attention mechanism enables the comprehension decision to be visualisable and explainable. Experiments on three referring expression comprehension datasets show the advantage of the proposed approach.

Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:1812.04794 [cs.CV]
	(or arXiv:1812.04794v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.1812.04794

Submission history

From: Peng Wang [view email]
[v1] Wed, 12 Dec 2018 03:30:41 UTC (9,804 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.CV

< prev | next >

new | recent | 2018-12

Change to browse by:

References & Citations

DBLP - CS Bibliography

listing | bibtex

Peng Wang
Qi Wu
Jiewei Cao
Chunhua Shen
Lianli Gao

…

export BibTeX citation

Computer Science > Computer Vision and Pattern Recognition

Title:Neighbourhood Watch: Referring Expression Comprehension via Language-guided Graph Attention Networks

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Neighbourhood Watch: Referring Expression Comprehension via Language-guided Graph Attention Networks

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators