Multi-step Reasoning via Recurrent Dual Attention for Visual Dialog

Gan, Zhe; Cheng, Yu; Kholy, Ahmed El; Li, Linjie; Liu, Jingjing; Gao, Jianfeng

Computer Science > Computer Vision and Pattern Recognition

arXiv:1902.00579 (cs)

[Submitted on 1 Feb 2019 (v1), last revised 4 Jun 2019 (this version, v2)]

Title:Multi-step Reasoning via Recurrent Dual Attention for Visual Dialog

Authors:Zhe Gan, Yu Cheng, Ahmed El Kholy, Linjie Li, Jingjing Liu, Jianfeng Gao

View PDF

Abstract:This paper presents a new model for visual dialog, Recurrent Dual Attention Network (ReDAN), using multi-step reasoning to answer a series of questions about an image. In each question-answering turn of a dialog, ReDAN infers the answer progressively through multiple reasoning steps. In each step of the reasoning process, the semantic representation of the question is updated based on the image and the previous dialog history, and the recurrently-refined representation is used for further reasoning in the subsequent step. On the VisDial v1.0 dataset, the proposed ReDAN model achieves a new state-of-the-art of 64.47% NDCG score. Visualization on the reasoning process further demonstrates that ReDAN can locate context-relevant visual and textual clues via iterative refinement, which can lead to the correct answer step-by-step.

Comments:	Accepted to ACL 2019
Subjects:	Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL)
Cite as:	arXiv:1902.00579 [cs.CV]
	(or arXiv:1902.00579v2 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.1902.00579

Submission history

From: Zhe Gan [view email]
[v1] Fri, 1 Feb 2019 22:48:26 UTC (2,294 KB)
[v2] Tue, 4 Jun 2019 05:54:02 UTC (2,788 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.CV

< prev | next >

new | recent | 2019-02

Change to browse by:

cs
cs.CL

References & Citations

DBLP - CS Bibliography

listing | bibtex

Zhe Gan
Yu Cheng
Ahmed El Kholy
Linjie Li
Jingjing Liu

…

export BibTeX citation

Computer Science > Computer Vision and Pattern Recognition

Title:Multi-step Reasoning via Recurrent Dual Attention for Visual Dialog

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Multi-step Reasoning via Recurrent Dual Attention for Visual Dialog

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators