Revisiting Character-Based Neural Machine Translation with Capacity and Compression

Cherry, Colin; Foster, George; Bapna, Ankur; Firat, Orhan; Macherey, Wolfgang

Computer Science > Computation and Language

arXiv:1808.09943 (cs)

[Submitted on 29 Aug 2018]

Title:Revisiting Character-Based Neural Machine Translation with Capacity and Compression

Authors:Colin Cherry, George Foster, Ankur Bapna, Orhan Firat, Wolfgang Macherey

View PDF

Abstract:Translating characters instead of words or word-fragments has the potential to simplify the processing pipeline for neural machine translation (NMT), and improve results by eliminating hyper-parameters and manual feature engineering. However, it results in longer sequences in which each symbol contains less information, creating both modeling and computational challenges. In this paper, we show that the modeling problem can be solved by standard sequence-to-sequence architectures of sufficient depth, and that deep models operating at the character level outperform identical models operating over word fragments. This result implies that alternative architectures for handling character input are better viewed as methods for reducing computation time than as improved ways of modeling longer sequences. From this perspective, we evaluate several techniques for character-level NMT, verify that they do not match the performance of our deep character baseline model, and evaluate the performance versus computation time tradeoffs they offer. Within this framework, we also perform the first evaluation for NMT of conditional computation over time, in which the model learns which timesteps can be skipped, rather than having them be dictated by a fixed schedule specified before training begins.

Comments:	To appear at EMNLP 2018
Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:1808.09943 [cs.CL]
	(or arXiv:1808.09943v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.1808.09943

Submission history

From: Colin Cherry [view email]
[v1] Wed, 29 Aug 2018 17:46:50 UTC (321 KB)

Computer Science > Computation and Language

Title:Revisiting Character-Based Neural Machine Translation with Capacity and Compression

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Revisiting Character-Based Neural Machine Translation with Capacity and Compression

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators