The Power of Randomization: Distributed Submodular Maximization on Massive Datasets

Barbosa, Rafael da Ponte; Ene, Alina; Nguyen, Huy L.; Ward, Justin

Computer Science > Machine Learning

arXiv:1502.02606 (cs)

[Submitted on 9 Feb 2015 (v1), last revised 22 Apr 2015 (this version, v2)]

Title:The Power of Randomization: Distributed Submodular Maximization on Massive Datasets

Authors:Rafael da Ponte Barbosa, Alina Ene, Huy L. Nguyen, Justin Ward

View PDF

Abstract:A wide variety of problems in machine learning, including exemplar clustering, document summarization, and sensor placement, can be cast as constrained submodular maximization problems. Unfortunately, the resulting submodular optimization problems are often too large to be solved on a single machine. We develop a simple distributed algorithm that is embarrassingly parallel and it achieves provable, constant factor, worst-case approximation guarantees. In our experiments, we demonstrate its efficiency in large problems with different kinds of constraints with objective values always close to what is achievable in the centralized setting.

Subjects:	Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Distributed, Parallel, and Cluster Computing (cs.DC)
Cite as:	arXiv:1502.02606 [cs.LG]
	(or arXiv:1502.02606v2 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.1502.02606

Submission history

From: Huy Nguyen [view email]
[v1] Mon, 9 Feb 2015 19:04:43 UTC (1,448 KB)
[v2] Wed, 22 Apr 2015 17:49:22 UTC (1,448 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.LG

< prev | next >

new | recent | 2015-02

Change to browse by:

cs
cs.AI
cs.DC

References & Citations

DBLP - CS Bibliography

listing | bibtex

Rafael Barbosa
Rafael da Ponte Barbosa
Alina Ene
Huy L. Nguyen
Justin Ward

export BibTeX citation

Computer Science > Machine Learning

Title:The Power of Randomization: Distributed Submodular Maximization on Massive Datasets

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:The Power of Randomization: Distributed Submodular Maximization on Massive Datasets

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators