Skip to main content
Cornell University
Learn about arXiv becoming an independent nonprofit.
We gratefully acknowledge support from the Simons Foundation, member institutions, and all contributors. Donate

arXiv.org > article trackbacks

Help | Advanced Search

arXiv logo
Cornell University Logo

quick links

  • Login
  • Help Pages
  • About

arXiv
Trackbacks

Trackbacks indicate external web sites that link to articles in arXiv.org. Trackbacks do not reflect the opinion of arXiv.org and may not reflect the opinions of that article's authors.

Trackback guide

By sending a trackback, you can notify arXiv.org that you have created a web page that references a paper. Popular blogging software supports trackback: you can send us a trackback about this paper by giving your software the following trackback URL:

https://arxiv.org/trackback/{arXiv_id}

Some blogging software supports trackback autodiscovery -- in this case, your software will automatically send a trackback as soon as your create a link to our abstract page. See our trackback help page for more information.

Trackbacks for 2201.11990

PaLM: Efficiently Training Massive Language Models

[ Towards Data Science - Medium@ INVALID-URL ] trackback posted Mon, 19 Jun 2023 15:43:46 UTC

Modern LLMs: MT-NLG, Chinchilla, Gopher and More

[ Towards Data Science - Medium@ INVALID-URL ] trackback posted Fri, 23 Dec 2022 19:34:45 UTC

I Used My Voice to Interact With OpenAI GPT-3

[ Towards Data Science - Medium@ INVALID-URL ] trackback posted Wed, 14 Sep 2022 13:52:44 UTC

Click to view metadata for 2201.11990

[Submitted on 28 Jan 2022 (v1), last revised 4 Feb 2022 (this version, v3)]

Title:Using DeepSpeed and Megatron to Train Megatron-Turing NLG 530B, A Large-Scale Generative Language Model

Authors:Shaden Smith, Mostofa Patwary, Brandon Norick, Patrick LeGresley, Samyam Rajbhandari, Jared Casper, Zhun Liu, Shrimai Prabhumoye, George Zerveas, Vijay Korthikanti
, Elton Zhang, Rewon Child, Reza Yazdani Aminabadi, Julie Bernauer, Xia Song, Mohammad Shoeybi, Yuxiong He, Michael Houston, Saurabh Tiwary, Bryan Catanzaro
et al. (10 additional authors not shown)
Abstract:
Comments: Shaden Smith and Mostofa Patwary contributed equally
Subjects: Computation and Language (cs.CL)
Cite as: arXiv:2201.11990 [cs.CL]
  (or arXiv:2201.11990v3 [cs.CL] for this version)
  • About
  • Help
  • contact arXivClick here to contact arXiv Contact
  • subscribe to arXiv mailingsClick here to subscribe Subscribe
  • Copyright
  • Privacy Policy
  • Web Accessibility Assistance
  • arXiv Operational Status