10 Large Language Models You Should Know

Large Language Models (LLMs) are machine learning models with large number of parameters trained for Natural Language Processing. They require high computing power and petabytes scale of text data for training. This article presents some promising and popular LLMs.

  1. GPT-3 (Generative Pretrained Transformer)

    Paper: (https://arxiv.org/pdf/2005.14165v4.pdf)

    Code: (https://github.com/openai/gpt-3)

    Developer: OpenAI

    Parameters: 175 billion

  2. PaLM (Pathways Language Model)

    Paper: (https://arxiv.org/pdf/2204.02311v5.pdf)

    Code: (https://github.com/lucidrains/PaLM-pytorch)

    Developer: Google

    Parameters: 540 billion

  3. Megatron-Turing NLG

    Paper: (https://arxiv.org/pdf/2201.11990v3.pdf)

    Code: (https://github.com/microsoft/DeepSpeed)

    Developer: Microsoft and NVIDIA

    Parameters: 530 billion

  4. LaMDA (Language Models for Dialog Applications)

    Paper: (https://arxiv.org/pdf/2201.08239v3.pdf)

    Code: (https://github.com/conceptofmind/LaMDA-rlhf-pytorch)

    Developer: Google

    Parameters: 137 billion

  5. BLOOM

    Paper: (https://arxiv.org/pdf/2211.05100v2.pdf)

    Code: (https://github.com/bigscience-workshop/bigscience)

    Developer: Hugging Face

    Parameters: 176 billion

  6. T5 (Text-To-Text Transfer Transformer)

    Paper: (https://arxiv.org/pdf/1910.10683v3.pdf)

    Code: (https://github.com/google-research/t5x)

    Developer: Google

    Parameters: 11 billion

  7. XLNet

    Paper: (https://arxiv.org/pdf/1906.08237v2.pdf)

    Code: (https://github.com/zihangdai/xlnet)

    Developer: Carnegie Mellon University and Google

    Parameters: 340 million

  8. ELECTRA

    Paper: (https://arxiv.org/pdf/2003.10555v1.pdf)

    Code: (https://github.com/google-research/electra)

    Developer: Google

    Parameters: 335 million

  9. RoBERTa (Robustly Optimized BERT Pretraining Approach)

    Paper: (https://arxiv.org/pdf/1907.11692v1.pdf)

    Code: (https://github.com/facebookresearch/fairseq/tree/main/examples/roberta)

    Developer: Facebook AI and the University of Washington

    Parameters: 355 million

  10. BERT (Bidirectional Encoder Representations from Transformers)

    Paper: (https://arxiv.org/pdf/1810.04805v2.pdf)

    Code: (https://github.com/google-research/bert)

    Developer: Google

    Parameters: 340 million

© 2024 | Eneotu Joe