10 Large Language Models You Should Know
Large Language Models (LLMs) are machine learning models with large number of parameters trained for Natural Language Processing. They require high computing power and petabytes scale of text data for training. This article presents some promising and popular LLMs.
-
GPT-3 (Generative Pretrained Transformer)
Paper: (https://arxiv.org/pdf/2005.14165v4.pdf)
Code: (https://github.com/openai/gpt-3)
Developer: OpenAI
Parameters: 175 billion
-
PaLM (Pathways Language Model)
Paper: (https://arxiv.org/pdf/2204.02311v5.pdf)
Code: (https://github.com/lucidrains/PaLM-pytorch)
Developer: Google
Parameters: 540 billion
-
Megatron-Turing NLG
Paper: (https://arxiv.org/pdf/2201.11990v3.pdf)
Code: (https://github.com/microsoft/DeepSpeed)
Developer: Microsoft and NVIDIA
Parameters: 530 billion
-
LaMDA (Language Models for Dialog Applications)
Paper: (https://arxiv.org/pdf/2201.08239v3.pdf)
Code: (https://github.com/conceptofmind/LaMDA-rlhf-pytorch)
Developer: Google
Parameters: 137 billion
-
BLOOM
Paper: (https://arxiv.org/pdf/2211.05100v2.pdf)
Code: (https://github.com/bigscience-workshop/bigscience)
Developer: Hugging Face
Parameters: 176 billion
-
T5 (Text-To-Text Transfer Transformer)
Paper: (https://arxiv.org/pdf/1910.10683v3.pdf)
Code: (https://github.com/google-research/t5x)
Developer: Google
Parameters: 11 billion
-
XLNet
Paper: (https://arxiv.org/pdf/1906.08237v2.pdf)
Code: (https://github.com/zihangdai/xlnet)
Developer: Carnegie Mellon University and Google
Parameters: 340 million
-
ELECTRA
Paper: (https://arxiv.org/pdf/2003.10555v1.pdf)
Code: (https://github.com/google-research/electra)
Developer: Google
Parameters: 335 million
-
RoBERTa (Robustly Optimized BERT Pretraining Approach)
Paper: (https://arxiv.org/pdf/1907.11692v1.pdf)
Code: (https://github.com/facebookresearch/fairseq/tree/main/examples/roberta)
Developer: Facebook AI and the University of Washington
Parameters: 355 million
-
BERT (Bidirectional Encoder Representations from Transformers)
Paper: (https://arxiv.org/pdf/1810.04805v2.pdf)
Code: (https://github.com/google-research/bert)
Developer: Google
Parameters: 340 million