The Pre-trained Language Model What Can GTP-3, Switch Transformer, and Enlightenment 2.0 Do?

5 min readOct 25, 2021

In just two years, the parameter scale of the pre-trained language model has increased tenfold, and competition in the 21st century is indeed everywhere. What is a pre-trained language model? What can they do?

In May 2020, OpenAI released the pre-trained model GPT-3 with 175 billion parameters. It can not only write articles, answer questions, and translate, but also have the ability to have multiple rounds of dialogue, code typing, and mathematical calculations. As one of the “star” models in the field of artificial intelligence in 2020, GPT-3 has pushed the popularity of ultra-large-scale pre-training models to a new high.

In January 2021, less than a year after the advent of GPT-3, Google launched the Switch Transformer model, which directly increased the amount of parameters from 175 billion GPT-3 to 1.6 trillion, making it the first trillion-level language in human history. Model.

On June 1, 2021, less than half a year after Switch Transformer came out, the Beijing Zhiyuan Conference kicked off as scheduled at the Conference Center of Zhongguancun National…

The Pre-trained Language Model What Can GTP-3, Switch Transformer, and Enlightenment 2.0 Do?

Written by Jarvis+