GLM通用语言模型是什么？2026年技术解析与应用指南

Q: GLM模型的核心创新点是什么？

GLM的核心创新在于自回归空白填充预训练目标，它通过随机遮盖文本片段并以自回归方式预测，统一了理解和生成任务，并引入了2D位置编码和片段重排技术增强灵活性。

GLM (General Language Model) 是一个基于自回归空白填充Autoregressive Blank Infilling，GLM的核心预训练目标，随机遮盖文本中的连续跨度，然后以自回归方式预测这些跨度，结合了自编码和自回归的优点。目标进行预训练的通用语言模型，可以针对多种自然语言理解和生成任务进行微调。

GLM (General Language Model) 是一个基于自回归空白填充Autoregressive Blank Infilling，GLM的核心预训练目标，随机遮盖文本中的连续跨度，然后以自回归方式预测这些跨度，结合了自编码和自回归的优点。目标进行预训练的通用语言模型，可以针对多种自然语言理解和生成任务进行微调。

有关 GLM 的详细描述，请参阅我们的论文：

GLM: General Language Model Pretraining with Autoregressive Blank Infilling (ACL 2022)
Zhengxiao Du*, Yujie Qian*, Xiao Liu, Ming Ding, Jiezhong Qiu, Zhilin Yang, Jie Tang (*: 同等贡献)

有关 GLM 的详细描述，请参阅我们的论文：
GLM: General Language Model Pretraining with Autoregressive Blank Infilling (ACL 2022)
Zhengxiao Du*, Yujie Qian*, Xiao Liu, Ming Ding, Jiezhong Qiu, Zhilin Yang, Jie Tang (*: 同等贡献)

最新动态：我们发布了 ChatGLM-6B，这是一个基于 GLM 框架、拥有 60 亿参数、针对中文问答和对话优化的开源预训练语言模型。

News: We release ChatGLM-6B, an open pre-trained language model with 6 billion parameters optimized for Chinese QA and dialogue based on the GLM framework.

核心概念与模型架构

GLM 的核心创新在于其自回归空白填充Autoregressive Blank Infilling，GLM的核心预训练目标，随机遮盖文本中的连续跨度，然后以自回归方式预测这些跨度，结合了自编码和自回归的优点。的预训练目标。与传统的单向自回归模型（如 GPT）或仅编码器模型（如 BERT）不同，GLM 将输入文本中的部分片段随机替换为 [MASK] 标记，然后以自回归的方式（从左到右）预测这些被遮盖的片段。这种方法巧妙地统一了理解和生成任务。

The core innovation of GLM lies in its autoregressive blank-filling pre-training objective. Unlike traditional unidirectional autoregressive models (e.g., GPT) or encoder-only models (e.g., BERT), GLM randomly replaces parts of the input text with [MASK] tokens and then predicts these masked spans in an autoregressive manner (left to right). This approach elegantly unifies understanding and generation tasks.

为了实现这一目标，GLM 引入了 2D 位置编码 和 片段重排 技术。2D 位置编码能同时表示一个标记在原始文本中的位置和在被遮盖片段内的位置。片段重排则允许模型以任意顺序预测被遮盖的片段，增强了模型的灵活性。

To achieve this, GLM introduces 2D positional encoding and span shuffling techniques. The 2D positional encoding can represent both a token's position in the original text and its position within a masked span. Span shuffling allows the model to predict masked spans in any arbitrary order, enhancing the model's flexibility.

预训练模型概览

论文中使用的预训练模型可以从 OneDrive 或清华云盘下载。

The pretrained models used in the paper can be downloaded from OneDrive or Tsinghua-Cloud.

下表列出了主要的预训练模型：

The following table lists the main pretrained models:

模型名称	参数量	语言	训练语料	训练目标	文件	配置文件
GLM-Base	110M	英文	Wiki+Book	Token	glm-base-blank.tar.bz2	model_blocklm_base.sh
GLM-Large	335M	英文	Wiki+Book	Token	glm-large-blank.tar.bz2	model_blocklm_large.sh
GLM-Large-Chinese	335M	中文	WuDaoCorpora	Token+Sent+Doc	glm-large-chinese.tar.bz2	model_blocklm_large_chinese.sh
GLM-Doc	335M	英文	Wiki+Book	Token+Doc	glm-large-generation.tar.bz2	model_blocklm_large_generation.sh
GLM-410M	410M	英文	Wiki+Book	Token+Doc	glm-1.25-generation.tar.bz2	model_blocklm_1.25_generation.sh
GLM-515M	515M	英文	Wiki+Book	Token+Doc	glm-1.5-generation.tar.bz2	model_blocklm_1.5_generation.sh
GLM-RoBERTa	335M	英文	RoBERTa	Token	glm-roberta-large-blank.tar.bz2	model_blocklm_roberta_large.sh
GLM-2B	2B	英文	Pile	Token+Sent+Doc	glm-2b.tar.bz2	model_blocklm_2B.sh
GLM-10B	10B	英文	Pile	Token+Sent+Doc	下载	model_blocklm_10B.sh
GLM-10B-Chinese	10B	中文	WuDaoCorpora	Token+Sent+Doc	下载	model_blocklm_10B_chinese.sh

Name Params Language Corpus Objective File Config

GLM-Base 110M English Wiki+Book Token glm-base-blank.tar.bz2 model_blocklm_base.sh

GLM-Large 335M English Wiki+Book Token glm-large-blank.tar.bz2 model_blocklm_large.sh

GLM-Large-Chinese 335M Chinese WuDaoCorpora Token+Sent+Doc glm-large-chinese.tar.bz2 model_blocklm_large_chinese.sh

GLM-Doc 335M English Wiki+Book Token+Doc glm-large-generation.tar.bz2 model_blocklm_large_generation.sh

GLM-410M 410M English Wiki+Book Token+Doc glm-1.25-generation.tar.bz2 model_blocklm_1.25_generation.sh

GLM-515M 515M English Wiki+Book Token+Doc glm-1.5-generation.tar.bz2 model_blocklm_1.5_generation.sh

GLM-RoBERTa 335M English RoBERTa Token glm-roberta-large-blank.tar.bz2 model_blocklm_roberta_large.sh

GLM-2B 2B English Pile Token+Sent+Doc glm-2b.tar.bz2 model_blocklm_2B.sh

GLM-10B 10B English Pile Token+Sent+Doc Download model_blocklm_10B.sh

GLM-10B-Chinese 10B Chinese WuDaoCorpora Token+Sent+Doc Download model_blocklm_10B_chinese.sh

Name	Params	Language	Corpus	Objective	File	Config
GLM-Base	110M	English	Wiki+Book	Token	glm-base-blank.tar.bz2	model_blocklm_base.sh
GLM-Large	335M	English	Wiki+Book	Token	glm-large-blank.tar.bz2	model_blocklm_large.sh
GLM-Large-Chinese	335M	Chinese	WuDaoCorpora	Token+Sent+Doc	glm-large-chinese.tar.bz2	model_blocklm_large_chinese.sh
GLM-Doc	335M	English	Wiki+Book	Token+Doc	glm-large-generation.tar.bz2	model_blocklm_large_generation.sh
GLM-410M	410M	English	Wiki+Book	Token+Doc	glm-1.25-generation.tar.bz2	model_blocklm_1.25_generation.sh
GLM-515M	515M	English	Wiki+Book	Token+Doc	glm-1.5-generation.tar.bz2	model_blocklm_1.5_generation.sh
GLM-RoBERTa	335M	English	RoBERTa	Token	glm-roberta-large-blank.tar.bz2	model_blocklm_roberta_large.sh
GLM-2B	2B	English	Pile	Token+Sent+Doc	glm-2b.tar.bz2	model_blocklm_2B.sh
GLM-10B	10B	English	Pile	Token+Sent+Doc	Download	model_blocklm_10B.sh
GLM-10B-Chinese	10B	Chinese	WuDaoCorpora	Token+Sent+Doc	Download	model_blocklm_10B_chinese.sh

注意：下载文件后，请将其解压到本地文件夹，并在相应的脚本中将 CHECKPOINT_PATH 设置为该文件夹路径。

Note: After downloading the file, unzip it into a local folder and set CHECKPOINT_PATH in the corresponding scripts to the folder path.

性能表现

GLM 在多个基准测试中展现了强大的性能，证明了其作为通用骨干模型的有效性。

GLM has demonstrated strong performance across multiple benchmarks, proving its effectiveness as a general-purpose backbone model.

SuperGLUE自然语言理解基准测试集，包含COPA、WSC、RTE等任务，用于评估模型在多种NLU任务上的性能。基准测试

在 SuperGLUE自然语言理解基准测试集，包含COPA、WSC、RTE等任务，用于评估模型在多种NLU任务上的性能。开发集上，采用单模型、单任务微调的结果如下（GLM-10B 表现优异）：

Results on the SuperGLUE自然语言理解基准测试集，包含COPA、WSC、RTE等任务，用于评估模型在多种NLU任务上的性能。 dev set with single-model, single-task finetuning are as follows (GLM-10B performs excellently):

模型	COPA	WSC	RTE	WiC	CB	MultiRC	BoolQ	ReCoRD
GLM-10B	98.0	95.2	93.1	75.7	98.7/98.2	88.1/63.3	88.7	94.4/94.0
DeBERTa-XXLarge-v2	97.0	-	93.5	-	-	87.8/63.6	88.3	94.1/93.7

序列到序列生成任务

在文本摘要任务上，GLM-10B 同样取得了具有竞争力的结果。

On text summarization tasks, GLM-10B also achieves competitive results.

CNN/Daily Mail (测试集，未使用额外数据):

CNN/Daily Mail (test set, no additional data used):

模型	ROUGE-1	ROUGE-2	ROUGE-L
GLM-10B	44.7	21.4	41.4
T5-11B	43.5	21.6	40.7
PEGASUS-Large	44.2	21.5	41.4
BART-Large	44.2	21.3	40.9

XSum (测试集，未使用额外数据):

XSum (test set, no additional data used):

模型	ROUGE-1	ROUGE-2	ROUGE-L
GLM-10B	48.9	25.7	40.4
PEGASUS-Large	47.2	24.6	39.3
BART-Large	45.1	22.3	37.3

语言建模能力

在零样本设置下的语言建模评估中，GLM-10B 也表现不俗。

In zero-shot language modeling evaluation, GLM-10B also performs well.

模型	LAMBADA (准确率)	Wikitext103 (困惑度)
GLM-10B (双向)	72.35	11.33
GLM-10B (单向)	67.18	12.22
GPT-2	52.66	17.48
Megatron-LM (8.3B)	66.51	10.81
Turing-NLG	67.98	10.21

快速开始

通过 Hugging Face Hub 使用

你可以通过 HuggingFace Hub 访问 GLM 模型。请安装 transformers>=4.23.1，并在此处查找所有可用模型。

You can access GLM models via HuggingFace Hub. Please install transformers>=4.23.1 and find all the available models here.

生成示例

以下代码展示了如何使用 GLM-10B 进行空白填充生成：

The following code shows how to use GLM-10B for blank-filling generation:

from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
tokenizer = AutoTokenizer.from_pretrained("THUDM/glm-10b", trust_remote_code=True)
model = AutoModelForSeq2SeqLM.from_pretrained("THUDM/glm-10b", trust_remote_code=True)
model = model.half().cuda()
model.eval()

# 推理
inputs = tokenizer("Ng is an adjunct professor at [MASK] (formerly associate professor and Director of its Stanford AI Lab or SAIL ). Also a pioneer in online education, Ng co-founded Coursera and deeplearning.ai.", return_tensors="pt")
inputs = tokenizer.build_inputs_for_generation(inputs, max_gen_length=512)
inputs = inputs.to('cuda')
outputs = model.generate(**inputs, max_length=512, eos_token_id=tokenizer.eop_token_id)
print(tokenizer.decode(outputs[0].tolist()))

# 训练
inputs = tokenizer(
    ["Tsinghua University is located in [MASK].", "One minus one equals zero, is it correct? Answer: [MASK]"],
    return_tensors="pt", padding=True)
inputs = tokenizer.build_inputs_for_generation(inputs, targets=["Beijing", "No"], max_gen_length=8, padding=False)
inputs = inputs.to('cuda')
outputs = model(**inputs)
loss = outputs.loss
logits = outputs.logits

分类示例

以下代码展示了如何将 GLM 用于多项选择任务：

The following code shows how to use GLM for multiple-choice tasks:

from transformers import AutoTokenizer, AutoModelForMultipleChoice
tokenizer = AutoTokenizer.from_pretrained("THUDM/glm-10b", trust_remote_code=True)
model = AutoModelForMultipleChoice.from_pretrained("THUDM/glm-10b", trust_remote_code=True)
model = model.half().cuda()
model.eval()

inputs = tokenizer(["Tsinghua University is located in [MASK].",
                    "One minus one equals zero, is it correct? Answer: [MASK]"], return_tensors="pt", padding=True)
choices = [["Beijing", "Shanghai"], ["Yes", "No"]]
inputs = tokenizer.build_inputs_for_multiple_choice(inputs, choices)
inputs = inputs.to('cuda')
outputs = model(**inputs)
logits = outputs.logits

提示：你也可以使用 scripts/convert_glm_checkpoint_to_transformers.py 脚本将微调后的检查点转换为 Transformers 格式。

Tip: You can also convert the finetuned checkpoints with scripts/convert_glm_checkpoint_to_transformers.py.

手动安装与环境配置

克隆仓库:

Clone the repo:

git clone https://github.com/THUDM/GLM
cd GLM

安装依赖:
1. Install dependencies:
  请先安装 PyTorch（我们使用 1.7.0 版本）和 apex，然后通过 pip install -r requirements.txt 安装其他依赖项。
  
  Please first install PyTorch (we use 1.7.0) and apex, and then install other dependencies by pip install -r requirements.txt.
模型并行Model Parallelism，将大型模型参数分割到多个GPU上的技术，用于解决GPU内存不足问题，GLM提供change_mp.py等工具支持。（可选，用于大模型）:
1. Model Parallelism (Optional, for large models):
  如果遇到 CUDA out of memory 错误，意味着 GPU 内存有限，可以尝试使用模型并行Model Parallelism，将大型模型参数分割到多个GPU上的技术，用于解决GPU内存不足问题，GLM提供change_mp.py等工具支持。将参数划分到多个 GPU 上。以双向模型并行Model Parallelism，将大型模型参数分割到多个GPU上的技术，用于解决GPU内存不足问题，GLM提供change_mp.py等工具支持。为例，首先运行 change_mp.py 来划分检查点：
  
  If your encounter the CUDA out of memory error, which means you GPU memory is limited, you can try the model parallelism to divide the parameters into multiple GPUs. Take the two-way model parallelism as an example. First run change_mp.py to divide the checkpoint:
```
python change_mp.py path_to_the_checkpoint 2
```
然后更新模型配置文件（如 config_tasks/model_blocklm_10B.sh）中的检查点路径，并将脚本（如 scripts/ds_finetune_superglue.sh）中的 MP_SIZE 改为 2。

Then update the checkpoint path in the model

常见问题（FAQ）

GLM模型的核心创新点是什么？

GLM的核心创新在于自回归空白填充Autoregressive Blank Infilling，GLM的核心预训练目标，随机遮盖文本中的连续跨度，然后以自回归方式预测这些跨度，结合了自编码和自回归的优点。预训练目标，它通过随机遮盖文本片段并以自回归方式预测，统一了理解和生成任务，并引入了2D位置编码GLM引入的位置编码方法，同时编码标记在文本中的位置和其在遮盖跨度内的位置，提升空白填充任务的性能。和片段重排技术增强灵活性。

GLM有哪些专门的中文版本模型？

GLM提供专门的中文版本，包括GLM-Large-Chinese（3.35亿参数）和GLM-10B-Chinese（100亿参数），均使用悟道语料库训练，支持中文自然语言任务。

如何快速开始使用GLM进行开发？

可通过Hugging Face Hub直接使用GLM模型，或从论文提供的清华云盘/OneDrive链接下载预训练模型文件，最新开源的ChatGLM-6B基于GLM框架优化的开源预训练语言模型，参数量为60亿，专门针对中文问答和对话任务进行优化。（60亿参数）专门针对中文对话优化。