FlagAI：更快更轻松地训练和部署大型模型

FlagAI-1

FlagAI 是一个快速、简单且适应性强的工具箱，用于训练、微调和部署大规模模型。它现在专注于 NLP 模型和任务，但很快就会支持其他模式。FlagAI 兼容大量预训练模型，包括 WuDao GLM、BERT、RoBERTa、GPT2、T5 和 Huggingface Transformers 模型。它提供了用于快速下载和使用这些模型的 API，在各种数据集上对其进行微调，并与社区共享。在本文中，我们将回顾 FlagAI 并通过示例演示如何安装 FlagAI。

FlagAI还有一个用于一次性作业的快速学习工具箱。这些模型可用于对中文或英文文本执行文本分类、信息提取、问答、摘要和文本生成等任务。

先决条件

Python 版本 >= 3.8
PyTorch 版本 >= 1.8.0
在 GPU 上训练/测试模型也需要安装 CUDA 和 NCCL。

为什么要使用 FlagAI？

如果你正在寻找一个强大且易于使用的大规模语言模型工具包，那么 FlagAI 是一个不错的选择。FlagAI 的一些关键特性如下：

使用 API 密钥下载模型

FlagAI 提供了一个 API 密钥，让您可以轻松下载预训练模型，并在从 SuperGLUE 和 CLUE 中英文文本基准测试收集的各种数据集上对其进行微调。不必自己训练模型，您可以节省大量时间和工作。
FlagAI目前支持超过30种主流模型，包括语言模型Aquila、多语言文本和图像表示模型AltCLIP、文本到图像生成模型AltDiffusion Huggingface空间、五道GLM（最大100亿参数）、EVA -CLIP、OPT、BERT、RoBERTa、GPT2、T5、ALM 和 Huggingface Transformers 模型。
这为您提供了多种型号供您选择，让您发现最适合您需求的型号。

少于 10 行代码的并行训练

FlagAI 得到四个最著名的数据/模型并行库的支持，包括 PyTorch、Deepspeed、Megatron-LM 和 BMTrain。
FlagAI 支持轻松集成，允许用户使用少于 10 行代码并行化他们的训练/测试过程。这可以大大加快训练过程。

方便地使用小样本学习工具包

FlagAI 还为 few-shot jobs 提供了一个快速学习工具包。
如果您只有少量数据可以训练，这很有用。

特别擅长中文任务

FlagAI 特别擅长中文作业。这是因为它是在一个大的中文文本数据集上训练的。
如果您正在从事中文项目，FlagAI 是一个很好的解决方案。

总体而言，FlagAI 是一个强大的工具，可以帮助您快速轻松地设计和部署 AI 模型。对于新用户和经验丰富的用户来说，这都是一个很好的选择。

如何安装FlagAI？

以下命令用于使用 pip 安装 FlagAI，

pip install -U flagai

[可选] 安装 FlagAI 并在本地开发，请按照以下步骤操作

git clone https://github.com/BAAI-Open/FlagAI.git
python setup.py install

[可选] 安装 NVIDIA 的 apex 软件以加快训练速度

git clone https://github.com/NVIDIA/apex
cd apex
pip install -v --disable-pip-version-check --no-cache-dir --global-option="--cpp_ext" --global-option="--cuda_ext" ./

[可选] 应为 ZeRO 优化器安装 DEEPSPEED

git clone https://github.com/microsoft/DeepSpeed
cd DeepSpeed
DS_BUILD_CPU_ADAM=1 DS_BUILD_AIO=1 DS_BUILD_UTILS=1 pip install -e .
ds_report # check the deespeed status

[可选] 安装 BMTrain (>= 0.2.2) 进行 BMTrain 训练

git clone https://github.com/OpenBMB/BMTrain
cd BMTrain
python setup.py install

[可选] 安装 BMInf 以进行低资源推理

pip install bminf

[可选] 对于 Flash Attention，安装 Flash-attention (>=1.0.2)

pip install flash-attn

如何使用自动加载类？

他们提供各种模型，这些模型经过训练可以执行不同的任务。您可以使用 AutoLoader 加载这些模型并生成预测。

FlagAI 工具包示例

示例代码展示了如何利用 FlagAI 工具包的 AutoLoad 类来快速加载模型和分词器。

模型和分词器

FlagAI 工具包的 AutoLoad 类提供了模型和分词器的快速加载。此类可用于简化加载所需模型和分词器的过程。以下是如何使用 AutoLoad 类的示例：

from flagai.auto_model.auto_loader import AutoLoader


# Specify the model and tokenizer names and Load the model and tokenizer
auto_loader = AutoLoader(
    task_name="title-generation",
    model_name="BERT-base-en"
)
model = auto_loader.get_model()
tokenizer = auto_loader.get_tokenizer()

这是 title_generation 任务的示例，但您可以通过更改 task_name 来为其他任务建模。然后可以使用模型和分词器来微调或测试模型。

预测器

Predictor 类用于预测各种任务，如以下示例所示。

from flagai.model.predictor.predictor import Predictor
predictor = Predictor(model, tokenizer)
test_data = [
    "Four minutes after the red card, Emerson Royal nodded a corner into the path of the unmarked Kane at the far post, who nudged the ball in for his 12th goal in 17 North London derby appearances. Arteta's misery was compounded two minutes after half-time when Kane held the ball up in front of goal and teed up Son to smash a shot beyond a crowd of defenders to make it 3-0.The goal moved the South Korea talisman a goal behind Premier League top scorer Mohamed Salah on 21 for the season, and he looked perturbed when he was hauled off with 18 minutes remaining, receiving words of consolation from Pierre-Emile Hojbjerg.Once his frustrations have eased, Son and Spurs will look ahead to two final games in which they only need a point more than Arsenal to finish fourth.",
]

for text in test_data:
    print(
        predictor.predict_generate_beamsearch(text,
                                              out_max_length=50,
                                              beam_size=3))

seq2seq 任务是一种自然语言处理任务，其中训练模型将一系列单词从一种语言翻译成另一种语言。predict_generate_beamsearch 函数可用于为给定的输入序列生成一系列可能的翻译。

NER（命名实体识别）

命名实体识别 (NER) 是识别文本中命名实体的任务，例如人物、组织和位置。
标题生成是为给定文本片段创建标题的过程。
问题回答：在本活动中，您必须回答有关特定文本片段的问题。

from flagai.auto_model.auto_loader import AutoLoader
from flagai.model.predictor.predictor import Predictor

task_name = "ner"
model_name = "RoBERTa-base-ch"
target = ["O", "B-LOC", "I-LOC", "B-ORG", "I-ORG", "B-PER", "I-PER"]
maxlen = 256

auto_loader = AutoLoader(task_name,
                         model_name=model_name,
                         load_pretrain_params=True,
                         class_num=len(target))

model = auto_loader.get_model()
tokenizer = auto_loader.get_tokenizer()

predictor = Predictor(model, tokenizer)

test_data = [
      "On June 15, the Caocao Gaoling Cultural Relics Team of the Henan Provincial Institute of Cultural Relics and Archaeology publicly issued a statement admitting: "It has never been said that the unearthed beads belonged to the owner of the tomb",
    "On April 8, the Beijing Winter Olympics and Winter Paralympics Summary and Commendation Conference was grandly held in the Great Hall of the People. General Secretary Xi Jinping attended the conference and delivered an important speech. In his speech, the General Secretary fully affirmed the Beijing Winter Olympics, Winter Olympics The outstanding achievements of the Paralympic Games comprehensively reviewed the extraordinary process of preparations for the seven years, in-depth summed up the valuable experience in preparing for the Beijing Winter Olympics and the Winter Paralympics, and profoundly explained the spirit of the Beijing Winter Olympics. Austrian heritage to promote high-quality development and put forward clear requirements.",
    "On the 8th local time, the European Commission stated that the governments of EU member states have frozen a total of about 30 billion euros in assets related to Russian oligarchs and other sanctioned Russian personnel.",
    "Under this state of handicap, Betfair's Asian trading data shows that Bologna is hot. From the perspective of European betting, it is also hot for the home team. Palermo has lost two games in a row,"
 ]
for t in test_data:
    entities = predictor.predict_ner(t, target, maxlen=maxlen)
    result = {}
    for e in entities:
        if e[2] not in result:
            result[e[2]] = [t[e[0]:e[1] + 1]]
        else:
            result[e[2]].append(t[e[0]:e[1] + 1])
    print(f"result is {result}")