使用 Stable Diffusion 3 进行文生图 (opens in a new tab)

本文演示如何在 JupyterLab 里搭建一个文生图应用进行开发调试。

应用概述

本应用从 Hugging Face 下载 Stable-Diffusion 和 Helsinki-NLP 两个模型来实现。前者用于文生图，后者用于将中文提示语翻英文，然后调用 diffusion 画图，这样方便中文用户使用。

应用位置

代码说明

jupyter notebook 里可逐条执行代码，为了条理清楚将代码进行了归类。一般分环境初始化和功能实现几个部分。

环境初始化

如下代码块是安装依赖软件，执行一次即可。

# install needed software
!pip install diffusers -i https://pypi.tuna.tsinghua.edu.cn/simple
!pip install transformers -i https://pypi.tuna.tsinghua.edu.cn/simple
!pip install accelerate -i https://pypi.tuna.tsinghua.edu.cn/simple
!pip install -U huggingface_hub -i https://pypi.tuna.tsinghua.edu.cn/simple
!pip install sentencepiece -i https://pypi.tuna.tsinghua.edu.cn/simple

加 ! 表示在 jupyter notebook 里执行 shell 命令

引入依赖包：

# import package
 
import os
import torch
import huggingface_hub
from diffusers import StableDiffusion3Pipeline
 
from torch.cuda.amp import autocast as autocast
from transformers import pipeline

功能实现

配置 Hugging Face 镜像，因为在线安装 diffusion 需要

# config env variable
os.environ['HF_ENDPOINT'] = 'https://hf-mirror.com'
token = 'your_huggingface_token'
huggingface_hub.login(token)

在线安装 diffusion 模型

# diffusion preparation
pipe = StableDiffusion3Pipeline.from_pretrained("stabilityai/stable-diffusion-3-medium-diffusers", torch_dtype=torch.float16)
pipe = pipe.to("cuda")

定义画图函数

def draw_pic(content, file_name):
    with autocast():
        image = pipe(
            content,
            negative_prompt='',
            num_inference_steps=28,
            guidance_scale=7.0,
        ).images[0]
        # see it via https://filebrowser.poc1-be9e3e9b62c8.ing.zw1.paratera.com/
        image.save(f"./imgs/{file_name}.jpg")        
        image.show() # 注意显示较慢，用 file-browser 看会快

用英文画图

# drawn by English
content = "a boy play football in playground"
file_name = ("boy")
draw_pic(content, file_name)

执行成功后，可在左边导航栏的 imgs 目录下看到 boy.jpg 文件，同时下面会有执行提示，过一会会显示图片

100%|██████████| 28/28 [00:09<00:00,  2.85it/s]

下载翻译模型

因为这个模型没法通过 Hugging Face 在线安装，所以先使用 hfd.sh 离线下载。该脚本请参考这里 (opens in a new tab)安装。
```
hfd.sh Helsinki-NLP/opus-mt-zh-en --tool aria2c --local-dir /workspace/bds/model/Helsinki-NLP/opus-mt-zh-en
```

加载翻译模型

# drawn by Chinese
# preparation
translator = pipeline("translation", model="/workspace/bds/model/Helsinki-NLP/opus-mt-zh-en", device="cuda")

中文画图

content = "宇航员在月球上跳舞"
file_name = "space-man"
result = translator(content)[0]['translation_text']
print(result)
content=result
draw_pic(content, file_name)

效果

改进

因国内无法直连使用 Hugging Face，尤其要获取 token 时，后续计划使用阿里的魔塔社区 (opens in a new tab)的方式来加载模型。

让一幅肖像活起来! LivePortrait 使用 ComfyUI 文生图