电子科技 ›› 2023, Vol. 36 ›› Issue (7): 32-38.doi: 10.16180/j.cnki.issn1007-7820.2023.07.005

• • 上一篇    下一篇

基于多尺度梯度的轻量级生成对抗网络

孙红,赵迎志   

  1. 上海理工大学 光电信息与计算机工程学院,上海 200093
  • 收稿日期:2022-01-10 出版日期:2023-07-15 发布日期:2023-06-21
  • 作者简介:孙红(1964-),女,博士,副教授。研究方向:模式识别与智能系统、大数据与云计算、控制科学与工程。|赵迎志(1996-),男,硕士研究生。研究方向:图像生成、图像超分辨率重建。
  • 基金资助:
    国家自然科学基金(61472256);国家自然科学基金(61170277);国家自然科学基金(61703277)

Lightweight Generative Adversarial Networks Based on Multi-Scale Gradient

SUN Hong,ZHAO Yingzhi   

  1. School of Optical-Electrical and Computer Engineering,University of Shanghai for Science and Technology,Shanghai 200093,China
  • Received:2022-01-10 Online:2023-07-15 Published:2023-06-21
  • Supported by:
    National Natural Science Foundation of China(61472256);National Natural Science Foundation of China(61170277);National Natural Science Foundation of China(61703277)

摘要:

随着生成对抗网络研究的推进,网络模型的计算量急剧增加,其自身的训练不稳定问题依然存在,生成图像的质量也有待提升。为解决以上问题,文中提出一种轻量级生成对抗网络模型,引入多尺度梯度结构解决训练不稳定的问题。通过融合自注意力机制和动态卷积的思想,利用循环模块和图像增强模块,在保持较少参数的前提下提高模型的学习能力。对文中所提算法进行验证,实验结果表明该算法在CelebA数据集上的IS(Inception Score)值为2.75,FID(Fréchet Inception Distance)值为70.1,在LSUN数据集上的IS值为2.61, FID值为73.2,相比SAGANDCGAN等经典模型性有所提高,验证了该算法可行性和性能。

关键词: 多尺度梯度, 动态卷积, 循环块, 半注意力机制, 注意力稀疏化, 卷积网络, 深度学习, 图像生成, 生成对抗网络

Abstract:

With the advancement of generative adversarial network research, the computational amount of the network model increases sharply, its own training instability still exists, and the quality of the generated image also needs to be improved. To solve the problems, a lightweight generative adversarial network is proposed, which introduces multi-scale gradient structure to solve the problem of unstable training. By combining the ideas of self-attention mechanism and dynamic convolution, the cyclic module and image enhancement module are used to improve the learning ability of the model under the premise of keeping fewer parameters. The verification experimental results show that the inception score is 2.75 and the FID is 70.1 on CelebA data set, the inception score is 2.61 and the FID is 73.2 on LUSN data set, which is better than that of the classical models such as SAGAN and DCGAN, and verifies the feasibility and performance of the proposed algorithm.

Key words: multi-scale gradient, dynamic convolution, cyclic block, half-attention mechanism, sparse attention, convolutional neural networks, deep learning, image generation, generative adversarial net

中图分类号: 

  • TP391