多尺度注意力融合与视觉 Transformer 方法优化的电阻抗层析成像深度学习方法
DOI:
CSTR:
作者:
作者单位:

作者简介:

通讯作者:

中图分类号:

R318 TH701

基金项目:

国家自然科学基金(62072335, 62071328, 61872269, 61903273)项目资助


Optimized learning method for electrical impedance tomography with multi-scale attention fusion and vision transformer
Author:
Affiliation:

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    电阻抗层析成像(EIT)具有显著的可视化和非侵入性等特点,在工业和生物医学工程领域展现了其广阔的应用潜力。 由于其逆问题存在高度非线性和病态性特点,导致了数值成像方法在空间分辨率上的局限性,尤其是在多相介质分布情况下, 现有 EIT 技术在成像过程中出现边界失真和电导率误差,从而影响最终的成像精度。 本文提出了一种基于卷积注意力机制的 U 型深度成像方法———MAT-UNet,将卷积块注意力模块(CBAM)与 U-Net 结构相结合,在特征提取与融合过程中嵌入卷积块注 意力模块,以增强模型的注意力定向和特征表征能力,同时跳跃连接引入了压缩-激励( SE) 注意力机制与视觉 Transformer (ViT)来优化全局特征的学习,使用多头交叉注意力模块(MHCA)实现编码器与解码器的多尺度信息融合。 MAT-UNet 通过大 量的仿真数据训练获得最优模型参数,并在多样化复杂形状和肺部仿真模型进行了实验验证。 定量评估指标表明,该方法在重 建图像中的均方根误差(RMSE)结果为 2. 315 6,结构相似性指数(SSIM)结果为 0. 943 7,可视化结果与真实分布和边界具有很 好的一致性。 实验结果表明,本文提出的 MAT-UNet 模型展现出良好的鲁棒性和泛化能力,相较于传统的单一卷积结构,集成 Transformer 结构提供了更精准的 EIT 图像重建效果,在无损测量与检测应用中存在很大的潜力和价值。

    Abstract:

    Considering the advantage of visualization and non-invasiveness of electrical impedance tomography ( EIT), it′s broadly applied in industrial and biomedical fields. However, due to the highly nonlinear and ill-posed nature of inverse problem, numerical imaging methods face spatial resolution limitations. These limitations are especially evident in multiphase media distributions, where current EIT technology encounters boundary distortions and conductivity errors, thereby affecting the final imaging accuracy. To address mentioned issues, this paper introduces a learning-based model for EIT reconstruction, referred as MAT-UNet, which is mainly composed of U-shaped backbone and optimized multi-head attention block. The MAT-UNet integrates convolutional block attention module (CBAM) in the Encoders for feature extraction to construct the spatial and channel latent feature. In addition, the Squeeze-andExcitation Vision Transformer (SE-ViT) is introduced in the skip connection between Encoder and Decoder, which optimizes the global feature learning. Also, the Multi-Head Cross-Attention (MHCA) module facilitates multi-scale information fusion between the encoder and decoder. MAT-UNet is trained on extensive simulation data to obtain optimal model parameters and is experimentally validated on diverse complex shapes and lung simulation models. The quantitative evaluation metrics indicate that this method achieves a Root Mean Square Error (RMSE) of 2. 315 6 and a Structural Similarity Index (SSIM) of 0. 943 7 in reconstructed images. The visualized results closely match the true distribution and boundaries. Experimental outcomes demonstrate that the proposed MAT-UNet model exhibits robust performance and generalization capability. Compared to traditional single convolution structures, the integration of the Transformer structure provides more accurate EIT image reconstruction, presenting significant potential and value in non-destructive measurement and detection applications.

    参考文献
    相似文献
    引证文献
引用本文

王 琦,张 涛,徐超炜,卢梦凡,王子辰.多尺度注意力融合与视觉 Transformer 方法优化的电阻抗层析成像深度学习方法[J].仪器仪表学报,2024,45(7):52-63

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:
  • 最后修改日期:
  • 录用日期:
  • 在线发布日期: 2024-10-24
  • 出版日期:
文章二维码