Abstract:Considering the advantage of visualization and non-invasiveness of electrical impedance tomography ( EIT), it′s broadly applied in industrial and biomedical fields. However, due to the highly nonlinear and ill-posed nature of inverse problem, numerical imaging methods face spatial resolution limitations. These limitations are especially evident in multiphase media distributions, where current EIT technology encounters boundary distortions and conductivity errors, thereby affecting the final imaging accuracy. To address mentioned issues, this paper introduces a learning-based model for EIT reconstruction, referred as MAT-UNet, which is mainly composed of U-shaped backbone and optimized multi-head attention block. The MAT-UNet integrates convolutional block attention module (CBAM) in the Encoders for feature extraction to construct the spatial and channel latent feature. In addition, the Squeeze-andExcitation Vision Transformer (SE-ViT) is introduced in the skip connection between Encoder and Decoder, which optimizes the global feature learning. Also, the Multi-Head Cross-Attention (MHCA) module facilitates multi-scale information fusion between the encoder and decoder. MAT-UNet is trained on extensive simulation data to obtain optimal model parameters and is experimentally validated on diverse complex shapes and lung simulation models. The quantitative evaluation metrics indicate that this method achieves a Root Mean Square Error (RMSE) of 2. 315 6 and a Structural Similarity Index (SSIM) of 0. 943 7 in reconstructed images. The visualized results closely match the true distribution and boundaries. Experimental outcomes demonstrate that the proposed MAT-UNet model exhibits robust performance and generalization capability. Compared to traditional single convolution structures, the integration of the Transformer structure provides more accurate EIT image reconstruction, presenting significant potential and value in non-destructive measurement and detection applications.