Fusion information enhanced method based on transformer for 3D object detection
DOI:
Author:
Affiliation:

Clc Number:

TH741 TP391. 4

Fund Project:

  • Article
  • |
  • Figures
  • |
  • Metrics
  • |
  • Reference
  • |
  • Related
  • |
  • Cited by
  • |
  • Materials
  • |
  • Comments
    Abstract:

    A fusion information enhanced method based on Transformer is proposed to address the issue of misalignment when the current 3D object detection methods fuse different modal data, which mitigates the disruption of correlation between data and data loss. Firstly, a region proposal network of dual fusion feature module based on transformer is designed, which utilizes the deformable attention mechanism to fuse the extracted lidar point cloud features and image features into dual domain features and generate pre-selected boxes. Then, the refinement of box is designed by using a feature information enhancement module, which utilizes a deep completion mechanism to complement the dense depth and feature semantic information. Finally, a multimodal feature cross attention module is designed, which uses a dynamic cross attention mechanism to obtain correlations between different modalities, thereby aligning and fusing feature information effectively. The experimental results based on the Kitti, Nucences, and Waymo datasets demonstrate the effectiveness of method. A large number of ablation experiments have proven the effectiveness and efficiency of each module in the algorithm. The experimental results based on a real vehicle platform show that the algorithm possesses strong robustness in complex practical environments.

    Reference
    Related
    Cited by
Get Citation
Share
Article Metrics
  • Abstract:
  • PDF:
  • HTML:
  • Cited by:
History
  • Received:
  • Revised:
  • Adopted:
  • Online: February 27,2024
  • Published: